End of Mainstream Support for Microsoft SQL Server 2008 and 2008 R2

July 14, 2014, 11:38 am

≫ Next: Log Shipping - Exclusive access could not be obtained because the database is in use

≪ Previous: Why Is your datetime saved as a bigint anyway?

@DBA_ANDY: A reminder to all #sqlserver peeps out there and *especially* to our clients: #sql2008 #sql20008R2 http://blogs.msdn.com/b/sqlreleaseservices/archive/2014/07/09/end-of-mainstream-support-for-sql-server-2008-and-sql-server-2008-r2.aspx#Ntirety

Original Message:
http://twitter.com/DBA_ANDY/status/488744595136999424

What does this mean?

The key takeaway comes from this section of the MSDN blog:

For both SQL Server 2008 and SQL Server 2008 R2, Microsoft will continue to provide technical support which also includes security updates during the duration of extended support. See the table below for extended support end date. Non-security hotfixes for these versions will be offered only to customers who have an Extended Hotfix Support agreement. Please refer to Extended Hotfix Support – Microsoft for more information.

Microsoft uses a standard cycle of mainstream support/extended support for almost all of their products, as described at http://support.microsoft.com/lifecycle/default.aspx?LN=en-us&x=14&y=6

The biggest difference between mainstream and extended support is that during the extended support period the only true support that occurs for the product is security fixes (which very rarely occur for Microsoft SQL Server) and paid support (aka 1-900-Microsoft). There are no service packs/cumulative updates/functionality changes/online support for a product once it enters extended support. If you pay for an Extended Support contract (I can only think of one shop I have worked with that does pay for it) then you get some added support beyond security patches, but not much.

What do we need to do?

As described in the MSDN blog, you should plan to get off Microsoft SQL Server 2008 and 2008 R2 ASAP. Generally Microsoft keeps a product in mainstream support until the second version after becomes GA (Generally Available). In the case, now that SQL Server 2012 *and* 2014 are out, 2008/2008 R2 are now “minus two” versions and therefore have gone out of mainstream support.

In many cases it is possible to piggy back on top of a hardware refresh project or virtualization project to try to remove as much older SQL Server as you can.

I still run lots of even older versions – SQL Server 7.0/2000/2005 – why should I care?

Many people don’t care, but I have found it is a good general IT policy is to run everything – hardware/software/etc. – within vendor-supported timelines. Not to make the vendors $$$, but for the supportability of the product *and* for the functionality/performance of the product.

Sure, your SQL Server 2005 on Windows Server 2003 runs your database (kind of) but what would you do if the server blew its motherboard – do you have a spare eight-year-old board in stock? What about the software (Windows/SQL Server/etc.) – each new version adds increased performance options (compression, availability groups, etc.) and improved manageability (new dynamic management views (DMV’s), Extended Events, improved tools, etc.) – what could your staff be doing if they didn’t need to continue to hand-hold those old servers?

My application vendor doesn’t support SQL Server 2012/2014 yet!

This is one of the most cited catches to advancing software, especially database platforms. In most cases it is possible to upgrade/replace your existing server with a new current version Windows Server/SQL Server and run your “unsupported” application databases under a down-level compatibility level (for example, running the database under SQL Server 20008 compatibility (100) on a SQL server 2012 (110) server). This option needs to be tested before it is implemented.

Where am I going to get the $$$$ to do this?

I can’t help you much there other than to refer back to my earlier point about the old server and its eight-year-old motherboard – quantify the cost to your management of the system going down, possibly for days on end while you search eBay and Craigslist for replacement hardware

(this isn’t a joke – I worked in one shop that had to go through this).

This is where many DBA's (and DBA Managers) get hung up in the idea of SQL Server backups.

**IMPORTANT NOTE - you absolutely need to have SQL Server backups of all of your systems with only limited exceptions – even DEV and TEST since they are functionally PROD for your developers and QA staff**

Having said that, a wonderful stack of SQL Server backups shipped securely to your offsite facility doesn’t save you from the failed eight-year-old motherboard scenario – best case you could spin up a VMware/HyperV/etc. virtual server and restore the databases there, but do you still have all of the necessary Windows/SQL Server/service pack installation media to even install Windows 2003 SP2 and SQL Server 2005 SP4CU3 patched with MS12-070?

I wanted to put this out there because I know that for many of you SQL Server 2008 and 2008 R2 are your base versions – there are some SQL Server 2012’s out there (and more than a few 2005’s and 2000’s) but most servers I deal with day to day are 2008/2008 R2.

Pause and consider what I have said and work together with your team and your management in the coming months to get this done.

↧

Log Shipping - Exclusive access could not be obtained because the database is in use

July 31, 2014, 12:34 pm

≫ Next: POSTPONED to 02/11 - Speaking at the Omaha SQL Server User Group *Tomorrow* Night!

≪ Previous: End of Mainstream Support for Microsoft SQL Server 2008 and 2008 R2

This story began as they almost always do - with a page at 4am local time...

--

'Alert: SQL Failed Jobs on InstanceName is Failed'

Job name: LSRestore_InstanceName_DatabaseName
Date & Message: 2014-07-31 05:24:00.0 The job failed. The Job was invoked by User DOMAIN\LOGIN. The last step to run was step 1 (Log shipping restore log job step.).

--

Sigh.

I am not a fan of log shipping but I do acknowledge that it has its place. In this case the client is using it to have a read-only standby database on a Standard Edition SQL Server (hence no mirroring+snapshot or Availability Group with readable secondary). The database in question is large (>1TB) but with relatively small daily churn (~5GB of LOG backups per day). Due to the traffic on the standby database during the day they have their log shipping on this database configured to only run restores during the overnight, keeping the standby database static and available for their users during the business day.

By the time I received the page, the LSRestore job has been failing for much of the overnight, with service desk'ers trying various things to try to resolve the issue.

When I signed on to the system I drilled into the job step history and found the following:

--

Microsoft (R) SQL Server Log Shipping Agent [Assembly Version = 10.0.0.0, File Version = 10.50.1600.1 ((KJ_RTM).100402-1539 )] Microsoft Corporation. All rights reserved. 2014-07-31 03:45:00.32 ----- START OF TRANSACTION LOG RESTORE -----

2014-07-31 03:45:00.39 Starting transaction log restore. Secondary ID: '8f45136f-fb73-4815-b849-c7c1b391831b' 2014-07-31 03:45:00.39 Retrieving restore settings. Secondary ID: '8f45136f-fb73-4815-b849-c7c1b391831b' 2014-07-31 03:45:00.41 Retrieved common restore settings. Primary Server: 'InstanceName', Primary Database: 'DatabaseName', Backup Destination Directory: 'H:\LogShip\DatabaseName', File Retention Period: 1440 minute(s) 2014-07-31 03:45:00.42 Retrieved database restore settings. Secondary Database: 'DatabaseName', Restore Delay: 10, Restore All: True, Restore Mode: Standby, Disconnect Users: False, Last Restored File: H:\LogShip\DatabaseName\DatabaseName_20140730071500.trn, Block Size: Not Specified, Buffer Count: Not Specified, Max Transfer Size: Not Specified

2014-07-31 03:45:20.77 *** Error: Could not apply log backup file 'H:\LogShip\DatabaseName\DatabaseName_20140730073000.trn' to secondary database 'DatabaseName'.(Microsoft.SqlServer.Management.LogShipping) *** 2014-07-31 03:45:20.77 *** Error: Exclusive access could not be obtained because the database is in use. RESTORE LOG is terminating abnormally.(.Net SqlClient Data Provider) ***

2014-07-31 03:45:21.05 *** Error: The log backup file 'H:\LogShip\DatabaseName\DatabaseName_20140730073000.trn' was verified but could not be applied to secondary database 'DatabaseName'.(Microsoft.SqlServer.Management.LogShipping) *** 2014-07-31 03:45:21.05 Deleting old log backup files. Primary Database: 'DatabaseName' 2014-07-31 03:45:21.06 The restore operation completed with errors. Secondary ID: '8f45136f-fb73-4815-b849-c7c1b391831b' 2014-07-31 03:45:21.06 ----- END OF TRANSACTION LOG RESTORE ----- Exit Status: 1 (Error)

--

Seeing the "Exclusive access could not be obtained because the database is in use" error (buried in the middle of the messages), I went looking for something connected to the DatabaseName database. Sure enough there it was:

spid	blocked	dbid	uid	login_time	last_batch	status	hostname
69	0	10	1	7/30/2014 11:07	7/30/2014 11:07	sleeping	ServerName

program_name		hostprocess	cmd	loginame	stmt_start	stmt_end	request_id
Microsoft SQL Server Management Studio - Query		5404	AWAITING COMMAND	Login1	0	0	0

Basically someone opened a Management Studio window on InstanceName as loginname Login1 at 11:07am the previous day, ran one or two queries (last_batch only one second after the login_time), and then left the window open, maintaining a connection to the DatabaseName database.

Log Shipping requires exclusive access (as noted in the error) to apply the log backups, so this single Management Studio connection was effectively breaking log shipping.

Because the SPID was "Awaiting Command" with a Last_Batch of some time ago I went ahead and killed SPID 69 and manually started the "LSRestore_ServerName_DatabaseName" job. The job completed successfully, restoring the last day's worth of logs in about ten minutes.

How could this be prevented? What was configured incorrectly?

When setting up a log shipping secondary as a read-only standby, there is an option to disconnect any existing connections before attempting log shipping restores. This option is available whether you use the Management Studio GUI or T-SQL commands.

In Management Studio, the option is "Disconnect users in the database before restoring backups." This option is on the "Restore Transaction Log" tab on the "Secondary Database Settings" window after you select to "Add" a secondary database. The option is only exposed if you select Standby mode - there is no need for the option in No Recovery mode since users won't be able to connect to the secondary database anyway:

Via T-SQL, the option is a parameter of the "sp_add_log_shipping_secondary_database" stored procedure. Set "@disconnect_users = 1" to enable the disconnect functionality.

IMPORTANT NOTE - as seen in the screenshot above, this option is not enabled by default in Management Studio, and it is also not enabled by default by the stored procedure (default value is 0). If you wish to use this functionality you need to go "out of your way" to turn it on. Be aware of the impact of this option *before* turning it on. If use pattern/business rules of your standby database is such that queries against the standby are more important than successful restores, do *not* enable this option as it will terminate your queries with extreme prejudice.

http://sd.keepcalm-o-matic.co.uk/i/keep-calm-and-terminate-with-extreme-prejudice.png

Hope this helps!

↧

POSTPONED to 02/11 - Speaking at the Omaha SQL Server User Group Tomorrow Night!

February 3, 2015, 10:59 am

≫ Next: Availability Groups - Where Did My Disks Go?

≪ Previous: Log Shipping - Exclusive access could not be obtained because the database is in use

UPDATE - due to weather conditions today (snow this AM and sub-zero cold tonight) we are postponed to next Wednesday 02/11 still at Farm Credit - I will be giving this same talk and we hope to see you there!

Please RSVP at this new link: http://omahamtg.com/Events.aspx?ID=268

Thanks!

--

I will be giving my talk on "Performing a SQL Server Health Check" tomorrow night at our local SQL User Group. We will be at Farm Credit Services of America at 6pm. You can RSVP at this link:

http://www.omahasql.com/2015/01/28/omaha-ssug-february-2015-performing-a-sql-server-health-check/

Hope to see you there!

↧

Availability Groups - Where Did My Disks Go?

April 2, 2015, 7:55 am

≫ Next: Deleting Files Older than X Hours with PowerShell

≪ Previous: POSTPONED to 02/11 - Speaking at the Omaha SQL Server User Group *Tomorrow* Night!

The TL;DR - beware of Failover Cluster Manager trying to steal your non-shared storage!

At a client recently two availability groups on a single Windows cluster went down simultaneously. Apparently the server that was the primary for the AGs (Server1) had mysteriously lost its DATA and LOG drives. By the time the client got us involved they had faked the application into coming up by pointing it directly to the single SQL Server instance that was still up (Server2) directly via the instance name rather than the availability group listeners.

I found that two of the drives on Server1 had gone offline, causing the issues – sample errors from the Windows System and Application Logs respectively:

Log Name: System

Source: Microsoft-Windows-FailoverClustering

Date: 3/31/2015 2:36:25 PM

Event ID: 1635

Task Category: Resource Control Manager

Level: Information

Keywords:

User: SYSTEM

Computer: server1.mydomain.com

Description:

Cluster resource 'Cluster Disk 2' of type 'Physical Disk' in clustered role 'Available Storage' failed.

Log Name: Application

Source: MSSQLSERVER

Date: 3/31/2015 2:36:25 PM

Event ID: 9001

Task Category: Server

Level: Error

Keywords: Classic

User: N/A

Computer: server1.mydomain.com

Description:

The log for database 'Database1' is not available. Check the event log for related error messages. Resolve any errors and restart the database.

Since this is an Availability Group (AG) I was surprised that there were “Cluster Disk” resources at all – AG’s do not rely on shared disk (it is one of their many advantages) and most AG clusters don’t have any shared disk at all (occasionally a quorum drive).

This is what I saw in Failover Cluster Manager:

Cluster Disk 1 was the Quorum, but the presence of disks 2-7 did not make sense to me in a regular AG arrangement. The two disks that were online (Disk 6 and Disk 7) were the two disks that were currently “live” on Server2, but there was still no reason for them to be in Failover Cluster Manager.

The service provider assured me that none of the drives except the Quorum are presented to more than one server from the back-end storage.

There was one reported event that happened at 2:36pm, a time that coincided with the failures – the client added a new node Server3 to the cluster (it was evicted 30 minutes later with no further impact positive or negative).

My best theory at this point was that when the engineer tried to add Server3 to the Windows cluster they mistakenly tried to add the disks as Cluster Disk resources – for a traditional SQL Server Failover Cluster Instance (FCI) this would be correct – for a SQL FCI almost all disk is shared and all nodes need to have access to all of the shared disk (although only one node can “own” it at any one time).

A cluster will “lock” disks – if cluster MySuperHugeAndAmazingCluster01 owns a particular drive then no other server or cluster can use it – the only way for a server to access it is through the cluster. I considered that may be the cause of this issue – even though several of the drives are flagged that “clustered storage is not connected to the node” this may simply have been because the storage wasn’t presented to the current “owner” of the Cluster Disk objects Server2.

After an application downtime was scheduled, I signed on to the server and after deleting the AGs (first saving their settings for later re-creation) and shutting down SQL I deleted the cluster disk objects. This, combined with a disk rescan in the Computer Management console on each server, did indeed return control of the “missing” drives to the servers. I dropped six of the seven cluster disk objects (all of them except the Quorum object) which means I needed to rescan disks on all of the servers. This validated that the only reason things have been working on Server2 was because the cluster thought that Server2 owned the disk objects (my guess is because the Add Node wizard to add Server3 to the cluster the other day was probably run from Server2 rather than Server1– more to follow on that).

I recreated the two AGs and as a final step I performed a failover test of each of the two availability groups from Server2 to Server1 and back again so that at the end of the process Server2 was the primary for both availability groups. Making Server2 the primary was necessary because of the changes the client had made to the front-end applications and processes to get them to work since they redirected the applications to talk directly to Server2 rather than to the two availability group names (this works since the availability group name is really just a redirect to the server name/IP itself under the hood). A final step for the client was to redirect the apps to once again talk to the availability group listeners.

I then added the new node (Server3) to the cluster and stepping through the Add Node wizard showed me the likely cause of the original issue (below).

As of the end of the call, the client was satisfied with the end state – SQL Servers online, availability groups running, and new cluster node added.

Here is what *I* learned today, brought to light through adding the new node and what was almost certainly the cause of the problem:

As I noticed when adding Server3 to the cluster, on the Confirmation screen of the Add Node wizard in Windows Server 2012 there is a check box to “Add all eligible storage to the cluster” – by default it is *CHECKED*.

As described hereby Clustering MVP David Bermingham, this can really cause problems:

On the confirmation screen you will see the name and IP address you selected. You will also see an option which is new with Windows Server 2012 failover clustering…”Add all eligible storage to the cluster”. Personally I’m not sure why this is selected by default, as this option can really confuse things. By default, this selection will add all shared storage (if you have it configured) to the cluster, but I have also seen it add just local, non-shared disks, to the cluster as well. I suppose they want to make it easy to support symmetric storage, but generally any host based or array based replication solutions are going to have some pretty specific instructions on how to add symmetric storage to the cluster and generally this option to add all disks to the cluster is more of a hindrance than a help when it comes to asymmetric storage. For our case, since I have no shared storage configured and I don’t want the cluster adding any local disks to the cluster for me automatically I have unchecked the Add all eligible storage to the cluster option.

(emphasis mine)

Although I have seen a cluster disk object reserve/”lock” a resource so that the actual servers can’t access it other than through the cluster, but I haven’t run over this specific situation before (the check box). The above explanation from David shows the most likely reason *why* this happened in this case – with the offending box checked by default, whoever was adding the node probably clicked right past it and when the process to actually add the node started, it grabbed all of the storage for the cluster, locking everybody out. This would have impacted Server3 as well, but since it was a new server with no user databases (or anything else) on its D: and E: drives unless someone was looking in My Computer and saw the drives disappear, there wouldn’t be any immediately apparent problem on that server.

The reason why I believe the Add Node wizard was run from Server2 (not that it is important, just explanatory) was because the disk objects showed as being owned by Server2. Since Server2 owned the cluster disk objects, it could still access them, which is why it was able to keep accessing its user databases on the two drives.

At the end of the day, if you are working on a cluster with no shared storage, make sure to uncheck the "Add all eligible storage to the cluster" check box - and even if you do have storage, it may not be a bad practice to uncheck the box - it isn't that hard to add the disks manually afterward, and it makes your cluster creation process consistent.

BONUS - I am not a PowerShell-freak myself (I keep telling myself I need to become one since #YouCanDoAnythingWithPowerShell) but if you like PS there is a flag to the relevant command there as well that is functionally equivalent to unchecking the box:

PS C:\> Add-ClusterNode -Name Server3 -NoStorage

#TheMoreYouKnow

↧

Deleting Files Older than X Hours with PowerShell

April 15, 2015, 7:49 am

≫ Next: PASS Summit 2015 Recap

≪ Previous: Availability Groups - Where Did My Disks Go?

(aka "OMG I can't believe I am actually finally writing a #PowerShell blog post").

--

I currently have a situation at a client where we are running a server-side trace that is generating 6GB-7GB of trc files per day while we are watching to see what might be causing a server issue.

I need to keep enough trace files that if something happens late in the day or overnight we can run it down, but not so many as to have 6GB+ of files all of the time.

In the past for file cleanups I have relied on the ForFiles command (modified from the one used by Ola Hallengren in his Output File Cleanup job):

cmd /q /c "For /F "tokens=1 delims=" %v In ('ForFiles /P "$(ESCAPE_SQUOTE(SQLLOGDIR))" /m *_*_*_*.txt /d -30 2^>^&1') do if EXIST "$(ESCAPE_SQUOTE(SQLLOGDIR))"\%v echo del "$(ESCAPE_SQUOTE(SQLLOGDIR))"\%v& del "$(ESCAPE_SQUOTE(SQLLOGDIR))"\%v"

The problem in this case is that ForFiles takes a parameter in *days* (/d) and in my case I really want to delete files older than 18 *hours*.

Looking for another solution I figured there had to be a PowerShell solution (Remember - #YouCanDoAnythingWithPowerShell) and after some digging I found someone using the Get-ChildItem and Remove-Item cmdlets to do pretty much what I was looking for:

Get-ChildItem $env:temp/myfiles | where {$_.Lastwritetime -lt (date).addminutes(-15)} | remove-item

http://cornasdf.blogspot.com/2010/03/powershell-delete-files-older-than-x.html

This sample would remove anything from temp\myfiles older than 15 minutes. By hardcoding the -path parameter rather than relying on the $env variable and changing addminutes to addhours I was able to accomplish my goal:

Get-ChildItem -path D:\AndyG\Traces | where {$_.Lastwritetime -lt (date).addhours(-18)} | remove-item

This command "gets" each item in the given path with a modification date-time older than 18 hours and then removes/deletes them from the folder .

After testing this in a PS window, I tried it as a PowerShell job step in a SQL Server Agent job and it worked! (Of course, the account running the job step needs to have the necessary permissions on the D:\AndyG\Traces folder to perform the item deletes).

Moving forward, I may abandon the ForFile idea in favor of this whenever possible - it is simpler and PowerShell seems to be more and more what makes the world go around - hope this helps!

↧

PASS Summit 2015 Recap

November 3, 2015, 10:22 am

≫ Next: T-SQL Tuesday #72 - Implicit Conversion Problems

≪ Previous: Deleting Files Older than X Hours with PowerShell

Last week was the annual PASS Summit in Seattle. The Summit is the predominant educational and networking event for SQL Server professionals of all types – DBA’s, Developers, Analysts, and many more.

I wasn’t sure I was going to be able to attend this year but after some gyrations with a friend and my employer I was able to make it work, and it was definitely worth it, as always.

In no particular order here are my top moments from the week:

The Day One Keynote– not for the Microsoft message, but for the introduction from outgoing PASS President Tom LaRock (blog/@SQLRockstar). I have admired Tom for some time for the effort he is able to put in to his job as a technical evangelist (Head Geek) at Solarwinds while still being married with children *and* maintaining a strong presence as a member of the #sqlfamily and the overall SQL Server community. Being the president of PASS is no small task, with no pay and often little appreciation, and it was obvious as Tom described the journey he has taken just how invested he is in that service to the community.

The Day Two Keynote– Dr. David DeWitt and Dr. Rimma Nehmeof the Microsoft Jim Gray Systems Lab never disappoint, always delivering a presentation that is almost academic – there is never any hard marketing aside from a brief mention of their project of the moment (such as Polybase) but even that is usually a tongue-in-cheek reference. This year’s talk was on the “Internet of Things” #IoT– describing how the world is relying on more and more items with internet connectivity, such as smartphones and fitness trackers. They described how the data generated by these objects is resulting in a data explosion that will be an overload for business intelligence and how we as data professionals need to be ready for that volume. The slides from the talk are available on the Gray Systems Lab website.They also made a sad announcement that they would no longer be giving PASS Summit keynotes. Dr. DeWitt is retiring (at least mostly retiring) and Dr. Nehme is moving on to other things. They will be sorely missed by the PASS community and Microsoft has a very high bar to clear for whomever comes next year.

Bob Ward’s session– Bob (blog/@bobwardms) is the “Chief Technology Officer, CSS Americas” for Microsoft’s Customer Support Services (CSS, the artist formerly known as PSS) and he always melts our brains with a deep dive internals session on a chosen area of SQL Server. This year it was “Inside Wait Types, Latches, and Spinlocks” and was booked for a double session at the end of the day. Bob’s session is another one of those that always is about technology and *not* marketing or fluff. It is marked as 500-level (off the charts) and he isn’t kidding – about a third of the audience didn’t come back after the mid-session break. I learned a lot but I was definitely in a daze at the end.

Photo from http://slavasql.blogspot.com/2015/10/pass-summit-2015-thursday-day-2.html

Slava Murygin’s photos– Slava (blog/@SlavaSQL) did an amazing job documenting the Summit with a wide array of “day in the life” style pictures from the keynotes, sessions, exhibit hall, and a variety of other locations. He posted them on his blog at http://slavasql.blogspot.com/ and I have referenced several of them in this post. If you haven’t already looked at the pictures you need to check them out – and if you don’t already follow Slava on Twitter, do that too! :)

Reconnecting with #sqlfamily– one of the top parts of any Summit is seeing people that I deal with all of the time online but who live all over the world. From former co-workers that live in town (but that I still rarely see) to new friends from half a world away, there are hundreds of friends new and old that I only see at the Summit (and there are far too many to list here). Part of #sqlfamily is caring for each other and for others and meeting each other in person just reinforces that. Two items this year are #sqlcares to support the National MS Societyand #ArgenisWithoutBordersto support Doctors without Borders. This second item especially highlighted how many of us are willing to make fools of ourselves to help draw donations in – this happened at the Summit in response to the $25,000 that was donated:

Photo from https://twitter.com/jpries/status/659829060910977024

Video here: https://twitter.com/SqlrUs/status/659823548693893121

The regular sessions– especially “Kicking and Screaming: Replacing Profiler with Extended Events” from Erin Stellato (blog /@erinstellato) of SQLskills and “Tuning and Troubleshooting Transactional Replication” from Kendal Van Dyke (blog/@SQLDBA) of UpSearch. I had some knowledge of XEvents and transactional replication before, but I pulled a lot of information directly useful to my day-to-day job as a remote DBA. I also pulled a lot of data from @SQLSentry’s day-long “SQL Sentry Performance Boot Camp” – we recently purchased SQL Sentry tools and I understand them more now.

Erin and her crowd of fans after her session.

I didn’t get a good picture of Kendal presenting but loved his opening slide. :)

All in all it was a great experience as always – registration is already open for next fall!

↧

T-SQL Tuesday #72 - Implicit Conversion Problems

November 10, 2015, 10:40 am

≫ Next: SQL Server Training

≪ Previous: PASS Summit 2015 Recap

This month T-SQL Tuesday is hosted by Mickey Steuwe (blog/@SQLMickey) and her topic of choice is “Data Modeling Gone Wrong.”

My first memory of a data modeling problem comes to data types and the process of choosing appropriate data types - not because of storage concerns (although there can be problems) but because of actual performance risks.

When the data type of a field in a table doesn’t precisely match the datatype of your query (or parameter of your stored procedure) a conversion occurs. Conversions can be implicit (automatic) or explicit (requiring an inline CONVERT or CAST statement in the query).

Here is a chart from the Technet article on Data Type Conversions:

As you can see, a conversion from a CHAR to an NVARCHAR is “implicit” – so no big deal, right?

WRONG!

When I started as a DBA, we designed a system for payment processing for a university debit card system (an internal system, not a VISA/MC branded card at that time). Most of us that were involved were relatively new to the design process, so we decided to use NVARCHAR for all of our character fields. We decided it would be useful in a couple of ways:

It was a standard – everything was the same (my 16-year-experienced self now knows that isn’t a great definition of “standard”)

It allowed us to handle any eventual contingency – maybe we would need Unicode data, right?

At the beginning there wasn’t any visible impact – we only had a few thousand accounts and didn’t notice any issues.

Then we got to Fall Rush…

If you are familiar with American Universities, you know that campus bookstores make 80%-90% of their profit during the first week of each semester and often the week before. (This is probably true at many international universities as well.) One of the primary uses of our internal card was textbook purchasing, so we were slammed during this time period, both in terms of overall quantity of transactions and rapidity of transactions during the business day.

When we hit this period, we saw performance dip. At first we assumed it was just load (remember this was SQL Server 7.0 and then 2000) but we quickly realized that there was slag in our queries that wasn’t needed – we saw this in our query plan:

Image from https://pawankkmr.files.wordpress.com/2015/07/pawan-khowal-concatenation-operator-graphical-execution-plan.jpg

Why did we have a SCAN instead of a SEEK?

In a modern query plan this is visible in the predicate as a CONVERT_IMPLICIT:

Image from https://www.simple-talk.com/iwritefor/articlefiles/2166-clip_image002.gif

Even in our inexperienced state we knew one of those key rules of query optimizations:

SCAN BAD! (In your best Frankenstein voice)

There are two ways to deal with an issue like this – you can fix the design of the table, or you can modify your code to perform an explicit CAST or CONVERT of the statement. The best fix (“best”) is to fix the design of the table – Aaron Bertrand (blog/@AaronBertrand) has a good list of suggestions here for fields that can often be set to a smaller data type.

If you can’t change the table design (as you very often can’t) the other option is to modify the code to prevent an implicit conversion. The first method is like this:

DECLARE @CardID as NVARCHAR(10)

SET @CardID = ‘123ABC456’

SELECT CardID, LastName, FirstName, Balance

FROM Credit

WHERE CardID = @CardID

In this case the CardID field in the Credit table is of the NVARCHAR data type (as in my debit card scenario above). To get rid of the implicit conversion I am using a variable that is NVARCHAR to “pre-perform” the conversion so that by the time the query runs it is comparing apples to apples (or in this case NVARCHAR to NVARCHAR).

Another way to accomplish this is by using a CAST/CONVERT, like this:

SELECT CardID, LastName, FirstName, Balance

FROM Credit

WHERE CardID = CAST(‘value’ as NVARCHAR)

This seems like a minor detail but it can be a very important one, especially in systems with lots of rows and/or high transaction rates.

Your homework - there is a great test harness for this problem created by Jonathan Kehayias (blog/@sqlpoolboy) in a blog post at SQLPerformance.com including charts that quantify the performance hit and CPU costs. Check it out!

↧

SQL Server Training

November 12, 2015, 7:38 am

≫ Next: CHECKDB - The database could not be exclusively locked to perform the operation

≪ Previous: T-SQL Tuesday #72 - Implicit Conversion Problems

Brent Ozar’s group (http://www.brentozar.com/) just announced a pretty crazy set of “Black Friday” deals:

http://www.brentozar.com/archive/2015/11/announcing-our-2015-black-friday-sale-on-sql-server-training/

Insane sales if you can be one of the first few at midnight, but discounts all day on various training.

The other top training in the field (to me) is from Kimberly Tripp & Paul Randal (& company) at SQLskills (https://www.sqlskills.com/).

UPDATE - I originally said SQLskills wasn't offering discounts right now but their CEO Paul Randal corrected me:

Actually we do have discounts right now - up to $200 off every 2016 class if you register before 2016. Thanks

Sorry Paul :)

https://www.sqlskills.com/sql-server-training/

Make sure to include this as well:

https://www.sqlskills.com/why-sqlskills-immersion-events-are-worthwhile.pdf

None of the above training is cheap, but it is pretty much the only in-person training classes (as opposed to conferences) I would consider spending money on for SQL Server. In-person class from Microsoft and smaller providers have never provided me with much value and even less so as a Senior DBA

The conference options are the PASS Summit (http://www.sqlpass.org/summit/2016/About.aspx- what I attended last week) and SQL Intersection (formerly Connections - https://devintersection.com/SQL-Conference/#) which is usually held in Vegas around this time of year. The Summit is put on by the non-profit PASS (formerly the Professional Association for SQL Server) and Intersection is put on by a for-profit group that runs a variety of Intersection conferences (Dev, SharePoint, etc.).

I haven’t been to Intersection so I can’t objectively compare one to the other, but the Summit definitely gets a broader array of speakers and content, while Intersection is curated to a specifically narrow group of speakers, always of the MVP/MCM level.

I would be remiss to omit PASS SQLSaturdays - this is free training by the same speaker that present at PASS Summit and the other conferences. (Disclaimer - there is often a small fee for lunch.) There are one or two day SQLSaturdays all over the world, often more than one event per week! Check out the list at: http://www.sqlsaturday.com/events.aspx

I always feel I learn more in person, but failing that, here are some other options:

For Purchase:

Pluralsight – for a monthly membership fee you get access to all of the recorded courses Pluralsight has online, and it is quite a large library – the SQL Server courses are by a wide array of individuals but many of them are from the folks at SQLskills (see above).

https://preview.pluralsight.com/

http://www.pluralsight.com/tag/sql-server

FREE:

PASS Virtual Chapters – there are currently 27 virtual chapters that provided monthly or semi-monthly free webcasts on a variety of topics - http://www.sqlpass.org/PASSChapters/VirtualChapters.aspx

Vendors – the major vendors each provide some variant of regular free webcasts – some of it is marketing for their products (obviously) but there is often good content along with it – sometimes you do have to register, but at the end of the day if you have ever done anything online as a SQL Server person odds are these companies already have your info:

http://sqlsentry.tv/
http://www.solarwinds.com/resources/webcasts.aspx

http://www.idera.com/events/webcasts

https://www.mssqltips.com/sql-server-webcasts/

http://www.brentozar.com/sql-server-training-videos/

https://msdn.microsoft.com/en-us/data/aa937693.aspx

SQLbits – the major European SQL Server conference is SQLbits, often held in the UK. One of the plusses to SQLbits over the PASS Summit is that the post most if not all of their recorded sessions online for free!

http://sqlbits.com/content/

Of course these are just training content in the form of presentations, etc. There are always blogs (and they are just as important – content produced every day from SQL Server Pros all over the world).

Start with Tom LaRock’s (SQLRockstar) list here: http://thomaslarock.com/rankings/but make sure you read Tom too http://thomaslarock.com/blog/

If you are looking for more, SQL Server Central (owned by Redgate) publishes a strong list at http://www.sqlservercentral.com/blogs/

I apologize to anyone I omitted and all opinions are my own as to who provides top content in their arenas - thanks!

↧

CHECKDB - The database could not be exclusively locked to perform the operation

November 16, 2015, 1:31 pm

≫ Next: We Are All Responsible

≪ Previous: SQL Server Training

…so the job failed….again…you know the one – that darn Integrity Check job:

Executing query "DECLARE @Guid UNIQUEIDENTIFIER EXECUTE msdb..sp_ma...".: 100% complete End Progress Error: 2015-11-15 03:27:52.43 Code: 0xC002F210 Source: User Databases Integrity Check Execute SQL Task Description: Executing the query "DBCC CHECKDB(N'master') WITH NO_INFOMSGS " failed with the following error: "The database could not be exclusively locked to perform the operation. Check statement aborted. The database could not be checked as a database snapshot could not be created and the database or table could not be locked. See Books Online for details of when this behavior is expected and what workarounds exist. Also see previous errors for more details."

This particular SQL Server 2014 Standard server is relatively new at a client that still uses old style maintenance plans (haven’t yet been able to convince them to #GoOla). My first instinct in these cases is always that there is a problem related to the maintenance plan itself rather than the database or the integrity check command. Since the failure is on the small yet unbelievably important master database I decided to just run the command from the query window to find out…

DBCC CHECKDB(N'master') WITH NO_INFOMSGS Msg 5030, Level 16, State 12, Line 1The database could not be exclusively locked to perform the operation.

Msg 7926, Level 16, State 1, Line 1Check statement aborted. The database could not be checked as a database snapshot could not be created and the database or table could not be locked. See Books Online for details of when this behavior is expected and what workarounds exist. Also see previous errors for more details.

I have a small private user database on the system so I tried CHECKDB there:

DBCC CHECKDB(N'Ntirety') WITH NO_INFOMSGS Command(s) completed successfully.

OK….now what?

A Google of the “CHECKDB master could not be exclusively locked” brought back a short list of possibles which turned into a strand of spaghetti through the usual suspects – ServerFault, StackExchange, support.microsoft.com…before I found a question on SimpleTalk that sounded pretty spot on and referenced the most reliable source on CHECKDB I know, Paul Randal (blog/@PaulRandal).

Paul’s blog post “Issues around DBCC CHECKDB and the use of hidden database snapshots” discusses the need to have certain permissions to be able to create the snapshot CHECKDB uses. I checked the DATA directory and the SQL Server default path and found that the service account did have Full Control to those locations.

What happened next ultimately resolved my issue, and it reflects something I constantly tell people when they ask me how I research things relatively quickly (most of the time anyway :)) – whenever you read a blog post or article about a subject, MAKE SURE TO READ THE FOLLOW-UP COMMENTS! Sometimes they are nothing beyond “Great Article!” but quite often there are questions and answers between readers and the author that add important extra information to the topic, or just “Don’t Forget This!” style comments that add more detail.

In this case, one of the commenters said this:

Brandon Tucker says:
August 27, 2013 at 2:19 pm
Ran into some issues and did some more debugging. Check https://connect.microsoft.com/SQLServer/feedback/details/798675/2008-r2-engine-dbcc-checkdb-fails-against-databases-on-drives-that-dont-have-certain-root-permissions

BINGO!

The Connect item talks about how the questioner was trying to use minimum permissions for their SQL Server service account and found that it broke the CHECKDB for their system databases in the same fashion as what I was seeing. The agreed-upon fix was to add READ permission (just READ, not WRITE or anything higher) to the root level of the drive - not the folders, but the actual root of the drive. Most of the respondents talked about adding the permission to ServerName\Users, while one person mentioned jut adding for the service account.

I checked the root directory of the DATA drive (where the data files for both the system and user databases reside) and found that it had been stripped down to Adminstrators and a couple of system logins (Windows Server 2012 R2).

I added the read permission to the root of the drive for my SQL Server service account, and:

DBCC CHECKDB(N'master') WITH NO_INFOMSGS
Command(s) completed successfully.

…and all was right with the world.

A subsequent attempt to run the offending Integrity Check job succeeded cleanly as well.

This is not the first time I have seen something like this and I always have to question the level of zeal with which some security admins/teams pursue this. In my experience most shops (by far) have insufficient security, while the remaining shops have way too much, functionally breaking the systems by keeping them so secure that *nobody* can access them.

There are only a few people out there that seem to have that happy balance figured out where data is secure but people can still do their jobs and customers can use the systems.

In general my best experience is not to mess with permissions of the SQL Server account for files/folders/drives that are relevant to SQL Server. If a folder is in one of the MSSQL hierarchies, let the install set the permissions and then LEAVE THEM ALONE!

Your SQL Server service account doesn’t need to be a Windows administrator, or even have that elevated of general permissions – but if the file or folder is created or managed by SQL Server, leave the service account alone.

As always, my $02 – hope this helps!

↧

We Are All Responsible

November 24, 2015, 6:16 pm

≫ Next: What Do You Mean There is No Current Database Backup?!?

≪ Previous: CHECKDB - The database could not be exclusively locked to perform the operation

Today we all received an email from PASS's president Tom LaRock (blog/@SQLRockstar):

PASS Anti-Harassment Policy Reminder

It is unfortunate that I have to write this letter, but it has become necessary.

An Anti-Harassment Policy (AHP) was implemented in 2012 to help provide a harassment-free conference experience. Everyone attending PASS Summit is expected to follow the AHP at all times, including at PASS-sponsored social events. This includes, but is not limited to, PASS staff, exhibitors, speakers, attendees, and anyone affiliated with the event.

This year at PASS Summit I served on the Anti-Harassment Review Committee. As such, it was my responsibility to help review complaints and incidents reported during Summit. The PASS Summit experience should be enjoyable, exciting, and safe for everyone at all times. However, I am disappointed to say that PASS Summit was a negative experience for a few members of our #SQLFamily this year.

I expect Summit attendees and PASS members to treat others with respect at all times, whether that is inside a session room, the conference hallway, a PASS sponsored event, or even at a local coffee shop.

On a positive note, there were people actively using the policy this year and supporting one another onsite as well. I am proud to see that our community has embraced having this policy. It is wonderful to know that our #SQLFamily will not put up with these types of incidents.

If you have concerns or want to report an incident within the guidelines of the AHP, I encourage you to contact governance@sqlpass.org.

Thomas LaRock
PASS President

It is sad to me that we still live in a world where it is necessary to remind people of things like this. I am not deluded that events like those in this post by Wendy Pastrick (blog/@wendy_dance) will never happen, but it is still disappointing every time I hear about one.

No one - regardless of gender/ethnicity/anything else - should have to suffer from an uncomfortable environment or even worse outright physical or mental abuse, especially in a professional setting. When something like this does happen it is on all of us to speak up to try to end the situation and to prevent it from recurring.

I commit to personally working harder at this - when we see something unacceptable happen, Speak Up!

At the end of the day, we are all responsible for ourselves *and* for each other in our #sqlfamily.

As always, my $.02

Also - read Erin Stellato's (blog/@erinstellato) excellent post here giving *her* $.02.

↧

What Do You Mean There is No Current Database Backup?!?

December 3, 2015, 11:49 am

≫ Next: T-SQL Tuesday #73 - Not Running CHECKDB is Naughty!

≪ Previous: We Are All Responsible

Yet another late night/early morning tale...

The LOG backups for multiple databases on a server failed overnight (not all databases, but many of them), and looking in the output file from the LiteSpeed Maintenance plan (the client's tool of choice) I found entries like this:

Database Database01: Transaction Log Backup...
Destination: "I:\MSSQL\Transaction_Log_Backups\Database01\Database01_tlog_201511211012.TRN"

SQL Litespeed™ Version 4.8.4.00086
Copyright (C) 2004-2006, Quest Software Inc.
Quest Software.
Registered Name: ServerA

Msg 62309, Level 19, State 1, Line 0
SQL Server has returned a failure message to LiteSpeed which has prevented the operation from succeeding.

The following message is not a LiteSpeed message. Please refer to SQL Server books online or Microsoft technical support for a solution:
BACKUP LOG is terminating abnormally.
BACKUP LOG cannot be performed because there is no current database backup.

I checked the FULL backup job, and it had completed normally that night.

Looking back in the SQL Server Error Log I saw that the problems started at or around 9 pm local server time the previous night – when I looked at that time I found paired messages like these for *all* of the databases - not just the offending ones:

Date 11/20/2015 9:08:18 PM
Log SQL Server (Current - 11/21/2015 10:16:00 AM)
Source spid110
Message Setting database option RECOVERY to SIMPLE for database Database01

--

Date 11/20/2015 9:08:22 PM
Log SQL Server (Current - 11/21/2015 10:16:00 AM)
Source spid111
Message Setting database option RECOVERY to FULL for database Database01.

These messages showed that a process of some kind ran just after 9 pm that switched the databases from FULL recovery to SIMPLE and then back again. This broke the LOG recovery chain and required new FULL backups before any LOG backups could succeed, which is why the LOG backup job was failing.

The regular nightly FULL backup job on this server runs at 6 pm and normally takes 3-4 hours to complete (that night it took 4.25) – the databases that had backups completed prior to the FULL>>SIMPLE>>FULL switch were those that were failing – the databases that didn’t have backups until after the switch were still OK because they had a FULL *after* the switch to re-start their LOG recovery chain.

The fix in this case was to run new FULL backups of the impacted databases, which thankfully I was able to do without much impact during the morning. (Although DIFF backups could have worked too, as discussed here by Paul Randal (blog/@PaulRandal)

I never could get the users to admit who had made a change (although mysteriously it hasn't happened again :))

--

The cautionary tale part of this relates to the recovery switch. It is not uncommon to see recommendations to switch the recovery model prior to a large data load, or even a nightly import/export process, as part of transaction LOG management (although depending on your operations, a switch to BULK_LOGGED can be just as effective without breaking the LOG chain - there are point-in-time recovery issues for times *during* any actual minimally logged operations). I am not a fan of switching like this, but it is out there.

The takeaway is that you need to plan your backup/restore schedule accordingly - if you use a recovery model switch mechanism make sure to schedule your FULL backups *after* the recovery switch so that this problem doesn't bite you.

Also - always scan vendor-provided scripts for recovery model changes - we all know it is good practice to backup databases before and after major changes (we know that, right?) but it also often doesn't happen for one reason or another. If a script changes the recovery model to SIMPLE and back, you will have this problem if you don't take a new FULL or DIFF backup after the script runs.

Hope this helps!

↧

T-SQL Tuesday #73 - Not Running CHECKDB is Naughty!

December 8, 2015, 10:07 am

≫ Next: What Permissions Does This Login Have?

≪ Previous: What Do You Mean There is No Current Database Backup?!?

This month T-SQL Tuesday is hosted by Bradley Balls (blog/@SQLBalls) and his topic of choice is “Naughty or Nice.”

(If you don’t know about T-SQL Tuesday check out the information here – each month there is a new topic and it is a great excuse to write each month (and to start writing!) because someone offers a topic, so you already have taken the first step!).

--

Probably the number one problem (Naughty, Naughty!) I find when I run across a new server is DBCC CHECKDB - or rather, the complete lack of CHECKDB.

http://i.memecaptain.com/gend_images/fEN0ew.png

I have heard any number of excuses over the years, but they are almost all some variant of this:

It takes too much time/CPU/RAM/IO/Wheatons to run CHECKDB on my databases.

To this I say one thing - <BLEEP!>

At the end of the day you can't afford *not* to run CHECKDB- you just can't. Your database objects are only as reliable as your last clean CHECKDB (meaning your last CHECKALLOC + CHECKCATALOG + individual CHECKTABLE's on all of your tables - but more on that in a moment).

If you are not running regular CHECKDB, you may have endless unknown corruption across your databases, and you won't find out until someone tried to access the data.

Ask yourself - wouldn't you rather find out about data corruption from a regular maintenance process than from the Senior Vice President when she tries to run her Management Big Bonuses Report and it fails?

My recommendation is always to run CHECKDB as often as is plausible, with a practical line of at most once per day (there is no need to run CHECKDB every minute just because you can).

If you can't run daily CHECKDB's, then run it weekly (or even monthly), but run it.

One thing I have done at multiple clients that have not had "sufficient" resources to run full CHECKDB on a regular basis is to split the job up into its component parts mentioned above. This comes from an idea originally given by Paul Randal (blog/@PaulRandal) and works out like this:

First, analyze the size of your tables to create grouping of comparable sizes. I like a personal variant of the script at http://stackoverflow.com/questions/15896564/get-table-and-index-storage-size-in-sql-server from "marc_s" that breaks out the objects:

SELECT
s.Name AS SchemaName,
t.NAME AS TableName,
p.rows AS RowCounts,
SUM(a.total_pages) * 8 AS TotalSpaceKB,
SUM(a.used_pages) * 8 AS UsedSpaceKB,
(SUM(a.total_pages) - SUM(a.used_pages)) * 8 AS UnusedSpaceKB
FROM sys.tables t
INNER JOIN sys.schemas s
ON s.schema_id = t.schema_id
INNER JOIN sys.indexes i
ON t.OBJECT_ID = i.object_id
INNER JOIN sys.partitions p
ON i.object_id = p.OBJECT_ID AND i.index_id = p.index_id
INNER JOIN sys.allocation_units a
ON p.partition_id = a.container_id
WHERE
t.NAME NOT LIKE 'dt%' /* filter out system tables for diagramming */
AND t.is_ms_shipped = 0
AND i.OBJECT_ID > 255
GROUP BY t.Name, s.Name, p.Rows
ORDER BY TotalSpaceKB desc

Next, after you have divided the tables into two to four groups, schedule them into jobs running CHECKTABLE individually on groups of tables, with a final job running a catch-all step for all tables not explicitly checked - something like this (I like Ola Hallengren maintenance so this syntax is from there, but the concept will be visible even if you aren't familiar with his code):

Job 1 - run CHECKTABLE on three big tables in BigDatabase01:

EXECUTE [dbo].[DatabaseIntegrityCheck]
@Databases = 'BigDatabase01',
@CheckCommands = 'CHECKTABLE',
@Objects = 'BigDatabase01.dbo.BigTable01, BigDatabase01.dbo.BigTable02, BigDatabase01.dbo.BigTable03',
@LogToTable = 'Y'

Job 2 - run CHECKTABLE on the next two big tables in BigDatabase01:

EXECUTE [dbo].[DatabaseIntegrityCheck]
@Databases = 'BigDatabase01',
@CheckCommands = 'CHECKTABLE',
@Objects = 'BigDatabase01.dbo.BigTable04, BigDatabase01.dbo.BigTable05',
@LogToTable = 'Y'

Job 3 - Step 1 - run CHECKTABLE on all tables in BigDatabase01 except the five big tables from Jobs 1 and 2 - this "catch-all" step is very important because you need to make sure new tables added to the database get checked - if all you have are steps running against individual tables, new table never get checked!

EXECUTE [dbo].[DatabaseIntegrityCheck]
@Databases = 'BigDatabase01',
@CheckCommands = 'CHECKTABLE',
@Objects = 'ALL_OBJECTS, -BigDatabase01.dbo.BigTable01, -BigDatabase01.dbo.BigTable02, -BigDatabase01.dbo.BigTable03, -BigDatabase01.dbo.BigTable04, -BigDatabase01.dbo.BigTable05',
@LogToTable = 'Y'

Job 3 - Step 2 - run CHECKTABLE on all tables in all databases except BigDatabase01 - the same "catch-all" logic applies here - if you run checks against individual named databases (rather than all databases except BigDatabase01) then new databases don't get picked up:

EXECUTE [dbo].[DatabaseIntegrityCheck]
@Databases = 'USER_DATABASES, -BigDatabase01',
@CheckCommands = 'CHECKTABLE',
@Objects = 'ALL_OBJECTS',
@LogToTable = 'Y'

Job 3 - Step 3 - run CHECKALLOC on all databases:

EXECUTE [dbo].[DatabaseIntegrityCheck]
@Databases = 'USER_DATABASES',
@CheckCommands = 'CHECKALLOC',
@LogToTable = 'Y'

Job 3 - Step 4 - run CHECKCATALOG on all databases:

EXECUTE [dbo].[DatabaseIntegrityCheck]
@Databases = 'USER_DATABASES',
@CheckCommands = 'CHECKCATALOG',
@LogToTable = 'Y'

Over the course of three jobs (usually scheduled over three nights), this plan runs CHECKTABLE on all tables in all databases, runs CHECKALLOC on all databases, and runs CHECKCATALOG on all databases - BINGO! CHECKDB!

This is a relatively simple example with a server that has one large database with multiple large tables. (Of course, this is a fairly common situation on many of these "can't CHECKDB" servers.) The same principle - divide the CHECKDB operation into component parts - allows me to scheduled CHECKDB on multiple systems when a client says "oh we can *never* run CHECKDB here."

This same principle of exclusion is useful to start running CHECKDB (or its components) on systems that have large databases or objects even if you can't get the offending objects in progress. Usually when I find one of these servers, there are no CHECKDB's in place on *any* databases even though the problem is ReallyBigTable01 in BiggestDatabase01. A great first step is to set up CHECKDB or its components on all objects except the offending one(s) - as above, set up CHECKTABLE on everything on the server except ReallyBigTable01 and also CHECKCATALOG and CHECKALLOC everywhere, and you have much more protection while you figure out what to do about ReallyBigTable01.

--

At the end of the day - stay off the naughty list -
RUN CHECKDB!

http://sugarscape.cdnds.net/14/50/980x490/nrm_1418137741-santa-naughty.jpg

↧

What Permissions Does This Login Have?

December 17, 2015, 12:27 pm

≫ Next: The transaction log for database 'ABC' is full due to 'ACTIVE_BACKUP_OR_RESTORE'

≪ Previous: T-SQL Tuesday #73 - Not Running CHECKDB is Naughty!

I recently was tasked with this ticket:

Please add new login Domain\Bob to server MyServer. Grant the login the same permissions as Domain\Mary.

On the face of it, this seems relatively straightforward, right? It is the kind of request that we all get from time to time, whether as an ad-hoc task or as part of a larger project, such as a migration.

The catch of course is that it isn't that easy - how do you know what permissions Mary has?

Is Mary a member of any server-level roles?
What specific individual server permissions does she have?
What database(s) is she a member of? What database role(s) is she in?
What specific object(s) does she have permissions to? (this is often the killer)

Each of these items can be manually checked, of course - you can browse to Security>>Logins and right-click on Mary to see what server-level roles she is in and which databases she has access to, and then you can browse to each database and check what roles she is in and which objects she has explicit permissions on....

Sounds like fun, right?

As with all things SQL, almost anything you can do via the GUI can also be done programmatically (although it can be ugly).

If you read my blog with any regularity you know that I am a firm believer in building on the work of others while granting credit where due - that is, there is no need to reinvent the wheel when you have a freely available example to start with. Over time I have compiled links in my personal store to code created by other people for each of the components of the task at hand:

Kendal Van Dyke (blog/@SQLDBA) has a wonderful blog post "Scripting Server Permissions And Role Assignments" for server-level roles and permissions: http://www.kendalvandyke.com/2009/01/scripting-server-permissions-and-role.html
Phillip Kelley had a useful answer to a question on StackOverflow with code for database role memberships - the question relates to SQL 2005 but it is still valid through current: http://stackoverflow.com/questions/3265526/generating-scripts-for-database-role-membership-in-sql-server-2005
Wayne Sheffield (blog/@DBAWayne) posted a great answer to a forum question on SQLServerCentral with code to script out database users and their object level permissions: http://www.sqlservercentral.com/Forums/Topic977700-146-1.aspx

Each of these gave me part of the answer to my question about Mary, but why not roll them all together? After some brief digging I couldn't find someone who had taken that next step.

I took the code from each of the three sources above (thanks Kendal, Phillip, and Wayne!) and modified them to play nicely together, including wrapping them in sp_msforeachdb as relevant and adding WHERE clauses to filter for an individual login.

I considered what I might use this code for, and then took one further step - I took the code from sp_help_revlogin (thanks Microsoft!) and added it to the start of my new code block. This allows me to script the login itself at the beginning in case I want to transfer the login and its permissions to a new server (or replace them if something goes wrong on the original server!)

In my Bob and Mary case, I ran the code, and ran a find/replace for Mary>>Bob - this gave me a script to create Domain\Bob and grant him all of the same role memberships and permissions as Domain\Mary.

One note - if you run this for SQL logins (rather than Windows logins) you have the issue of the login SID (Security Identifier). That is one of the big bonuses of using sp_help_revlogin is that it scripts out the SID of the login, so that when you try to re-create that login it maintains its SID for all security chains. Of course for what we are doing in this example you wouldn't want that - if Mary and Bob were SQL logins, you would want to take one additional step in the final script of editing the CREATE LOGIN statement to remove the @sid parameter (since your new login Bob couldn't use Mary's SID),

The resulting script takes an input parameter of a login name (or NULL) and then outputs CREATE/GRANT/etc. commands for the login, role memberships, and object permissions of that login (or of all logins if NULL is specified).

First, sp_help_revlogin (and sp_hexadecimal) is created if it doesn't exist - if it does exist it is ALTER'ed with this recent version
sp_help_revlogin is run to generate a CREATE LOGIN
Kendal's code (slightly modified) is run to generate sp_addsrvrolemember and GRANT/DENY statements for server-level security
Wayne's code (again modified) is run to generate CREATE USER statements for the databases
Phillip's code (modified) is run to generate sp_AddRoleMember statements for database roles
Finally, another piece of Wayne's code (from that same post) is run to generate the GRANT/DENY statements for individual database object permissions

The current version of the script (v 1.0) is located here. http://tiny.cc/PermScript

A few caveats - because I include the CREATE/ALTER PROC for sp_help_revlogin and sp_hexadecimal in my script, my actual parameter (@LoginName) to specify your desired login is roughly halfway down the script (after the CREATE/ALTER PROC code) - the easiest way to deal with this is to CTRL-F for "SET @LoginName"

The code outputs via a large number of result sets, so I *strongly* recommend you change your query output from Grid to Text before you run the script.

I have run this against versions from 2005 through 2014 and it has generated successfully.

I am working on improving commenting and error handling. but check it out and let me know what you think and any holes you find (and as with all things on the Web, run it at your own risk).

Hope this helps!

--

NOTE - someone mentioned in a comment that they had trouble accessing the link - I originally didn't publish the script in the blog because the length was prohibitive but here it is:

--

/*

Permissions Scripter v1.0

All code on the web should be examined and run at your own risk!

--

2015/12/16
Andy Galbraith @DBA_ANDY
http://nebraskasql.blogspot.com/

--

IMPORTANT - CTRL-F for 'SET @LoginName' and set the name

Strongly recommend you change the Query output from Grid to Text for best output results

--

Uses code from several sources that I have combined and modifed to work together.

Sources are attributed thoughout but are also noted here:

** Microsoft - sp_help_revlogin - https://support.microsoft.com/en-us/kb/918992

** Kendal Van Dyke @SQLDBA - "Scripting Server Permissions And Role Assignments" - http://www.kendalvandyke.com/2009/01/scripting-server-permissions-and-role.html

** Phillip Kelley - "Generating scripts for database role membership in SQL Server 2005"
http://stackoverflow.com/questions/3265526/generating-scripts-for-database-role-membership-in-sql-server-2005

** Wayne Sheffield @DBAWayne - "script out database users for the selected database" - http://www.sqlservercentral.com/Forums/Topic977700-146-1.aspx

*/

SET NOCOUNT ON

SELECT 'SET NOCOUNT ON;'+CHAR(13)+'USE [MASTER];'+CHAR(13)+'GO'+CHAR(13) as '/* Set Database Context to master */'

USE [master]
GO

/*
Microsoft - sp_help_revlogin - https://support.microsoft.com/en-us/kb/918992
*/

/****** Object: StoredProcedure [dbo].[sp_hexadecimal] Script Date: 10/12/2010 13:58:44 ******/
SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

IF NOT EXISTS (SELECT * FROM INFORMATION_SCHEMA.ROUTINES WHERE ROUTINE_NAME = 'sp_hexadecimal')
EXEC ('CREATE PROC dbo.sp_hexadecimal AS SELECT GETDATE()')
GO

ALTER PROCEDURE [dbo].[sp_hexadecimal]
@binvalue varbinary(256),
@hexvalue varchar(256) OUTPUT
AS
DECLARE @charvalue varchar(256)
DECLARE @i int
DECLARE @length int
DECLARE @hexstring char(16)
SELECT @charvalue = '0x'
SELECT @i = 1
SELECT @length = DATALENGTH (@binvalue)
SELECT @hexstring = '0123456789ABCDEF'
WHILE (@i <= @length)
BEGIN
DECLARE @tempint int
DECLARE @firstint int
DECLARE @secondint int
SELECT @tempint = CONVERT(int, SUBSTRING(@binvalue,@i,1))
SELECT @firstint = FLOOR(@tempint/16)
SELECT @secondint = @tempint - (@firstint*16)
SELECT @charvalue = @charvalue +
SUBSTRING(@hexstring, @firstint+1, 1) +
SUBSTRING(@hexstring, @secondint+1, 1)
SELECT @i = @i + 1
END
SELECT @hexvalue = @charvalue

GO

/****** Object: StoredProcedure [dbo].[sp_help_revlogin] Script Date: 10/12/2010 13:58:51 ******/
SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

IF NOT EXISTS (SELECT * FROM INFORMATION_SCHEMA.ROUTINES WHERE ROUTINE_NAME = 'sp_help_revlogin')
EXEC ('CREATE PROC dbo.sp_help_revlogin AS SELECT GETDATE()')
GO

ALTER PROCEDURE [dbo].[sp_help_revlogin] @login_name sysname = NULL AS
DECLARE @name sysname
DECLARE @type varchar (1)
DECLARE @hasaccess int
DECLARE @denylogin int
DECLARE @is_disabled int
DECLARE @PWD_varbinary varbinary (256)
DECLARE @PWD_string varchar (514)
DECLARE @SID_varbinary varbinary (85)
DECLARE @SID_string varchar (514)
DECLARE @tmpstr varchar (MAX)
DECLARE @is_policy_checked varchar (3)
DECLARE @is_expiration_checked varchar (3)

DECLARE @defaultdb sysname

IF (@login_name IS NULL)
DECLARE login_curs CURSOR FOR

SELECT p.sid, p.name, p.type, p.is_disabled, p.default_database_name, l.hasaccess, l.denylogin FROM
sys.server_principals p LEFT JOIN sys.syslogins l
ON ( l.name = p.name ) WHERE p.type IN ( 'S', 'G', 'U' ) AND p.name <> 'sa'
ELSE
DECLARE login_curs CURSOR FOR

SELECT p.sid, p.name, p.type, p.is_disabled, p.default_database_name, l.hasaccess, l.denylogin FROM
sys.server_principals p LEFT JOIN sys.syslogins l
ON ( l.name = p.name ) WHERE p.type IN ( 'S', 'G', 'U' ) AND p.name = @login_name
OPEN login_curs

FETCH NEXT FROM login_curs INTO @SID_varbinary, @name, @type, @is_disabled, @defaultdb, @hasaccess, @denylogin
IF (@@fetch_status = -1)
BEGIN
PRINT 'No login(s) found.'
CLOSE login_curs
DEALLOCATE login_curs
RETURN -1
END
SET @tmpstr = '/* sp_help_revlogin script ** Generated ' + CONVERT (varchar, GETDATE()) + ' on ' + @@SERVERNAME + ' */'
PRINT @tmpstr
PRINT ''
WHILE (@@fetch_status <> -1)
BEGIN
IF (@@fetch_status <> -2)
BEGIN
PRINT ''
SET @tmpstr = '/* Login: ' + @name+' */'
PRINT @tmpstr
IF (@type IN ( 'G', 'U'))
BEGIN -- NT authenticated account/group

SET @tmpstr = 'CREATE LOGIN ' + QUOTENAME( @name ) + ' FROM WINDOWS WITH DEFAULT_DATABASE = [' + @defaultdb + ']'
END
ELSE BEGIN -- SQL Server authentication
-- obtain password and sid
SET @PWD_varbinary = CAST( LOGINPROPERTY( @name, 'PasswordHash' ) AS varbinary (256) )
EXEC sp_hexadecimal @PWD_varbinary, @PWD_string OUT
EXEC sp_hexadecimal @SID_varbinary,@SID_string OUT

-- obtain password policy state
SELECT @is_policy_checked = CASE is_policy_checked WHEN 1 THEN 'ON' WHEN 0 THEN 'OFF' ELSE NULL END FROM sys.sql_logins WHERE name = @name
SELECT @is_expiration_checked = CASE is_expiration_checked WHEN 1 THEN 'ON' WHEN 0 THEN 'OFF' ELSE NULL END FROM sys.sql_logins WHERE name = @name

SET @tmpstr = 'CREATE LOGIN ' + QUOTENAME( @name ) + ' WITH PASSWORD = ' + @PWD_string + '
HASHED, SID = ' + @SID_string + ', DEFAULT_DATABASE = [' + @defaultdb + ']'

IF ( @is_policy_checked IS NOT NULL )
BEGIN
SET @tmpstr = @tmpstr + ', CHECK_POLICY = ' + @is_policy_checked
END
IF ( @is_expiration_checked IS NOT NULL )
BEGIN
SET @tmpstr = @tmpstr + ', CHECK_EXPIRATION = ' + @is_expiration_checked
END
END
IF (@denylogin = 1)
BEGIN -- login is denied access
SET @tmpstr = @tmpstr + '; DENY CONNECT SQL TO ' + QUOTENAME( @name )
END
ELSE IF (@hasaccess = 0)
BEGIN -- login exists but does not have access
SET @tmpstr = @tmpstr + '; REVOKE CONNECT SQL TO ' + QUOTENAME( @name )
END
IF (@is_disabled = 1)
BEGIN -- login is disabled
SET @tmpstr = @tmpstr + '; ALTER LOGIN ' + QUOTENAME( @name ) + ' DISABLE'
END
PRINT @tmpstr
END

FETCH NEXT FROM login_curs INTO @SID_varbinary, @name, @type, @is_disabled, @defaultdb, @hasaccess, @denylogin
END
CLOSE login_curs
DEALLOCATE login_curs

RETURN 0

GO

DECLARE @LoginName sysname

/*

SET LOGIN NAME!!!!

*/

SET @LoginName = 'sa'

/*
For all Logins set to NULL

SET @LoginName = NULL
*/

EXEC sp_help_revlogin @login_name = @LoginName

PRINT ''

/*
Kendal Van Dyke @SQLDBA - "Scripting Server Permissions And Role Assignments" - http://www.kendalvandyke.com/2009/01/scripting-server-permissions-and-role.html

Modifed by me to filter for an individual login if one is specified
*/

/* Generate statements to create server permissions for SQL logins, Windows Logins, and Groups */

-- Role Members
IF @LoginName is NOT NULL
SELECT 'EXEC sp_addsrvrolemember @rolename =' + SPACE(1)
+ QUOTENAME(usr1.name, '''') + ', @loginame =' + SPACE(1)
+ QUOTENAME(usr2.name, '''') +';' AS '/* Server Role Memberships */'
FROM sys.server_principals AS usr1
INNER JOIN sys.server_role_members AS rm ON usr1.principal_id = rm.role_principal_id
INNER JOIN sys.server_principals AS usr2 ON rm.member_principal_id = usr2.principal_id
and usr2.name = @LoginName
ORDER BY rm.role_principal_id ASC
ELSE
SELECT 'EXEC sp_addsrvrolemember @rolename =' + SPACE(1)
+ QUOTENAME(usr1.name, '''') + ', @loginame =' + SPACE(1)
+ QUOTENAME(usr2.name, '''') +';' AS '/* Server Role Memberships */'
FROM sys.server_principals AS usr1
INNER JOIN sys.server_role_members AS rm ON usr1.principal_id = rm.role_principal_id
INNER JOIN sys.server_principals AS usr2 ON rm.member_principal_id = usr2.principal_id

-- Permissions
IF @LoginName is NOT NULL
SELECT server_permissions.state_desc COLLATE SQL_Latin1_General_CP1_CI_AS
+ '' + server_permissions.permission_name COLLATE SQL_Latin1_General_CP1_CI_AS
+ ' TO [' + server_principals.name COLLATE SQL_Latin1_General_CP1_CI_AS
+ '];' AS '/* Server Level Permissions */'
FROM sys.server_permissions AS server_permissions WITH ( NOLOCK )
INNER JOIN sys.server_principals AS server_principals WITH ( NOLOCK ) ON server_permissions.grantee_principal_id = server_principals.principal_id
WHERE server_principals.type IN ( 'S', 'U', 'G' )
and server_principals.name = @LoginName
ORDER BY server_principals.name,
server_permissions.state_desc,
server_permissions.permission_name
ELSE
SELECT server_permissions.state_desc COLLATE SQL_Latin1_General_CP1_CI_AS
+ '' + server_permissions.permission_name COLLATE SQL_Latin1_General_CP1_CI_AS
+ ' TO [' + server_principals.name COLLATE SQL_Latin1_General_CP1_CI_AS
+ '];' AS '/* Server Level Permissions */'
FROM sys.server_permissions AS server_permissions WITH ( NOLOCK )
INNER JOIN sys.server_principals AS server_principals WITH ( NOLOCK ) ON server_permissions.grantee_principal_id = server_principals.principal_id
WHERE server_principals.type IN ( 'S', 'U', 'G' )
ORDER BY server_principals.name,
server_permissions.state_desc,
server_permissions.permission_name

/*
Wayne Sheffield @DBAWayne - "script out database users for the selected database" - http://www.sqlservercentral.com/Forums/Topic977700-146-1.aspx

Modified by me to run inside sp_msforeachdb and to filter for an individual login if one is specified
Also added COLLATE DATABASE_DEFAULT statements to handle databases with collations different from the instance
*/

DECLARE @strsql nvarchar(4000)

PRINT '/* Database Users */'
IF @LoginName is not NULL
set @strsql = 'SELECT ''/* ? */'';SELECT ''USE [?];
GO
IF NOT EXISTS (SELECT 1 FROM sys.database_principals WHERE name = '' +
QuoteName(dp.name, char(39)) COLLATE DATABASE_DEFAULT + '')
CREATE USER '' + QuoteName(dp.name) +
IsNull('' FOR LOGIN '' + QuoteName(sp.name),'''') +
IsNull('' WITH DEFAULT_SCHEMA = '' + QuoteName(dp.default_schema_name),'''') + '';''
FROM [?].sys.database_principals dp
LEFT JOIN [?].sys.server_principals sp
ON sp.sid = dp.sid
WHERE dp.type like ''[GUS]''
and dp.name = '''+@LoginName+''''
ELSE
set @strsql = 'SELECT ''/* ? */'';SELECT ''USE [?];
GO
IF NOT EXISTS (SELECT 1 FROM sys.database_principals WHERE name = '' +
QuoteName(dp.name, char(39)) COLLATE DATABASE_DEFAULT + '')
CREATE USER '' + QuoteName(dp.name) +
IsNull('' FOR LOGIN '' + QuoteName(sp.name),'''') +
IsNull('' WITH DEFAULT_SCHEMA = '' + QuoteName(dp.default_schema_name),'''') + '';''
FROM [?].sys.database_principals dp
LEFT JOIN [?].sys.server_principals sp
ON sp.sid = dp.sid
WHERE dp.type like ''[GUS]'''

EXEC sp_msforeachdb @strsql

/*
Phillip Kelley - "Generating scripts for database role membership in SQL Server 2005"
http://stackoverflow.com/questions/3265526/generating-scripts-for-database-role-membership-in-sql-server-2005

Modified by me to run inside sp_msforeachdb and to filter for an individual login if one is specified
*/

PRINT '/* Database Role Memberships */'

IF @LoginName is not NULL
set @strsql = 'SELECT ''/* ? */'';
SELECT ''USE [?];
GO
EXECUTE sp_AddRoleMember '''''' + roles.name + '''''', '''''' + users.name + ''''''''+'';''
from [?].sys.database_principals users
inner join [?].sys.database_role_members link
on link.member_principal_id = users.principal_id
inner join [?].sys.database_principals roles
on roles.principal_id = link.role_principal_id
where users.name = '''+@LoginName+''''
ELSE
set @strsql = 'SELECT ''/* ? */'';
SELECT ''USE [?];
GO
EXECUTE sp_AddRoleMember '''''' + roles.name + '''''', '''''' + users.name + ''''''''+'';''
from [?].sys.database_principals users
inner join [?].sys.database_role_members link
on link.member_principal_id = users.principal_id
inner join [?].sys.database_principals roles
on roles.principal_id = link.role_principal_id'

EXEC sp_msforeachdb @strsql

/*
Wayne Sheffield @DBAWayne - "script out database users for the selected database" - http://www.sqlservercentral.com/Forums/Topic977700-146-1.aspx

Modified by me to run inside sp_msforeachdb and to filter for an individual login if one is specified
*/

PRINT '/* Database Object Permissions */'

IF @LoginName is not NULL
set @strsql = 'SELECT ''/* ? */'';
SELECT ''USE [?];
GO
''+ dp.state_desc COLLATE SQL_Latin1_General_CP1_CI_AS + '''' +
dp.permission_name COLLATE SQL_Latin1_General_CP1_CI_AS +
'' ON '' + QuoteName(ss.name) + ''.'' + QuoteName(so.name) +
'' TO '' + QuoteName(dp2.name) + '';''+CHAR(13)
FROM [?].sys.database_permissions dp
JOIN [?].sys.database_principals dp2
ON dp2.principal_id = dp.grantee_principal_id
JOIN [?].sys.objects so
ON so.object_id = dp.major_id
JOIN [?].sys.schemas ss
ON ss.schema_id = so.schema_id
WHERE dp2.name = '''+@LoginName+''''
ELSE
set @strsql = 'SELECT ''/* ? */'';
SELECT ''USE [?];
GO
''+ dp.state_desc COLLATE SQL_Latin1_General_CP1_CI_AS + '''' +
dp.permission_name COLLATE SQL_Latin1_General_CP1_CI_AS +
'' ON '' + QuoteName(ss.name) + ''.'' + QuoteName(so.name) +
'' TO '' + QuoteName(dp2.name) + '';''+CHAR(13)
FROM [?].sys.database_permissions dp
JOIN [?].sys.database_principals dp2
ON dp2.principal_id = dp.grantee_principal_id
JOIN [?].sys.objects so
ON so.object_id = dp.major_id
JOIN [?].sys.schemas ss
ON ss.schema_id = so.schema_id'

EXEC sp_msforeachdb @strsql

PRINT '/* END OF SCRIPT */'

↧

The transaction log for database 'ABC' is full due to 'ACTIVE_BACKUP_OR_RESTORE'

December 22, 2015, 12:57 pm

≫ Next: Why is my SQL Server using all of the RAM on the server?

≪ Previous: What Permissions Does This Login Have?

I have recently had a client with a problem each morning where they were having processes fail with this message in their SQL Server Error Log:

Date12/01/2015 5:58:48 AM
LogSQL Server (Current - 12/01/2015 1:00:00 PM)
Sourcespid118
MessageError: 9002, Severity: 17, State: 3.
--
Date12/01/2015 5:58:48 AM
LogSQL Server (Current - 12/01/2015 1:00:00 PM)
Sourcespid118
MessageThe transaction log for database 'ABC' is full due to 'ACTIVE_BACKUP_OR_RESTORE'.

Their question was how their log could be full since their database is in SIMPLE recovery.

Their particular situation is that database ABC is a 650GB OLTP database, and the backup to the remote location was taking 7-8 hours each night over the time in question:

server_name	database_name	backup_start_date	backup_finish_date	physical_device_name	type	BackupSizeGB
Server1	ABC	12/4/2015 23:30	12/5/2015 7:11	VNBU0-29756-4248-1450758615	D	630
Server1	ABC	12/3/2015 0:00	12/3/2015 7:46	VNBU0-33644-31396-1450499421	D	630
Server1	ABC	12/1/2015 23:30	12/2/2015 6:22	VNBU0-30500-35052-1450413013	D	424
Server1	ABC	11/30/2015 23:30	12/1/2015 6:37	VNBU0-18236-33032-1450326613	D	468
Server1	ABC	10/30/2015 23:30	10/31/2015 4:51	VNBU0-5696-14276-1447734610	D	386
Server1	ABC	10/29/2015 0:05	10/29/2015 5:27	VNBU0-14976-21580-1447475427	D	378
Server1	ABC	10/27/2015 23:31	10/28/2015 4:59	VNBU0-18040-27960-1447389025	D	367
Server1	ABC	10/26/2015 23:31	10/27/2015 4:34	VNBU0-20180-26980-1447302625	D	356
Server1	ABC	10/25/2015 23:31	10/26/2015 5:00	VNBU0-22808-28180-1447216223	D	372
Server1	ABC	10/24/2015 23:31	10/25/2015 5:21	VNBU0-6160-15336-1447129821	D	372
Server1	ABC	10/23/2015 23:31	10/24/2015 5:01	VNBU0-5396-24044-1447043425	D	359
Server1	ABC	10/22/2015 23:31	10/23/2015 7:47	VNBU0-8796-18884-1446957027	D	375
Server1	ABC	10/21/2015 23:37	10/22/2015 5:29	VNBU0-18032-28004-1446870623	D	364
Server1	ABC	10/20/2015 23:31	10/21/2015 5:00	VNBU0-8692-19836-1446784216	D	371

The way FULL backups work in SQL Server, the transaction log is not released for re-use during a FULL backup, even if regular LOG backups are occurring or the database is in SIMPLE recovery. This is due to the fact that the portion of the LOG that is used during the FULL has to be persisted during the FULL in order to be backed up at the end of the FULL – that is, the FULL backup includes the data at the start of the FULL (23:30) *plus* the LOG used until the end of the FULL (in the case of the 12/04-12/05 backup, the LOG used from 23:30 to 07:11). This is the meaning of the ACTIVE_BACKUP_OR_RESTORE message – the LOG is waiting for the end of the active FULL backup before it can be released for re-use, which in this case was causing the LOG/LDF file to grow to fill its mount point.

What this means is that the LOG file has to be large enough (or be able to auto-grow large enough) to hold all of the work done during the FULL backup. In this system (like many others) there is maintenance done overnight (the largest on this system being a purge job which runs for 7-8 hours each night against database ABC almost exactly during this same time window). This maintenance was generating hundreds of GB of LOG each night during the FULL backup, resulting in the errors shown above.

For this client, the problem turned out to be that the NetBackup process performing these particular backups had been replaced by a snapshot style backup and disabled at the end of October but had recently been accidentally re-enabled. Shutting off this NetBackup schedule killed these backups, directly resolving the issue.

The alternative would be to have a sufficiently large LOG/LDF file to hold the work done during the FULL backup - even in SIMPLE recovery this is required. Another option would be to examine the nightly maintenance and schedule it to a different window, away from the FULL backup to minimize the amount of LOG/LDF that needs to persist during the FULL.

Many people don't consider this situation, and it doesn't come up frequently - usually only in this specific situation of a relatively large database that is also high traffic to fill the LOG during the FULL.

Hope this helps!

NOTE: as I was finishing polishing this for release I noticed that Erik Darling (blog/@dumb_chauffeur) released a similar post earlier today "What happens to transaction log backups during full backups?" on the Brent Ozar Unlimited blog. His post has good detail on proving the situation's existence via the Log Sequence Numbers (LSNs) stored in the FULL and LOG backups.

↧

Why is my SQL Server using all of the RAM on the server?

January 8, 2016, 9:11 am

≫ Next: T-SQL Tuesday #74 - Who Grew The Database?

≪ Previous: The transaction log for database 'ABC' is full due to 'ACTIVE_BACKUP_OR_RESTORE'

The TL;DR on this is simple:

"BECAUSE IT'S SUPPOSED TO!"

A frequent complaint we receive comes from a client that has an in-house sysadmin monitoring tool like Microsoft SCOM/SCCM. They turn the tool on and it starts red-alerting because the Windows server hosting SQL Server is at 90%+ used RAM. The sysadmin (or local DBA) logs on to the server and finds that there is 5GB free (~4%), and the sqlservr.exe process is using 120GB of the 128GB on the server!

http://cdn.meme.am/instances/55759451.jpg

Like almost all things SQL Server, this is a simplification of a situation that requires some serious #ItDepends.

What is the baseline - how much RAM is usually free on this server? This can be seen by tracking the Perfmon counter Memory\Available MBytes.
What is the Buffer Manager\Page Life Expectancy number - not right this second, but over time?
Is there any actual performance issue on the server?

I have written before about the need for baselines, especially regarding memory and CPU use on the Windows server itself. As a DBA, I don't care that my Windows server is at 95% RAM *if* both Windows and SQL Server aren't having performance problems *over time*.

I started my IT career on a support/help desk supporting servers and desktops, and 16 years ago being at 5% free RAM probably meant you had 500MB-1GB free, which was definitely an OMG! (although then we probably didn't actually say OMG.)

Today's servers often have 96GB or more of RAM, and the larger the server the less relevant a percentage-based measurement is - just because your server has 256B of RAM doesn't mean Windows is suddenly going to need 25GB+ of RAM to run cleanly.

This brings us back to our three questions above. Rather than worrying about some artificial guideline like 10%, how much memory is usually free on this server? Track Memory\Available MBytes over time and see where it tops out - if the regular operation of your server always leaves 2-4GB free, you are probably OK from an Windows point of view.

What is the Page Life Expectancy of the SQL Server instance over time? As many other have written about, PLE is one of the most misused statistics in SQL Server. A DBA checks Perfmon or runs a DMV query (like this one from the awesome set curated by Glenn Berry (blog/@GlennAlanBerry)):

-- Page Life Expectancy (PLE) value for each NUMA node in current instance (Query 36) (PLE by NUMA Node)
SELECT @@SERVERNAME AS [Server Name], [object_name], instance_name, cntr_value AS [Page Life Expectancy]
FROM sys.dm_os_performance_counters WITH (NOLOCK)
WHERE [object_name] LIKE N'%Buffer Node%' -- Handles named instances
AND counter_name = N'Page life expectancy' OPTION (RECOMPILE);

They find that the PLE is some relatively small number and freak out, or worse, they find it is a large number and think "my SQL Server doesn't need this much RAM, I can cut back."

https://lost100pounds.files.wordpress.com/2011/05/danger-cat.jpg

Like Available MBytes, PLE is a number that needs to be measured over time. On the servers I touch I always recommend setting up a "permanent" Perfmon collector of SQL Server-related counters so that you can look at the numbers over time. For example, a usual PLE chart over a week can look like this:

...or it can look like this:

Even over time, some people look at this second chart and say one of two things:

My PLE goes down to 14, what do I do?
My PLE maxes out at 50,000, what do I do?

Again, #ItDepends - what is going on in SQL Server during each of these times? If you look at the above chart you can see that PLE generally climbs during the day (with some minor fluctuations), but then tanks over night pretty regularly.

Look at when SQL Server dips to 14 - is that when reindexing or CheckDB runs? If so, is performance good enough? If CheckDB runs in two hours, and your window is three to four hour, isn't that acceptable? Are other things running at the same time that are negatively impacted, or is that window clear?

If you do need to improve the processes running at that moment (when PLE dips), what is the Available MBytes at this same time and over time? If it is high over time, it shows that there is head room on your server to consider increasing your Max Server Memory and to help your PLE stay higher. If your Available MBytes is not consistently high, you can't consider increasing Max Server Memory unless you increase actual server RAM.

This is the 60-second version of quick memory troubleshooting, but think about it - isn't this (even just this small piece of it) a lot more complicated than "I have less than 10% free RAM."

Is anybody or any process complaining (besides SCCM)? Are any numbers out of line - for example is a virus scan that normally takes 2 hours suddenly taking 4-5? If nothing else is abnormal, is there a problem?

--

As I mentioned at the start of this post, quite often your SQL Server process(es) are simply *supposed* to be using the lion's share by far of the resources on the Windows server - especially RAM. As long as you reserve sufficient head room for the operating system (usually 4GB-16GB as described here by Jonathan Kehayias (blog/@SQLPoolBoy) but modified by monitoring Avail MBytes) then the percentage of RAM used probably doesn't matter.

...and if you have good sysadmins they will listen to you and realize that SQL Servers (like Oracle, Exchange, and other database servers) are simply different from web servers or file servers.

Hope this helps!

↧

T-SQL Tuesday #74 - Who Grew The Database?

January 12, 2016, 11:40 am

≫ Next: Counting Your VLFs, or, Temp Tables Inside IF...ELSE Blocks

≪ Previous: Why is my SQL Server using all of the RAM on the server?

This month T-SQL Tuesday is hosted by Robert Davis (blog/@SQLSoldier) and his topic of choice is “Be The Change”

(If you don’t know about T-SQL Tuesday check out the information here– each month there is a new topic and it is a great excuse to write each month (and to start writing!) because someone offers a topic, so you already have taken the first step!).

--

Robert's choice of topics threw me:

The topic for T-SQL Tuesday #74 is Be the Change. More specifically, data changes. How do you track changing data? How do you do your ETL? How do you clean or scrub your data? Anything related to changing data. Give us your most interesting post involving changing data next Tuesday, January 12, 2016.

As a mostly operational DBA, I have very limited contact with data changes and tracking them. I have some very limited experience with Change Data Capture, but others have already written about it and done a very good job (also - read Mickey Stuewe's (blog/@SQLMickey) post "Knowing When Data Changes Occur in Standard Edition" about how to work around CDC being Enterprise Only!)

I had a breakthrough earlier today reading Jason Brimhall's (blog/@sqlrnnr) post "All about the Change" in which he writes about changes as related to SQL Audit. I realized that a fair amount of the administrative triage that I do is directly caused by data changes, especially large scale ones. Here is a solution that I have modified over the years to help me track the changes in my systems *caused* by data changes. (see how I spun that?)

--

The changes in question here are file size changes - as operational DBAs one of the problems we constantly deal with is files that grow out of control, often from maintenance work such as index rebuilds or from unusual ad-hoc operations such as the analyst working at 1am trying to create a personal copy of the giant Sales table.

We all know that the "best" thing to do (remember, #ItDepends) is to appropriately size your DATA and LOG files ahead of time, and if possible to manually grow those files after hours so that there isn't any production impact. Even in this absolute best case (almost theoretical) scenario, it is still usually right to leave auto-growth enabled "just in case."

Very few of us live in that world, and we size our files as best as we can with almost no real business requirements using our DBA "Spidey Sense." Our files are then "managed by auto-grow" as we try to find the best steady state for the DATA and LOG files and minimize impact while getting the best performance.

Does that sound familiar?

http://cdn.meme.am/instances/53655925.jpg

As a service provider, we monitor the drive free space and SQL Server Error Logs on our client servers (along with dozens of other things) - auto-growth problems can often be seen through messages like this:

Could not allocate space for object 'dbo.Table1'.'PK_Table1' in database 'myDatabase' because the 'PRIMARY' filegroup is full. Create disk space by deleting unneeded files, dropping objects in the filegroup, adding additional files to the filegroup, or setting autogrowth on for existing files in the filegroup.

...or...

The transaction log for database 'SomeOtherDatabase' is full due to 'ACTIVE_TRANSACTION'.

These messages usually mean that either the file has hit its MAXSIZE cap or that the drive hosting the file is full (or at least sufficiently full that it can't hold another FILEGROWTH increment worth of space). The first question that comes up in the root cause analysis is often "Who grew the database?"

Sometimes there are obvious culprits - is an index maintenance job running? Is there a nightly ETL running that had an exceptionally large file last night? Often there is no such smoking gun...and then you need something more.

--

I feel a little dirty writing about the Default Trace in the world of Extended Events, but I also know that many people simply don't know how to use XEvents, and this can be faster if you already have it in your toolbox. Also it will work back to SQL 2005 where XEvents were new in SQL 2008.

I have modified this several times to improve it - I started with a query from Tibor Karaszi (blog/@TiborKaraszi), modified it with some code from Jason Strate (blog/@StrateSQL), and then modified that myself for what is included and what is filtered. There are links to both Tibor's and Jason's source material in the code below.

--

/*

Default Trace Query

Especially useful for Auto Grow and Shrink Events
but works for all default trace info

Modified from Tibor Karaszi at
http://sqlblog.com/blogs/tibor_karaszi/archive/2008/06/19/did-we-have-recent-autogrow.aspx

Default Trace Path query modified from
http://www.jasonstrate.com/2013/01/determining-default-trace-location/

*/

DECLARE @fn VARCHAR(1000), @df bit
SELECT @fn =REVERSE(SUBSTRING(REVERSE(path),CHARINDEX('\',REVERSE(path)),LEN(path)))+'log.trc',
/* reading log.trc instead of a specific file will query all current log files */
@df = is_default
FROM sys.traces
WHERE id = 1

IF @df = 0 OR @df IS NULL
BEGIN
RAISERROR('No default trace running!', 16, 1)
RETURN
END

SELECT te.name as EventName
, t.DatabaseName
, t.FileName
, t.TextData
, t.StartTime
, t.EndTime
, t.ApplicationName
, HostName
, LoginName
, Duration
, cast(Duration/1000000.0 as decimal(10,2)) as DurationSeconds
FROM fn_trace_gettable(@fn, DEFAULT) AS t
INNER JOIN sys.trace_events AS te
ON t.EventClass = te.trace_event_id
WHERE 1=1 /* necessary to cleanly build WHERE clause */
/* find autogrowth events */
--and te.name LIKE '%grow%'
/* find autoshrink events */
--and te.name LIKE '%shrink%'
/* find manual shrink events */
--and (te.name = 'Audit DBCC Event' and (TextData like '%shrinkfile%' or TextData like '%shrinkdatabase%'))
--and DatabaseName='tempdb'
--and StartTime>'01/10/2016 00:00'
--and EndTime<='01/11/2016 13:00'
ORDER BY StartTime desc

The base query pulls all events from the Default Trace. As noted in the Variable assignment query from @fn, reading data from log.trc (rather than log_05.trc or log_44.trc for example) will combine the rows in the five current default trace TRC files.

The WHERE clause is built so that you can uncomment whichever lines you need. The initial 1=1 is present so that all of the commented out lines can start with an 'AND' to allow them to flow together regardless of which lines you uncomment.

Want to find autogrowth events for the Bob database? Uncomment the "and te.name like '%grow%' and the "and DatabaseName='Bob'" lines and you are set! Need to add time filters? Uncomment out those lines and modify the times. And so on....

If you run the query, you can see that the autogrowth of Bob was caused by application "Sales" running on AppServer01, or by a .NET application on WebServer99 running as Domain\WebUser, or even by SQLCMD running on the server locally as the Agent service account (in my world this often means the growth is being caused by a Hallengren maintenance job, since his SQL Agent jobs run under SQLCMD).

Remember, this can be done via XEvents as well (a topic for another blog post) and since Trace is deprecated that is the "better" way to do this - but this still works and is quick to use.

Hope this helps!

↧

Counting Your VLFs, or, Temp Tables Inside IF...ELSE Blocks

January 21, 2016, 7:11 am

≫ Next: Speaking at SQL Saturday #500 in Boston!

≪ Previous: T-SQL Tuesday #74 - Who Grew The Database?

There are many many blog posts out there about Virtual Log Files (VLFs) - one of the absolute best is "8 Steps to Better Transaction Log Throughput" from Kimberly Tripp (blog/@KimberlyLTripp) of SQLskills as well as the several other posts she links to from that post - if you haven't read them, click the link right now and do so - my post will be right here when you get back.

--

VLFs are the logical units that make up your transaction logs, and at the end of the day the TL;DR boils down to "Usually, too many VLFs Are bad" - they can decrease performance by functionally "fragmenting" your transaction log and slowing down everything transaction log-related (so basically, everything).

phil hartman frankenstein - Too Many VLFs BAD!

http://memegenerator.net/Phil-Hartman-Frankenstein/caption

VLF count is something that most vendors check during their health checks, and many SQL Server pros recommend it as well. Of course "too many VLFs" is a relative term, with people throwing around numbers of 50 or 100 or 200 as their threshold of concern.

--

The resource I have always fallen back on to run this check is the script from Michelle Ufford (blog/@sqlfool). She created it back in 2010 and it is the basis for the VLF script included in Glenn Berry's (blog/@GlennAlanBerry) Diagnostic Information (DMV) Queries.

Michelle's query relies on the undocumented DBCC LogInfo command to gather its VLF data - DBCC LogInfo returns a row for each VLF, so the count(*) of that query gives the number of VLFs for the database. The catch is that in SQL Server 2012, Microsoft added a column to the front of the resultset (RecoveryUnitID). As the DBCC command is undocumented, this new column is undocumented as well.

Michelle's code uses INSERT...EXEC to populate a temporary table with the VLF info, and the addition of this extra column breaks the original script. Glenn's versions of the scripts handle this issue easily since they are version-specific - in the SQL 2012/2014/2016 versions of the script, the temp table declaration is modified to include the extra RecoveryUnitID column, which allows the rest of the script to function as designed.

--

My problem is I wanted a version of the script that could be used across versions 2005+, and this presented a problem. At first I tried to add an IF...ELSE block to the start of the script to handle the differing CREATE TABLE statements:

--

If (select LEFT(cast(serverproperty('ProductVersion') as varchar),2)) in ('8.','9.','10')
BEGIN
Create Table #stage
(
FileID int
, FileSize bigint
, StartOffset bigint
, FSeqNo bigint
, [Status] bigint
, Parity bigint
, CreateLSN numeric(38)
);
END
ELSE
BEGIN
Create Table #stage
(
RecoveryUnitID int /* This is the new column as of SQL 2012 */
, FileID int
, FileSize bigint
, StartOffset bigint
, FSeqNo bigint
, [Status] bigint
, Parity bigint
, CreateLSN numeric(38)
);
END

--

http://memegenerator.net/Grumpy-Cat

Regardless of the SQL version I tested, I received this:

Msg 2714, Level 16, State 1, Line 16
There is already an object named '#stage' in the database.

I played with it a little, including adding an IF EXISTS check to the beginning of the second block (and yes, I directed it to tempdb..#stage to reference the temp table) and none of it worked. I poked around a little online and couldn't find a way to make it work - many people saying that it couldn't be done with temp tables, and that you should use a "regular" table or maybe a regular view instead.

My problem is that I am creating a script I want to run on lots of different servers across lots of environments, and I don't want to assume that the table name I am using doesn't already exist. Is it likely that a client server will have a table named dbo.AndyGVLFCountReallyUniqueTableNameGUIDPurpleMonkeyDishwasher? Well no, but you never know... Also, many environments have rules about creating "real" objects without change control - even an object that will be created, exist for <30 seconds, and be dropped.

Besides at this point it had become a challenge of how to make it work - there had to be a different way of looking at the problem. I fiddled with a table variable solution and had no better luck, resulting in a similar "already exists" error.

I realized part of the problem was how my script was laid out - I was checking for the lower version as my decision gate (in 8/9/10 ELSE) and while that was what needs to happen (I didn't want to hard code 11/12/13 and have it break with future versions) I didn't need to have the CREATE be part of the check - I just needed to handle the fact that the down-level object couldn't have the offending column:

--

Create Table #stage
(
RecoveryUnitID int /* This is the new column as of SQL 2012 */
, FileID int
, FileSize bigint
, StartOffset bigint
, FSeqNo bigint
, [Status] bigint
, Parity bigint
, CreateLSN numeric(38)
);

If (select LEFT(cast(serverproperty('ProductVersion') as varchar),2)) in ('8.','9.','10')
ALTER TABLE #stage DROP COLUMN RecoveryUnitID

http://4.bp.blogspot.com/-cQvrmJkAsCk/UVQyFL-S9WI/AAAAAAAAAMA/xcT4LCxYFQ8/s1600/33617404.jpg

In this case, I was able to create the table before the version check *with* the extra column, and then run a version check to drop the column if the instance is down-level.

With this in hand, I was able to modify Michelle's script to run for all current versions of SQL:

--

/*

VLF Count Script

Modifed From Michelle Ufford @sqlfool
http://sqlfool.com/2010/06/check-vlf-counts/

Added version check code due to changes in DBCC LOGINFO

Tested on MSSQL 2005/2008/2008R2/2012/2014

*/

/*
NOTE - the output of DBCC LogInfo adds an extra
column as of SQL 2012 so there is a version check
to drop that column for older versions
*/

Create Table #stage
(
RecoveryUnitID int /* This is the new column as of SQL 2012 */
, FileID int
, FileSize bigint
, StartOffset bigint
, FSeqNo bigint
, [Status] bigint
, Parity bigint
, CreateLSN numeric(38)
);

If (select LEFT(cast(serverproperty('ProductVersion') as varchar),2)) in ('8.','9.','10')
ALTER TABLE #stage DROP COLUMN RecoveryUnitID

Create Table #results(
Database_Name sysname
, VLF_count int
);

Exec sp_msforeachdb N'Use [?];

Insert Into #stage
Exec sp_executeSQL N''DBCC LogInfo([?])'';

Insert Into #results
Select DB_Name(), Count(*)
From #stage;

Truncate Table #stage;'

Select *
From #results
Order By VLF_count Desc;

Drop Table #stage;
Drop Table #results;

--

I am happy with the final product (the modified VLF count script) but also with my brief path of discovery on handling Temp Tables in IF...ELSE blocks - I know I have had similar problems before as outputs vary from version to version and now I have another idea to try the next time it comes up!

Hope this helps!

↧

Speaking at SQL Saturday #500 in Boston!

January 25, 2016, 9:55 am

≫ Next: Copying SSIS packages With DTUTIL

≪ Previous: Counting Your VLFs, or, Temp Tables Inside IF...ELSE Blocks

I have been selected to speak at SQL Saturday #500 in Boston on March 19th - I will be speaking (at 9am in the morning!) on "Getting Started with Extended Events":

--

Few subjects in Microsoft SQL Server inspire the same amount of Fear, Uncertainty, and Doubt (FUD) as Extended Events. Many DBA's continue to use Profiler and SQL Trace even though they have been deprecated for years. Why is this?

Extended Events started out in SQL Server 2008 with no user interface and only a few voices in the community documenting the features as they found them. Since then it has blossomed into a full feature of SQL Server and an amazingly low-impact replacement for Profiler and Trace.

Come learn how to get started - the basics of sessions, events, actions, targets, packages, and more. We will look at some base scenarios where Extended Events can be very useful as well as considering a few gotchas along the way. You may never go back to Profiler again!

--

SQL Saturdays are put on by PASS and are amazing full-day events of free training and networking ($10 for lunch), with sessions presented by a wide array of speakers. Many of the speakers are SQL Server MVPs and the topics cover a wide array of fields from DBA to Developer to BI and beyond.

--

If you are interested there are also full day sessions (albeit not free!) Thursday and Friday from SQL Server MVP Jes Borland and MVP Denny Cherry respectively. Jes will be presenting "How to Get Started Using Microsoft Azure" and Denny is covering "SQL Server Performance Tuning and Optimization" - if you are coming out Saturday and have the availability Thursday and/or Friday, consider them as well.

--

Hope to see you there!

↧

Copying SSIS packages With DTUTIL

February 5, 2016, 12:28 pm

≫ Next: How Bad Are Your Indexes?

≪ Previous: Speaking at SQL Saturday #500 in Boston!

A frequent need when performing a server migration is to copy the SSIS packages from one server to a new server. There are a couple of different ways to do this, including a wizard in SSMS. (See https://www.mssqltips.com/sqlservertip/2061/how-to-manage-ssis-packages-stored-in-multiple-sql-server-database-instances/). The catch to this is that these are manual and they only move one package at a time.

I recently had to migrate a server with over twenty packages, and I knew I didn't want to click-click-click over and over again. :)

I looked around and was reminded of dtutil, the utility designed to manage DTS and then SSIS packages from the command line. I found a comment at http://www.sqlservercentral.com/Forums/Topic1068518-1550-1.aspx that included a SELECT statement to generate dtutil commands based on the contents of msdb.dbo.sysssispackages:

select 'DTUTIL /SQL "'+f.foldername+'"/"'+ name +'" /DestServer [YOURSQLSERVERDEST] /COPY SQL;"'+f.foldername+'"/"'+name+'" /QUIET'
from msdb.dbo.sysssispackages p
inner join msdb.dbo.sysssispackagefolders f
on p.folderid = f.folderid

I played with it a little and it did serve my purpose - I was able to generate twenty dtutil commands, drop them in a Notepad batch file, and successfully run that batch from Windows to move the packages.

I fiddled with the script and started testing it on different SQL Server versions. The biggest gotcha I found was that on SQL Server 2005 there is no ssispackages table - the comparable table is sysdtspackages90 (and sysdtspackages90folders). A quick modification to the script to add a version check dealt with this:

-------

/*

SSIS Package Copy with DTUTIL in xp_cmdshell

Run on source server where packages are stored
Set parameter @TargetServer to server name where packages are moving

Modified be Andy Galbraith @DBA_Andy from an idea at http://www.sqlservercentral.com/Forums/Topic1068518-1550-1.aspx

Tested on MSSQL 2005/2008/2008R2/2012/2014

*/

SET NOCOUNT ON

DECLARE @TargetServer sysname, @SQLVersion char(4)

SET @TargetServer = 'ServerB'

SET @SQLVersion = left(cast(SERVERPROPERTY('productversion') as varchar),4)

/* PRINT @SQLVersion */

IF LEFT(@SQLVersion,1) NOT IN ('1','9') /* Not 2005+ */
BEGIN
PRINT 'SQL Server Version Not Supported By This Script'
END
ELSE
BEGIN
IF @SQLVersion = '9.00' /* 2005 */
BEGIN
select 'EXEC xp_cmdshell ''DTUTIL /SQL "'+f.foldername+'\'+ name
+'" /DestServer "'+@TargetServer+'" /COPY SQL;"'+f.foldername+'\'+name+'" /QUIET'''
from msdb.dbo.sysdtspackages90 p
inner join msdb.dbo.sysdtspackagefolders90 f
on p.folderid = f.folderid
END
ELSE /* 2008+ */
BEGIN
select 'EXEC xp_cmdshell ''DTUTIL /SQL "'+f.foldername+'\'+ name
+'" /DestServer "'+@TargetServer+'" /COPY SQL;"'+f.foldername+'\'+name+'" /QUIET'''
from msdb.dbo.sysssispackages p
inner join msdb.dbo.sysssispackagefolders f
on p.folderid = f.folderid
END
END

-------

In the above script I wrapped the dtutil statements in xp_cmdshell calls so that I could run it from SQL Server rather than the Windows command line (or batch files).

If your environment doesn't support xp_cmdshell (which is a completely different best practices discussion - see a great post by K Brian Kelley (blog/@kbriankelley) here about the risks of enabling xp_cmdshell in your environment) then it is easy to remove the xp_cmdshell piece to return the results back to simple dtutil calls:

-------

/*

SSIS Package Copy with DTUTIL

Run on source server where packages are stored
Set parameter @TargetServer to server name where packages are moving

Modified be Andy Galbraith @DBA_Andy from an idea at http://www.sqlservercentral.com/Forums/Topic1068518-1550-1.aspx

Tested on MSSQL 2005/2008/2008R2/2012/2014

*/

SET NOCOUNT ON

DECLARE @TargetServer sysname, @SQLVersion char(4)

SET @TargetServer = 'ServerB'

SET @SQLVersion = left(cast(SERVERPROPERTY('productversion') as varchar),4)

/* PRINT @SQLVersion */

IF LEFT(@SQLVersion,1) NOT IN ('1','9') /* Not 2005+ */
BEGIN
PRINT 'SQL Server Version Not Supported By This Script'
END
ELSE
BEGIN
IF @SQLVersion = '9.00' /* 2005 */
BEGIN
select 'DTUTIL /SQL "'+f.foldername+'\'+ name
+'" /DestServer "'+@TargetServer+'" /COPY SQL;"'+f.foldername+'\'+name+'" /QUIET'
from msdb.dbo.sysdtspackages90 p
inner join msdb.dbo.sysdtspackagefolders90 f
on p.folderid = f.folderid
END
ELSE /* 2008+ */
BEGIN
select 'DTUTIL /SQL "'+f.foldername+'\'+ name
+'" /DestServer "'+@TargetServer+'" /COPY SQL;"'+f.foldername+'\'+name+'" /QUIET'
from msdb.dbo.sysssispackages p
inner join msdb.dbo.sysssispackagefolders f
on p.folderid = f.folderid
END
END

-------

Hope this helps!

↧

How Bad Are Your Indexes?

February 10, 2016, 1:21 pm

≫ Next: Configuring a Perfmon Collector for SQL Server

≪ Previous: Copying SSIS packages With DTUTIL

In my last post "Copying SSIS packages With DTUTIL" I described a "gotcha" where the dynamic management views (DMVs) had different columns in SQL Server 2005 compared to 2008+, and I showed a version check I had built into the script to handle it.

It made me think about other places this check would be useful, and the first thing that came to mind was the Bad Indexes DMV query.

If you read my blog or have seen me speak at a SQL Saturday, you know I am a *big* fan of Glenn Berry (@GlennAlanBerry/blog) and his DMV Diagnostic Queries.

http://img.memecdn.com/im-your-biggest-fan_o_668248.jpg

Glenn has put many hours of work into deciphering and joining the dynamic management views/functions that have been in SQL Server since 2005 into useful query frameworks, and I leverage his work (with credit) whenever I can.

The concept of a "bad" index in this context is an index with many more writes than reads - that is, the index requires more effort to maintain than there is benefit from its existence.

http://www.lexisnexis.com/legalnewsroom/resized-image.ashx/__size/500x400/__key/telligent-evolution-components-attachments/13-12-00-00-00-00-00-89/ContentImage_2D00_MugShot.jpg

** IMPORTANT ** - any time you consider removing an index, always verify via use patterns, user interviews, etc. that the index is truly "removeable" - you do *not* want to be the one to remove a "bad" index only to find that it is needed for the monthly payroll run, or the quarterly bonus reports, or some other critical business process. Quite often "bad" indexes are only used periodically but are very crucial. An alternative to consider rather than dropping the index outright is to check if the index can be dropped and then recreated when it is needed, but often the index creation process incurs too much overhead for this to be viable.

Out of the box, the bad index query from Glenn's script is database-specific (rather than instance-wide):

-- Possible Bad NC Indexes (writes > reads) (Query 47) (Bad NC Indexes)
SELECT OBJECT_NAME(s.[object_id]) AS [Table Name], i.name AS [Index Name], i.index_id,
i.is_disabled, i.is_hypothetical, i.has_filter, i.fill_factor,
user_updates AS [Total Writes], user_seeks + user_scans + user_lookups AS [Total Reads],
user_updates - (user_seeks + user_scans + user_lookups) AS [Difference]
FROM sys.dm_db_index_usage_stats AS s WITH (NOLOCK)
INNER JOIN sys.indexes AS i WITH (NOLOCK)
ON s.[object_id] = i.[object_id]
AND i.index_id = s.index_id
WHERE OBJECTPROPERTY(s.[object_id],'IsUserTable') = 1
AND s.database_id = DB_ID()
AND user_updates > (user_seeks + user_scans + user_lookups)
AND i.index_id > 1
ORDER BY [Difference] DESC, [Total Writes] DESC, [Total Reads] ASC OPTION (RECOMPILE);
-- Look for indexes with high numbers of writes and zero or very low numbers of reads
-- Consider your complete workload, and how long your instance has been running
-- Investigate further before dropping an index!

The query relies on database-specific tables/views (such as sys.indexes) and therefore returns results for the current database context.

The first thing I wanted to do was to wrap the query in my undocumented little friend sp_msforeachdb.

http://mergeralpha.com/blog/wp-content/uploads/2015/10/resized_one-does-not-simply-meme-generator-one-does-not-simply-say-hello-to-my-little-friend-fdd94d.jpg

The query turned out like this:

EXEC sp_msforeachdb '
/* MODIFIED from Glenn - Possible Bad NC Indexes (writes > reads) (Query 58) (Bad NC Indexes) */
SELECT ''?'' as DBName,o.Name AS [Table Name], i.name AS [Index Name],
user_updates AS [Total Writes], user_seeks + user_scans + user_lookups AS [Total Reads],
user_updates - (user_seeks + user_scans + user_lookups) AS [Difference],
i.index_id,
i.is_disabled, i.is_hypothetical, i.has_filter, i.fill_factor
FROM sys.dm_db_index_usage_stats AS s WITH (NOLOCK)
INNER JOIN [?].sys.indexes AS i WITH (NOLOCK)
ON s.[object_id] = i.[object_id]
AND i.index_id = s.index_id
INNER JOIN [?].sys.objects as o WITH (nolock)
on i.object_ID=o.Object_ID
WHERE o.type = ''U''
AND s.database_id = DB_ID(''?'')
/* AND user_updates > (user_seeks + user_scans + user_lookups) */
AND i.index_id > 1
AND user_updates - (user_seeks + user_scans + user_lookups) >75000
ORDER BY [Difference] DESC, [Total Writes] DESC, [Total Reads] ASC;
'

The query has all of the standard artifacts of sp_msforeachdb, such as the question mark placeholder for the database name sprinkled throughout to set the proper context for all of the database-specific tables and views ( such as [?].sys.indexes).

This was where the version-specific problem came up - SQL Server 2008 introduced the concept of filtered indexes, and therefore a new column (has_filter) was added to sys.indexes. The result is that running the above query (which came from Glenn's SQL 2008 query script) errors out with a non-existent column error.

A fix to this could have been to have a modified version of the query without the offending column, and it would line up with how Glenn publishes his queries, with different scripts for each SQL Server version.

For *my* purpose I wanted a single script that I could run against any SQL Server 2005+, and the version check logic allows for that.

Here is the version checked version of the Bad Indexes For All Databases script:

/*
Bad Indexes DMV For All Databases
Modified by Andy Galbraith to run across all databases on the instance
Modified version of the Bad Indexes query in the Glenn Berry DMV scripts
http://www.sqlskills.com/blogs/glenn/category/dmv-queries/
Tested on MSSQL 2005/2008/2008R2/2012/2014
*/
SET NOCOUNT ON
DECLARE @SQLVersion char(4)
SET @SQLVersion = left(cast(SERVERPROPERTY('productversion') as varchar),4)
/* PRINT @SQLVersion */
IF LEFT(@SQLVersion,1) NOT IN ('1','9') /* Not 2005+ */
BEGIN
PRINT 'SQL Server Version Not Supported By This Script'
END
ELSE
BEGIN
IF @SQLVersion = '9.00' /* 2005 */
BEGIN
/* SQL 2005 Version - removes i.has_filter column */
EXEC sp_msforeachdb '
/*
MODIFIED from Glenn - Possible Bad NC Indexes (writes > reads) (Query 58) (Bad NC Indexes)
*/
SELECT ''?'' as DBName,o.Name AS [Table Name], i.name AS [Index Name],
user_updates AS [Total Writes], user_seeks + user_scans + user_lookups AS [Total Reads],
user_updates - (user_seeks + user_scans + user_lookups) AS [Difference],
i.index_id,
i.is_disabled, i.is_hypothetical, i.fill_factor
FROM sys.dm_db_index_usage_stats AS s WITH (NOLOCK)
INNER JOIN [?].sys.indexes AS i WITH (NOLOCK)
ON s.[object_id] = i.[object_id]
AND i.index_id = s.index_id
INNER JOIN [?].sys.objects as o WITH (nolock)
on i.object_ID=o.Object_ID
WHERE o.type = ''U''
AND s.database_id = DB_ID(''?'')
/* AND user_updates > (user_seeks + user_scans + user_lookups) */
AND i.index_id > 1
AND user_updates - (user_seeks + user_scans + user_lookups) >75000
ORDER BY [Difference] DESC, [Total Writes] DESC, [Total Reads] ASC;'
END
ELSE
BEGIN
EXEC sp_msforeachdb '
/*
MODIFIED from Glenn - Possible Bad NC Indexes (writes > reads) (Query 58) (Bad NC Indexes)
*/
SELECT ''?'' as DBName,o.Name AS [Table Name], i.name AS [Index Name],
user_updates AS [Total Writes], user_seeks + user_scans + user_lookups AS [Total Reads],
user_updates - (user_seeks + user_scans + user_lookups) AS [Difference],
i.index_id,
i.is_disabled, i.is_hypothetical, i.has_filter, i.fill_factor
FROM sys.dm_db_index_usage_stats AS s WITH (NOLOCK)
INNER JOIN [?].sys.indexes AS i WITH (NOLOCK)
ON s.[object_id] = i.[object_id]
AND i.index_id = s.index_id
INNER JOIN [?].sys.objects as o WITH (nolock)
on i.object_ID=o.Object_ID
WHERE o.type = ''U''
AND s.database_id = DB_ID(''?'')
/* AND user_updates > (user_seeks + user_scans + user_lookups) */
AND i.index_id > 1
AND user_updates - (user_seeks + user_scans + user_lookups) >75000
ORDER BY [Difference] DESC, [Total Writes] DESC, [Total Reads] ASC;'
END
END

This did exactly what I wanted, returning all non-clustered indexes with at least 75,000 more writes than reads (my chosen threshold) across all databases on the SQL Server 2005+ instance.

Hope this helps!

↧