I had a great time at SQL Saturday #256 in Kalamazoo. Thanks to Josh Fennessy (b|t), Joe Fleming (t), Tim Ford (b|t) and everyone else who put on such a wonderful event. And a big thank you to everyone who attending my session. You can find my slides and scripts here.
It’s hard to believe that I wrote this post almost a year ago. It’s been my most popular post, so I wanted to let people know about a change to one of the scripts. Many thanks to reader Francesco for pointing out an error. The Attach Script uses sp_msforeachdb and in the original script, I failed to wrap the ? wildcard in square brackets. As a result, database names containing dashes (I’m looking at you SharePoint), would not resolve properly. The script has been updated.
Having seen Kevin Kline (b|t) and Aaron Bertrand (b|t) present 10 Query Tuning Techniques Every SQL Programmer Should Know at this year’s PASS Summit, I know I should update the code to use Aaron’s sp_foreachdb. I will put that on the list of things to learn. That is why I love this job, because there is always something new to learn.
I’m writing from Charlotte, North Carolina where I’m attending my fifth PASS Summit. I hope to write a series of posts recapping my experiences, but I wanted to write about something I learned in Paul White’s (b|t) pre-conference session, Understanding the Optimizer and Interpreting Execution Plans. The session was amazing and I think I’ll spend the next year going through his scripts.
If you’ve read my blog you know I love internals. Paul introduced me to a function called sys.fn_physlocformatter. In SQL Server 2008 and beyond, %%physloc%% is a virtual column that returns the file, page and slot of each row of a result set in byte-reversed, hex format. That means it is hard to read. If you don’t believe me, here is an example of the query and its result:
USE MYTEST GO SELECT %%physloc%% AS [physloc],* FROM MYTestTable
fn_physlocformatter takes that hex value and formats it into a colon-delimited value that is much more readable. Here’s the query above, replacing %%physloc%% with fn_physlocformatter:
SELECT sys.fn_PhysLocFormatter(%%physloc%%),* FROM MYTestTable
After Paul showed us the function, I did a search to get more information and came across this post. Using the example provided, I wrote the following query to return a count of the number of rows written to each page:
SELECT SUBSTRING(sys.fn_PhysLocFormatter(%%physloc%%),4,(CHARINDEX(':',sys.fn_PhysLocFormatter(%%physloc%%),4) - 4)) [Page ID], COUNT(*) AS [Row Count] FROM MYTestTable GROUP BY SUBSTRING(sys.fn_PhysLocFormatter(%%physloc%%),4,(CHARINDEX(':',sys.fn_PhysLocFormatter(%%physloc%%),4) - 4)) ORDER BY [Row Count]
The sample code from the link above contains code to insert nine rows to MYTestTable2 at a time. I’ve listed the table definition and insert code below.
USE MYTEST GO CREATE TABLE MYTestTable2 (MYID INT IDENTITY, NAME CHAR(500), LNAME CHAR(500)) GO INSERT INTO MYTestTable2 VALUES ('NAME-1','LNAME-1'); GO INSERT INTO MYTestTable2 VALUES ('NAME-2','LNAME-2'); GO INSERT INTO MYTestTable2 VALUES ('NAME-3','LNAME-3'); GO INSERT INTO MYTestTable2 VALUES ('NAME-4','LNAME-4'); GO INSERT INTO MYTestTable2 VALUES ('NAME-5','LNAME-5'); GO INSERT INTO MYTestTable2 VALUES ('NAME-6','LNAME-6'); GO INSERT INTO MYTestTable2 VALUES ('NAME-7','LNAME-7'); GO INSERT INTO MYTestTable2 VALUES ('NAME-8','LNAME-8'); GO INSERT INTO MYTestTable2 VALUES ('NAME-9','LNAME-9'); GO
Because the rows are 1008 bytes long, seven rows will fit on one 8kb data page. After the initial insert of nine rows, I expected to see a page containing seven rows followed by a page with two rows. Here are the results after the initial insert:
I continued running the nine inserts, to confirm that my code was working as expected. And for the next five runs, it did. The strange results didn’t show up until the seventh run, where I saw this:
After the seventh insert, something was causing seven pages to be allocated each with a single row inserted. This didn’t make any sense to me, so I continued investigating. I’ll show you the results of those investigations in my next post, sys.fn_physlocformatter – The Mystery Deepens!
In this recent post, Paul Randal identified a bug in SQL Server versions 2005 and newer that prevents DBCC CHECKDB from running data purity checks against the master and model databases. Paul provides code to identify and fix the issue. The code below can be run through CMS to check the settings for master and model databases throughout your environment.
Many thanks to Paul for sharing this information.
--Declare local variable DECLARE @sqlstr VARCHAR(2000) --Create temp table CREATE TABLE #dbccinfo (ParentObject VARCHAR(256) ,Object VARCHAR(256) ,field SYSNAME ,value VARCHAR(256)) --Build @sqlstr variable to exec DBCC INFO for the master database --Use the WITH TABLERESULTS parm to return results in a tabular format SET @sqlstr = 'DBCC DBINFO (N''master'') WITH TABLERESULTS' --Execute the @sqlstr, inserting the results into the temp table INSERT INTO #dbccinfo EXEC(@sqlstr) --Build @sqlstr variable to exec DBCC INFO for the model database --Use the WITH TABLERESULTS parm to return results in a tabular format SET @sqlstr = 'DBCC DBINFO (N''model'') WITH TABLERESULTS' --Execute the @sqlstr, inserting the results into the temp table INSERT INTO #dbccinfo EXEC(@sqlstr) SELECT * FROM #dbccinfo --Select the field and value from the temp table for the dbi_dbid parm --and the database id SELECT field, value FROM #dbccinfo WHERE field IN ('dbi_dbccFlags', 'dbi_dbid') DROP TABLE #dbccinfo
I’ve posted previously about log space reservation (here and here). I’ve also posted about running dynamic T-SQL in a WHILE loop (here). This morning, I wrote a procedure that combines the two topics.
Because SQL Server may have to roll an active transaction back, it reserves space in the log file in addition to the space used to hold the log records. This reserved space is released when the transaction commits.
One of the places I’ve seen this cause problems is a routine that deletes historical data from a table. Imagine you have an application that has been active for two years. It contains 150 million rows and has never been purged. A business decision is made to purge all data older than two months, which accounts for approximately 140 million rows. The simplest way to code this is as follows:
USE [databasename] GO DELETE FROM historytable WHERE insertdate < DATEADD(MONTH,-2,GETDATE()); GO
This will work, but it will consume a huge amount of log space because all 140 million deletes will happen in a single transaction. So, in addition to the log space required for the deletes and any associated allocations, SQL Server will hold on to the reserved space for all 140 million records for the entirety of the transaction. A better solution is to delete records in smaller batches, with each delete wrapped in an explicit transaction.
That brings us to the procedure I wrote this morning for a situation very similar to the one I described above. The procedure will delete rows from a table based on the value of a date column. It allows you to set the number of rows to delete in each batch along with the threshold for deletion. See below for a sample execution.
-- ============================================= -- Create uspBatchDelete -- Created : 2013-08-08 -- Created By: Frank Gill -- Performs a batch delete of rows from a table -- Inputs - @tablename - the table to delete from -- @datecolumn - the date column to compare against -- @increment - the number of rows to delete in each batch -- @numberofdays - the number of days of data you want to maintain -- (unless your database has a time machine, the last parm should be negative) -- ============================================= -- Drop stored procedure if it already exists IF EXISTS ( SELECT * FROM INFORMATION_SCHEMA.ROUTINES WHERE SPECIFIC_SCHEMA = N'dbo' AND SPECIFIC_NAME = N'usp_Batch_Delete' ) DROP PROCEDURE dbo.uspBatchDelete GO CREATE PROCEDURE dbo.uspBatchDelete @tablename SYSNAME, @datecolumn SYSNAME, @increment BIGINT, @numberofdays INT AS BEGIN TRY -- Declare local variables DECLARE @sqlstr NVARCHAR(2000); DECLARE @rowcount BIGINT; DECLARE @loopcount BIGINT; DECLARE @ParmDefinition nvarchar(500); -- Set the parameters for the sp_executesql statement SET @ParmDefinition = N'@rowcountOUT BIGINT OUTPUT'; -- Initialize the loop counter SET @loopcount = 1; -- Build the dynamic SQL string to return the row count -- Note that the input parameters are concatenated into the string, while the output parameter is contained in the string -- Also note that running a COUNT(*) on a large table can take a long time SET @sqlstr = N'SELECT @rowcountOUT = COUNT(*) FROM ' + @tablename + ' WITH (NOLOCK) WHERE ' + @datecolumn + ' < DATEADD(DAY,' + CAST(@numberofdays AS VARCHAR(4)) + ',GETDATE())'; -- Execute the SQL String using sp_executesql, passing in the parameter definition and defining the output variable EXECUTE sp_executesql @sqlstr ,@ParmDefinition ,@rowcountOUT = @rowcount OUTPUT; -- Perform the loop while there are rows to delete WHILE @loopcount <= @rowcount BEGIN -- Build a dynamic SQL string to delete rows SET @sqlstr = 'DELETE TOP (' + CAST(@increment AS VARCHAR(10)) + ') FROM ' + @tablename + ' WHERE ' + @datecolumn + ' < DATEADD(DAY,' + CAST(@numberofdays AS VARCHAR(4)) + ',GETDATE())'; -- Execute the dynamic SQL string to delete a batch of rows EXEC(@sqlstr); -- Add the @increment value to @loopcount SET @loopcount = @loopcount + @increment; PRINT @increment + ' rows deleted.' END END TRY BEGIN CATCH SELECT ERROR_NUMBER() AS ErrorNumber ,ERROR_SEVERITY() AS ErrorSeverity ,ERROR_STATE() AS ErrorState ,ERROR_PROCEDURE() AS ErrorProcedure ,ERROR_LINE() AS ErrorLine ,ERROR_MESSAGE() AS ErrorMessage; END CATCH GO
In this example, the procedure will delete rows from tablename in batches of 50,000 where datecolumnname is older than two months. If you are dealing with a scenario similar to the one described above, deleting 22 months of data in a single run may take longer than you want. In that case, you can run the procedure multiple times changing the @numberofdays parm with each run. You will need to do some testing to determine the optimal values for the @increment and @numberofdays parameters.
-- ============================================= -- Example to execute the stored procedure -- ============================================= EXECUTE dbo.uspBatchDelete @tablename = 'tablename', @datecolumn = 'datecolumnname', @increment = 50000, @numberofdays = -60; GO
I hope you can put this code to use.
In my last post, I showed how to define a policy using Policy-Based Management (PBM). In this post, I will show how to evaluate this policy. While a policy can be evaluated against a single instance of SQL Server, PBM is especially powerful when used with Central Management Server. For more information about setting up Central Management Server, see my previous post on the topic.
To begin evaluating your policy, open the Registered Servers tab in Management Studio and expand the Central Management Servers node. Right-click on the server group you want to evaluate and select New Policy. In the example, I will be using the 2005-2008 group.
You should see your policy in the Policies list. To evaluate it, check the checkbox next to the policy and click Evaluate. Central Management Server will connect to each of the instances defined in the selected server group and will evaluate the policy, returning the results in the Target Details pane.
In my example, a number of databasesfailed the test. As a result, a red X is displayed next to the policy name in the Results pane. The Target Details pane will show the status for each database. I’ve obscured the results, to protect the innocent (and my job), but the Server column contains the name of the instance connected to and the Target column contains the following: SQLSERVER:\SQL\<Machine Name>\<Instance Name>\Database\<Database Name>.
Clicking the View link next to a failed database will show you the reason for the failure. In this case, the PAGE_VERIFY setting for the database is set to NONE. That, in technical terms, is real bad! Fortunately, Policy-Based Management gives us an easy method of fixing all databases in violation of the policy.
To apply the policy against all databases in violation, check the checkbox next to the policy name in the Results pane. This will select all of the databases in violation. If you want to apply the policy to a subset of the violators, you can select those manually in the Target Details pane. Once you’ve selected the databases you want to apply the policy to, click the Apply button.
Central Management Server will connect to the instances containing databases in violation of the policy and change the PAGE_VERIFY setting to CHECKSUM. When it completes, you should see a green check mark next to the policy name in the Results pane, as well as green check marks next to all databases in the detail pane. If so, congratulations, you’ve successfully evaluated and applied your policy and PAGE_VERIFY is set to CHECKSUM for all databases.
These posts have demonstrated how to define and evaluate a Policy-Based Management policy with a single condition. There are many other facets and conditions that can be evaluated. You can get more information on Policy-Based Management at Books Online. And if you are really interested in learning more, I recommend Pro SQL Server 2008 Policy-Based Management.
Policy-Based Management provides a powerful tool for evaluating and standardizing your environment. I hope this introduction is helpful to you.
After my last post on using Dynamic SQL, I tweaked the script to check for databases with a PAGE_VERIFY setting other than CHECKSUM. If you are running SQL Server 2005 or higher, CHECKSUM should be set for all of your databases. With PAGE_VERIFY set to CHECKSUM, when SQL Server writes a dirty page back to disk, it first calculates a checksum value for the contents of the page and stores the value in the page header. When the page is read back into memory, the checksum is recalculated and compared with the values in the page header. If they are different, you have corruption. This is the best method for detecting corruption caused by the IO subsystem.
To determine which databases have PAGE_VERIFY values other than CHECKSUM, you could query sys,databases and change the offenders manually. Or you could write a dynamic SQL script like the one from my previous post to identify the right databases and update them. I was planning on posting the script that I wrote to do just that, and then I remembered Policy-Based Management. Policy-Based Management (PBM) first shipped with SQL Server 2008 and is available in 2008 R2 and 2012. PBM allows you to define policies for your environment, audit your environment for adherence to those policies and make changes to ensure the environment is compliant. Policies can be run against SQL Server 2005 instances, but some functionality is not available. You can get more detailed information about PBM from Books Online.
For this example, I will set up a policy to check the PAGE_VERIFY setting for each user database in my environment. To start, expand the Management node in Object Explorer, followed by the Policy Management node. Right-click on Policies and select New Policy.
Enter a name for your policy in the Name field of the Create New Policy dialog. I’ve called mine PageVerify. Next click the Check Condition dropdown and select New Condition.
Enter a name for your condition and click the Facet dropdown, selecting Database Options. Click the lookup button next to the Field text box and select @PageVerify. Click in the Operator text box and select the equals sign. Finally, click the Value dropdown and select CHECKSUM. Your dialog should look like the one below. Click the OK button and you will return to the Create New Policy dialog.
From here, you can set the policy to run against a subset of databases or servers. You also have the opportunity to set the Execution Policy. For this example, I will leave it set to On Demand. The other options for this example are On Schedule and On Change: Log Only. A fourth option, On Change: Prevent is available for some policies. It will roll back a change that violates the policy and output an error message to the user. Click okay, and you’ve created your first policy.
I will cover evaluating this policy in my next post.