Tuesday, May 31, 2011

SQL Profiler and Database Tuning Advisor and optimizing the db server

About a year and half ago, I'd done some work on tuning my production database. The db was SQL server 2005 but what I will write below should work for SQL server 2k8 as well.

My notes on using SQL Profiler and Database tuning advisor(err..tips if you will):

  • Common columns to use are TextData, Duration, CPU, Reads, Writes, ApplicationName, StartTime and EndTime.

  • Do not trace to table. If you want in a table, import it.

  • Right-click on column to apply filter starting with that column.

  • Not all events within a group are important.

  • EventClass and SPID columns cannot be unselected. EventClass cannot be selected either.

  • Do not use on the PC where the database resides. Use Profiler from a different PC.

  • If your sever is busy do not check server processes trace data.

  • Turn Auto Scroll off to monitor a previous event without being scrolled to the bottom.

  • Bookmarking is useful to identify which even to look at a later time.

  • In order to minimize the load on the SQL server, reduce the number of events traced/captured.

  • The same goes with data columns.

  • Useful events to track slow running stored procedures are RPC:Completed, SP:StmtCompleted, SQL:BatchStarting, SQL:BatchCompleted and ShowPlan XML.

  • Useful data columns are Duration, ObjectName, TextData, CPU, Reads, Writes, IntegerData, DatabaseName, ApplicationName, StartTime, EndTime, SPID, LoginName, EventSequence, BinaryData.

  • Testing for which queries run frequently and storing that to trace table. This should be found out from the production server.
    SELECT [ObjectName], COUNT(*) AS [SP Count]
    FROM [dbo].[Identify_query_counts]
    WHERE [Duration] > 100
    AND [ObjectName] IS NOT NULL
    GROUP BY [ObjectName]
    ORDER BY [SP Count] DESC

  • Testing for deadlocks use events like Deadlock graph, Lock: Deadlock, Lock: Deadlock Chain, RPC: Completed, SP: StmtCompleted, SQL: BatchCompleted, SQL: BatchStarting.

  • Useful data columns are TextData, EventSequence, DatabaseName.

  • Testing for blocking issues use event BlockedProcessReport but also use this:
    SP_CONFIGURE 'show advanced options', 1 ;
    GO
    RECONFIGURE ;
    GO
    SP_CONFIGURE 'blocked process threshold', 10 ;
    GO
    RECONFIGURE ;
    GO//do this to turn it off
    SP_CONFIGURE 'blocked process threshold', 0 ;
    GO
    RECONFIGURE ;
    GO

  • Useful Data Columns are Events, TextData, Duration, IndexID, Mode, DatabaseID, EndTime

  • For production environment set the value of threshold to 1800 (30 mins) and be sure to turn off.

  • Testing for excessive index/table scans, use events like Scam:Started along with RPC:Completed, SP:StmtCompleted, SQL:BatchStarting, SQL:BatchCompleted and Showplan XML.

  • Useful Data columns are ObjectID, ObjectName, Duration, EventCall, TextData, CPU, Reads, Writes, IntegerData, StartTime, EndTime, EventSequence and BinaryData.

  • DTA: Provide representative workload in order to receive optimal recommendations.

  • Only RPC:Completed, SP:StmtCompleted and SQL:BatchCompleted.

  • Data Columns used are TextData, Duration, SPID, DatabaseName and LoginName.

  • Check Server Processes Trace Data to capture all trace events.

  • Run traces quarterly or monthly to feed to DTA to ensure indexes are up to date.

  • Create baseline traces to compare traces after indexing to check which queries run most often and their average duration.

  • Run only one trace at a time.

  • Do not run Profiler when running DB backup.

  • Set the Trace Off time when you run trace.

  • Run DTA at low traffic times.

  • Reduce the use of cursors in the application. Those that are in jobs or not in use can be ignored if they execute on time and on those hours where the application is least accessed.

  • Index created without the knowledge of queries serve little purpose.

  • Sort the trace (tuning) by CPU Reads. This gives the costly query.


Basically there are different parameters to look at. Firstly, I optimized those queries that are most frequently used by creating indexes and where possible re-writing them looking at query execution path. Then since I know these indexes are going to be fragmented when data gets updated or deleted from the tables in question, I setup a defragmentation plan as a job in the db server. Those indices that had fragmentation between 0 and 20 were left untouched, between 20 and 40 were re-organized and those above 40 were re-built.

Secondly, I also examined any queries or stored procedures that were hogging CPU, meaning not responding and causing other queries to wait for it to complete. There was one that I found that was not written very well. So I re-wrote it.

After that, I checked other server parameter to see if the server actually meets the standard. Such parameters are 'Memory -> pages/sec' and 'Memory->Available bytes'.  We had 32-bit processor so we couldn't upgrade the RAM only. We had to upgrade it to 64-bit server with initially 8gb RAM enabling 3GB of process space. The reason for upgrade to only 8gb was of course we want to see gradual performance improvement.

Then, I adjusted connection pooling parameters of my application server (Jboss 4.0.3 sp1) by doing a lot of load testing. I think I should have an article on that later. It was setup with Apache forwarding all the non-static (images and html) requests to Jboss. I won't dwell on this too much for now.

Lastly, all the developers in team focuses their attention to checking the source code to see if connections were being opened and closed properly. The application was using JDBC and this was quite a tedious task. We'd even managed to write code to flush connection after it reached a certain inactive threshold and log whenever it did that. I know most dba's would ask to do this step first, but either way our queries and db server needed optimization/upgrade.

There was an emailing app which sent newsletters to over 150k users at the time. It used to execute normally from 5-6hrs depending on the traffic on the application. That drastically dropped to less than an hour! :)

References:

Query to get the top 20 most executed queries in the database

SELECT TOP 20 SUBSTRING(qt.text, (qs.statement_start_offset/2)+1,
((CASE qs.statement_end_offset WHEN -1 THEN DATALENGTH(qt.text) ELSE
qs.statement_end_offset
END - qs.statement_start_offset)/2)+1), qs.execution_count, qs.total_logical_reads,
qs.last_logical_reads,
qs.min_logical_reads, qs.max_logical_reads, qs.total_elapsed_time, qs.last_elapsed_time,
qs.min_elapsed_time, qs.max_elapsed_time, qs.last_execution_time, qp.query_plan
FROM sys.dm_exec_query_stats qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) qt
CROSS APPLY sys.dm_exec_query_plan(qs.plan_handle) qp
WHERE qt.encrypted=0
ORDER BY qs.total_logical_reads DESC


Query to identify wait times

Select top 10 *
from sys.dm_os_wait_stats
ORDER BY wait_time_ms DESC


The job to set defragmentation logic

USE [xxxx]--Your Db name
GO

SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO

ALTER PROCEDURE [dbo].[sp_IndexDefrag]
AS
DECLARE @DBName NVARCHAR(255),
        @TableName NVARCHAR(255),
        @SchemaName NVARCHAR(255),
        @IndexName NVARCHAR(255),
        @PctFrag DECIMAL,
        @Defrag NVARCHAR(MAX)

IF EXISTS (SELECT * FROM sys.objects WHERE object_id =  object_id(N'#Frag'))
    DROP TABLE #Frag

Create table #Frag
        (DBName NVARCHAR(255),
         TableName NVARCHAR(255),
         SchemaName NVARCHAR(255),
         IndexName NVARCHAR(255),
         AvgFragment DECIMAL)

EXEC sp_msforeachdb 'INSERT INTO #Frag(DBName,
                                       TableName,
                                       SchemaName,
                                       IndexName,
                                       AvgFragment)
                     Select ''?'' As DBNAME,
                            t.Name As TableName,
                            sc.Name As SchemaName,
                            i.name As IndexName,
                            s.avg_fragmentation_in_percent
                     FROM

?.sys.dm_db_index_physical_stats(DB_ID(''?''),NULL,NULL,NULL,''Sampled'') As s
                     JOIN ?.sys.indexes i
                     ON s.Object_Id = i.Object_id
                        AND s.Index_id = i.Index_id
                     JOIN ?.sys.tables t
                     ON i.Object_id = t.Object_id
                     JOIN ?.sys.schemas sc
                     ON t.schema_id = sc.SCHEMA_ID
                     WHERE s.avg_fragmentation_in_percent > 20
                     AND t.TYPE = ''U''
                     AND s.page_count > 8                    
                     ORDER BY TableName, IndexName'

                     DECLARE cList CURSOR FOR
                     SELECT * FROM #Frag
                     where DBName = 'XXXX' --your db

                     OPEN cList
                     FETCH NEXT FROM cList
                     INTO @DBName, @TableName, @SchemaName, @IndexName, @PctFrag

                     WHILE @@FETCH_STATUS = 0
                     BEGIN
                          IF @PctFrag BETWEEN 20.0 AND 40.0
                          BEGIN
                               SET @Defrag = N'ALTER INDEX ' + @IndexName + ' ON ' +

@DBName + '.' + @SchemaName + '.' + @TableName + ' REORGANIZE'
                               EXEC sp_executesql @Defrag
                               PRINT 'Reorganize index: ' + @DBName + '.' + @SchemaName +

'.' + @TableName + '.' + @IndexName
                          END
                          ELSE IF @PctFrag > 40.0
                          BEGIN
                               SET @Defrag = N'ALTER INDEX ' + @IndexName + ' ON ' +

@DBName + '.' + @SchemaName + '.' + @TableName + ' REBUILD'
                               EXEC sp_executesql @Defrag
                               PRINT 'Rebuild index: ' + @DBName + '.' + @SchemaName + '.'

+ @TableName + '.' + @IndexName
                          END

                          FETCH NEXT FROM cList
                          INTO @DBName, @TableName, @SchemaName, @IndexName, @PctFrag
                    END
                    CLOSE cList
                    DEALLOCATE cList

                    DROP TABLE #Frag


So that's it. ';) I know this is very long for a post, but trust me, your work takes days if not weeks. And optimization is an on-going process. You cannot sit back and relax once you do it the first time. 

 

2 comments: