only My site

Friday, April 29, 2011

SQL Best practices and Tips

1. Avoid DISTINCT



2. TOP 1 ORDER BY ID desc (to get the last record) - Use WHERE ID = (SELECT (MAX(ID) FROM Tbl) - TOP N * or TOP N PERCENTAGE or TOP N WITH TIES (tank)



3. Table Scan needs to eliminate if the cost is high... Always Table Seek is better for IO operation and Performance.



4. INNER join is better than Equi join ECT col1, col2, col3... FROM table_name1, table_name2 WHERE table_name1.col2 = table_name2.col1;



5. Use PRIMARY key and then use Non clustered index as Non clustered index uses the Clustered index key



6. Use Order by only if required



7. Use composite index keys if the Where condition using more than one filters CDN



8. Use INCLUDE
CREATE NONCLUSTERED INDEX [] ON [dbo].[order] ([Contract_No],[Delete_Flag],[Status]) INCLUDE ([Order_No],[Exchange_Id])



Index fragmentation - Health check



SELECT CAST(DB_NAME(database_id) AS varchar(20)) AS [Database Name],CAST(OBJECT_NAME(object_id) AS varchar(20)) AS [TABLE NAME], Index_id, Index_type_desc, Avg_fragmentation_in_percent, Avg_page_space_used_in_percentFROM sys.dm_db_index_physical_stats(DB_ID('SOLO'),OBJECT_ID('dbo.order'),NULL,NULL,'Detailed')
SELECT index_id, index_depth, cast(avg_fragmentation_in_percent as int), page_count FROM sys.dm_db_index_physical_stats(DB_ID('SOLO'), OBJECT_ID('dbo.order'), NULL, NULL, 'DETAILED') order by 1,2




replace the NOT IN clause with a != , if you are comparing with 1 value
Avoid Index for a fixed value columns such as "Boolean", "Status" which will not give a great benifit.




WHERE SUBSTRING(firstname,1,1) = 'm'
can be rewritten like this:
WHERE firstname like 'm%




If you currently have a query that uses NOT IN, which offers poor performance because the SQL Server optimizer has to use a nested table scan to perform this activity, instead try to use one of the following options instead, all of which offer better performance:
-Use EXISTS or NOT EXISTS -Use IN -Perform a LEFT OUTER JOIN and check for a NULL condition




WHERE clauses that use NOT are not sargable, but can often be rewritten to remove the NOT from the WHERE clause, for example:
WHERE NOT column_name > 5
to
WHERE column_name <= 5





if you can rewrite the WHERE clause so that the column and function are separate, then the query can use an available index, greatly boosting performance. for example: Function Acts Directly on Column, and Index Cannot Be Used:
SELECT member_number, first_name, last_nameFROM membersWHERE DATEDIFF(yy,datofbirth,GETDATE()) > 21
Function Has Been Separated From Column, and an Index Can Be Used:
SELECT member_number, first_name, last_nameFROM membersWHERE dateofbirth < DATEADD(yy,-21,GETDATE





SELECT column_name FROM table_nameWHERE LOWER(column_name) = 'name'
SELECT column_name FROM table_nameWHERE column_name = 'NAME' or column_name = 'name'
This code will run much faster than the first example.



Here are the key operators used in the WHERE clause, ordered by their performance. Those operators at the top will produce results faster than those listed at the bottom.
= >, >=, <, <= LIKE <>This lesson here is to use = as much as possible, and <> as least as possible.





Carefully evaluate whether your SELECT query needs the DISTINCT clause or not



While is most cases, using either option works equally efficiently, there are some instances (such as rows returned from an unsorted heap) where the TOP operator is more efficient than using SET ROWCOUNT. Because of this, using the TOP operator is preferable to using SET ROWCOUNT to limit the number of rows returned by a query.





Whenever SQL Server has to perform a sorting operation, additional resources have to be used to perform this task. Sorting often occurs when any of the following Transact-SQL statements are executed:
ORDER BY GROUP BY SELECT DISTINCT UNION CREATE INDEX (generally not as critical as happens much less often)In many cases, these commands cannot be avoided. On the other hand, there are few ways that sorting overhead can be reduced. These include:
Keep the number of rows to be sorted to a minimum. Do this by only returning those rows that absolutely need to be sorted. Keep the number of columns to be sorted to the minimum. In other words, don't sort more columns that required. Keep the width (physical size) of the columns to be sorted to a minimum. Sort column with number datatypes instead of character datatypes.When using any of the above Transact-SQL commands, try to keep the above performance-boosting suggestions in mind.





USE NorthwindSELECT OrderIDFROM [Order Details]WHERE UnitPrice > 10GROUP BY OrderID
or
USE NorthwindSELECT DISTINCT OrderIDFROM [Order Details]WHERE UnitPrice > 10
Both of the above queries produce the same results, but the second one will use less resources and perform faster.





It is fairly common request to write a Transact-SQL query to to compare a parent table and a child table and find out if there are any parent records that don't have a match in the child table. Generally, there are three ways this can be done:
Using a NOT EXISTS
SELECT a.hdr_keyFROM hdr_tbl aWHERE NOT EXISTS (SELECT * FROM dtl_tbl b WHERE a.hdr_key = b.hdr_key)
Using a Left Join
SELECT a.hdr_keyFROM hdr_tbl aLEFT JOIN dtl_tbl b ON a.hdr_key = b.hdr_keyWHERE b.hdr_key IS NULL
Using a NOT IN
SELECT hdr_keyFROM hdr_tblWHERE hdr_key NOT IN (SELECT hdr_key FROM dtl_tbl)





Here's how you might use COUNT(*):
IF (SELECT COUNT(*) FROM table_name WHERE column_name = 'xxx')
Here's a faster way, using IF EXISTS:
IF EXISTS (SELECT * FROM table_name WHERE column_name = 'xxx')
The reason IF EXISTS is faster than COUNT(*) is because the query can end immediately when the text is proven true, while COUNT(*) must count go through every record, whether there is only one, or thousands, before it can be found to be true.






Performing UPDATES takes extra resources for SQL Server to perform. When performing an UPDATE, try to do as many of the following recommendations as you can in order to reduce the amount of resources required to perform an UPDATE. The more of the following suggestions you can do, the faster the UPDATE will perform.
If you are UPDATing a column of a row that has an unique index, try to only update one row at a time. Try not to change the value of a column that is also the primary key. When updating VARCHAR columns, try to replace the contents with contents of the same length. Try to minimize the UPDATing of tables that have UPDATE triggers. Try to avoid UPDATing columns that will be replicated to other databases. Try to avoid UPDATing heavily indexed columns. Try to avoid UPDATing a column that has a reference in the WHERE clause to the column being updated. Of course, you may have very little choice when UPDATing your data, but at least give the above suggestions a thought. [6.5, 7.0, 2000]
If you have created a complex transaction that includes several parts, one part of which has a higher probability of rolling back the transaction than the others, better performance will be provided if you locate the most likely to fail part of the transaction at the front of the greater transaction. This way, if this more-likely-to-fail transaction has to roll back because of a failure, there has been no resources wasted on the other less-likely-to-fail transactions.





you often need to INSERT the same value into a column. For example, perhaps you have to perform 100,000 INSERTs a day into a particular table, and that 90% of the time the data INSERTed into one of the columns of the table is the same value.If this the case, you can reduce network traffic (along with some SQL Server overhead) by creating this particular column with a default value of the most common value. This way, when you INSERT your data, and the data is the default value, you don't INSERT any data into this column, instead allowing the default value to automatically be filled in for you.





we discussed various T-SQL discovery, optimization and tuningtechniques using the SET and DBCC commands. SQL Server performance is acomposite of many factors. And while T-SQL p

No comments: