SQL Server: Query Optimization And SQL Tuning

To get better and time efficient result from SQL server, one need to be careful while designing DB and working with Queries. SQL Statements are used to retrieve data from the database. We can get same results by writing different sql queries. But use of the best query is important when performance is considered. So one need to write SQL query with performance and requirement in mind.

Here I am sharing some performance optimization techniques from my experience.

SQL Tuning/SQL Optimization Techniques:

Every Table should have primary key, if primary column is of integer type, then it should have Identity set to true.
Table should have appropriate amount of non-clustered index, Non-clustered index should be created on columns of table based on query which is running
Following priority order should be followed when any index is created a) WHERE clause, b) JOIN clause, c) ORDER BY clause, d) SELECT clause

Always use the actual columns names in SELECT statement instead of than '*' for better performance.

Example: Write the query as

SELECT Id, Roll_No, First_Name, Last_Name FROM Student_Details

Instead of:

SELECT * FROM Student_Details

Sql Server is not case sensitive

Example: Write the query as

SELECT * FROM Student_Details WHERE First_Name='sandeep'

Instead of
SELECT * FROM Student_Details WHERE LOWER(First_Name)='sandeep'

Always use WHERE Clause in SELECT Queries while we don’t need all the rows to be returned. This will help to narrow the return rows else it will perform a whole table scan and waste the Sql server resources with increasing the network traffic. While scanning the whole it will lock the Table which may prevent other users to access the table.

The queries having WHERE clause connected by AND operators are evaluated from left to right in the order they are written. So certain things should be taken care of like,

Provide the least likely true expressions first in the AND. By doing this if the AND expression is false at the initial stage the clause will end immediately. So it will save execution time
If all the parts of the AND expression are equally like being false then better to put the Complex expression first. So if the complex works are false then less works to be done.
While running a query, the operators used with the WHERE clause directly affect the performance. The operators shown are in their decreasing order of their performance. =, >,>=,<, <=, LIKE, <>

HAVING clause is used to filter the rows after all the rows are selected. It is just like a filter. Do not use HAVING clause for any other purposes. While the select statement contains a HAVING clause, its better to make the WHERE clause to do most of the works (removing the undesired rows) for the Query instead of letting the HAVING clause to do the works. In a SELECT statement with GROUP BY and HAVING clause, things happens like first WHERE clause will select appropriate rows then GROUP BY divide them to group of rows and finally the HAVING clause have less works to perform, which will boost the performance.

Example: Write the query as

SELECT Subject_Name, Count(Subject_Name)

FROM Student_details

WHERE Subject_Name <> 'English'

AND Subject_Name <> 'Maths'

GROUP BY Subject_Name

Instead of:

SELECT Subject_Name, Count(Subject_Name)

FROM Student_details

GROUP BY Subject_Name

HAVING Subject_Name <> 'English' AND Subject_Name <> 'Maths'

Sometimes you may have more than one subqueries in your main query. Try to minimize the number of subquery block in your query.

Example: Write the query as

SELECT Name

FROM Employee

WHERE (Salary, Age) = (SELECT MAX(Salary), MAX(Age) FROM Employee_Details)

Instead of:

SELECT Name
FROM Employee
WHERE Salary = (SELECT MAX(Salary) FROM Employee_details) AND Age = (SELECT MAX(Age) FROM Employee_Details)

While there is case to use IN or BETWEEN clauses in the query, it is always advisable to use BETWEEN for better result.

Example: Write the query as

SELECT * FROM Student_Details

WHERE Roll_No BETWEEN (10 AND 15)

Instead of:

SELECT * FROM Student_Details

WHERE Roll_No IN (10,11,12,13,14,15)

Use operator EXISTS, IN and table joins appropriately in your query.

i) Usually IN has the slowest performance.

ii) IN is efficient when most of the filter criteria is in the sub-query.

iii) EXISTS is efficient when most of the filter criteria is in the main query.

When there is a choice to use IN or EXIST, we should go with EXIST clause for better performance.Same case for NOT IN or EXIST.

For Example: Write the query as

Select * From Product p

Where EXISTS (Select * From Order_Detail o Where o.ProductId = p.Id)

Instead of:

Select * From Product p Where Id IN (Select ProductId From Order_Detail)

Use EXISTS instead of DISTINCT when using joins which involves tables having one-to-many relationship.

Example: Write the query as

SELECT d.Dept_Id, d.Dept_Name

FROM Department d

WHERE EXISTS (SELECT 'X' FROM Employee e WHERE e.Dept_Name = d.Dept_Name)

Instead of:

SELECT DISTINCT d.Dept_Id, d.Dept_Name

FROM Department d,Employee e

WHERE e.Dept_Name = e.Dept_Name

It is always best practice to use the Index seek while the columns are covered by an index, this will force the Query Optimizer to use the index while using IN or OR clauses as a part of our WHERE clause.

Example: Write the query as

SELECT * FROM Student_Details (INDEX=IX_Roll_no) WHERE Roll_No IN (22,43,55)

Instead of:

SELECT * FROM Student_Details WHERE Roll_No IN (22,43,55)

While we use Like, in the sql query it better to use one or more leading characters in the clause instead of using the wildcard character at the starting.
SELECT * FROM Student_Details WHERE First_Name LIKE 's%'
SELECT * FROM Student_Details WHERE First_Name LIKE '%s'
In the first query the Query optimizer is having the ability to use an index to perform the query and there by reducing the load on sql server. But in the second query, no suitable index can be created while running the query.
Always try to use UNION ALL in place of UNION.
Its sometimes better to combine queries using UNION ALL instead of using many OR clauses.
SELECT Id, First_Name, Last_Name FROM Student_Details
WHERE First_Name = 'Sandeep' OR Age = 25
The above query to use and index, it is required to have indexes on all the 3 columns. The same query can be written as

SELECT Id, First_Name, Last_Name FROM Student_Details
WHERE First_Name = 'Sandeep'

UNION ALL

SELECT Id, First_Name, Last_Name FROM Student_Details
WHERE Age = 25

Both the queries will provide same results but if there is only an index on First_Name and no indexes on the Age, then the first query will not use the index and a table scan is performed. But the 2nd one will use the index as the part of the query.

Always avoid the use of SUBSTRING function in the query.

Example: Write the query as

SELECT Id, First_Name, Last_Name

FROM Student_Details

WHERE First_Name LIKE 'San%'

Instead of:

SELECT Id, First_Name, Last_Name

FROM Student_Details

WHERE SUBSTRING(First_Name,1,3) = 'San'

Be careful while using conditions in WHERE clause.

Example:

Write the query as

SELECT Id, First_Name

FROM Student_Details

WHERE Age BETWEEN MIN(Age) And MAX(Age)

Instead of:

SELECT Id, First_Name

FROM Student_Details

WHERE Age > MIN(Age) And Age <= MAX(Age)

Write the query as

SELECT Id, First_Name

FROM Student_Details

WHERE First_Name = 'Sandeep' AND Last_Name = 'Kumar'

Instead of:

SELECT Id, First_Name

FROM Student_Details

WHERE First_Name || Last_Name = 'SandeepKumar'

Write the query as

SELECT Id, First_Name

FROM Student_Details

WHERE First_Name LIKE NVL (:name, '%')

Instead of:

SELECT Id, First_Name

FROM Student_Details

WHERE First_Name = NVL ( :name, First_Name)

Use non-column expression on one side of the query because it will be processed earlier.

Example:Write the query as

SELECT Id, First_Name

FROM Student_Details

WHERE Age < 15;

Instead of:

SELECT Id, First_Name

FROM Student_Details

WHERE Age + 5 < 15

Use DECODE to avoid the scanning of same rows or joining the same table repetitively. DECODE can also be made used in place of GROUP BY or ORDER BY clause.

Example: Write the query as

SELECT Id, First_Name

FROM Student_Details

WHERE First_Name LIKE 'San%' And Last_Name = 'Kumar'

Instead of:

SELECT DECODE(Last_Name,'Kumar',Id,NULL) Id

FROM Student_Details

WHERE First_Name LIKE 'San%'

Remove any unnecessary joins from table. Joins are expensive in terms of time. Make sure that you use all the keys that relate the two tables together and don't join to unused tables -- always try to join on indexed fields. The join type is important as well (INNER, OUTER,... ).
Remove any adhoc queries and use Stored Procedure instead, Ad Hoc queries are a security hole.
Dynamic queries are cached but they may not be reused and would be taking up space in the plan cache. If you’re going to use dynamic SQL, use parameterized queries which will promote plan reuse.
If possible move the logic of UDF(User Defined Function) to SP(Stored Procedure) as well
To store large binary objects, first place them in the file system and add the file path in the database.
To write queries which provide efficient performance follow the general SQL standard rules.

i) Use single case for all SQL verbs
ii) Begin all SQL verbs on a new line
iii) Separate all words with a single space
iv) Right or left aligning verbs within the initial SQL verb

Do not to use Views or replace views with original source table, In simple terms views like any other sql statements need to be compiled then executed which consumes time and space for large amount of data unlike stored procs which are already precompiled. Views perform about as well as accessing the tables directly as long as the query in the view has been optimized and the tables indexed appropriately. However, using a view may require slightly more overhead than using the tables directly because the view must be rendered before the results are returned, similar to a derived table.
Be careful when using indexes views as these views are materialized in the database. This means that every time a base table in the index view is updated, the materialized table must also be updated (along with it’s indexes). Indexed views are great for read-only reporting or OLAP databases, but may cause performance issues with OLTP databases.

If there is cursor used in query, see if there is any other way to avoid the usage of this (either by SELECT … INTO or INSERT … INTO, etc)

Triggers should not be used if possible, incorporate the logic of trigger in stored procedure. Using triggers requires more overhead in the database than just incorporating the logic in a stored procedure. That does not mean you should not use triggers, just use them wisely and only were necessary.

Techie's Diary

Search This Blog

SQL Server: Query Optimization And SQL Tuning

Labels

Comments

Post a Comment

Popular posts from this blog

C#: Merging Excel cells with NPOI HSSFWorkbook

Error 405 : ASP.NET Core Web API PUT and DELETE Methods not allowed

How to set Swagger as the default start page for API hosted on the Azure web app?