You are on page 1of 15

SQL Server Performance Tuning for SQL Server Developers

By Brad McGehee

Don't think that performance tuning your SQL Server applications is regulated to the end
of the development process. If you want your SQL Server-based applications to scale
and run at their full potential, you must begin considering scalability and performance
issues during the early stages of your application's development.

If you have been a DBA or SQL developer for very long, then you have probably run
across some slow SQL Server-based applications. And often when this happens,
everybody begins blaming everybody else for the problem. It's the network. It's the
server hardware. It's SQL Server. It's the users. It's the database. And this goes on and
on, but unfortunately, blame doesn't fix slow applications. The cure to most slow SQL
Server-based applications is prevention, which includes careful up front analysis of the
user's needs, thoughtful design, optimal coding, and appropriate implementation.

For any application, SQL Server or otherwise, scalability and performance have to be
built in from the very beginning. Once the application is rolled out, it is very difficult and
expensive to resolve most scalability and performance issues.

In this article you are going to learn the fundamentals of how to design, code, and
implement scalable and performance optimized SQL Server applications. You won't learn
everything, as that would take an entire book. The focus of this article is on learning the
very minimum you must know in order to produce scalable and performance tuned SQL
Server-based applications. Here's what you will learn:

 What Every Developer and DBA Must Know About SQL Server Performance
Tuning
 How to Optimize Your Server's Hardware
 How to Optimize SQL Server's Configuration Settings
 How to Optimize Your Application's Design
 How to Optimize Your Database's Design
 How to Optimize Your Application Code for SQL Server
 How to Optimize Your Transact-SQL Code
 How to Select Indexes for Optimal Database Performance
 How to Take Advantage of SQL Server Performance Tuning Tools

At the very least, if you take advantage of the advice and information in this article, you
will find that performance tuning your SQL Server-related applications is not a big as
mystery as you might think. So let's get to work.

What Every Developer Must Know About SQL Server Performance Tuning
As a developer, there are some overriding principals on which to proceed. This section
introduces these principals. Keep these in mind as you read about specific performance
tuning details discussed later in this article, and whenever performance tuning your SQL
Server applications.

Performance Tuning is Not a Science


SQL Server performance tuning is more art than science. I am sure you didn't want to
hear that, but this is a fact of life. I wish I could tell you exactly, step-by-step, what you
need to do to make your applications scale and perform well. The problem, as you
probably already know, is that any modern software application is a combination of
many complex variables. Unfortunately, no matter how hard you try, you don't have full
control of your application and the environment it runs under. For example, here are
some (not all) of the factors that affect an application's performance:

 SQL Server (the program itself)


 SQL Server's Configuration Settings
 The Application's Transact-SQL Code
 The Application's non-Transact-SQL Code
 The Database's Design
 The Operating System (server and client)
 The Middleware (Microsoft Transaction Server, Microsoft Messaging Server)
 The Hardware (server and client)
 The Network Hardware and Bandwidth (LAN and WAN)
 The Number of Clients
 The Client Usage Patterns
 The Type and Quantity of Data Stored in SQL Server
 Whether the Application is OLTP- or OLAP-based

While it is virtually impossible to control every factor that influences SQL Servers'
scalability and performance, what you can do is make the most of what you can control.

Test During All Stages of Your Application's Development


Scalability and performance testing should not only be done after the application is
written and ready to be rolled out. Testing should be an integral part of the development
process, beginning with the earliest stages of the application and database design, and
continuing as appropriate throughout the entire process. Most scalability and
performance issues are a result of poor initial design, and can only be prevented early in
the game. If you wait until after the application is complete, you will either have to live
with performance problems, or rewrite the application.

When performing tests, always proceed scientifically, only testing one dependant
variable at a time. For example, if you suspect that you need to add an index to a table
to boost performance, but you are not sure which one, or of what type is best,
experiment with only one change at a time, testing each change individually to see if it
produces the results you expect. If you change more than one thing at a time, you won't
know which change you made worked or didn't work. This goes for all testing, whether it
is adding indexes, making SQL Server configuration changes, or testing various
hardware configurations.

Always try to test under realistic conditions. This means use "real" data, testing against
the largest expected data sets, and using hardware similar to the hardware that will be
used when the application goes into production. If you don't, you may be surprised that
while your application works well for 10 simultaneous users during testing, that it fails
miserably when 500 users are online.

Not All Performance Issues are Obvious


If you do much performance tuning, you will soon realize that many performance tuning
and scalability issues are not so obvious. This is because many performance-related
problems are caused by two or more problems, not a single obvious problem. This
makes it difficult to isolate and fix the problem. While there are no easy solutions, one of
the best approaches to take is to isolate and correct one problem at a time, until you
have found and fixed them all.

Not All Performance Tuning Suggestions Work In All Cases


In this article, and from other performance tuning sources, you will find dozens of ideas
and tips on performance tuning. Keep in mind that in many cases a performance tuning
suggestion will work great in one particular situation, but could actually reduce
performance under a different situation. This is because many performance tuning
suggestions are situation specific. As the person responsible for performance tuning, you
will need to evaluate each tip or suggestion you run across and decide whether it is
applicable to your particular situation. In other words, don't blindly proceed with a
performance tuning tip. Be sure you understand its implications before you use it.

SQL Server Performance Tuning is a Learned Skill


Learning how to master SQL Sever 2000 performance tuning cannot be learned
overnight. In fact, experience, more than book learning, is how you will master this skill.
But in order to take advantage of the experience you gain over time, it is also important
to be conversant in the fundamentals of the technologies that affects your application's
performance.

For example, you need to fully understand the programming language used to write your
application, database design, application design, Transact-SQL, how SQL Server stores
and indexes data, and how networks and server hardware really work. The better
understanding you have of the basics of the applicable technologies used to develop and
roll out your application, the better position you will be in to understand what is causing
performance and scalability problems and how to resolve them. Learn all you can.

How to Optimize Your Server's Hardware


When it comes time to blame poor application performance on something, server
hardware gets a disproportionate amount of blame. What is ironic, is that in most cases
the hardware is not the main cause of the problem. In fact, server hardware plays a
much smaller role than most people think when it comes to SQL Server-based
application performance and scalability.

The reason for this is that most slow applications are slow because of poor up front
design, not because of slow hardware. The reason hardware is often blamed is because
performance problems often don't show themselves until after the application is rolled
out. And since the application's design can't be changed at this time, about the only
thing you can try to help boost performance is to throw hardware at it. While hardware
can help, it usually doesn't fully resolve the problem, and this is why hardware is often
blamed for slow performance. While hardware can sometimes be an issue, most likely it
is not.

In order to prevent your server hardware from being a drag on your SQL Server-based
application (which it can if it is inappropriately selected or configured), let's take a brief
look at some of the most common hardware selection and tuning issues.

Selecting Hardware
Selecting the optimum hardware for your SQL Server-based application depends on a
variety of factors, such as the size of the database, the number of users, how the
database is used (OLTP or OLAP), and others. While there is no sure-fire formula for
sizing server hardware, the best way to get a feel for sizing is to test your application
early in the development stage. Ah, testing is mentioned again. That's right. While many
experienced DBAs can probably give you a good estimate on the optimum hardware you
need, only through realistic testing will you know for sure what hardware is required to
meet your application's needs.

When is comes to server hardware, here are some things to keep in mind:
 CPU: Always purchase a server with the ability to expand its number of CPUs. For
example, if testing indicates that a single CPU server will be adequate, purchase a
server with at least room for two CPUs, even if you only use one of the slots. The
same goes for larger servers with four or more CPUs. Always leave room for
growth.
 Memory: This is probably the most significant piece of hardware that affects SQL
Server's performance. Ideally, your entire database should fit into RAM.
Unfortunately, this is not often possible. At the very minimum, try to get enough
RAM to hold the largest table you expect to have, and if you can afford it, get all
the RAM your server can handle, which is often 2GB or more. There is no such
thing as too much RAM.
 I/O Subsystem: After RAM, the I/O subsystem is the most important piece of
hardware to affect SQL Server's performance. At the very minimum, purchase
hardware-based RAID for your databases. As a rule of thumb, you will to
purchase more, smaller drives, not fewer, larger drives in your array. The more
disks that are in an array, the faster I/O will be.
 Network Connection: At the server, have at least one 100Mbs network card, and
it should be connected to a switch. Ideally, you should have two network cards in
the server connected to a switch in full-duplex mode.

Tuning the Server


Even the most expensive server hardware won't perform well if it is not configured and
tuned correctly. I have seen many hardware-related performance problems caused as
the result of not using Microsoft NT Server approved hardware and drivers. Some of
these types of hardware performance-related issues are very difficult to trace and
resolve. Ideally, ensure that you hardware, including NT, is correctly installed and
configured by a competent technician. Then test your application under controlled
conditions to test for potential performance issues before it is used in production.

Your operating system must also be configured correctly. This includes many things, too
many to describe here. Just as with the hardware, ensure that the operating system is
properly configured and tested before it is put into production.

For best performance on a server, SQL Server should be the only application running on
the server, other than management utilities. Don't try to save a few bucks by putting
your IIS or MTS server on the same server as SQL Server. Not only does this hurt SQL
Server's performance, but it also makes it more difficult to performance tune and
troubleshoot SQL Server.

How to Optimize SQL Server's Configuration Settings


Another common misconception about tuning SQL Server is that you must fine-tune its
various configuration settings in order to get optimum performance. While there was
some truth to this in earlier versions of SQL, this is no longer much of an issue, except
on the very largest and busiest of servers.

For the most part, SQL Server is self-tuning. What does this mean? It means that SQL
Server observes what is running on itself, and automatically makes internal adjustments
which, for the most part, keep SQL Server running as optimally as possible given the
tasks at hand and the given hardware.

When you perform performance testing on SQL Server, keep in mind that SQL Server
can take some time before it adjusts itself optimally. In other words, the performance
you get immediately after starting the SQL Server service, and the performance you get
a couple of hours later after a typical workload has been running, can be different.
Always perform your testing after SQL Server has had a chance to adjust itself to your
workload.

There are 36 SQL Server configuration options that can be changed using either the
Enterprise Manager or the sp_configure stored procedure. Unless you have a lot of
experience tuning SQL Server, I don't recommend you change any of SQL Server's
settings. As a novice, you may make a change that could in fact reduce performance.
This is because when you change a setting, you are "hard-coding" the setting from then
on. SQL Server has the ability to change its setting on the fly, based on the current
workload. But once you "hard-code" a setting, you partially remove SQL Server's ability
to self-tune itself.

If after serious consideration you feel that making a change to one or more SQL Server
configuration settings can boost performance in your particular environment, then you
will want to proceed slowly and cautiously. Before you make the setting change, you will
first want to get a good baseline on the SQL Server's performance, under a typical
workload, using a tool such as Performance Monitor (discussed later). Then make only
one change at a time. Never make more than one change at a time, because if you do,
you won't know which change, if any of them, made a difference.

Once the one change is made, again measure SQL Server's performance under the same
workload to see if performance was actually boosted. If it wasn't, which will often be the
case, then change back to the default setting. If there was a performance boost, then
continue to check to see if the boost in performance continues under other workloads the
server experiences over time. Your later testing may show that your change helps some
workloads, but hinders others. This is why changing most configuration settings is not
recommended.

In any event, if your application is suffering from a performance-related issue, the odds
of a configuration change resolving it are quite low.

How to Optimize Your Application's Design


If you are using an n-tier design for your application, and who isn't for most large-scale
applications these days, SQL Server is just one part of a larger application. And perhaps
more important than you realize, how your implement your n-tier design affects your
application's performance more than SQL Server itself. Unfortunately, SQL Server often
gets more of the blame for poor performance than the application design, even when it
is usually the application's design that is causing most of the performance problems.
What I hope to do here is offer a few suggestions that may aide you in your application
design, helping to prevent SQL Server from getting all the blame for poor performance.
So let's start.

One of the first steps you must decide when designing an n-tier application is to select
the logical and physical design. Of the two, the physical design is where most of the
mistakes are made when it comes to performance. This is because this is where the
theory (based on the logical design) has to be implemented in the real world. And just
like anything else, you have many choices to make. And many of these choices don't
lend themselves to scalability or high performance.

For example, do you want to implement a physical two-tier implementation with fat
clients, a physical two-tier implementation with a fat server, a physical three-tier
implementation, an Internet implementation, or some other implementation? Once you
decide this question, then you must ask yourself, what development language will be
used, what browser, will you use Microsoft Transaction Server (MTS), will you use
Microsoft Message Queue Server (MSMQ), and on and on.
Each of these many decisions can and will affect performance and scalability. Because
there are so many options, it is again important to test potential designs early in the
design stage, using rapid prototyping, to see which implementation will best meet your
user's needs.

More specifically, as you design your physical implementation, try to follow these general
recommendations to help ensure scalability and optimal performance in your application:

 Perform as many data-centered tasks as possible on SQL Server in the form of


stored procedures. Avoid manipulating data at the presentation and business
services tiers.
 Don't maintain state (don't store data from the database) in the business services
tier. Maintain state in the database as much as possible
 Don't create complex or deep object hierarchies. The creation and use of complex
classes or a large number of objects used to model complex business rules can be
resource intensive and reduce the performance and scalability of your application.
This is because the memory allocation when creating and freeing these objects is
costly.
 Consider designing the application to take advantage of database connection
pooling and object pooling using Microsoft Transaction Server (MTS). MTS allows
both database connections and objects to be pooled, greatly increasing the
overall performance and scalability of your application.
 If your application runs queries against SQL Server that by nature are long,
design the application to be able to run queries asynchronously. This way, one
query does not have to wait for the next before it can run. One way to build in
this functionality into your n-tier application is to use the Microsoft Message
Queue Server (MSMQ).

While following these suggestions won't guarantee a scalable and fast performing
application, they are a good first start.

How to Optimize Your Database's Design


Like application design, database design is very critical to the scalability and
performance of your SQL Server applications. And also like application design, if you
don't do a good job in the first place, it is very hard and expensive to make changes
after your application has gone into production. Here are some key things to keep in
mind when designing SQL Server databases for scalability and performance.

As always, you will want to test your design as early as possible using realistic data. This
means you will need to develop prototype databases with sample data, and test the
design using the type of activity you expect to see in the database once production
starts.

One of the first design decisions you must make is whether the database will be used for
OLTP or OLAP. Notice that I said "or". One of the biggest mistakes you can make when
designing a database is to try to meet the needs of both OLTP and OLAP. These two
types of applications are mutually exclusive in you are interested in any sense of high
performance and scalability.

OLTP databases are generally highly normalized, helping to reduce the amount of data
that has to be stored. The less data you store, the less I/O SQL Server will have to
perform, and the faster database access will be. Transactions are also kept as short as
possible in order to reduce locking conflicts. And last of all, indexing is generally
minimized to reduce the overhead of high levels of INSERTs, UPDATEs, and DELETEs.
OLAP databases, on the other hand, are highly de-normalized. In addition, transactions
are not used, and because the database is read-only, record locking is not an issue. And
of course, heavy indexing is used in order to meet the wide variety of reporting needs.

As you can see, OLTP and OLAP databases serve two completely different purposes, and
it is virtually impossible to design a database to handle both needs. While OLAP database
design is out of this book's scope, I do want to mention a couple of performance-related
suggestions in regard to OLTP database design.

When you go through the normalization process when designing your OLTP databases,
your initial goal should be to fully normalize it according to the three general principles
of normalization. The next step is to perform some preliminary performance testing,
especially if you foresee having to perform joins on four or more tables at a time. Be
sure to test using realistic sample data.

If performance is acceptable, then don't worry about having to join four or more tables
in a query. But if performance is not acceptable, then you may want to do some
selective de-normalization of the tables involved in order to reduce the number of joins
used in the query, and to speed performance.

It is much easier to catch a problem in the early database design stage, rather than after
the finished application has been rolled out. De-normalization of tables after the
application is complete is nearly impossible. One word of warning. Don't be tempted to
de-normalize your database without thorough testing. It is very hard to deduce logically
what de-normalization will do to performance. Only through realistic testing can you
know for sure if de-normalization will gain you anything in regards to performance.

How to Optimize Your Application Code for SQL Server


At some point during the development process you will have to begin coding your
application to work with SQL Server. By this time, the application and database designs
should have already been completed and tested for performance and scalability using
rapid prototyping techniques.

How your code your application has a significant bearing on performance and scalability,
just as the database design and the overall application design affect performance and
scalability. Sometimes, something as simple as choosing one coding technique over
another can make a significant different. Rarely is there only one way to code a task, but
often there is only one way to code a task for optimum performance and scalability.
What I want to do in this section is focus on some essential techniques that can affect
the performance of your application and SQL Server.

Since I don't know what development language you will be using, I am going to assume
here that you will be using Microsoft's ADO (Active Data Objects) object model to access
SQL Server from your application. The examples I use here should work for most Visual
Basic and ASP developers. So let's just dive in and look at some specific techniques you
should implement in your application code when accessing SQL Server data to help
ensure high performance.

Use OLE DB to Access SQL Server


You can access SQL Server data using either ODBC or OLE DB. Which method you use
depends on how you specify the connection string when you use ADO to connect to SQL
Server. For best performance, always select OLE DB. OLE DB is used natively by SQL
Server, and is the most effective way to access any SQL Server data.
Along these same lines, when creating an ADO connection to SQL Server, you can either
use a DSN in the connection string, or you can use a DSN-less connection. For optimal
performance, use DSN-less connections. Using them prevents the need for the OLE DB
driver to look up connection string information in the registry of the client the application
code is running on, saving some overhead.

Encapsulate your DML (Data Manipulation Language) in Stored Procedures


ADO allows you three different ways to SELECT, INSERT, UPDATE, or DELETE data in a
SQL Server database. You can use ADO's methods, you can use dynamic SQL, or you
can use stored procedures. Let's take a brief look at each of these.

The easiest way to manipulate data from your application is to use ADO's various
methods, such as rs.AddNew, rs.Update, or rs.Delete. While using these methods is easy
to learn and implement, you pay a relatively steep penalty in overhead for using them.
ADO's methods often create slow cursors and generate large amounts of network traffic.
If your application is very small, you would never notice the difference. But if your
application has much data at all, your application's performance could suffer greatly.

Another way to manipulate data stored in SQL Server using ADO is to use dynamic SQL
(also sometimes referred to as ad hoc queries). Here, what you do is send Transact-SQL
in the form of strings from ADO in your application to be run on SQL Server. Using
dynamic SQL is generally much faster than using ADO's methods, although it does not
offer the greatest performance. When SQL Server receives the dynamic SQL from your
ADO-based application, it has to compile the Transact-SQL code, create a query plan for
it, and then execute it. Compiling the code and creating the query plan the first time
takes a little overhead. But once the Transact-SQL code has been compiled and a query
plan created, it can be reused over and over assuming the Transact-SQL code sent later
is nearly identical, which saves overhead.

For optimal performance, you will want to use ADO to called stored procedures on your
server to perform all your data manipulation. The advantages of stored procedures are
many. Stored procedures are already pre-compiled and optimized, so this step doesn't
have to be repeated every time the stored procedure is run. The first time a stored
procedure is run, a query plan is created and stored in SQL Server's memory, so it can
be reused, saving even more time. Another benefit of stored procedures is that they help
reduce network traffic and latency. When your application's ADO code calls a stored
procedure on SQL Server, it makes a single network call. Then any required data
processing is performed on SQL Server, where data processing is most efficiently
performed, and then if appropriate, it will return any results to your application. This
greatly reduces network traffic and increases scalability and performance.

While stored procedures handle basic data manipulation like a champ, they can also
handle much more very well. Stored procedures can run virtually any Transact-SQL
code, and since Transact-SQL code is the most efficient way to manipulate data, all of
your application's data manipulations should be done inside of stored procedures on SQL
Server, not in COM components in the business-tier or on the client.

When you use ADO to execute stored procedures on SQL Server, you have two major
ways to proceed. You can use ADO to call the Refresh method of the Parameters
collection in order to save you a little coding. ADO needs to know what parameters are
used by the stored procedure, and the Refresh method can query the stored procedure
on SQL Server to find out the parameters. But as you might expect, this produces
additional network traffic and overhead. While it takes a little more coding, a more
efficient way to call a SQL Server stored procedure is to create the parameters explicitly
in your code. This eliminates the extra overhead caused by the Refresh method and
speeds up your application.
Encapsulate Your ADO Code in COM Components
As part of creating a scalable and optimized n-tier applications, put the ADO code that
accesses SQL Server data into COM components. This is true whether your front end is a
Visual Basic application or a web-based ASP application. This gives you all the standard
benefits of COM components, such as object pooling using MTS. And for ASP-based
applications, it provides greater speed because the ADO code in COM objects is already
compiled, unlike ADO code found in ASP pages. How you implement your data
manipulation code in COM components should be considered when the application is first
designed.

When designing COM objects, make them stateless as possible, avoiding the use of
properties. Instead, use methods to perform your data-related tasks. This is especially
critical if you use MTS, as any objects that preserve state can significantly reduce MTS's
ability to scale, while at the same time, increasing overhead and hurting performance.

For optimum performance, COM objects should be compiled as in-process DLLs (which is
required if they are to run under MTS). You should always employ early binding when
referencing COM objects, and create them explicitly, not implicitly.

How to Optimize Your Transact-SQL Code


Transact-SQL, just like any programming language, offers more than one way to perform
many tasks. And as you might imagine, some techniques offer better performance than
others. In this section you will learn some of the "tricks-of-the-trade" when it comes to
writing high performing Transact-SQL code.

Choose the Appropriate Data Types


While you might think that this topic should be under database design, I have decided to
discuss it here because Transact-SQL is used to create the physical tables that were
designed during the earlier database design stage.

Choosing the appropriate data types can affect how quickly SQL Server can SELECT,
INSERT, UPDATE, and DELETE data, and choosing the most optimum data type is not
always obvious. Here are some suggestions you should implement when creating
physical SQL Server tables to help ensure optimum performance.

 Always choose the smallest data type you need to hold the data you need to store
in a column. For example, if all you are going to be storing in a column are the
numbers 1 through 10, then the TINYINT data type is more appropriate that the
INT data type. The same goes for CHAR and VARCHAR data types. Don't specify
more characters for character columns that you need. This allows SQL Server to
store more rows in its data and index pages, reducing the amount of I/O needed
to read them. Also, it reduces the amount of data moved from the server to the
client, reducing network traffic and latency.
 If the text data in a column varies greatly in length, use a VARCHAR data type
instead of a CHAR data type. Although the VARCHAR data type has slightly more
overhead than the CHAR data type, the amount of space saved by using
VARCHAR over CHAR on variable length columns can reduce I/O, improving
overall SQL Server performance.
 Don't use the NVARCHAR or NCHAR data types unless you need to store 16-bit
character (Unicode) data. They take up twice as much space as VARCHAR or
CHAR data types, increasing server I/O overhead.
 If you need to store large strings of data, and they are less than 8,000
characters, use a VARCHAR data type instead of a TEXT data type. TEXT data
types have extra overhead that drag down performance.
 If you have a column that is designed to hold only numbers, use a numeric data
type, such as INTEGER, instead of a VARCHAR or CHAR data type. Numeric data
types generally require less space to hold the same numeric value as does a
character data type. This helps to reduce the size of the columns, and can boost
performance when the columns is searched (WHERE clause) or joined to another
column.

Use Triggers Cautiously


Triggers can be a powerful tool in Transact-SQL, but since they execute every time that
a table is INSERTED, UPDATED, or DELETED (depending on how the trigger is created),
they can produce a lot of overhead. Here's some tips on how to optimize trigger
performance.

 Keep the code in your triggers to the very minimum to reduce overhead. The
more code that runs in the trigger, the slower each INSERT, UPDATE, and DELETE
that fires it will be.
 Don't use triggers to perform tasks that can be performed using more efficient
techniques. For example, don't use a trigger to enforce referential integrity if SQL
Server's built-referential integrity is available to accomplish your goal. The same
goes if you have a choice between using a trigger or a CHECK constraint to
enforce rules or defaults. You will generally want to choose a CHECK constraint as
they are faster than using triggers when performing the same task.
 Try to avoid rolling back triggers because of the overhead involved. Instead of
letting the trigger find a problem and rollback a transaction, catch the error
before it can get to the trigger (if possible based on your code). Catching an error
early (before the trigger fires) consumes fewer server resources than letting the
trigger roll back.

Don't Access More Data Than You Need


While this suggestion may sound obvious, it must not be, because this is a common
performance-related issue I find over and over again in many SQL Server-based
applications. Here are some ideas on how to minimize the amount of data that is
returned to the client.

 Don't return more columns or rows of data to the client than absolutely
necessary. This just increases disk I/O on the server and network traffic, both of
which hurts performance. In SELECT statements, don't use SELECT * to return
rows, always specify in your SELECT statement exactly which columns are needed
to be returned for this particular query, and not a column more. In most cases,
be sure to include a WHERE clause to reduce the number or rows sent to only
those rows the clients needs to perform the task immediately at hand.
 If your application allows users to run queries, but you are unable in your
application to easily prevent users from returning hundreds, even thousands of
unnecessary rows of data they don't need, consider using the TOP operator within
the SELECT statement. This way, you can limit how may rows are returned, even
if the user doesn't enter any criteria to help reduce the number or rows returned
to the client.

Avoid Using Cursors


Transact-SQL is designed to work best on result sets, not on individual records. That's
where cursors come into play. They allow you to process individual records. The only
problem with individual record processing is that it is slow. Ideally, for high-performing
SQL Server-based applications, cursors should be avoided.

If you need to perform row-by-row operations, try to find another method to perform the
task. Some options are to perform row-by-row tasks at the client instead of the server,
using tempdb tables at the server, or using a correlated sub-query.

Unfortunately, these are not always possible, and you have to use a cursor. If you find it
impossible to avoid using cursors in your applications, then perhaps one of these
suggestions will help.

 SQL Server offers you several different types of cursors, each with its different
performance characteristics. Always select the cursor with the least amount of
overhead that has the features you need to accomplish your goals. The most
efficient cursor you can choose is the fast forward-only cursor.
 When using a server-side cursor, always try to fetch as small a result set as
possible. This includes fetching only those rows and columns the client needs
immediately. The smaller the cursor, no matter what type of server-side cursor it
is, the fewer resources it will use, and performance will benefit.
 When you are done using a cursor, don't just CLOSE it, you must also
DEALLOCATE it. Deallocation is required to free up the SQL Server resources used
by the cursor. If you only CLOSE the cursor, locks are freed, but SQL Server
resources are not. If you don't DEALLOCATE your cursors, the resources used by
the cursor will stay allocated, degrading the performance of your server until they
are released.

Use Joins Appropriately


Table joins can be a big contributor of performance problems, especially if the joins
include more than two tables, or if the tables are very large. Unfortunately, joins are a
fact of life in relational databases. Because they are so common, you will need to take
extra time to help ensure that your joins are as optimal as possible. Here are some tips
to help.

 If you have two or more tables that are frequently joined together, then the
columns used for the joins should have an appropriate index. If the columns used
for the joins are not naturally compact, then considering adding surrogate keys to
the tables that are compact in order to reduce the size of the keys, thus
decreasing read I/O during the join process, and increasing overall performance.
You will learn more about indexing in the next section of this article.
 For best performance, the columns used in joins should be of the same data
types. And if possible, they should be numeric data types rather than character
types.
 Avoid joining tables based on columns with few unique values. If columns used
for joining aren't mostly unique, then the SQL Server optimizer will perform a
table scan for the join, even if an index exists on the columns. For best
performance, joins should be done on columns that have unique indexes.
 If you have to regularly join four or more tables to get the recordset you need,
consider denormalizing the tables so that the number of joined tables is reduced.
Often, by adding one or two columns from one table to another, joins can be
reduced.

Encapsulate Your Code in Stored Procedures


Virtually all of the Transact-SQL used in your SQL Server-based applications should be
encapsulated in stored procedures, not run as dynamic SQL or scripts. This not only
reduces network traffic (only the EXECUTE or CALL is issued over the network between
the client and SQL Server), but it speeds up the Transact-SQL because the code in the
stored procedure residing on the server is already pre-compiled. Here are a couple of
things to keep in mind when writing stored procedures for optimal performance.

When a stored procedure is first executed (and it does not have the WITH RECOMPILE
option specified), it is optimized and a query plan is compiled and cached in SQL Server's
memory. If the same stored procedure is called again, it will use the cached query plan
instead of creating a new one, saving time and boosting performance. This may or may
not be what you want. If the query in the stored procedure is the same each time, then
this is a good thing. But if the query is dynamic (the WHERE clauses changes
substantially from one execution of the stored procedure to the next), then this is a bad
thing, as the query will not be optimized when it is run, and the performance of the
query can suffer.

If you know that your query will vary each time it is run from the stored procedure, you
will want to add the WITH RECOMPILE option when you create the stored procedure.
This will force the stored procedure to be re-compiled each time it is run, ensuring the
query is optimized each time it is run.

Always include in your stored procedures the statement, "SET NOCOUNT ON". If you
don't turn this feature on, then every time a SQL statement is executed, SQL Server will
send a response to the client indicating the number of rows affected by the statement. It
is rare that the client will ever need this information. Using this statement helps reduce
the traffic between the server and the client.

Deadlocking can occur within a stored procedure when two user processes have locks on
separate objects and each process is trying to acquire a lock on the object that the other
process has. When this happens, SQL Server ends the deadlock by automatically
choosing one and aborting the process, allowing the other process to continue. The
aborted transaction is rolled back and an error message is sent to the user of the
aborted process.

To help avoid deadlocking in your SQL Server application, try to design your application
using these suggestions: 1) have the application access server objects in the same order
each time; 2) during transactions, don't allow any user input. Collect it before the
transaction begins; 3) keep transactions short and within a single batch, and 4) if
appropriate, use as low of an isolation level as possible for the user connection running
the transaction.

How to Select Indexes for Optimal Database Performance


Index selection is a mystery for many SQL Server DBAs and developers. Sure, we know
what they do and how they boost performance. The problem often is how to select the
ideal type of index (clustered vs. non-clustered), the number of columns to index (do I
need multi-column indexes?), and which columns should be indexed.

In this section we will take a brief look at how to answer the above questions.
Unfortunately, there is no absolute answer for every occasion. Like much of SQL Server
performance tuning and optimization, you may have to do some experimenting to find
the ideal indexes. So let's begin by looking as some general index creation guidelines,
then we will take a more detailed look at selecting clustered and non-clustered indexes.

Is There Such a Thing as Too Many Indexes?


Yes. Some people think that all you have to do is index everything, and then all of your
performance issues will go away. It doesn't work that way. Just as an index can speed
data access, it can also degrade access if it is inappropriately selected. The problem with
extra indexes is that SQL Server must maintain them every time that a record is
INSERTED, UPDATED, or DELETED from a table. While maintaining one or two indexes
on a table is not too much overhead for SQL Server to deal with, if you have four, five,
or more indexes, they can be a large performance burden on tables. Ideally, you want to
have as few as indexes as you can. It is often a balancing act to select the ideal number
of indexes for a table in order to find optimal performance.

As a general rule of thumb, don't automatically add indexes to a table because it seems
like the right thing to do. Only add indexes if you know that they will be used by the
queries run against the table. If you don't know what queries will be run against your
table, then don't add any indexes until you know for sure. It is too easy to make a guess
on what queries will be run, create indexes, and then later find out your guesses were
wrong. You must know the type of queries that will be run against your data, and then
these need to be analyzed to determine the most appropriate indexes, and then the
indexes must be created and tested to see if they really help or not.

The problem of selecting optimal indexes is often difficult for OLTP applications because
they tend to experience high levels of INSERT, UPDATE, and DELETE activity. While you
need good indexes to quickly locate records that need to be SELECTED, UPDATED, or
DELETED, you don't want every INSERT, UPDATE, or DELETE to result in too much
overhead because you have too many indexes. On the other hand, if you have an OLAP
application that is virtually read-only, then adding as many indexes as you need is not a
problem because you don't have to worry about INSERT, UPDATE, or DELETE activity. As
you can see, how your application is used makes a large difference in your indexing
strategy.

Another thing to think about when selecting indexes is that the SQL Server Query
Optimizer may not use the indexes you select. If the Query Optimizer chooses not to use
your indexes, then they are a burden on SQL Server and should be deleted. So how
come the SQL Server Query Optimizer won't always use an index if one is available?

This is too large a question to answer in detail here, but suffice to say, sometimes it is
faster for SQL Server to perform a table scan on a table than it is to use an available
index to access data in the table. Two reasons that this may happen is because the table
is small (not many rows), or if the column that was indexed isn't at least 95% unique.
How do you know if SQL Server won't use the indexes you create? We will answer this
question a little later when we take a look at how to use the SQL Server Query Analyzer
later in this article.

Tips for Selecting a Clustered Index


Since you can only create one clustered index per table, take extra time to carefully
consider how it will be used. Consider the type of queries that will be used against the
table, and make an educated guess as to which query is the most critical, and if this
query will benefit from having a clustered index.
In general, use these rules of thumb when selecting a column for a possible clustered
index.

 The primary key you select for your table should not always be a clustered index.
If you create the primary key and don't specify otherwise, then SQL Server
automatically makes the primary key a clustered index. Only make the primary
key a clustered index if it meets one of the following recommendations.
 Clustered indexes are ideal for queries that select by a range of values or where
you need sorted results. This is because the data is already presorted in the index
for you. Examples of this include when you are using BETWEEN, <, >, GROUP BY,
ORDER BY, and aggregates such as MAX, MIN, and COUNT in your queries.
 Clustered indexes are good for queries that look up a record with a unique value
(such as an employee number) and when you need to retrieve most or all of the
data in the record. This is because the query is covered by the index.
 Clustered indexes are good for queries that access columns with a limited number
of distinct values, such as a columns that holds country or state data. But if
column data has little distinctiveness, such as columns with a yes or no, or male
or female, then these columns should not be indexed at all.
 Clustered indexes are good for queries that use the JOIN or GROUP BY clauses.
 Clustered indexes are good for queries where you want to return a lot of rows,
just not a few. This is because the data is in the index and does not have to be
looked up elsewhere.
 Avoid putting a clustered index on columns that increment, such as an identity,
date, or similarly incrementing columns, if your table is subject to a high level of
INSERTS. Since clustered indexes force the data to be physically ordered, a
clustered index on an incrementing column forces new data to be inserted at the
same page in the table, creating a table hot spot, which can create disk I/O
bottlenecks. Ideally, find another column or columns to become your clustered
index.

What can be frustrating about the above advice is that there might be more than one
column that should be clustered. But as we know, we can only have one clustered index
per table. What you have to do is evaluate all the possibilities (assuming more than one
column is a good candidate for a clustered index) and then select the one that provides
the best overall benefit.

Tips for Selecting Non-Clustered Indexes


Selecting non-clustered indexes is somewhat easier than clustered indexes because you
can created as many as is appropriate for your table. Here are some tips for selecting
which columns in your tables might be helped by adding non-clustered indexes.

 Non-clustered indexes are best for queries that return few rows (including just
one row) and where the index has good selectivity (above 95%).
 If a column in a table is not at least 95% unique, then most likely the SQL Server
Query Optimizer will not use a non-clustered index based on that column.
Because of this, don't add non-clustered indexes to columns that aren't at least
95% unique. For example, a column with "yes" or "no" as the data won't be at
least 95% unique.
 Keep the "width" of your indexes as narrow as possible, especially when creating
composite (multi-column) indexes. This reduces the size of the index and reduces
the number of reads required to read the index, boosting performance.
 If possible, try to create indexes on columns that have integer values instead of
characters. Integer values have less overhead than character values.
 If you know that your application will be performing the same query over and
over on the same table, consider creating a covering index on the table. A
covering index includes all of the columns referenced in the query. Because of
this, the index contains the data you are looking for and SQL Server doesn't have
to look up the actual data in the table, reducing logical and/or physical I/O. On
the other hand, if the index gets too big (too many columns), this can increase
I/O and degrade performance.
 An index is only useful to a query if the WHERE clause of the query matches the
column(s) that are leftmost in the index. So if you create a composite index, such
as "City, State", then a query such as "WHERE City = 'Houston'" will use the
index, but the query "WHERE STATE = 'TX'" will not use the index.

Generally, if a table needs only one index, make it a clustered index. If a table needs
more than one index, then you have no choice but to use non-clustered indexes. By
following the above recommendations, you will be well on your way to selecting the
optimum indexes for your tables.
What to Do Next?
This article has a lot of information, but on the other hand, it has barely touched the
surface when it comes to performance tuning and optimization. Reading this article has
been your first step. Now its time for you to take the next step, and that is to learn all
you can about how SQL Server works internally and the tools described in this article.
Remember, SQL Server performance tuning is more of an art than a science, and only
through the mastery of the basics, and experience, can you become an expert at SQL
Server performance tuning.

You might also like