Indexing is a fundamental aspect of managing database applications. Indexes are used in databases to find data more quickly and efficiently. An index in a database functions much like an index in a book, pointing to the location of information, allowing the database to skip over unneeded data. Just as an index can greatly aid a reader in finding specific information in a book, a database index can provide significant performance improvements to queries. In fact, the right index can often result in a reduction of 99% or more in the amount of data the server needs to read to satisfy a query.
Not all indexes, however, are beneficial. Indexes that are not designed correctly, or are not properly maintained, can result in worse performance than no index at all.
To administer Microsoft Azure SQL Solutions you must understand how to identify and implement index changes for queries. This involves finding missing indexes, identifying unused or duplicate indexes, and updating statistics.
Identifying Missing Indexes
A missing index is an index that SQL Server has determined could help improve performance of a specific query but does not exist. SQL Server logs information about these potential indexes in the missing index DMVs (Dynamic Management Views), which can be queried to identify missing indexes.
The following query will return details about the missing indexes SQL Server has identified.
SELECT
MID.database_id,
MID.[object_id],
migs.avg_total_user_cost * migs.avg_user_impact * migs.user_seeks as improving_cost_factor,
‘CREATE INDEX missing_index_’ + CONVERT (varchar, MID.index_handle)
+ ‘ ON ‘ + migs.statement
+ ‘ (‘ + ISNULL (MID.equality_columns, ”)
+ CASE WHEN MID.equality_columns IS NOT NULL AND MID.inequality_columns IS NOT NULL THEN ‘,’ ELSE ” END
+ ISNULL (MID.inequality_columns, ”)
+ ‘)’ + ISNULL (‘ INCLUDE (‘ + MID.included_columns + ‘)’, ”) AS create_index_statement,
migs.user_seeks,
migs.user_scans,
migs.avg_total_user_cost,
migs.avg_user_impact,
migs.avg_number_of_user_scans
FROM sys.dm_db_missing_index_details MID
JOIN sys.dm_db_missing_index_groups MIG ON MID.index_handle = MIG.index_handle
JOIN sys.dm_db_missing_index_group_stats migs ON migs.group_handle = MIG.index_group_handle
ORDER BY improving_cost_factor desc;
Identifying Unused or Duplicate Indexes
Indexes that are not being used, or duplicate indexes that serve the same purpose, can lead to unnecessary overhead on the server. Such indexes could potentially be dropped to improve performance.
SQL Server maintains DMVs that track use of indexes. For example, `sys.dm_db_index_usage_stats` can be queried to find indexes that may not be needed.
SELECT
o.name AS ObjectName,
i.name AS IndexName,
i.index_id AS IndexID,
us.user_seeks + us.user_scans + us.user_lookups AS UserReads,
us.user_updates AS UserWrites
FROM sys.dm_db_index_usage_stats us
JOIN sys.indexes i ON i.index_id = us.index_id AND us.object_id = i.object_id
JOIN sys.objects o on us.object_id = o.object_id
WHERE o.name = ‘your table name’
An index with a lot more UserWrites than UserReads might be a candidate for being dropped.
Updating Statistics
SQL Server uses statistics to estimate the distribution of data in an index, and hence how to best execute a query. If the statistics are out-of-date, the query optimizer could make poor decisions, leading to sub-optimal execution plans.
Statistics are automatically updated by SQL Server, but in some circumstances, it may be beneficial to manually update them. This can be done using the `UPDATE STATISTICS` command.
UPDATE STATISTICS your_table_name with FULLSCAN
The FULLSCAN option scans all rows in a table and updates the statistics based on that scan.
By properly managing indexes in SQL Server, it is possible to significantly improve the performance of your Azure SQL Database. It ties into a wider strategy for optimizing your queries and should be something every DP-300 exam aspirant should understand thoroughly in Microsoft Azure SQL solutions management. Be sure to utilize these strategies as you work towards optimizing your queries and enhancing your knowledge for the DP-300.
Practice Test
True/False. Adding indexes brings benefits for SELECT queries but has no impact on INSERT, UPDATE, and DELETE statements.
- Answer: FALSE.
Explanation: While indexes help in query performance for SELECT statements by helping to locate the data swiftly, they can slow down INSERT, UPDATE, and DELETE operations because they need to be updated each time data is modified.
Multiple select. Which of the following operations can be improved by implementing index changes?
- a) SELECT statements
- b) UPDATE statements
- c) INSERT statements
- d) DELETE statements
- Answer: a, b, d.
Explanation: Indexes can improve the performance of SELECT, UPDATE and DELETE statements by reducing the amount of data that needs to be processed. However, for INSERT statements, indexes may cause performance degradation because the database has to update the index each time a new data is inserted.
True/False: A clustered index sorts and stores the data rows in a table or view based on their key values.
- Answer: TRUE.
Explanation: A clustered index determines the physical order of data inside a table which is why a table can have only one clustered index.
Single select. How many clustered indexes can a table have?
- a) 1
- b) 2
- c) 3
- d) Unlimited
- Answer: a) 1
Explanation: Due to the nature of clustered indexes, where it determines the physical order of data inside a table, a table can only have one clustered index.
True/False: The order of columns in an index has no impact on the query performance.
- Answer: FALSE.
Explanation: The order of columns in an index matters because SQL Server reads them left to right. Therefore, column selection and the order of columns can significantly affect the performance of an index.
Multiple select. What measures might you consider to optimize an SQL query?
- a) Adding an index
- b) Removing unnecessary columns from the index
- c) Removing an index
- d) Both a and b
- Answer: d) Both a and b
Explanation: Both adding an index and removing unnecessary columns from the index can help optimize an SQL query by speeding up data retrieval and reducing storage space.
True/False: Non-clustered indexes have a structure separate from the data rows.
- Answer: TRUE.
Explanation: A non-clustered index has a structure independent of the data rows, storing a copy of the indexed data with a pointer to the data row.
Single select. What is the maximum number of non-clustered indexes a table can have in SQL Server?
- a) 999
- b) 249
- c) 100
- d) 64
- Answer: a) 999
Explanation: In SQL server, a table can have up to 999 non-clustered indexes.
Multiple select. Which are factors to consider when choosing columns for an index?
- a) Column’s cardinality
- b) Size of the column data
- c) The column is often used in WHERE clause
- d) The column is never used in SELECT clause
- Answer: a, b, c
Explanation: Cardinality, size of the column data and whether the column is often used in WHERE clause are important factors to consider. The column not being used in SELECT clause does not necessarily devalue its index viability.
True/False: Adding more indexes always increases the performance of SQL Server.
- Answer: FALSE.
Explanation: Though indexes generally improve read operations, adding too many indexes can slow down write operations as every modification needs to update all related indexes. Hence, the number of indexes needs to be determined with a balanced approach.
Interview Questions
1. What is a non-clustered index in Azure SQL Database?
A non-clustered index in Azure SQL Database is a separate structure from the table data that stores the index key values along with a pointer to the actual table row.
2. How can you identify missing indexes in Azure SQL Database?
You can identify missing indexes in Azure SQL Database by utilizing the dynamic management view sys.dm_db_missing_index_details.
3. What is included in an execution plan in Azure SQL Database?
The execution plan in Azure SQL Database includes information about the query optimization process, such as the chosen indexes, join methods, and data access methods.
4. How can you determine the index usage statistics in Azure SQL Database?
You can determine the index usage statistics in Azure SQL Database by querying the dynamic management view sys.dm_db_index_usage_stats.
5. What is the impact of adding too many indexes to a table in Azure SQL Database?
Adding too many indexes to a table in Azure SQL Database can lead to decreased write performance, increased storage requirements, and additional maintenance overhead.
6. How can you modify an existing index in Azure SQL Database?
You can modify an existing index in Azure SQL Database by using the ALTER INDEX statement to change properties like fill factor or included columns.
7. In Azure SQL Database, what is the purpose of the fill factor property for an index?
The fill factor property for an index in Azure SQL Database specifies the percentage of space on each leaf level of the index page to be filled during index creation.
8. What is a clustered index in Azure SQL Database?
A clustered index in Azure SQL Database determines the physical order of the data in the table and is the actual table itself sorted based on the index key.
9. How can you drop an existing index in Azure SQL Database?
You can drop an existing index in Azure SQL Database by using the DROP INDEX statement followed by the name of the index to be removed.
10. What is the function of included columns in indexes in Azure SQL Database?
Included columns in indexes in Azure SQL Database allow additional non-key columns to be stored at the leaf level of the index, which can cover more queries without adding them to the key columns.