SQL/Query Optimization
How SQL queries are executed in DB
explain plan, performance tuning.
The EXPLAIN PLAN statement displays execution plans chosen by the Oracle optimizer for SELECT, UPDATE, INSERT, and DELETE statements.
A statement’s execution plan is the sequence of operations Oracle performs to run the statement.
Syntax for finding indexes in tuning.
Indexes are special lookup tables that the database search engine can use to speed up data retrieval.
An index is a pointer to data in a table.
Indexes are a performance drag when the time comes to modify records.
Multiple Conditions : AND
SELECT * from EMP WHERE salary > 10000 AND dateofjoining > ‘1-Jan-1990’ ;
Range Selection - BETWEEN
SELECT * from EMP WHERE salary between 9000 AND 20000;
Exact List Matching
SELECT *
from EMP
WHERE employeeID IN (3001, 3002);
Full table scan
- A full table scan occurs when an index is either not used or there is no index on the table(s)
- Full table scans should be avoided when reading large tables.For example, a full table scan is performed when a table that does not have an index is read
- FTS will be performed even though an index is present on that table.
- If a query does have a WHERE clause, but none of the columns in that WHERE clause match.
- when WHERE clause prevents the use of an index like below.
- If the NOT EQUAL (the “<>“) operator is used.
- If the NOT operator is used.
- If the wildcard operator is used in the first position of a comparison string. An example is “WHERE NAME LIKE ‘%INTERVIEW%’”.
DB Index
A database index is like an efficient lookup table that allows a database to find data much faster. It uses Data structues like B-Tree, HashMap, Bitmaps etc.
- index helps to avoid full scan for a single row in a massive table
- with index, the pointer helps to get the exact address of the indexed column
An index on a non-primary column allows the database to quickly locate rows based on the indexed column’s values, without scanning the entire table
- columns that are frequently used in WHERE clauses, JOIN conditions, or ORDER BY clauses are good candidates for indexing.
Creating index for Employees Table
CREATE
INDEX idx_department ON employees(department);
-- Analyze Query Performance
EXPLAIN
SELECT *
FROM employees
WHERE department = 'Sales';
Composite Index
CREATE
INDEX idx_first_last_name ON employees(first_name, last_name);
-- Analyze Query Performance
EXPLAIN
SELECT *
FROM employees
WHERE first_name = 'John'
AND last_name = 'Doe';
Types of Database Indexes
Key Attributes based:
Primary Index: Automatically created when a primary key constraint is defined on a table.
Clustered Index A clustered index determines the physical order of data in a table. A table can have only one clustered index because the data rows can be stored in only one order.
The primary index is usually a clustered index, but a clustered index can be created on other columns as well.
When a table has a clustered index, the data rows are stored in the same order as the index.
A clustered index is most useful for searching in a range. Only one clustered index can exist per table.
CREATE
CLUSTERED INDEX idx_department ON employees(department);
-- The data rows in the employees table will be stored in the order of the department values.
Non-clustered or Secondary Index: This index does not store data in the order of the index. Instead, it provides a list of virtual pointers or references to the location where the data is actually stored.
Data Coverage based:
Dense index: Has an entry for every search key value in the table. Suitable for situations where the data has a small number of distinct search key values or when fast access to individual records is required.
Sparse index: Has entries only for some of the search key values. Suitable for situations where the data has a large number of distinct search key values.
Specialized Index Types: Bitmap Index: Excellent for columns with low cardinality (few distinct values). Common in data warehousing.
Hash Index: A index that uses a hash function to map values to specific locations. Great for exact match queries.
Filtered Index: Indexes a subset of rows based on a specific filter condition. Useful to improve query speed on commonly filtered columns.
Covering Index: Includes all the columns required by a query in the index itself, eliminating the need to access the underlying table data.
Function-based index: Indexes that are created based on the result of a function or expression applied to one or more columns of a table.
Full-Text Index: A index designed for full-text search, allowing for efficient searching of text data.
Spatial Index: Used for indexing geographical data types.
Datastructures used in Indexing
Indices can be saved in
- B-Tree (Balanced Tree) or variation of it
- HashMaps
- Bitmap Index
For where clause, comparisons should be SARGABLE - Search ARGument * *ABLE**
- refers to queries that can use indices for faster execution
B-Tree Index
B-Tree indices are commonly used for columns with a wide range of values and support equality and range queries efficiently.
Range Query
-- Assuming an index on the `salary` column
SELECT *
FROM employees
WHERE salary BETWEEN 50000 AND 100000;
Composite Index Query
-- Assuming a composite index on the `last_name` and `first_name` columns
SELECT *
FROM employees
WHERE last_name = 'Smith'
AND first_name = 'John';
Bitmap Index
Bitmap indices are more efficient for columns with a low cardinality, such as gender or status columns.
Equality Query on Low Cardinality Column
-- Assuming a bitmap index on the `gender` column
SELECT *
FROM employees
WHERE gender = 'F';
Ensuring SARGABLE Queries
Do’s and Dont’s
Avoid Functions on Indexed Columns: Functions on indexed columns prevent the use of the index.
-- Not SARGABLE
SELECT *
FROM employees
WHERE YEAR(hire_date) = 2020;
-- SARGABLE
SELECT *
FROM employees
WHERE hire_date BETWEEN '2020-01-01' AND '2020-12-31';
Avoid Operations on Indexed Columns: Operations on indexed columns also prevent the use of the index.
-- Not SARGABLE
SELECT *
FROM employees
WHERE salary + 1000 = 50000;
-- SARGABLE
SELECT *
FROM employees
WHERE salary = 49000;
Use AND for Composite Indexes: Ensure all parts of the composite index are used in the query.
-- Assuming a composite index on (last_name, first_name)
-- Not fully SARGABLE (only uses part of the index)
SELECT *
FROM employees
WHERE last_name = 'Smith';
-- Fully SARGABLE
SELECT *
FROM employees
WHERE last_name = 'Smith'
AND first_name = 'John';