Home » SQL CLUSTERED INDEX

SQL CLUSTERED INDEX

SQL Server Clustered index


A clustered index defines the order in which data is physically stored in a table. A table can have only one clustered index because data rows can be only sorted in one order.

When a PRIMARY KEY constraint on a table is created, a unique clustered index will be created automatically on the column.

 

SYNTAX

CREATE CLUSTERED INDEX index_name
ON schema_name.table_name (column_list);

Clustered Index design and implementation

A clustured indexes are made up of a set of pages called as index nodes that are organized in a B-tree structure.

The top node of the B-tree is called as root node. The bottom nodes in the index are called as leaf nodes. Any index levels between the root and the leaf nodes are called as intermediate levels.

The leaf nodes contain the data pages of the underlying table. The root and intermediate level nodes contain index pages holding index rows.

Each index row contains a key value and a pointer to either an intermediate level page in the B-tree, or a data row in the leaf level of the index.

Lets understand the working of clustured index , if you query to search for the value 125 in an indexed column, the query engine would first look in the root level to determine which page to reference in the top intermediate level.

Lets take a reference of above image , the first page of intermediate level indicates the value which is between 1 to 250, and the second page indicates the value is between 251-500, so the query engine would go to the first page on that level as your are searching for a value =125.

Now the query engine would then determine that it must go to the first page at the next intermediate level. From there, the query engine would navigate to the leaf node for value=125.

The leaf node will contain either the entire row of data or a pointer to that row, depending on whether the index is clustered or nonclustered.

Lets create a Employee table, and insert some records into it.

CREATE table EMPLOYEE
(EMPID INT NOT NULL, 
EMPNAME VARCHAR(50) NOT NULL, 
EMPCODE VARCHAR(15))

INSERT INTO EMPLOYEE(EMPID, EMPNAME, EMPCODE)
VALUES (501, 'RAJESH MITAL', 'XCV567'),
(502, 'SUJOY SINGH', 'XCB787'),
(503, 'ABHISHEK MISHRA', 'FJV542'),
(504, 'HARSHIT KAPOOR','NMD557'),
(505, 'MANISH RAWAT', 'QJB987'),
(506, 'DEEPAK KUMAR', 'FZV704'),
(507, 'HARSH KUMAWAT','NET657'),
(508, 'ROZER JR', 'DDB987'),
(509, 'BALAJI NEGI', 'KL0542'),
(510, 'NARAYANA SWAMI','NSD447'),
(511, 'LOKESH MANTRI', 'BBB007'),
(512, 'SUMIT VIYAS', 'FFNM04'),
(513, 'BABA NAYAK','NEG157')

Remember, we have not created any primary key or clustered index on table yet.

Lets execute the following statement also turned on the estimated execution plan for the query .

SELECT * FROM EMPLOYEE WHERE EMPID =511

As you can see, a table scanning (scanned whole table) is performed by query optimizer to get the employee record for EMPID =511, Although it returns data very fast as table are having only 13 records.

In case,  if the table are having a huge number of records , It will take a good amount of time to searching a records in table.

Using SQL Server index , you can speed up retrieval of rows from the table.

Lets create a clustered index on cloumn EMPID

CREATE CLUSTERED INDEX idx_emp_id
ON EMPLOYEE (EMPID);

SELECT * FROM EMPLOYEE WHERE EMPID =511

Now when above statement is executed , SQL Server clustered index seek to get the row rather than scanning a whole table.

 

 

 

 

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.