Database Index: An Introduction for Beginners

"Database Index" refers to a special kind of data structure that speeds up retrieving records from a database table. Database indices make sure that you can locate and access the data in a database table efficiently without having to search every row each time a database query is processed.

A database index can be likened to a book’s index. Indices in databases point you to the record you're looking for in the database, just like a book’s index page points you to your desired topic or chapter.

However, while database indices are essential for quick and efficient data lookup and access, they take up additional writes and memory space.

What Is an Index?

Database indexes are special lookup tables consisting of two columns. The first column is the search key, and the second one is the data pointer. The keys are the values you want to search and retrieve from your database table, and the pointer or reference stores the disk block address in the database for that specific search key. The key fields are sorted so that it accelerates the data retrieval operation for all your queries.

Why Use Database Indexing?

I'm going to show you database indices in a simplified way here. Let’s assume you have a database table of the eight employees working in a company, and you want to search the information for the last entry of the table. Now, to find the previous entry, you need to search each row of the database.

However, suppose you've alphabetically sorted the table based on the first name of the employees. So, here indexing keys are based on the “name column.” In that case, if you search the last entry, “Zack,” you can jump to the middle of the table and decide whether our entry comes before or after the column.

As you know, it'll come after the middle row, and you can again divide the rows after the middle row in half and make a similar comparison. This way, you don't need to traverse each row to find the last entry.

If the company had 1,000,000 employees and the last entry was “Zack,” you would have to search 50,000 rows to find his name. Whereas, with alphabetical indexing, you can do it in a few steps. You can now imagine how much faster data lookup and access can become with database indexing.

Different File Organization Methods for Database Indexes

Indexing depends heavily on the file organization mechanism used. Usually, there are two types of file organization methods used in database indexing to store data. They are discussed below:

1. Ordered Index File: This is the traditional method of storing index data. In this method, the key values are sorted in a particular order. Data in an ordered index file can be stored in two ways.

Sparse Index: In this type of indexing, an index entry is created for each record.
Dense Index: In dense indexing, an index entry is created for some records. To find a record in this method, you first have to find the most significant search key value from index entries that are less than or equal to the search key value you're looking for.

2. Hash File organization: In this file organization method, a hash function determines the location or disk block where a record is stored.

Types of Database Indexing

There are generally three methods of Database Indexing. They are:

Clustered Indexing
Non-clustered Indexing
Multi-Level Indexing

1. Clustered Indexing

In clustered indexing, one single file can store more than two data records. The system keeps the actual data in clustered indexing rather than the pointers. Searching is cost-efficient with clustered indexing as it stores all the related data in the same place.

A clustering index uses ordered data files to define itself. Also, joining multiple database tables is very common with this type of indexing.

It's also possible to create an index based on non-primary columns that are not unique for each key. On such occasions, it combines multiple columns to form the unique key values for clustered indexes.

So, in short, clustering indices are where similar data types are grouped and indices are created for them.

Example: Suppose there’s a company that has over 1,000 employees in 10 different departments. In this case, the company should create clustering indexing in their DBMS to index the employees who work in the same department.

Each cluster with employees working in the same department will be defined as a single cluster, and data pointers in indices will refer to the cluster as a whole entity.

2. Non-clustered Indexing

Non-clustered indexing refers to a type of indexing where the order of the index rows is not the same as how the original data is physically stored. Instead, a non-clustered index points to the data storage in the database.

Example: Non-clustered indexing is similar to a book that has an ordered contents page. Here, the data pointer or reference is the ordered contents page which is alphabetically sorted, and the actual data is the information on the book's pages. The contents page doesn't store the information on the book's pages in their order.

3. Multi-level Indexing

Multi-level indexing is used when the number of indices is very high, and it can't store the primary index in the main memory. As you may know, database indices comprise search keys and data pointers. When the size of the database increases, the number of indices also grows.

However, to ensure quick search operation, index records are needed to be kept in the memory. If a single-level index is used when the index number is high, it's unlikely to store that index in memory because of its size and multiple accesses.

This is where multi-level indexing comes into play. This technique breaks the single-level index into multiple smaller blocks. After breaking down, the outer-level block becomes so tiny that it can easily be stored in the main memory.

What Is SQL Index Fragmentation?

When any order of the index pages doesn’t match with the physical order in the data file causes SQL index fragmentation. Initially, all the SQL indexes reside fragmentation-free, but as you use the database (Insert/Delete/Alter data) repeatedly, it may cause fragmentation.

Apart from database fragmentation, your database can also face other vital issues like database corruption. It can lead to lost data and a harmed website. If you're doing business with your website, it can be a fatal blow for you.

Database Index: An Introduction for Beginners | what is search engine optimiz