create table ctbl_facts (id int not null, value int) go
alter table ctbl_facts add constraint PK_ ctbl_facts primary key clustered (Id) go |
Listing 3. The creation of clustered index on primary key of one table
2. Indexes
An index on a table provides a mechanism for locating one or more rows without searching the entire table. The property to be located is specificied by a set of columns of the indexed table called search key or index key which is way of faster access path to data and increases the overall performance of query execution.
The structure of an index consists of a set of entries together with a mechanism for locating the rows based on the search key value. The use of an index can reduce the number of data file pages retrieve, although accessing the index itself is another overhead. In the case of thousand of millions of rows using an index, it might be possible to access to a row in one or two I/O operations. It is remarkable to say that index needs maintenance and when a row is inserted or updated in the table, the index must be modified in order to be adapted from the changes.
There are two type of index: clustered and unclustered. In a clustered index, the entries which are close implies proximity among the corresponding data rows. A sorted index is clustered if the index entries and data rows are sorted on the same key; otherwise the index is unclustered.
In a clustered index, the leaf node of the index tree has the actual data pages itself, while in an unclustered index the leaf node contains pointers to data pages or clustered index data pages. For this reason of physical order of rows in the data according to the index, you can have only one clustered index on a given table. Clustered index concept correspond to index organized table (IOT) in Oracle Database and clustered index on a table in Microsoft SQL Server which we discussed in the last section.
There are fundamentally two types of index structure implemented in most of the database systems: B*Tree and Bitmap indexes. A B*Tree index or Balance Tree index is the most commonly used index structure and default when you create an index using the create index SQL statement. B*Tree is similar in structure to binary tree. One property of B*Tree is that the tree is balanced, and it implies that despite the insertion and deletion of rows, any path from the tree root to a leaf node has the same length, and from the point of view of database systems the selection of a row has the same performance overhead independently of the searched row. Most B*Tree indexes will have a length of 2 or 3 levels even for thousands of millions of rows.
A bitmap index is a sort of one or more bit which stores pointers to many rows with a single index key entry and typically we find a very small number of entries pointing to many rows. Bitmap index is mainly designed for OLAP databases to be used in data warehouse and data mining applications; and not for OLTP solutions which changes the data frequently. Bitmap index is appropiate for table attributes that can take on only a small number of values. A Bitmap index should not be selective unlike a B*Tree index that should be selective for example attributes storing boolean values. Oracle Databases only implement this type of index unlike SQL Server databases.
2.1 SQL statement for the creation of an index
The basic statement for the creation of an index is (see Listing 4)
create index [schema].index_name on [schema].table_name ({column}[ASC|DESC] [,{column}[ASC|DESC]]) |
Listing 4. SQL statement for the creation of an index
Microsoft SQL Server and Oracle Database have their own extension to the former SQL statement as it is resumed in the following tables.
Argument |
Description |
Unique |
To enforce the uniqueness in the index key |
clustered | unclustered |
To specify if the index is clustered or unclustered |
asc | desc |
To specify the columns order of indexing |
Table 1. Microsoft SQL Server extension to the creation of index
Argument |
Description |
Unique |
To enforce the uniqueness in the index key |
Bitmap |
To specify that the index is created as bitmap rather than using the normal structure based on B*Tree |
asc | desc |
To specify the columns order of indexing |
Compress | uncompress |
To specify that key compression is enable |
Table 2. Oracle Database extension to the creation of index
2.3 Examples of index creation
In this section we are going to analyze some type of index and its underlying implementation in Microsoft SQL Server and Oracle Database.
The creation of a simple index
In order to create a simple index in Microsoft SQL Server and Oracle Database, you must use the CREATE INDEX statement (see Listing 5).
create index ndx_myindex on table_name(attr1_name, attr2_name, ...) |
Listing 5. Generic form for creating an index using SQL
The creation of a bitmap index
To create a bitmap index in Oracle Database based on the attribute job of the table emp, you must use the CREATE BITMAP INDEX statement (see Listing 6)
create bitmap index bndx_dept_dname on dept(dname) |
Listing 6. The creation of a bitmap index on Oracle Database
The creation of a unique index
The creation of a unique index on table in Oracle Database and in Microsoft SQL Server is similar by using the CREATE UNIQUE INDEX statement. Let's illustrate it with a generic example (see Listing 7).
create unique index on tbl_table_name on (attr1_name, attr2_name, ...) |
Listing 7. The creation of a unique index in Oracle database.
The creation of composite indexes
You can create index on one or more attributes of a table as well as setting up the order for each attribute. Let's suppose that we have a table named tbl_composite with three attributes attr1, attr2 and attr3. This table has thousands of millions of rows. We are going to query this table to display its columns and filtering some rows (see Listing 8).
select attr1, attr2, attr3 from tbl_composite where attr1=value1 and attr2=value2 and attr3=value3 |
Listing 8. Expensive query to the table tbl_composite
As you can see this query is very expensive because we are using different filtering criteria. If you want to improve the overall performance, you should create a composite index on the table tbl_composite and on attributes attr1, attr2, attr3 and specify the order (see Listing 9).
create index on tbl_composite (attr1 asc, attr2 asc, attr3 desc) |
Listing 9. The creation of a composite index
There are common situation when we've found a lot of repetitive values in composite values. You can use the compress clause in Oracle Database to eliminate these repetitive values (see Listing 10).
create index on tbl_composite (attr1 asc, attr2 asc, attr3 desc) compress &1; |
Listing 10. A composite index with the index key's compression enable
The & 1 clause indicates to create the index using compression on just the leading columns which are the first part of the index key values and thus will have many repetitive values.
Join indexes in Oracle Database
A join index is used to increase the performance of the join of several tables. You can think of a join index as a pre-computed join. An index is normally created on a table but a join index allows indexing a given table using the columns from another table. This index can be organized as a B*Tree or as a Bitmap. Oracle Database only implements bitmap join indexes.
This type of index can be used particularly in data-warehouse environment where the main schema is star-based thus having a lot of relationships between entities. Although, it's recommended not to use this type of index in OLTP databases because of the overhead associated to its maintenance.
Let's illustrate these concepts with an example. Suppose that we have a data warehouse that contains a star schema with a fact table named sales and a dimension table named product. We want to create a report (see Listing 11) to display the sales amount given a product category.
select sum(sales.amount) from sales, product where sales.prod_id = product.prod_id and product.category = 'dbms'; |
Listing 11. Report displaying sales amount by product category
Because the sales table might be very large as any other transaction table, the performance of this query might be not good at all. In order to eliminate some key interactions and merge work, we might define a bitmap join index to increase the response time and improve the overall application performance (see Listing 12). You can notice that in the index creation, we can refer to the clause where using the index creation syntax on sales(product.category).
create bitmap index bidx_sales_product on sales(product.category) from sales, product where sales.prod_id = product.prod_id |
Listing 12. The Oracle SQL code for the creation of join indexes
Function based index
This type of index is used to store computed columns value and it's only implemented in Oracle Database.
Let's suppose that we write a SELECT statement for a report in order to display information about a particular employee using the emp table (see Listing 13).
select * from emp where upper(ename)='JOHN' |
Listing 13. Query displaying information about the employee whose name is JOHN
You can improve this query response time creating a function based index on the string function upper (see Listing 14).
Create index f_ndx_dept_loc_upper on emp( upper(ename) ) |
Listing 14. The creation of function based index
Conclusion
This article has illustrated the main concepts about the two most important storage structures in database system such as tables and indexes. We have discussed structures used to store rows in tables as well as index mechanisms and several index types used to implement some strategies to access rows in tables in order to improve the overall performance of the application and decreasing the overhead of I/O operations. After you read this article, you are able to apply these concepts and techniques to your own business scenario.