Search For Missing Values Within a Numerical Sequence

Scope

The purpose of this short article is to show a method for identifying, within a SQL Server table, gaps in a numerical sequence, to quickly locate absences of consequence within given data. This approach is useful when there is a need, for example, to verify that there are no missing documents in a given list (think of the possibility of having to monitor a flow of invoices, when you must check if in a given range some of them are missing, basing your analysis on the document's registration number).

Basic concept

The script is based on a simple concept. If we could have a table that contains each number among those possible for the management of our case, we could compare our actual list (where some numbers will potentially not be present) with the ideal list, that contains them all and then verify the deficiencies.

It is highly recommended to have a numbers table in each instance of the SQL Server. There is no reason to create a new table each time, in most cases. Moreover, we can use one table for any query in any database in the server instance. There are many queries that can be improved using this numbers table.

* Tip!

You can create an Accessories database that includes accessories, such as numbers table, dates tables, CLR functions, T-SQL functions and more. This database should be on a read only mode and will make the queries faster with less locks and more security. All users should have privileges to read from this database.

If you have not yet created a numbers table yet, then this is the time to do so.

  1. ---------------------------------------------------> Numbers table  
  2. CREATE TABLE AriNumbersTbl (Number  int  not null)  
  3. GO  
  4.    
  5. INSERT INTO AriNumbersTbl(Number)  
  6. SELECT TOP 10000000 row_number() over(order by (select NULL)) as N  
  7. FROM master..spt_values t1  
  8. CROSS JOIN master..spt_values t2  
  9. CROSS JOIN master..spt_values t3  
  10. GO  
  11.    
  12. ALTER TABLE AriNumbersTbl ADD CONSTRAINT PK_AriNumbersTbl PRIMARY KEY CLUSTERED (Number)  
  13. GO  

Application

We'll now create a table named Products to be used for our comparisons. It will have the following two fields, the first is a hypothetical unique ID of a product, whereas the second is a description of the product itself. In a table like this, we will insert some test records.

  1. CREATE TABLE Products( ProductID INT PRIMARY KEY, ProductDes VARCHAR (50))  
  2. INSERT INTO Products VALUES(1 , 'Product 1' )  
  3. INSERT INTO Products VALUES(2 , 'Product 2' )  
  4. INSERT INTO Products VALUES(3 , 'Product 3' )  
  5. INSERT INTO Products VALUES(4 , 'Product 4' )  
  6. INSERT INTO Products VALUES(5 , 'Product 5' )  
  7. INSERT INTO Products VALUES(10 , 'Product 10' )  
We have 5 consecutive records, and the sixth has an ID = 10. The following implements the preceding logic. If we wish to spot potential missing values then the result we should expect will consist of a record having an ID from 6 to 9, missing in the Products table, but not in the AriNumbersTbl. Let's see how to do this.
  1. SELECT number  
  2. FROM AriNumbersTbl  
  3. WHERE number NOT IN (SELECT ProductID FROM Products) AND  
  4.       number <= (SELECT MAX(ProductID) FROM Products )  

This simple SELECT contains two subqueries in the WHERE clause. What we will do is extract from the AriNumbersTbl table the values not contained in (NOT IN) the selection of all ID values from the Products table (SELECT ProductID FROM Products). If we stop here, we will obtain the values 6, 7, 8 and 9 but also all those values between 11 and 99999, being absent from the Products table, but not into AriNumbersTbl. As an upper limit, we can then put the maximum value readable from the Products table. Since the higher ID is 10 at the moment, we are not interested in spotting the values that will go beyond that limit, being obvious for them to satisfy the basic conditions for our analysis. So, we simply add a second condition to extract, among all the values of AriNumbersTbl, those that won't go beyond the maximum ID present in the Products table (SELECT MAX(ProductID) FROM Products).

Executing our script in Management Studio, using the example data, we'll obtain a list like this:

Complete script

  1. --> Don't forget that you need a numbers table which is not part of this script, since it is something that should be in any server instance.  
  2.    
  3. -- Products Table creation and population  
  4. CREATE TABLE Products( ProductID INT PRIMARY KEY, ProductDes VARCHAR (50))  
  5. INSERT INTO Products VALUES(1 , 'Product 1' )  
  6. INSERT INTO Products VALUES(2 , 'Product 2' )  
  7. INSERT INTO Products VALUES(3 , 'Product 3' )  
  8. INSERT INTO Products VALUES(4 , 'Product 4' )  
  9. INSERT INTO Products VALUES(5 , 'Product 5' )  
  10. INSERT INTO Products VALUES(10 , 'Product 10' )  
  11.     
  12. -- Identifying numerical holes in products list  
  13. SELECT *  
  14. FROM AriNumbersTbl  
  15. WHERE number NOT IN (SELECT ProductID FROM Products) AND  
  16.       number <= (SELECT MAX( ProductID) FROM Products )  
  17.     
  18. -- Deleting sample tables  
  19. DROP TABLE Products  

Up Next
    Ebook Download
    View all
    Learn
    View all