SQL Server Analytic Functions

Article

SQL Server contains several analytic functions, analytic functions compute an aggregate value based on a group of rows. Analytic functions can return single or multiple rows for each group. Today, we learn about the following SQL Server analytic functions.

CUME_DIST
FIRST_VALUE
LAST_VALUE
LEAD
LAG
PERCENTILE_COUNT
PERCENTILE_DISC
PERCENT_RANK

Now that we have read about each analytic function in brief, we will use the following table for our queries.

table

COME_DIST:

The CUME_DIST function calculates the cumulative distribution of a value in a group of values. CUME_DIST computes the relative position of a specified value in a group of values. For example, the cumulative distribution for a given row in group is equal to the number of rows with values lower than or equal to the value of r, divided by the total number of rows in groups. The range of values returned by CUME_DIST is greater than 0 and less than or equal to 1.

Syntax:

CUME_DIST( )
OVER ( [ partition_by_clause ] order_by_clause )

Example:

Example

FIRST_VALUE:

FIRST_VALUE function returns the first value in an ordered set of values. Return type of this function is same type as scalar_expression.

Syntax:

FIRST_VALUE( )
OVER ( [ partition_by_clause ] order_by_clause )

Example:

Example

LAST_VALUE:

The LAST_VAlUE function return the last value in an ordered set of values. Return type of this function is same type as scalar_expression.

Syntax:

LAST_VALUE( )
OVER ( [ partition_by_clause ] order_by_clause )

Example:

Example

Example:

Example

In above result some thing is interesting that ,if we go from row 1 to 10 then value of FIRST_VALUE is same for each group but value of LAST_VALUE is different for each row. The reason behind this that as we go from row 1 to 10 each time value of last row is change, the last value will be of the row where we are currently looking at.

LEAD:

The LEAD function accesses data from a subsequent row in the same result set without the use of a self-join. LEAD provides access to a row at a given physical offset that follows the current row.

Syntax:

LEAD ( scalar_expression ,offset , default )
OVER ( partition_by_clause [order_by_clause] )

Parameters:

Offset: The number of rows forward from the current row from which to obtain a value.
Default: The value to return when scalar_expression at offset is NULL.

Example:

Example

LAG:

Working of LAG function is just opposite of LEAD function. LAG Accesses data from a previous row in the same result set without the use of a self-join. AG provides access to a row at a given physical offset that comes before the current row.

Syntax:

LAG( scalar_expression ,offset ,default)
OVER ( partition_by_clause [order_by_clause] )

Parameters:

Offset: The number of rows forward from the current row from which to obtain a value.
Default: The value to return when scalar_expression at offset is NULL.

Example:

Example

PERCENTILE_CONT:

The PERCENTILE_CONT analytic function calculates a percentile based on a continuous distribution of the column value. It is similar to finding median with percentile value.

If number of element is event then PERCENTILE_CONT will be calculated by add the two digits from the middle and divide by 2 and in case of odd numbers of element PERCENTILE_CONT will be calculated by select the digits from the middle in order list.

Example

Example:

Example

PERCENTILE_DISC:

The PERCENTILE_DISC analytic function Computes a specific percentile for sorted values in an entire rowset or within distinct partitions of a rowset . For a given percentile value P, PERCENTILE_DISC sorts the values of the expression in the ORDER BY clause and returns the value with the smallest CUME_DIST value (with respect to the same sort specification) that is greater than or equal to P.

Example:

Example

PERCENT_RANK:

The PERCENT_RANK analytic function calculates the relative rank of a row within a group of rows. Use PERCENT_RANK to evaluate the relative standing of a value within a query result set or partition. PERCENT_RANK is similar to the CUME_DIST function.

Formula:

PERCENT_RANK() = (RANK() – 1) / (Total Rows – 1)

Syntax:

PERCENT_RANK( partition_by_clause order_by_clause)

Example:

Example

In above table for Emp_Id 6 ,RANK=4 so PERCENT_RANK will be,

PERCENT_RANK=(RANK-1)/(Total_Row-1)
=(4-1)/(10-1)
=3/9
=0.3333333

Thanks for reading the article.