Tag Archiv für Analytic Function

Oracle Analytic Functions III: RANK() & DENSE_RANK()

Today I’ve prepared a lesson in Oracle analytic functions. Basicly these functions are an enhancement of the aggregation functions like SUM, MAX, MIN, etc. With the help of this enhancement it’s possible to calculate different aggregations within the same query. With a classic group by you can only have one aggregation per query. With analytic functions you can have more than this. It’s also possible to calculate windows, i.e. to access rows before or after the current row which gives you the ability to create dynamic window sums, etc. This basic functionality is described in part 1 and part 2 of this tutorial. Beside these basics functions there are a lot of additional functions in the database.

Today I want to demonstrate the use of the functions RANK() and DENSE_RANK(). These are used to calculate rankings in lists. The ranking depends on the sorting order which you can define within each calculation, i.e. you can create one ranking based on turnover descending and another on turnover ascending within the same query. As with all analytical functions you can also use partitioning with the ranking functions. This gives you the ability to start a new ranking for each partition. We will see that in the example. Basicly RANK() and DENSE_RANK() work in the same way but the difference lays in the way how they deal with rows that have the same ranking. If you have two rows with the same turnover they receive the same rank. The function RANK() then skips the next rank and continues with the following one. In comparision with that DENSE_RANK() continues with the next rank. Here comes the first example to demonstrate the basic stuff. This example is based on the TBL_TEST table (it consists of four columns: DAY, PRODUCT_NO, PRODUCT_NAME, TURNOVER. See also the example files):

SELECT x.*,
       RANK() OVER(ORDER BY PRODUCT_NO) AS RANK_NO,
       DENSE_RANK() OVER(ORDER BY PRODUCT_NO) AS DRANK_NO
FROM (
      SELECT to_char(DAY,’YYYY-MM’) AS MONTH,
             PRODUCT_NO,
             SUM(TURNOVER) as TURNOVER
      FROM TBL_TEST
      GROUP BY to_char(DAY,’YYYY-MM’), PRODUCT_NO
) x

Result:

BLOG_0028_PIC01_Example_RANK_DENSE_RANK

This example works with a sub-query to aggregate the data to a month based level. Then in the main query RANK() and DENSE_RANK() are used to calculate a ranking based on the product-no column. For that PRODUCT_NO appears in the ORDER BY of the RANK and DENSE_RANK functions. In the results you can see that the first four rows have the same rank. The function RANK() continues with rank 5 then but DENSE_RANK with 2. That’s the difference between these two functions when rows have the same rank.

I’ve mentioned that it’s possible to create a new ranking which each partition. The next example deals with that:

SELECT x.*,
       RANK() OVER(PARTITION BY MONTH ORDER BY PRODUCT_NO) AS RANK_NO
FROM (
      SELECT to_char(DAY,’YYYY-MM’) AS MONTH,
                    PRODUCT_NO,
                    SUM(TURNOVER) as TURNOVER
      FROM TBL_TEST
      GROUP BY to_char(DAY,’YYYY-MM’), PRODUCT_NO
) x;

Result:

BLOG_0028_PIC02_Example_RANK_PARTITION_BY

Again we have a sub-query to aggregate daily values in the table to monthly values. In the main query we just added a PARTITION BY MONTH to the definition of the RANK_NO column and that changes the list in that way that we have a ranking per month now.

To finish todays article I’ve created a small task. The solution can be found in the download area:

Produce a list with products, days and turnover. Create a ranking based on the turnover (of product and day). The data should be filtered to January 2017. And in the result the 3 top days should be displayed. The list should look like this one:

BLOG_0028_PIC03_Task1_Result

That’s it for RANK and DENSE_RANK. I hope you enjoyed today’s lesson.

If you are interested in this or other advanced Cognos/SQL topics you can also attend training on my corporate homepage. At the moment my open courses are only available in German. Inhouse training and online courses are also available in English language. So check out my open trainings (German) or my online courses (English)!

Download

BLOG_0028_RANK_DENSE_RANK_Examples_TASK.zip

Oracle Analytic Functions – Part II

As this is the second part of my series about Oracle’s analytic functions I would like to step ahead with two more functions:

  • LEAD-function
  • LAG-function

Sometimes you need to access the preceding or following data row. In these cases you can use the above functions. Let’s use the following table to build some examples:

MONTH QTY TURNOVER
1 201211 3 125
2 201212 5 500
3 201301 1 100
4 201302 2 150
5 201303 3 250
6 201304 4 450

Fig.1: TBL_SALES

 

If you want to show the turnover of every month compared with the turnover of the month before you can use the Lag-Function. The general syntax for both functions is the following:

Lead/Lag (<Value>, <No of Rows>, <Default>) Over (Partition by … Order by ….)

Please note that you have to use the order by clause for these two functions. Lead returns the following row and Lag the preceding row. <Value> describes the Column of which the following or preceding value is returned. <No of rows> sets how many rows in front or back the system should look for and return a value. If there isn’t a value available (within the partition) then the <default> value is returned.

Let’s develop the first example:

       SELECT MONTH,
                TURNOVER,
                LAG(TURNOVER,1,-1) OVER (ORDER BY MONTH ASC) AS TURNOVER_2
FROM TBL_SALES

The result should look like this:

MONTH               TURNOVER            TURNOVER_2
——————————————————-
201211                125                   -1
201212                500                  125
201301                100                   500
201302                150                   100
201303                250                   150
201304                450                   250

As you can see the TURNOVER_2  column contains the value of the turnover column of the row before. The very first row of our little example TURNOVER_2 is -1 cause there is no row before and because of this a value can’t be found and the -1 is returned instead. If you wouldn’t define a default value then NULL is returned.

Now we change our statement a little:

       SELECT MONTH,
                TURNOVER,
                LAG(TURNOVER,1,-1) OVER (PARTITION BY substr(MONTH,1,4)
                                                      ORDER BY MONTH ASC)
                   AS TURNOVER_2
FROM TBL_SALES

The result looks this way:

MONTH               TURNOVER            TURNOVER_2
——————————————————
201211                 125                 -1
201212                 500                 125
201301                 100                 -1
201302                 150                 100

The difference is the partition by clause. That means that the function is only used within the partitions we have defined. In our case we have partitioned our data by years. To do so we use the function substr on the column MONTH. Now the DBMS uses the LAG function only within the corresponding year. That means for 2012 the first row in our table is the 201211. We have a -1 there. The next row is the second row within the partition. So we see the value of the row before there. The third row in our table is the 201301. It’s a new partition because substr(201301,1,4)=2013 and that is different from 2012. So it starts again and the first row of this partition (i.e. the third row in the table) is -1 again.

I hope the concept of partitioning became clearer with this issue too. Have fun experimenting with the two new functions.

If you are interested in this or other advanced Cognos/SQL topics you can also attend training on my corporate homepage. At the moment my open courses are only available in German. Inhouse training and online courses are also available in English language. So check out my open trainings (German) or my online courses (English)!

Oracle Analytic Functions – Part I

In today’s issue I would like to address a very useful topic in an oracle environment called analytic functions. These functions are extremely useful to build complex calculations in an easy manner. In older oracle versions you had to create compley sub-queries and joins what can be now done by analytic functions.

To give you a brief introduction in this topic we will build YearToDate sums with these functions, that means for our monthly turnovers we want to calculate the YearToMonth turnover (turnover from January until each month) for each month with the help of analytic functions. For this our base table TBL_SALES looks like this:

MONTH QTY TURNOVER
1 201212 5 500
2 201301 1 100
3 201302 2 150
4 201303 3 250
5 201304 4 450

Fig.1: TBL_SALES

The general syntax fort he analytic functions is like this:

Function(….) Over( Partition by ….. Order by ….. Rows between …. And …..)

There are a lot of functions that can be used with this, but today we will focus on the SUM function, which is also known as aggregate function. The disadvantage of these aggregates is, that they can only work on the grouping of the whole query. But sometimes you need different groupings for different calculations. Maybe we want to list every month with its quantity and then the complete quantity. Without analytic functions one could create a sub query that sums all quantities, maybe like this:

SELECT MONTH,
QTY,
(SELECT SUM(QTY) FROM TBL_SALES) AS TOTAL_QTY
FROM TBL_SALES

 

With analytical functions the same would look like this:

SELECT MONTH,
                QTY,
                SUM(QTY) OVER() AS TOTAL_QTY
FROM TBL_SALES

The result in both cases is this:

MONTH               QTY   TOTAL_QTY
———————————————–
201212                  5             15
201304                  4             15
201303                  3             15
201301                  1             15
201302                  2             15

 

As this example is pretty simple the advantage isn’t that big. But the more complex the aggregates are, the higher is the advantage using analytic functions.

Back to the syntax of analytic functions – In our case above we didn’t use the PARTITION BY neither the ORDER BY clause. The reason is because we wanted to calculate a SUM over everything in the table. We can use the PARTITION BY clause to define on what level the function is working, i.e. on what level the total quantities are calculated. If we want to calculate them for each year then we would define a partition on the year like this:

SELECT MONTH,
                QTY,
                SUM(QTY) OVER(substr(MONTH,0,4) AS TOTAL_QTY_YEAR
FROM TBL_SALES

 

The result would be like this:

MONTH               QTY   TOTAL_QTY_YEAR
———————————————–
201212                  5             5
201304                  4             10
201303                  3             10
201301                  1             10
201302                  2             10

 

We now want to calculate the YearToMonth-Sum for each Month. For this we need to tell oracle that in Jan 2013 he has to summarize just the January qty, in Feb 2013 he has to sum the Jan and Feb quantity and so on. We can give this information with the ROWS BETWEEN …. AND …. Statement. With this we can define a dynamic window. But when we use it we need to order our datasets too. This is done with the ORDER BY clause within the Function. The complete Statement for the YearToMonth SUM would look like this:

 

SELECT  MONTH,
        QTY,
        SUM(QTY) OVER(PARTITION BY substr(MONTH,0,4)
                      ORDER BY MONTH ASC
                      ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS YTM
FROM TBL_SALES

 

The result is this:

MONTH               QTY   TOTAL_QTY_YEAR
———————————————–
201212                  5             5
201301                  1             1
201302                  2             3
201303                  3             6
201304                  4             10

 

In this case UNBOUNDED PRECEDING means that all rows before the current row but within the partition are taken.

This was an example to show the basic principle how these functions work. You can easily do the same on partitions over several columns. And of course you can combine them all. For example you could combine the YTM Calculation and the Total_Qty_Year in one statement.

This article was the first part of the series on the analytic functions. In the next parts I will describe more functions and give you further examples on how to use them correctly.

If you are interested in this or other advanced Cognos/SQL topics you can also attend training on my corporate homepage. At the moment my open courses are only available in German. Inhouse training and online courses are also available in English language. So check out my open trainings (German) or my online courses (English)!