Archiv für Oracle SQL

Group By extensions IV: Functions

Hi,

This is the last part of my article series on SQL’s Group By extensions. In the first part we dealt with the ROLLUP statement, the second part then covered the CUBE statement and in the third part we discovered flexibility with the GROUPING SETS. Today we will take a look at two functions that are sometimes used to filter some of the auto-generated sums of ROLLUP and CUBE. The names of the functions are:

  • GROUPING(<expression>)
  • GROUPING_ID(<expression1>, <expression2>, …, <expression n>)

The examples are based on the test data of part 2, i.e. the table looks something like this:

Day Product No Product Name Turnover
01.01.2017 1 Product 1 20
2 Product 2 30
3 Product 3 40
02.01.2017 1 Product 1 30
2 Product 2 40
3 Product 3 50
….
30.04.2017 3 Product 3 330

 

The GROUPING() function is used to check whether an expression was used in the grouping or not. Here comes a small example for that:

SELECT  to_char(DAY, ‘YYYY-Q’) AS QUARTER_ID,
              to_char(DAY, ‘YYYY-MM’) AS MONTH_ID,
              SUM(TURNOVER) AS TURNOVER,
              GROUPING(to_char(DAY, ‘YYYY-Q’)) AS GRP_QUARTER,
              GROUPING(to_char(DAY, ‘YYYY-MM’)) AS GRP_MONTH
     
FROM TBL_TEST
GROUP BY ROLLUP(to_char(DAY, ‘YYYY-Q’), to_char(DAY, ‘YYYY-MM’))
ORDER BY QUARTER_ID, MONTH_ID;
 

Result:

BLOG_0023_PIC01_Result_Query_1

In this example you can see how the GROUPING() function basically works. For the most detailed level (I.e. month) both functions return 0 as the month_id and the quarter_id columns are grouped by. For the quarter sums the month_id is not grouped by and therefore the GROUPING() function returns 1 for this column. For the total sum both id-columns are not in the GROUP BY and that leads to a result of 1 for both GROUPIN() functions.

You can use these results to decide what sums you want to keep or remove. For that you usually need these functions when you use CUBE or ROLLUP. With GROUPING SETS you normally don’t have the need for these functions as you can exactly define what sums you need.

The GROUPING_ID() function is the combination of several GROUPING() functions in the way that if you take several GROUPING results (which are always 1 or 0 = binary) and put them in a row you receive a binary string which can be interpreted as a decimal number and that one is returned by GROUPING_ID. You receive a different ID for each different kind of sum. You then only have one column with the ID instead of several columns as with the GROUPING() function. Now here is an example for that:

SELECT  to_char(DAY, ‘YYYY-Q’) AS QUARTER_ID,
              to_char(DAY, ‘YYYY-MM’) AS MONTH_ID,
              SUM(TURNOVER) AS TURNOVER,
              GROUPING(to_char(DAY, ‘YYYY-Q’)) AS GRP_QUARTER,
              GROUPING(to_char(DAY, ‘YYYY-MM’)) AS GRP_MONTH,
              GROUPING_ID(to_char(DAY, ‘YYYY-Q’), to_char(DAY, ‘YYYY-MM’)) AS GRP_ID
 
FROM TBL_TEST
GROUP BY ROLLUP(to_char(DAY, ‘YYYY-Q’), to_char(DAY, ‘YYYY-MM’))
ORDER BY QUARTER_ID, MONTH_ID;

Result:

BLOG_0023_PIC02_Result_Query_2

 

I’ve take the first example and added only the last row in the SELECT. This is the GROUPING_ID() function which takes several expressions. In this example we put the expressions of the QUARTER_ID and MONTH_ID columns. The order of the expressions is important. It defines on what position of the binary string the result of the GROUPING of the column is put. If you take a look at row number 4. The GROUPING of Quarter is 0 and the GROUPING of Month is 1. As an binary string this would be 01 and if you transform it to a decimal number it’s: 0*2+1*2° = 0 + 1 = 1. If we check line no. 7 we have 11 as a binary string and decimal it’s: 1*2 + 1*2° = 2 + 1 = 3.

Now another example for the GROUPING():

SELECT  to_char(DAY, ‘YYYY-Q’) AS QUARTER_ID,
              to_char(DAY, ‘YYYY-MM’) AS MONTH_ID,
              DAY,
              SUM(TURNOVER) AS TURNOVER,
              GROUPING(to_char(DAY, ‘YYYY-Q’)) AS GRP_QUARTER,
              GROUPING(to_char(DAY, ‘YYYY-MM’)) AS GRP_MONTH,
              GROUPING(DAY) AS GRP_DAY,
              GROUPING_ID(to_char(DAY, ‘YYYY-Q’), to_char(DAY, ‘YYYY-MM’),
                                      DAY) AS GRP_ID     
 
FROM TBL_TEST
WHERE to_char(DAY,’DD’) IN (’01′, ’02′, ’03′)
GROUP BY ROLLUP(to_char(DAY, ‘YYYY-Q’), to_char(DAY, ‘YYYY-MM’), DAY)
ORDER BY QUARTER_ID, MONTH_ID, DAY;

Result:

BLOG_0023_PIC03_Result_Query_3

This example is based on the second example. In addition I’ve added the DAY column in the SELECT and ROLLUP. Also I’ve added a GROUPING() for DAY and added it to the GROUPING_ID() function. A filter in the WHERE clause just limit the rows in the result. It’s pretty much the same as in the second example. But you have one column more in this example. The daily sums have a GROUPING_ID of 0, the monthly SUMS of 1, the quarterly sums of 3 and the total sum a GROUPING_ID of 7. The 7 is calculated like this: 111 = 1*2² + 1*2 + 1*2° = 4 + 2 + 1 = 7.

I hope it became clear a little how the functions GROUPING() and GROUPING_ID() works and what they can be used for. These functions can be used in the HAVING clause to filter out some sums. Also these functions can be used with CUBE and also GROUPING SETS.

The examples used in this article can be found at the very end of this article as zip-file. I will continue this series soon and we will then deal with some useful functions you can use if you are working with the rollup or cube statement.

If you are interested in this or other advanced Cognos/SQL topics you can also attend training on my corporate homepage. At the moment my open courses are only available in German. Inhouse training and online courses are also available in English language. So check out my open trainings (German) or my online courses (English)!

 

Downloads

BLOG_0023_Testdata_&_Examples.zip

Share and Enjoy: Diese Icons verlinken auf Bookmark Dienste bei denen Nutzer neue Inhalte finden und mit anderen teilen können.
  • LinkedIn
  • XING
  • Facebook
  • Google Bookmarks
  • Twitter

Group By extensions III: GROUPING SETS

Hi,

again something on the GROUP BY extensions. Today we will deal with GROUPING SETS. Probably the most flexible and easiest way to create multiple GROUP BYs within one query. In part 1 of this series we discovered the ROLLUP clause, which let’s us automatically calculate totals along some kind of dimension, which is defined by the grouping columns. Part 2 described the CUBE clause which just simply calculate all possible sums based on the grouping columns.

GROUPING SETS let you exactly specify the totals you want to see in your query. If you just want a report total and some monthly totals then you can just define that and you get it. If you need all the totals for all levels of a dimension a ROLLUP might be easier, but in most cases GROUPING SETS will probably be.

Now let’s get to coding. The examples are based on the test data of part 2, i.e. the table looks something like this:

Day Product No Product Name Turnover
01.01.2017 1 Product 1 20
2 Product 2 30
3 Product 3 40
02.01.2017 1 Product 1 30
2 Product 2 40
3 Product 3 50
….
30.04.2017 3 Product 3 330

 

In my first example I just want a monthly toal and a report total. With GROUPING SETS it looks like this:

SELECT to_char(DAY,’MM/YYYY’) AS MONTH,
            SUM(TURNOVER) AS TURNOVER
FROM TBL_TEST
GROUP BY GROUPING SETS( (to_char(DAY,’MM/YYYY’)),
                                                    ()
                                           );

Result:

BLOG_0022_PIC01_Result_Query1

As you can see in this example you just put all the different GROUP BYs in different sets of brackets. All the columns that belong to a specific GROUP BY are put to one pair of brackets. The different GROUP BYs are then separated by comma. If a GROUP BY only consist of one column you could theoretically leave out the brackets. If you want a report total you leave the brackets empty.

Now another example with a monthly total, a report total and a total per month and product:
SELECT to_char(DAY,’MM/YYYY’) AS MONTH,
            PRODUCT_NO,
            PRODUCT_NAME,
            SUM(TURNOVER) AS TURNOVER
FROM TBL_TEST
GROUP BY GROUPING SETS( to_char(DAY,’MM/YYYY’),
                                            (),
                                           (to_char(DAY,’MM/YYYY’), PRODUCT_NO, PRODUCT_NAME)
                                          )
ORDER BY MONTH, PRODUCT_NO NULLS LAST

Result:

BLOG_0022_PIC02_Result_Query2

In this example we have calculated three sums as you can see directly in the GROUPING SETS section of the SQL. This example could have been produced also with a ROLLUP if you change the clauses and leave out the first two rows of the GROUPING SETS section. As you can see often there are several ways to reach the target. For larger reports and if you need a lot of sums maybe ROLLUP makes it easier for you and the database. But sometimes you only need a few specific totals out of a lot of different combinations then maybe GROUPING SETS are easier to use.

The examples used in this article can be found at the very end of this article as zip-file. I will continue this series soon and we will then deal with some useful functions you can use if you are working with the GROUP BY extensions.

If you are interested in this or other advanced Cognos/SQL topics you can also attend training on my corporate homepage. At the moment my open courses are only available in German. Inhouse training and online courses are also available in English language. So check out my open trainings (German) or my online courses (English)!

 

Downloads

BLOG_0022_Testdata_Examples.zip

Share and Enjoy: Diese Icons verlinken auf Bookmark Dienste bei denen Nutzer neue Inhalte finden und mit anderen teilen können.
  • LinkedIn
  • XING
  • Facebook
  • Google Bookmarks
  • Twitter

Group By extensions II: CUBE

This is the second part of the series about GROUP BY extensions. In the first part of this series we discovered the ROLLUP statement which lets you easily calculate totals on different levels on granularity without using several queries. ROLLUP just calculates the totals along an artificial dimension which can be defined through expressions for each level.

Beside ROLLUP there is another extension which is called CUBE. CUBE calculates all possible totals for the given expressions. If you have Month and Product in your query and using CUBE then there calculated for totals: Month, Product, Month and Product, Total Query.

Let’s think of the following table with test data:

Day
Product No
Product Name Turnover
01.01.2017 1 Product 1 20
2 Product 2 30
3 Product 3 40
02.01.2017 1 Product 1 30
2 Product 2 40
3 Product 3 50
….
30.04.2017 3 Product 3 330

 

The first example creates a cube for day and product (no + name):

SELECT to_char(DAY,’MM/YYYY’) AS MONTH,
              PRODUCT_NO,
              PRODUCT_NAME,
              SUM(TURNOVER)
FROM TBL_TEST
GROUP BY CUBE ( to_char(DAY,’MM/YYYY’),
                                (PRODUCT_NO, PRODUCT_NAME)
                              );

Result:

BLOG_0020_PIC01_Result_Query1

This query generates 20 rows. The first 12 rows are the most detailed totals, i.e. a total for month and product. Line 13 – 15 are the totals per product, line 16 – 19 are the totals per month and line 20 is the query total.

To get this result you only have to use the CUBE statement instead of ROLLUP. CUBE then generates all possible sums. If you want two columns to be treated as one (e.g. product no and product name) then you use extra brackets for it ( take a look at the example). Of course you could add more expressions to the CUBE but this always leads to a lot more sums. If you have three expressions in the CUBE then you’ll get 2³=8 totals.

That leads sometimes to the question what to do if you need only specific totals but not all. You could use a CUBE then and use the HAVING to filter out the not needed totals based on values in the grouping columns.

The second example shows how to do that:

SELECT to_char(DAY,’MM/YYYY’) AS MONTH,
             PRODUCT_NO,
             PRODUCT_NAME,
             SUM(TURNOVER)
FROM TBL_TEST
GROUP BY CUBE ( to_char(DAY,’MM/YYYY’),

                                (PRODUCT_NO, PRODUCT_NAME)

                               )
HAVING PRODUCT_NO IS NOT NULL AND to_char(DAY, ‘MM/YYYY’) IS NULL
                OR PRODUCT_NO IS NULL AND to_char(DAY, ‘MM/YYYY’) IS NOT NULL;

Result:

BLOG_0020_PIC02_Result_Query2

In this SQL I have added the two lines for the HAVING clause. With these two lines I tell the database what sums out of all possible sums I am really interested in. In order to do that I just check that either the month is null and product not or month is not null but product is. Only in these two cases I want the line to be displayed.

Like the ROLLUP CUBE is available on Oracle, MS SQL Server, DB2, Postgres, etc., but not on MySQL. I hope you enjoyed today’s lesson again. The examples used in this article can be found at the very end of this article as zip-file. I will continue this series soon and we will then deal with the so called GROUPING SETS which let you define sums and totals in a very detailed way.

If you are interested in this or other advanced Cognos/SQL topics you can also attend training on my corporate homepage. At the moment my open courses are only available in German. Inhouse training and online courses are also available in English language. So check out my open trainings (German) or my online courses (English)!

Download

BLOG_0020_Testdata_Examples_zip

Share and Enjoy: Diese Icons verlinken auf Bookmark Dienste bei denen Nutzer neue Inhalte finden und mit anderen teilen können.
  • LinkedIn
  • XING
  • Facebook
  • Google Bookmarks
  • Twitter

Group By extensions I: ROLLUP

In todays article I will deal with an extension of the classical Group By clause. The Group By clause is used to aggregate data to a specific level of granularity. Let’s say the data in the database is daily data and you need a report on monthly level. Then you have to aggregate the data, i.e. you have to sum up all lines that belong to the same month. The result will be that you have one line per month. For that you would introduce a GROUP BY for defining the needed level of granularity (in this case: monthly level). Furthermore you would use the SUM() aggregation function to sum up the kpi.

That all works fine so far but it has one limitation: You can only have one GROUP BY per SQL statement and therefore you can only aggregate the data to one level of granularity within one query. What if you need monthly data and quarterly data in one query? Then you would use two queries and the set operation UNION ALL. Or after this article you will use the Group By extension ROLLUP. This has two advantages:

  1. Performance
  2. One additional clause vs. another query + UNION ALL

In the first approach (second query + UNION ALL) the tables are accessed several times (depending on how many queries are used). That can lead to bad performance. Furthermore the sql will grow because you have several queries. When using the Group By extension you can see in the execution plan that the tables are only accessed once. And the SQL isn’t growing that much because all you have to do is to extent the GROUP BY clause.

Okay, let’s write some code. Given the following table (see the zip-file for a creation script):

Day (DD.MM.YY)
TURNOVER
01.01.17 1000
1000
31.01.17 1000
01.02.17 1200
1200
28.02.17 1200
01.03.17 1500
1500
30.04.17 1500

 

Now we want to create a query with monthly data and a total over everything. Without the Group By extension it would look something like this:

SELECT  to_char(DAY,’MM/YYYY’) AS MONTH,
             SUM(TURNOVER) AS TURNOVER
FROM TBL_TEST
GROUP BY to_char(DAY,’MM/YYYY’) 
 
UNION ALL
 
SELECT  ‘TOTAL 2017′ AS MONTH,
              SUM(TURNOVER) AS TURNOVER
FROM TBL_TEST;

Result:

BLOG_0019_PIC01_Result_Query1

As you can see we have two queries that are connected via UNION ALL. The first sql creates the monthly totals. The second one creates the query total.

With ROLLUP the same query would look like this:

SELECT  nvl(to_char(DAY,’MM/YYYY’),’TOTAL 2017′) AS MONTH,
            SUM(TURNOVER) AS TURNOVER
FROM TBL_TEST
GROUP BY ROLLUP (to_char(DAY,’MM/YYYY’))

Result:

BLOG_0019_PIC01_Result_Query1

The result is exactly the same as in the first query. But this one is a little shorter than the first one. In the select you put all columns you need to distinguish the totals. In our case we have the month and the turnover total. The Group By is extended by ROLLUP ( …. ). Within the brackets you put one or more expressions. They have to be consistent to the SELECT clause. The ROLLUP clause is creating rollups along some kind of dimension. In this dimensions the levels are defined through the expressions. Totals over all is added automatically. This means in our case that we just put the expression for the month (the same as in the select) in the ROLLUP and the total over all is added automatically.

Now we want to create a more complex example. We are adding totals for the quarter and the year. That makes four different totals (month, quarter, year, query). Our query looks like this one:

SELECT  to_char(DAY,’YYYY’) AS YEAR,
             ‘Q’||to_char(DAY,’Q/YYYY’) AS QUARTER,
             to_char(DAY,’MM/YYYY’) AS MONTH,
             SUM(TURNOVER) AS TURNOVER
FROM TBL_TEST
GROUP BY ROLLUP ( to_char(DAY,’YYYY’),
                                ‘Q’||to_char(DAY,’Q/YYYY’),
                                 to_char(DAY,’MM/YYYY’)
)

Result:

BLOG_0019_PIC02_Result_Query3

For each total we added one expression to define the level. The ROLLUP clause is processed from the right to the left, i.e. first all columns will be grouped by, then the very right one is removed and the group by only consists of quarter and year, then the next one is removed from the right and so on. In that way all the sums are created.

Sometimes you have the need to display two columns for a level cause they belong logically together, e.g. product no and product name. If you just add both to the ROLLUP you receive two total lines for the product. What you can do is to put these two columns in brackets within the ROLLUP clause like this: ROLLUP( bla, (Product_No, Product_Name), … )

We call this composite columns. If they are put in brackets they behave as one total line.

The ROLLUP is available on Oracle, MS SQL Server, DB2, Postgres, etc. Unfortunately it’s not available on MySQL. I hope you enjoyed today’s lesson. The examples used in this article can be found at the very end of this article as zip-file. I will continue this series soon and we will then deal with the CUBE statement.

If you are interested in this or other advanced Cognos/SQL topics you can also attend training on my corporate homepage. At the moment my open courses are only available in German. Inhouse training and online courses are also available in English language. So check out my open trainings (German) or my online courses (English)!

Download

BLOG_0019_GroupBy_Ext_ROLLUP_examples.zip

Share and Enjoy: Diese Icons verlinken auf Bookmark Dienste bei denen Nutzer neue Inhalte finden und mit anderen teilen können.
  • LinkedIn
  • XING
  • Facebook
  • Google Bookmarks
  • Twitter

String Comparison Differences ( MySQL vs. Oracle)

When comparing strings in MySQL you find out that something is different compared with Oracle. Basicly said it’s that Oracle is case-sensitive when comparing strings and MySQL is not. Let’s say you have a table TBL_TEST that looks like this:

TBL_TEST
ID Name City
1 Schmidt kiel
2 Maier KIEL

 

Given the following SQL:

SELECT *
FROM  TBL_TEST
WHERE City=‘kiel‘;

In MySQL you receive both rows as the string comparison is not case-sensitive. In Oracle and MS SQL you would receive only the first row. It is the same if you use the LIKE operator.

So how will you accomplish a case sensitive comparison in MySQL then?

The trick is to use the operator binary in your equation. That converts a normal string into a binary string and the result is that the strings are compared by character code rather than by character. The SQL changes like this:

SELECT *
FROM  TBL_TEST
WHERE City=binary ‘kiel‘;

Now in MySQL you also only receive the first row. The binary operator can be placed on both sides, the effect is always the same. By having one binary string in your equation the strings are compared on byte level.

I hope this helps and explains this behaviour a little bit.

If you are interested in this or other advanced Cognos/SQL topics you can also attend training on my corporate homepage. At the moment my open courses are only available in German. Inhouse training and online courses are also available in English language. So check out my open trainings (German) or my online courses (English)!

Share and Enjoy: Diese Icons verlinken auf Bookmark Dienste bei denen Nutzer neue Inhalte finden und mit anderen teilen können.
  • LinkedIn
  • XING
  • Facebook
  • Google Bookmarks
  • Twitter