Tag Archiv für SQL

Analytical Functions IV: RANGE vs. ROWS

Hi,

today’s article is the fourth part of my tutorial on analytical functions. Today I will deal with the differences between RANGE and ROWS windows. We already learned about the ROWS windows in Part I of this tutorial. Today we will take closer look at the RANGE windows and how they differ from ROWS.

I’ve prepared some test data (you can download them in the download section). The data looks like this:

BLOG_0031_PIC01_Result_Query

It contains DAYs from 01/01/2017 till 30/04/2017 and PRODUCT_NO from 1 to 3.

Now let’s take a look at the first example. We will create a query to calculate the monthly sums of the turnover. Then we want to compare the previous and the current month. We could do it for example with window functions.

SELECT MONTH,
            TURNOVER AS CUR_MONTH,
            SUM(TURNOVER) OVER (ORDER BY MONTH
                                                    ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING
                                                   ) AS PREV_MONTH
FROM (
          SELECT DISTINCT
                     to_char(DAY, ‚YYYY-MM‘) AS MONTH,
                     SUM(TURNOVER) OVER (PARTITION BY to_char(DAY, ‚YYYY-MM‘)
                                                        ) AS TURNOVER
          from tbl_test
     )

What we are doing here is calculating monthly values first. Then we are working with a ROWS window, which means that for each row (that is returned by the sub query) we calculate a separate window which is defined in this case as ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING. That means that we are just accessing the previous row. For that it is important that we first sort the data what is done by ORDER BY MONTH.

ROWS also means that the window definition is based on physical rows. That’s the reason why we need a sub query. First we have to create “physically” monthly rows. Then we can work with them. If we don’t do it we receive strange results as the physical level would still be on day level. So within the window daily data is accessed and shown.

Now if we use the RANGE instead of the ROWS we don’t need the subquery. I.e. we receive the same result with this:

SELECT DISTINCT
           to_char(DAY, ‚YYYY-MM‘) AS MONTH,
           SUM(TURNOVER) OVER (PARTITION BY to_char(DAY, ‚YYYY-MM‘)
                                             ) AS CUR_MONTH,
           SUM(TURNOVER) OVER (ORDER BY to_number(to_char(DAY, ‚YYYYMM‘))
                                               RANGE BETWEEN 1 PRECEDING AND 1 PRECEDING
                                             ) AS PREV_MONTH
FROM tbl_test
 

Test it yourself. The results are the same. There are two differences. We are using RANGE instead of ROWS and we don’t have a sub query. We don’t need the sub query any longer because the RANGE windows are based on logical blocks or logical value changes. When you take a closer look at the ORDER BY clause in the last example you find that we are not using to_char(…..) but to_number(to_char(….)).

We do that as the range windows only works with dates or numbers in the ORDER BY clause. The expression you are using there is important for that logical value change. All rows with the same value in that expression are treated as one. They are summed up (because we are using SUM) first (that’s what we have done manually with the sub query in the first example) and afterwards the window is defined. If we say one block back we are stepping back one logical change. In our case it would mean that we step back one month and not one row.

I know that is a little difficult to understand. ROWS means we are defining our window based on physical row changes. RANGE means we are defining it by logical value change within the ORDER BY expression.

That also leads to another point. If you use ROWS and do your window on physical rows the database doesn’t really care if the previous month isn’t available. If we take April and March isn’t there the February is taken instead. With RANGE it is different. Here a NULL would be shown for March as the previous month of April doesn’t have a value. Which is correct then.

This was just a small introduction to the RANGE windows. If you are interested in further information on that just check out my BLOG. Maybe I will write another article on that with more examples. Or take a look at my new SQL book. I’ve added a free excerpt on this BLOG as well.

You find the example code for this article in the download section.

If you are interested in this or other advanced Cognos/SQL topics you can also attend training on my corporate homepage. At the moment my open courses are only available in German. Inhouse training and online courses is also available in English language. So check out my trainings (German) or my online courses (English)!

Download

BLOG_0031_ROWS_RANGE_Examples.zip

Oracle External Tables

Hi,

In today’s article I want to explain how to use Oracle’s external tables in order to access flat files in the filesystem. Since Oracle 9i it’s possible to mount files in the file system and make then accessible in the database. Those files appear as normal tables within the database and can be accessed in read-only mode. You cannot write to those files or use indexes on these tables. But you can read data from it and join it with other tables.

My example file (cities.txt) looks like this:

BLOG_0030_PIC01_Example_File

First your administrator (or you) has to create a new directory in the file system and make it accessible within the database. Then the user who should access the external table needs to have read + write rights to this directory. The following two lines are granting both:

CREATE OR REPLACE DIRECTORY ext_file_data AS ‚c:\‘;
 
GRANT read, write ON DIRECTORY ext_file_data TO test;

With these two lines the user TEST will receive read and write access to c:\.

After you’ve created the directory you can define a table with four columns. When you create the table you have to enhance the external table definition and it looks like this:

CREATE TABLE tbl_cities_ext (
  city_name      VARCHAR2(25),
  population      NUMBER,
  country_name  VARCHAR2(25),
  country_code   VARCHAR2(3)
)
ORGANIZATION EXTERNAL (
  TYPE ORACLE_LOADER
  DEFAULT DIRECTORY ext_file_data
 
  ACCESS PARAMETERS (
    RECORDS DELIMITED BY NEWLINE
    FIELDS TERMINATED BY ‚,‘
    MISSING FIELD VALUES ARE NULL
    (
        city_name      CHAR(25),
        population      CHAR(10),
        country_name  CHAR(25),
        country_code   CHAR(3)
    )
  )
  LOCATION (‚cities.txt‘)
)
REJECT LIMIT UNLIMITED;

You can define a lot of things in the external table definition. Normally you use the oracle loader tool but it’s also possible to use data pump. With DEFAULT DIRECTORY you define the directory in which the files are located. Then you can define several access parameters. These parameters define how the data is stored in the files. In my case records are delimited by a new line and fields are terminated by ‘,’. If field values are missing they should be null. Then we have the column list. For each column within the file we have a column in the external table definition. You could also specify fixed column length and column positions and so on. The filenames itself are put in the LOCATION. You can also put more than one filename here separated by comma. This is just a brief introduction. If you need further information on it, just google oracle external files. You find a lot of tutorials there.

When everything was implemented like this then you can start accessing the table with normal SQL operations. This technique can be used to import data to the database.

You find the example code for this in the download section of this article.

If you are interested in this or other advanced Cognos/SQL topics you can also attend training on my corporate homepage. At the moment my open courses are only available in German. Inhouse training and online courses are also available in English language. So check out my open trainings (German) or my online courses (English)!

Downloads

BLOG_0030_External_Tables_Examples.zip

How to create a database trigger

Hi,

In today’s article I will explain how you can create a trigger and what you can use it for. A trigger is a procedure within the database which is automatically executed by the database when an event takes place. I will focus on table triggers and there you have three events: INSERT, UPDATE, DELETE. That means when a row is inserted an event takes places and if you have a trigger defined for this event it is executed. What do you need this for? You can use this for several things. For example you want to fill some columns based on other columns. Or you want to log something in another table.

In my following example I will create two tables. One with the data itself and some kind of logging table. If I insert a new row to the table I will also add a row to the logging table with a timestamp and the action performed.

First we need the two tables:

CREATE TABLE TBL_TRIGGER_TEST(
    ROW_ID NUMBER,
    ROW_VALUE NUMBER,
    ROW_COMMENT VARCHAR2(1000)
);
 
CREATE TABLE TBL_TRIGGER_TEST_LOG(
    ROW_ID NUMBER,
    ROW_VALUE NUMBER,
    ROW_COMMENT VARCHAR2(1000),
    ROW_TS DATE,
    ROW_ACTION VARCHAR2(1)
);

The table TBL_TRIGGER_TEST is the main table. We have a ROW_ID, a ROW_VALUE and a ROW_COMMENT, just to have some columns. In the table TBL_TRIGGER_TEST_LOG we will log all actions on the main table. In the case of the insertion of a new record we will insert a copy of it with date & time to the logging table. In the ROW_ACTION column we will insert an I (= INSERT).

Now we need the trigger:

CREATE OR REPLACE TRIGGER TRG_TEST_INS BEFORE INSERT
ON TBL_TRIGGER_TEST FOR EACH ROW
BEGIN
  INSERT INTO TBL_TRIGGER_TEST_LOG
  VALUES(:new.ROW_ID, :new.ROW_VALUE, :new.ROW_COMMENT, sysdate, ‚I‘);
END;

A trigger is created via the CREATE OR REPLACE TRIGGER statement. If a trigger with the same name already exists it is overwritten. The next important thing is BEFORE. That defines when the trigger should be executed before the DML operation takes place or after. In the case of after you just write AFTER instead of BEFORE. INSERT is the kind of DML event. You can choose between INSERT, DELETE and UPDATE. When you select UPDATE you can also specify a column. Then you defined with ON followed by the table name on what table this trigger should be implemented. The FOR EACH ROW option defines that the trigger should be executed for each row that is inserted and not only once for all rows within the transaction.

After BEGIN the code itself starts. We are just inserting a new into the logging table. The first three columns will be filled with the values that are to be inserted in the main table. You can access them via :new. This is something like a pseudo row. Beside :new you can use :old to access the old values. That is important when you delete a row or update it.

You are not allowed to put a COMMIT in a trigger. The trigger is commited by the transaction that inserted a row in the main table. That’s why it is not needed here.

Now you can test the trigger. Just insert a new row to the main table and commit it. Afterwards you can show all rows from the logging table. You should find an exact copy of the row you’ve inserted to the main table and the timestamp should match the date/time when you’ve inserted it.

As a task you can try to implement a trigger for the delete event. When a row is deleted you want a copy in the logging table. The action flag should contain a ‘D’. The examples and the result to this task can be found in the download area. Have fun!

If you are interested in this or other advanced Cognos/SQL topics you can also attend training on my corporate homepage. At the moment my open courses are only available in German. Inhouse training and online courses are also available in English language. So check out my open trainings (German) or my online courses (English)!

Downloads

BLOG_0029_Create_Trigger_examples_&_task.zip

How to generate a list of dates

Hi,

it’s been a long time since the my last article in this BLOG. But from today on I want to reactivate my blog. I’ll start with an article on how to generate a list of rows, e.g. if you want to generate a sequential list of dates between two given dates (e.g. 1st of January, 2015 and 31st of December 2015). You have the need to generate 365 rows and each is incremented by one day.

To do that within Oracle SQL you can just use the concepts of hierarchical queries, in detail that’s the connect by statement which just tells the dbms in a hierarchical query how the different levels are connected (i.e. a child and parent key). In our case we will use it to generate a new row for each level and the level in our case is the counter for each row. We are connecting every row to a parent until we reach a hierarchy depth of e.g. 365. That’s why it is working in this case.

Here comes the code:

SELECT (to_Date(‚01.01.2015‘, ‚DD.MM.YYYY‘) + (LEVEL -1)) AS GEN_DATE
FROM DUAL
CONNECT BY LEVEL <=to_date(‚31.12.2015‘, ‚DD.MM.YYYY‘)-   to_date(‚01.01.2015‘, ‚DD.MM.YYYY‘)+1;

With the expression to_date(‚31.12.2015‘, ‚DD.MM.YYYY‘)-to_date(‚01.01.2015‘, ‚DD.MM.YYYY‘) we just tell the database that we want to generate  the dates between the given dates. If you subtract two dates you receive the difference in days. We need one more otherwise the last given date is missing. In the select we just take the given start date and the level is added. As the level counts up by one with each new line we always add one more in each line. Adding a number to a date is just adding a day. And in this way we generate a list of the dates between two given dates.

If you are interested in this or other advanced Cognos/SQL topics you can also attend training on my corporate homepage. At the moment my open courses are only available in German. Inhouse training and online courses are also available in English language. So check out my open trainings (German) or my online courses (English)!

Inner joins explained

In today’s issue I want to explain the different options of combining data of multiple tables (Joins) in Oracle. There are several options, the following are the most common ones:

  • Inner join / equi join (this article)
  • Left/right outer join
  • Cross Join / cross product

Let’s say you have two tables with data:

ID Name Age
1 Gaussling 32
2 Smith 45
3 Meier 25

Table1: TBL_CUSTOMER

Date Customer_ID Turnover
2013-12-01 1 100 €
2013-12-02 1 50€
2013-12-02 2 200€
2013-12-03 3 75€

Table2: TBL_SALES

The most common case is the inner join. In the above example we maybe want to know how much turnover we made with each customer. For this we can use an inner join:

SELECT c.Name, SUM(t.Turnover)
FROM TBL_CUSTOMER c JOIN TBL_SALES t ON c.ID=t.Customer_ID
GROUP BY c.Name

The database now would go through the TBL_SALES table and for each row it would check in TBL_CUSTOMER whether an ID matching the Customer_ID of that row. If so it would connect that row of  TBL_CUSTOMER with the row of TBL_SALES. If it would find multiple rows with ID=1 in TBL_CUSTOMER it would combine each of these rows with the matching TBL_SALES row.

The syntax is pretty easy. I marked the relevant keywords bold. For joining two tables you just write the keyword JOIN between these tables. After the second table you have to write the key ON followed by the joining condition. The joining condition tells the database on which columns in the two tables the join is to be performed. You can also join over two or more columns. These conditions can be added by AND.

In the above example we want to join the two tables over the ID and Customer_ID column. So we just write …. ON c.ID=t.Customer_ID. After combining the two tables the results are aggregated.

Now imagine the Customer table looks like this (because of an error or whatever):

ID Name Age
1 Gaussling 32
2 Smith 45
3 Meier 25
2 Gaussling2 33

Table1: TBL_CUSTOMER

Our SQL looks like this

SELECT *
FROM TBL_CUSTOMER c JOIN TBL_SALES t ON c.ID=t.Customer_ID

Now the result set would look like this:

ID Name Age Date Customer_Id Turnover
1 Gaussling 32 2013-12-01 1 100€
1 Gaussling 32 2013-12-02 1 50€
2 Smith 45 2013-12-02 2 200€
3 Meier 25 2013-12-03 3 75€
2 Gaussling2 33 2013-12-02 2 200€

The green marked row is the one out of the TBL_SALES that is duplicated now because in the TBL_CUSTOMER there are two rows with the ID=2.

I hope it got a little clearer what an inner join is and how it works. Also I tried to figure how some strange results (with duplicated) rows might occur. In future articles I will also explain the other join types.

If you are interested in this or other advanced Cognos/SQL topics you can also attend training on my corporate homepage. At the moment my open courses are only available in German. Inhouse training and online courses are also available in English language. So check out my open trainings (German) or my online courses (English)!