Category Archives: dwh-consult

Bulk Load vs Direct Path Load

Overview about the main differences between direct path loads and bulk loads.
Direct path load (insert /*+append*/ as select…)

  • very fast (especially if no indexes on the target table)
  • always loads ABOVE the high water mark, so any existing free space not reclaimed
  • locks the object
  • you must commit/rollback at completion

Bulk Load (bulk collect into/forall insert)

  • fast (better than row at a time), but not as fast as direct load
  • can utilise existing free space
  • does not lock the object
  • additional work can be done in same transaction

Found by asktom.

ORACLE – redo-log by using append-hint, nologging and archive-log-mode

Here a very good overview about the combinations when Redo-Log is generated or not for inserts with and without logging/append-hint/archive-log-mode. Found by asktom.

Check DB for archive-log-mode turned on or off:
select log_mode from v$database;

Check for logging-mode for table:
select logging from all_tables
where owner = '<USER>'
and table_name = '<TABLE_NAME>';

Example for Direct-Path-Insert with append-hint:
insert into '<TABLE_NAME>' /*+ append */
select * from all_objects;


TableMode  Insert Mode ArchiveLogMode  result
---------- ----------- --------------- ----------
LOGGING    APPEND      ARCHIVE LOG     redogenerated
NOLOGGING  APPEND      ARCHIVE LOG     no redo
LOGGING    no append   ""              redogenerated
NOLOGGING  no append   ""              redogenerated
LOGGING    APPEND      noarchive       no redo
NOLOGGING  APPEND      noarchive       no redo
LOGGING    no append   noarchive       redogenerated
NOLOGGING  no append   noarchive       redogenerated

If “FORCE LOGGING” is used, than of course redo is generated!

Optimizer-Modes ALL_ROWS vs FIRST_ROWS

FIRST_ROWS and ALL_ROWS are cost based optimizer features.

Overview:

  • Default: ALL_ROWS
  • Change: alter [system|session] OPTIMIZER_MODE = [ALL_ROWS|FIRST_ROWS|FIRST_ROWS_1|FIRST_ROWS_10|FIRST_ROWS_100|FIRST_ROWS_1000]
  • Check: SHOW PARAMETER OPTIMIZER_MODE (with dba-role)
  • SQL – Hint for retrieving a few rows: /*+ FIRST_ROWS*/

 

http://docs.oracle.com/cd/E18283_01/server.112/e17110/initparams167.htm

first_rows_n
The optimizer uses a cost-based approach and optimizes with a goal of best response time to return the first n rows (where n = 1, 10, 100, 1000).

first_rows
The optimizer uses a mix of costs and heuristics to find a best plan for fast delivery of the first few rows.

all_rows
The optimizer uses a cost-based approach for all SQL statements in the session and optimizes with a goal of best throughput (minimum resource use to complete the entire statement).


http://oracle-online-help.blogspot.de/2007/03/optimizermode-allrows-or-firstrows.html

Foreign Key Constraints in a ORACLE DWH

There are several possibilities to have Foreign Key Constraints in a ORACLE-DWH-Environment.

ENABLE VALIDATE
ALTER TABLE sales ADD CONSTRAINT sales_time_fk
FOREIGN KEY (time_id) REFERENCES times (time_id)
ENABLE VALIDATE;

Enabled and validated means on every commit the constraint is validated to be correct.

ENABLE NOVALIDATE
ALTER TABLE sales ADD CONSTRAINT sales_time_fk
FOREIGN KEY (time_id) REFERENCES times (time_id)
ENABLE NOVALIDATE;

You can use this option if the tables are not correct corresponding to the constraint, but you need to have a enforced constraint for future inserts.
That means the current state of the existing data is not validated but you can’t insert data in the child-table without having a parent-key in the parent table.
That means that the constraint is checked on inserts/updates/deletes.

RELY DISABLE NOVALIDATE
ALTER TABLE sales ADD CONSTRAINT sales_time_fk
FOREIGN KEY (time_id) REFERENCES times (time_id)
RELY DISABLE NOVALIDATE;

This constraint is NOT used for data validation. If you know data is correct by running the ETLs than you can use this option.
DWH-Tools can use this info for retrieving information on this constraint. And advanced query-rewriting can be possible for materialized views.

  • No data validation -> no cpu-consumption.
  • No data validation -> inexpensive and no overhead on running dml.

More Infos can be found in the
ORACLE Database Data Warehousing Guide 11g or
ORACLE Database Data Warehousing Guide 12c.

Data Vault Modeling

Datamodelling the core-dwh

Bill Inmon: Corporate Information Factory,
Ralph Kimball: Dimensional Modeling
Dan Linstedt: Data Vault

Data Vault Modeling:
New Modeling Method for the Core Enterprise Data Warehouse including the complete history.
This technique allows to follow a complete agile approach.

Here a few links for more information about Data Vault:

English:
Data Vault Modeling specification (DanLinstedt)
Data Vault Loading Specification v1.2 (DanLinstedt)

Data Vault Modeling guide (Hans Hultgren 2012)

A very good overview and real-life-example from Source to Raw/Information-Mart by Kent Graziano:
Part 1: Agile Modeling: Not an Option Anymore
Part 2: Data Vault 2.0 Modeling Basics
Part 3: The Business Data Vault
Part 4: Building an Information Mart With Your Data Vault

Data Matters (Unseen)

Modeling One Model for persistence and access with Data Vault

Web Archive for MakingDataMeaningful: Data Vault: Hubs, Links, and Satellites With Associated Loading Patterns

Forum:

Forum: dvforums.danlinstedt.com (not active anymore)

German:
Core DWH Modellierung: Vergleich Data Vault und konsolidierte Ankermodellierung (nicht Anchormodelling!)

Anker – Modellierung im Core Data Warehouse

Additional resources:
www.datavaultmodeling.de

danlinstedt.com

Visual Data Vault – DV-extension for Microsoft Visio

www.dwh42.de

roelantvos.com: DWH-Automatisation

kentgraziano.com: The Data Warrior

Scalefree: BIG DATA CONSULTING

 

PL/SQL-Function for SVN-Revision

If you are using Subversion-Keywords in you PLSQL-Project you can automatically get the info for all these Keywords for Packages, Procedures and Functions.

Just use a function get_svn_info to find out, which version, date and so on are located actually in you database:

Funcion-Declaration for the Package-Header:

/**
*
* Gibt die Subversion-Informationen des Package-Bodies zurueck
* @is_type mögliche Ausprägungen:
* 'version','url','autor','datum','all'
* default: 'version'
* @return angeforderte Subversion-Info des Package Bodies
*/
FUNCTION  get_svn_Info
(is_type                     IN VARCHAR2 DEFAULT 'version')
RETURN VARCHAR2;

 

Record-declaration in the Package-Body:

TYPE typ_svn_info IS RECORD (
version  VARCHAR2 (256) := '$Rev: 209690 $',
autor    VARCHAR2 (256) := '$Author: authorname $',
datum    VARCHAR2 (256) := '$Date: 2016-02-09 09:32:47 +0100 (Di, 09. Feb 2016) $',
url      VARCHAR2 (256) := '$URL: svn://dir1/oracle/trunk/myschema/MYSCHEMA.MYPACKAGE.packbody $'
);

And the function itself in the Package-Body:

/**
*
* Gibt die Subversion-Informationen des Package-Bodies zurueck
* @is_type mögliche Ausprägungen:
* 'version' oder 'revision','url','autor','datum','name','all'
* default: 'version'
* @return angeforderte Subversion-Info des Package Bodies
*/
FUNCTION  get_svn_Info
(is_type                     IN VARCHAR2 DEFAULT 'version')
RETURN VARCHAR2
is

c_convert   varchar2 (4000);
infos       typ_svn_info;

begin

case lower(is_type)
when 'version' then     c_convert := infos.version;
when 'revision' then    c_convert := infos.version;
when 'url' then         c_convert := infos.url;
when 'autor' then       c_convert := infos.autor;
when 'datum' then       c_convert := infos.datum;
when 'name'  then       c_convert := substr (infos.url,instr(infos.url,'/',-1)+1); -- name = part after the last slash
when 'all' then
c_convert := get_svn_Info('url')
||' Ver.'||get_svn_Info('version')
||' von:'||upper(get_svn_Info('autor'))
||' letzte Aenderung:'||get_svn_Info('datum')
;
return c_convert;
else
return 'Parameter existiert nicht! (all, autor, url, datum, version, name)';
end case;

c_convert := replace(c_convert,'$','');
c_convert := trim(substr(c_convert,instr(c_convert,':',1,1)+1));
return c_convert;
exception
when others then return '-1';
end get_svn_info;

Examplecall:

set serveroutput on
exec dbms_output.put_line(myschema.mypackage.get_svn_Info('name')||' '||myschema.mypackage.get_svn_Info('Revision')||' vom '||myschema.mypackage.get_svn_Info('datum')||' ('||myschema.mypackage.get_svn_Info('autor')||')');

Result:

myschema.my_package.packbody 209690 vom 2016-02-09 09:32:47 +0100 (Di, 09. Feb 2016) (authorname)

Regular Expression RegEx Example for use in notepad ++ plus plus

For the generation of HTML-Doc out of the PLSQL-Sourcecode with PLDOC I had the problem to remove the trailing colon ‘:’ at the end of listing parameters.

Examplecode:

/**
*This is the comment about the functionality of this function,<br/>
* which is of cours incredible<br/>
*<br/>
*@param para_1:
* This is the long desription for parameter para_1…<br/>
* And this means:<br/>
* – everything is as fine<br/>
* – as it could be<br/>
*<br/>
*@param para_2:
* This is the long desription for parameter para_1…<br/>
* And this means:<br/>
* – everything is as fine<br/>
* – as it could be<br/>
*@return somethin beautiful
*/
FUNCTION set_something
( para_1 IN VARCHAR2
, para_2 IN NUMBER
)
RETURN NUMBER;

Solution:
The solution is inside the replace-gui to search for:
(.*\*.*@.*)(:)(.*)
and replace it with
$1$3

Explanation:
$1 represents the matched char inside the first pair of brackets.
All chars before the colon. (.*\*.*@.*)
$2 represents the matched char inside the first pair of brackets.
The colon itself. (:)
$3 represents the matched char inside the first pair of brackets.
All chars after the colon. (.*)
And so we replace it only with the first and the third part.
The second part ($2) means the colon itself we don’t care about or better we want to delete it…

Notepad++ Language-file for PLSQL

Notepad++ is one of the best free Text-Editors.

Here you can download a file for the Syntax-Highlightning of PLSQL here local (rename userDefineLang_xml.txt to userDefineLang.xml) or on the original site:
http://notepad-plus.sourceforge.net/commun/userDefinedLang/userDefineLang_plsql.xml

An Overview of all available Language-Files is here:
http://docs.notepad-plus-plus.org/index.php?title=User_Defined_Language_Files

This file must be located (in Windows 7) in your Profile in the Notepad++-Dir besides the normal Language-File langs.xml and must be named userDefineLang.xml.

PLDOC

PLDOC is a generator which generates HTML-Docs in the format of JAVA-Doc for PL/SQL or PLSQL-Code like Packages, Procedures an Functions.
It is possible to add comments and explanations for parameter, returnvalues, exceptions and variables.

Projectsite: http://pldoc.sourceforge.net/maven-site/
Download: http://pldoc.sourceforge.net/maven-site/downloads.html
Documentation/User Guide: http://pldoc.sourceforge.net/maven-site/docs/Users_Guide/index.html


 

Example for Adding Comment to a Package(Header or Body)
CREATE OR REPLACE PACKAGE myschema.packagename
IS
/**
* <pre>Dieses Package steuert den Ablauf der Befüllung des DWHs.
*
* URL: $UR$
* Revision: $Rev$
* Author: $Author$
* Date: $Date$
* ID: $Id$
*
* MODIFICATION HISTORY
* Person Date Comments<br/>
* --------- ---------- ----------------------------------------
* Name001 01.01.2016 - Neuerstellung.
* Name002 01.02.2016 - PLDOC-Formattierung der Kommentare
* - SVN-Keywords
* </pre>
* @headcom
*/


 

Inside of the tag <pre></pre> there is no need for the line-break-tags (<br/>)

Example for Adding Comment to a Procedure/Function
/** This is a comment shown til the dot at the end of the comment or an At Sign in the next line.
* @param id This is the description of parameter "id"
* @return This is the description of the returnvalue
* @throws NO_PARTNER_FOUND This is the description of the Exception thrown
*/
FUNCTION check_object(
id IN VARCHAR2
) RETURN NUMBER;

There must be no colon (‘:’) after the name of the parameter behind @param!
This is wrong:
* @param id: This is the description of parameter "id"

You can remove the colons for example with notepad++ like described here.


 

Example-Call on Windows to generate the PL-DOC:
call pldoc.bat -doctitle 'Special Database Project Name' -d DirectoryInWhichTheDocWillBeStored -inputencoding ISO-8859-15 --overview OverviewToBeIncludedAutomatically.html SourceDirectory/*.sql SourceDirectory/*.pkb SourceDirectory/*.pks


 

Alternative doc-generators for PLSQL to evaluate can be:

 

Subversion SVN: Set Keyword-Properties for existing files

If you want to use in an already existing Project the keyword-substitution in Subversin as described here: Subversion Keywords (Revision,Author,URL,Date,ID) you have to set the properties to substitute these keywords for all existing files, too.

This can be done e.g. in Tortoise SVN in the contextmenu-properties-new
Just check in the needed Keywords an click OK.
You can the properties for every single file or recursively for an directory:

Subversion - TortiseSVN set Properties-Dialog
Subversion – TortiseSVN set Properties-Dialog

After setting this properties, alle the files have to be checked in!

More information can be found here.