Social Icons

Featured Posts

Columnar

Teradata Columnar
The purpose of a Columnar(NoPI) table is to spread the rows evenly across the AMPs. This is why a NoPI table is often used as a staging table.

Columnar Table Fundamentals
  • Columnar Tables must be a NoPI Table so No Primary Index (NoPI).
  • The NoPI brings even distribution to the table.
  • Columnar Tables allow Columns to be Partitioned.
  • An AMP still holds the entire row, but partitions vertically.
  • Columns are placed inside their own individual Container.
  • All Containers have the same amount of rows in the exact order.
  • Single Columns or Multi-Columns can be placed inside containers.
  • Each container looks like a small table for I/O purposes.
  • Add up all the containers and you rebuild the row.
  • Columnar Tables make sense when users query only certain columns.
  • When a row is deleted it is NOT Physically Deleted but marked deleted


Normal table vs columnar table
The two tables above contain the same Employee data, but one is a columnar table. Employee_Normal has placed 3 rows on each AMP with 5 columns. The other table Employee_Columnar has 5 Containers each with one column.
Teradata Columnar containers
Add caption


Example 
CREATE Table Employee
(
  Emp_Id        Integer
 ,Dept_Id       Integer
 ,First_Name    Varchar(20)
 ,Last_Name     Char(20)
 ,Salary        Decimal (10,2)
)
No Primary Index
PARTITION BY COLUMN;


Columnar table 
Teradata Columnar AMP



  • This AMP is assigned 3 Employee Rows
  • All AMPs hold 3 different Employee Rows also
  • Each Row has 5 Columns
  • This Columnar Table partitions in 5 separate containers
  • Each container has a relative row number (1, 2, 3)
  • Each container has the exact same number of rows


NoPI Table Capabilities:
  • Are always Multi-Set Tables
  • Have Secondary Indexes (USI or NUSI)
  • Have Join Indexes
  • Be Volatile or Global Temporary Tables
  • Can COLLECT STATISTICS
  • Be FALLBACK Protected
  • Have Triggers
  • Be Large Objects (LOBs)
  • Have Primary Key Foreign Key Constraint


NoPI Table Restrictions
  • No Primary Indexes allowed
  • No SET Tables
  • No Partition Primary Index (PPI) tables
  • No Queue Tables
  • No Hash Indexes
  • No Identity Columns
  • No Permanent Journaling
  • Can't be the Target Table for any UPDATE, UPSERT or MERGE-INTO Statements



Examples of Columnar tables

Multi-Columnar

CREATE Table Employee
(
  Emp_Id        Integer
 ,Dept_Id       Integer
 ,First_Name    Varchar(20)
 ,Last_Name     Char(20)
 ,Salary        Decimal (10,2)
)
No Primary Index
PARTITION BY COLUMN
(Emp_Id
,Dept_id
(,First_name, Last_name, Salary));

Row Hybrid Columnar

CREATE Table Employee
(
  Emp_Id        Integer
 ,Dept_Id       Integer
 ,First_Name    Varchar(20)
 ,Last_Name     Char(20)
 ,Salary        Decimal (10,2)
)
No Primary Index
PARTITION BY COLUMN
(Emp_Id No Auto Compress
,Dept_id
(,First_name, Last_name, Salary)
No Auto Compress);

Columnar Partitions

Crete Table Order_table_PPI_Col
(Order_no integer not null
,Customer_no integer
,Order_date date
,Order_total decimal(10,2)
)
NO PRIMARY INDEX
PARTITION BY(Column
,Range_N(Order_date between date '2015-01-01 AND date '2015-12-31'
EACH INTERVAL '1' Month));

Continue Reading...

Creating Tables








Creation Type SQL
Creating a Table with a Unique Primary Index CREATE TABLE EMP_TABLE
(
EMP_NO INTEGER
,DEPT_NO INTEGER
,FIRST_NAME VARCHAR(20)
,LAST_NAME CHAR(20)
,SALARY DECIMAL(10,2)
,ADDRESS VARCHAR(100)
)
UNIQUE PRIMARY INDEX (EMP_NO);
Creating a Table with a Non-Unique Primary Index CREATE TABLE EMP_TABLE
(
EMP_NO INTEGER
,DEPT_NO INTEGER
,FIRST_NAME VARCHAR(20)
,LAST_NAME CHAR(20)
,SALARY DECIMAL(10,2)
,ADDRESS VARCHAR(100)
)
PRIMARY INDEX(EMP_NO);
Creating a Table without entering a Primary Index CREATE TABLE EMP_TABLE
(
EMP_NO INTEGER
,DEPT_NO INTEGER
,FIRST_NAME VARCHAR(20)
,LAST_NAME CHAR(20)
,SALARY DECIMAL(10,2)
,ADDRESS VARCHAR(100)
)
Creating a Table with NO Primary Index CREATE TABLE EMP_TABLE
(
EMP_NO INTEGER
,DEPT_NO INTEGER
,FIRST_NAME VARCHAR(20)
,LAST_NAME CHAR(20)
,SALARY DECIMAL(10,2)
,ADDRESS VARCHAR(100)
)
NO PRIMARY INDEX;
Creating a Set Table CREATE SET TABLE EMP_TABLE
(
EMP_NO INTEGER
,DEPT_NO INTEGER
,FIRST_NAME VARCHAR(20)
,LAST_NAME CHAR(20)
,SALARY DECIMAL(10,2)
,ADDRESS VARCHAR(100)
)
PRIMARY INDEX(EMP_NO);
Creating a Multiset Table CREATE MULTISET TABLE EMP_TABLE
(
EMP_NO INTEGER
,DEPT_NO INTEGER
,FIRST_NAME VARCHAR(20)
,LAST_NAME CHAR(20)
,SALARY DECIMAL(10,2)
,ADDRESS VARCHAR(100)
)
PRIMARY INDEX(EMP_NO);
Creating a Set Table with a Unique Primary Index CREATE SET TABLE EMP_TABLE
(
EMP_NO INTEGER
,DEPT_NO INTEGER
,FIRST_NAME VARCHAR(20)
,LAST_NAME CHAR(20)
,SALARY DECIMAL(10,2)
,ADDRESS VARCHAR(100)
)
UNIQUE PRIMARY INDEX(EMP_NO);
Creating a Set Table with a Unique Secondary Index CREATE SET TABLE EMP_TABLE
(
EMP_NO INTEGER
,DEPT_NO INTEGER
,FIRST_NAME VARCHAR(20)
,LAST_NAME CHAR(20)
,SALARY DECIMAL(10,2)
,ADDRESS VARCHAR(100)
)
PRIMARY INDEX(EMP_NO)
UNIQUE INDEX(DEPT_NO);
Creating a Table with an UPI and USI CREATE SET TABLE EMP_TABLE
(
EMP_NO INTEGER
,DEPT_NO INTEGER
,FIRST_NAME VARCHAR(20)
,LAST_NAME CHAR(20)
,SALARY DECIMAL(10,2)
,ADDRESS VARCHAR(100)
)
UNIQUE PRIMARY INDEX(EMP_NO)
UNIQUE INDEX(DEPT_NO);
Creating a Table with a Multicolumn Primary Index CREATE TABLE EMP_TABLE
(
EMP_NO INTEGER
,DEPT_NO INTEGER
,FIRST_NAME VARCHAR(20)
,LAST_NAME CHAR(20)
,SALARY DECIMAL(10,2)
,ADDRESS VARCHAR(100)
)
PRIMARY INDEX(EMP_NO,DEPT_NO,SALARY);
Creating a Unique Secondary Index (USI) after a table is created CREATE UNIQUE INDEX IDXNAME(EMP_NO) ON EMP_TABLE ;
Creating a Value-Ordered NUSI CREATE INDEX(DEPT_NO) ORDER BY VALUES ON EMP_TABLE ;
Making an exact copy a Table with data CREATE TABLE EMP_TABLE_BACKUP AS EMP_TABLE WITH DATA;
Making an exact copy a Table with out data CREATE TABLE EMP_TABLE_BACKUP AS EMP_TABLE WITH NO DATA;
Copying a Table CREATE TABLE EMP_TABLE_BACKUP AS (SELECT * FROM EMP_TABLE ) WITH DATA;
Copying only specific columns of a table CREATE TABLE EMP_TABLE_BACKUP AS
(SELECT
EMP_NO
,DEPT_NO
,SALARY
FROM EMP_TABLE )
WITH DATA;
Copying aTable and Keeping the Statistics CREATE TABLE EMP_TABLE_BACKUP AS EMP_TABLE WITH DATA AND STATS;
Copying a Table without data but with Statistics CREATE TABLE EMP_TABLE_BACKUP AS EMP_TABLE WITH NO DATA AND STATS ;
Creating a Table with Fallback CREATE TABLE EMP_TABLE, FALLBACK
(
EMP_NO INTEGER
,DEPT_NO INTEGER
,FIRST_NAME VARCHAR(20)
,LAST_NAME CHAR(20)
,SALARY DECIMAL(10,2)
,ADDRESS VARCHAR(100)
)
UNIQUE PRIMARY INDEX(EMP_NO) ;
Creating a Table with No Fallback CREATE TABLE EMP_TABLE, NO FALLBACK
(
EMP_NO INTEGER
,DEPT_NO INTEGER
,FIRST_NAME VARCHAR(20)
,LAST_NAME CHAR(20)
,SALARY DECIMAL(10,2)
,ADDRESS VARCHAR(100)
)
UNIQUE PRIMARY INDEX(EMP_NO) ;
Creating a Table with a Before Journal CREATE TABLE EMP_TABLE
,NO FALLBACK
,BEFORE JOURNAL
(
EMP_NO INTEGER
,DEPT_NO INTEGER
,FIRST_NAME VARCHAR(20)
,LAST_NAME CHAR(20)
,SALARY DECIMAL(10,2)
,ADDRESS VARCHAR(100)
)
UNIQUE PRIMARY INDEX(EMP_NO) ;
Creating a Table with a Dual Before Journal CREATE TABLE EMP_TABLE
,NO FALLBACK
,DUAL JOURNAL
(
EMP_NO INTEGER
,DEPT_NO INTEGER
,FIRST_NAME VARCHAR(20)
,LAST_NAME CHAR(20)
,SALARY DECIMAL(10,2)
,ADDRESS VARCHAR(100)
)
UNIQUE PRIMARY INDEX(EMP_NO) ;
Creating a Table with an After Journal CREATE TABLE EMP_TABLE
,NO FALLBACK
,AFTER JOURNAL
(
EMP_NO INTEGER
,DEPT_NO INTEGER
,FIRST_NAME VARCHAR(20)
,LAST_NAME CHAR(20)
,SALARY DECIMAL(10,2)
,ADDRESS VARCHAR(100)
)
UNIQUE PRIMARY INDEX(EMP_NO) ;
Creating a Table with a Dual After Journal CREATE TABLE EMP_TABLE
,NO FALLBACK
,DUAL AFTER JOURNAL
(
EMP_NO INTEGER
,DEPT_NO INTEGER
,FIRST_NAME VARCHAR(20)
,LAST_NAME CHAR(20)
,SALARY DECIMAL(10,2)
,ADDRESS VARCHAR(100)
)
UNIQUE PRIMARY INDEX(EMP_NO) ;
Creating a Table with a Journal Keyword Alone CREATE TABLE EMP_TABLE
,NO FALLBACK
,JOURNAL
(
EMP_NO INTEGER
,DEPT_NO INTEGER
,FIRST_NAME VARCHAR(20)
,LAST_NAME CHAR(20)
,SALARY DECIMAL(10,2)
,ADDRESS VARCHAR(100)
)
UNIQUE PRIMARY INDEX(EMP_NO) ;
Creating a Table with Customization of the Data Block Size CREATE TABLE EMP_TABLE
,FALLBACK
,DATABLOCKSIZE= 21248 BYTES
(
EMP_NO INTEGER
,DEPT_NO INTEGER
,FIRST_NAME VARCHAR(20)
,LAST_NAME CHAR(20)
,SALARY DECIMAL(10,2)
,ADDRESS VARCHAR(100)
)
UNIQUE PRIMARY INDEX(EMP_NO) ;
Creating a Table with Customization with FREESPACE Percent CREATE TABLE EMP_TABLE
,FALLBACK
,DATABLOCKSIZE= 21248 BYTES
,FREESPACE = 20 PERCENT
(
EMP_NO INTEGER
,DEPT_NO INTEGER
,FIRST_NAME VARCHAR(20)
,LAST_NAME CHAR(20)
,SALARY DECIMAL(10,2)
,ADDRESS VARCHAR(100)
)
UNIQUE PRIMARY INDEX(EMP_NO) ;
Creating a Columnar Table CREATE TABLE EMP_TABLE
(
EMP_NO INTEGER
,DEPT_NO INTEGER
,FIRST_NAME VARCHAR(20)
,LAST_NAME CHAR(20)
,SALARY DECIMAL(10,2)
,ADDRESS VARCHAR(100)
)
NO PRIMARY INDEX
PARTITION BY COLUMN
;
Creating a Columnar Table with Multi-Column Containers CREATE TABLE EMP_TABLE
(
EMP_NO INTEGER
,DEPT_NO INTEGER
,FIRST_NAME VARCHAR(20)
,LAST_NAME CHAR(20)
,SALARY DECIMAL(10,2)
,ADDRESS VARCHAR(100)
)
NO PRIMARY INDEX
PARTITION BY COLUMN (
EMP_NO,
DEPT_NO,
(FIRST_NAME,LAST_NAMESALARY))
;
Creating a Columnar Table with a Row Hybrid CREATE TABLE EMP_TABLE
(
EMP_NO INTEGER
,DEPT_NO INTEGER
,FIRST_NAME VARCHAR(20)
,LAST_NAME CHAR(20)
,SALARY DECIMAL(10,2)
,ADDRESS VARCHAR(100)
)
NO PRIMARY INDEX
PARTITION BY COLUMN (EMP_NO NO AUTO COMPRESS,DEPT_NO
,ROW (FIRST_NAME,LAST_NAMESALARY)NO AUTO COMPRESS);
Creating a Columnar Table with both Row and Column Partitions CREATE TABLE ORDER_TABLE
(
ORDER_NO INTEGER
,CUSTMER_NO INTEGER
,ORDER_DATE DATE
,ORDER_TOTAL DECIMAL(10,2)
)
NO PRIMARY INDEX
PARTITION BY (COLUMN,
RANGE_N (ORDER_DATE BETWEEN DATE '2015-01-01' AND '2015-06-30' EACH INTERVAL '1' MONTH)) ;



Continue Reading...

TPT in deatail

TPY Syntax
Teradata Parallel Transporter supports the following types of SQL statements:

  • Data Manipulation Language (DML): Insert, Update, Delete, Upsert, Merge, and Select
  • Data Control Language (DCL): Give, Grant, and Revoke
  • Data Definition Language (DDL): Create, Drop, Alter, Modify, Delete Database, Delete User, and Rename

OS Commands
Use the OS Command operator to send commands to the operating system on the client system

Delimited Data
Delimited data are variable-length text records with each field or column separated by one or more delimiter characters. Delimited data are also known as VARTEXT.
Use the Data Connector operator to read or write delimited data

Large Object Data Types
There are two kinds of large object data type:
Character large object (CLOB)
Binary large object (BLOB)
Three operators support the CLOB and BLOB data types.
The Inserter operator can insert CLOB and BLOB data types into a Teradata table
The Selector operator can export CLOB and BLOB data types from a Teradata table
The Data Connector operator can read/write CLOB and BLOB data types from/to a file.
Selecting the wrong operator to process the CLOB or BLOB data type terminates the job.

TPT script Structure

Building TPT Scripts 
TPT uses a SQL-like scripting language for extract, basic transformation, and load functions. This easy-to-use language is based on SQL, making it familiar to most database users. All operators use the same scripting language. This represents an improvement over the individual utilities, each of which has its own unique scripting language. A single script can be used to define multiple operators and schemas to create complex extracting and loading jobs. 
There are only a few statements that are needed to build a TPT script. A quick look at the basic statements can be seen here:

DEFINE JOB
Defines the overall job and packages together all following DEFINE and APPLY statements. 

DEFINE SCHEMA 
Defines the structure of a data object in terms of columns of specific data types. A given schema definition can be used to describe multiple data objects. Also, multiple schemas can be defined in a given script.

DEFINE OPERATOR 
Defines a specific TPT operator to be used in this job.

DEFINE DBMS 
Defines an instance of a database server to be used in this job.

APPLY 
A processing statement used to initiate a TPT load, update, or delete operation.

Generally TPT script has two sections
1) Declarative section 
2) Executable section

Declarative section—all schema and operator definitions

Define the job name for the script
Provide the description of job
Schema Definition
Operator Definition #1
….
Operator Definition #n
Executable section - -
Specifies all sql statements processing that is extract, load, filter, delete and update done by Apply definition
It also requires a job Variable file.

The structure of TPT script is as mentioned below

Script example 

DEFINE JOB FILE_LOAD
DESCRIPTION 'Load a Teradata table from a file'
(
/* TPT declaration section/*

DEFINE SCHEMA Trans_n_Accts_Schema
(
Account_Number VARCHAR(50),
Trans_Number VARCHAR(50),
Trans_Date VARCHAR(50),
Trans_ID VARCHAR(50),
Trans_Amount VARCHAR(50)
);
DEFINE OPERATOR DDL_OPERATOR
TYPE DDL
ATTRIBUTES
(
VARCHAR PrivateLogName = 'ddl_log',
VARCHAR TdpId = @jobvar_tdpid,
VARCHAR UserName = @jobvar_username,
VARCHAR UserPassword = @jobvar_password,
VARCHAR ErrorList = '3807'
);
DEFINE OPERATOR FILE_READER
TYPE DATACONNECTOR PRODUCER
SCHEMA Trans_n_Accts_Schema
ATTRIBUTES
(
VARCHAR PrivateLogName = 'dataconnector_log',
VARCHAR DirectoryPath = @jobvar_datafiles_path,
VARCHAR FileName = 'accounts.txt',
VARCHAR Format = 'Delimited',
VARCHAR OpenMode = 'Read',
VARCHAR TextDelimiter = '|'
);
DEFINE OPERATOR LOAD_OPERATOR
TYPE LOAD
SCHEMA *
ATTRIBUTES
(
VARCHAR PrivateLogName = 'load_log',
VARCHAR TdpId = @jobvar_tdpid,
VARCHAR UserName = @jobvar_username,
VARCHAR UserPassword = @jobvar_password,
VARCHAR TargetTable = @jobvar_tgt_dbname || '.Trans',
VARCHAR LogTable = @jobvar_wrk_dbname || '.LG_Trans',
VARCHAR ErrorTable1 = @jobvar_wrk_dbname || '.ET_Trans',
VARCHAR ErrorTable2 = @jobvar_wrk_dbname || '.UV_Trans'
);
STEP Setup_Tables

(
/* TPT Executation section/*
APPLY
('DROP TABLE ' || @jobvar_wrk_dbname || '.ET_Trans;'),
('DROP TABLE ' || @jobvar_wrk_dbname || '.UV_Trans;'),
('DROP TABLE ' || @jobvar_tgt_dbname || '.Trans;'),
('CREATE TABLE ' || @jobvar_tgt_dbname
|| '.Trans (Account_Number VARCHAR(50),
Trans_Number VARCHAR(50),
Trans_Date VARCHAR(50),
Trans_ID VARCHAR(50),
Trans_Amount VARCHAR(50));')
TO OPERATOR (DDL_OPERATOR);
);
STEP Load_Trans_Table
(
APPLY
('INSERT INTO ' || @jobvar_tgt_dbname || '.Trans(Account_Number,
Trans_Number,
Trans_Date,
Trans_ID,
Trans_Amount)
VALUES(:Account_Number,
:Trans_Number,
:Trans_Date,
:Trans_ID,
:Trans_Amount);')
TO OPERATOR (LOAD_OPERATOR[2])
SELECT * FROM OPERATOR (FILE_READER[2]);
);
);

Note:  Example, Job Variables can be maintained in separate variables file or we can pass directly 

Job execution
tbuild -f <script file name> -z <checkpoint interval>

The -z option sets the checkpoint interval to the number of seconds specified.

SET CHECKPOINT INTERVAL 160 SEC
Or
SET CHECKPOINT INTERVAL 12 MINUTES

The checkpoint interval can be specified in a job script between the last DEFINE statement
and the APPLY statement(s).

tbuild
We have seen the tbuild command in many of the previous examples. This command is used to initiate a TPT job. The following key options may be used with tbuild:
-f Specifies the filename to be used as input.
-u Specifies job variable values which are to be applied.
-z Specifies a checkpoint interval to be used for the client side.
-s Specifies that job execution is to start at a specific job step.
-v Specifies that job attributes are to be read from an external file.
-l Specifies latency interval - how often to flush stale buffers.
-n Specifies that the job should continue, even if a step return code is greater than 4 

Note: If the checkpoint interval is specified both in the job script and with the tbuild -z command option, the -z option takes precedence.

Troubleshooting a Failed Job

Common Job Failures and Remedies
There are two categories of job failures. The evaluation and correction of each type of failure must be handled differently:
• Some jobs fail at launch, during execution of the tbuild statement, but before the initial job step have run.
• Some jobs launch successfully, and one or more job steps may execute successfully, but thejob fails to run to completion.

The following sections describe common errors encountered by Teradata PT jobs.
  • When the Job Fails to Begin Running.
  • When a job is launched but fails to begin execution, the associated errors appear in the public log. Errors are detected according to the launch sequence:
1. Teradata PT first processes the options specified in the tbuild command. If it detects tbuild                   command errors, the job stops.
Error types encountered: tbuild command errors
        2 If Teradata PT encounters no tbuild command errors, it then parses the job script and creates a            parallel job execution plan that will perform the operations specified in the APPLY            
           statement(s) in the job script.
Errors types encountered:
  • Pre processor errors -- Incorrect use of job variables or the INCLUDE directive.
  • Job script compilation errors -- Syntactic and semantic errors. 
      3.Only when script compilation is successful and the execution plan has been generated does the            Teradata PT allocate resources for and launch the various internal tasks required to execute the             job plan

Errors types encountered: System resource errors

The following common types of tbuild errors may occur at job launch:
  • User errors
  • executing the tbuild command
  • Script compiler errors
  • System resource errors
  • semaphore errors
  • socket errors
  • shared memory errors

Continue Reading...

Teradata15 features

teradata 15 features
Teradata 15.0 has come up with many new exciting features and enhanced capabilities.
This post provides brief descriptions of the new features in this release in database level.

JavaScript Object Notation:
JSON (JavaScript Object Notation) is new functionality to support the storage and processing of data into Teradata database.
  • A JSON data type, its stores the data in JSON document or JSON records in relational format.
  • JSON data can be used for all methods, functions, and procedures for processing, shredding, and publishing.
  • The size of JSON documents up to 16MB
  • JSONPath support provides simple traversal and regular expressions with wildcards to filter and navigate complex JSON documents.
Teradata QueryGrid:
Teradata database now able to connect Hadoop with this QueryGrid so it’s called as Teradata Database-to-Hadoop also referred as  Teradata-to-Hadoop connector.

  • It provides a SQL interface for transferring data between Teradata Database and remote Hadoop hosts.
  • Import Hadoop data into a temporary or permanent Teradata table.
  • Export data from temporary or permanent Teradata tables into existing Hadoop tables.
  • Create or drop tables in Hadoop from Teradata Database.
  • Reference tables on the remote hosts in SELECT and INSERT statements.
  • Select Hadoop data for use with a business tool.
  • Select and join Hadoop data with data from independent data warehouses for analytical use.
XML Data Type:
Teradata also provides a new data type which stores and process the XML data.
It supports methods, functions, stored procedures for all operations like parsing, validation, transformations and Query.
The XML type stores values up to 2GB in size.

Sample Table creation

CREATE TABLE EMP_TABLE (
   EMPID INTEGER,
   EMPName VARCHAR(256),
   EMPXML XML )
PRIMARY INDEX (EMPID);

EMPXML is an XML type column.

DBQL Show Parameters
A parameterized query used to place  parameters, and the parameter values are provided in a separate statement at time of execution.

The main purpose is to  distinguishes between code and data. Also avoids attackers from changing the query by inserting SQL commands.

Two new  Data Dictionary tables are 
  • DBC.DBQLParamTbl
  • DBC.DBQLParamTbl logs

Light-Weight Redistribution:
The Light-Weight Redistribution(LWR) also referred to as the Single Sender Redistribution (SSR). With this feature, the Teradata Database optimizer can switch from an all-AMP row redistribution to a few AMP row redistribution.

While executing the query, Teradata optimizer determines a query step is eligible for SSR, then the retrieved rows are redistributed by hash code to receiver AMP. In this case the number of AMPs is few. Without SSR all rows are normally redistributed across all AMPs.
Below chart shows the overall enhancements in Teradata 15 release.
Big Data & Analytics JSON Integration
Scripting and Language Support
Table Operator Enhancements
3D Geospatial Data Type
Geospatial Performance Enhancements
SQL-H Enhancements
Performance Light-Weight Redistribution
Software Efficiency Improvements
Ecosystem SQL Interface For ShowBlocks
TASM/TIWM - Classification by Datablock Selectivity
Utility Statistics Logging
TASM - Global Weights
Unity Callback Support
TASM - Virtual Partitions Enhancement
TIWM - Operating Periods For Appliance
Quality Smart Retry During Parsing
HSN Health
New PI On Access Rights Table
Unresponsive Node Isolation
Onsite System and Dump Analysis
DBQL - Show Parameters
Industry Compatibility Sequenced Aggregate Enhancements and Aggregate JI for Temporal
1MB Phase 2
ANSI Temporal Syntax Support
Teradata Directory Manager


Continue Reading...

DATE Functions

Teradata Date Functions
As part of Domain Specific Functions Teradata database 14 has released new DATE functions.
Below re list of function.

FunctionDescription
LAST_DAY Return date of the last day of the month that contains timestamp value
NEXT_DAY returns first weekday named by day_value that is later than the date specified by date/timestamp value
NUMTODSINTERVAL convert a numeric value into an INTERVAL DAY(4) TO SECOND(6) value
NUMTOYMINTERVAL Convert a numeric value into an INTERVAL YEAR(4) TO MONTH value
TO_DSINTERVAL Convert a string value into an INTERVAL DAY(4) TO SECOND(6) value.
TO_YMINTERVAL Convert a string value into an INTERVAL YEAR(4) TO MONTH value.
MONTHS_BETWEEN Return the number of months between two date/timestamp values.
OADD_MONTHS Add a specified date/timestamp value to a specified number of months and return the resulting date.
TO_DATE Convert a string into a DATE value via a format string.
TO_TIMESTAMP Convert a string into a TIMESTAMP value via a format string.
TO_TIMESTAMP_TZ Convert a string into a TIMESTAMP WITH TIME ZONE value via a format string.
TRUNC Returns a DATE value with the time portion truncated to the unit specified by a format string.
ROUND Returns a DATE value with the time portion rounded to the unit specified by a format string.



Examples:

SELECT LAST_DAY (DATE);
-----------------------
LAST_DAY(Date)
2014-06-30


SELECT NEXT_DAY(DATE '2014-06-10' , 'FRIDAY');
----------------------------------------------
NEXT_DAY(2014-06-10,'FRIDAY')
2014-06-13


SELECT NUMTODSINTERVAL(86405,'SECOND'), NUMTOYMINTERVAL(100, 'MONTH' );
----------------------------------------------------------------------------------------------------------------------
NUMTODSINTERVAL(86405,'SECOND')               NUMTOYMINTERVAL(100,'MONTH')
  1 00:00:05.000000                                                     8-04


SELECT TO_DSINTERVAL('150 08:30:00') , TO_YMINTERVAL( '2-11') ;
----------------------------------------------------------------------------------------------
TO_DSINTERVAL('150 08:30:00')                            TO_YMINTERVAL('2-11')
150 08:30:00.000000                                                       2-11
The above functions can be helpful while adding to any date columns.

samples
SELECT ORDER_DATE, ORDER_DATE + TO_YMINTERVAL('02-11') FROM ORDERS;
SELECT EMP_ID, LAST_NAME FROM EMP_TABLE  WHERE HIRE_DATE + TO_DSINTERVAL('100 00:00:00')


SELECT MONTHS_BETWEEN(DATE'2014-06-01', DATE'2014-02-01');
-------------------------------------------------------------
MONTHS_BETWEEN(2014-06-01,2014-02-01)
4.00


SELECT OADD_MONTHS (DATE '2014-04-15', 2), OADD_MONTHS (DATE '2008-02-29', 1);
---------------------------------------------------------------------------------------------------------------------
OADD_MONTHS(2014-04-15,2)                              OADD_MONTHS(2008-02-29,1)
2014-06-15                                             2008-03-31
Since 29 is the last day in February, March 31 is returned since 31 is the last day in March


SELECT TRUNC(CAST('2014/06/05' AS DATE), 'D') (FORMAT 'yyyy-mm-dd');
----------------------------------------------------------------------------------------------------
TRUNC('2014/06/05','D')
2014-06-01
The date was rounded to the first day of that week.


SELECT ROUND(CAST('2003/09/20' AS DATE), 'RM') (FORMAT 'yyyy-mm-dd');
----------------------------------------------------------------------------------------------------------
ROUND('2003/09/20','RM')
2003-10-01
Since the day is greater than or equal to 16, the date is rounded to the beginning of the next month.

Continue Reading...

Numeric Functions

Teradata Numeric Functions
Teradata 14 has released many Domain Specific Functions. Now we are discussing about new Numeric functions which are equivalent to Oracle.
The following numeric functions are available in this Teradata 14 release.



NameDescription
SIGN It returns the sign of a value
TRUNC It will truncate a numeric value
ROUND It will round a numeric value
GREATEST It return the highest value from a list of given values
LEAST It return the lowest value from a list of given values
TO_NUMBER It convert a string to a number via a format string
CEILING It return the smallest integer not less than the input parameter
FLOOR It return the largest integer equal to or less than the input parameter

Example for all functions.

SELECT
  SIGN(-123)                                         as SIGN
, TRUNC (32.976)                                as TRUNC
, ROUND(345.175)                               as ROUND
, GREATEST(12,56,10.1, 155.6)          as GREATEST
, LEAST (12,56,10.1, 155.6)                 as LEAST
, TO_NUMBER ('4769.96', '9999.99')  as TO_NUMBER
, CEIL( 5.4)                                            as CEIL
, FLOOR(3.86)                                      as FLOOR
;

Teradata Numeric Functions


Continue Reading...

Regular Expression Functions

Regular Expression Functions
In this Teradata 14 has released many domain specific function added NUMERIC data type, String functions and many of the functions supports regular expressions. These new functions are designed to be compatible to Oracle.
Here we are going to discuss about the following domain specific regular expression functions.
  1. REGEXP_SUBSTR
  2. REGEXP_REPLACE 
  3. REGEXP_INSTR 
  4. REGEXP_SIMILAR
  5. REGEXP_SPLIT_TO_TABLE

Now will discuss in detail of each function below

REGEXP_SUBSTR
This function extracts a substring from a source string that matches a regular expression pattern.

General Syntax:
REGEXP_SUBSTR (source_string, regexp_string, position_arg);

position_arg = (occurance_org, match_arg)

REGEXP_SUBSTR Function Example:

SELECT REGEXP_SUBSTR ('God Bless Mummy God Bless Daddy', 'bless', 1, 2, 'i')

In the above example
source_string = God Bless Mummy God Bless Daddy
regexp_string = Bless
position_arg = Find the second occurrence of the string that matches regular expression. 'i' indicates case insensitive.
REGEXP_SUBSTR


REGEXP_REPLACE
This function replaces portions of the source string parameter that match a regular expression pattern with a replace string.

General Syntax:
REGEXP_SUBSTR (source_string, regexp_string, replace_string);

replace_string=(position_arg,occurance_org, match_arg)

REGEXP_REPLACE Function Examples:



SELECT REGEXP_REPLACE ('Hello World World', 'world', 'My', 1, 1,'i');
In this we are replacing the 1st occurrence of 'World' with 'My' by ignoring case specific.


REGEXP_REPLACE


SELECT REGEXP_REPLACE ('Godís Love is so wonderful.', 'God's', 'Mother', 1, 1, 'c');
In this we are replacing the 1st occurrence of 'God's' with 'Mother' by considering with case specific with 'c'.




SELECT REGEXP_REPLACE('I love the buzz-buzz buzzing of the bee', 'buzz', 'BUZZ', 1, 2, 'c');
In this we are replacing the 2nd occurrence of 'buzz' with 'BUZZ' by considering the case specific with 'c'.




REGEXP_REPLACE



SELECT REGEXP_REPLACE ('ABCD123-$567xy','[^0-9]*','',1,0,'i')
;

In this we are replacing the any character or symbols with NULL. So it results only Numeric values.

REGEXP_REPLACE


REGEXP_INSTR
This function Search the source string for a match to a regular expression pattern and return the beginning or ending position of that match.

General Syntax:
REGEXP_SUBSTR (source_string, regexp_string, position_arg)

position_argt = (occurance_org,return_opt, match_arg)

return_opt  returns  0 = function returns the beginning position of the match (default).
                               1 = function returns the end position

REGEXP_INSTR Function Examples:

SELECT REGEXP_INSTR('Happy Birthday to you', 'Happy Birthday', 1, 1, 0, 'c');
It returns 1
REGEXP_INSTR


SELECT REGEXP_INSTR('Happy Birthday to you', 'Happy Birthday', 1, 1, 1, 'c');

It returns 15

REGEXP_INSTR



REGEXP_SIMILAR
This function compares a source string to a regular expression and returns an integer value.
 1 (true) if the entire string matches regexp_arg
 0 (false) if the entire string does not match regexp_arg

General Snytax:
REGEXP_SIMILAR(source_string, regexp_string, match_arg)

REGEXP_INSTR Function Examples:

SELECT    Emp_Name
FROM    Emp_Table
WHERE    REGEXP_SIMILAR (emp_name, '(Smith B(i|y)rd)|| (John B(i|y)rd)','c') = 1;


It returns the names of employees that match
Smith Byrd
Smith Bird
John Bird
John Byrd

REGEXP_SIMILAR


REGEXP_SPLIT_TO_TABLE
This table function splits a source string into a table of strings using a regular expression as the delimiter.

General Snytax:

REGEXP_SPLIT_TO_TABLE (source_string, regexp_string, match_arg)

REGEXP_SPLIT_TO_TABLE Function Examples:

CREATE TABLE split_table_latin(id integer, src varchar(100) character set latin, pattern varchar(100) character set latin, match varchar(100) character set latin);

INSERT into split_table_latin(3, 'The2134quick234brown234fox987jumps8743over342the03487lazy089734red972384dog2343.', '[0-9]*','c');

Then the following SELECT statement:

SELECT * from table ( regexp_split_to_table(split_table_latin.src,split_table_latin.pattern,split_table_latin.match)returns (res varchar(100) character set latin)) as t1;

REGEXP_SPLIT_TO_TABLE


Returns a table with the following rows


  • The
  • quick
  • brown
  • fox
  • jumps
  • over
  • the
  • lazy
  • red
  • dog
  • .

    This all about the Regular Expression Functions. I hope you enjoyed while learning.

Continue Reading...

A to Z Performance Tuning

Teradata Performance Tuning
A) Explain the EXPLAIN: Check for EXPLAIN plan to see how exactly Teradata will be executing the query. Try to understand basic keywords in Explain Plan like confidence level, join strategy used, re-distribution happening or not.

B) Collect STATS: The stats of the columns used join conditions should updated. Secondary Indexes without proper STATS can be of little or no help. Check for STATS status of the table.

C) Use Proper PI: If the Primary index is not properly defined in any one or all of the tables in the query. Check if the PI of target table loaded is unique.

D) Use PPI: If there is Partition Primary Index created on a table, try to use it. If you are not using it in filter condition, it will degrade the performance.

E) No FUNCTIONS in Conditions: Try to avoid using function in join conditions. Ex Applying COALESCE or TRIM etc causes high CPU consumption.

F) Use PPI: If Partition Primary Index is defined in tables try to use it. If you are not using it in filter condition, it will degrade the performance.

G) Same column DATA TYPES:
Define same data type for the joining columns.

H) Avoid IN clause in filter conditions: When there can be huge number of values in where conditions, better option can be to insert such values in a volatile table and use volatile table with INNER JOIN in the main query.

I) Use Same PI in Source & Target: PI columns also can help in saving the data into disk .If the Source and Target have the same PI, data dump can happen very efficiently form source to target.

J) Collect STATS on VOLATILE table: Collect stats on volatile tables where required can save AMPCPU. Remove stats if already present where it is not getting used.
If the volatile table contains UNIQUE PI, then go for sample stats rather than full stats.

K) DROPPING volatile tables explicitly: Once volatile tables is no more required you can drop those. Donít wait for complete procedure to be over. This will free some spool space immediately and could prove to be very helpful in avoiding No More Spool Space error.

L) NO LOG for volatile tables: Create volatile tables with NO LOG option.

M) Check DBQL Stats: Keep your performance stats accessible. Target the most AMPCPU consuming query first.

N) UPDATE clause: Do not write UPDATE clause with just SET condition and no WHERE condition. Even if the Target/Source has just one row, add WHERE clause for PI column.

O) DELETE & INSERT: Sometimes replacing UPDATE with DELETE & INSERT can save good number of AMPCPU. Check if this holds good for your query.

P) Query SPLITS: Split queries into several smaller queries logically and use volatile tables with proper PI.

Q) Try MSR: If same target table is loaded multiple times, try MSR for several sections. This will speed the final MERGE step into target table and you may see good CPU gain.

R) Try OLAP Functions:
Check if replacing co-related sub query with OLAP function may result in AMPCPU saving.

S) Avoid DUPLICATE data: If the join columns in the tables involved in the query have duplicates. Use Distinct or Group by, load into a volatile table, collect stats and use the volatile table.

T) Use Proper JOINS: If joins used, donít use right outer, left or full joins where inner joins is sufficient.

U) User proper ALIAS: Check the aliases in the joins. Small mistake could lead to a product join.

V) Avoid CAST: Avoid unnecessary casting for DATE columns. Once defined as DATE, you can compare date columns against each other even when they are in different format. Internally, DATE is stored as INTEGER. CAST is required mainly when you have to compare VARCHAR value as DATE.

W) Avoid UDF:
Most of the functions are available in Teradata for data manipulations. So avoid User Defined Functions

X) Avoid FULL TABLE SCAN: Try to avoid FTS scenarios like SI should be defined on the columns which are used as part of joins or Alternate access path. Collect stats on SI columns else there are chances where optimizer might go for FTS even when SI is defined on that particular column

Y) Avoid using IN/NOT IN: For large list of values, avoid using IN /NOT IN in SQLs. Write large list values to a temporary table and use this table in the query

Z) Use CONSTANTS:
Use constants to specify index column contents whenever possible, instead of specifying the constant once, and joining the tables. This may provide a small savings on performance.
Continue Reading...