Friday 27 July 2012

TERADATA 14 NEW FEATURES


Teradata Database 14 is the high performing analytical engine that powers all of Teradata’s “purpose-built” platform family members.
Teradata brings analytical processing to all corners of your organization.
 Its proven analytical power and flexibility, scalability, and ease of use allows your entire organization to benefit from your valuable corporate information assets.

Teradata Database 14 additions - New features in the Teradata Database make it even easier to migrate data and associated applications from competitive databases. Teradata 14 has added several new functions and data types that ease migrations from OLTP databases, including:
  • Regular Expression Functions - eliminate the complex programming necessary to interpret web logs and other text strings in structured query language (SQL).
  • Date Functions – offer a shortcut that reduces programming time by converting and manipulating information related to date and time, which may be stored in numerous formats, into a consistent format, and validates the data.
  • New data types (ARRAY and NUMBER data types) – Teradata is now able to migrate, store and analyze data in additional data types or formats not previously used by the Teradata database, without converting it. This simplifies and speeds the migration.
Teradata provides the analytical performance you need through its “parallel everything” design and industry leading optimizer to intelligently process all types of analytical and business intelligence queries.
The renowned multi-dimensional scalability allows not only large amounts of data to be stored and processed, but the most concurrent queries, the most complex requests, and the widest variety of work to be accomplished. And, with its easy “set and go” optimization options, Teradata provides powerful embedded analytics and advanced workload management.

Features & Benefits
Fast Query Performance 
Parallel Everything” design and smart Teradata Optimizer enables fast query execution across platforms.
Quick Time to Value 
Simple set up steps with automatic “hands off” distribution of data, along with integrated load utilities result in rapid installations.
Simple to Manage.
DBAs never have to set parameters, manage table space, or reorganize data.
Responsive to Business Change
Fully parallel MPP “shared nothing” architecture scales linearly across data, users, and applications providing consistent and predictable performance and growth
Powerful, Embedded Analytics 
In-database data mining, virtual OLAP/cubes, geospatial and temporal analytics, custom and embedded services in an extensible open parallel framework drive efficient and differentiated business insight
Advanced Workload Management 
Workload management options by user, application, time of day and CPU exceptions.
Intelligent Scan Elimination 
“Set and Go” options reduce full file scanning (Primary, Secondary, Multi-level Partitioned Primary, Aggregate Join Index, Sync Scan).
Teradata Database 14 helps you create a consolidated data environment that gives you a single source for enterprise-wide decision making with these features and more:
  • Teradata Columnar
  • Compress on cold and intelligent multi-temperature data management
  • Enhanced security
  • Customer driven innovation in application migration and temporal
  • Teradata Temporal: Time aware for automated temporal data management and intelligent temporal query processing. Reduces data errors, simplifies application development, and opens temporal data analysis up to business users.
  • Teradata Columnar: Hybrid row and column oriented data for extreme performance and dramatic compression without sacrificing the other benefits and industry leading attributes of Teradata.
DETAIL SUMMARY
Teradata Viewpoint 14.0
However there are other items of interest in this release. First related to Teradata DB 14.0 release, is the introduction of a significant number of new TASM state matrix events. 
As a quick review, the state matrix allows automated adjustment of TASM ruleset values associated to a state change, being either planned or unplanned. 
  • System CPU Utilization
  • System-wide Node Skew
  • By Workload Definition (WD) Events:
  • CPU Utilization
  • Missed Service Level Goal (SLG)
  • Arrival Rate
  • Concurrent Active Requests
  • Delay Queue Size
  • Delay Queue Depth
  • Message Delay Time

The primary goal of Teradata Viewpoint 14.0 is to support the Teradata Database 14.0 releases, both TD 14.0 on SLES10 and TD 14.0 on SLES11. The majority of this work was to provide the enabling interfaces for the TD 14.0 SLES11 new workload management stack. As such, the amount of "general" features was limited but there are still some additions of interest. 
Lastly as always, the new Viewpoint release represents an integration of patch releases into the base release.
Aligned with this Viewpoint release is the very exciting new Teradata offering called Teradata Data Lab
Teradata Data Lab refers to work spaces or maybe better known as "sandboxes" within the data warehouse where users can explore, analyze new data, and test data value theories. 
Teradata Data Lab is a separately sold product that provides Viewpoint portlets that assist in governance, auto-provisioning, and monitoring of these "sandboxes". 
See below for more details on Teradata Data Lab and the Lab Group Setup and Data Labs portlets in this initial release.


Viewpoint 14.0 main focus was on the interface support of the Teradata Database 14.0 releases, and in particular the new workload management coming with TD 14.0 SLES11.Additional details on this new foundation will be included in future article aligned with the actual Teradata DB 14.0 SLES11 release.The new state matrix events are:


A general feature addition (meaning no dependency on a specific Teradata Database version) worth noting is the new "Skew" tab in Query Monitor and My Queries portlets which displays details about the level of the skew in the query or session. 
See below for more information. Another administrative change worth mentioning is addition of a "SQL Explain" control for the Sessions data collector.
This allows one to stop collection of the SQL and Explain information as part of the Sessions collection. This might be considered for performance or security considerations.

"Skew" tab

The skew tab is a new option in the Query Monitor and My Queries portlets details view that provides CPU and I/O skew information and how it relates to specific AMPs.
 It lists the high and low impacted AMPs as well as an overall AMP participation count.
 So drilling down on a skew query provides you the new skew tab which will present information as shown below


Data Lab Concept

A challenge for any organization with a production data warehouse is how to allow for data proofing and new data value investigations while maintaining the relationships to production data.

Bringing this concept data into production often becomes a contention between the need for flexibility and the IT processes necessary for protecting and staging data in the production environment.

There is also the obstacle of the DBA feeling that they cannot take on the additional responsibility of managing a development environment and in particular, one that may have negative impact on production operations performance and service levels.

The end result often becomes exporting the production data out of the data warehouse consuming precious resources with very little value to the production environment itself.
This data then moves into hidden infrastructure servers and client systems resulting in stale data, conflicting reports, outdated data models, as well as security and privacy adherence concerns.

Teradata Data Lab advocates providing space in the production environment for this effort with automated and distributed management and protected by built-in hooks into Teradata workload management strategies.
Email is leveraged for notifications and intuitive Viewpoint portlet interfaces for everything. That's Teradata Data Lab. Continue reading for more details on the available portlets for this initial product release.

Lab Group Setup

The first Data Lab portlet to discuss is the setup and configuration portlet called Lab Group Setup. Upon loading of this portlet, it will ask for Teradata credentials as you will need the necessary Teradata system permissions to define the lab space.

 Clicking on the "Add Lab Group" gets you started and then leads you through an easy five step process for setting up the lab group.

The first step allows configuration of the name, available space, default lab size, and a lab prefix that will be an easy integration mechanism with Teradata workload management.


Also notice the email notifications setup for easy automated "hands free" communications to the Data Labs users.
Subsequent steps will allow configuration of Viewpoint ownership users and roles as well as Teradata user and roles access.
 Lastly step #5 will be setup of thresholds and approvers for various requests. 
This is where you decide on how open or controlled the lab is going to be. 

The example below is a mix of automation and manual controls.


If you have a plan on how you want the data lab environment to be established, setup through this Viewpoint portlet takes just a few minutes to complete. 
You can return to this portlet any time you need to view, edit, or delete lab group environments.

Data Labs

The Data Labs portlet will be used by DBAs, lab group owners, and user/consumers of data labs for creation, monitoring, and administration of data labs, in other words the on-going operations. Once connected to the associated Teradata system, the Data Labs portlet offers three main tabs (Monitoring, Approvals, Requests).

The Monitoring tab provides views at the lab group, lab, and table levels through easy point and click operations. Specific information is given at each of the levels for things such as size, ownership, expirations, etc accompanied by cylinder views for space usage.

 This is a view at the lab level:




The Approvals tab is what you would expect it to be. When a data lab request is made, it is added to a request queue for approval.
You can view these unapproved requests within this tab if you are a lab group or lab owner.
This is also where approvals or denials of requests are managed.

The third tab, Requests, is also what you might anticipate it is. This portlet is going to be commonly used by the user/consumers of data labs.
Requests can be for a variety of data lab operations and necessities including:
  • Add or delete labs
  • Increase or decrease lab size
  • Add or remove users from labs
  • Add or remove owners from labs
  • Extend lab expiration dates
  • Promote or drop tables
  • Change lab details
  • Send other requests
Users can also review their requests within this portlet to understand status. Has it been approved, denied, or still pending?
The portlet provides pull downs for easy lab group and lab filtering as well as operations for managing your requests.
A simple example is shown below with various requests, in different states, and preparation for deleting one of the older requests.



 The preferences for both Lab Group Setup and Data Labs portlets allow setting of thresholds for highlighting in the portlet views as well as configuration of optional email notifications. 

Wednesday 13 June 2012

TERADATA INTERVIEW QUESTIONS

Recent collection of questions




1.How do you find the list of employees named "Madhu" in an Party table without using Like operator??

Above question looks like confused

Yes ,another way by which we can find names/patterns without using like operator.

By using "BETWEEN" , we can find the list of customer named Madhu

sel * from party where name between 'M' and 'N';

But at times usage of between is tricky, if there are other customer Names starts with M also will be listed.

2.What is the acceptable range for skew factor in a table?
There is no particular range for skew factor.  In case of production systems, it is suggested to keep skew factor between 5-10.  

There are various considerations for skew factor
- Number of AMPS
- Size of row in a table
- number of records in a table
- PI of a table
- Frequent access of table (Performance consideration)
- whether table getting loaded daily /monthly or how frequently data is being refreshed.

3.How do you list all the objects available in given database?

 select * from dbc.tables where databasename='<DATABASENAME>';

 By running a normal help command on that database as follows.  
help database <DATABASENAME';

4.How to rename columns using views?

Create a view which is a subset of the Party table as follows.
Renaming columns in views will help increase security of sensitive tables and hiding columns under alias names

Replace  view Partyview ( number,fullname,addr,phono,codenum) as 
locking row for access  
sel
partyNo ,                        
Name  ,                        
Address,                       
Phone   ,                      
CodeNo                   
from Party;                

If we Run this below query
 sel * form party.view, Only alias columns are displayed .

5.What are Restrictions on Views in Teradata?
An index cannot be Created on a view.
It cannot contain an ORDER BY clause.
All the derived columns and aggregate columns used in the view  must have an AS clause (alias defined).
A view cannot be used to UPDATE if it contains the following :
        – Data from more than one table (JOIN VIEW)
        – The same column twice
        – Derived columns
        – A DISTINCT clause
        – A GROUP BY clause

6. Built-in functions used in teradata ?

The main functionality of built in functions is  that they do not need any arguments or paramaters and can be used directly with select to return system values.

• SESSION: – Returns a number for the session for current user .
• TIME: – this function provides the current time based on a 24-hour day
• USER: –  This one gives the user name of the current user.
• ACCOUNT: – display's your Teradata Account information 
• CURRENT_DATE: – Returns the current system date 
• CURRENT_TIME: - This function returns the current system time and current session ‘Time Zone’ displacement. 
• CURRENT_TIMESTAMP: - Returns the current system timestamp  with TimeZone
• DATABASE: –  It  returns the name of the default database for the current user.
• DATE: – same as Current_DATE   and is teradata built in .

7.How do you know whether table is locked or not?

Simple just run this query on the table.

Lock Table DBNAME.TABLENAME write nowait
Select * from DBNAME.TABLENAME;

If this table is locked currently then , then the above statement would fail as Nowait will not wait for locks to be released on that  table .

one more way


You can find it by using Show locks utility in Teradata Manager Tool.

You can use Teradata Manager -> Administrator -> Database Console Utility -> Show locks

8.what are the  different date formats available in Teradata system?

Two different Date formats are

 The Teradata default format is: 
YY/MM/DD

The ANSI display format is: 
YYYY-MM-DD


9.How to handle nulls in Teradata?How many columns can be there in a table???

1. Use zeroifnull, nullifzero in select and NULL in insert directly.
2. 256 columns max per table.

10.What are the scenarios in which Full Table Scans occurs?

1. The where clause in SELECT statement does not use either primary index or secondary index
2. SQL Statement which uses a partial value (like or not like), in the WHERE statement.
3. SQL Statement which does not contain where clause.
4. SQL statement using range in where clause. Ex. (col1 > 40 or col1 < =10000)

11.What is a role? Profile?

A role is a set of access rights which can be assigned to the users.   They indirectly help in performance by reducing the number of rows entered in DBC.accessrights

A profile contains set of user parameters  like accounts, default database, spool
space, and temporary space for a group of users

To assign the profile to a user, use the AS PROFILE modified in a CREATE USER or
MODIFY USER statement:
MODIFY USER username AS PROFILE=profilename ;
To remove a profile from a member but retain the profile itself:
MODIFY USER username AS PROFILE=NULL ;

12.How can you track Login Parameters of users in Teradata?

You can view all these parameters in this Data Dictionary Table DBC.LOGONOFF
SELECT LOGDATE, LOGTIME, USERNAME, EVENT FROM DBC.LOGONOFF;

13.Why are AMPs and PEs called as vprocs ?

AMPs and PEs are implemented as “virtual processors - vprocs”. 
They run under the control of PDE and their number is software configurable.
AMPs are associated with “virtual disks – vdisks” which are associated with logical units (LUNs) within a disk array.

Vprocs:Virtual process From PE to AMP (This is the network root via MSP(message passing layer),The processing data will store in Disks(These are Physical disks),Each Amp have too many P.disks,to migrate these P.disks The Bynet Network maintains Virtual disks.These V.disks will responsible for data migration.hence they are called as Virtual Process(VPROCS).

14.How does Hashing happens in Teradata?

  Hashing is the mechanism through which data is distributed and retrieved to/from AMPs.
 Primary Index (PI) value of a row is the input to the Hashing Algorithm.
 Row Hash (32-bit number) value is the output from this Algorithm.
 Table Id + Row Hash is used to locate Cylinder and Data block.
 Same Primary Index value and data type will always produce same hash value.
 Rows with the same hash value will go to the same AMP.

So data distribution depends directly on the Row Hash uniqueness; be careful while Choosing Indexes in Teradata.


15.Define POSITION Function?

The POSITION function Returns the actual position of the character which occurs first. POSITION function is ANSI standard.

Teradata has an equivalent function called INDEX.

Both the POSITION and INDEX functions returns position of character's first occurrence in a string.

Examples for the POSITION function

SELECT POSITION( 'e' IN 'Read');  Displays Result as '2' 
SELECT POSITION( 'ad' IN 'Read'); Displays Result as '3' 
SELECT POSITION( 's' IN 'Read');  Displays Result as '0'  

Examples for the INDEX function.

SELECT INDEX('Write', 'i');    Displays Result as '3'   
SELECT INDEX('Write', 'te');   Displays Result as '4'   

16.Explain LOB?

It is possible to create functions which operate on large object data types.
 The following library functions exists to provide the ability use LOB's as function input or output:

• FNC_GetLobLength - acquire Lob length
• FNC_LobAppend - add to Lob in memory
• FNC_LobClose - close a context to a Lob
• FNC_LobOpen - open a context to a Lob
• FNC_LobRead - read some or all
• FNC_LobRef2Loc - convert a Lob Reference to a Locator 
• FNC_LobLoc2Ref - convert a Locator to a Lob Reference

Example :
The CREATE FUNCTION syntax permits BLOB's or CLOB's to be the data type of an argument or a return value

CREATE FUNCTION JPEG_CROP
(
Image BLOB AS LOCATOR 
Width INTEGER, 
Height INTEGER, 
X_Offset INTEGER, 
Y_Offset INTEGER
RETURNS BLOB AS LOCATOR
LANGUAGE C
NO SQL
EXTERNAL;