Monday 10 October 2011

Teradata Typical Questions Set-2

Hi Students

In this 2 nd module I covered Important questions from Journals, Indexes,Utilities.
Teradata Typical Questions 2
1.What is Journaling? why teradata requires journaling?

Journling is a data protection mechanism in teradata.It prevents from Image failure.
Journals are generated to maintain pre-images and post images of a DML transaction starting/ending, at/from a checkpoint.
When a DML transaction fails,the table is restored back to the last available checkpoint using the journal Images.

There are 3 types of Journals
Permanent
Transient
Recovery Journal

2.How many types of Image supports Journaling?

Four types of Images are supported by Journaling they are

a. Single Image---->one copy of data will be taken.

b.Dual Image----->Two copies of data will be taken.

c.Before Image----->Before changes occur on the row data will be taken.

d.After Image---->After changes happen on the row taking copy of data.

3.What is Transient Journal?

Transient Journal - an area of space in the DBC database which is used primarily for storing of roll-back information during inserts/deletes/updates of tables.

Detail Explanation:
The Transient Journal maintains a copy of all before images of all rows affected
by the transaction.
If the event of transaction failure, the before images are reapplied to the affected tables, the images are deleted from the journal and a rollback operation is completed.
In the event of transaction success, at the point of transaction commit,
the before images for the transaction are discarded from the journal.
In Summary, if a Transaction fails (for whatever reason), the
before images in the transient journal are used to return the data (in the
tables involved in the transaction) to its original state.

4.What is Permanent Journal?

The purpose of the Permanent Journal is to provide selective or full database recovery to a specified point in time.It permits recovery from unexpected hardware or software disasters.
The Permanent Journal also has the effect of reducing the need for full table backups which can be costly both in time and resource.

5.What are the different image options for Permanent Journal?

There are four image options for the Permanent Journal:

· Before Journal

· After Journal

· Dual Before Journal

· Dual After Journal

6.Permanent Journals are automatically purged in teradata. True / False?

False.

The Permanent Journal must be manually purged from time to time.

7.Where does TD store transient journal?

In perm space -> dbc.transient journal
But that special table can grow over dbc's perm limit, until the whole system runs out of perm space.

8.What are the different return codes(severity errors) in Teradata utilities?

There are 3 basic return codes (severity errors) in teradata utilities.
4 - Warning
8 - User error
12 - System error


9.How will you connect a database server to other server?

We can connect from one server to another server in UNIX using the command
ssh or FTP or SU
ssh - ssh user_id@server_name

10.What is the meaning of skewness in Teradata?

Data or Amp skew occurs in teradata due to uneven distribution of data across all the amps. Often this leads to spool space error too. To avoid skewness try to select a Primary Index which has as many unique values as possible.

PI columns like month day etc. will have very few unique values. So during data distribution only a few amps will hold all the data resulting in skew. If a column (or a combination of columns) is chosen a PI which enforces uniqueness on the table then the data distribution will be even and the data will not be skewed.

11.Does Primary Index column choosing is important?

The Success of teradata warehouse starts with choosing of correct column
to creation of Primary index.

Try to Choose a column which has unique values. so that data can be distributed evenly among all AMPs.
Otherwise Skewness will come into picture.
Primary index is useful to get a straight path to retrieve data.

12.What are the basic rules that define how PI is defined in Teradata?

These are the following rules how Primary Index defined in Teradata
a. Only one Primary Index per table.
b.Its a physical mechanism which assigns Row to AMP.
c.A Primary Index value can be unique or non-unique.
d.A Primary Index can be composite till 64 columns.
e.The Primary Index of a populated table cannot be modified.

13.What are the basic criteria to select Primary Index column for a given table?
Answer:

A thumb rule of ADV demographics is followed.
Access Demographics
Identify index candidates that maximize one-AMP operations.
Columns most frequently used for access (Value and Join).

Distribution Demographics
Identify index candidates that optimize parallel processing.
Columns that provide good distribution.

Volatility Demographics
Identify index candidates with low maintenance I/O.

14.can you explain and PE and AMPs communicate?


When user connects to teradata database he opened a session with parsing Engine(PE)

there after when user submits a query,

1. First PE takes query, checks syntax, verifies his access rigthts
2. If every thing looks okay, PE prepare an action plan for AMP.
Which AMP to respond , what is row ID to read ...etc
3. Then PE sends action plan to AMP via BYNET signals.
4. Then corresponding AMP takes action plan and reads data and send back to PE
Then PE sends data to USER

15.Does Permanent Journals and Secondary indexes require Perm Space?

Ans: Yes



16.which objects require Perm space in teradata?

Tables and Stored Procedures objects require Perm Space

Views, Macros , Triggers doesn't require Perm space.



17.What is LOG TABLE?
A log table maintains record of all checkpoints related to the load job, it is essential/mandatory to specify a log table in a job. This table will be useful in case you have a job abort or restart due to any reason.


18.what is the use of Partition?

If you create PPI on table, then data at AMPs is ordered using Partition column.

Example
If we have Partition on deptno then all dept 10 records at one place at AMP and dept 20 records at one place.

19.Is it possible to alter NO RANGE and Unknown partition on a table?

Yes if table is empty,we can alter NO RANGE and Unknown Partition of a table.

20.can you apply a read lock on table where write lock is already applied?

No


21.How many partition in given SQ L?

Assume SQ L,

Partition by CASE_N
( partycode = 10,
partycode = 20,
partycode = 30,
partycode = 40 ,
NO CASE OR UNKNOWN)
)


Answer -- 5 partitions

22.How many partions in given SQL?
Assume SQL,

Partition by CASE_N
( partycode = 10,
partycode = 20,
partycode = 30,
partycode = 40 ,
NO CASE, UNKNOWN)
)

Answer --- 6 partitions

Here it creates 6 Partitions

If a value partycode entered is not NULL and first 4 cases are not satisfied then data goes
to "NO CASE" partition.

NULL values goes to UNKNOWN partition.


23.How teradata makes sure that there are no duplicate rows being inserted when its a SET table?


Teradata will redirect the new inserted row as per its PI to the target AMP (on the basis of its row hash value), and if it find same row hash value in that AMP (hash synonyms) then it start comparing the whole row, and find out if duplicate.
If its a duplicate it silently skips it without throwing any error.



24. Explain What are Low and High Confidentiality in Explain command?

The explain generally displays the plan which would be prepared by Optimizer.
Confidence levels indicate how well the optimizer is aware about the demographic data for a particular step.
High confidence: Means the optimizer know about the no of rows that would be returned as a result of that step. Examples would be when PI statistics exist when the column or range stats exist or no join involved.

Low confidence: Some stats available. Join and stats available on both sides of the join.

No confidence: No stats available. Join involved.

25.Why Fastload Utility does not support multiset table and duplication?

Multi set table supports duplicate rows.
Fastload utility does not support duplicate rows.

Restart logic is one of the reason. Fastload is one of the utility which loads data in blocks(64K). During restart Fastload sends some of the rows for the second time which occur after a checkpoint. Due to this Fastload rejects duplicates.

example
consider 20 rows to be loaded in a table. FL utility is used to load the table and ckpt is 5 rows. If restart occurs after 7th row FL may send 6 and 7th to AMPs during the restart. This records would be considered as duplicates and rejected.


26.Can you Recover the password of a User in Teradata?

No , you cant recover the password of a User in Teradata. Passwords are stored in Data Dictionary table (DBC.DBASE).Using a one way encryption method.
You can view the encrypted passwords using the following query.

SEL * FROM DBC.DBASE;

27.what is the differnce between Sub Query and Corelated Sub Query?

Sub Query

If Queries written in a nested manner then its termed as a sub query.
Inner query executed First and executed Only one time.

Corelated Sub Query

Co-Related Sub query get executed once for each row of the parent query.
Inner Query executed many based on Outer query.


28.what is FILLER command in Teradata?

while running Fastload or Multiload if you dont want to load a particular field from the datafile to the target table then use the FILLER command to achieve.


29.Difference between Access Logging and Query Logging in Teradata?

1.Access Logging is concerned with security ( who is doing what)
In access logging you ask the database to log who is doing what on a given object.
The information stored is based on the object not the SQL fired or the user who fired it.

2.Query Logging (DBQL) is used for debugging (what is happening around)
Incase of DBQL database keep on tracking various parameters like SQLs, Resource, Spool Usage and other things which help you to understand whats going on, the information is fruitful to debug a problem.

30.What is basic Teradata Query language?

1.It allows us to write SQL statements along with BTEQ commands.
we can use BTEQ for Importing,Exporting and Reporting Purposes.
2.The commands start with a (.) dot can be terminated by using (;) it is not mandatory to use(;)
3.BTEQ will assume any thing written with out dot as a SQL statement and requeries a (;)
to terminate it.

31.How can you track Login parameters of users in teradata?

You can view all these parameters in this Data Dictionary Table DBC.LOGONOFF

//SELECT LOGDATE,LOGTIME,USERNAME,EVENT FROM DBC.LOGONOFF;//


32.How can you use HASH FUNCTIONS to View Data Distribution across all AMPs in Teradata?

Hash Functions can be used to view the data distribution of rows for a chosen Primary Index.

SELECT HASHAMP(HASHBUCKET(HASHROW())) AS "AMP#",COUNT(*)
FROM GROUP BY 1 ORDER BY 2 DESC;

HASHROW --- returns the row hash value for a given value.
HASHBUCKET --- the grouping of a specific Hash value.
HASHAMP-----the AMP that is associated with the Hash Bucket.

This is realy good, by looking into the result set of above written query you can easily
find out the Data distribution across all AMPs in your system and further you can easily identify un-even data distribution.


33.How do you transfer large Amount of Data in Teradata?

Transfering of large Amount of data can be done using various Applications like Teradata Utilities.
BTEQ,FastLoad,MultiLoad,Tpump and FastExport.

BTEQ (Basic Teradata Query) supports all 4 DML s : SELECT, INSERT,UPDATE and DELETE.
BTEQ also support IMPORT/EXPORT protocols.

Fastload ,Multiload and Tpump transfer the data from Host to Teradata.

FastExport is used to export data from Teradata to the Host.


34.How can you determine I/O and CPU usage at a user level in Teradata?

You can find out I/O and CPU Usage from this data Dictionary Table

DBC.AMPUSAGE;

SELECT ACCOUNTNAME,USERNAME,SUM(CPUTIME) AS CPU,SUM (DISKIO)
AS DISKIO FROM DBC.AMPUSAGE GROUP BY 1,2 ORDER BY 3 DESC;


35.what is Normalization?

Normalization is the process of reducing a complex data structure into a simple, stable one.

Generally this process involves removing redundant attributes, keys, and relationships from the
conceptual data model.

36. How many types of Indexes are present in teradata?

There are 5 different indexes present in Teradata
1. Primary Index
a.Unique primary index
b. Non Unique primary index
2. Secondary Index
a. Unique Secondary index
b. non Unique Secondary index
3. Partitioned Primary Index
a. Case partition
b. Range partition
4. Join index
a. Single table join index
b. multiple table join index
c. Sparse Join index ( constraint applied on join index in where clause)
5. Hash index
37.what teradata supports 6A's means?
Active Load
Active Access
Active Events
Active Workload Management
Active Enterprise Integration
Active Availability.
38.Which is Faster – MultiLoad Delete or Normal Delete command?
MultiLoad delete is faster then normal Delete command, since the deletion happens in data blocks of 64Kbytes, where as delete command deletes data row by row.
Transient journal maintains entries only for Delete command since Teradata utilities doesn’t support Transient journal loading.
39.what tools can be used for Active Load in teradata?

ETL Tools can use queue tables and triggers, and use FastLoad, MultiLoad and TPump utilities .

40.How to Skip or Get first and Last Record from Flat File through MultiLoad?

In .IMPORT command in Mload we have a option to give record no. from which processing should begin.
i.e. ‘FROM m’ ‘m’ is a logical record number, as an integer, of the record in the identified data source where processing is to begin.
You can mention ’m’ as 2 and processing will start from second record.
THRU k and FOR n are two options in the same Mload command, functions same towards the end of the processing.
Adding to the above, if from n"start record" and for n "stop record" are not mentioned, mload considers records from start till the end of the file.
42.what is the use of TEMP space?

TEMPORARY (TEMP) space :
A database may or may not have TEMP space, however, it is required if Global Temporary Tables are used.

43. What is the default join strategy in Teradata???
Answer
# 1
The Teradata Cost based optimizer will decide the join strategy based on the optimum path. The common strategies followed are from Merge, Hash & Nested Join
Answer
# 2
Three strategies followed by optimizer are:
1. Duplication of rows of one table to every amp
--> This one is opted by optimizer when the non-PI column is on a small table.
2. Redistribution of the non PI rows of the table to the amp containing the matching PI row amp.
--> This one is opted when the tables are large. The non PI column is in one table is redistributed to the amp containing the matching PI.
3. Redistribute both the rows of the table by hash values.
--> This is chosen when the join is on a column that is not the PI in either table. This will require the most spool space.
44. what is common practice in creating views and tables in teradata?

It is a common practice in Teradata to have some databases with PERM space that contain only tables.

Then, other databases contain only views. These view databases require no PERM space and are the only databases that users have privileges to access.

The views in these databases control all access to the real tables in other databases. They insulate the actual tables from user access .
45.PERM space is a pre allocated space? True/ False?

False. PERM space is not pre allocated space. its a maximum allowed space
to user or a database.
46.How to handle nulls in Teradata?
How many columns can be there in a table?
How to find average sal with out using avg function?
Answer
1. Use zeroifnull, nullifzero in select and NULL in insert directly.
2. 256 columns max per table.
3. Without using "avg" we can find the avg salary by using sum (sal)/count (sal);
sel sum(sal)/count(sal) as avgsal from tablename.

47.what is FALLBACK in teradata?

FALLBACK is a table protection feature used in case an AMP fails. You can use FALLBACK on all tables.

FALLBACK tables use twice as much disk space as NON-FALLBACK rows .

48. How many types of Skew exist?
If you utilized unequally TD resources (CPU,AMP,IO,Disk and etc) this is called skew exists. Major are 3 types of skews (CPU skew, AMP/Data skew, IO Skew).
-Data skew?
When data is not distributed equally on all the AMPs.
-Cpu skew?
Who is taking/consuming more CPU called cpu skew.
-IO skew?
Who perform more IO Operation? Resulting in IO Skew.
49.What are Differences between Teradata and ANSI Session modes in Teradata?
TERADATA MODE
-------------
1. Comparison is not Case sensitive
2. Create table are default to SET tables
3. Each transaction is committed implicitly
4. Supports all Teradata commands
5. It follows BTET (Begin and End Transaction) Mode
ANSI MODE
-------------
1. Comparison is CASE sensitive
2. Create table are default to MULTISET tables
3. Each transaction has to be committed explicitly
4. Does not support all Teradata commands .
5. It does not follow BTET Mode.
50.How to skip 1st record while using Bteq IMPORT?
By using SKIP=1 ; , we can skip first record.
.import infile=, skip=1;

No comments:

Post a Comment