Friday, 15 April 2016

TERADATA INTERVIEW QUESTIONS AND ANSWERS

NOTE: ANSWERS FOR THE QUESTIONS ARE NOT VERIFIED (FEW MAY BE WRONG/IN ACCURATE)
--------------------------------------------------------------------------------------------------------------------------

 

1.How do you find the list of employees named "Madhu" in an Party table without using Like operator??

 Above question looks like confused
Yes ,another way by which we can find names/patterns without using like operator.
By using "BETWEEN" , we can find the list of customer named Madhu
sel * from party where name between 'M' and 'N';
But at times usage of between is tricky, if there are other customer Names starts with M also will be listed.

2.What is the acceptable range for skew factor in a table?

There is no particular range for skew factor.  In case of production systems, it is suggested to keep skew factor between 5-10.  
There are various considerations for skew factor

- Number of AMPS

- Size of row in a table

- number of records in a table

- PI of a table

- Frequent access of table (Performance consideration)

- whether table getting loaded daily /monthly or how frequently data is being refreshed.


3.How do you list all the objects available in given database?

 

 select * from dbc.tables where databasename='<DATABASENAME>';


 By running a normal help command on that database as follows.  

help database <DATABASENAME';

 

4.How to rename columns using views?

 

Create a view which is a subset of the Party table as follows.

Renaming columns in views will help increase security of sensitive tables and hiding columns under alias names

 

Replace  view Partyview ( number,fullname,addr,phono,codenum) as 

locking row for access  

sel

partyNo ,                        

Name  ,                        

Address,                       

Phone   ,                      

CodeNo                   

from Party;                

 

If we Run this below query

 sel * form party.view, Only alias columns are displayed .

 

5.What are Restrictions on Views in Teradata?

An index cannot be Created on a view.

It cannot contain an ORDER BY clause.

All the derived columns and aggregate columns used in the view  must have an AS clause (alias defined).

A view cannot be used to UPDATE if it contains the following :

        – Data from more than one table (JOIN VIEW)

        – The same column twice

        – Derived columns

        – A DISTINCT clause

        – A GROUP BY clause

 

6. Built-in functions used in teradata ?

 

The main functionality of built in functions is  that they do not need any arguments or paramaters and can be used directly with select to return system values.

 

• SESSION: – Returns a number for the session for current user .

• TIME: – this function provides the current time based on a 24-hour day

• USER: –  This one gives the user name of the current user.

• ACCOUNT: – display's your Teradata Account information 

• CURRENT_DATE: – Returns the current system date 

• CURRENT_TIME: - This function returns the current system time and current session ‘Time Zone’ displacement. 

• CURRENT_TIMESTAMP: - Returns the current system timestamp  with TimeZone

• DATABASE: –  It  returns the name of the default database for the current user.

• DATE: – same as Current_DATE   and is teradata built in .

 

7.How do you know whether table is locked or not?

 

Simple just run this query on the table.

 

Lock Table DBNAME.TABLENAME write nowait

Select * from DBNAME.TABLENAME;

 

If this table is locked currently then , then the above statement would fail as Nowait will not wait for locks to be released on that  table .

 

one more way

 

 

You can find it by using Show locks utility in Teradata Manager Tool.

 

You can use Teradata Manager -> Administrator -> Database Console Utility -> Show locks

 

8.what are the  different date formats available in Teradata system?

 

Two different Date formats are

 

 The Teradata default format is: 

YY/MM/DD

 

The ANSI display format is: 

YYYY-MM-DD

 

 

9.How to handle nulls in Teradata?How many columns can be there in a table???

 

1. Use zeroifnull, nullifzero in select and NULL in insert directly.

2. 256 columns max per table.

 

10.What are the scenarios in which Full Table Scans occurs?

 

1. The where clause in SELECT statement does not use either primary index or secondary index

2. SQL Statement which uses a partial value (like or not like), in the WHERE statement.

3. SQL Statement which does not contain where clause.

4. SQL statement using range in where clause. Ex. (col1 > 40 or col1 < =10000)

 

11.What is a role? Profile?

 

A role is a set of access rights which can be assigned to the users.   They indirectly help in performance by reducing the number of rows entered in DBC.accessrights

 

A profile contains set of user parameters  like accounts, default database, spool

space, and temporary space for a group of users

 

To assign the profile to a user, use the AS PROFILE modified in a CREATE USER or

MODIFY USER statement:

MODIFY USER username AS PROFILE=profilename ;

To remove a profile from a member but retain the profile itself:

MODIFY USER username AS PROFILE=NULL ;

 

12.How can you track Login Parameters of users in Teradata?

 

You can view all these parameters in this Data Dictionary Table DBC.LOGONOFF

SELECT LOGDATE, LOGTIME, USERNAME, EVENT FROM DBC.LOGONOFF;

 

13.Why are AMPs and PEs called as vprocs ?

 

AMPs and PEs are implemented as “virtual processors - vprocs”. 

They run under the control of PDE and their number is software configurable.

AMPs are associated with “virtual disks – vdisks” which are associated with logical units (LUNs) within a disk array.

 

Vprocs:Virtual process From PE to AMP (This is the network root via MSP(message passing layer),The processing data will store in Disks(These are Physical disks),Each Amp have too many P.disks,to migrate these P.disks The Bynet Network maintains Virtual disks.These V.disks will responsible for data migration.hence they are called as Virtual Process(VPROCS).

 

14.How does Hashing happens in Teradata?

 

  Hashing is the mechanism through which data is distributed and retrieved to/from AMPs.

 Primary Index (PI) value of a row is the input to the Hashing Algorithm.

 Row Hash (32-bit number) value is the output from this Algorithm.

 Table Id + Row Hash is used to locate Cylinder and Data block.

 Same Primary Index value and data type will always produce same hash value.

 Rows with the same hash value will go to the same AMP.

 

So data distribution depends directly on the Row Hash uniqueness; be careful while Choosing Indexes in Teradata.


15.Define POSITION Function?

 

The POSITION function Returns the actual position of the character which occurs first. POSITION function is ANSI standard.

 

Teradata has an equivalent function called INDEX.

 

Both the POSITION and INDEX functions returns position of character's first occurrence in a string.

 

Examples for the POSITION function

 

SELECT POSITION( 'e' IN 'Read');  Displays Result as '2' 

SELECT POSITION( 'ad' IN 'Read'); Displays Result as '3' 

SELECT POSITION( 's' IN 'Read');  Displays Result as '0'  

 

Examples for the INDEX function.

 

SELECT INDEX('Write', 'i');    Displays Result as '3'   

SELECT INDEX('Write', 'te');   Displays Result as '4'   

 

16.Explain LOB?

 

It is possible to create functions which operate on large object data types.

 The following library functions exists to provide the ability use LOB's as function input or output:

 

• FNC_GetLobLength - acquire Lob length

• FNC_LobAppend - add to Lob in memory

• FNC_LobClose - close a context to a Lob

• FNC_LobOpen - open a context to a Lob

• FNC_LobRead - read some or all

• FNC_LobRef2Loc - convert a Lob Reference to a Locator 

• FNC_LobLoc2Ref - convert a Locator to a Lob Reference

 

Example :

The CREATE FUNCTION syntax permits BLOB's or CLOB's to be the data type of an argument or a return value

 

CREATE FUNCTION JPEG_CROP

(

Image BLOB AS LOCATOR 

Width INTEGER, 

Height INTEGER, 

X_Offset INTEGER, 

Y_Offset INTEGER


RETURNS BLOB AS LOCATOR

LANGUAGE C

NO SQL

EXTERNAL;

 

 

 

1.Definition of Teradata? Tell me important components of Teradata?

Definition:
It is an RDBMS that drives company's Datawarehouse,Datamart,OLTP,OLAP,DSS Appliances.
a.An "Open system which executes on UNIX MR-RAS(Discontinued in teradata 13),WIN2K, Suse Linux ETC & Above OS.
b.Compatible with ANSI standards
c.Runs on Single or Multi nodes.
d.Built in Parallelism
e.Acts Like a Server.

New Featues (Teradata 13):
Designed with the below appliances.
a.Extreme Data warehouse 1550.
d.Active Enterprise Datawarehouse 5550 H,5555 C/H
c.Internet and E-business
d.Customer Relationship Management.
e.Datamart Appliance 2500,2550,2555

Important Components:
a.PE(Parsing Engine )
b.Bynet
c.Amp (Access Module Processor)
d.VSS(Virtual Storage System) (from Teradata13)
e.Vdisk(Virtual Disk)

2.How to run the Teradata jobs in unix environment ?
At unix prompt,Please execute in the below way.
$Sh&gt;BTEQ&lt;[Script Path]&gt;[Logfile Path]
or
$Sh&gt;BTEQ&lt;[Script Path] TEE [Logfile Path]

Not only these ways ,there are many ways.
Similarly to other utilities.


3.There is a load to the Table every one hour and 24/7.Morning trafic is high, afternoon trafiic is less, Night trafiic is high.According to this situation which Utility you use and how do you load,which utility used?
Tpump is suggestable here
By Using packet size increasing or decreasing we can handle traffic.



4.Fast Load Script is failed and error tables are available then how do you restart?
There are 2 ways
1.Incase of Old file to Run
Dont drop error tables simply rectify the error in the script or file and Run again.so that it runs from last configuration.
2.Incase of new file to Run
Drop Error tables
Try to run the script with only begin and end loading statements. ,so that it releases the lock on target table If possible remove the record from fastlog table.
Run the script with new file freshly.


5.What are the advantages of other ETL tools(Informatica,DataStage,SSIS etc...) than Teradata Utilities or vice versa?
TD Utilities run faster than other ETL tools incase of File to Table(Import) or Table to File (Export) Operations. Inmod and Outmod also help us to do better programing while Importing and Exporting.
Major advantages of Other ETL tools than TD Utilities:

1.Heterogeneous Sources and Destinations we can Operate. (Oracle,Sqlserver,Excel,Flat Files etc....)
2.As they are providing full GUI support,degugging is so easier.


 

3.Reusability of components(Informatica:mapplets,worklets etc...) available in ETL tools so if we change in main location automatically other applications(mappings) which are using these components can update instantly. 4.Pivoting(Normalization),Unpivot(Denormalization) we can implement very easily in ETL tools.
5.Caching is one more advantage when we work with a source (Heterogenous) which is not changing frequently.Some times cache can be shared across applications etc....

6.Aborted in Phase 2 data acquisition completed in

?
Simply take Begin and End loading in scripts and Run again.so that it runs from Amp to table.

7.How do you Generate sequence at the time of Display?
By Using CSUM

8.How do you Generate Sequence in Teradata?

By Using Identity Column
1-for storing purpose using identity.
2-for display purpose using csum.

9.How do you load Multiple files to a table by using fast load scripts?
Remove End Loading statement in the script and Replace the file one by one in the script till last file and submit every time so that data appended in Amp Level.For the last file specify End Loading statement in the script and Run.so that it runs from Amp to table.

10.Why Multiload and Fastload does not supports SI,JI,RI and TRIGGERS?
Above all concepts require communication between Multiple Amps. As per Fastload and Multiload doesnot provide any communication b/w multiple Amps and they Operate Independently.As concept is violating as well as it takes time to implement the above operation they are not allowing them.

11.Why Multiload does not supports USI and supports NUSI?
Index subtable row is on the same Amp as the data row in NUSI.so it operates each Amp independently and in parallel.

12.While Executing MLOAD Client System Restarted?
We need to Manualy Submit the script,so that it loads data from last checkpoint.

13.While Executing MLOAD Teradata Server Restarted?
Along with the server MLAOD script will also restarted and Runs from last check point.

14.There is file it contains 100 records,need to load 60 records by skipping first 20 records and last 20 records.
use BTEQ Utility to do this task by Skip = 20 and Repeat 60 in script.

15.How to see current teradata version?
.SHOW VERSION

16.what is Node?
A Node is nothing but Collection of Hardware and Software components.Typically a Server is called as node.

17.What is PDE?

Parallel Data Extension
A software interface layer on top of the operating system that enables the database to operate in a parallel environment. It was created by NCR to support the parallel environment.

18.What is Trusted parallel database (TPD)?
A database is called TPD if it runs under PDE.
Teradata is a database which runs under PDE.so we call Teradata as pure Parallel database or Trusted parallel database.

19.What is channel driver?

A Software that communicates between the PEs and applications running on channel-attached clients.

20.What is Teradata Gateway?
Terdata Gateway software gives communication between the application and the PEs assigned to network-attached clients. There is one Gateway per node.

21.What is Virtual disk?


A collection of Cylinders(physical disks) arranged in an Array fashion is called Vdisk or Virtual disk.Traditionally this is called as disk Array or Array of disk.

22.what is Amp?
Access Module Processor
It is a Virtual processor responsible for managing one portion of the database(collection of virtual
disks).This portion is not sharable by any other AMP.so well call this Architecture as shared nothing Architecture.
Amp contains Database Manager subsystem and it performs the below operations
a.Performing DDL
b.Performing DML
c.Implementing Joins,Aggregations.
d.Applying and Releasing Locks etc.

23.What is Parsing Engine?

PE is type of Vproc it takes SQL request and delivers SQL response.
It has software components to break SQL into steps, and send the steps to the AMPs.

Session Control
A session is nothing but logical connection between User and Application.Here it controls Authorization if its a Valid it does log on otherwise log off.
Parser
checks syntactical errors
checks semantical errors
checks existence of objects.
Dispatcher
It takes set of request and keep in a queue ,delivers set of responses by keeping the same queue that means it does request response flow control.

24.How many MAX session PE handles at a time?

PE handles MAX 120 sessions.

25.What is BYNET?

It acts like a "Message Communication" happens between Components.It is responsible for
1. Sending messages
2. Merging data
3.Sorting data

26.what is Clique?
It prevents from Node Failure.
a.A Clique is a collection of Nodes will Share Common Disk drives.
b.whenever any node went down automatically Vprocs all migrate from fail node to other node to retrieve data from common disk drives.

27.List different types of LOCKS in teradata?
Teradata can apply four types of LOCKS
a.Access Lock
b.Read lock
c.Write Lock
d.Exclusive Lock

28.At what level teradata can apply a LOCK?

1.Database level lock --- All objects inside database will be locked.
2.Table level -- All rows inside a table will be locked.
3,Row hash level lock-- Only Corresponding row will be locked.

29.How many AMPs involved in case of Primary Index?
Its always one AMP.


30.what about UPSERT command in teradata?


UPSERT means update else insert.
In Teradata we have this option.

31.What is Advantage of PPI?

Mainly we use for Range based data storing or category based Data storing.
Range queries don't have to utilize a Full Table Scan.Its directly goes to the corresponding partition and skips other partitions.


 and Multiload work with PPI tables, but not with all Secondary Indexes.

32.what is the size of BYTEINT,SMALLINT,INTEGER?


BYTEINT - 1 Bytes - 8 bits -- &gt; -128 to 127
SMALLINT - 2 bytes - 16 bites: ---&gt; -32768 to 32767

INTEGER - 4 BYTES - 32 BITS ---&gt; -2,147,483,648 to 2,147,483,647

33.Difference between user and database in teradata?


Database User

A database is Passive Repository User is a Active.

It stores all database objects It stores any object such as table,macro,view etc.

It does not contain Password. It contains password

34..Difference between primary key and Primary Index?

Primary Index Primary Key


It is mandatory It is optional

Limit of 64 columns/table No limit

Allows Nulls & Duplicates Doesnt allows

It is physical mechanism Logical mechanism

Effects Data Distribution Does not effect Data Distribution

35.What is the use of Spool Space?

Teradata spool space is unused Perm space that is used for running queries.Teradata recommend 20 % of the available perm space is allocated for spool space.
This is used to hold intermittent results of the queries and volatile tables.

36.What is the need of Performance tuning?
To identify bottlenecks and to resolve it we go for Performance tuning.
Bottle neck is not an error but it causes system delay in Performance.
example: There is a query it is suppose to run in 2 mins but executed for half an hour and finally succeeded.
In this situation we need to identify bottlenecks and resolve it.
To Identify bottlenecks we go for
a.Explain Request Modifier
b.Teradata Visual Explain
c.Performance monitor
d.Teradata Manager

37.Define Explain plan?
Explain plan displays the execution plan of SQL statement that is going to be executed by the database.This plan
will be specified by the component called optimiser.Generaly it displays below information
a.Number of Amps
b.Amount of spool memory it is occupying.
c.Number of Rows its affecting.
d.Type of Join strategy it is taking.
e.Time it takes to execute.
f.Locks it is Using etc.
Syntax : EXPLAIN
Example : EXPLAIN SEL * FROM PARTY;

38.What is Collect statistics?
Collect stats just derives the data demographics of the table.
Collect stats is an important concept in teradata, Collect stats gives PE to come up with a plan with least cost plan for an requested query.
Collect stats defines the confidence level of PE in estimating
"how many rows it is going to access ?
how many unique values does a table have , null values etc and all this info is stored in data dictionary.
Once you submit a Explain plan query in TD the parsing engine checks if the stats are available for the requested table .
If collected stats available already PE generates a plan with "high confidence" .
If Collect stats unavailable gives "low confidence" .

Syntax : COLLECT STATISTICS ON INDEX/COLUMN NAME

39.What is Least Cost Plan?

It executes in shortest path with less time.

40.What is Highest Cost Plan?

It executes in Longest path with more time.

41.How many Confidence Level present?
a.Low
b.No
c.High
d.Join

42. If collect stats is not done on the table what will happen?


Teradata uses a cost based optimizer and cost estimates are done based on statistics.
So if you donot have statistics collected then optimizer will use a Dynamic AMP Sampling method to get the stats. If your table is big and data was unevenly distributed then dynamic sampling may not get right information and your performance will suffer.


43.What are the 5 phases in a MultiLoad Utility?
* Preliminary Phase – Basic Setup *
* DML Phase – Get DML steps down on AMPs*
* Data Acquisition Phase – Send the input data to the AMPs and sort it*
* Application Phase – Apply the input data to the appropriate Target Tables*
* End Phase – Basic Cleanup*


44.What are the MultiLoad Utility limitations?

MultiLoad is a very powerful utility; it has following limitations:

* MultiLoad Utility doesn’t support SELECT statement.
* Concatenation of multiple input data files is not allowed.
* MultiLoad doesn’t support Arithmatic Functions i.e. ABS, LOG etc. in Mload Script.
* MultiLoad doesn’t support Exponentiation and Aggregator Operators i.e. AVG, SUM etc. in Mload Script.
* MultiLoad doesn’t support USIs (Unique Secondary Indexes), Refrential Integrity, Join Indexes, Hash Indexes and Triggers.
* Import task require use of PI (Primary Index).



45.What are TPUMP Utility Limitations?

Following are the limitations of Teradata TPUMP Utility:
* Use of SELECT statement is not allowed.
* Concatenation of Data Files is not supported.
* Exponential & Aggregate Operators are not allowed.
* Arithmatic functions are not supported.


46.Explain Teradata Competetive Advantages detail?

1.Automatic,Even Data Distribution
In Teradata Even uniform or Parallel or Random distrubution is Automatic.
2.High Scalability
If you Increase the number of Nodes or users or work teradata does not sacrifice any performance and it scales Linearly.we are calling this scalabilty as linear scalability.
3.Mature Optimizer
As we are having powerful optimiser in teradata it supports 64 joins/query, 64 subquery/query,
formating commands and aggregate commands.
4.Models the business
Teradata supports any business models for star,snowflake schema,Hybrid schema,Normalisation etc
5.Low Cost Tco(Total cost of Ownership)
Its easy to Install,Manage,work.
Full Support Gui and Cheaper in price.


47.How do you set the session mode parameters in BTEQ?

.set session transaction ANSI-----&gt; this is to set ANSI mode
.set session transaction BTET-----&gt; this is to set Teradata transaction mode.

These commands have to be entered before logging to the session.



48.How Teradata makes sure that there are no duplicate rows being inserted when its a SET table?

Teradata will redirect the new inserted row as per its PI to the target AMP (on the basis of its row hash value), and if it find same row hash value in that AMP (hash synonyms) then it start comparing the whole row, and find out if duplicate.

If it’s a duplicate it silently skips it without throwing any error.


49.List types of HASH functions used in teradata?

There are HASHROW, HASHBUCKET, HASHAMP and HASHBAKAMP.

The SQL hash functions are:

HASHROW (column(s))

HASHBUCKET (hashrow)

HASHAMP (hashbucket)

HASHBAKAMP (hashbucket)


50.what is derived table?

a.It stores intermediate results and calculations.
b.You can specify derived table in an SQL statement(Preferrably Select).
c.This table created and dropped as a part of the query.
d.Its stored under spool memory.
e.Once the query finishes execution table is not available.
f.This types of tables are called as Inline Query tables.

 

1.What is Journaling? why teradata requires journaling?

Journling is a data protection mechanism in teradata.It prevents from Image failure.
Journals are generated to maintain pre-images and post images of a DML transaction starting/ending, at/from a checkpoint.
When a DML transaction fails,the table is restored back to the last available checkpoint using the journal Images.

There are 3 types of Journals
Permanent
Transient
Recovery Journal

2.How many types of Image supports Journaling?

Four types of Images are supported by Journaling they are

a. Single Image----&gt;one copy of data will be taken.

b.Dual Image-----&gt;Two copies of data will be taken.

c.Before Image-----&gt;Before changes occur on the row data will be taken.

d.After Image----&gt;After changes happen on the row taking copy of data.

3.What is Transient Journal?

Transient Journal - an area of space in the DBC database which is used primarily for storing of roll-back information during inserts/deletes/updates of tables.

Detail Explanation:
The Transient Journal maintains a copy of all before images of all rows affected
by the transaction.
If the event of transaction failure, the before images are reapplied to the affected tables, the images are deleted from the journal and a rollback operation is completed.
In the event of transaction success, at the point of transaction commit,
the before images for the transaction are discarded from the journal.
In Summary, if a Transaction fails (for whatever reason), the
before images in the transient journal are used to return the data (in the
tables involved in the transaction) to its original state.

4.What is Permanent Journal?

The purpose of the Permanent Journal is to provide selective or full database recovery to a specified point in time.It permits recovery from unexpected hardware or software disasters.
The Permanent Journal also has the effect of reducing the need for full table backups which can be costly both in time and resource.

5.What are the different image options for Permanent Journal?

There are four image options for the Permanent Journal:

· Before Journal

· After Journal

· Dual Before Journal

· Dual After Journal

6.Permanent Journals are automatically purged in teradata. True / False?

False.

The Permanent Journal must be manually purged from time to time.

7.Where does TD store transient journal?

In perm space -&gt; dbc.transient journal
But that special table can grow over dbc's perm limit, until the whole system runs out of perm space.

8.What are the different return codes(severity errors) in Teradata utilities?

There are 3 basic return codes (severity errors) in teradata utilities.
4 - Warning
8 - User error
12 - System error


9.How will you connect a database server to other server?

We can connect from one server to another server in UNIX using the command
ssh or FTP or SU
ssh - ssh user_id@server_name

10.What is the meaning of skewness in Teradata?

Data or Amp skew occurs in teradata due to uneven distribution of data across all the amps. Often this leads to spool space error too. To avoid skewness try to select a Primary Index which has as many unique values as possible.

PI columns like month day etc. will have very few unique values. So during data distribution only a few amps will hold all the data resulting in skew. If a column (or a combination of columns) is chosen a PI which enforces uniqueness on the table then the data distribution will be even and the data will not be skewed.

11.Does Primary Index column choosing is important?

The Success of teradata warehouse starts with choosing of correct column
to creation of Primary index.

Try to Choose a column which has unique values. so that data can be distributed evenly among all AMPs.
Otherwise Skewness will come into picture.
Primary index is useful to get a straight path to retrieve data.

12.What are the basic rules that define how PI is defined in Teradata?

These are the following rules how Primary Index defined in Teradata
a. Only one Primary Index per table.
b.Its a physical mechanism which assigns Row to AMP.
c.A Primary Index value can be unique or non-unique.
d.A Primary Index can be composite till 64 columns.
e.The Primary Index of a populated table cannot be modified.

13.What are the basic criteria to select Primary Index column for a given table?
Answer:

A thumb rule of ADV demographics is followed.
Access Demographics
Identify index candidates that maximize one-AMP operations.
Columns most frequently used for access (Value and Join).

Distribution Demographics
Identify index candidates that optimize parallel processing.
Columns that provide good distribution.

Volatility Demographics
Identify index candidates with low maintenance I/O.

14.can you explain and PE and AMPs communicate?


When user connects to teradata database he opened a session with parsing Engine(PE)

there after when user submits a query,

1. First PE takes query, checks syntax, verifies his access rigthts
2. If every thing looks okay, PE prepare an action plan for AMP.
Which AMP to respond , what is row ID to read ...etc
3. Then PE sends action plan to AMP via BYNET signals.
4. Then corresponding AMP takes action plan and reads data and send back to PE
Then PE sends data to USER

15.Does Permanent Journals and Secondary indexes require Perm Space?

Ans: Yes



16.which objects require Perm space in teradata?

Tables and Stored Procedures objects require Perm Space

Views, Macros , Triggers doesn't require Perm space.



17.What is LOG TABLE?
A log table maintains record of all checkpoints related to the load job, it is essential/mandatory to specify a log table in a job. This table will be useful in case you have a job abort or restart due to any reason.


18.what is the use of Partition?

If you create PPI on table, then data at AMPs is ordered using Partition column.

Example
If we have Partition on deptno then all dept 10 records at one place at AMP and dept 20 records at one place.

19.Is it possible to alter NO RANGE and Unknown partition on a table?

Yes if table is empty,we can alter NO RANGE and Unknown Partition of a table.

20.can you apply a read lock on table where write lock is already applied?

No


21.How many partition in given SQ L?

Assume SQ L,

Partition by CASE_N
( partycode = 10,
partycode = 20,
partycode = 30,
partycode = 40 ,
NO CASE OR UNKNOWN)
)


Answer -- 5 partitions

22.How many partions in given SQL?
Assume SQL,

Partition by CASE_N
( partycode = 10,
partycode = 20,
partycode = 30,
partycode = 40 ,
NO CASE, UNKNOWN)
)

Answer --- 6 partitions

Here it creates 6 Partitions

If a value partycode entered is not NULL and first 4 cases are not satisfied then data goes
to "NO CASE" partition.

NULL values goes to UNKNOWN partition.


23.How teradata makes sure that there are no duplicate rows being inserted when its a SET table?


Teradata will redirect the new inserted row as per its PI to the target AMP (on the basis of its row hash value), and if it find same row hash value in that AMP (hash synonyms) then it start comparing the whole row, and find out if duplicate.
If its a duplicate it silently skips it without throwing any error.



24. Explain What are Low and High Confidentiality in Explain command?

The explain generally displays the plan which would be prepared by Optimizer.
Confidence levels indicate how well the optimizer is aware about the demographic data for a particular step.
High confidence: Means the optimizer know about the no of rows that would be returned as a result of that step. Examples would be when PI statistics exist when the column or range stats exist or no join involved.

Low confidence: Some stats available. Join and stats available on both sides of the join.

No confidence: No stats available. Join involved.

25.Why Fastload Utility does not support multiset table and duplication?

Multi set table supports duplicate rows.
Fastload utility does not support duplicate rows.

Restart logic is one of the reason. Fastload is one of the utility which loads data in blocks(64K). During restart Fastload sends some of the rows for the second time which occur after a checkpoint. Due to this Fastload rejects duplicates.

example
consider 20 rows to be loaded in a table. FL utility is used to load the table and ckpt is 5 rows. If restart occurs after 7th row FL may send 6 and 7th to AMPs during the restart. This records would be considered as duplicates and rejected.


26.Can you Recover the password of a User in Teradata?

No , you cant recover the password of a User in Teradata. Passwords are stored in Data Dictionary table (DBC.DBASE).Using a one way encryption method.
You can view the encrypted passwords using the following query.

SEL * FROM DBC.DBASE;

27.what is the differnce between Sub Query and Corelated Sub Query?

Sub Query

If Queries written in a nested manner then its termed as a sub query.
Inner query executed First and executed Only one time.

Corelated Sub Query

Co-Related Sub query get executed once for each row of the parent query.
Inner Query executed many based on Outer query.


28.what is FILLER command in Teradata?

while running Fastload or Multiload if you dont want to load a particular field from the datafile to the target table then use the FILLER command to achieve.


29.Difference between Access Logging and Query Logging in Teradata?

1.Access Logging is concerned with security ( who is doing what)
In access logging you ask the database to log who is doing what on a given object.
The information stored is based on the object not the SQL fired or the user who fired it.

2.Query Logging (DBQL) is used for debugging (what is happening around)
Incase of DBQL database keep on tracking various parameters like SQLs, Resource, Spool Usage and other things which help you to understand whats going on, the information is fruitful to debug a problem.

30.What is basic Teradata Query language?

1.It allows us to write SQL statements along with BTEQ commands.
we can use BTEQ for Importing,Exporting and Reporting Purposes.
2.The commands start with a (.) dot can be terminated by using (;) it is not mandatory to use(;)
3.BTEQ will assume any thing written with out dot as a SQL statement and requeries a (;)
to terminate it.

31.How can you track Login parameters of users in teradata?

You can view all these parameters in this Data Dictionary Table DBC.LOGONOFF

//SELECT LOGDATE,LOGTIME,USERNAME,EVENT FROM DBC.LOGONOFF;//


32.How can you use HASH FUNCTIONS to View Data Distribution across all AMPs in Teradata?

Hash Functions can be used to view the data distribution of rows for a chosen Primary Index.

SELECT HASHAMP(HASHBUCKET(HASHROW())) AS "AMP#",COUNT(*)
FROM GROUP BY 1 ORDER BY 2 DESC;

HASHROW --- returns the row hash value for a given value.
HASHBUCKET --- the grouping of a specific Hash value.
HASHAMP-----the AMP that is associated with the Hash Bucket.

This is realy good, by looking into the result set of above written query you can easily
find out the Data distribution across all AMPs in your system and further you can easily identify un-even data distribution.


33.How do you transfer large Amount of Data in Teradata?

Transfering of large Amount of data can be done using various Applications like Teradata Utilities.
BTEQ,FastLoad,MultiLoad,Tpump and FastExport.

BTEQ (Basic Teradata Query) supports all 4 DML s : SELECT, INSERT,UPDATE and DELETE.
BTEQ also support IMPORT/EXPORT protocols.

Fastload ,Multiload and Tpump transfer the data from Host to Teradata.

FastExport is used to export data from Teradata to the Host.


34.How can you determine I/O and CPU usage at a user level in Teradata?

You can find out I/O and CPU Usage from this data Dictionary Table

DBC.AMPUSAGE;

SELECT ACCOUNTNAME,USERNAME,SUM(CPUTIME) AS CPU,SUM (DISKIO)
AS DISKIO FROM DBC.AMPUSAGE GROUP BY 1,2 ORDER BY 3 DESC;


35.what is Normalization?

Normalization is the process of reducing a complex data structure into a simple, stable one.

Generally this process involves removing redundant attributes, keys, and relationships from the
conceptual data model.

36. How many types of Indexes are present in teradata?

There are 5 different indexes present in Teradata

1. Primary Index

a.Unique primary index

b. Non Unique primary index

2. Secondary Index

a. Unique Secondary index

b. non Unique Secondary index

3. Partitioned Primary Index

a. Case partition

b. Range partition

4. Join index

a. Single table join index

b. multiple table join index

c. Sparse Join index ( constraint applied on join index in where clause)

5. Hash index

37.what teradata supports 6A's means?

Active Load
Active Access
Active Events
Active Workload Management
Active Enterprise Integration
Active Availability.

38.Which is Faster – MultiLoad Delete or Normal Delete command?

MultiLoad delete is faster then normal Delete command, since the deletion happens in data blocks of 64Kbytes, where as delete command deletes data row by row.

Transient journal maintains entries only for Delete command since Teradata utilities doesn’t support Transient journal loading.

39.what tools can be used for Active Load in teradata?


ETL Tools can use queue tables and triggers, and use FastLoad, MultiLoad and TPump utilities .

 

40.How to Skip or Get first and Last Record from Flat File through MultiLoad?


In .IMPORT command in Mload we have a option to give record no. from which processing should begin.
i.e. ‘FROM m’ ‘m’ is a logical record number, as an integer, of the record in the identified data source where processing is to begin.
You can mention ’m’ as 2 and processing will start from second record.

THRU k and FOR n are two options in the same Mload command, functions same towards the end of the processing.

Adding to the above, if from n"start record" and for n "stop record" are not mentioned, mload considers records from start till the end of the file.

42.what is the use of TEMP space?


TEMPORARY (TEMP) space :
A database may or may not have TEMP space, however, it is required if Global Temporary Tables are used.

 

43. What is the default join strategy in Teradata???

Answer
# 1

The Teradata Cost based optimizer will decide the join strategy based on the optimum path. The common strategies followed are from Merge, Hash & Nested Join

Answer
# 2

Three strategies followed by optimizer are:

1. Duplication of rows of one table to every amp

--&gt; This one is opted by optimizer when the non-PI column is on a small table.

2. Redistribution of the non PI rows of the table to the amp containing the matching PI row amp.

--&gt; This one is opted when the tables are large. The non PI column is in one table is redistributed to the amp containing the matching PI.

3. Redistribute both the rows of the table by hash values.

--&gt; This is chosen when the join is on a column that is not the PI in either table. This will require the most spool space.

44. what is common practice in creating views and tables in teradata?


It is a common practice in Teradata to have some databases with PERM space that contain only tables.

Then, other databases contain only views. These view databases require no PERM space and are the only databases that users have privileges to access.

The views in these databases control all access to the real tables in other databases. They insulate the actual tables from user access .

45.PERM space is a pre allocated space? True/ False?


False. PERM space is not pre allocated space. its a maximum allowed space
to user or a database.

46.How to handle nulls in Teradata?

How many columns can be there in a table?

How to find average sal with out using avg function?

Answer

1. Use zeroifnull, nullifzero in select and NULL in insert directly.

2. 256 columns max per table.

3. Without using "avg" we can find the avg salary by using sum (sal)/count (sal);

sel sum(sal)/count(sal) as avgsal from tablename.

47.what is FALLBACK in teradata?

FALLBACK is a table protection feature used in case an AMP fails. You can use FALLBACK on all tables.

FALLBACK tables use twice as much disk space as NON-FALLBACK rows .

48. How many types of Skew exist?

If you utilized unequally TD resources (CPU,AMP,IO,Disk and etc) this is called skew exists. Major are 3 types of skews (CPU skew, AMP/Data skew, IO Skew).

-Data skew?

When data is not distributed equally on all the AMPs.

-Cpu skew?

Who is taking/consuming more CPU called cpu skew.

-IO skew?

Who perform more IO Operation? Resulting in IO Skew.

49.What are Differences between Teradata and ANSI Session modes in Teradata?

TERADATA MODE

-------------

1. Comparison is not Case sensitive

2. Create table are default to SET tables

3. Each transaction is committed implicitly

4. Supports all Teradata commands

5. It follows BTET (Begin and End Transaction) Mode

ANSI MODE

-------------

1. Comparison is CASE sensitive

2. Create table are default to MULTISET tables

3. Each transaction has to be committed explicitly

4. Does not support all Teradata commands .

5. It does not follow BTET Mode.

50.How to skip 1st record while using Bteq IMPORT?

By using SKIP=1 ; , we can skip first record.

.import infile=, skip=1;

 

.Which statement about spool space is true?

A.) Spool space must be specified during creation of the user
B.) Spool space is used to hold the intermediate result set.
C.) Spool space is permanent space.
D.) Spool space is limited to the size of the table.

Answer--- B

2. Which Utility used to copy a table and restore it to another database?

A.) BTEQ
B.) Archive Recovery
C.) Fastload
D.) Table rebuild

Answer ---B

3.The Given below, which tool should you use to collect and analyze database performance?

A.) BTEQ
B.) TDQM
C.) Teradata SQL Assistant
D.) Teradata Manager

Answer --- D

4. In Which way is data distributed with a partitioned primary index?

A.) based on the secondary unique index
B.) based on the Primary key of the table
C.) Based on the Primary Index of the table
D.) based on the partitioning column .

Answer --- c

5. Batch process block level loading are performed against populated tables by which utility?

A.) Multiload
B.) bulkload
C.) Tpump
D.) Fastload

Answer-- A

6. . Teradata Index wizard used for to

A.) compare performance of multiple workloads
B.) recommend secondary indexes for a table
C.) recommend primay indexes for a table
D.) compare primay and secondary indexes for a table

Answer --B

7. Which statement is true if using NUPIs to accomplish data distribution?

A.) The increase of hash buckets has resulted in more hash collisions.
B.) Duplicate NUPI values will produce the same row hash.
C.) Uniform row distribution is accomplished using UPIs.
D.) The low-order 16 bits of the row hash are input to the hash map.

Ans -- B

8. Which is the component limits queries based on set thresholds

A.) TDQM
B.) Teradata Manager
C.) BTEQ
D.) Query Man

Answer --- A

9.How many BYNETs in a teradata system?

Well, BYNET is both a hard ware and software BYNET.
Hard ware BYNET connects every PE and AMP in the sytem . It can be scalable at any time.
Every teradata system has two BYNETS . one for backup in case of one failed

However both BYNETS can be used at same time by AMP or PES .

10. What is TENACITY? What is its default value?

TENACITY specifies the amount of time in hours, to retry to obtain a loader slot or to establish all requested sessions to logon.

The default for Fast Load is “no tenacity”, meaning that it will not retry at all.

If several FastLoad jobs are executed at the same time, we recommend setting the TENACITY to 4, meaning that the system will continue trying to logon for the number of sessions requested for up to four hours.

11. How do you create materialized view in Teradata?

There is no such thing as a "materialized view" in Teradata.

The equivalent in Teradata would be a join index (or aggregate index) with a corresponding view put on top of it. The command to create one is "CREATE JOIN INDEX...(lots of options)".

Join indexes are maintained when the data is maintained. They cannot be accessed directly, but are invoked when the optimizer determines it is beneficial.

Even though you cannot invoke a join index directly, you create a view that looks like the join index and in that way, it would be similar to having direct access.

However, this approach does not guarantee that the join index will be used.

12.who will read and write data in teradata?


AMP works like a hard worker. AMP read and write data to their own assigned
disks.

13. Failure 7547 Target row updated by multiple source rows.


-updating the same target row by multiple source rows

-Remove the duplicate data.

14.How do you execute the given SQL statement repeatedly in BTEQ?

Select top 1* from database.table1;

=n

Here "=n” is to run the previous sql statement, "n" number of times.

15. Failure 2646 No more spool space

Check select query well tuned.
Collecting stats on the table may fix this problem.

16. How do you declare a variable inside a Teradata macro and Teradata stored procedure?

No facility to declare variables inside a Teradata macro, however you can pass values into the macro by means of input parameters. These values will then be embedded in the SQL statements that the macro contains.

Teradata stored procedure supports IN, OUT and INOUT parameters, as well as the ability to declare variables within the procedure itself. Such variables could be useful as counters when you have looping logic.

17. What does SLEEP function does in Fast load?

The SLEEP command specifies the amount minutes to wait before retrying to logon and establish all sessions.

Sleep command can be used with all load utilities not only fastload.

This situation can occur if all of the loader slots are used or if the number of requested sessions is not available. The default value is 6 minutes. If tenacity was set to 2 hours and sleep 10 minutes, Sleep command will try to logon for every 10 minutes up to 2 hours duration.

18. Given below, What is the maximum level of data protection?


A.) RAID 1 and Fallback
B.) RAID 5 and Fallback
C.) RAID 5 and No Fallback
D.) RAID 1 and No Fallback

ANswer - A

19. When utilizing a unique secondary index, how many AMP(s) are most probably accessed?

A.) 1 AMP
B.) Multi_AMP
C.) 2 AMP
D.) All-AMP

Answer --- c

20. Difference between count(*) and count(1) ?

Count is one of very important functions used in any database .

But what many dont know that is the result count we get from COUNT function might be different based on how it is used.

Let us consider the example of count(*) and Count(1) and understand how there is some difference the usage.

Say for a given table with data


X y
------------
21 31
Null 32
Null 33
41 34
Null Null

And this query:


select count(*), count(x), count(1) from table;

Results: 5, 2, 4

* Count(*) counts all occurrences including nulls. This is a row count for a given table.
* Count(x) counts all non null occurrences of values in column x. If x has null values, count(x) will be less than count(*).
* Count(1) is similar to count(x) .It counts all null values in a "pseudo column" created by the constant. Since the constant 1 will never be null, this would, be like count(*) i.e row count .

21.What is use of compress in teradata?Explain?

Compression is used to Minimize the table size, for example when the table size is increasing anonymously We can use Compression to reduce the size of the table.

Conditions:
1.Compression can be declared at the time of table creation
2.We can compress up to 256 column values(not columns)
3.We can't compress variable length fields (vartext,varchar..)

22. What is the difference between start schema and Fsldm?

FSLDM --&gt; Financial Services Logical Data Model (Teradata FS-LDM 7.0) its developed by Teradata for the financial sectors (Specifically for Bank). it can be customised based on the user requirement.

StarSchema --&gt; Its a relational database schema for representing multidimensional data. The data is stored in a central fact table, with one or more tables holding information on each dimension.

23. Teradata performance tuning and optimization steps?


1. collecting statistics
2. Explain Statements
3. Avoid Product Joins when possible
4. select appropriate primary index to avoid skewness in storage
5. Avoid Redistribution when possible
6. Use sub-selects instead of big "IN" lists
7. Use derived tables
8. Use GROUP BY instead of DISTINCT ( GROUP BY sorts the data locally on the VPROC. DISTINCT sorts the data after it is redistributed)
9. Use Compression on large tables.

24. Explain command displays the execution plan of the query chosen by the Teradata Optimizer

a) False b) True

Answer: b




25. Success/Failure/Error responses from Teradata are received by

a) Request Parcel b) Data parcel c) Response Parcel

Answer: c

26. Teradata Database facilities for software fault tolerance on

a) Vproc Migration & Fallback tables b) AMP Clusters & Journaling

c) Backup/Archive/Restore d) Table Rebuild Utility e) All of the above

f) None of the above

Answer: e

27.Total Number of nodes suppport by BYNET are

a)510 SMP Nodes a) 512 SMP Nodes b) 513 SMP Nodes c) 514 SMP Nodes

Answer: b

28.SMP Stands for

a) Symmetric Multi-processor b) Symmetric Massive-Processor

c) System Multi-processor d) Symmetric Multi-Processor

Answer: a

29.Terabytes represent

a) 10 power 15 b) 10 power 12 c) 10 power 18 d) 10 power 24

Answer: b

30.Global Temporary Tables are stored in the TD are

a) Perm Space b) Spool Space c) Temp Space

Answer : c

31.What are the following object required for Teradata Database-Prem space?.

a) Tables & Stored Procedure b) Tables & Views

c) Macros & Triggers d) Views & Stored Procedures

Answer: a.

32. HASHROW funtions is

a. Returns the hash bucket number that corresponds to a specified row hash value.

b. Returns the hexadecimal row hash value for an expression

c. Returns the AMP number of the primary AMP corresponding to the specified hash bucket number.

d. Returns the identification number of the fallback AMP corresponding to the specified hash bucket.

Answer: b

33.Teradata—A Brief History ?

1979 — Teradata Corp founded in Los Angeles, California.

Development begins on a massively parallel database computer.

1982 — BYNET technology is patented.

1984 — Teradata sells first DBC/1012 to Wells Fargo Bank of California. Total revenue for year – $3 million.

1986 — Product of the Year!

1989 — Teradata and NCR partner on next generation of DBC.

1990 — First Terabyte system installed and in production.

1991 — NCR is acquired by AT&T. Teradata revenue $280M.

1992 — Teradata is merged into NCR.

1995 — Teradata Version 2 for UNIX operating systems released.

1996 — NCR only vendor to run multi-user TPC-D Benchmarks.

1996 — AT&T spins off NCR Corporation with Teradata product.

1997 — Teradata database becomes industry leader in scalable data warehousing.

1997 — 100 Gigabyte TPC-D Benchmark leader.

1998 — 24 Terabyte system in production.

1999 — 1 TB TPC-D winner

2000 — 50+ TB system in production! Teradata Division created.

VN:F [1.9.8_1114]

34.What are the functions of a Tera data DBA?

Following are the different functions which a DBA can perform:

1. User Management – Creation and managing Users, Databases, Roles, Profiles and Accounts.

2. Space Allocation – Assigning Permanent Space, Spool Space and Temporary Space.

3. Access of Database Objects – Granting and Revoking Access Rights on different database objects.

4. Security Control – Handling logon and logoff rules for Users.

5. System Maintenance – Specification of system defaults, restart etc.

6. System Performance – Use of Performance Monitor(PMON), Priority Scheduler and Job Scheduling.

7. Resource Monitoring – Database Query Log(DBQL) and Access Logging.

8. Data Archives, Restores and Recovery – ARC Utility and Permanent Journals.

35. what is Full table scan (FTS) ?

What are the scenarios in which Full Table Scans occurs?

A Full Table Scan (FTS) is a query that reads every row of a table.

Scenarios Full table scan occurs

1. Where clause in SELECT statement does not use either primary index or secondary index.

2. SQL Statement which uses a partial value (like or not like), in the WHERE statement.

3. SQL Statement which does not contain where clause.

4.SQL statement using range in where clause. Ex. (col1 &gt; 40 or col1 &lt; =10000)

36. How to identify PPI columns?

Select databasename , tablename , columnposition ,columnname from dbc.indexes

where indextype ='Q'

order by 1 ,2,3 ;

37.Can you create a table with no Primary Index in teradata?

NO. Its not possible.... if you dont mention external primary index, teradata creates one for you.

Teradata uses this Primary index to distribute data among AMPs.

38.In case of NUSI, does AMP has local subtable?

Yes. In case of NUSI, AMP is going to have a subtable

which has only information for the rows in the same AMP.

But in case of USI, Each row in AMP is hashed and decide AMP number

to store its information in Secondary index Sub table.

39. How long ROW ID is?

64 Bits...

AMP receives 32 bit hash value and row data from PE. AMP then append a 32 bit unique value to row HASH to make 64 bit ROW ID.

ROW ID is used to sort rows in AMP

First AMP checks is there any row with same HASH value? If yes, then it append32 bit 1 to (0000 0000 0000 0000 0000 0000 0000 0001) to row hash and make it 64 bit ROW ID... If one more row comes with same hash value then it appends 2( 0000 0000 0000 0000 0000 0000 0000 0010) to Row HASH.

40.Can a user creates any object who assigned no perm space?

A user with No permspace can't create any tables in his user space.

But He can create Macro, Views and triggers.

Because these objects will be stored in Global Space .

41. When a user query fails because of spool space?

Query may fails because of spool space in two cases

When there is no empty perm space. i.e complete system is full.

User query requires more spool space than user spool space limit .

42.Does AMP will do row by row search once it has Hash value to retrieve a row?

AMP receives a hash value from PE for which data is required.

AMP applies a Binary Search to find that row.... Its never do linear search.

43.A Table has UPI on EMP no. I wrote a query in which i used deptno in emp table?

How many amps gets this data?

Even though table has Primary Index ,

my SQL query written on

dept number on which there is not index.... So its a Full table scan.

44.While Multiload is running client system restarted?

We need to manually resubmit the script so that it runs from Last configuration check point.

45.While Executing Multiload script teradata server restarted?

Along with server multiload script restarted and Runs from last checkpoint.

46.Can we concatenate Multiple files with same structure and load in multiple tables?

NO ,only in fastload its possible.

47.How do you load data from a file to multiple tables MLOAD?

In this apply class we use where conditions while loading data into multiple tables.

Apply Ins_table where Recind = ‘T1’

Apply Ins_table where Recind = ‘T2’

48.If Multiload is fail in Acquistion or Application phase how do you restart?

a.Incase of old file to Run

Dont drop ET,UV,WT and log tables

Dont Release locks on target tables

Rectify errors in the File or script and Run again.

b.Incase of new file to Run

Drop ET,UV,WT and log table.

Release locks on target tables by using Release MLOAD command.

RELEASE MLOAD Table1, ...Table 5

RELEASE MLOAD Table1, ...Table 5 IN APPLY

 

49.Terdata Data Dictonary Directory explain?

The DD/D ( Data Dictionary Directory)

is an integrated set of system tables

contains definitions of and information about all objects in the system

is entirely maintained by the RDBMS

is “data about the data” or “metadata”

is distributed across all AMPs like all tables

may be queried by administrators or support staff

is accessed via Teradata supplied views

Examples of DD/D views:

DBC.Tables —&gt; information about all tables
DBC.Users —&gt; information about all users
DBC.AllRights —&gt; information about access rights
DBC.AllSpace —&gt; information about space utilization.

 

50.Can I use drop statement in Fload utility?

YES,

But you have to declare it out of the FLOAD Block it means
it should not come between .begin loading,.end loading

FLOAD also supports DELETE,CREATE,DROP statements which we
have to declare out of FLOAD block.

In the FLOAD Block we can give only INSERT

===================================

1. How do you define Teradata? Give some of the primary characteristics of the same.

Teradata is basically an RDMS which is used to drive the Datamart, Datawarehouse, OLAP,

OLTP, as well as DSS Appliances of the company. Some of the primary characteristics of

Teradata are given below.

Is capable of running on Single-nodes, as well as multi-nodes.

Parallelism is built into the system.

Very much compatible with the standards of ANSI.

Tends to act in the same way as a server.

It is an Open System that basically executes for UNIX MR-RAS, Suse Linux ETC,

WIN2K, etc.

2. What are the newly developed features of Teradata?

Some of the newly developed features of Teradata are: -

Automated temporal analytics

Extension in the compression capabilities which allows flexible compression of data

about 20 times more data than the previous version.

Customer associated innovation like tetradata viewpoint.

3. Highlight a few of the important components of Teradata.

Some of the important components of Teradata are: -

Bynet

Access Module Processor (AMP)

Parsing Engine (PE)

Virtual Disk (vDisk)

Virtual Storage System (VSS)

 

4. Mention the procedure via which, we can run Teradata jobs in a UNIX environment.

All you have to do is perform execution in UNIX in the way as mentioned below.

$Sh > BTEQ [Logfile Path]

or

$Sh > BTEQ

5. In Teradata, how do we Generate Sequence?

In Teradata, we Generate Sequence by making use of Identity Column

6. During the Display time, how is the sequence generated by Teradata?

All you have to do is use CSUM.

7. A certain load is being imposed on the table and that too, every hour. The traffic in the

morning is relatively low, and that of the night is very high. As per this situation, which is

the most advisable utility and how is that utility supposed to be loaded?

The most suggestible utility here has to be Tpump. By making use of packet size decreasing or

increasing, the traffic can be easily handled.

8. If Fast Load Script fails and only the error tables are made available to you, then how

will you restart?

There are basically two ways of restarting in this case.

Making the old file to run – Make sure that you do not completely drop the error tables.

Instead, try to rectify the errors that are present in the script or the file and then execute

again.

Running a new file – In this process, the script is executed simply using end loading and

beginning statements. This will help in removing the lock that has been put up on the

target table and might also remove the given record from the fast-log table. Once this is

done, you are free to run the whole script once again.

 

9. Mention a few of the ETL tools that come under Teradata.

Some of the ETL tools which are commonly used in Teradata are DataStage, Informatica, SSIS,

etc.

10. Highlight a few of the advantages that ETL tools have over TD.

Some of the advantages that ETL tools have over TD are: -

Multiple heterogeneous destinations, as well as sources can be operated.

Debugging process is much easier with the help of ETL tools owing to full-fledged GUI

support.

Components of ETL tools can be easily reused, and as a result, if there is an update to

the main server, then all the corresponding applications connected to the server are

updated automatically.

De-pivoting and pivoting can be easily done using ETL tools.

11. What is the meaning of Caching in Teradata?

Caching is considered as an added advantage of using Teradata as it primarily works with the

source which stays in the same order i.e. does not change on a frequent basis. At times, Cache

is usually shared amongst applications.

12. How can we check the version of Teradata that we are using currently?

Just give the command .SHOW VERSION.

13. Give a justifiable reason why Multi-load supports NUSI instead of USI.

The index sub-table row happens to be on the same Amp in the same way as the data row in

NUSI. Thus, each Amp is operated separately and in a parallel manner.

14. How is MLOAD Client System restarted after execution?

The script has to be submitted manually so that it can easily load the data from the checkpoint

that comes last.

15. How is MLOAD Teradata Server restarted after execution?

The process is basically carried out from the last known checkpoint, and once the data has

been carried out after execution of MLOAD script, the server is restarted.

16. What is meant by a node?

A node basically is termed as an assortment of components of hardware and software. Usually

a server is referred to as a node.

17. Let us say there is a file that consists of 100 records out of which we need to skip the

first and the last 20 records. What will the code snippet?

We need to use BTEQ Utility in order to do this task. Skip 20, as well as Repeat 60 will be used

in the script.

18. Explain PDE.

PDE basically stands for Parallel Data Extension. PDE basically happens to be an interface

layer of software present above the operation system and gives the database a chance to

operate in a parallel milieu.

19. What is TPD?

TPD basically stands for Trusted Parallel Database, and it basically works under PDE. Teradata

happens to be a database that primarily works under PDE. This is the reason why Teradata is

usually referred to as Trusted Parallel or Pure Parallel database.

20. What is meant by a Channel Driver?

A channel driver is software that acts as a medium of communication between PEs and all the

applications that are running on channels which are attached to the clients.

21. What is meant by Teradata Gateway?

Just like channel driver, Teradata Gateway acts as a medium of communication between the

Parse Engine and applications that are attached to network clients. Only one Gateway is

assigned per node.

22. What is meant by a Virtual Disk?

Virtual Disk is basically a compilation of a whole array of cylinders which are physical disks. It is

sometimes referred to as disk Array.

23. Explain the meaning of Amp?

Amp basically stands for Access Module Processor and happens to be a processor working

virtually and is basically used for managing a single portion of the database. This particular

portion of database cannot be shared by any other Amp. Thus, this form of architecture is

commonly referred to as shared-nothing architecture.

24. What does Amp contain and what are all the operations that it performs?

Amp basically consists of a Database Manager Subsystem and is capable of performing the

operations mentioned below.

Performing DML

Performing DDL

Implementing Aggregations and Joins.

Releasing and applying locks, etc.

25. What is meant by a Parsing Engine?

PE happens to be a kind Vproc. Its primary function is to take SQL requests and deliver

responses in SQL. It consists of a wide array of software components that are used to break

SQL into various steps and then send those steps to AMPs.

26.What do you mean by parsing?

Parsing is a process concerned with analysis of symbols of string that are either in computer

language or in natural language.

27. What are the functions of a Parser?

A Parser: -

Checks semantics errors

Checks syntactical errors

Checks object existence

28. What is meant by a dispatcher?

Dispatcher takes a whole collection of requests and then keeps them stored in a queue. The

same queue is being kept throughout the process in order to deliver multiple sets of responses.

29. How many sessions of MAX is PE capable of handling at a particular time?

PE can handle a total of 120 sessions at a particular point of time.

30. Explain BYNET.

BYNET basically serves as a medium of communication between the components. It is primarily

responsible for sending messages and also responsible for performing merging, as well as

sorting operations.

31. What is meant by a Clique?

A Clique is basically known to be an assortment of nodes that is being shared amongst common

disk drives. Presence of Clique is immensely important since it helps in avoiding node failures.

32. What happens when a node suffers a downfall?

Whenever there is a downfall in the performance level of a node, all the corresponding Vprocs

immediately migrate to a new node from the fail node in order to get all the data back from

common drives.

33. List out all forms of LOCKS that are available in Teradata.

There are basically four types of LOCKS that fall under Teradata. These are: -

Read Lock

Access Lock

Exclusive Lock

Write Lock

34. What is the particular designated level at which a LOCK is liable to be applied in

Teradata?

Table Level – All the rows that are present inside a table will certainly be locked.

Database Level Lock – All the objects that are present inside the database will be

locked.

Row Hash Level Lock – Only those rows will be locked which are corresponding to the

particular row.

35. In the Primary Index, what is the score of AMPs that are actively involved?

Only one AMP is actively involved in a Primary Index.

36. In Teradata, what is the significance of UPSERT command?

UPSERT basically stands for Update Else Insert. This option is available only in Teradata.

37. Highlight the advantages of PPI(Partition Primary Index).

PPI is basically used for Range-based or Category-based data storage purposes. When it

comes to Range queries, there is no need of Full table scan utilization as it straightaway moves

to the consequent partition thus skipping all the other partitions.

38. Give the sizes of SMALLINT, BYTEINT and INTEGER.

SMALLINT – 2 Bytes – 16 Bites -> -32768 to 32767

BYTEINT – 1 Bytes – 8 Bits -> -128 to 127

INTEGER – 4 Bytes – 32 Bits -> -2,147,483,648 to 2,147,483,647

39. What is meant by a Least Cost Plan?

A Least Cost Plan basically executes in less time across the shortest path.

40. Highlight the points of differences between the database and user in Teradata.

A database is basically passive, whereas a user is active.

A database primarily stores all the objects of database, whereas a user can store any

object whether that is a macro, table, view, etc.

Database does not has password while the user has to enter password.

41. Highlight the differences between Primary Key and Primary Index.

Primary index is quite mandatory, whereas Primary Key is optional.

Primary Index has a limit of 64 tables/columns, whereas Primary Key does not have any

limit.

Primary Index allows duplicates and nulls, whereas Primary Key doesn’t.

Primary Index is a physical mechanism, whereas Primary Key is purely logical

mechanism.

42. Explain how spool space is used.

Spool space in Teradata is basically used for running queries. Out of the total space that is

available in Teradata, 20% of the space is basically allocated to spool space.

43. Highlight the need for Performance Tuning.

Performance tuning in Teradata is basically done to identify all the bottlenecks and then resolve

them.

44. Comment whether bottleneck is an error or not.

Technically, bottleneck is not a form of error, but it certainly causes a certain amount of delay in

the system.

45. How can bottlenecks be identified?

There are basically four ways of identifying a bottleneck. These are: -

Teradata Visual Explain

Explain Request Modifier

Teradata Manager

Performance Monitor

7 / 9

http://career.guru99.com/

46. What is meant by a Highest Cost Plan?

As per Highest Cost Plan, the time taken to execute the process is more, and it takes the

longest path available.

47. Highlight all the modes that are present under Confidence Level.

Low, No, High and Join are the four modes that are present under Confidence Level.

48. Name the five phases that come under MultiLoad Utility.

Preliminary Phase, DML Phase, Data Acquisition Phase, Application Phase and End Phase.

49. Highlight the limitations of TPUMP Utility.

Following are the limitations of TPUMP utility: -

We cannot use SELECT statement.

Data Files cannot be concatenated.

Aggregate and Exponential operators are not supported.

Arithmetic functions cannot be supported.

50. In BTEQ, how are the session-mode parameters being set?

.set session transaction BTET -> Teradata transaction mode

.set session transaction ANSI -> ANSI mode

These commands will work only when they are entered before logging into the session.

---------------------------------------------------------------------------------------------------------------------

 

 

 

 

 

 

 

 

 

 

 

    

1 comment:

  1. It is nice blog Thank you provide important information and I am searching for the same information to save my time Big Data Hadoop Online Course

    ReplyDelete