Tuesday, November 23, 2010

SSIS Interview Questions and answers

Here are some SSIS related Interview Questions with answers. hope they help.

Question - 1 What is the control flow and data flow

In SSIS a workflow is called a control-flow. A control-flow links together our modular data-flows as a series of operations in order to achieve a desired result.
A control flow consists of one or more tasks and containers that execute when the package runs. To control order or define the conditions for running the next task or container in the package control flow, you use precedence constraints to connect the tasks and containers in a package. A subset of tasks and containers can also be grouped and run repeatedly as a unit within the package control flow.

SQL Server 2005 Integration Services (SSIS) provides three different types of control flow elements: containers that provide structures in packages, tasks that provide functionality, and precedence constraints that connect the executables, containers, and tasks into an ordered control flow.

A data flow consists of the sources and destinations that extract and load data, the transformations that modify and extend data, and the paths that link sources, transformations, and destinations. Before you can add a data flow to a package, the package control flow must include a Data Flow task. The Data Flow task is the executable within the SSIS package that creates, orders, and runs the data flow. A separate instance of the data flow engine is opened for each Data Flow task in a package.

SQL Server 2005 Integration Services (SSIS) provides three different types of data flow components: sources, transformations, and destinations. Sources extract data from data stores such as tables and views in relational databases, files, and Analysis Services databases. Transformations modify, summarize, and clean data. Destinations load data into data stores or create in-memory datasets.

Question -2 How do you do error handling in SSIS

When a data flow component applies a transformation to column data, extracts data from sources, or loads data into destinations, errors can occur. Errors frequently occur because of unexpected data values.

For example, a data conversion fails because a column contains a string instead of a number, an insertion into a database column fails because the data is a date and the column has a numeric data type, or an expression fails to evaluate because a column value is zero, resulting in a mathematical operation that is not valid.

Errors typically fall into one the following categories:

-Data conversion errors, which occur if a conversion results in loss of significant digits, the loss of insignificant digits, and the truncation of strings. Data conversion errors also occur if the requested conversion is not supported.
-Expression evaluation errors, which occur if expressions that are evaluated at run time perform invalid operations or become syntactically incorrect because of missing or incorrect data values.
-Lookup errors, which occur if a lookup operation fails to locate a match in the lookup table.

Many data flow components support error outputs, which let you control how the component handles row-level errors in both incoming and outgoing data. You specify how the component behaves when truncation or an error occurs by setting options on individual columns in the input or output.

For example, you can specify that the component should fail if customer name data is truncated, but ignore errors on another column that contains less important data.

Question- 3 How do you do logging in ssis

SSIS includes logging features that write log entries when run-time events occur and can also write custom messages.

Integration Services supports a diverse set of log providers, and gives you the ability to create custom log providers. The Integration Services log providers can write log entries to text files, SQL Server Profiler, SQL Server, Windows Event Log, or XML files.

Logs are associated with packages and are configured at the package level. Each task or container in a package can log information to any package log. The tasks and containers in a package can be enabled for logging even if the package itself is not.

To customize the logging of an event or custom message, Integration Services provides a schema of commonly logged information to include in log entries. The Integration Services log schema defines the information that you can log. You can select elements from the log schema for each log entry.

To enable logging in a package
1. In Business Intelligence Development Studio, open the Integration Services project that contains the package you want.
2. On the SSIS menu, click Logging.
3. Select a log provider in the Provider type list, and then click Add.


Question- 4 How do you deploy ssis packages.

SQL Server 2005 Integration Services (SSIS) makes it simple to deploy packages to any computer.
There are two steps in the package deployment process:
-The first step is to build the Integration Services project to create a package deployment utility.
-The second step is to copy the deployment folder that was created when you built the Integration Services project to the target computer, and then run the Package Installation Wizard to install the packages.

Question -5 How do you schedule ssis packages to run on the fly
SQL Server 2005 Integration Services (SSIS) makes it simple to deploy packages to any computer.There are two steps in the package deployment process:-The first step is to build the Integration Services project to create a package deployment utility.-The second step is to copy the deployment folder that was created when you built the IntegrationServices project to the target computer, and then run the Package Installation Wizard to install thepackages.

Question - 6 How do you run stored procedure and get data

Question -7 A scenario: Want to insert a text file into database table, but during the upload want to change a column called as months - January, Feb, etc to a code, - 1,2,3.. .This code can be read from another database table called months. After the conversion of the data , upload the file. If there are any errors, write to error table. Then for all errors, read errors from database, create a file, and mail it to the supervisor.
How would you accomplish this task in SSIS?


Question 8- What are variables and what is variable scope ?

Variables store values that a SSIS package and its containers, tasks, and event handlers can use at run time. The scripts in the Script task and the Script component can also use variables. The precedence constraints that sequence tasks and containers into a workflow can use variables when their constraint definitions include expressions.

Integration Services supports two types of variables: user-defined variables and system variables. User-defined variables are defined by package developers, and system variables are defined by Integration Services. You can create as many user-defined variables as a package requires, but you cannot create additional system variables.

Scope :

A variable is created within the scope of a package or within the scope of a container, task, or event handler in the package. Because the package container is at the top of the container hierarchy, variables with package scope function like global variables and can be used by all containers in the package. Similarly, variables defined within the scope of a container such as a For Loop container can be used by all tasks or containers within the For Loop container.


Question 9 - True or False - Using a checkpoint file in SSIS is just like issuing the CHECKPOINT command against the relational engine. It commits all of the data to the database.

False. SSIS provides a Checkpoint capability which allows a package to restart at the point of failure.

Question 10 - Can you explain what the Import\Export tool does and the basic steps in the wizard?

The Import\Export tool is accessible via BIDS or executing the dtswizard command.
The tool identifies a data source and a destination to move data either within 1 database, between instances or even from a database to a file (or vice versa).


Question 11 - What are the command line tools to execute SQL Server Integration Services packages?

DTSEXECUI - When this command line tool is run a user interface is loaded in order to configure each of the applicable parameters to execute an SSIS package.
DTEXEC - This is a pure command line tool where all of the needed switches must be passed into the command for successful execution of the SSIS package.


Question 12 - Can you explain the SQL Server Integration Services functionality in Management Studio?

You have the ability to do the following:
Login to the SQL Server Integration Services instance
View the SSIS log
View the packages that are currently running on that instance
Browse the packages stored in MSDB or the file system
Import or export packages
Delete packages
Run packages

Question 13 - Can you name some of the core SSIS components in the Business Intelligence

Development Studio you work with on a regular basis when building an SSIS package?
Connection Managers
Control Flow
Data Flow
Event Handlers
Variables window
Toolbox window
Output window
Logging
Package Configurations

Question Difficulty = Moderate

Question 14 - True or False: SSIS has a default means to log all records updated, deleted or inserted on a per table basis.

False, but a custom solution can be built to meet these needs.

Question 15 - What is a breakpoint in SSIS? How is it setup? How do you disable it?

A breakpoint is a stopping point in the code. The breakpoint can give the Developer\DBA an opportunity to review the status of the data, variables and the overall status of the SSIS package.
10 unique conditions exist for each breakpoint.
Breakpoints are setup in BIDS. In BIDS, navigate to the control flow interface. Right click on the object where you want to set the breakpoint and select the 'Edit Breakpoints...' option.


Question 16- Can you name 5 or more of the native SSIS connection managers?

OLEDB connection - Used to connect to any data source requiring an OLEDB connection (i.e., SQL Server 2000)
Flat file connection - Used to make a connection to a single file in the File System. Required for reading information from a File System flat file
ADO.Net connection - Uses the .Net Provider to make a connection to SQL Server 2005 or other connection exposed through managed code (like C#) in a custom task
Analysis Services connection - Used to make a connection to an Analysis Services database or project. Required for the Analysis Services DDL Task and Analysis Services Processing Task
File connection - Used to reference a file or folder. The options are to either use or create a file or folder
Excel
FTP
HTTP
MSMQ
SMO
SMTP
SQLMobile
WMI


Question 17 - How do you eliminate quotes from being uploaded from a flat file to SQL Server?

In the SSIS package on the Flat File Connection Manager Editor, enter quotes into the Text qualifier field then preview the data to ensure the quotes are not included.
Additional information: How to strip out double quotes from an import file in SQL Server Integration Services

Question 18- Can you name 5 or more of the main SSIS tool box widgets and their functionality?
For Loop Container
Foreach Loop Container
Sequence Container
ActiveX Script Task
Analysis Services Execute DDL Task
Analysis Services Processing Task
Bulk Insert Task
Data Flow Task
Data Mining Query Task
Execute DTS 2000 Package Task
Execute Package Task
Execute Process Task
Execute SQL Task
etc.

Question 19- Can you explain one approach to deploy an SSIS package?

One option is to build a deployment manifest file in BIDS, then copy the directory to the applicable SQL Server then work through the steps of the package installation wizard
A second option is using the dtutil utility to copy, paste, rename, delete an SSIS Package
A third option is to login to SQL Server Integration Services via SQL Server Management Studio then navigate to the 'Stored Packages' folder then right click on the one of the children folders or an SSIS package to access the 'Import Packages...' or 'Export Packages...'option.
A fourth option in BIDS is to navigate to File Save Copy of Package and complete the interface.



Question 20- Can you explain how to setup a checkpoint file in SSIS?

The following items need to be configured on the properties tab for SSIS package:
CheckpointFileName - Specify the full path to the Checkpoint file that the package uses to save the value of package variables and log completed tasks. Rather than using a hard-coded path as shown above, it's a good idea to use an expression that concatenates a path defined in a package variable and the package name.
CheckpointUsage - Determines if/how checkpoints are used. Choose from these options: Never (default), IfExists, or Always. Never indicates that you are not using Checkpoints. IfExists is the typical setting and implements the restart at the point of failure behavior. If a Checkpoint file is found it is used to restore package variable values and restart at the point of failure. If a Checkpoint file is not found the package starts execution with the first task. The Always choice raises an error if the Checkpoint file does not exist.
SaveCheckpoints - Choose from these options: True or False (default). You must select True to implement the Checkpoint behavior.

Question 21- Can you explain different options for dynamic configurations in SSIS?

Use an XML file
Use custom variables
Use a database per environment with the variables
Use a centralized database with all variables

Question 22 - How do you upgrade an SSIS Package?

Depending on the complexity of the package, one or two techniques are typically used:
Recode the package based on the functionality in SQL Server DTS
Use the Migrate DTS 2000 Package wizard in BIDS then recode any portion of the package that is not accurate


Question 23 - Can you name five of the Performance counters for SSIS and the value they provide?

SQLServer:SSIS Service
SSIS Package Instances - Total number of simultaneous SSIS Packages running
SQLServer:SSIS Pipeline
BLOB bytes read - Total bytes read from binary large objects during the monitoring period.
BLOB bytes written - Total bytes written to binary large objects during the monitoring period.
BLOB files in use - Number of binary large objects files used during the data flow task during the monitoring period.
Buffer memory - The amount of physical or virtual memory used by the data flow task during the monitoring period.
Buffers in use - The number of buffers in use during the data flow task during the monitoring period.
Buffers spooled - The number of buffers written to disk during the data flow task during the monitoring period.
Flat buffer memory - The total number of blocks of memory in use by the data flow task during the monitoring period.
Flat buffers in use - The number of blocks of memory in use by the data flow task at a point in time.
Private buffer memory - The total amount of physical or virtual memory used by data transformation tasks in the data flow engine during the monitoring period.
Private buffers in use - The number of blocks of memory in use by the transformations in the data flow task at a point in time.
Rows read - Total number of input rows in use by the data flow task at a point in time.
Rows written - Total number of output rows in use by the data flow task at a point in time.

Question 24. Explain what ETL is?

Question 25. Which task would you use to copy, move or delete files?

Question 26. Which transform would you use to split your data based on
conditions you define?


Question 27. Explain the pros and cons of deploying to a file system vs msdb?

Question 28 Explain architecture of SSIS?

Question 29 Difference between Control Flow and Data Flow?

Question 30 How to pass property value at Run time? How do you implement Package Configuration?

Question 31 How would you deploy a SSIS Package on production?
1. Create deployment utility by setting its propery as true .
2. It will be created in the bin folder of the solution as soon as package is build.
3. Copy all the files in the utility and use manifest file to deply it on the Prod.

Question 32 Difference between DTS and SSIS?

Question 33 What are new features in SSIS 2008?

Question 34 How would you pass a variable value to Child Package?


Question 35 What is Execution Tree?

Question 36 What are the points to keep in mind for performance improvement of the package?

Question 37 You may get a question stating a scenario and then asking you how would you create a package forthat e.g. How would you configure a data flow task so that it can transfer data to different table based on thecity name in a source table column?

Question 38 Difference between Unionall and Merge Join?

Question 39 May get question regarding what X transformation do?Lookup, fuzzy lookup, fuzzy grouping

Question 40 How would you restart package from previous failure point?What are Checkpoints and how can we implement in SSIS?

Question 41 Where are SSIS package stored in the SQL Server?

MSDB.sysdtspackages90 stores the actual content and ssydtscategories, sysdtslog90,
sysdtspackagefolders90, sysdtspackagelog, sysdtssteplog, and sysdtstasklog do the
supporting roles.

Question 42 How would you schedule a SSIS packages?

Using SQL Server Agent. There are few third party applications also by which we can deploy the SSIS packages with greater flexibility.

Question 43 Difference between asynchronous and synchronos transformations?

Asynchronous transformation have different Input and Output buffers and it is up to the component designerin an Async component to provide a column structure to the output buffer and hook up the data from theinput.

Question 44 How to achieve multiple threading in SSIS? Passing variable to Child package from Parent package in SSIS


Question 45 why do we use execute sql task?

Execute SQL task is used in SSIS for running SQL queries and stored procedures

Question 46 What are the points to keep in mind for performance improvement of the package?

See answer

Question 47 Where are SSIS package stored in the SQL Server?

MSDB.sysdtspackages90 stores the actual content and ssydtscategories, sysdtslog90, sysdtspackagefolders90, sysdtspackagelog, sysdtssteplog, and sysdtstasklog do the supporting roles.

Question 48 Will trigger fire when inserting data through SSIS package?

By default it wont.
To change this behavior we have to follow following step:
In the data flow task, go to the Advanced Editor of OLEDB Destination, and there should be a property "FastLoadOptions". Specify FIRE_TRIGGERS as an additional option. That would cause it to fire triggers.

Question 49 How to unzip a File in SSIS?

The following is the process for Unzip a file in SSIS.

- Use Execute Process Task in the Control Flow Task.
- From BIDS, drag and drop an Execute Process Task to the control flow and configure.

In the Execute Process, perform the following configurations:

- Executable: The path of the application that is being used.
- Arguments: Need to supply the arguments to extract the zipped files. –o+ is the default parameter for overwriting files if they exist.
- Working Directory: The current directory for all process.

Question 50 Why Lookups are a key component in SQL Server Integration Services (SSIS)?

Data from two sources is combined by Lookup Transformation. There are matching fields from these sources. Case insensitive transformations is performed by the lookups. For the purpose of accessing additional information in relation to the tables is one of the purposes of lookups. Lookups can be used in data warehousing.

Lookups are used for the following purposes

- Data Cleansing
- Error Tolerance
- Data search in data warehouse
- Optimizations using cache modes

Question 51 How to achieve parallelism in SSIS?

Parallelism is achieved using MaxConcurrentExecutable property of the package. Its default is -1 and is calculated as number of processors + 2.

Question 52 How to Generate an Auto Incremental Number in a SSIS Package?


SSIS package does not support any function to generate auto incremental numbers. A script component can be used for the designated task. The steps are as follows:

1. Drag and drop the Script Document to the Data flow and select the Script Component Type as Transformation.
2. Double click the Script Component.
3. Select the column which is to pass through the script component, in the Input Columns tab.
4. Add a column with an integer data type, in the Inputs and Outputs tab.

Question 53 How to Generate an Auto Incremental Number in a SSIS Package?

Want to insert a text file into database table, but during the upload want to change a column called as months - January, Feb, etc to a code, - 1,2,3.. .This code can be read from another database table called months. After the conversion of the data , upload the file. If there are any errors, write to error table. Then for all errors, read errors from database, create a file, and mail it to the supervisor.
How would you accomplish this task in SSIS?

Question 54 What are new features in SSIS 2008?

With the release of SQL SERVER 2008 comes improved SSIS 2008.

1) Improved Parallelism of Execution Trees:
The biggest performance improvement in the SSIS 2008 is incorporation of parallelism in the processing of execution tree. In SSIS 2005, each execution tree used a single thread whereas in SSIS 2008 , the Data flow engine is redesigned to utilize multiple threads and take advantage of dynamic scheduling to execute multiple components in parallel, including components within the same execution tree

2) Any .NET language for Scripting:
SSIS 2008 is incorporated with new Visual Studio Tool for Application(VSTA) scripting engine. Advantage of VSTA is it enables user to use any .NET language for scripting.

3) New ADO.NET Source and Destination Component:
SSIS 2008 gets a new Source and Destination Component for ADO.NET Record sets.

4) Improved Lookup Transformation:
In SSIS 2008, the Lookuo Transformation has faster cache loading and lookup operations. It has new caching options, including the ability for the reference dataset to use a cache file(.caw) accessed by the Cache Connectin Manager. In addition same cache can be shared between multiple Lookup Transformations.

5) New Data Profiling Task:
SSIS 2008 has a new debugging aid Data Profiling Task that can help user analyze the data flows occurring in the package.In many cases, execution errors are caused by unexpected variations in the data that is being transferred. The Data Profiling Task can help users to discover the cource of these errors by giving better visibility into the data flow.

6) New Connections Project Wizard:
One of the main usability enhancement to SSIS 2008 is the new Connections Project Wizard. The Connections Project Wizard guides user through the steps required to create source and destinations.

Question 55 How to pass property value at Run time? How do you implement Package Configuration?

A property value like connection string for a Connection Manager can be passed to the pkg using package configurations.Package Configuration provides different options like XML File, Environment Variables, SQL Server Table, Registry Value or Parent package variable.


Question 56 How would you do Logging in SSIS?

SSIS includes logging features that write log entries when run-time events occur and can also write custom messages.

Integration Services supports a diverse set of log providers, and gives you the ability to create custom log providers. The Integration Services log providers can write log entries to text files, SQL Server Profiler, SQL Server, Windows Event Log, or XML files.

Logs are associated with packages and are configured at the package level. Each task or container in a package can log information to any package log. The tasks and containers in a package can be enabled for logging even if the package itself is not.

To customize the logging of an event or custom message, Integration Services provides a schema of commonly logged information to include in log entries. The Integration Services log schema defines the information that you can log. You can select elements from the log schema for each log entry.

To enable logging in a package
1. In Business Intelligence Development Studio, open the Integration Services project that contains the package you want.
2. On the SSIS menu, click Logging.
3. Select a log provider in the Provider type list, and then click Add.

Question 57 Explain architecture of SSIS?

SSIS architecture consists of four key parts:
a) Integration Services service: monitors running Integration Services packages and manages the storage of packages.

b) Integration Services object model: includes managed API for accessing Integration Services tools, command-line utilities, and custom applications.

c) Integration Services runtime and run-time executables: it saves the layout of packages, runs packages, and provides support for logging, breakpoints, configuration, connections, and transactions. The Integration Services run-time executables are the package, containers, tasks, and event handlers that Integration Services includes, and custom tasks.

d) Data flow engine: provides the in-memory buffers that move data from source to destination.

4 comments:

soumya said...

Great collection of content on MSBI Online Training Bangalore, Check it once.

rmouniak said...

Very nice article Thanks for sharing
.Net Online Training Hyderabad

thulasiram said...

I found so many interesting stuff in your blog especially its discussion. From the tons of comments on your articles, I guess I am not the only one having all the enjoyment here! Keep up the good work
JAVA Training in Chennai |
JAVA Course in Chennai |
Best JAVA Training in Chennai

thulasiram said...

I discovered so many exciting things in your weblog site especially its conversation. From the plenty of feedback on your content, I think I am not the only one having all the entertainment here! Keep up the great perform
DOT NET Training in Chennai |
DOT NET Course in Chennai |
DOT NET Course Chennai