Readme_Capture Data Lineage Package Sample

11/05/2008 21:36:06


This sample works only with SQL Server 2005 and SQL Server 2008. It will not work with any version of SQL Server earlier than SQL Server 2005.
This sample works with the SQL Server 2005 version of the AdventureWorks OLTP database. To install this database, see Sample Databases for Microsoft SQL Server 2008.
The Capture Data Lineage sample is a package that captures audit information. When the package is run, the package loads five identically configured files, and then uses the following Integration Services components to process these files:
  • An Audit transformation adds columns of historical information, such as the file names, to the data before loading the data into a table.
  • An OLE DB destination loads the data from the files into a table, LineageFactTable, in the AdventureWorks database.
  • An Execute SQL task both creates the LineageFactTable table and then truncates the table every time that the package runs.
  • A second Execute SQL task queries the LineageFactTable and stores the table rows in a variable of the Object data type.
  • A Foreach Loop container extracts the table row values, which are stored in the variable of the Object data type, into separate variables. The container has a Script task that writes the values of the separate variables to a text file. If you run the sample on a non-English version of Windows, you may have to substitute the localized name of the Program Files folder to open or run the sample.

Important:
Samples are provided for educational purposes only. They are not intended to be used in a production environment and have not been tested in a production environment. Microsoft does not provide technical support for these samples.



Requirements

Running this sample package requires the following:
  • The sample package and data files that it uses must be installed on the local hard disk drive.
  • You must have installed and have administrative permissions on the AdventureWorks OLTP database.
  • If you intend only to run the sample package from the command line, you must install Integration Services.
  • If you intend to open the package in SSIS Designer and run the sample package, you must install Business Intelligence Development Studio. For more information about how to install samples, see "Installing Sample Integration Services Packages" in SQL Server Books Online.

Location of the Sample Package

If the samples were installed to the default installation location, the Capture Data Lineage sample package, CaptureDataLineage.dtsx, is located in the following folder:
C:\Program Files\Microsoft SQL Server\100\Samples\Integration Services\Package Samples\CaptureDataLineage Sample\Capture Data Lineage\
The following files are required to run this sample package.

File Description
CaptureDataLineage.dtsx The sample package.
Data732.txt Flat file sample data.
Data733.txt Flat file sample data.
Data734.txt Flat file sample data.
Data735.txt Flat file sample data.
Data736.txt Flat file sample data.
CheckQueryResults.txt Text file that contains the results of the query that the second Execute SQL task runs.


Running the Sample

The package can be run from the command line by using the dtexec utility, or can be run in Business Intelligence Development Studio.
If you are using a non-English version of Windows, or if you have installed the samples to a non-default location, you may have to update the ConnectionString property of any file connection managers used in the package to run the sample package successfully. You should verify that the path used in the connection manager is valid on your computer, and if you need to, modify the path so that it uses the correct path to the sample files.
For this sample, you may have to update "Program Files" in the ConnectionString property for the Sample Data and Check Query Results connection managers.
To run the package by using dtexec
  1. Open a Command Prompt window.
  2. Change the directory to C:\Program Files\Microsoft SQL Server\100\DTS\Binn, the location of dtexec.
  3. Type the following command: * dtexec /f "C:\Program Files\Microsoft SQL Server\100\Samples\Integration Services\Package Samples\CaptureDataLineage Sample\CaptureDataLineage\CaptureDataLineage.dtsx" *
  4. Press Enter.For more information about how to run the package by using the dtexec utility, see the topic, "dtexec Utility", in SQL Server Books Online.
To run the package in Business Intelligence Development Studio
  1. Open Business Intelligence Development Studio.
  2. On the File menu, point to Open, and click Project/Solution.
  3. Locate the CaptureDataLineage Sample folder, and then double-click the file named CaptureDataLineage.sln.
  4. In Solution Explorer, right-click CaptureDataLineage.dtsx in the SSIS Packages folder, and then click Execute Package.

Components in Sample

The following table lists the Integration Services tasks, containers, data adapters, and transformations that are used within the sample.

Element Purpose
Execute SQL task The Execute SQL task, Create LineageFactTable, runs an SQL statement that creates the LineageFactTable table the first time that you run the package, and then truncates the table when you rerun the package.
Data Flow task The Data Flow task, Get Data Lineage Information, executes the data flow in the package.
Flat File source The Flat File source, Extract Data from Files, loads the flat file source data and adds a column for the file name to each output row.
Audit transformation The Audit transformation, Add Data Lineage Information, adds two new columns for lineage information to each output row. The columns contain user name and start time.
Note:
The default length of the column for the user name is 64 characters. If your organization might have user names that exceed 64 characters, you must update the column length by using the Advanced Editor dialog box.


|
OLE DB destination The OLE DB destination, Load Data into LineageFactTable, loads the results to the LineageFactTable in the AdventureWorks database.
Execute SQL task The Execute SQL task, Query LineageFactTable, queries the LineageFactTable table. The task then stores the table rows, as a Full result set result set, in the SQLResults variable of the Object data type.
Foreach Loop container The Foreach Loop container, Enumerate Rows in LineageFactTable, iterates through each table row that is stored in the SQLResults variable. The container then extracts column values into package variables that are mapped to the columns. To enumerate the table rows, the Foreach Loop container uses the Foreach ADO enumerator.
Script task The Script task, Write Query Results to Text File, writes the values of the variables that are mapped to the LineageFactTable columns to a text file.
Flat File connection manager The Flat File connection manager, Check Query Results, connects to the file to which the Script task writes the values of variables.
Multiple Flat Files connection manager The Multiple Flat Files connection manager, Sample Data, connects to files that have the .txt extension.
OLE DB connection manager The OLE DB connection manager, (local).AdventureWorks, connects to the AdventureWorks database on the local server.


Sample Results

To see the execution results of the Capture Data Lineage sample package, run the following Transact-SQL query:

Select * from AdventureWorks.dbo.Lineage_Fact_Table


In these results, you will see the columns populated with the data retrieved from the flat files, with the addition of generated lineage information in the File Name, User Name, and Execution Start Time columns.
© 2008 Microsoft Corporation. All rights reserved.

Last edited Feb 16, 2009 at 10:16 PM by sabottaca, version 10

Comments

linqiurong Apr 17, 2012 at 10:21 PM 
Or set "AlwaysUseDefaultCodePage" to true on component "Load Data into LineageFactTable".

bezhart Jan 23, 2010 at 6:01 PM 
If non-English environment, the task "Create Lineage_Fact_Table" needs to use collate "Latin1_General_CI_AS" to define 1252 codepage.