BIML 101
A collection of a my BIML / SSIS scripts and examples. BIML (Business Intelligence Markup Language) is a product of varigence: http://varigence.com/
Tuesday, October 22, 2013
BIML Adding a description to your packages
<Packages>
<Package Name="Demo" ConstraintMode="Parallel">
<Annotations>
<Annotation AnnotationType ="Description">(c) i8Solutions</Annotation>
</Annotations>
</Package>
</Packages>
</Biml>
Till Next Time
Tuesday, October 9, 2012
BIML Transfer DataBase Objects
Quick demo on moving tables and other object between MS-SQL environments:
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<Annotations>
<Annotation>
File: Transfer.DataBase.Objects.biml
Demo on transferring database objects
between MSSQL environments
Enviroment:
DB: MS-SQL2012
BIML: 1.6 VS2010 BIDS Helper
(c) John Minkjan biml101.blogspot.com
</Annotation>
</Annotations>
<Connections>
<SqlServerManagementConnection
Name="SSMCLocalHostMSSQL2012"
SqlServerName=".\MSSQL2012"
UseWindowsAuthentication="true"></SqlServerManagementConnection>
</Connections>
<Packages>
<Package Name ="Transfer.DataBase.Objects" ConstraintMode="Linear">
<Tasks>
<TransferSqlServerObjects
Name ="TransferSqlServerObjects"
SourceConnectionName="SSMCLocalHostMSSQL2012"
SourceDatabase="AdventureWorks2012"
DestinationConnectionName="SSMCLocalHostMSSQL2012"
DestinationDatabase="AdventureWorksDW2012"
DropObjectsFirst="true"
CopyData="true"
CopyAllTables="false">
<Annotations>
<Annotation>
Notation for tables:
[schema].[TableName]
(including bracktes}
Target schema must exist!
</Annotation>
</Annotations>
<Tables>
<Table>[HumanResources].[Department]</Table>
<Table>[HumanResources].[Employee]</Table>
<Table>[HumanResources].[EmployeeDepartmentHistory]</Table>
</Tables>
</TransferSqlServerObjects>
</Tasks>
</Package>
</Packages>
</Biml>
The biml can als be downloaded here: Transfer.SqlServerObjects.biml
This will get you:
Till Next Time
This article is also published on www.bimlscript.com
Saturday, October 6, 2012
BIML BigData: Reading a RSS feed
Bigdata is HOT!
Based on the twitter feed reader I made here (Biml101: Reading Twitter Feed), I made a simple RSS feed reader using BIML.
The Feed URI term is passed trough the variable RSSURI. Be aware not to poll the RSS feeder every second, it probably will exclude your IP. For most feeds once every hour is enough! The package will only retrieve “new” feed items which are not jet in your table.
The Script:
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<Annotations>
<Annotation>
File: RSS.Feed.Reader.biml
Demo on reading a twitter feed
Enviroment:
DB: MS-SQL2012
BIML: 1.6 VS2010 BIDS Helper
(c) John Minkjan biml101.blogspot.com
</Annotation>
<Annotation>
RSSFeedReader Table Create script:
CREATE TABLE [dbo].[RSSFeedReader](
[ChannelTitle] [nvarchar](255) NULL,
[ChannelDescription] [nvarchar](2048) NULL,
[ChannelLink] [nvarchar](255) NULL,
[ChannelLanguage] [nvarchar](255) NULL,
[ItemDescription] [nvarchar](2048) NULL,
[ItemGuid] [nvarchar](128) NOT NULL,
[ItemTitle] [nvarchar](255) NOT NULL,
[ItemPubDate] [nvarchar](50) NOT NULL,
[ItemLink] [nvarchar](255) NULL,
[RSSURI] [nvarchar](255) NULL,
CONSTRAINT [PK_RSSFeedReader] PRIMARY KEY CLUSTERED
(
[ItemGuid] ASC,
[ItemTitle] ASC,
[ItemPubDate] ASC
)
)
</Annotation>
</Annotations>
<Connections>
<OleDbConnection
Name="CnOleDBAdventureWorks2012"
ConnectionString="Data Source=.\MSSQL2012;Initial Catalog=AdventureWorks2012;Provider=SQLNCLI10.1;Integrated Security=SSPI;Connect Timeout=30;"
RetainSameConnection="true">
</OleDbConnection>
</Connections>
<ScriptProjects>
<ScriptComponentProject Name="SC_RSS_READER">
<AssemblyReferences>
<AssemblyReference AssemblyPath="Microsoft.SqlServer.DTSPipelineWrap" />
<AssemblyReference AssemblyPath="Microsoft.SqlServer.DTSRuntimeWrap" />
<AssemblyReference AssemblyPath="Microsoft.SqlServer.PipelineHost" />
<AssemblyReference AssemblyPath="Microsoft.SqlServer.TxScript" />
<AssemblyReference AssemblyPath="System.dll" />
<AssemblyReference AssemblyPath="System.AddIn.dll" />
<AssemblyReference AssemblyPath="System.Data.dll" />
<AssemblyReference AssemblyPath="System.Xml.dll" />
</AssemblyReferences>
<OutputBuffers>
<OutputBuffer Name="Output0" IsSynchronous ="false">
<Annotations>
<Annotation>
IsSynchronous="false" ==>
When there are no non-synchronus buffers,
the compiler does not emit the CreateNewOutputRows
virtual base method (to match the BIDS/SSDT behavior)
</Annotation>
</Annotations>
<Columns>
<Column Name="ChannelTitle" DataType="String" Length="255"></Column>
<Column Name="ChannelDescription" DataType="String" Length="2048"></Column>
<Column Name="ChannelLink" DataType="String" Length="255"></Column>
<Column Name="ChannelLanguage" DataType="String" Length="255"></Column>
<Column Name="ItemDescription" DataType="String" Length="2048"></Column>
<Column Name="ItemGuid" DataType="String" Length="128"></Column>
<Column Name="ItemTitle" DataType="String" Length="255"></Column>
<Column Name="ItemPubDate" DataType="String" Length="50"></Column>
<Column Name="ItemLink" DataType="String" Length="255"></Column>
<Column Name="RSSURI" DataType="String" Length="255"></Column>
</Columns>
</OutputBuffer>
</OutputBuffers>
<ReadOnlyVariables>
<Variable Namespace="User" VariableName="RSSURI" DataType="String"></Variable>
</ReadOnlyVariables>
<Files>
<File Path="AssemblyInfo.cs">
using System.Reflection;
using System.Runtime.CompilerServices;
[assembly: AssemblyTitle("SC_RSS_READER")]
[assembly: AssemblyDescription("")]
[assembly: AssemblyConfiguration("")]
[assembly: AssemblyCompany("")]
[assembly: AssemblyProduct("SC_RSS_READER")]
[assembly: AssemblyCopyright("Copyright @ 2012")]
[assembly: AssemblyTrademark("")]
[assembly: AssemblyCulture("")]
[assembly: AssemblyVersion("1.0.*")]
</File>
<File Path="main.cs">
<![CDATA[
using System;
using System.Data;
using Microsoft.SqlServer.Dts.Pipeline.Wrapper;
using Microsoft.SqlServer.Dts.Runtime.Wrapper;
using System.Xml;
[Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute]
public class ScriptMain : UserComponent
{
public override void CreateNewOutputRows()
{
XmlTextReader rssReader;
XmlDocument rssDoc;
XmlNode nodeRss;
XmlNode nodeChannel;
XmlNode nodeItem;
string RSSURI = this.Variables.RSSURI.ToString();
rssReader = new XmlTextReader(RSSURI);
rssDoc = new XmlDocument();
rssDoc.Load(rssReader);
for (int i = 0; i < rssDoc.ChildNodes.Count; i++)
{
if (rssDoc.ChildNodes[i].Name == "rss")
{
nodeRss = rssDoc.ChildNodes[i];
for (int j = 0; j < nodeRss.ChildNodes.Count; j++)
{
if (nodeRss.ChildNodes[j].Name == "channel")
{
nodeChannel = nodeRss.ChildNodes[j];
for (int k = 0; k < nodeChannel.ChildNodes.Count; k++)
{
if (nodeChannel.ChildNodes[k].Name == "item")
{
nodeItem = nodeChannel.ChildNodes[k];
Output0Buffer.AddRow();
Output0Buffer.ChannelTitle = nodeChannel["title"].InnerText;
Output0Buffer.ChannelDescription = nodeChannel["description"].InnerText;
Output0Buffer.ChannelLink = nodeChannel["link"].InnerText;
Output0Buffer.ChannelLanguage = nodeChannel["language"].InnerText;
Output0Buffer.ItemDescription = nodeItem["description"].InnerText;
Output0Buffer.ItemGuid = nodeItem["guid"].InnerText;
Output0Buffer.ItemTitle = nodeItem["title"].InnerText;
Output0Buffer.ItemPubDate = nodeItem["pubDate"].InnerText;
Output0Buffer.ItemLink = nodeItem["link"].InnerText;
Output0Buffer.RSSURI = RSSURI;
}
}
}
}
}
}
}
}
]]>
</File>
</Files>
</ScriptComponentProject>
</ScriptProjects>
<Packages>
<Package Name ="RSS.Feed.Reader" ConstraintMode="Linear">
<Variables>
<Variable Name="RSSURI" DataType="String">http://rss.cnn.com/rss/edition.rss</Variable>
</Variables>
<Tasks>
<Dataflow Name ="DFT Read RSS Feed" >
<Transformations>
<ScriptComponentSource Name="SC RSS Feed Reader">
<ScriptComponentProjectReference ScriptComponentProjectName="SC_RSS_READER"></ScriptComponentProjectReference>
</ScriptComponentSource>
<Lookup Name="LKP Existing ItemGuids"
NoMatchBehavior="RedirectRowsToNoMatchOutput"
OleDbConnectionName="CnOleDBAdventureWorks2012"
CacheMode="Partial">
<ExternalTableInput Table="[dbo].[RSSFeedReader]"></ExternalTableInput>
<Inputs>
<Column SourceColumn="ItemGuid" TargetColumn="ItemGuid"></Column>
<Column SourceColumn="ItemPubDate" TargetColumn="ItemPubDate"></Column>
<Column SourceColumn="ItemTitle" TargetColumn="ItemTitle"></Column>
</Inputs>
</Lookup>
<OleDbDestination Name="ODD dbo RSSFeedReader" ConnectionName="CnOleDBAdventureWorks2012">
<InputPath OutputPathName="LKP Existing ItemGuids.NoMatch"></InputPath>
<ExternalTableOutput Table="[dbo].[RSSFeedReader]"></ExternalTableOutput>
</OleDbDestination>
</Transformations>
</Dataflow>
</Tasks>
</Package>
</Packages>
</Biml>
The Script can also be found here: RSS.Feed.Reader.biml
Will give you:
Till Next Time (also published on BIMLSCRIPT.com ==> http://bimlscript.com/Snippet/Details/59)
Friday, October 5, 2012
BIML BigData: Reading a twitter feed
Bigdata is HOT! I made a simple twitter feed reader using BIML. The search term is passed trough the variable SearchTerm. Be aware not to poll twitter every second, twitter will exclude your IP. For most search terms once every hour is enough! The package will only retrieve “new” tweets which are not jet in your table. There is a maximum of 100 tweets per run.
The script uses a ScriptComponentSource. Pay attention to the <IsSynchronous="false" ==> When there are no non-synchronus buffers, the compiler does not emit the CreateNewOutputRows virtual base method (to match the BIDS/SSDT behavior)
Kudos to Scott Curie of Varigence for helping me take this hurdle!
Script:
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<Annotations>
<Annotation>
File: Twitter.Reader.biml
Demo on reading a twitter feed
Enviroment:
DB: MS-SQL2012
BIML: 1.6 VS2010 BIDS Helper
(c) John Minkjan biml101.blogspot.com
</Annotation>
<Annotation>
Twitter Log Table Script:
CREATE TABLE [dbo].[TwitterLog](
[TweetNumber] [bigint] NULL,
[TweetID] [nvarchar](128) NULL,
[PublishedDateTime] [nvarchar](50) NULL,
[UpdatedDateTime] [nvarchar](50) NULL,
[TweetContent] [nvarchar](2048) NULL,
[TweetTitle] [nvarchar](2048) NULL,
[TweetURI] [nvarchar](2048) NULL,
[TweeterName] [nvarchar](2048) NULL,
[TweetLanguage] [nvarchar](128) NULL,
[TweetSource] [nvarchar](128) NULL,
[TweetResultType] [nvarchar](128) NULL,
[TweetGeoInfo] [nvarchar](128) NULL,
[TweeterImageLink] [nvarchar](128) NULL,
[TweetLink] [nvarchar](128) NULL,
[SearchTerm] [nvarchar](128) NULL
)
</Annotation>
</Annotations>
<Connections>
<OleDbConnection
Name="CnOleDBAdventureWorks2012"
ConnectionString="Data Source=.\MSSQL2012;Initial Catalog=AdventureWorks2012;Provider=SQLNCLI10.1;Integrated Security=SSPI;Connect Timeout=30;"/>
</Connections>
<ScriptProjects>
<ScriptComponentProject Name ="SCS_Twitter_Feed">
<AssemblyReferences>
<AssemblyReference AssemblyPath="Microsoft.SqlServer.DTSPipelineWrap" />
<AssemblyReference AssemblyPath="Microsoft.SqlServer.DTSRuntimeWrap" />
<AssemblyReference AssemblyPath="Microsoft.SqlServer.PipelineHost" />
<AssemblyReference AssemblyPath="Microsoft.SqlServer.TxScript" />
<AssemblyReference AssemblyPath="System.dll" />
<AssemblyReference AssemblyPath="System.AddIn.dll" />
<AssemblyReference AssemblyPath="System.Data.dll" />
<AssemblyReference AssemblyPath="System.Xml.dll" />
</AssemblyReferences>
<OutputBuffers>
<OutputBuffer Name="Output0" IsSynchronous="false">
<Annotations>
<Annotation>
IsSynchronous="false" ==>
When there are no non-synchronus buffers,
the compiler does not emit the CreateNewOutputRows
virtual base method (to match the BIDS/SSDT behavior)
</Annotation>
</Annotations>
<Columns>
<Column Name="TweetNumber" DataType ="Int64"></Column>
<Column Name="TweetID" DataType="String" Length="128"></Column>
<Column Name="PublishedDateTime" DataType="String" Length="50"></Column>
<Column Name="UpdatedDateTime" DataType="String" Length="50"></Column>
<Column Name="TweetContent" DataType="String" Length="2048"></Column>
<Column Name="TweetTitle" DataType="String" Length="2048"></Column>
<Column Name="TweetURI" DataType="String" Length="2048"></Column>
<Column Name="TweeterName" DataType="String" Length="2048"></Column>
<Column Name="TweetLanguage" DataType="String" Length="128"></Column>
<Column Name="TweetSource" DataType="String" Length="128"></Column>
<Column Name="TweetResultType" DataType="String" Length="128"></Column>
<Column Name="TweetGeoInfo" DataType="String" Length="128"></Column>
<Column Name="TweeterImageLink" DataType="String" Length="128"></Column>
<Column Name="TweetLink" DataType="String" Length="128"></Column>
</Columns>
</OutputBuffer>
</OutputBuffers>
<ReadOnlyVariables>
<Variable Namespace="User" VariableName="LastTweetNumber" DataType="Int64"></Variable>
<Variable Namespace="User" VariableName="SearchTerm" DataType="String"></Variable>
</ReadOnlyVariables>
<Files>
<File Path="AssemblyInfo.cs">
using System.Reflection;
using System.Runtime.CompilerServices;
[assembly: AssemblyTitle("SCS_Twitter_Feed")]
[assembly: AssemblyDescription("")]
[assembly: AssemblyConfiguration("")]
[assembly: AssemblyCompany("")]
[assembly: AssemblyProduct("SCS_Twitter_Feed")]
[assembly: AssemblyCopyright("Copyright @ 2012")]
[assembly: AssemblyTrademark("")]
[assembly: AssemblyCulture("")]
[assembly: AssemblyVersion("1.0.*")]
</File>
<File Path="main.cs">
using System;
using System.Data;
using Microsoft.SqlServer.Dts.Pipeline.Wrapper;
using Microsoft.SqlServer.Dts.Runtime.Wrapper;
using System.Xml;
using System.Web;
using System.Net;
[Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute]
public class ScriptMain : UserComponent
{
public override void CreateNewOutputRows()
{
System.Xml.XmlDocument xml_doc = new System.Xml.XmlDocument();
string TwitterUri = "http://search.twitter.com/search.atom?q=%23"
+ this.Variables.SearchTerm
+ <![CDATA["&rpp=100&result_type=recent&since_id="]]>
+ this.Variables.LastTweetNumber ;
xml_doc = GetResponse(TwitterUri);
XmlNodeList child_nodes = xml_doc.GetElementsByTagName("entry");
string[] temp;
foreach (XmlNode child in child_nodes)
{
Output0Buffer.AddRow();
Output0Buffer.TweetID = child.ChildNodes.Item(0).InnerText;
temp = child.ChildNodes.Item(0).InnerText.Split(':');
Output0Buffer.TweetNumber = Convert.ToInt64(temp[2]);
Output0Buffer.PublishedDateTime= child.ChildNodes.Item(1).InnerText;
Output0Buffer.TweetLink = child.ChildNodes.Item(2).Attributes["href"].Value.ToString();
Output0Buffer.TweetTitle = child.ChildNodes.Item(3).InnerText;
Output0Buffer.TweetContent =child.ChildNodes.Item(4).InnerText;
Output0Buffer.UpdatedDateTime = child.ChildNodes.Item(5).InnerText;
Output0Buffer.TweeterImageLink = child.ChildNodes.Item(6).Attributes["href"].Value.ToString();
Output0Buffer.TweetGeoInfo = child.ChildNodes.Item(7).InnerText;
Output0Buffer.TweetResultType = child.ChildNodes.Item(8).ChildNodes.Item(0).InnerText;
Output0Buffer.TweetSource = child.ChildNodes.Item(9).InnerText;
Output0Buffer.TweetLanguage= child.ChildNodes.Item(10).InnerText;
Output0Buffer.TweeterName = child.ChildNodes.Item(11).ChildNodes.Item(0).InnerText;
Output0Buffer.TweetURI = child.ChildNodes.Item(11).ChildNodes.Item(1).InnerText;
}
}
public XmlDocument GetResponse(string uri)
{
XmlDocument doc = new XmlDocument();
WebRequest myRequest = WebRequest.Create(new Uri(uri));
IWebProxy proxy = myRequest.Proxy;
if (proxy != null)
{
proxy.GetProxy(myRequest.RequestUri);
}
doc.Load(myRequest.GetResponse().GetResponseStream());
return doc;
}
}
</File>
</Files>
</ScriptComponentProject>
</ScriptProjects>
<Packages>
<Package Name ="Twitter.Reader" ConstraintMode="Linear">
<Variables>
<Variable Name ="LastTweetNumber" DataType="Int64" >0</Variable>
<Variable Name="SearchTerm" DataType="String">SSIS</Variable>
</Variables>
<Tasks>
<ExecuteSQL Name ="EST Get Last TweetNumber" ConnectionName="CnOleDBAdventureWorks2012" ResultSet="SingleRow">
<DirectInput> SELECT isnull(max([TweetNumber]),0) FROM [dbo].[TwitterLog] where [SearchTerm] =? </DirectInput>
<Parameters>
<Parameter Name="0" DataType="String" Length="128" Direction ="Input" VariableName ="User.SearchTerm"></Parameter>
</Parameters>
<Results>
<Result Name="0" VariableName="User.LastTweetNumber"></Result>
</Results>
</ExecuteSQL>
<Dataflow Name ="DFT Get Tweets">
<Transformations>
<ScriptComponentSource Name="SCS Twitter Feed">
<ScriptComponentProjectReference ScriptComponentProjectName="SCS_Twitter_Feed"></ScriptComponentProjectReference>
</ScriptComponentSource>
<DerivedColumns Name="DC SearchTerm">
<Columns>
<Column Name="SearchTerm" DataType ="String" Length ="128">@[User::SearchTerm]</Column>
</Columns>
</DerivedColumns>
<OleDbDestination Name="ODD dbo TwitterLog" ConnectionName="CnOleDBAdventureWorks2012">
<ExternalTableOutput Table="[dbo].[TwitterLog]"></ExternalTableOutput>
</OleDbDestination>
</Transformations>
</Dataflow>
</Tasks>
</Package>
</Packages>
</Biml>
Will get you:
Till Next Time