WebsiteSpark

Tuesday, 31 January 2012

Listing FTP directory content using c# and .Net

Requirements:
Ø  Visual Studio 2010
Ø  A FTP site
What we are going to do here is to connect to a FTP site and list down the directory content in that FTP site. This uses FtpWebRequest class to connect and get response back from FTP site. The response stream is read using StreamReader class. The code and UI is as given below.





The UI consists of the following elements
Ø  Text box for URL, User ID & Password.
Ø  Multiline text box to display FTP response.
Ø  Command button to connect and get data from FTP location.
The code for command button click is given below, followed by the output UI.


using System;
using System.Linq;
using System.Windows.Forms;
using System.Net;
using System.IO;

namespace FTPWindow
{
    public partial class frmMain : Form
    {
        public frmMain()
        {
            InitializeComponent();
        }
       
        private void btnConnect_Click(object sender, EventArgs e)
        {
            FtpWebRequest ftpReq;
            string ftpURI = "ftp://" + txtServer.Text;
            ftpReq = (FtpWebRequest)FtpWebRequest.Create(ftpURI);
            ftpReq.Credentials = new NetworkCredential(txtUID.Text, txtPwd.Text);
            ftpReq.KeepAlive = true;
            ftpReq.UseBinary = true;
            ftpReq.Method = WebRequestMethods.Ftp.ListDirectory ;
           
            StreamReader reader =
                    new StreamReader(ftpReq.GetResponse().GetResponseStream());

            txtResp.Text = reader.ReadToEnd();
            //MessageBox.Show( (txtResp.Lines.Count() -1).ToString() + " item(s). " );
            reader.Close();
        }
    }
}






Enjoy!!!

Monday, 30 January 2012

Bulk insert using SQLBulkCopy, SQL Server and tab delimited text file

Here we are going to load data from a tab delimited text file and then insert this data into SQL Server table using the .Net SQLBulkCopy class.
Here also we will use one the Microsoft sample database tables and that will be AdventureWorksDW2008R2 database and the table we selected for this demo is DimCurrency. The create script for this table is taken from the sample database as reproduce here.


/****** Object:  Table [dbo].[DimCurrency]    Script Date: 01/30/2012 17:04:20 ******/
SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

CREATE TABLE [dbo].[DimCurrency](
      [CurrencyKey] [int] IDENTITY(1,1) NOT NULL,
      [CurrencyAlternateKey] [nchar](3) NOT NULL,
      [CurrencyName] [nvarchar](50) NOT NULL,
 CONSTRAINT [PK_DimCurrency_CurrencyKey] PRIMARY KEY CLUSTERED
(
      [CurrencyKey] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY],
 CONSTRAINT [AK_DimCurrency_CurrencyAlternateKey] UNIQUE NONCLUSTERED
(
      [CurrencyAlternateKey] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]

GO


You can create this table in any database. I created this table within my TestDB database.

Now the .Net part. What we have in .Net is a Windows forms application with two forms, one main form for selecting the text file and the other to display the data within a grid, design images as shown. The second image shows the running version of the application.








The main form contains the following components
Ø  An open file dialog, to select the text file.
Ø  3 text boxes
o    TxtFile holds the complete file name.
o   TxtPath holds the file folder structure from the root drive letter
o   TxtFileName holds the file name with extension
Ø  Two buttons
o   One for selecting the file
o   Other for inserting the data into SQL server.

To bulk insert data through this application, user needs to do the following
1)   Select the source file
a.    Application deciphers the file name and folder name for the user.
b.    Application opens the text file and fills the data into a data table using Jet OLEDB provider and OLEDB connection.
2)   Upload to SQL Server
a.    Using SQL Connection and .Net SQL Bulk Copy utility.
Note: There are still modifications required to make this application a full-fledged bulk copy application as there are lots of hardcoding within the application, like this will insert data only to the DimCurrency table (BCP’s DestinationTableName
 is hardcoded with DimCurrency), also this as of now only accepts tab delimited (hardcoded within OLEDB connection string). To make this work, you need to place schema.ini in the folder where the tab delimited text file resides. The content od schema.ini is given as follows.


[DimCurrency.csv]
Format=TabDelimited
ColNameHeader=False
CharacterSet=ANSI



Now the .Net code for frmMain

using System;
using System.Data;
using System.Windows.Forms;
using System.Data.SqlClient;
using System.Data.SqlTypes;
using System.Data.OleDb;


namespace BCPWindow
{
    public partial class frmMain : Form
    {
        public frmMain()
        {
            InitializeComponent();
        }

        private void btnSelect_Click(object sender, EventArgs e)
        {
            openFile.Reset();
            openFile.Multiselect = false;
            openFile.ShowDialog(this);
            openFile.Title = "Select";
           
            string tmp = openFile.FileName;
            txtFile.Text = tmp;

            txtFolderName.Text = tmp.Substring(0, tmp.LastIndexOf("\\"));
            txtFileName.Text = tmp.Substring(tmp.LastIndexOf("\\") + 1);
        }

        private void btnUpload_Click(object sender, EventArgs e)
        {
            SqlConnection sqlCn = new SqlConnection("server=(local); Initial catalog=TestDB; integrated security=SSPI");
            SqlBulkCopy bcp = new SqlBulkCopy(sqlCn);

            string oledbConStr = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" +
                                            txtFolderName.Text +
                                        "; Extended Properties='text;FMT=TabDelimited;HDR=No;'";
            string oledbSQL = string.Format("Select * from [{0}]", txtFileName.Text);
           
            OleDbConnection oledbCN = new OleDbConnection(oledbConStr);
            OleDbDataAdapter oledbDA = new OleDbDataAdapter(oledbSQL, oledbCN);
           
            DataTable dt = new DataTable();
            try
            {
                sqlCn.Open();
                bcp.BatchSize = 100;
                bcp.DestinationTableName = "dbo.DimCurrency";

                oledbCN.Open();
                oledbDA.Fill(dt);

                frmGrid fg = new frmGrid(dt);
                fg.Show();

                for (int i = 0; i < dt.Columns.Count; i++)
                    bcp.ColumnMappings.Add(new SqlBulkCopyColumnMapping(i, i));
               
                bcp.WriteToServer(dt);

                MessageBox.Show(string.Format(" {0} row(s) inserted successfully.", dt.Rows.Count));
            }
            catch (Exception ex)
            {
                MessageBox.Show(ex.Message);
            }
            finally
            {
                if (sqlCn != null)
                    if (sqlCn.State != ConnectionState.Closed)
                        sqlCn.Close();

                if (oledbCN != null)
                    if (oledbCN.State != ConnectionState.Closed)
                        oledbCN.Close();
            }
        }

        private void frmMain_Load(object sender, EventArgs e)
        {

        }
    }
}


Similarly code for frmGrid

using System;
using System.Data;
using System.Windows.Forms;

namespace BCPWindow
{
    public partial class frmGrid : Form
    {
        public frmGrid()
        {
            InitializeComponent();
        }
        DataTable dt;
        public frmGrid( DataTable dt )
        {
            InitializeComponent();
            this.dt = dt;
        }

        private void frmGrid_Load(object sender, EventArgs e)
        {
            dg.DataSource = dt;
          
        }
    }
}


Hope this helps you.

Enjoy!!

Friday, 27 January 2012

SQL Server 2008 Bulk Insert and database recovery model

The Bulk insert is one of the bulk load TSQL commands, used to insert loads of data into a table form a verity of source. Source can be a text file, a CSV file it can also be a OPENROWSET , XML files etc. The other bulk load commands are BCP and Insert Into.

This post describes the bulk insert activity under various recovery model of database. One of the three recovery models can be set for a database based on recoverability NFR, the models are
1)   Simple recovery
Ø  Minimal logging and recovery is up to last full backup of database
2)   Full recovery and
Ø  Logs each and every activity, recovery is till the last full backup + last differential backup + last transaction log backup, uses maximum amount of transaction log space.
3)   Bulk-logged recovery model.
Ø  As similar to full recovery model, but minimally logs the bulk insert action. For details http://msdn.microsoft.com/en-us/library/ms190692.aspx
To experiment, what we need is data and database.
Let’s create a test database for our testing purpose and the best place to get data would be the Microsoft sample database AdventureWorks2008R2. When you download the installation package (available here http://sqlserversamples.codeplex.com/ ) and extract it, you can find the scripts for creating the schema, bulk insert scripts and the data files.

We will select the DimCustomers table for our experiment.
Let’s start by creating a new database BulkInsertTest and set the recovery to simple mode.

USE master;
GO


IF EXISTS (SELECT name FROM master.dbo.sysdatabases WHERE name = N'TestBulkInsert')
BEGIN
      ALTER DATABASE TestBulkInsert SET OFFLINE WITH ROLLBACK IMMEDIATE ;
      ALTER DATABASE TestBulkInsert SET ONLINE ;
      DROP DATABASE TestBulkInsert ;
END;
GO

CREATE DATABASE TestBulkInsert ;
GO

Alter database TestBulkInsert set recovery SIMPLE;
GO


Now let’s create the table for bulk insert

USE TestBulkInsert;
GO

-- from AdventureWorksDW2008R2

CREATE TABLE [dbo].[DimCustomer] (
    [CustomerKey] [int] IDENTITY(1, 1) NOT NULL,
    [GeographyKey] [int] NULL,
    [CustomerAlternateKey] [nvarchar] (15) NOT NULL,
    [Title] [nvarchar] (8) NULL,
    [FirstName] [nvarchar] (50) NULL,
    [MiddleName] [nvarchar] (50) NULL,
    [LastName] [nvarchar] (50) NULL,
    [NameStyle] [bit] NULL,
    [BirthDate] [date] NULL,
    [MaritalStatus] [nchar] (1) NULL,
    [Suffix] [nvarchar] (10) NULL,
    [Gender] [nvarchar] (1) NULL,
    [EmailAddress] [nvarchar] (50) NULL,
    [YearlyIncome] [money] NULL,
    [TotalChildren] [tinyint] NULL,
    [NumberChildrenAtHome] [tinyint] NULL,
    [EnglishEducation] [nvarchar] (40) NULL,
    [SpanishEducation] [nvarchar] (40) NULL,
    [FrenchEducation] [nvarchar] (40) NULL,
    [EnglishOccupation] [nvarchar] (100) NULL,
    [SpanishOccupation] [nvarchar] (100) NULL,
    [FrenchOccupation] [nvarchar] (100) NULL,
    [HouseOwnerFlag] [nchar] (1) NULL,
    [NumberCarsOwned] [tinyint] NULL,
    [AddressLine1] [nvarchar] (120) NULL,
    [AddressLine2] [nvarchar] (120) NULL,
    [Phone] [nvarchar] (20) NULL,
    [DateFirstPurchase] [date] NULL,
    [CommuteDistance] [nvarchar] (15) NULL
) ON [PRIMARY];
GO

ALTER TABLE [dbo].[DimCustomer] WITH CHECK ADD
    CONSTRAINT [PK_DimCustomer_CustomerKey] PRIMARY KEY CLUSTERED
    (
        [CustomerKey]
    )  ON [PRIMARY];
GO
   
ALTER TABLE [dbo].[DimCustomer] ADD
    CONSTRAINT [IX_DimCustomer_CustomerAlternateKey] UNIQUE NONCLUSTERED
    (
        [CustomerAlternateKey]
    )  ON [PRIMARY];
GO

CREATE INDEX [IX_DimCustomer_GeographyKey] ON [dbo].[DimCustomer]([GeographyKey]) ON [PRIMARY];

GO


And now let’s view our log info

DBCC LogInfo
GO



Now we are ready for Bulk insert

BULK INSERT DimCustomer FROM N'D:\<Path>\AdventureWorks2008R2\AdventureWorks 2008R2 Data Warehouse\DimCustomer.csv'
WITH (
   CODEPAGE='ACP',
   DATAFILETYPE = 'widechar',
   FIELDTERMINATOR= '\t',
   ROWTERMINATOR = '\n' ,
   KEEPIDENTITY
)


(18484 row(s) affected)



Now again looking into our log file

DBCC LogInfo
GO

/*------------------------
DBCC LogInfo
GO
------------------------*/

(67 row(s) affected)
DBCC execution completed. If DBCC printed error messages, contact your system administrator.




We can see our log file getting increased to 22 MB from 2 MB initial size. All the entries having status as 2 are active. Issue a CHECKPOINT, then we can see these getting written to respective data file and respective log VLFs getting invalidated to status 0.
Now execute the bulk insert with “TABLOCK” Hint

BULK INSERT DimCustomer FROM N'D:\<Path>\AdventureWorks2008R2\AdventureWorks 2008R2 Data Warehouse\DimCustomer.csv'
WITH (
   CODEPAGE='ACP',
   DATAFILETYPE = 'widechar',
   FIELDTERMINATOR= '\t',
   ROWTERMINATOR = '\n' ,
   KEEPIDENTITY,
   TABLOCK
)

 
Now if we look at log file, we can see that the amount of VLFs generated is just 6 instead of 67.

DBCC LogInfo
GO


Now let’s drop the database, recreate with Full recovery, then again bulk insert and then see the effect on Log file.

USE master;
GO


IF EXISTS (SELECT name FROM master.dbo.sysdatabases WHERE name = N'TestBulkInsert')
BEGIN
      ALTER DATABASE TestBulkInsert SET OFFLINE WITH ROLLBACK IMMEDIATE ;
      ALTER DATABASE TestBulkInsert SET ONLINE ;
      DROP DATABASE TestBulkInsert ;
END;
GO

CREATE DATABASE TestBulkInsert ;
GO

Alter database TestBulkInsert set recovery SIMPLE;
GO

USE TestBulkInsert;
GO
--Rest of the script remains same


Again database has logged everything to log file.
Now let’s try the same thing with “TABLOCK” hint in BULK insert command

BULK INSERT DimCustomer FROM N'D:\<Path>\AdventureWorks2008R2\AdventureWorks 2008R2 Data Warehouse\DimCustomer.csv'
WITH (
   CODEPAGE='ACP',
   DATAFILETYPE = 'widechar',
   FIELDTERMINATOR= '\t',
   ROWTERMINATOR = '\n' ,
   KEEPIDENTITY,
   TABLOCK
)

Now the VLFs has reduced to 39
Now repeat the same with Bulk logged recovery model. Result is as similar to Simple recovery model. And here too the “TABLOCK” hint effects the manner in which logging is done by SQL server.

So in summary, more than selecting a recovery model, hints are of much more importance for the particular operation for better performance. 
Enjoy!