재우니의 블로그


Schema.ini 로 CSV 파일을 DB 에 저장하기 - ASP.NET


Introduction
In real time web applications, there come several situations to import data from an external data sources such as Excel file, CSV file, text file etc. Asp.Net Framework provides a simple solution to import the data from the external sources to the database with the Microsoft.Jet.OLEDB provider. But difficulties arise when the external data file contains, many columns with different data types. Basically, the provider cannot differentiate data types between the columns or the rows, blindly it will consider them as a data type based on first few rows and leave all the data which does not match the data type. To overcome this problem, we use schema.ini file to define the data type of the CSV or text file and allow the provider to read that and recognize the exact data types of each column. And this article mainly concentrates on how to import a CSV file with different data types into database, using Microsoft Jet Database Engine and also with Schema.ini.


What is Schema.ini file? 

Schema.ini is a information file, used to define the data structure and format of each column that contains data in the CSV file. If schema.ini file exists in the directory, Microsoft.Jet.OLEDB provider automatically reads it and recognizes the data type information of each column in the CSV file. Thus, the provider intelligently avoids the misinterpretation of data types before inserting the data into the database.

Points to remember before creating Schema.ini

1. The schema information file, must always named as 'schema.ini'. 
2. The schema.ini file must be kept in the same directory where the CSV file exists. 
3. The schema.ini file must be created before reading the CSV file. 
4. The first line of the schema.ini, must the name of the CSV file, followed by the properties of the CSV file, and then the properties of the each column in the CSV file. 

For demonstration, consider a CSV file with 5 columns such as Date, First Name, Last Name, Age and Salary. The data structure of these columns are given below

Column Name Data Type Width Format
Date DateTime   dd-MMM-yyyy
First Name Text 100  
Last Name Text 100  
Age Long    
Salary Double    

Let us consider that we need to upload the CSV file in two different scenarios, with same CSV file name and with different CSV file name whenever we upload.

Import a CSV file every time with the SAME CSV file name. 

If you want to import a CSV file every time with same file name, just do the following steps. 

1. Open a Notepad file. 

2. copy and paste the below content in the notepad and replace 'YourCSVFileName.csv' with your CSV file name.

[YourCSVFileName.csv] 
ColNameHeader=True 
Format=CSVDelimited 
DateTimeFormat=dd-MMM-yyyy 
Col1=A DateTime 
Col2=B Text Width 100 
Col3=C Text Width 100 
Col4=D Long 
Col5=E Double

3. Save the Notepad file as 'schema.ini' in the same location of your CSV file. 

4. In your web page, place a FileUpload control, an Upload button and a GridView control. In the Upload button's click event write the following code.

if (filUpload.HasFile) 

  FileInfo fileinfo = new FileInfo(filUpload.PostedFile.FileName); 

  string strCsvFilePath = Server.MapPath("MyCSVFolder") + "\\" + fileinfo.Name; 

  //Save the CSV file in the Server inside 'MyCSVFolder' 
  filUpload.SaveAs(strCsvFilePath); 

  //Fetch the location of CSV file 
  string strFilePath = Server.MapPath("MyCSVFolder") + "\\"; 

  string strSql = "SELECT * FROM [" + fileinfo.Name + "]"; 

  string strCSVConnString = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + strFilePath + ";" + "Extended Properties='text;HDR=YES;'"; 

  // load the data from CSV to DataTable 

  OleDbDataAdapter oleda = new OleDbDataAdapter(strSql, strCSVConnString); 
  DataTable dtbCSV = new DataTable(); 
  oleda.Fill(dtbCSV); 

  //Display data in a GridView control 
  GridView1.DataSource = dtbCSV; 
  GridView1.DataBind(); 
}

5. Now you can insert the data in the DataTable into the database in your own way.

Import a CSV file every time with DIFFERENT CSV file name or
Dynamically creating Schema.ini File

If you want to upload a CSV file with different CSV file name every time, then you have to create the schema.ini file dynamically using the FileStream and StreamWriter object before you import the CSV file. To know how, follow the simple steps below.

In your web page, place a FileUpload control, an Upload button and a GridView control. In the Upload button's click event write the following code.

if (filUpload.HasFile) 

  FileInfo fileinfo = new FileInfo(filUpload.PostedFile.FileName); 

  string strCsvFilePath = Server.MapPath("MyCSVFolder") + "\\" + fileinfo.Name; 

  filUpload.SaveAs(strCsvFilePath); 

  string strFilePath = Server.MapPath("MyCSVFolder"); 

  using (FileStream filestr = new FileStream(strFilePath + "\\schema.ini", 
      FileMode.Create, FileAccess.Write)) 
  { 
      using (StreamWriter writer=new StreamWriter(filestr)) 
      { 
          writer.WriteLine("[" + fileinfo.Name + "]"); 
          writer.WriteLine("ColNameHeader=True"); 
          writer.WriteLine("Format=CSVDelimited"); 
          writer.WriteLine("DateTimeFormat=dd-MMM-yy"); 
          writer.WriteLine("Col1=A DateTime"); 
          writer.WriteLine("Col2=B Text Width 100"); 
          writer.WriteLine("Col3=C Text Width 100"); 
          writer.WriteLine("Col4=E Long"); 
          writer.WriteLine("Col5=F Double"); 
          writer.Close(); 
          writer.Dispose(); 
      } 
      filestr.Close(); 
      filestr.Dispose(); 
  } 


  string strSql = "SELECT * FROM [" + fileinfo.Name + "]"; 
  string strCSVConnString = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + strFilePath + ";" 
    + "Extended Properties='text;HDR=YES;'"; 

  OleDbDataAdapter oleda = new OleDbDataAdapter(strSql, strCSVConnString); 
  DataTable dtbBankStmt = new DataTable(); 
  oleda.Fill(dtbBankStmt); 

  GridView1.DataSource = dtbBankStmt; 
  GridView1.DataBind(); 

}


he above code uses FileInfo object to get the file name of the CSV file to write in the schema.ini file. Using the FileUpload control, we save the CSV file in the server’s directory “MyCSVFolder”. The schema.ini file has also to be created in this same directory. By using the FileStream object, we create the schema.ini file, and then by using StreamWriter object, we write the content of the schema.ini file. Rest everything is simple to read the CSV file, with the Microsoft.Jet.OLEDB provider, load the content into a DataTable, then data manipulation can be done. For this article purpose, we displayed the data in a GridView control.


Performance tips to import data is, pass the CSV file’s work sheet name between square brackets (“[]”) in the SELECT statement. This will avoid the problem if the CSV file’s work sheet name has two or more words separated by spaces.

Understanding the Schema.ini file line by line

Line 1: Name of the CSV file enclosed between square brackets (‘[‘ and ‘]’ ). 
Line 2: Specify the CSV file contains column header in first row. If column header exists, then specify ColNameHeader as True, otherwise False. 
Line 3: Specify the CSV file delimited format. It can be CSVDelimited, TabDelimited, Delimited(), FixedLength. 
Line 4: If your CSV file contains any date column, then specify the format of the Date value. Furthermore, if the CSV file contains any other columns that has to be formatted properly before data manipulation, then it can be specified here. 
Line 5 and above: Specify each column’s Name, Data type, Width if applicable. The general syntax is

Col(n)=<column name> <data type> <Width width>
 

Where n is the position of the column in the CSV file, and Width is mandatory only for Text. 

The main concept of this article is to provide the knowledge of reading, importing or exporting a CSV file with the help of schema.ini file. The scope the schema.ini file can be extendable to any limit. 


To learn more about Schema.ini, please refer 

http://msdn.microsoft.com/en-us/library/ms709353(VS.85).aspx

원문사이트 : http://www.aspdotnetcodes.com/Importing_CSV_Database_Schema.ini.aspx