SSIS: Lesson 2
Import data from CSV into database
In the first lesson we exported data from a table in our SQL Server database to a CSV file. Today we are going to do just the opposite, we are going to bring data from a CSV file into our database.
I like to try to provide close to real world scenarios when I make my tutorials. Obviously in the simpler intro lessons, that can be difficult. That being said, for this lesson, imagine you manage a database for a school. Every week 9 students are selected to be on either the Red Team (Hall Safety Monitors) or the Blue Team (Lunch Room Aids). Every Friday, the Vice Principal Smith picks the student names at random and puts them into a CSV file that is emailed out to all the teachers. You point out that if you could just get the names into your database, you could put the information on the school’s main Intranet page, so the teachers don’t need to download and open the same CSV file 300 times a week.
So our task is to upload the new weekly CSV file into the database every Friday at the end of the school day.
Here is this week’s CSV file:
First, before heading into SSIS, we need to create a destination table in our database for the CSV file. Here is the file we want to import into our database, you’ll note it has three columns, two columns are strings and one is an integer:
So go into SSMS (SQL Server Management Studio) and run the following query to create the table:
CREATE TABLE STUDENT_TEAM ( STUDENT_NM varchar(50), STUDENT_AGE int, STUDENT_TEAM varchar(50)) GO
This code will create an empty table with 3 columns named STUDENT_NM, STUDENT_AGE, STUDENT_TEAM.
Now you can minimize SSMS and open up Visual Studios Data Tools
Once open, go into your Training Project and create a new SSIS Package. I’m naming this one Lesson_2
Now grab an Execute SQL Task from the SSIS Toolbox and drag it to the Design window.
Notice the red arrow at the bottom, there is already a Connection set up for the Sandbox database. This is because when we made this connection in Lesson_1, we made it by clicking on the Connection Managers in the Solution Explorer window.
This creates a (Project) connection – one that can be seen in every package. If you only want your connection to exist inside a single package (this becomes important as the number of packages you have grows, otherwise you’d have thousands of connections in Solution Explorer) – you can right click anywhere inside the Connection Managers box at bottom of the design window.
But this particular package will work fine just using the existing connection, we can move on without creating a new one.
Double Click on your Execute SQL Task icon in the design window.
In the window that pops up, you can rename your Task, provide a description of the task if you would like. This does come in handy when you have to go back to old package. It makes it easier to understand what is going on.
Down at the green arrow in the picture, you’ll have to selection a Connection from the drop down. Luckily were currently only have one to choose from.
Next go to SQL Statement and click on the little ‘…’ box that appears in the right corner.
In the pop up window, this is where you enter your SQL Query. In this case were going to delete all data from table in this box. The reason for this step is so that last week’s students won’t still be in the table on Monday morning. We only want the current students in our database.
Now click Okay on this window and click OKAY again to get back to the Design Window
Next drag over a Data Flow Task. Connect the Create Table task by dragging the arrow over from the Create Table task. Note the direction of the arrow, this is the order in which tasks will run.
Now open Data Flow Task. In the SSIS Toolbox, go to Other Sources and drag a Flat File Source over to the Design Window.
Double click on your new Flat File Source box and a new window will open. Click on New… next to Flat File Connection Manager
On the new window, click Browse
Browse to your CSV File. I created a folder called SSIS Training to make it easy to store all my files.
If you click on Columns in the left window, you can see what the CSV file contains.
Finally, before existing this window, Click back on General and make sure Column names in the first data row is checked.
Now go back to the Design window and drag an OLE DB Destination box onto the design window. Connect the blue arrow from the Flat File Source to the OLE DB Destination
Now double click on OLE DB Destination
Select STUDENT_TEAM from Name of table or the view
Now Click on Mappings
Everything should line up nicely since we named the table columns to match the CSV file.
Now click okay to go back to the Design Window, Click on Control Flow in the upper left to get back to the main page of the package
Right click the package and click Execute Package
If all goes well, you should end up with two green checks
If this happens, check to make sure your CSV file is not currently open, that can cause errors. If so, close the file and try executing the package again
Finally, check the table in your SQL Server to see if it populated
If you want to see how it would work the next week, go into the CSV file and change the data. When you run the package again, the names in the SQL table will change too.