For this tutorial, you will need to have Power Query installed. If you are running Office 2016, Power Query should already be available. For Excel 2010 and 2013, here is a link to the download: Power Query
Here is a link to the practice file for this Lesson: Data Cleaning Power Query
If you open the Excel file, you will see 3 sheets with 3 tables (Work Order, Vehicles, Prices). We want to populate a single sheet combining the three tables into one data set.
Open up a New Excel Workbook (A new file. Do not try working from the practice file). From the Ribbon bar select Power Query > From File > From Excel
Select the practice file: dataCleaningPQuery.xlsx and when the Navigator pops up, select Work Orders
Select Load. Once the work order loads, repeat the process, this time loading Prices.
**Note if the Query Editor window pops up, just click Close and Load. We will be working in the Query Editor window later.
Now go to the sheet with the Work Order table on it. Power Query > Merge
A) The top drop down box should already be populated. If not, select Work Orders for top and then select Prices for the bottom.
B) Now highlight the Service Columns in both tables.
C) Leave the Join Kind at Left Outer.
A Left Outer Join works as seen below. The left table is displayed in full and the the right table adds data based on columns that match up against data in the left table.
Note the text below “The selection has matched 29 out of the first 29 rows.” This is because if you look at the two columns we selected, they both have matching text. It is through this matching that we are able to line up the two tables into one.
Click OK and the Query Editor window will pop up.
Select the dual arrow icon in the NewColumn header and deselect Service from the list.
Right click the NewColumn and Rename it Price. Click on it and hold down the mouse button. Now drag it so that price sits in between Service and Mech.
Hit Close & Load and your new merged table should look like this:
Ok, now to load the Vehicle sheet.
Power Query > From File > From Excel. Select excel practice file and select Vehicles. DO NOT HIT LOAD this time. Instead, select Edit.
If you look at the original data set, the Lic Plate column in the Work Order sheet and Licence Plate in Vehicles are the two column we need to match up. Unfortunately, they currently do not match. In the second table, the licence plates are preceded by the letters Lic. We need to remove this.
Right click on the Licence Plate> Replace Values
Type Lic into Value to Find and leave Replace with blank. **Note — the Value to Find is “Lic ” with a space after it. Make sure you add the space.
Click OK, the Licence Plate column will remove the Lic. Click Close & Load
Now it is time to merge. Go back to the Work Order sheet and click Merge
Set the top table to Work Orders. Set the bottom to Vehicles. Highlight Lic Plates and Licence Plate columns. Set Join to Left Outer Join and click okay.
Click the double arrows on the new column and un-check column 1.
Rename the new columns Make and Model. Highlight both column and move them in between Lic Plate and Service. Click Close&Load
Congratulations. You have now successfully merged 3 tables into 1 using Power Query. Now you can go forth and analyze the data.
3 thoughts on “Power Query: Merge Data Sets”
Thanks for step-by-step guide.
Just a note: for the second merge instead of “Work Orders” as the first table the “Merge 1” [i.e., the result of merging “Work Orders” and “Prices”] has to be selected.
Thanks for the great blog and videos on YouTube.
Good catch. Thanks. I am going to have to go back and fix this. Perils of not having an editor to double check my work.
Awesome stuff! thanks for the guide. I’m an aspiring data analyst (maybe one day scientist), so I have much to learn.
I just came across this site and it looks like it has a lot of stuff I can use. I’m currently learning python, SQL, Excel, etc, so this stuff is awesome for me.