This past weekend I attended SQL Saturday in Portland, OR. While I was there, I attended Reza Rad’s session on Advanced Data Transformations with Power Query. During that session, Reza showed a cool trick to merge data based on two columns through the user interface… without concatenating the columns first.
The Issue
Assume for a second that we have data that looks like this:
There’s two tables, and we want to join the account name to the transaction. The problem is that the unique key to join these two tables (which isn’t super obvious here) is a combination of the Acct and Dept fields. (Elsewhere in the data the same account exists in multiple departments.
To get started, I created two connection only queries, one to each table.
- Select a cell in the left table (Transactions) –> create a new query –> From Table –> Close & Load To… Connection only
- Select a cell in the right table (COA) –> create a new query –> From Table –> Close & Load To… Connection only
My Original Approach
Now, with both of those created, I want to merge the data so I get the account name on each row of the Transactions table. So how…?
Originally I would have edited each query, selected the Acct and Dept columns, and merged the two columns together, probably separating them with a custom delimiter. (This can be done via the Merge command on the Transform or the Add Column tab.)
Essentially, by concatenating the columns, I end up with a single column that I can use to dictate the matches.
Reza’s presentation showed that this isn’t actually necessary, and I don’t need to merge those columns at all…
Merge Data Based on Two Columns
So here’s how we can get those records from the COA Table into the Transactions table:
- Right click the Transactions query in the Workbook Queries pane
- Choose Merge
- Select the COA query
The data now looks like this, asking for us to select the column(s) we wish to use for the merge:
So here’s the secret:
- Under Transactions, click the Acct column
- Hold down the CTRL key
- Click the Dept column
And Power Query indicates the order of the columns you selected. It will essentially use this as a temporary concatenated value!
So now do the same to the COA table:
And then complete the merge. As you can see, you get a new column of data in your query:
of course, we can expand NewColumn to get just the Name field, and everything is working perfectly!
End Thoughts
This is pretty cool, although not super discoverable. The really nice piece here is that it can save you the work of creating extra columns if you only need them to merge your data.
I should also mention that Reza showed this trick in Power BI Desktop, not Excel. But because it’s Power Query dealing with the data in both, it works in both. How cool is that?
The post Merge Data Based on Two Columns appeared first on The Ken Puls (Excelguru) Blog.