Quantcast
Channel: PowerBI Archives - The Excelguru Blog
Viewing all 178 articles
Browse latest View live

What the #? Literally…

$
0
0

I was chatting with my friend Scott of www.tinylizard.com today, discussing the structure of the M language, and how it’s kind of weird compared to VBA, C# and such. Well, “what the #?”, I figure.  I’m going to do a blog post on what the # sign actually signifies today, as it’s kind of confusing when you’re just starting with M.

A Little M Code:

I knocked up a bit of M code today to illustrate the challenge for someone just getting into this language.  Here’s a snap of the beginning of each line:

SNAGHTML1cfb30

Okay, so every line starts with something followed the = character… but…

What the #?

The question that I’m looking at specifically is… why do some lines start with # and some don’t?  (The second lines starts Source = and the fourth start ReplacedValue = but all others start with the # character.)

Literally…

The secret is all in the step name.  Check out the recorded steps for this query:

image

Notice anything about the two highlighted entries compared to the others?  No spaces!

When Power Query comes up against a space in the code it expects that there is another command or parameter coming next.  To deal with this, we need to tell Power Query that “Hey, I want you to literally use this…”  I’ve always heard this referred to as “escaping literals”, and M isn’t the only language that this happens in.

But it’s not sufficient to just use the # sign either.  It needs to be the # sign followed by two sets of quotes… kind of like how we refer to text in formulas in Excel.  So basically what you get is:

#”Filtered Rows” =

I Don’t Like It!

Yeah, me either.  I think it makes the code look horribly ugly.  It’s easy to avoid if you want to though, although it takes a little work.  Just right click the step, choose Rename, and remove the space:

image

At this point your M code will be updated to remove the literals, and you’ll be back to cleaner code.

You could also edit the M code manually to remove the # and the leading and trailing quotes.  If you do this, however, just make sure that you get all instances of them, as there will always be two in your recorded code, and possibly more if you’ve been manually tweaking it:

SNAGHTML744002

While I’m not a big fun of ugly code, I’ve also got to accept that I’m one of a very small percentage of users who will actually read it.  Most people will use the UI for this kind of stuff, so making it read easier there is probably the right design decision.

At any rate, now you know what the # character is all about.

The post What the #? Literally… appeared first on The Ken Puls (Excelguru) Blog.


Dynamic Database Queries

$
0
0

Some time ago I posted a technique to pass parameters to Power Query via an Excel table.  In a comment on that post, Anand asked “Can I pass the parameter to a SQL Table as well?”  The question could mean one of two things, but the answer to both is “yes”.

Dynamic Database Queries – Native Queries

The first thing it could mean is where you connect to a SQL database and feed it a native query like this:

SELECT  [ClubActivityTypeCode], [ClubMemberCategoryCode]
FROM  [tblClubFeeBillingSchedules]
WHERE  [ClubActivityTypeCode]='Golf'

And where we’d like to make the “Golf” part in the WHERE clause dynamic.  Power Query gives us code for this step as follows:

Source = Sql.Database("servername", "datbasename", [Query="SELECT  [ClubActivityTypeCode], [ClubMemberCategoryCode]#(lf)FROM  [tblClubFeeBillingSchedules]#(lf)WHERE  [ClubActivityTypeCode]='Golf'"])

Pretty ugly again, but we can absolutely have it compile if we use our parameter function.  Without using the parameter function at all, it might look like this:

typecode=fnGetParameter(“ActivityType”),

Source = Sql.Database("servername", "datbasename", [Query="SELECT  [ClubActivityTypeCode], [ClubMemberCategoryCode]#(lf)FROM  [tblClubFeeBillingSchedules]#(lf)WHERE  [ClubActivityTypeCode]='” & typecode & “'"])

Notice that we’ve just inserted a variable to hold the value we pulled via the fnGetParameter function, and then replaced the existing Golf term with the variable between quotes, joining it with the & characters on both sides.

One caveat here: In the case of the query above, we’re passing a custom Select query, which gets flagged as a Native Database Query by Power Query.  This has security permission implications, and Power Query won’t even attempt to fold more steps into this query.

Dynamic Database Queries – Folded Queries

And here’s the other option… with a caveat up front: I’m probably going to butcher the explanation of Query Folding.  My intent isn’t to get this perfectly factually correct, but rather to give you a general sense of what’s going on.  (In my impression, us Excel folks don’t actually NEED to understand the inner working of this piece to make use of it.)

So here goes… In contrast to the Native Database Query, we can avoid security prompts and let Power Query take advantage of query folding at the database level (pushing the process back to the database to complete.)  In layman’s terms this means we connect to the database, we pull in the data, then we do some operations.  Anything Power Query can get the database to do for us, it will “fold” into a single query.  And it will continue to fold more steps in until you try to do something that the database can’t do.  For that and all subsequent steps, Power Query’s engine takes over.

Dealing with these kind of queries works in exactly the same fashion as what you saw in the original parameter table article.  We connect to the database, then perform subsequent steps in Power Query, like filtering out records.  And we use variables and get fnGetParameter function to feed those variables in to the subsequent query steps.

An example of this (using a different database) is as follows, where the original code is:

Source = Access.Database(File.Contents("D:\Excelguru Courses\_Example Files\pos.accdb")),
_tblChits = Source{[Schema="",Item="tblChits"]}[Data],
FilteredRows = Table.SelectRows(_tblChits, each ([POSCategoryCode] ="Beer"))

And the modified example where we’ve made it dynamic:

category = fnGetParameter("POSCatCode"),
Source = Access.Database(File.Contents("D:\Excelguru Courses\_Example Files\pos.accdb")),
_tblChits = Source{[Schema="",Item="tblChits"]}[Data],
FilteredRows = Table.SelectRows(_tblChits, each ([POSCategoryCode] =category))

In this case we are still making dynamic database queries, we’re just feeding Power Query the info in a way that’s a bit easier for us to construct/review, and Power Query will encourage the database to fold it up and execute a query that is equivalent to SELECT * FROM tblChits WHERE POSCategoryCode = “Beer”.

Which Route Should You Go?

Unless you’re a SQL Ninja, use the latter route.  We know that Power Query won’t fold into the most efficient query in all cases.  On the other hand, providing a native database query as your first step immediately shuts down query folding so that any subsequent steps can’t be folded.  So unless you can write wickedly efficient SQL, you could actually hurt your performance.

Connecting to the database, exploring your tables and using Power Query’s filtering is easier for most Excel pros anyway.  Easier to build, and easier to audit.  So why not go down the path of least resistance?

The post Dynamic Database Queries appeared first on The Ken Puls (Excelguru) Blog.

Combine Multiple Excel Workbooks in Power Query

$
0
0

I got a comment on a previous post today, which made me realize I’d promised this but never posted it.  So let’s look at how to combine multiple workbooks together in Power Query, providing they have the same format.  Yes, it’s been covered before, (even linked to in the comments of the previous posts,) but I’m going to put my own flavour on it.  By the time we’re done, you’ll see how similar it is to working with non-Excel files.

Background

For our example we’re going to assume that we have four (or more) Excel files which you can download here.  I’ve stored in a folder called “Data”, which is a subfolder of the “Combine Workbooks” folder (more on why I store them in a subfolder a little later.)

Each file has a similar structure, which looks like this:

SNAGHTML1984743a

Notice that no one has bothered to set up a table style or anything, they are just raw worksheets of data.  Having said that, they are consistent in the fact that

  • The data starts in row 5
  • Each files is set up across same number of columns
  • The column headers and data types are consistent across files

(Just as a quick note, if they DID have tables set up, that would be okay too.  I’m just demoing that it isn’t necessary.)

The End Goal

The end goal we’re after is fairly common.  We basically want to grab the data from each file, strip out the first 4 rows, and append the tables together (without repeating header info.)  This will essentially create a nice data source that we can use in PivotTables, charts and other tools.

The Process

My preferred tool to combine multiple workbooks into one data source – where it used to be VBA – is most definitely Power Query today.  And here’s the steps we need to put together to make it work.

  1. Import a single workbook
  2. Convert it to a function
  3. Import all file contents (using our function)
  4. Combine all the data

Step 2 does involve a VERY minor manipulation of M code today, but as you’ll see it’s quite easy.

Let’s Combine Multiple Excel Workbooks

Step 1: Import a single workbook

To begin we’ll go to the Power Query menu and choose:

  • From File –> From Excel –> Sales-July2014.xlsx
  • Select Sheet1 and choose to Edit it

You’ll now see your query showing in the Query Editor:

SNAGHTML199ca279

We’ll need to do a bit of cleanup here to get the data just the way we need it:

  • Home –> Remove Rows –> Remove Top Rows –> 4 –> OK
  • Transform –> Use First Row As Headers
  • Select InventoryID and SalesPersonID –> Transform –> Data Type –> Whole Number
  • Select Cost, Price and Commission –> Transform –> Data Type –> Decimal Number
  • Select Date –> Transform –> Data Type –> Date
  • Select Date –> Home –> Remove Errors

That last one might be a bit odd, but I like to do that to my date columns.  The reason is that this protects me when I stack another table and it has headers.  I know that converting text to a date format will throw an error, so I’m guaranteed that any subsequent header rows will be nuked out.

At this point we’ve got a nice tidy import that would look pretty good if we decided to land it in a workbook.

Next up…

Step 2: Convert it to a function

Converting our nice query to a function is actually SUPER easy.  To begin, while still in the Power Query editor, we need to go to the View tab and click Advanced Editor.

When we do, we’ll see code similar to this, with the highlighted portion being the most important part:

SNAGHTML19b495d2

Okay, now follow carefully:  Right before the let statement at the very beginning, type:

(filepath)=>

The ()=> indicate to Power Query that this is a function, not a regular query.  And the “filepath” is the name of a parameter that we want to pass to the function.

The second part is that we want to replace the entire hard coded file path highlighted in yellow in the image above – including the quotes – with the name of our variable.  When we do, the lead three lines should look like this:

image

That’s all the code editing you need to do.  Let’s finalize this.  Click Done.  At which point you’ll see this:

image

No too inspiring or exciting really, but it IS what you want.  Final thing to do here is give the function a better name than Sheet1.  I’m going to use “fnGetContents”.  Choose your name, replace Sheet1 in the name box, then click File –> Close and Load.

If the query shows up in your Queries Pane saying “Load is Disabled”, don’t freak out.  That’s just a really poorly worded message that is indicating it is only a connection.  It will still work.  :)

You’ll also notice that, even though we imported this file, NOTHING has landed in any worksheet.  Huh?

Oddly enough, that’s as designed…

Step 3: Import all file contents (using our function)

All right, now it’s show time.  Let’s make some magic happen.  Let’s go get all of the files we need:

  • Go to the Power Query tab –> From File –> From Folder
  • Browse and select the folder that holds the data files
  • Click OK

You’ll now end up in the Query Editor and be staring at this:

SNAGHTML19d697a1

Now, normally we’d be tempted to click the double arrows beside the Content header to combine them all… but that doesn’t work here.  (It just returns the first file when we’re working with Excel files.)  So this is where we need to resort to our function.

  • Go to Add Column –> Add Custom Column
  • Enter the following:  fnGetContents([Folder Path]&[Name])
  • Click Enter
  • Right click the new “Custom” column –> Remove Other Columns

Almost done!

Step 4: Combine all the data

You should now be seeing:

image

Click the little double headed arrow, turn off the “Use original column name as prefix”, and click OK.

You should now have a nice table of data.  Give it a name, click Close and Load and you’re done!

Final Thoughts

Working with Excel files is pretty easy once you realize how easy creating that function really is.

One caveat though… If you store your “consolidating” workbook in the same folder as your data files, your fnGetContents() function will throw an error.  The reason for this is that your consolidation workbook will also get pulled in to the query, as well as the temp file version (prefixed with a ~).  You’ll need to filter out both files.

The post Combine Multiple Excel Workbooks in Power Query appeared first on The Ken Puls (Excelguru) Blog.

Consolidating Worksheet Print Areas With Power Query

$
0
0

I’ve posted on consolidating worksheets before, but in a task I got asked to do last week, I took a slightly different approach.  This approach allows consolidating worksheet print areas to grab the data that I need.

Background

In the previous post, I used some custom M code to reach outside the current workbook, and attach to the cached copy of the same workbook to read the data.  It works, but it’s a bit ridiculous that you need to.  One of the problems with that approach is that new sheets don’t show up in your solution until you save the workbook and refresh the query.  Not a big issue if you safeguard against it, but it’s a gotcha you want to be aware of.

In the situation I was working through, I wanted to consolidate some of the worksheets in the workbook, and also wanted to “future proof” it for new sheets that would be created later.  But I also wanted to make sure that all sheets would be pulled into the solution, even if they’d been newly created and the workbook hadn’t been saved.

So my data (which you can download here) looks like this:

image

Yes, that’s right.  It’s pivoted, even though it’s not in a Pivot Table.  And I’ve got one sheet per month.  Naturally we’d like to turn this into a proper data source.

My Approach

I decided to go seriously hardcore, and just went right into a blank query:

  • Power Query –> From Other Sources –> Blank Query

Not much to see yet (it is a blank query, after all) but I typed the following into the formula bar (which you can activate from the “View” tab if it’s not there.)

  • =Excel.CurrentWorkbook()
  • Enter

SNAGHTML1be24864

And check that out… I set some print areas!

What is in Excel.CurrentWorkbook()?

The Excel.CurrentWorkbook() command (and yes it is case sensitive) will give you a list of all tables, named ranges and power queries that are stored in your workbook.  And, as it happens, I can take advantage of this.  Because the print range is a named range, I can employ that to pick up my data.

Now, to be fair, I TOTALLY wish that Excel.CurrentWorkbook() also included all the raw worksheets.  That would be awesome.  You can get them by going to Excel.Workbook(File.Contents(full_file_path)) but that means it grabs them from the last saved version of the file in the workbook path.  Not ideal if your data is “live”.

At any rate, I’m good now.  So long as my users copy the previously set up sheet each month, and rename it to the current month, it will inherit the print area.  I now have something I can work with.

Consolidating Worksheet Print Areas

So let’s make use of this…

Something we need to recognize is that – as soon as we load this query to the worksheet – we are going to get a query and a table showing up in this list.  So let’s protect against that.  Click the Filter area on the “Name” column, and filter to records that contain “Print_Area”.

image

Now I don’t actually need the worksheet name for anything here, so I can

  • Select the Name column –> Right Click –> Remove
  • Click the Expand button (at the top of the the Content column)
  • Un-check the option to use original column name as prefix and click OK

And I’m left with this:

SNAGHTML1c49ca60

So now it’s just time for cleanup.  Let’s do that:

  • Go to Add Column –> Add Custom Column
  • Call the column Date and enter the following formula:
    • if Text.Start([Column1],5) = "Sales" then [Column3] else null
  • Select the Date column –> Transform –> Fill Down
  • Go to Home –> Remove Rows –> Remove Top Rows –> 2
  • Go to Transform –> Use First Row as Headers
  • Rename:
    • “Column1” to “Major Group”
    • “Column2” to “SubClass”
    • “Column7” to “Date”
  • Select the “Date” column –> Transform –> Data Type –> Date
  • Select “Major Group” column
    • Go to Transform –> Fill Down
    • Filter to only rows that “does not contain” “Sales”
  • Select “SubClass”
    • Filter to remove “null” records
  • Remove the “Total” column
  • Select the “Major Group”, “SubClass” and “Date” columns
  • Go to Transform –> Unpivot Columns –> Unpivot Other Columns
  • Rename
    • “Attribute” to “Category”
    • “Value” to “Units Sold”
  • Rename the query to “Sales” and load it to the worksheet

Create a Simple Pivot

Now that we have our data, we can create a simple Pivot Table from it:

image

Which is great, but what about next month?  Let’s see what happens…

  • Right click the “Feb Sales” worksheet –> Move or Copy
  • Check the “Create a copy” checkbox and click OK
  • Change C1 to 3/31/2015

Yuck… we’re simulating that we can’t trust our users to even rename the worksheet properly, but the do get the data right.  Now let’s see how our query reacts…

Go back to the Power Query worksheet, refresh the query, and the PivotTable if necessary…

image

Awesome!

What’s Safe?

This worked because we only have the print area defined for our input sheets.  If someone added a print area on other worksheets we’d have to add additional/adjust our logic to deal with that, as we’d get data in our query that possibly shouldn’t be there.

If someone adds something outside the print area, then you have an issue.  In those cases a table may be a better option to start with.  And naturally tables also show up in the Excel.CurrentWorkbook() call.  Having said that, it’s harder to control the default names there, so that could be an issue for you.

If someone modifies the shape of a range, that could also be a problem.  Say a new column gets added or something.

At the end of the day you need to think through an anticipate the scope of what your users are likely to do, and what shape the data will likely stay in (or not) so that you can plan accordingly.  In the scope of the solution that I built, I’m fairly certain it will work without a challenge.

The post Consolidating Worksheet Print Areas With Power Query appeared first on The Ken Puls (Excelguru) Blog.

Power Query Errors: Please Rebuild This Data Combination

$
0
0

I got sent this today from a friend.  He was a bit frustrated, as he got a message from Power Query that read “Formula.Firewall: Query 'QueryName' (step 'StepName') references other queries or steps and so may not directly access a data source. Please rebuild this data combination.”

What Does This Mean?

This message is a bit confusing at first.  For anyone who has worked with Power Query for a bit, you know that it’s perfectly acceptable to merge two queries together.  So why when you do some merges does Power Query throw this error?

image

The answer is that you cannot combine an external data source with another query.

Looking At Some Code

Let’s have a quick look at the code that my friend sent:

image

Notice the issue here?  The Merge step near the end references a query called DimShipper.  No issue there.  But it tries to merge that to an external data source which is called in the first line.

This is actually a bit irritating.  But let’s break this down a bit further.  Looking at the first line, it is made up as follows:

Source=Excel.Workbook(File.Contents(Filename())),

The Filename() portion is calling a function to return the desired filename from a workbook table, (based somewhat on this approach.)  We already know that works, but we also know that this is definitely pulling data from an external workbook (even it the file path is for this current workbook!)  And to be fair, this would be the same if we were pulling from a web page, database or whatever.  As long as it’s an external source being combined with another query, we’re going to get smacked.

So it’s suddenly starting to become a bit clearer what Power Query is complaining about (even if we think it’s frustrating that it does complain about it!)

Let’s “rebuild this data combination”

Right.  Who cares why, we care how to deal with this.  No problem.  It’s actually fairly straightforward.  Here’s how I dealt with it:

  • Right click the Query in Excel’s Query window
  • Choose Edit

I’m now in the Power Query window.  On the left side, I found the Queries pane and clicked the arrow to expand it:

image

Duplicate the Existing Query

The query in question here is “Purchase”…

  • Right click “Purchase” and choose Duplicate
  • Immediately rename the query to PurchaseList
  • Go to View –> Advanced Editor
  • Selected everything from the comma on the second line down to the last row of the query:

image

  • Press Delete
  • Change the final line of “Merge” to “Purchase_Sheet”

Perfect… so my new PurchaseList query looks like this:

image

  • Click Done

This query will now load.  Even though it is pointing to an external data source, it’s not being combined with anything else.  So it’s essentially just a data stage.

Modify the Existing Query

Now we need to go back and modify our existing query.  So in that left pane we’ll again select the Purchase query.

  • Go to View –> Advanced Editor
  • Select the first two lines and delete them
  • Put a new line after the let statement that reads as follows

Source = PurchaseList,

NOTE:  Don’t forget that comma!

And what we end up with is as follows:

image

So What’s The Difference?

All we’ve really done is strip the “external” data source and put it in it’s own query.  Yet that is enough for Power Query to be happy.  The new Purchase query above now has two references that it is comfortable with, and it works.  (And that’s probably the main thing.)

Designing To Avoid This Issue

I make it a practice to land all of my data sources into specific “Staging Tables”, which are set to load as connections only.  From there I build my “finalization” tables before pushing them into my Data Model (or worksheet).  You can visualize it like this:

SNAGHTML409dfd44

The key takeaways here are:

  • I always go from data source into a Staging Query (and do a bit of reshaping here)
  • My staging queries are always set to load to “Connection Only” so that the aren’t executed until called upon
  • All combining of data from disparate sources is done in queries that reference other queries, not external data sources

I’ve found this approach works quite well, and always avoids the “rebuild this data combination” error.

The post Power Query Errors: Please Rebuild This Data Combination appeared first on The Ken Puls (Excelguru) Blog.

Prompt for a Folder at Refresh

$
0
0

I got a comment on my blog the other day from David, asking if we could make Power Query prompt for a folder at refresh, allowing us to choose which folder to should be used to consolidate files.  That sounds like a pretty cool idea, and my first thought was “I think we can totally do that!”

I headed over to Chris Webb’s blog, as he’s done some cool things with Power Query prompts. Ultimately though, I got stuck on one thing… we have to go into the Query to invoke it.  My (albeit limited) tests left me unable to have that prompt when we refresh the query.

Not to be outdone by a lacking feature though, I cooked up a solution using some Power Query and VBA techniques that I’ve shared before.  This post wraps them up into a neat little package.

Background

The idea I’m going for here is to be able to click a button which:

  • Prompts the user to select a folder,
  • Feeds that folder into Power Query, and
  • Refreshes the Power Query output table.

To set it up, I used 3 techniques that I’ve shared before:

Initial Setup

To begin with, I created a blank workbook, and immediately added a Power Query parameter table as outline in Building a Parameter Table for Power Query.  I followed the steps there exactly, so I’m not going to detail that here.  The only things of note were:

  • I named the worksheet “Parameters”, and
  • I used the following for my “File Path” value:  D:\Consolidate\Begin

image

Next, I created a query to consolidate all the files in a folder.  Basically I ran through the following steps:

  • Power Query –> From File –> From Folder
  • Chose my folder:  D:\Consolidate\Begin
  • Did a bit of cleanup on the file
  • Implemented the “fnGetParameter” function call in place of the folder (as described in the aforementioned blog post)

The end result was some M code that looked like this:

let
Source = Folder.Files(fnGetParameter("File Path")),
#"Combined Binaries" = Binary.Combine(Source[Content]),
#"Imported CSV" = Csv.Document(#"Combined Binaries",null,",",null,1252),
#"First Row as Header" = Table.PromoteHeaders(#"Imported CSV"),
#"Changed Type" = Table.TransformColumnTypes(#"First Row as Header",{{"TranDate", type date}, {"Account", Int64.Type}, {"Dept", Int64.Type}, {"Sum of Amount", type number}}),
#"Renamed Columns" = Table.RenameColumns(#"Changed Type",{{"Sum of Amount", "Amount"}}),
#"Changed Type1" = Table.TransformColumnTypes(#"Renamed Columns",{{"Amount", type number}}),
#"Removed Errors" = Table.RemoveRowsWithErrors(#"Changed Type1", {"TranDate"})
in
#"Removed Errors"

And returned results like this:

SNAGHTML250112a0

I then loaded the Power Query to the worksheet, and called the worksheet “Data”.

So far so good?  Nothing really unusual yet.

Laying the Folder Change groundwork

With the basic functionality in place, I wanted to now give the user the ability to change the folder.

Rather than write the Browse For Folder routine from scratch, I headed over the VBAExpress.com and grabbed the one I submitted to the KB… um… a long time ago.  :)

Once I had that code copied I:

  • Opened the VBE
  • Browsed into my workbook
  • Right clicked the project and added a new Module
  • Pasted in all the code (shown below):

Function BrowseForFolder(Optional OpenAt As Variant) As Variant
'Function purpose:  To Browser for a user selected folder.
'If the "OpenAt" path is provided, open the browser at that directory
'NOTE:  If invalid, it will open at the Desktop level
Dim ShellApp As Object
'Create a file browser window at the default folder
Set ShellApp = CreateObject("Shell.Application"). _
BrowseForFolder(0, "Please choose a folder", 0, OpenAt)
'Set the folder to that selected.  (On error in case cancelled)
On Error Resume Next
BrowseForFolder = ShellApp.self.Path
On Error GoTo 0
'Destroy the Shell Application
Set ShellApp = Nothing
'Check for invalid or non-entries and send to the Invalid error
'handler if found
'Valid selections can begin L: (where L is a letter) or
'\\ (as in \\servername\sharename.  All others are invalid
Select Case Mid(BrowseForFolder, 2, 1)
Case Is = ":"
If Left(BrowseForFolder, 1) = ":" Then GoTo Invalid
Case Is = "\"
If Not Left(BrowseForFolder, 1) = "\" Then GoTo Invalid
Case Else
GoTo Invalid
End Select
Exit Function
Invalid:
'If it was determined that the selection was invalid, set to False
BrowseForFolder = False
End Function

With that in place, I just needed to link that function into a routine that can use it…

Rolling the Final Routine

I write a lot of VBA, so this was pretty quick to knock up for me.  The code itself looks like this:

Sub UpdatePQ()
Dim vFolder As Variant
Dim wsParameters As Worksheet
Dim wsData As Worksheet
'Set the worksheets
Set wsParameters = Worksheets("Parameters")
Set wsData = Worksheets("Data")
'Update the folder to import
vFolder = BrowseForFolder
If CStr(vFolder) = "False" Then
'No folder chosen, exit the routine
Exit Sub
End If
'Update parameter table with folder
wsParameters.Range("B2").Value = vFolder
wsData.Range("A1").ListObject.QueryTable.Refresh BackgroundQuery:=False
End Sub

In short, this routine basically:

  • Sets references to the two worksheets
  • Prompts the user for the folder.  The output here could be a folder path or “False” which means the user cancelled.  If it’s “False” we bail out
  • If the path was provided, it gets placed in B2 of the Parameter worksheet (which is the “Value” field for the “File Path” variable.  (So if you didn’t create your table starting in A1, then you’d need to update this.)
  • Finally, we refresh the Power Query output table which I landed in cell A1 of the Data worksheet.

Final Setup Step

The very last step is to link the UpdatePQ macro to a button.  This is fairly straight forward as well, and the steps are illustrated well in the Refresh Power Query with VBA post.

Make Power Query Prompt for a Folder at Refresh

And now, the only thing we need to do is click a button…

image

Choose a folder…

SNAGHTML2525f706

And the query will refresh the data in the landing page.

Try it out

You can download the completed workbook, as well as two folders with the source data, from my Skydrive. Click here to do so.

Caveats

I hope it goes without saying that the data structure of the files in the selected folders must be the same.  If they’re not, the query load will fail, of course.  But so long as you’re just trying to deal with different months/years, this should work nicely.

The post Prompt for a Folder at Refresh appeared first on The Ken Puls (Excelguru) Blog.

Create Running Totals in Power Query

$
0
0

I was presented with an interesting problem last week; how to create running totals in Power Query.  There doesn’t appear to be a built in method to do this, so I thought it would make an interesting use case study.

Background

Let’s take a look at the following table, which can be found in the sample file found here (if you’d like to follow along.)

SNAGHTML1ec4a255

Now, we could certainly use a Pivot Table to create running totals, but consider the scenario where we want to preserve the original list.  Pivot Tables aggregate data and apply running totals to the aggregation, so that might not suit our needs.  (In the case of the table above, it would aggregate both records for 1/2/2015 into a single line.)

We could also set up a formula on our table that summed the row about plus the row to the left, for example.  Of course that means you MUST land it in the worksheet, which may not be ideal either.

This solution is entirely possible to solve with Power Query, but with no one click feature in the user interface, how do we pull it off?

Creating the Staging Query

The first thing I did is create a query to pull in my data source and land it in a “Staging” table inside Power Query, as per this post.  To do that I:

  • Selected a cell in the table
  • Went to Power Query –> From Table
  • Selected the Date column –> Transform –> Data Type –> Date
  • Went to Add Column –> Add Index Column –> From 0

At this point the query looks like this:

image

Next I finalized the staging query.  I:

  • Changed the table name to “Staging-Sales”
  • Went to File –> Close and Load to… –> Only Create Connection
  • Right clicked the query in the Workbook Queries pane –> Move to Group –> Staging

Building a Running Total function - Groundwork

With the staging query in place, it was time to tackle the real problem; the method to create running totals in Power Query.  That work requires a bit of down and dirty M work, and here’s how I approached this to build up my end solution:

  • Right click the “Sales” staging query and choose Edit

This will land you back in the Power Query editor.  On the left side of the window, expand the vertical “Queries” window by click on the little arrow until you see this:

image

With this window expanded, we can begin creating our function:

  • Right click “Sales” –> Reference
  • Right click “Sales (2)” –> Move to Group –> Create Group –> “Functions”
  • Change the name from “Sales (2)” to “fnRunningTotal”

Things should now look as follows:

image

Perfect.  We now have the query set up the way we need to take our next steps.  Right now it’s just pointing to our Sales query and pulling the data in, but that will change.

From a logic point of view, here’s what I want to happen:

  1. I want to take a row from the Sales table
  2. I want to send a copy of the table to my function
  3. I want my function to remove all rows where the index is higher than the submitted row
  4. I want to sum the remaining values
  5. I want to return that total back to the calling row

So let’s start working through that process… we’ll come back to steps 1 and 2 in a bit… let’s pretend those have already happened, and focus on the rest.

  • Select the Sale and Index columns –> Home –> Remove Columns –> Remove Other Columns
  • Click the Index column filter –> Number Filters –> Less than or equal to
  • We’ll pick an arbitrary starting point like 2 for right nowimage
  • Rename the “Sale” column to “Temp”

Now we’re going to take the final step with this data before we convert it into a function.  We are going to group it, so go to:

  • Transform –> Group By

Now, we need to tweak the default a bit to get it right.  The very first thing we need to do is remove the “Group By” level that groups by Temp, as we want to group all remaining rows together.  To do that, you click the – button beside Temp.

Next we need to set our data correctly.  We are going to set it to:

  • New column name:  RunningTotal
  • Operation:  Sum
  • Column:  Temp

And at this point, our output should look like this:

image

Building a Running Total function – Finalization

With this in place, we now need to turn this into a real function.  And to do that, we need to step into the M code itself.  So go to View –> Advanced Editor to do that.

Now, at the very top of the M code (before the let line), we are going to insert the following code:

(tbl as table, sumcolumn as text, rowindex as number) =>

This code will allow us to pass in the table, which column we want to sum, and which row we are working with when we call the function.

Next we need to plumb these variable into our code.  To do that we will:

  • Remove the “Source = Sales,” line of the M query
  • Replace every instance of Source in query (there is only one) with the name of our table: tbl

image

  • Replace every instance of “Sale” in the query with the name of the variable which holds our column name; sumcolumn.

image

Note that we do not need to wrap sumcolumn in quotes, as Power Query knows that it is text based on the parameter definition at the head of the function.

  • In the #”Filtered Rows” step, replace the hard coded 2 with the name of the variable which holds our index number; rowindex.

image

  • Place a comma at the end of the #”Grouped Rows” line (to make way for the next step)
  • Replace the last two rows with the following to return the single value back to the caller:

   RunningTotal = Record.Field(#"Grouped Rows"{0},"RunningTotal")
in
RunningTotal

When complete the function looks as follows:

image

And now click Done.  At this point, your query should flip to appear as a function.  This is evident in three ways:

  1. The icon in the left pane changes to an fx logo
  2. The table disappears and is replaced with the function parameters and an invoke box
  3. The Applied Steps collapse to a single line (I wish this didn’t happen)

Commit it by choosing File –> Close and Load

Create Running Totals in Power Query

So now… did it work?  Let’s find out.

  • Right click the Sales query in the Workbook Queries pane and choose Edit
  • Expand the queries pane on the left
  • Right click Sales –> Reference
  • Right click Sales (2) –> Move to Group –> Other Queries
  • Rename Sales (2) to “Landing”
  • Go to Add Column –> Add Custom Column
  • Call the column “Running Total”

And now we need to enter our formula.  The actual formula itself will be:

=fnRunningTotal(Source,"Sale",[Index])

Here’s how it breaks down:

  • fnRunningTotal is the name of the function
  • Source is the name of the previous step in the Applied Steps section.  This allows us to pass the entire table that is generated up to that point
  • Sale” is the name of the column we want to summarize.  The quotes here indicated that we are going after the column’s name, not the data in that column’s row.
  • [Index] is surrounded in square brackets as we want to pass the value for that row of the index column.  That is the row that is then used to filter down the data.

And the result, if you followed along exactly, is this:

image

So at this point, we can remove the Index column and call it done.  You’ll find a complete version here if you’d like to review it.

Final Thoughts

This looks like a very long and cumbersome route, but overall it’s actually not that complicated once you understand it.  The key points that I want to highlight here are:

  • We started with a simple Staging query
  • We used that as a reference and stepped through the process of filtering down the data for one record
  • We converted the new query to a function by:
    • Defining a line at the beginning to receive 3 variables
    • Integrated the variables into the code in place of hard values
    • Added an ending to feed back a single value
  • We used our function

The first line of the query is essentially just () => and any variables we need go between the brackets.  And while I declared there types, you don’t actually even have to do that.

The last line of code before the in is the trickiest part.  And here’s a secret… I just come back to a blog post and copy this part:

RunningTotal = Record.Field(#"Grouped Rows"{0},"RunningTotal")

The secret is understanding that the #"Grouped Rows" portion is just the name of the previous step in the M code, and the final “RunningTotal” is the name of the column you want to return.  That lets me retrofit it pretty easily into any other function.

So… would it be faster to just flip it into a pivot and add running totals there?  Absolutely.  Would it be faster to just land the data in a table and create a running total formula down the table?  Absolutely.  But if those aren’t options, then Power Query may be a viable solution for you.  :)

The post Create Running Totals in Power Query appeared first on The Ken Puls (Excelguru) Blog.

Looking for Power Query samples

$
0
0

How would you like to make your data famous?  Or at least – how would you like a solution built for your own data using Power Query… for free?

I have a couple of Power Query related things I’m working on right now, like these:

Now, if you’ve ever been to one of my courses, you’ll know that one of the most important things to me is to use data that is modelled on real world examples.  That’s right, no canned AdventureWorks samples.  I like data that real users face in the real world.

To that end, I’m looking for Power Query samples.  Specifically, for sample files of data to use as examples for my Power Query blog posts, courses and other educational products.  And while I’ve got lots of great data sets of my own, I’m particularly interested in stuff that I haven’t seen or demonstrated on the blog before.

So if you’ve got a nasty file that’s painful to import, or you have a few files that you’re having trouble combining, or you just can’t figure out a starting point with Power Query, I’d like to have a crack at your scenario.  It’s super easy… you email me the file, I’ll take a look at it, and cook up a solution.  And any new or interesting techniques that are needed become the subject of a future blog post, article, or even sample in the courses I teach.

I hope it goes without saying, but for legal reasons I’m going to say it anyway… any files you send me can and will be used in public.  In order to use them in my materials, by sending them to my you are, by default, assigning me copyright to those specific source files.  Because of that, it is VERY important that the data you include is not sensitive.  To be fair, I’m not interested in your data specifics, as much as the form and shape of it, how it starts out, and what the end product should become.

“Okay, that’s great for you, Ken, but what do we get out of sending you our files?”

How about a working solution?  Does that sound reasonable?  You send me your data, and I send you back something that uses Power Query to complete the goal?

If you’re interested (and who wouldn’t be), here’s what you need to include:

  • A sanitized copy of your source data.  (i.e. Do a mass find/replace on employee names/numbers, phone numbers, addresses and such.  If you have particularly troubling data that is interesting, I may be willing to even help here.)
  • A copy of what you want the output to look like.  (I’ll build to that end goal.)
  • A summary of any “rules” you use along the way to determine if something needs to change one way or another.  (I’d rather not spend my time guessing about your end goals.)
  • Your direction as to whether you’d like your/your company name included or withheld should I decide to publish a post using your data.

In return for that, I will send you a copy of the solution once I’ve crafted it.  This may happen days/weeks before any solution is posted publicly, or it may come in the form of an email to let you know that the post is live, at my discretion.

Please be aware that I also reserve the right to NOT use your submission at my discretion.  This could happen if my inbox gets overloaded with requests, if I’ve already dealt with your scenario on the blog, if your data would be better handled in another fashion, if I get too busy with paying projects, if the data is just too difficult to work with, or any other reason that I haven’t listed. (I’ll do my best, but this is free work, so it’s totally at my schedule and discretion.)

If that sounds interesting to you, my email is ken at excelguru dot ca.  I’m going to limit this to the first 10 data sets that land in my inbox, so if you want to try and take advantage of it, act quick!

The post Looking for Power Query samples appeared first on The Ken Puls (Excelguru) Blog.


Use Dynamic Ranges in Power Query

$
0
0

This was a great question that came up in my M is for Data Monkey session at the Amsterdam Excel summit:  Can we use dynamic ranges in Power Query?

If I recall correctly, this was asked by Gašper Kamenšek, after I demonstrated how to use the =Excel.CurrentWorkbook() method to consolidate worksheet tables.  So we gave it a shot to find out.

Setup:

Nick Hodge jumped up and built us a quick little dynamic named range solution that looked like this:

SNAGHTML21f653dd

And defined a named range call “Animals”, as follows:

=OFFSET(Sheet1!$A$4,0,0,COUNTA(Sheet1!$A$4:$A$12),2)

The next step was to pull it into Power Query, but I didn’t have a Table or normal Named Range to pull from, so I went to Power Query –> From Other Sources –> Blank Query.

The Normal Piece

As I’d shown earlier, I typed the following in the formula bar in the Power Query editor:

=Excel.CurrentWorkbook()

And at this point I was pretty chuffed as it looked pretty straight forward:

image

This is exactly what I was hoping I would see.  Fantastic!  So I did a little normal cleanup:

  • Expanded the content
  • Moved the first row to a header
  • Deleted the “Animals” column that was carried down the table

I then proudly announced that “Yes!  It works!”, clicked Close and Load to land it in a table… and it failed:

image

The Data Fun House of Mirrors

That was a bit of a shock.  It worked in the Query window, but when loading it failed.  There’s something not adding up here.  So I jumped back in to edit the query:

I was immediately presented with this:

image

Doesn’t make sense… so I started to step through the process. I clicked on the Source step:

image

Aha!

Use Dynamic Ranges in Power Query

What I forgot was that I’d used =Excel.CurrentWorkbook().  That  function returns a table which contains all tables, connections and named ranges in the workbook.  And when we create a new Power Query, it adds an output table and a connection.  So we basically got a circular reference.

No problem.  We can easily deal with this.  All we need to do is filter down to the Animals table that we need.  (That way we don’t have to worry about any new tables being created.)

So I added that filter immediately after the Source step.  And boom, everything works again because it is restricted to just the table I originally had.  And now, when I commit it the the worksheet, it loads:

SNAGHTML229c0b05

Proof Positive

Perfect.  It loaded, now lets just go add some data and make sure it’s working properly at update.  I added a new record to the table:

SNAGHTML22839637

And voila!

SNAGHTML22a09947

So it looks like we can use dynamic ranges in Power Query without any issues.

Alternate Method

When I was writing this up, I couldn’t replicate the error at first.  The method I used was just slightly different.  Here’s what I did instead:

  • Pulled in the table
  • Clicked the green Table to drill into it
  • Promoted headers
  • Landed it

It worked perfectly at update.  Why?  Because I drilled into a specific table, avoiding the issue at update.  And in all honesty, it’s probably a better method. ;)

The post Use Dynamic Ranges in Power Query appeared first on The Ken Puls (Excelguru) Blog.

Native Database Query Security Prompts

$
0
0

In a comment on my Parameter Tables post, Talha asks why we get prompted with a Native Database Query message each time the value of the parameter used in a SQL query changes:

I had a question. The value I grab from my parameter table, I store it in a variable and use it in my SQL query. However, every time I change the parameter, I get the following prompt:

------------------------------------------------

NATIVE DATABASE QUERY

You are about to run a native database query. Native queries can make changes to the database. Would you like to run the following query?

------------------------------------------------

Is there any way I can make this go away?

To answer this, it helps to understand why this message comes up.

Replicating The Issue

When you connect to a SQL server database in Power Query, there are three parts to the initial dialog:

image

Server and Database are pretty straight forward, but the SQL Statement is marked as optional, and is actually hidden under the arrow at the left.  Now, for anyone who has written SQL in the past, you open up that little window, pop in your SQL and click OK.  And you’ll get prompted with a message about trusting a Native Database Query.

On the other hand… if you DON’T fill in a SQL statement, you’ll be taken to a list of all tables and views in the database that you have access to, and will be able to bring those in to start working with them.  Do that, and you’ll NEVER SEE THIS PROMPT.

Why Prompt About a Native Database Query?

A native database query is basically SQL code that you provide to pass to the database.  Power Query treats this as potentially dangerous, so gives you a warning when you do this.  And every time you change the underlying SQL query (pass one parameter to it to change it in some way), the Query is seen as “new” and again, potentially dangerous.

I tend to agree that this is a bit overkill.  I’m not a SQL expert, by any means, but if the query is just a SELECT query, then I’m not sure why we need to be warned about it.  You’d think that would be easy enough for the PQ team to parse out, looking for keywords like ALTER, DROP, UPDATE, DELETE and such, but regardless, that’s the way it is.

Avoiding the Error

Now that we know what is doing it, how do we avoid it?  The answer is actually more simple, but harder to swallow than you might like. Don’t use the Native Database Query functionality!

I really struggled with this at first.  I saw Power Query as a “SQL Sever Management Studio Light” for Excel pros.  It’s not.  It’s totally different. What the PQ team has built is a way for non-power users to be able to query, clean, restrict and reshape their data.  And if you do it their way, then you won’t get this irritating prompt.

In the case of grabbing data from SQL Server, you’d simply connect to the server, grab the table/view you want, then start setting your filters in Power Query to restrict down the data you want.  The UI makes this easy, and it’s a lot easier to tweak the data to get it right than forcing an Excel person to go back and tweak their SQL.

And here’s the cool part… once you’ve done this, Power Query takes advantage of a technology called “Query Folding”, which folds up all the reshaping commands you’ve issued.  It passes those to the SQL server as a single command to bring back your data.  In layman’s terms (although not perfectly technically correct), you can look at this as if Power Query is allowing you a graphical way to write your SQL for you behind the scenes, then pass it to the server.  And because Power Query knows it won’t do anything dangerous, it doesn’t prompt you about potential problems.

What About Performance?

A few months ago I asked someone at Microsoft this question… the thrust of my question was if Power Query would be more or less efficient than passing an optimal SQL query.  His reply was that no, if someone was a SQL Ninja, they could certainly come up with something that performed better in some situations.  Having said that, for the majority of Excel pros, Power Query will do a better job, as we don’t tend to be SQL optimization experts.

Further to this, the very worst thing you can do for performance is provide a base query then try to do further filtering.  Look at these three cases:

Case 1 Use Power Query to:
Connect to table Transactions
Filter to Dept <> 150
Case 2 Native Database Query:
Select * From Transactions Where Dept <> 150
Case 3 Native Database Query:
Select * From Transactions
Use Power Query to filter:
Dept <> 150

Case 1 is the preferred method when dealing with Power Query.  It will roll both steps up into a single statement, and retrieve the requested set from the server.

Case 2 returns the same data as Case 1, but prompts you with a Native Database Query on each computer you run the solution on.  (You can’t even save the connection with the workbook, it’s a user specific trust flag.)

Case 3 is the worst performing.  The reason is that the initial table has to be brought to Power Query in it’s entirety, and THEN you can start filtering.  Why?  Because Power Query cannot query fold it’s commands into the SQL you provided.

Recommendation

Even though connecting to a table, then re-creating all the filtering steps seems less graceful at first, this is the method I’d recommend you take.  In fact, I’ve actually made it a habit to remove any Native Database Queries from project I wrote in the past, replacing them with the recommended method.

To me this offers two benefits:

  1. No security prompting.  May seem minor, but when you put a Power Query error in front of non-tech users, they tend to freak out.  Why put them through that?
  2. It makes the query easier to follow/tweak when you need to debug it later.  And let’s face it, sooner or later you’ll need to do this.

The post Native Database Query Security Prompts appeared first on The Ken Puls (Excelguru) Blog.

Load Power Query directly to Power Pivot in Excel 2010

$
0
0

One of the cool features in Excel 2013’s Power Query is being able to load to the Data Model (PowerPivot) directly.  But Excel 2010 doesn’t appear to have this feature.  Interestingly, you can still load Power Query directly to Power Pivot in Excel 2010, it just takes a bit of a careful workaround.

Let’s look at the required steps

Step 1: Create Your Connection

First, I’m going to load in the content of a text file.  So I:

  • Go to Power Query –> From File –> From Text
  • I browsed to the file I needed, and imported it into Power Query
  • I do whatever cleanup is needed and name the query Sales
  • Next, we go to the Home tab –> Close and Load –> Close and Load To…

And here’s the important part:

  • Choose “Only Create Connection” –> Load

And I’ve now got a basic connection to my sales table without landing it in a worksheet:

image

Step 2: Grab the Connection String

Now, here comes the secret.  We need to get the connection string that Excel uses to connect to the Power Query.  Here’s how:

  • Go to the Data tab –> Connections

In there, you’ll see the name of your new connection:

image

  • Select your Query and click Properties
  • Click the Definition tab

Now you’ll be looking at something like this:

image

Notice that this query is actually an OLE DB Query that is simply “SELECT * FROM [Sales]”  That seems easy to work with.  But the key for us is the connection string shown (#2 in the image above).

  • Select the ENTIRE connection string
  • Press CTRL + C to copy it
  • Click Cancel

Note:  Make sure you start at “Provider=” and highlight all the way to the end.  (It’s much longer than what you see in that little box.)

Load Power Query directly to Power Pivot

Finally, we’re going to pull this into Power Pivot.  To do this:

  • Go to the Power Pivot tab –> PowerPivot Window
  • From Other Sources –> Others (OLEDB/ODBC) –> Next

image

  • Name your table
  • Paste your Connection String in the box

image

  • Click Next –> Next –> Finish –> Close

And voila!  We have our Power Query linked directly into Power Pivot in Excel 2010!

image

Just remember… if you do this, NEVER modify this table in Power Pivot.  Always go back to modify the table in the Power Query stage.  Failure to do so could set the table into a non-refreshable state.

The post Load Power Query directly to Power Pivot in Excel 2010 appeared first on The Ken Puls (Excelguru) Blog.

Announcing Online Power Query Training!

$
0
0

I’m really pleased to announce that a new project I’ve been working on is live: powerquery.training, a website that offers online Power Query training!

online Power Query training

PowerQuery.Training

About PowerQuery.Training

PowerQuery.Training is a joint effort between Miguel Escobar (of PoweredSolutions.co) and myself: the two guys who are writing that Power Query book – M is for Data Monkey book.

M is for Data Monkey

M is for Data Monkey

What you can find at PowerQuery.Training

We believe that Power Query is a super important tool in the toolbox of every Excel user out there.  It's so easy to use, and so powerful, that we think everyone should know about it.  To that end, we've decided to include the following two areas on the site:

Power Query patterns

One of the areas of focus is to showcase Power Query patterns (techniques).  These patterns are well illustrated articles with supporting workbooks, which will help you build solutions for common scenarios.  There will be more coming as we have time to add them, but today you'll already find:

Online Power Query Training

We know that not everyone wants to read pages of documentation, trying to figure out how to use a new technology.  Some people want to carve out a block of time, link up with an instructor, and be taught the basics, and how to avoid the inevitable pitfalls.  That's our aim with our online Power Query training workshops.

That’s right, there will be live online offerings in order to help get you skilled up on Power Query without ever leaving the comfort of your own office!  All you need to do is dedicate 3 days of your time – well okay… and some space on your credit card – and you’ll be off to the races with this software, taming and automating the cleanup and refresh of your data.

Interested in seeing what the course covers?  Download the Course Agenda here.

The post Announcing Online Power Query Training! appeared first on The Ken Puls (Excelguru) Blog.

What Power Query Functions Exist?

$
0
0

I know that this topic has been covered before by others, but I think it’s still pretty valuable for a user to be able to figure out what Power Query functions exist, especially since they are often different than what we’re used to in Excel.

Power Query Functions Documentation on the web

There’s a pretty good resource site available on the Microsoft Support site.  Personally I have that one bookmarked and head over there often when I’m looking for a new function to do something.  I find that with a quick CTRL + F on the page, I can quickly search and narrow in on the function I think I need in order to learn it’s syntax.

To be fair, I’m not always in love with the actual examples (many lack a power query UI view), but overall the site is fairly useful.

Power Query Functions Documentation in the client

Now that’s all good, but what if you’re working on a plane with no WIFI, and you need to figure out the syntax for a new function?

As luck has it, there is a way to pull up the list for most functions right in the client.  To do this, I:

  • Clicked Add Column –> Add Custom Column
  • Typed a 1 and clicked OK
  • Went to the Power Query formula bar and typed the formula below.  (Notice that this is case sensitive)

 =#shared

(Why the custom column? Because typing in the formula bar replaces the previous step, and I want to be able to revert to that since it’s part of my logic:

image

Now, you’ll see you get a list of (almost) all the functions that you can access:

SNAGHTML894ffc53

Now, let’s assume I’m trying to find a formula to remove certain characters from a text string.  I really need to search for “Text.”, but there isn’t a search option.  No big deal, let’s convert this list into a table:

SNAGHTML89522a10

Once we’ve done that, we get a nice table of all of the functions, and we can filter them to our heart’s content.  Here’s my table filtered down to just rows that begin with “Text”:

SNAGHTML897b95c3

And a page or so down, I found something that looks like it might work:  Text.Remove.  So I clicked on the green Function beside it’s entry

image

It pops up an Invoke Function box to try it out, which I did.  And behind that is the syntax for how it’s supposed to work.  So that’s pretty cool.

I tried it out with some text, then reverted to the previous step, as I wanted to look at the syntax page that popped up behind the Invoke Function dialog:

image

My only complaint here is that once you land in this window, the only indicator of the actual function name is in the smallest font on the page, buried in the middle.  You’d think that the name would should up a little more prominently.

At any rate, now that I’ve been able to explore the functions and found one I’d like to use, I can just knock off the extra steps shown below in yellow, returning me back to my original source query:

image

Learning more about Power Query functions

For reference, this is one of the many things that Miguel and I will be covering in our upcoming Power Query training workshops.  Learn more about the workshop and register here:  http://powerquery.training/course/

The post What Power Query Functions Exist? appeared first on The Ken Puls (Excelguru) Blog.

Data From Different TimeZones

$
0
0

A friend of mine emailed yesterday asking how to compare data from different timezones.  With how good the UI is in Power Query, you’d think this would be easy.  Unfortunately it’s a bit less than that, so I thought it would make a good example for today’s post.

Background

Let’s assume that we’ve got two columns of data; an Order Date and a Shipping Date.  We’d like to work out the number of days it took to ship our order.  Easy enough, we just need to subtract one from the other… except… the system that holds the Order Date reports it in UTC +0:00, and the shipping date is done from my home time zone (UTC –7:00).

The data table we’re starting with looks like this:

image

And you can download a copy of the workbook from my OneDrive here if you’d like to follow along.

Avoiding Temptation

So the first thing to do is pull the data in to Power Query.  So I clicked in the table, went to the Power Query tab, and chose From Table.  At this point we’re greeted with a nice table, and our first temptation is to go directly to the Transform tab and set the Data Type to Date/Time/Timezone:

image

And herein lies a problem.  The system has forced my local TimeZone on the data.  As specified in the initial problem, I need this to carry a UTC +0:00 distinction.

It’s a shame that there is no intermediate step here (how often do I ask for MORE clicks?) which allowed you to specify WHICH TimeZone.  If you’re into working with data from different regions (I.e. this feature), I’d don’t think I’m venturing out on a limb to say that this is pretty important.

To further complicate things, that is the extent of the TimeZone functionality in the UI.  And that’s not going to help us.  So let’s knock off the “Changed Type” step and look at this another way.

Using M to Deal with Data From Different TimeZones

The secret to making this work is to take explicit control of the time zone settings using some Power Query M functions.  It’s not as hard as it sounds.  In fact, we’re only going to use two in this example:

  • DateTime.AddZone to add a time zone to a DateTime data type
  • DateTimeZone.SwitchZone to convert from one time zone to another

I discovered both of these functions by searching the Power Query formula categories article on Microsoft’s site.

Forcing a DateTime to a Specific Time Zone

So we’re currently looking at this data in Power Query:

image

Let’s create a new column to convert the OrderDate:

  • Add Column –> Add Custom Column
    • Name:  Order Date (UTC +0:00)
    • Formula:  =DateTime.AddZone([OrderDate],0)

The secret here is in the last parameter, as we get to specify the time zone.  Since we know these dates/times come out of our system in UTC +0:00, we’re good to not add anything to it.  The result is shown below:

image

Converting a DateTime to a Different Time Zone

Now, in order to be able to compare our DateTimes easily, we want them both to be based in our own time zone.  Since my business works in UTC –7:00, I really want my Order Date represented in that time zone as well.  So let’s convert it.

  • Add Column –> Add Custom Column
    • Name:  Order Date (UTC -7:00)
    • Formula:  =DateTimeZone.SwitchZone([#"OrderDate (UTC +0:00)"],-7)

SNAGHTML1f066319

Beautiful.

Just a note here… It may have been tempting to force this data to UTC –7:00 when we added the time zone above, but that would have assigned the date based in the wrong time zone.  I.e. our first record would have returned 7/4/1996 1:12:00 PM –07:00, which is not the same as what we ended up with.

Forcing another DateTime to a Different Time Zone

Now we need to deal with the ShippedDate column, forcing that to my local time.  I could just select the column and turn it into a Date/Time/Timezone data type, but I won’t.  Why?  What if I send this workbook to another user?  It will return THEIR time zone, not mine.  And that could be different.  Much better to explicitly set it.

  • Add Column –> Add Custom Column
    • Name:  ShippedDate (UTC –7:00)
    • Formula:  DateTime.AddZone([ShippedDate],-7)

Notice that this time we do force it to be in the –7 time zone, as these DateTimes originated from that time zone. The result:

SNAGHTML1f0e9e46

Fantastic.  We’ve added time zone data, without changing the original times.

Let’s just go do a little bit of cleanup now:

  • Select the OrderDate and ShippedDate columns
  • Transform –> Data Type –> Date/Time
  • Select OrderDate (UTC +0:00) through ShippedDate (UTC –7:00)
  • Transform –> Date Type –> Date/Time/Timezone

Excellent.  Now they should show up correctly when we load them to an Excel table instead of losing their formatting.

Making Comparisons

We’re at the final step now: Working out the time to ship.  This is relatively straight forward:

  • Add Column –> Add Custom Column
    • Name:  Days to Ship
    • Formula:  [#"ShippedDate (UTC -7:00)"]-[#"OrderDate (UTC -7:00)"]
  • Select the Days to Ship column
  • Transform –> Data Type –> Duration

Note:  You can just double click the column names in the formula wizard and it will put the # characters in there for you.

And the final look in Power Query:

SNAGHTML1f1602fa

With that all complete, the final step is to give the query a name (I chose ShippingTimes) and load it to a worksheet:

image

Final Thoughts

Personally, I like to take explicit control over my data types.  Call me a control freak if you like (I’ve been called much worse) but relying on implicit conversions that set to “local time” scare me a bit, particularly if I’m going to be sending my workbook off to someone who lives in a different zone than I do.  Once you know how to do this it’s not super difficult, and I now know EXACTLY how it will represent on their side.

I’ll admit also that I’m a bit disappointed in the UI for datetime conversions.  To me, anyone playing in this field needs very granular control over every column.  An extra step in the Transform to Date/Time/Timzone step would go a long way to solving this, as you’d be able to skip writing custom formulas.  Hopefully that’s on the Power Query team’s radar for the future, as well as a full datetime menu that would allow us to easily choose from/add/convert to the majority of the formulas found in the article referenced above.

Power Query Training

Also don't forget.  If you love Power Query or are intrigued by the things you can do with it, we have an online training course coming up soon.  Check it out and register at www.powerquery.training/course

The post Data From Different TimeZones appeared first on The Ken Puls (Excelguru) Blog.

Retrieve Related Tables in Power Query

$
0
0

As I was working on one of the assignments for our upcoming Power Query training course, it occurred to me that I’ve never blogged about this feature: how to retrieve related tables in Power Query.

Retrieve Related Tables in Access

If you’re using Access, you’re stuffed.  I mean… you can do it manually then merge the tables.  Here’s the database I connected to:

image

Yet when I pull the tblChits table, I get the following columns:

SNAGHTML24047fb1

All the columns from the tblChits table, but nothing from the other tables.  Too bad really.

So basically, you can still retrieve related tables in Access, you just need to create a connection to each table, then merge the queries manually.  The steps to do this, in brief:

  • Create a “Connection Only” query called Chits that points to the tblChits table
  • Create a “Connection Only” query called Items that points to the tblItems table
  • Create a “Connection Only” query called Categories that points to the tblCategories table
  • Create a new query that references the Chits query
  • Merge the Items query, making the relationship between the POSItemCode in both queries
  • Merge the Categories query, making the relationship between the POSCategoryCode in both queries

So basically, we can still retrieve related tables in Access, we just need to understand the relationships in our database and merge them manually.

Retrieve Related Tables in SQL

In SQL server, it actually pulls some things over for you nicely.

In this case I connected to the AdventureWorks database that we’re hosting in Windows Azure for course participants to use.  (Does anyone else provide an Azure hosted database to practice with?)  Specifically, I:

  • Connected to the Azure hosted AdventureWorks database
  • Selected the SalesOrderHeader table (only)
  • Filtered down the dataset to get a shorter table
  • Removed a bunch of unnecessary columns from the SalesOrderHeader table

And look what I’m left with:

SNAGHTML240c8e12

The Sales.SalesOrderDetail contains a full table of related records for each row in my table.  Those related records come from a completely separate table that I didn’t even ask for, how awesome is that?

SNAGHTML240e1703

And the Sales.SalesTerritory column shows a “value” for each, which also has more data that can be expanded (including sub tables):

SNAGHTML240f07eb

Pretty slick, and saves me the effort of having to sleuth out and perform the joins manually.  I sure wish Access had this ability as well!

Don’t have an Azure database?

This technique works for SQL server (on prem) as well.  I’m not 100% sure which other databases support this technique, as I don’t have access to others, but if you do know, please comment here and I’ll add them to the list.

And if you really wish you had access to try out sourcing data from an Azure database… it’s not too late to sign up for our course!

The post Retrieve Related Tables in Power Query appeared first on The Ken Puls (Excelguru) Blog.


Excel Text Function Equivalents

$
0
0

For some time it’s bothered me that we don’t have full parity between Excel and Power Query functions.  In this post I’m going to look at four Excel text function equivalents, intended to allow an Excel user to just work with formulas as they’re used to.

=LEFT()

The first function is the LEFT function which, if implemented in M, would be:

(text as text,num_chars as number)=>
let
ReturnValue = Text.Start(text,num_chars)
in
ReturnValue

You’d call it like so:

=LEFT([Word],4)

And it would return the Excel text function equivalent as shown here.  (Excel’s output is in teh LEFT(x,4) column, and the Power Query LEFT function is shown in the pqLeft column.)

image

This one may not be a big deal, as it’s just a new function name, but it now works exactly as an Excel user would expect.  (Except for the fact that you MUST type the function name in uppercase.)

=RIGHT()

We can do the same thing with the RIGHT function:

(text as text,num_chars as number)=>
let
ReturnValue = Text.End(text,num_chars)
in
ReturnValue

And call it with:

=RIGHT([Word],4)

Returning:

image

=LEN()

This is yet another simple one:

(text as text)=>
let
ReturnValue = Text.Length(text)
in
ReturnValue

Called with:

=LEN([Word])

Returning:

image

Why I care about Excel text function equivalents?

Beyond allowing a user to just use the knowledge they’ve been investing in for years, it’s the next function that makes a difference.  With the MID function we have a bunch of things that are different, and to replicate our expectations we need to do some manipulation and conditional testing.

=MID()

This is what it takes to replicate the MID() function in Power Query:

(text as text,start_num as number,num_chars as number)=>
let

tLength = Text.Length(text),
StartCorrected = if start_num>tLength then -1 else start_num-1,
ReturnCorrected = if (StartCorrected+num_chars) > tLength then tLength - start_num + 1 else num_chars,
ReturnValue = if StartCorrected >-1
then Text.Range(text,StartCorrected,ReturnCorrected)
else null
in
ReturnValue

That’s a fairly big pain to just knock up in a couple of seconds!  But once done, it’s easy to use:

=MID([Word],5,4)

And it returns results consistent to Excel as well:

image

Final Thoughts

I really wish these functions were built under and consistent with Excel’s function names to begin with.  While we can replicate their form and function, it takes a lot of work to create/debug and implement them in a power query solution.  In addition, some of the functions are VERY difficult to built.  But regardless, at least we can.

The post Excel Text Function Equivalents appeared first on The Ken Puls (Excelguru) Blog.

Calculate Hours Worked

$
0
0

Chandoo posted an interesting challenge on his blog last Friday, challenging users to calculated hours worked for an employee name Billy.  This example resonated with me for a couple of reasons.  The first is that I’ve had to do this kind of stuff in the past, the second is because I’ve got a new toy I’d use to do it.  (Yup… that toy would be Power Query.)

It always blows my mind how many people respond on Chandoo’s blog.  As the answers were pouring in, I decided to tackle the issue my own way too.  I thought I’d share a bit more detailed version of that here as I think many users still struggle with time in Excel.

Background and Excel Formula Solution

Chandoo provided a sample file on his blog, so I downloaded it.  The basic table of data looks like this:

SNAGHTMLeb2f08

Now, for anyone who has done this a long time, there a few key pieces to solving this:

  • The recognition that all times are fractions of days,
  • The recognition that if you omit the day it defaults to 1900-01-01, and
  • The data includes End times that are intended to be the day following the Start time

The tricks we use to deal with this are:

  • Test if the End time is less than the start time.  If so, add a day.  (This allows us to subtract the Start from the End and get the difference in hours.
  • Multiply the hours by 24.  (This allows us to convert the fractional time into a number that represents hours instead of fractions of a day.)

Easy enough, and the following submitted formula (copied down from F4:F9 and summed) works:

=(D4+IF(C4>D4,1,0)-C4)*24

SNAGHTMLea6d3e

Also, there was a great comment that Billy shouldn’t get paid for his lunch break.  Where I used to work (before I went out on my own), we had a rule that if you worked any more than 4 hours you MUST take a lunch break.  Plumbing in that logic, we’d would need a different formula.  There’s lots that would work, and this is one:

=((D4+IF(C4>D4,1,0)-C4)*24)-IF(((D4+IF(C4>D4,1,0)-C4)*24)>4,1,0)

SNAGHTMLf008e9

So why Power Query?

If we can do this in Excel, why would we cook up a Power Query solution?  Easy.  Because I’m tired of having to actually write the formula every time Billy sends me his timesheet. Formula work is subject to error, so why not essentially automate the solution?

Using Power Query to Calculate Hours Worked

Okay, first thing I’m going to do is set up a system.  I’m set up a template, email to Billy and get him to fill it out and email it to me every two weeks.  I’ll save the file in a folder, and get to work.

  • Open a blank workbook –> Power Query –> From File –> From Excel
  • Browse and locate the file
  • Select the “Billy” worksheet (Ok, to be fair, it would probably be called Sheet1 in my template)

image

  • Click Edit

And now the fun begins…

  • Home –> Remove Rows –> Remove Top Rows –> 2
  • Transform –> Use First Row as Headers
  • Filter the Day column to remove (null) values
  • Select the Day:End columns –> right click –> Remove Other Columns

And we’ve now got a nice table of data to start with:

SNAGHTMLfb3d71

Not bad, but the data type for the Start and End columns is set to “any”.  That’s bad news to me, as I want to do some Date/Time math.  The secret here is that we need our values to be Date/Times (not just times), so let’s force that format on them, just to be safe:

  • Select Start:End –> Transform –> Date/Time

Next, we need to test if the Start Date occurs after the End Date.  Let’s use one step to test that and add one day if it’s true:

  • Add Column –> Add Custom Column
    • Name:  Custom
    • Formula:  =if [Start]>[End] then Date.AddDays([End],1) else [End]

So basically, we add 1 day to the End data if the Start time is greater than the end time.  Once we’ve done that, we can:

  • Right click the End column –> Remove
  • Right click the Custom column –> Rename –> End

And, as you can see, we’ve got 3 records that have been increased by a day (they are showing 12/31/1899 instead of 12/30/1899

image

Good stuff, let’s figure out the difference between these two. The order of the next 3 steps is important…

  • Select the End column
  • Hold down the CTRL key and select the Start column
  • Go to Add Column –> Time –> Subtract

Because we selected the Start column second, it is subtracted from the End column we selected first:

image

Now we can set the Start and End columns so that only show times, as we don’t need the date portion any more.  In addition, we want to convert the TimeDifference to hours:

  • Select the Start:End columns –> Transform –> Time
  • Select the TimeDifference column –> Transform –> Decimal Number

Hmm… that didn’t work quite as cleanly as we’d like:

image

Ah… but times are fractions of days, right?  Let’s multiply this column by 24 and see what happens:

  • With the TimeDifference column selected:  Transform –> Standard –> Multiply –> 24
  • Right click the TimeDifference column –> Rename –> Hours

Nice!

image

Oh… but what about those breaks?

  • Add Column –> Add Custom Column
    • Name:  Breaks
    • Formula:  =if [Hours]>4 then -1 else 0
  • Add Column –> Add Custom Column
    • Name:  Net Hours
    • Formula:  =[Hours]+[Breaks]

And here we go:

image

At this point I would generally:

  • Change the name of the query to something like:  Timesheet
  • Close and Load to a Table
  • Add a total row to the table

SNAGHTML12fe424

But just in case you only cared about the total of the Net Hours column, we could totally do that in Power Query as well.  Even though it’s not something I would do (I’m sure Billy would trust YOU implicitly and never want to see the support that proved you added things up correctly…), here’s how you’d do it:

  • Go to Transform –> Group By
  • Click the – character next to the Day label to remove that grouping level
  • Set up the Grouping column:
    • Name:  Net Hours
    • Operation:  Sum
    • Column:  Net Hours

Here’s what it looks like if you set the column details up first, indicating where to click to remove the grouping level:

image

And the result after you click OK:

SNAGHTML1344d4b

Holy Cow that’s a LOT of Work!?!

Not really.  Honestly, it took me about a minute to cook it up.  (And a LOT longer to write this post.)  But even better, this work was actually an investment.  Next time I get a timesheet, I just save it over the old one, open this file, right click the table and click Refresh.  Done, dusted, finished and time to move on to more challenging problems.

Even better, if I wanted to get really serious with it, I could implement a parameter function to make the file path relative to the file, and then I could pass it off to someone else to refresh.  Or automate the refresh completely.  After all, why write formulas every month if you don’t have to?

:)

The post Calculate Hours Worked appeared first on The Ken Puls (Excelguru) Blog.

Listing Outstanding Cheques

$
0
0

Power Query is all about transforming and filtering data, and automating the process.  One of the tedious tasks that accountants get to deal with all the time is bank reconciliations, which is essentially the process of filtering and matching items to see what is left over.  It’s been on my list for a while now, but I’ve been thinking that we can use Power Query for listing outstanding cheques (or checks if you’re in the USA).

The completed workbook is available for download by clicking here.

Background

Because I don’t want to run on to 100 pages, I’m going to start with two lists that show just the cheques, in two tables:

SNAGHTML1901ec84

The table on the left is the table of cheques that have been issued, as per the list maintained in the General Ledger. We’re making the assumption that we’ve dumped that list into an Excel worksheet, formatted it as a table, and given the table the name “GLListing”.

The table on the right is the table of the cheques that have cleared the bank.  Again, the assumption is that we’ve been able to download a list of the transactions, and filtered them down to show just the cheques that have cleared.  (Maybe we’d even use Power Query to do this.)  This table has been named “Bank”

Creating the data Staging queries

The first step is to create the staging queries to connect to these two tables.  One of the important things I wanted to ensure is that I can match transactions where both the cheque number and amount are identical.  (If a cheque clears for the wrong amount, I want to list it as outstanding at this point, as I need to review it.)  I’m going to keep that in mind as I create my staging tables.

The GLListing table:

To set this up I:

  • Clicked inside the GLListing table –> Power Query –> From Table
  • Set the data types on each column (Whole Number, Date, Decimal)
  • Selected the Cheque and Amount columns –> Add Column –> Merge
    • Separator:  Custom (I used a dash)
    • Name:  Issued

The end result in Power Query:

image

  • Go to Home –> Close & Load –> Close & Load To… –> Only Create Connection

The Bank table:

It’s virtually the identical process:

  • Click inside the Bank table –> Power Query –> From Table
  • Set the data types on each column (Whole Number, Date, Decimal)
  • Selected the Cheque and Amount columns –> Add Column –> Merge
    • Separator:  Custom (I used a dash)
    • Name:  Cleared

The end result in Power Query:

image

  • Go to Home –> Close & Load –> Close & Load To… –> Only Create Connection

Listing Outstanding Cheques

Now to build the important part.

  • In the Workbook Queries pane, right click the GLListing query—> Reference

At this point you’ll have a pointer to the GLListing table.  We also want a pointer to the Bank table.  To do that, let’s click the fx icon on the formula bar:

image

This will create a new step in your query.  The formula in the formula bar will read =Source (it refers to the previous step), and you’ll see a new step in your Applied Steps area called Custom1.  Let’s update both of those:

  • Change the formula to =Bank
  • Right click and rename the step to “Bank”

image

The key things we now have are a Source step (which contains the output of the GLListing query) and a Bank step (which contains the output of the Bank query).  The Source step has an Issued column, the Bank query a Cleared column, and we’ll like to know which items between those two columns are different.

To work this out we’re going to knock up a little M code.  Here’s how:

  • Click the fx icon on the formula bar
  • Replace the formula (it will read =Bank) with this:

=List.Difference(Source[Issued],Bank[Cleared])

The result will be as follows:

image

So what happened here?  Let’s break this down.

List.Difference generates the items that are different between two provided lists.  And fortunately, when we feed a column to the function, it passes it in as a list.  So that’s what you’re seeing there:

  • Source[Issued] is the Issued column from the Source step of our query
  • Bank[Cleared] is the Cleared column from the Bank step of our query

And the result is the only items that don’t exist in both lists.

Expanding the Details

As great as this is, it is returning a list of values.  We want to convert this back into a table, and get the original data of issue as well.  So let’s do that.

  • Go to Transform –> To Table

image

  • When prompted, select that the list has a Custom Delimiter of a dash and click OK

You should now have a nice table split into two columns:

image

Let’s clean this up:

  • Right click Column1 –> Rename –> Cheque
  • Right click Column2 –> Rename –> Amount

Now, the next tricky part is getting the issue date back in.  I’d like to feed this from the current query – just to keep it self contained – but it’s easier to start by merging it with another query.

  • Go to Home –> Merge Queries –> GLListing
  • Choose to merge based on the Cheque field on both tables
  • Make sure you check to “Only include matching rows”

image

  • Click OK

This works nicely to add our column, but we’ve already pulled this data into this query once, so why reach outside it again?  If you look in the formula bar, you can see that the formula reads as follows:

= Table.NestedJoin(#"Renamed Columns",{"Cheque"},GLListing,{"Cheque"},"NewColumn",JoinKind.Inner)

Highlighted in the middle of the text is the name of the table we merged into this one.  So why not just replace that with the step name from this query that has the same table?  Modify the formula in the formula bar to read:

= Table.NestedJoin(#"Renamed Columns",{"Cheque"},Source,{"Cheque"},"NewColumn",JoinKind.Inner)

It doesn’t look like anything happened, does it?  That’s okay.  Remember that the source step just pulls in the data from the GLListing query.  Since we didn’t do anything to it in that step, it SHOULD look identical.

Now we can continue on and finalize the query:

  • Expand the “NewColumn” column:
    • Only expand the Date column, as we have the others we need
    • Uncheck the “Use original column name as prefix” setting
  • Move the Date column between then Cheque and Amount columns

image

  • Rename the query to “Outstanding”
  • Go to Home –> Close & Load

And the final result:

SNAGHTML1b93e715

Final Thoughts

Figuring out which records match is actually pretty easy.  We simply merge two tables, and choose to only include matching rows.  Working out differences is obviously a bit harder.  (Wouldn’t it be awesome if there was an inverse setting on that merge dialog that let us only include unmatched rows?)

I left my full time controllership job before I ever got the chance to implement this technique for our bank reconciliations.  Currently there is a lot of VBA and manual work needed to clear both the cheques and deposits on a monthly basis.  Given this, however, I know that I could have re-written the bank reconciliation to very quickly eliminate all the records that match, leaving me with only the transactions that I actually needed to focus on.

Taking it even one step further, with another table of adjustments added in to the mix, I’m sure I could build it to actually produce an ever diminishing listing of un-reconciled transactions, and most likely even an output report replicating a full bank reconciliation.  Pretty cool, especially when you consider how much could be refreshed when you start the process next month!

The post Listing Outstanding Cheques appeared first on The Ken Puls (Excelguru) Blog.

Create a Dynamic Calendar Table

$
0
0

I know this topic has been covered before, but I’m teaching a course on Power Pivot tomorrow, and it’s something that I’ll probably be brining up.  As we need a calendar table for our Power Pivot solutions, a method to create a dynamic calendar table is pretty important.  If you haven’t seen this before, I think you’ll be surprised at how easy it is to create a complete calendar driven by only a few Excel formulas and Power Query.

Setting up a dynamic source

The first key we need to do is set up a parameter table in order to hold the start and end date.  To do that, I’m creating a basic parameter table as described in this post.  Mine looks like this:

SNAGHTML3f7266cf

  • B3 is simply a hard coded January 1, 2014
  • B4 contains the =TODAY() function, returning today’s date

Once I created this table, I named it “Parameters” (as described in the aforementioned blog post), then created the fnGetParameter function in Power Query (again, as described in the aforementioned blog post.)

With that work done, it was time to move on to creating my calendar.

How to create a dynamic calendar table in Power Query

What I did at this point was create a new blank Power Query:

  • Power Query –> From Other Sources –> Blank Query

In the formula bar, I created a simple list by typing the following:

={1..10}

SNAGHTML3f7892f2

As you’ll see if you try this, it creates a simple list from 1 through 10.  That’s great, but it’s just a temporary placeholder.  Now I need to get my hands a bit dirty… I want to use the fnGetParameter function to load in my start and end dates as date serial numbers (not actual dates.)  To do this, I’ll retrieve them and explicitly force them to be Numbers.

  • Go to View –> Advanced Editor
  • Insert two new lines as follows:

let
    StartDate = Number.From(fnGetParameter("Start Date")),
EndDate = Number.From(fnGetParameter("End Date")),

Source = {1..10}
in
Source

So, as you can see, we’ve used the fnGetParameter function to retrieve the Start Date and End Date values from the Excel table, then converted them to the date serial numbers (values) by using Number.From.

With that in place, we can then sub the StartDate and EndDate variables into the list that we created in the Source step:

let
StartDate = Number.From(fnGetParameter("Start Date")),
EndDate = Number.From(fnGetParameter("End Date")),
Source = {StartDate..EndDate}
in
Source

And when we click OK, we now have a nice list of numbers that spans from the first date serial number to the last from the Excel table:

SNAGHTML3f810c61

“Great,” you’re thinking, “but I really want dates, not the date serial numbers.”  No problem, lets do that.

First we need to convert our list to a table:

  • Go to Transform –> Convert to Table –> Click OK (with the default options)

image

  • Select Column1 –> Transform –> Data Type –> Date
  • Right click Column1 –> Rename –> Date

And look at that!

SNAGHTML3f84f29c

And now it’s just a matter of adding the different date columns we need.  If you select the Date column, you’ll find a great variety of formats available under Add Column –> Date.  Just browse into the subcategory you want (year, month, day, week) and choose the piece you want to add.  In the table below, I added:

  • Date –> Year –> Year
  • Date –> Month –> Month
  • Date –> Month –> End of Month

SNAGHTML3f87a345

There are a lot of transformations for a variety of dates built in… for numeric or date values.  One thing that’s missing though, is text versions.  For those you need to add a custom column.  Here’s 3 formulas that you may find useful if you want to add text dates to your table:

  • Date.ToText([Date],"ddd")
  • Date.ToText([Date],"MMM")
  • Date.ToText([Date],"MMMM dd, yyyy")

To use them, go to Add Column –> Add Custom Column and provide those as the formula.  Their results add a bit more useful data to our query:

SNAGHTML3f907072

As you can see, they work like Excel’s TEXT function, except that the characters are case sensitive.

Conclusion

Overall, it’s super easy to create a dynamic calendar table using Power Query to read the start and end date from Excel cells.  This makes it very easy to scope your calendar to only have the date range you need, and also gives you the ability to quick add columns on the fly for formats that you discover you need, rather than importing a massive calendar with a ton of formats that you will never use.

In addition to being easy, it’s also lightening quick if you’re prepared.  It takes seconds to create the Excel parameter table, a few more seconds to set up the fnGetParameter function (if you have the code stored in a text file/bookmarked), and only a little while longer to create the original list and plumb in the variables once you’re used to it.  I can knock up a calendar like this in less than two minutes, and let it serve my data model every after.  :)

PowerQuery.Training

I should also mention that this is one of the techniques (amongst MANY others) that we cover in our PowerQuery.Training course.  We’ll be announcing a new intake soon, so don’t forget to sign up for the newsletter in the footer of the site so you’ll know when that happens!

The post Create a Dynamic Calendar Table appeared first on The Ken Puls (Excelguru) Blog.

MultipleUnclassified/Trusted error

$
0
0

If you’ve been using my fnGetParameter function to build dynamic content into your queries, you may have noticed that passing some variables recently started triggering an error message that reads as follows:

We couldn’t refresh the connection ‘Power Query – Categories’. Here the error message we got:

Query ‘Staging-Categories’ (Step ‘Added Index’) used ‘SingleUnclassified/Trusted/CurrentWorkbook/’ data when last evaluated, but now is attempting to use ‘MultipleUnclassified/Trusted’ data.

If you’re like me, that was promptly followed by cursing and head scratching in the pursuit of a solution.  (I would suggest clicking the “was this information helpful?” link in the bottom of the error, and submitting a “no”.

image

As far as I can tell, this error started showing up in Power Query around v 2.21 or so.  (I don’t have an exact version.)  I know that it existed in 2.22 for sure, and still exists in version 2.23.

Triggering the MultipleUnclassified/Trusted error

Triggering the error isn’t difficult, but it does seem to show up out of the blue.  To throw it, I built a solution that uses my fnGetParameter function to read the file path from a cell, and feed it into a query as follows:

let
fPath = fnGetParameter("FilePath"),
    Source = Csv.Document(File.Contents(fPath & "SalesCategories.txt"),null,",",null,1252),
#"First Row as Header" = Table.PromoteHeaders(Source),
#"Changed Type" = Table.TransformColumnTypes(#"First Row as Header",{{"POSPartitionCode", Int64.Type}, {"POSCategoryCode", type text}, {"POSCategoryDescription", type text}, {"POSReportingGroupCode", type text}, {"POSTaxTypeCode", type text}}),
#"Inserted Merged Column" = Table.AddColumn(#"Changed Type", "Lnk_Category", each Text.Combine({Text.From([POSPartitionCode], "en-US"), [POSCategoryCode]}, "-"), type text),
#"Removed Columns" = Table.RemoveColumns(#"Inserted Merged Column",{"POSCategoryCode"}),
#"Added Index" = Table.AddIndexColumn(#"Removed Columns", "Index", 0, 1)
in
#"Added Index"

The killer here is that – when you originally build the query – it works just fine!  But when you refresh it you trigger the error.

Fixing the MultipleUnclassified/Trusted error

There are two options to fix this particular error:

Option 1

The first method to fix this problem is to avoid the fnGetParameter function all together and just hard code the file paths.  While this works, you cut all dynamic capability from the query that you went to the effort of implementing.  In my opinion, this option is awful.

Option 2

If you want to preserve the dynamic nature of the fnGetParameter function, the only way to fix this error today is to perform the steps below in this EXACT order!

  1. Turn on Fast Combine (Power Query –> Options –> Privacy –> Ignore Privacy Levels)
  2. Save the workbook
  3. Close Excel
  4. Restart Excel
  5. Refresh the query

Each of the steps above are critical steps to ensure that transient caches are repopulated with the “fast combine” option enabled – merely setting the “fast combine” option is not enough on its own.

Expected Bug Fix

Microsoft is aware of this, and has been working on a fix.  I believe (although no guarantees), that it should be released in version 2.24.  Given the update schedule in the past, we should see it any day now.  Fingers crossed, as this is just killing my solutions!

I’ll keep checking the Power Query download page and post back when the new version goes live.

The post MultipleUnclassified/Trusted error appeared first on The Ken Puls (Excelguru) Blog.

Viewing all 178 articles
Browse latest View live