piątek, 8 grudnia 2017

How to create dynamic OData query in Power BI?

Power BI allows you to import data from any service that exposes OData endpoint. You can either select "OData feed" in the "Get Data" window or go directly to Power Query editor and use OData.Feed function. Although documentation doesn't explain it clearly this function can operate in two different modes. You can pass either URL to the feed or complete OData query. The latter option is particularly interesting as it allows us to build dynamic queries, however, there is a price!

Let's start with a simple example that loads work items from Visual Studio Team Services using recently announced OData endpoint [2]. In the example I load all the work items that have been created in the last 7 days. Please notice that I use DateTimeZone.UtcNow function to get current date, subtract 7 days and convert it to the ISO 8601 format.

let
    since = DateTimeZone.UtcNow() - #duration(7, 0, 0, 0),
    Source = OData.Feed("https://stansw.analytics.visualstudio.com/_odata/WorkItems?" 
        & "$filter=CreatedDate lt " & DateTimeZone.ToText(since, "yyyy-MM-ddThh:mm:ssZ")
        & "&$select=WorkItemId, Title"),
    #"Removed Other Columns" = Table.SelectColumns(Source,{"WorkItemId", "Title"})
in
    #"Removed Other Columns"

This code will work great in Power BI Desktop, but when you publish it, you won't be able to schedule refresh, which defeats the purpose of publishing. Why would you share stale data?

The problem is with string concatenation. Power BI refuses to detect host for dynamic URL, thus, it cannot figure out which credentials should be applied to the dataset.

In order to fix it we need to switch to the other mode and use feed. This might not work for an arbitrarily complex OData query, but it should be sufficient for the majority of real-world queries.

This solution boils down to starting with feed and relying on Power Query feature called "folding" to translate function calls to the operations natively supported by the data source. In our case we are interested in functions Table.SelectRows and Table.SelectColumns which can be folded. Please notice that not every function can be folded! A good example is DateTimeZone.UtcNow which has to be called upfront and only its result can be passed to the filtering expression.

let
    since = DateTimeZone.UtcNow() - #duration(7, 0, 0, 0),
    Source = OData.Feed("https://stansw.analytics.visualstudio.com/_odata"),
    WorkItems_table = Source{[Name="WorkItems", Signature="table"]}[Data],
    #"Filter Rows" = Table.SelectRows(WorkItems_table, each [CreatedDate] < since),
    #"Select Columns" = Table.SelectColumns(#"Filter Rows",{"WorkItemId", "Title"})
in
    #"Select Columns"

During execution, this code will be translated to the following OData query, which is exactly what we wanted.

https://stansw.analytics.visualstudio.com/_odata/WorkItems?
  $filter=CreatedDate lt 2017-12-01T15:54:56.8747859Z
  &$select=WorkItemId,Title

If you publish the dataset to Power BI, the data source will be correctly detected and after you specify credentials you will be able to refresh the data and define a schedule.

I showed how to apply this solution to Visual Studio Team Services feed. It is, however, general and will for your feeds as well.

References

  1. Power Query M function reference - OData.Feed
  2. VSTS Analytics OData now publicly available
  3. Date and time format - ISO 8601

czwartek, 7 grudnia 2017

Syntax highlighting for DAX

I'm not sure if you noticed it, but one of my previous posts had code snippet in DAX and apart from being nicely formatted with Dax Formatter the syntax was highlighed as well. In order to create that snippet I added DAX support to the amazing library highlight.js I use to build my blog.

With this contribution you can format your DAX code as well.

MaxLength =
VAR Separator = "; "
RETURN
    MAXX (
        'Work Items - Today',
        1 + LEN ( [Tags] )
            - LEN ( SUBSTITUTE ( [Tags], Separator, "" ) )
    )

The pull request to the main project is still pending so you might need to cherry-pick my commit and build it.

https://github.com/isagalaev/highlight.js/pull/1560

Apart from adding it to the main project, I've also contributed it to the fork which is used by Microsoft in their documentation pipeline. Using this fork is another option you have to enable this feature in your websites.

https://github.com/DuncanmaMSFT/highlight.js

Hopefully all pull request will be completed soon and everyone will be able to use it by default.

References:

  1. It is not overengineering! - How to filter multivalued column in Power BI?
  2. DAX Formatter
  3. highlight.js