Category: Power Query

Power Query Decathlon – Beginner 02: Importing Data

NOTE: This post was created using the November 2014 version of Power Query. In later versions, some of the screens or functionality could be different.

In this post, we will demonstrate a few of the options available for importing data into Excel with Power Query. There are a lot of options available, many with their own options. An exhaustive examination of all of them is outside the scope of this post and this series. Since a common use for self-service BI tools in Excel is to mash up data from your Enterprise data sources with data from elsewhere, we will focus on a few sources common in that situation.  For now, we will be covering the following:

1. From Table

2. From File (Text File)

3. From Database (From SQL Server Database)

4. From Web

If we are not covering a source you are wanting to learn about, please follow this link to the Office Online documentation from Microsoft. Also, feel free to comment on this post as well. A series that walks through different sources and how to use them is certainly not out of the question.

1. From Table

The From Table source allows you to take data residing in an Excel worksheet and import it into the Excel Data Model where you can easily relate it to other data using Power Pivot.

In this example, we have a list of the Nobel Prize categories along with ID values that have been assigned to them matching our internal database. This list is formatted as a table, but it does not have to be as the From Table source allows for specifying a range of cells as well. Figure 1-1 shows the table of data in Excel.

Figure 1-1


Since the data in this table is a list of Categories, I named the table Category. I highly recommend naming objects appropriately right away. The sooner you give objects the names you want them to have, the easier your life will be (as it relates to that data anyway).

1-1. With a cell in the table selected, click the Power Query tab of the Excel Ribbon, as shown in Figure 1-2.

Figure 1-2


1-2. On the Power Query Ribbon, click the From Table button, as shown in Figure 1-3.

Figure 1-3


This will launch a new query based on that table and open the Query Editor, as shown in Figure 1-4.

Figure 1-4


It’s that simple. Note that the name of the Query matches the name of the Table in Excel. Bonus.

2. From File (Text File)

There are a handful of options for using the From File source. Those options are shown in Figure 2-1.

Figure 2-1


For this example, we will use the From Text option to use a .txt file as a source.

2-1. Click the From Text option as shown in Figure 2-1. This will open a file browser window for selecting the source file as shown in Figure 2-2.

Figure 2-2


We will be using a text file I created listing out the Nobel Prize winners, but you can use any text file you want that has tabular data in it. By “tabular” I mean that a file containing paragraphs of text is not a great choice. Rather, you want something where the information is laid out in columns and rows.

2-2. To select the file and continue, you have a few options.

a. Double-click the file

b. Click the file and hit the <Enter> key on your keyboard

c. Click the file and click the Open button

Whichever option you choose, Power Query will launch a new query and open the Query Editor for it, as shown in Figure 2-3.

Figure 2-3


Notice that the name of the Query is Query1. Unlike the From Table source, the From File source does not, as of the November 2014 release, automatically use the file name as the Query name. Also notice that there are three steps in the APPLIED STEPS pane. The From File source WILL automatically detect (with great accuracy in my experience) whether the first row in the file makes suitable column headers. If so, it will apply that transformation so you do not have to. The Change Type step looked for numeric data and converted any columns with just numbers into an appropriate numeric format.

3. From Database (From SQL Server Database)

Power Query can extract data from many different database systems. The full list is shown in Figure 3-1.

Figure 3-1


In this example, we will pull data from a SQL Server database.

3-1. Click the From SQL Server Database Option. This will launch the dialog shown in Figure 3-2.

Figure 3-2


You will need the name of a SQL Server database engine instance. In my case, I have a default instance running on my laptop, so I can use “localhost” as my server name. You have the option to enter a database name as well. If not, you will have to navigate to the database you want in a future step. You also have the option of entering a Transact SQL query to extract your data. For this example, we will start by just entering the server name and continuing.

3-2. Enter the name of the database server (or localhost if you are working on a machine with the database engine instance you want) and either hit your <Enter> key or click OK. This will launch the window shown in Figure 3-3.

Figure 3-3


Connecting to a SQL Server database requires authentication. The default is to just use the credentials of the user currently logged on. However, if you have specific credentials for SQL Server, you can click the Database option shown in Figure 3-3 and enter that username and password there. In this example, I will use my Windows login so I can just click Save.

3-3. Click Save to use the method you have chosen.

If you are connecting to a SQL Server instance that has not been configured to use encrypted connections, you should receive the message shown in Figure 3-4 the first time you connect to that server.

Figure 3-4


If you do see this message, and are not going to use data that would put privacy at risk, you can click OK and continue. In my case, I am connecting to a database engine instance on my laptop and using public data, so this is not an issue.

If the server you are connecting to has been configured to use encrypted connections, then this message will not appear. Whether you get this message and click OK or you do not get this message at all, you should see a Navigator pane at the right listing the databases on the server, as shown in Figure 3-5.

Note: The instructions for configuring your SQL Server database engine to allow encrypted connections can be found here.

Figure 3-5


On my server, I will use the AdventureWorksDW2014 database. You can use whatever database you like.

3-4. Click the triangle to the left of the database you want to use, as shown by the arrow in Figure 3-5 above. This will expand that node to reveal the tables and views in that database, which, for the AdventureWorksDW2014 database will resemble Figure 3-6.

Figure 3-6


3-5. Hover the cursor over the object of your choice to reveal the Peek for that object. The Peek for the DimDate table is shown in Figure 3-7.

Figure 3-7


3-6. Click the Edit button, shown by the arrow in Figure 3-7, to open the query editor for a new query based on the selected object. A new query based on the DimDate table is shown in
Figure 3-8.

Figure 3-8


Notice that Power Query automatically used the name of the source object as the name of query. Cool. You can also see that there are two query steps in this query so far, the Source and the Navigation.

Clicking on the gear icon to the right of the Source in the APPLIED STEPS will show the server connection screen shown above in Figure 3-2 and repeated here as Figure 3-9.

Figure 3-9


Clicking the gear icon to the right of the Navigation step will reveal the object explorer from
Figure 3-6 repeated here as Figure 3-10.

Figure 3-10


Steps with gears are really helpful as they allow you to see the specific choices that were made in that step. Many steps perform simple enough actions that a gear icon is not necessary.

4. From Web

Before I get into this one, I think a quick note is in order. I used this as the foundation for my PASS Summit 2014 presentation, Power Query: Data Chemistry for the Masses. Less than two weeks before Summit, I followed (largely by accident) the following steps to consume a API using JSON,  without really knowing how to explain what JSON even was, with ZERO code. I was amazed. Right then and there, I revamped my demo to have this as a foundation. I had the great pleasure of sharing my joy about this with members of the Power Query team during Summit. That was awesome to let people who work extremely hard to produce great products know that their work is appreciated. I recommend you do that same if you get the opportunity.

NOTE: Revamping a demo really close to when the presentation is scheduled is not a great idea, particularly for an inexperienced speaker. If I may borrow a line from Charlie in Top Gun when analyzing flight footage of Maverick performing a split S, “The encounter was a victory, but I think we have shown it as an example of what not to do.”

The From Web source is useful when we already have a URL for a web page from which we want to pull data. It is through this source that we can utilize data sources that are not in the list of sites indexed for use by the Online Search source. The list of public data sources used by Online Search can be found here.

The following is the URL for the API from the Nobel Prize website allowing us to pull information directly from the authority on this data. The URL will communicate with the API using JSON (Java Script Object Notation). Feel free to follow the link in your browser and see the raw data.

4-1. Click the From Web button on the Power Query Ribbon, shown with the arrow in Figure 4-1.

Figure 4-1


This will open the From Web dialog shown in Figure 4-2.

Figure 4-2


4-2. Enter the URL as shown in Figure 4-2 then hit the <Enter> key or click OK. This will establish a new query using that data returned as a source, as shown in Figure 4-3.

Figure 4-3


I know what you’re thinking. “Mark, dude, what the heck kind of dataset is that?” JSON is hierarchical. What we see right here is just the top level of the hierarchy, ie the List itself.

4-3. Click the Into Table button shown with the arrow in Figure 4-3. This will convert that JSON List object into a Table for easier use in Power Query and result in what is shown in Figure 4-4.

Figure 4-4


That’s not much better. However, we now see a handy little “Expand” button, shown by the arrow in Figure 4-4. This will “expand” our list items into Records to make them easier to use.

4-4. Click the Expand button. This will reveal the Records, as shown in Figure 4-5.

Figure 4-5


OK. We’re getting there.

4-5. Click the Expand button again to expand another level. This will produce a new pane as shown in Figure 4-6.

Figure 4-6


There we go. That will give us far more useful columns than we have seen thus far. If we click OK with the “Use original column name as prefix” option checked, each column name will be prefaced with “Value.” As such, the firstname column would be imported as Value.firstname. As a matter of personal preference, I prefer NOT to have that happen here, even though I do appreciate having the option. So, for this example, we will uncheck that option.

4-6. Click the “Use original column name as prefix” checkbox, shown by the arrow in Figure 4-6, to deselect it.

4-7. Hit your <Enter> key or click OK. This will produce what is shown in Figure 4-7.

Figure 4-7


We are starting to see some meaningful data now. We have a single row for each Nobel Laureate.

4-8. Scroll over all the way to the right to see the columns that are currently on visible. The result should resemble Figure 4-8.

Figure 4-8


We see our friendly Expand button again. The reason there is another level here is that some Laureates have one more than one Nobel Prize. Imagine that. “Oh, you only have ONE Nobel Prize? How cute.”

4-9. Click the Expand button, resulting in a separate record for each awarding of a Nobel Prize. The result should resemble Figure 4-9.

Figure 4-9


Now we have a row for each awarding of each prize. Grand.

4-10. Click the Expand button again to reveal additional columns about each awarding of each prize, as shown in Figure 4-10.

Figure 4-10


Well, how cool is that? Power Query remembered our preference from last time and left the “Use original column name as prefix” deselected.

4-11 Click OK of hit the <Enter> key to add the new columns to our dataset. The result is shown in Figure 4-11.

Figure 4-11


We now have all the columns we need. We could expand the affiliations column but I think we have made the point here.

Just to recap, we just consumed a JSON API using just a few clicks of the mouse and ZERO code. That’s right. We didn’t write a single line of code ourselves. That is just awesome in my humble opinion. We will be using the dataset create in steps 4-1 through 4-11 for much of this series, so if you want to save this by clicking the Close & Load button, shown by the arrow in Figure 4-11, you can do so. You can then save the Excel file and the query will be saved as well.

For now, we will move on to managing columns and rows.

Power Query Decathlon – Beginner 01: Installation and Tour

NOTE: This post was created using the November 2014 version of Power Query. In later versions, some of the screens or functionality could be different.

Power Query, a free add-in for Excel 2010 and Excel 2013 is easily installed. Just search for “power query download” in your search engine of choice. Or click on Figure 1.

Figure 1


It is important that you download the version that matches your version of Excel, 32-bit or 64-bit.

Figure 2


To determine which version of Office you are running, see this link.

The installation wizard will walk you through the process. The steps are clear so I will not go into it here. Once the installation is complete, there will be a new option for you in Excel.

Figure 3


Clicking this option will activate the Power Query ribbon. This is where the journey of working with Power Query begins.

Figure 4


Figure 5 shows the basic options for sourcing data. We will go into these options, and demonstrate a few, and the next post in this series.

Figure 5


Figure 6

SNAGHTML64ad9eThe options in Figure 6 are used for combining queries that already exist, Merge and Append. We will cover those later in the series as well.






Figure 7


The option shown in Figure 7, Workbook Queries, is used to turn the Workbook Queries pane on or off.






Turning it on allows you see the queries that already exist in the workbook, as shown in Figure 8.

Figure 8


Once you have started a query by choosing a data source, or you have opened an existing query, you will have access to the Query Editor, as shown in Figure 9.

Figure 9


The sections of the Query Editor are as follows (corresponding to the labels in Figure 9):

1. Query Editor Toolbar/Ribbon: This is where we find the various buttons we can use for taking actions on our data. For a while, it was just called the Query Editor Toolbar. Lately, I have been seeing documentation from Microsoft referring to it as the Query Editor Ribbon. I prefer to call it the Ribbon since the similar structure in the rest of Office is called the Ribbon.

2. Formula Bar: All actions taken in the Power Query interface actually generate code using the Power Query Formula Language, known informally as M. In the Formula Bar, you can see the M code the currently selected query step. This bar is turned off by default. When you are first getting started, leaving it off is just fine. Once you feel ready to start learning more about M, then turning it on can help you with that by showing you the M code for each step you take.

3. Results Pane: This window shows you what your data looks like AFTER the execution of the currently selected query step.

4. Query Settings: This window holds a few properties of the query as a whole. The first the name of the query. I encourage you to choose an appropriate name for each query you make. Not only will it make more sense when you or someone else looks at it later, but it will also determine the name of the resulting table that gets created once you have loaded the results of the query to Excel.

This concludes this first post on the series. In the next post, we will start putting Power Query to use for importing data.

Power Query Decathlon – Beginner: Introduction

Power Query is an add-in for Excel that helps us easily import data from many different sources into Excel. During that process, we have the option to create new columns, change existing ones, filter out rows, and generally shape the data before it arrives in Excel. This is different from the capabilities we have had in Power Pivot as Power Query allows us to work with data BEFORE it arrives in our data model. With Power Pivot, you must import the data before you can do anything else with it. I don’t mean this as a dig against Power Pivot. Power Pivot is an amazing tool and I absolutely love it. Power Query merely fills a gap in the user story. If you are familiar with SQL Server Integration Services, you can think of Power Query as the End User version. That is not totally fair to Power Query, but if that helps with your foundation, then go for it.

A Power Query query is a job that imports data from one or more sources, optionally modifies or shapes that data, then loads the results to Excel. A query is composed of Steps arranged in a specific order. Each action you take in Power Query will produce steps. Sometimes multiple similar actions will be combined into a single step. Occasionally, a single action will result in multiple steps.

This series will go through various capabilities of Power Query and show how to use some of them. As this is meant for beginners, I will not be covering the Power Query Formula Language (informally known as “M”). That is the language behind the scenes in Power Query. Everything you do in the tool generates M code that will be executed against your data. M is tremendously powerful and takes Power Query to exciting places. However, I think that is best left for a deeper dive. Since the vast majority of what people will need to do with Power Query is possible directly in the user interface provided, we will focus there.

Being a Decathlon, this series will have 10 installments:

Power Query Decathlon – Beginner 01: Installation and Tour

Power Query Decathlon – Beginner 02: Importing Data

Power Query Decathlon – Beginner 03: Managing Columns and Rows

Power Query Decathlon – Beginner 04: General Transformations

Power Query Decathlon – Beginner 05: Text Columns

Power Query Decathlon – Beginner 06: Number Columns

Power Query Decathlon – Beginner 07: Date Columns

Power Query Decathlon – Beginner 08: Error Handling

Power Query Decathlon – Beginner 09: Loading Results

Power Query Decathlon – Beginner 10: Query Management

By the end of this series, you should have a solid foundation with which to start working with Power Query.

Note: At the time of this writing, automating the refresh of data in Power Query is only possible if you have a Power BI subscription. As such, it is difficult to consider Power Query to be ready as an Enterprise resource. It is still a great tool for end users and for targeted solutions where total automation is not necessary. That said, it is safe to assume that Microsoft is aware of this limitation and we will likely see things improve as Power Query matures as a product.

Power Query is not bound to the development cycle of any other tool. As such, it is updated pretty frequently, sometimes monthly. This series was put together using the November 2014 version of Power Query. In other versions, screens and functionality could be different (likely better).

Power to the Power of Power

2886083112_68c27e8a60_oI love Power tools. My father was a carpenter in his spare time and used a table saw, drill press, band saw, etc to make furniture and novelty products like a pencil holder shaped like the head of E.T. (I drew that design when I was kid). He sold quite a few of those. Times change. My power tools are quite different from his. I use Power Query, Power Pivot, Power View, and Power BI. However, the names of those tools don’t relate as directly to their function as my dad’s tools did. For most people, when you mention a table saw, they have a pretty solid idea that it is used for cutting. And a drill press, well, that puts holes in things. That’s pretty clear, too. From talking with other people in the Data world, particularly some of my friends on the DBA side of the house, it is just not clear what the Power tools are and what each one is for. So, my goal with this post is to help those folks out.

In the Excel BI stack, sometimes called the Power BI stack, we have a number of power tools:

  • Power Query
  • Power Pivot
  • Power View
  • Power Map
  • Power BI
  • Power Azure (Just kidding… This one does not exist. I just couldn’t resist.)

Let’s take a brief look at each one.

Power Query

Power Query is an add-in for Excel (both 2010 and 2013) that provides the ability to not only bring data into Excel from a number of sources, but also the ability to shape it, clean it up, and do some modeling on the way. Many people (myself included) have defined it as a Self-Service ETL (Extract, Transform, and Load) tool, a much simpler to use SQL Server Integration Services. And it is that. But it is so much more. I feel like that description just does not capture the power (can’t help it) of this fabulous tool. When all is said and done, Power Query is like a machine that takes data in on one side, does stuff to it, and spits it out the other end into Excel.

Power Pivot

Power Pivot was first released as a free add-in for Excel 2010. It was baked right into Excel 2013. Power Pivot lets you build an Analysis Services style data model IN Excel. You can augment that model with DAX (Data Analysis Expressions) to do powerful (I’m at it again, aren’t I?) calculations. It is Power Pivot that lets you drag join columns together to create relationships between tables in your model, thereby eliminating the need for doing VLOOKUP expressions. Power Pivot uses the xVelocity column-store engine to hold the data in the model you design. The data in Power Pivot is extracted from several different data sources (or sent to the model via Power Query). Power Pivot is a container that you design and fill with data. In addition to providing a means of self-service BI, Power Pivot is a fantastic tool for creating prototypes of larger BI solutions. You can put something in front of users in a few hours and make sure you are on the right track with what they are looking for. Just awesome.

Power View

Power View is a data visualization tool that is meant for end users. It is VERY simple to use. It was first release as a web part in SharePoint 2010 and later baked directly into Excel 2013. A key design tenet of Power View was “Two click to ROI.” If it takes more than two clicks to use a feature, it’s probably not worth it. This is a huge contrast from tools like Report Designer (in Business Intelligence Development Studio and SQL Server Data Tools) and Report Builder. Both of those tools have lots of menus within menus within menus. It can take a dozen clicks or more to do one thing. These tools provide much more functionality than Power View, but that functionality adds complexity. Power View is designed to be simple. A key feature of Power View is the cross filtering that happens automatically. Sometimes Power View gets a bad wrap because it is not as robust are Reporting Services; but it is not meant to be. It was not meant to replace (and does not replace) existing tools like Reporting Services. It complements them and is a great tool when used appropriately.

Power Map

Power Map is a tool for creating visualizations of geo-spatial data. While Power View supports creating maps, Power Map supports creating 3-dimensional maps. You can record the use of your maps as short videos called Scenes and play them back. While Power Map looks really cool and provides great eye candy, I must confess I have not played with it all that much.

Power BI

Power BI is an add-on service for Office 365. It provides a specially configured SharePoint Online site that enables some very exciting functionality. Once you have created a data model in Excel with Power Pivot and optionally created Power View reports against it, you can upload that file to a Power BI site to share it with others within and outside of your organization. An exciting feature of Power BI is called Q & A. It lets you use natural language to query your data and get results formatted in a appropriate visualization. If you have sales data up in Power BI and ask for sales over time, if will put your results into a line chart in order to highlight the trend you are likely seeking. The licensing of Power BI is currently very confusing. I can assure you that Microsoft has heard the feedback on this.

So, to sum up VERY simply:

Power Query extracts, transforms, and loads data into Excel.

Power Pivot models and stores data in Excel.

Power View and Power Map visualize data in Excel (Power View is available in SharePoint as well, though).

Power BI shares what you created with the tools above and adds other cool functionality, too.

I hope this helps clarify things a bit. My goal is to do a blog series on each of these tools. I never thought I would say this, but, any day I get to work with data in Excel is a good day for me. These tools provide so much capability for helping people make better use of what they already have. These tools certainly don’t replace the Enterprise BI tools like SQL Server Analysis services. We will need that for quite some time to come. But they do offer an option for enabling better decisions with a much lower barrier to entry than learning to develop cubes and write MDX. Microsoft Senior Technical Fellow Amir Netz has said many times, “BI doesn’t just have to mean Business Intelligence, it can also mean Basic Intelligence.” So much of BI is about making things simpler. These tools do that and more.

Power BI Demo Contest Entry

Behold! I hereby present my entry into the Power BI Demo Contest! I am really pumped about this set of tools and hope this demo helps show off what Power BI can do.

You can view it here on my YouTube channel.

Getting a prize would be cool, but I have to say the fun I had making this video and learning more about Power BI was awesome.