Updated MDX Trek: First Contact Downloads

Greetings. After delivering my MDX Trek: First Contact presentation as part of the Pragmatic Works Free Training series on 3/11, I got some great feedback from an attendee. He pointed out that my single zip file download on my home page for the presentation only contained the SQL Server 2008 R2 version and that I may want to include upgrade instructions for people that have SQL 2012. That was a great point. I have neglected to do much with that download, even after I started delivering this presentation in the SQL Server 2012 tools some time ago. While the MDX syntax is the same, the project would have to be upgraded to be opened in SQL Server Data Tools as the existing zip was in BI Studio. This would require some extra steps and create more work for the target audience (which includes people just getting started with SSAS).

So, to rectify this situation, the home page for my MDX Trek: First Contact presentation now has separate downloads for SQL 2008 R2 and SQL 2012. You can got there now by clicking on the image below. I should have done this a long time ago and apologize for being so late.

Thanks.

image

Introduction To Analysis Services Extended Events

I started digging into using Extended Events to trace Analysis Services recently for a client. They wanted to do some tracing of their SSAS instances, and with the deprecation of SQL Profiler, Extended Events was the best long term solution.

I have to admit, when I first started looking at this topic, I was overwhelmed. Other than a few blog posts, which I will list out below, there was very little to go on. I believe, on the whole, SQL Server Books Online (msdn, technet, etc) have pretty solid content. But for using Extended Events on Analysis Services, I have to agree with Chris Webb (Blog|Twitter) that BOL provides little value. Note: Although the examples I have seen in the wild, as well as my example below, have used SSAS Multidimensional, I implemented this for SSAS Tabular at my client. So, it works for both.

I will not be advising you on what events to trace for different purposes. I am afraid that is beyond the scope of this post and not something I have deep knowledge about at this point.

In researching this topic, I used the following blog posts:

Chris Webb (Blog|Twitter) – Using XEvents In SSAS 2012

Bill Anton (Blog|Twitter) – Extended Events For Analysis Services

Andreas Wolter (Blog|Twitter) – Tracing Analysis Services (SSAS) with Extended Events – Yes it works and this is how

Francesco De Chirico (Blog|Twitter) – Identify Storage Engine and Formula Engine bottlenecks with new SSAS XEvents

These posts were all helpful in one way or another. In some cases, I used a post as the source upon which I based the queries I used. When that is the case, I will make it clear where my base code came from. I do this because I am a vehement supporter of giving credit where it is due.

Extended Events for Analysis Services, unlike that for the database engine, lacks a graphical user interface. You have to work in code. Not only that, but the code happens to be XMLA. Yikes. I know there are people who are good with XMLA, but I am not among them. That was part of what gave me trepidation as I started down the path of working with Extended Events for SSAS.

For the CREATE script for my Extended Events trace, I turned to Bill Anton’s blog post listed above. That script not only includes the base syntax, but he also includes every event (I think it is all of them anyway) commented out. This allowed me to just uncomment the events I wanted to trace, but leave the others intact for easy use later. For this script, make sure you are connected to an SSAS instance in Management Studio, not Database Engine. Also, you will ideally be in an XMLA query window; I was able to run this code in an MDX window as well, but my examples below will assume an XMLA window.

Note: In XMLA, lines beginning with <!– and ending with –> are comments.

   1:  <!-- This script supplied by Bill Anton http://byobi.com/blog/2013/06/extended-events-for-analysis-services/ -->
   2:  
   3:  <Create
   4:      xmlns="http://schemas.microsoft.com/analysisservices/2003/engine"
   5:      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
   6:      xmlns:ddl2="http://schemas.microsoft.com/analysisservices/2003/engine/2"
   7:      xmlns:ddl2_2="http://schemas.microsoft.com/analysisservices/2003/engine/2/2"
   8:      xmlns:ddl100_100="http://schemas.microsoft.com/analysisservices/2008/engine/100/100"
   9:      xmlns:ddl200_200="http://schemas.microsoft.com/analysisservices/2010/engine/200/200"
  10:      xmlns:ddl300_300="http://schemas.microsoft.com/analysisservices/2011/engine/300/300">
  11:      <ObjectDefinition>
  12:          <Trace>
  13:              <ID>MyTrace</ID>
  14:              <!--Example: <ID>QueryTuning_20130624</ID>-->
  15:              <Name>MyTrace</Name>
  16:              <!--Example: <Name>QueryTuning_20130624</Name>-->
  17:              <ddl300_300:XEvent>
  18:                  <event_session    name="xeas"
  19:                                  dispatchLatency="1"
  20:                                  maxEventSize="4"
  21:                                  maxMemory="4"
  22:                                  memoryPartitionMode="none"
  23:                                  eventRetentionMode="allowSingleEventLoss"
  24:                                  trackCausality="true">
  25:  
  26:                      <!-- ### COMMAND EVENTS ### -->
  27:                      <!--<event package="AS" name="CommandBegin" />-->
  28:                      <!--<event package="AS" name="CommandEnd" />-->
  29:  
  30:                      <!-- ### DISCOVER EVENTS ### -->
  31:                      <!--<event package="AS" name="DiscoverBegin" />-->
  32:                      <!--<event package="AS" name="DiscoverEnd" />-->
  33:  
  34:                      <!-- ### DISCOVER SERVER STATE EVENTS ### -->
  35:                      <!--<event package="AS" name="ServerStateDiscoverBegin" />-->
  36:                      <!--<event package="AS" name="ServerStateDiscoverEnd" />-->
  37:  
  38:                      <!-- ### ERRORS AND WARNING ### -->
  39:                      <!--<event package="AS" name="Error" />-->
  40:  
  41:                      <!-- ### FILE LOAD AND SAVE ### -->
  42:                      <!--<event package="AS" name="FileLoadBegin" />-->
  43:                      <!--<event package="AS" name="FileLoadEnd" />-->
  44:                      <!--<event package="AS" name="FileSaveBegin" />-->
  45:                      <!--<event package="AS" name="FileSaveEnd" />-->
  46:                      <!--<event package="AS" name="PageInBegin" />-->
  47:                      <!--<event package="AS" name="PageInEnd" />-->
  48:                      <!--<event package="AS" name="PageOutBegin" />-->
  49:                      <!--<event package="AS" name="PageOutEnd" />-->
  50:  
  51:                      <!-- ### LOCKS ### -->
  52:                      <!--<event package="AS" name="Deadlock" />-->
  53:                      <!--<event package="AS" name="LockAcquired" />-->
  54:                      <!--<event package="AS" name="LockReleased" />-->
  55:                      <!--<event package="AS" name="LockTimeout" />-->
  56:                      <!--<event package="AS" name="LockWaiting" />-->
  57:  
  58:                      <!-- ### NOTIFICATION EVENTS ### -->
  59:                      <!--<event package="AS" name="Notification" />-->
  60:                      <!--<event package="AS" name="UserDefined" />-->
  61:  
  62:                      <!-- ### PROGRESS REPORTS ### -->
  63:                      <!--<event package="AS" name="ProgressReportBegin" />-->
  64:                      <!--<event package="AS" name="ProgressReportCurrent" />-->
  65:                      <!--<event package="AS" name="ProgressReportEnd" />-->
  66:                      <!--<event package="AS" name="ProgressReportError" />-->
  67:  
  68:                      <!-- ### QUERY EVENTS ### -->
  69:                      <!--<event package="AS" name="QueryBegin" />-->
  70:                      <event package="AS" name="QueryEnd" />
  71:  
  72:                      <!-- ### QUERY PROCESSING ### -->
  73:                      <!--<event package="AS" name="CalculateNonEmptyBegin" />-->
  74:                      <!--<event package="AS" name="CalculateNonEmptyCurrent" />-->
  75:                      <!--<event package="AS" name="CalculateNonEmptyEnd" />-->
  76:                      <!--<event package="AS" name="CalculationEvaluation" />-->
  77:                      <!--<event package="AS" name="CalculationEvaluationDetailedInformation" />-->
  78:                      <!--<event package="AS" name="DaxQueryPlan" />-->
  79:                      <!--<event package="AS" name="DirectQueryBegin" />-->
  80:                      <!--<event package="AS" name="DirectQueryEnd" />-->
  81:                      <!--<event package="AS" name="ExecuteMDXScriptBegin" />-->
  82:                      <!--<event package="AS" name="ExecuteMDXScriptCurrent" />-->
  83:                      <!--<event package="AS" name="ExecuteMDXScriptEnd" />-->
  84:                      <!--<event package="AS" name="GetDataFromAggregation" />-->
  85:                      <!--<event package="AS" name="GetDataFromCache" />-->
  86:                      <!--<event package="AS" name="QueryCubeBegin" />-->
  87:                      <!--<event package="AS" name="QueryCubeEnd" />-->
  88:                      <!--<event package="AS" name="QueryDimension" />-->
  89:                      <!--<event package="AS" name="QuerySubcube" />-->
  90:                      <!--<event package="AS" name="ResourceUsage" />-->
  91:                      <!--<event package="AS" name="QuerySubcubeVerbose" />-->
  92:                      <!--<event package="AS" name="SerializeResultsBegin" />-->
  93:                      <!--<event package="AS" name="SerializeResultsCurrent" />-->
  94:                      <!--<event package="AS" name="SerializeResultsEnd" />-->
  95:                      <!--<event package="AS" name="VertiPaqSEQueryBegin" />-->
  96:                      <!--<event package="AS" name="VertiPaqSEQueryCacheMatch" />-->
  97:                      <!--<event package="AS" name="VertiPaqSEQueryEnd" />-->
  98:  
  99:                      <!-- ### SECURITY AUDIT ### -->
 100:                      <!--<event package="AS" name="AuditAdminOperationsEvent" />-->
 101:                      <event package="AS" name="AuditLogin" />
 102:                      <!--<event package="AS" name="AuditLogout" />-->
 103:                      <!--<event package="AS" name="AuditObjectPermissionEvent" />-->
 104:                      <!--<event package="AS" name="AuditServerStartsAndStops" />-->
 105:  
 106:                      <!-- ### SESSION EVENTS ### -->
 107:                      <!--<event package="AS" name="ExistingConnection" />-->
 108:                      <!--<event package="AS" name="ExistingSession" />-->
 109:                      <!--<event package="AS" name="SessionInitialize" />-->
 110:  
 111:  
 112:                      <target package="Package0" name="event_file">
 113:                          <!-- Make sure SSAS instance Service Account can write to this location -->
 114:                          <parameter name="filename" value="C:\SSASExtendedEvents\MyTrace.xel" />
 115:                          <!--Example: <parameter name="filename" value="C:\Program Files\Microsoft SQL Server\MSAS11.SSAS_MD\OLAP\Log\trace_results.xel" />-->
 116:                      </target>
 117:                  </event_session>
 118:              </ddl300_300:XEvent>
 119:          </Trace>
 120:      </ObjectDefinition>
 121:  </Create>

You can download a version of this script without line numbers here.

I modified Bill’s original script for my own purposes in a few places.

I used my own Trace ID and Trace Name in lines 13 and 15 respectively.

  12:          <Trace>
  13:              <ID>MyTrace</ID>
  14:              <!--Example: <ID>QueryTuning_20130624</ID>-->
  15:              <Name>MyTrace</Name>
  16:              <!--Example: <Name>QueryTuning_20130624</Name>—>

I uncommented the Query End event on line 70 as well as the AuditLogin event on line 101 since those were the events I wanted to trace, to keep things simple. 70: <event package=“AS” name=“QueryEnd” /> 101: <event package=“AS” name=“AuditLogin” /> I put my own output file path on line 114. 114: <parameter name=“filename” value=“C:\SSASExtendedEvents\MyTrace.xel” /> I also added a comment on line 113. 113: <!– Make sure SSAS instance Service Account can write to this location –> I did this because I tripped over this myself. I initially got an Access Denied message when running the script above. Once I gave my SSAS instance service account rights to modify the C:\SSASExtendedEvents folder, I was good to go and the trace started just fine. When you execute the query, your Results pane should look like the screenshot below. This indicates success. Gotta love XMLA, huh? image You can verify your Extended Events trace is running by executing the following query in an MDX query window connected to the same instance in which you started the trace. The query below is in all of the blog posts referenced above.

SELECT * FROM $system.discover_traces

 

My results for this query looked like this:

image

Note the line highlighted in the red rectangle indicates “MyTrace” and the type is XEvent. Hazzah! You can also take a look at the destination folder specified for your output file. In my case, that is C:\SSASExtendedEvents, shown below.

image

There are two files here because I kept the output file from a test run earlier. I did that to show you that the function I will use to import this information into a tabular form in the database engine can iterate over multiple files easily. You will note that the engine added lots of numbers to the filename. I have not run this long enough to spill over into multiple files, but I am assuming the _0_ would refer to the first file in a tracing session. As in, the next file would have the same name, but with _1_, the next file _2_, and so on. But, that is just a guess. The long string of numbers after that seem to just be there to make sure the trace file name is unique.

OK. So, we have an Extended Events trace running. Now what? Well, let’s run some queries. In my case, I just ran some of the MDX queries from my MDX Trek: First Contact presentation. The queries themselves don’t really matter. Just query a database in your SSAS instance in some fashion.

Reading from the xel file in code (as opposed to manually in Management Studio) involves one of two processes I am aware of.

1. The sys.fn_xe_file_target_read_file function followed by shredding some XML. This function was mentioned by Bill Anton and Francesco De Chirico in their posts.

2. Jonathan Kehayias (Blog|Twitter) mentioned to me, on Twitter, the use of the QueryableXEventData class via .Net code. He stressed that this is his preferred method as it is much faster then using the sys.fn_xe_file_target_read_file function and then the XML shredding.

Trusting Jonathan on Extended Events, among many other topics, is a good idea. However, not being a .Net person, and wanting to post this while it is fresh in my mind, I am going to demonstrate the first method. I did find that method 1 is not ultra speedy, to be sure. But for the moment, event at my client, it will serve. I do intend to dig into the .Net and perhaps blog that when I do. 🙂

In Francesco De Chirico’s post, he not only discusses the use of the sys.fn_xe_file_target_read_file function to read in the xel files, but also provides great examples of the XML shredding. XML and I have an understanding: we both understand that I am horrible at XML. 🙂 So, the fact that Francesco provided the XML shredding syntax was a great find for me.

   1:  /****
   2:  Base query provided by Francesco De Chirico
   3:  http://francescodechirico.wordpress.com/2012/08/03/identify-storage-engine-and-formula-engine-bottlenecks-with-new-ssas-xevents-5/
   4:  
   5:  ****/
   6:  
   7:  SELECT
   8:        xe.TraceFileName
   9:      , xe.TraceEvent
  10:      , xe.EventDataXML.value('(/event/data[@name="EventSubclass"]/value)[1]','int') AS EventSubclass
  11:      , xe.EventDataXML.value('(/event/data[@name="ServerName"]/value)[1]','varchar(50)') AS ServerName
  12:      , xe.EventDataXML.value('(/event/data[@name="DatabaseName"]/value)[1]','varchar(50)') AS DatabaseName
  13:      , xe.EventDataXML.value('(/event/data[@name="NTUserName"]/value)[1]','varchar(50)') AS NTUserName
  14:      , xe.EventDataXML.value('(/event/data[@name="ConnectionID"]/value)[1]','int') AS ConnectionID
  15:      , xe.EventDataXML.value('(/event/data[@name="StartTime"]/value)[1]','datetime') AS StartTime
  16:      , xe.EventDataXML.value('(/event/data[@name="EndTime"]/value)[1]','datetime') AS EndTime
  17:      , xe.EventDataXML.value('(/event/data[@name="Duration"]/value)[1]','bigint') AS Duration
  18:      , xe.EventDataXML.value('(/event/data[@name="TextData"]/value)[1]','varchar(max)') AS TextData
  19:  FROM
  20:  (
  21:  SELECT
  22:        [FILE_NAME] AS TraceFileName
  23:      , OBJECT_NAME AS TraceEvent
  24:      , CONVERT(XML,Event_data) AS EventDataXML
  25:  FROM sys.fn_xe_file_target_read_file ( 'C:\SSASExtendedEvents\MyTrace*.xel', null, null, null )
  26:  ) xe
  27:  

 

In line 25, note that the file target indicates MyTrace*.xel. This is because the latter part of the file name(s) will not necessarily be known. The MyTrace*.xel tells the function to iterate over all files matching that spec. Thus, when I run this query, it will pull the data from both of the files shown earlier in my C:\SSASExtendedEvents folder.

In Line 24, we are converting the Event_Data column, which the function returns as an nvarchar(max), into XML to enable use to use the value() method.

Please note that I am not pulling all of the information available in the xel file. I am just pulling the fields I cared about for my purposes. There is more in there. And that will vary depending on the events you choose when creating the trace.

When I run this query, I get the following:

image

I can use this, like I did at my client, to insert into a SQL Server database table for later analysis. We are actually planning a Tabular model on this data to help track usage of their BI offerings in their organization. That will be fun to play with.

Once you are ready to stop the trace, execute the following XMLA:

   1:  <Delete xmlns="http://schemas.microsoft.com/analysisservices/2003/engine">
   2:      <Object>
   3:          <TraceID>MyTrace</TraceID>
   4:      </Object>
   5:  </Delete>

 

That’s about it. I hope this proves to be a meaningful addition to what is available for working with Extended Events on Analysis Services. It was certainly a great learning experience for me.

“Winning” The Power BI Demo Contest

First things first. According to the official rules, I did not win. My video did not even make it to the Top 15 Semi-Finalists. Not even close. The number of votes I got was laughable compared to others. But it was never about the votes for me. I never really had any illusions of winning the contest. However, this contest had #winning all over the place for me anyway. I shall explain.

#winning : I got to play with some really exciting tools. From Power Query to Power Pivot to Power View to the Power BI Team Site I played with for my demo, I had a total blast.

#winning : Holy crap is Power Query awesome! Even the base options in the tool’s ribbon makes common things really easy. I only dabbled a tiny bit with M (Officially: The Power Query Formula Language), but that was really cool as well. I will certainly be delving more deeply into Power Query and M.

#winning : I went from never having done a video to recording and editing a video I can be proud of. I used Camtasia Studio (got a 30-day free trail) and LOVED that software. I watched about 30 minutes worth of training videos and then went to town. That was a great experience I would love to repeat. One day I shall get my own license and make some videos… ideas are already churning.

NOTE: Techsmith, the maker of Camtasia Studio, has not compensated me in any way for saying these things. I have used their SnagIt software for years and love it. And I loved using Camtasia Studio as well. This is my own honest assessment.

#winning : I feel the need to mention Power Query again.

#winning : I learned my DAX needs some attention. After posting my video, i got a Tweet from the mighty Dan English (Blog|Twitter): “I think all three of your DAX calcs i would have done differently:)” This turned into a little back and forth discussion about how I could have done them differently. And it was not just about the awesome DIVIDE() function that I only remembered after submitting my entry. As such, I have recommitted myself to really digging deeper on this exciting expression language. I want to thank Dan for sparking that again.

NOTE: I beg you not to be afraid of posting your work for fear of embarrassment. The feedback we get from others helps us grow and improve. When I post my work/code, I have learned NEVER to say “This is the BEST way” when I do so. I leave it open to others to provide different suggestions. I sometimes flat out ask for exactly that. This is on purpose and part of what keeps me learning.

#winning : The mighty Paul Turley (Blog|Twitter) included my demo in his list of his favorites. See his Power BI Contest post. That was a great compliment from someone I have long respected.

#winning : More Power Query.

#winning : My entry got 218 views. That is a paltry sum when compared to others, to be sure. But from my perspective, that is 218 people who may not have seen my work otherwise. That is 218 people who may choose to come to a session of mine at a SQL Saturday, PASS Summit, or other event at which I speak. That is 218 people who may not have known I exist before that have now been introduced to me via something I am really proud of.

With the Winter Olympics in Sochi having just gotten under way, I remembered a swimmer in the 2000 Summer Olympics in Sydney. Eric Moussambani represented Equatorial Guinea in the 100 Meter Freestyle. His two competitors both had false starts and were disqualified. Eric swam alone and put in a time that, while more than double the fastest times for that event, set a national record for Equatorial Guinea. That race, at the Olympic Games, was the first time he had been in an Olympic size swimming pool. When he finished, the crowd cheered like mad. He was interviewed afterward and asked how he felt. He replied, “I’m happy.” Eric’s definition of #winning was different from that of the others. I would encourage you to watch this video about this race and Eric’s #winning attitude. I am not comparing myself to Eric. Rather, I am calling attention to the idea that the only way to really lose is to stop learning and stop having worthwhile experiences.

I really want to encourage you to jump at opportunities like the Power BI Demo Contest. There are great experiences waiting for you. There are great learning opportunities waiting for you. And don’t be afraid to create your own definitions of #winning.

Upcoming Presentations: SQL Saturday #241 Cleveland

 

 

 

 

It is with great joy that I announce that I will be presenting at SQL Saturday in Cleveland on February 8th. I have driven through Cleveland before, but never stopped for long. So, this will be my first real visit. Although, if it helps, I used to love to play as the Cleveland Browns in Tecmo Bowl back in my Nintendo days. I will be giving two sessions.

MDX Trek: First Contact

Cube space; the final frontier. In this Star Trek themed introduction to MDX, we will discuss the fundamentals of cube structure and vocabulary, including tuples, members, sets, hierarchies, and more. We will introduce and demonstrate the basic syntax of MDX with queries that include navigating hierarchies and even some time-based expressions. This session will give you the tools you need to write simple, yet meaningful, MDX queries in your own environment.

Session Level: Intermediate

I love this MDX session. I have given it many times over the past few years. The feedback has been overwhelmingly positive. It turns out that my view of the Cube space is a bit revolutionary. I have heard that writing MDX was like trying to solve a Rubik’s Cube in your head. When I first started dealing with MDX, I understood what that meant. But I soon found that it need not be that hard. In this session, before diving into code, I explain my model of looking at the Cube space that is much easier to deal with and understand. The Star Trek theme also keeps this really fun.

DANGER: The Art and Science of Presenting

Is there a great difference in the brain chemistry of someone fleeing a hungry mountain lion and someone presenting to a group of colleagues in a corporate board room? The answer is: NO. Over the past decade, a lot has been learned about the chemistry of the brain and why humans react the way we do to events in our environment. The concept of EQ (Emotional Intelligence) is a compelling and growing concept that applies this knowledge in a set of learnable, improvable skills for leading human beings. While EQ is often applied to corporate leadership, the parallels to presenting are fantastic. This session will explain the basics of EQ and demonstrate how you can apply it to make your presentations better in the following areas:

* Crafting better slide decks
* Preparing yourself for presenting
* Delivering your content
* Dealing with the unexpected

Understanding and practicing the concepts of EQ can make your presentations a better experience for everyone in the room–including you.

Session Level: Beginner

In this session, which I gave at the PASS Summit in Charlotte, I introduce the concepts and skills of Emotional Intelligence as they relate to presenting. This, too, has been incredibly well received and the feedback has been spectacular. Presenting is definitely a strength of mine and this session shows some of the mechanics behind my philosophy. This session can not only help you with presentations and their delivery, but also lays a great foundation for leadership and working with other humans.

I am also excited to announce that Digineer, the consulting firm I work for and adore, is a Gold Sponsor for this SQL Saturday. As such, I will also be giving a shorter presentation during lunch. This presentation, “Keeping The Business In Business Intelligence” lays out our philosophy around BI. While this session will touch a bit on Digineer and who we are, it will also be grounded in solid content for achieving success in Business Intelligence initiatives.

SQL Saturday has been a hugely successful program. I have participated in as many SQL Saturdays as I could over the past several years. You can read about many of my experiences in previous posts on this blog. I have to say that SQL Saturdays have been a hugely important part of my growth in working with SQL Server and related tools. The idea of members of the SQL Community (dubbed SQLFamily with good reason) sharing their expertise with others at free events is just exciting and inspiring. I am proud to be a part of these events. I also consider it part of my own personal mission to help encourage new speakers. If you have questions about speaking (or blogging), please come chat with me. I love helping people get started. The more people we have sharing their knowledge and passion, the stronger a community we are.

Oprah And The 2014 PASS Business Analytics Conference

After the success of the 2013 PASS Business Analytics Conference, PASS is doing another one. The 2014 PASS Business Analytics Conference will take place May 7-9 in San Jose, CA.

PASS_BA14_788x165

Last year, I was a speaker as well as part of the official Blogger Core for the event. You can read my posts on this topic:

Who’s Got Two Thumbs And Is Speaking At The PASS Business Analytics Conference?

Business Analytics And PASS: Yes, Please!

PASS Business Analytics Conference – Live Blogging – Keynote Day 1

PASS Business Analytics Conference – Live Blogging – Keynote Day 2

PASS Business Analytics Conference Recap

Alas, I am unable to attend this year. But I wanted to help spread the word about what I feel is a hugely valuable learning opportunity.

In the 1990s, you often heard people talk about the Information Age. This was essentially the revolution of computerization and the adoption of our new digital world. You could argue that we are still in the Information Age, but I think we have transcended that simple definition. Even in the Information Age, information was something to be tightly controlled and protected as an asset; something to be used by the privileged ones.

<<INSERT YOUR OWN JOKES ABOUT WIKILEAKS, ETC HERE>>

Analytics solutions were there to be used by senior people in companies in order to drive strategic decisions, etc. It was not something to be shared with just anyone, even within those organizations. What we have seen over the past several years is the adoption of the idea that everyone should have access to better information. The concepts of the Democratization of Data and bringing BI to the Masses have taken root and are driving a lot of the innovation that we have been seeing. With this movement, people are truly realizing that it is not only CXOs and senior managers that need better information to make better decisions.

I picture Oprah standing before all of us, as her audience, saying “YOU get access to better information! And YOU get access to better information! And YOU! and YOU! You ALL get access to better information!”

Image Source: http://www.flickr.com/photos/puroticorico/2129229071/sizes/l/

From the release of Power Pivot for Excel 2010 to the incorporation of Power View into Excel 2013 to the launch of Power BI for Office 365, Microsoft has certainly embraced this viewpoint. Anyone who needs to make decisions can benefit from better information. As such, the role of the Information Worker has expanded to more and more people as the tools of the trade have become much simpler to use. What is key, though, is that people understand how to use this information, and the tools involved, effectively. I have to applaud PASS for creating a Business Analytics Conference at such an important time and continuing to help us make better use of such a highly prized asset.

Although I cannot attend PASSBAC this year, I really want to encourage you to do so if you can. My own experience last year was just fantastic. PASS consistently puts on quality events with great speakers and networking opportunities. And I have no doubt the 2014 PASS Business Analytics Conference will live up to expectations.

NOTE: If you had told me back when I first started blogging that I would feature Oprah in a post, I never would have believed you. But, here we are…

Survey: Changing Model.bim Filename In SSAS Tabular Projects

I am working for a client that has several Tabular models and are developing more. Even thought the process of developing Tabular models in SSDT could use some improvement, I am happy to see this exciting technology being adopted.

I noticed that the models here are pretty much all called Model.bim in the project. I have typically renamed mine to provide better context and never encountered an issue. My thinking was based on the multi-dimensional world in which a Cube called Cube is pretty ambiguous as to what information it contains. Likewise, a table called Table or a database called Database. Those examples are a little different, though, since a tabular project can only contain ONE .bim file at the moment.

William Weber (Blog|Twitter), with whom I am working on this project, pointed out that Books Online indicates that the name of the bim file should not be changed:

image

There is so little detail here as to make me question what could happen. I reached out in general on Twitter and no one seemed to have a good explanation. Today I asked SSAS Program Manager Kasper de Jonge (Blog|Twitter) directly. Kasper knew of no specific issue, either, and suggested it was probably just not tested. Fair enough.

Although, there does seem to be some gray area here. With this post, my hope is that we can eliminate some of the gray and provide better clarity around this for all of us. I would appreciate responses to this in comments.

1. Do you rename your Model.bim file and why/why not?

2. If you do rename it, have you had issues as a result? If so, what issues?

Thanks.

Power BI Demo Contest Entry

Behold! I hereby present my entry into the Power BI Demo Contest! I am really pumped about this set of tools and hope this demo helps show off what Power BI can do.

You can view it here on my YouTube channel.

Getting a prize would be cool, but I have to say the fun I had making this video and learning more about Power BI was awesome.

PASS Summit Interview With Kamal Hathi

For the third, and final, installment in my PASS Summit Interview series, I present my interview with Kamal Hathi, Director of Program Management for Business Intelligence Tools at Microsoft. Kamal is the one who is ultimately responsible for the direction of Microsoft BI.

As with my other interviews, the byproducts of casual conversation have been edited out for better flow in writing.

Transcript

Mark V:

Things are changing very rapidly in the BI space. There are so many tools coming out and Microsoft is delivering so many awesome new features. As a BI Developer, what does my job look like 5 years down the road?

Kamal:

That’s a great question. I think we’ve been on a journey where BI was something central that somebody built and everybody sort of just clicked and used, or maybe had a report. Now, we’re coming to a point where a lot of it is empowered by the business. The business guy can go off and do lots of things. But there’s two parts that are interesting that a Developer, a Professional, if you will, needs to be in the middle of. One, I think, is the data. Where does the data come from? The first requirement is having the right data. And call it what you want, Data Stewardship, Data Sanctioning. And it’s not just Data meaning rows and columns or key-value pair kinds of things. I think there’s Data, as in the Data Model; building a Data Model. And a Data Model, sometimes, is not different than what people do building Cubes. They have Hierarchies, and they have Calculations, and they have interesting drill paths. All kinds of things. So, someone has to go do that work, many times. Even though, I think, end users can go and mash up data themselves. But there are times when you need a more supervisory nature of work; or a more complicated nature, if it turns out that Calculations are difficult, or whatever. Someone’s going to always do THAT work. And that’s interesting. The second piece that is Interesting is that I think there’s going to be a new work flow. We’re already seeing this pattern in many places. The workflow is going to be like this. An End User decides they want to build a solution. And they go off and, essentially, put something together that’s very close to what they want. But it’s not really high performance; it’s not perfect. Maybe it’s got some missing things. And they use it. And their peers use it. It gets popular. And then some Developer comes in and says, “Let me take that over.” And then they improve it, make it better, polish it up. Maybe completely re-write it. But they have the right requirements built in right there. A third piece you will start seeing is going to be the Cloud services based world. That is taking bits and pieces of finished Services, and composing then together to build solutions. And that’s something we don’t see much today. But I can imagine someone saying, “Hey, I’ve got this Power BI piece which gives my visualization, or some way to interact. I can take some part of it and plug it in to another solution. “ They can provide a vertical solution, provide a customized thing for whatever audience it is. And be able to do so. I think those kinds of things will be much more likely than doing the whole thing.

Mark V:

So, instead of necessarily the BI Developer working with the users to help map out requirements, with these End User tools, they can build something that’s their requirement.

Kamal:

More or less. And then the Developer can jump in and make it better. Maybe re-write better expressions and queries and make it faster; all kinds of interesting things. It just adds more value to their job. Instead of sitting there talking to people and getting “Oh, this is wrong. Re-do it.”

Mark V:

When Tabular came out, there was some uproar. When a new feature comes out, there are always “the sky is falling” people that say that something else must be going away. It happened when Power View came out. People said, “Report Builder is going away.” Then, when Tabular came out, “Oh! Multidimensional is going away.” I still hear it out there sometimes that Multidimensional is dead or dying; MDX is dead or dying. What message do you have for those people?

Kamal:

Two things here. MDX and Multidimensional are not the same thing. We should be very careful. Multidimensional is very important. Lots of customers use it. And we’re actually making improvements to it. This latest thing, DAX over MD, which allowed Power View to work over Multidimensional, is a great example. We know this is important. There are many customers who have very large scale systems in Multidimensional. And it’s important to us. We’ve just come to a point with Multidimensional where large jumps in functionality are just harder to do. You can really go on an say we can 10X anything. And so the In-memory stuff, the Tabular, has been the approach we’ve taken to give more kinds of scenarios; the more malleable, flexible stories, the “no schema design up front” kind of approach. But Multidimensional is super important. It isn’t going anywhere. So, when someone asks, “What should I use?” we say, “You pick.” Our tools, our aim is, should work on top of either. We would like Tabular to be on parity with Multidimensional in terms of capabilities. And we’re making progress. We’re getting there. We haven’t quite gotten there. But uses shouldn’t have to worry, in our opinion, about Multidimensional or Tabular. These should be things that you worry about as a tuning parameter, but you shouldn’t have to worry about them as a big choice.

Mark V:

So, the Business shouldn’t be concerned about it.

Kamal:

Right. It’s a store. It’s a model. It’s a calculation engine. And you might want to switch one to the other. And in the future, that could be a possibility. There are lots of limitations that might not make that practically possible, but notionally speaking. I’m not saying you could just pull a switch. But you should be able to have similar capabilities and then decide which one to use.

Mark V:

Or possibly some kind of migration tool?

Kamal:

Yeah, but those kinds of things are harder sometimes. They’re not so easy to do because.. who knows what’s involved? What kind of calculations did you write? Etc. Those are much harder to do. Migration is always hard. But comparable capabilities make a lot more sense. So, I can take this guy, and build the same thing and not have to worry about a dead end.

Mark V:

When I was at TechEd, Kay Unkroth did a great Managed BI session. And he started with a demo of taking in data from the Internet and combining it with some business logic. It was tracking the purchasing habits of Pink Panthers buying cell phones. And in his scenario, they went through a big investment at this Company only to find out that Pink Panthers don’t exist. So, in the realm we have, with data becoming more self-serve, with Power Query, etc, being able to reach out to more and more places, what is the thinking [from Microsoft] on the ways we can continue to have some governance over the realm that users have?

Kamal:

Fantastic. This question goes back to the first question on what Developers do. We talked about Data Stewards and Sanctioned Data and all that. And even with Power Query, if you work with that, the catalog you find things from isn’t just the Internet. The Internet is one catalog. We are also enabling a Corporate, internal catalog, which today, we showed in the keynote. And you saw what we can do. It’s in Power BI, and you can actually try it out. And the goal there is to find a way for someone, it could be a business user, maybe a Developer, the IT professional, to go off and add to a catalog that is internal to the company. They can add something they think is trustworthy and worth sharing. And then someone else can come in. And they want the shopping habits of certain constituencies or a certain segment, they can find it. As opposed to “I heard on YouTube that 20-inch phones are now hot.” Who knows, right? Just because it’s on the Internet, doesn’t mean it’s true. That’s the idea of having this info catalog, essentially. It knows how to provide people the capability of publishing. And that can be a mechanism for whoever one deems fit in the process to provide that sanctioned publishing. And maybe have a data curator look at things and make sure users have a place they can go for trusted sources as opposed to “wide open.” And we actually enable that. People can do that.

Mark V:

Could you then disable the Internet piece of that catalog in certain organizations?

Kamal:

Potentially. The question is Why? And, again, it’s a process thing. You ask people Why and what they want to do. And that’s the danger of governance. The minute you make something such that it becomes that Forbidden Fruit, someone will find a way of doing it. They’ll go out and Cut and Paste. They’ll do something. It’s almost impossible to really lock these things down. What you do want is people to be aware that there’s an alternate, a sanctioned place, where they can go compare and they can understand. And that’s what they should be doing. I think it’s much harder to lock it down and say you can only get data from inside. But that’s a process view; and organization will decide how they want to do it.

Mark V:

Early in my career, I built several report models for clients. Obviously, you know, they’re going away. They’re not be invested in. I recently wrote a Technical Article about alternatives to Report Models in SQL 2012. And I laid out the different options that you have, from Power Pivot for Excel, Power Pivot for SharePoint, Tabular, or full on Multidimensional, etc. One of the things that was pointed out to me after that one of the things that goes away with report models is the ability to make that regular detail style table report in Reporting Services that can go against a drag and drop model that’s not a Pivot table. Is there something coming down the road that could alleviate that?

Kamal:

There are ranges of this, right? You can do a drag and drop tablix, even, in Power View. So, it’s not like it’s completely gone.

Mark V:

Right. It’s not completely eliminated. But it’s not the flexibly of a Report that you can do subscriptions with, etc.

Kamal:

I think in terms of those kinds of things, like Subscriptions, mailings… Two things. One. Reporting Services is still there. It’s not like it’s gone away. And number two, we obviously think it’s important to have the same capabilities in other places and so we’re looking into making that possible. A lot of people ask for it. And it seems like a reasonable thing to ask for. Why would I not want to go schedule even my Excel sheets? A report’s a report. For some people a report is a Reporting Services report. For some people a report’s an Excel report. It’s an important point and certainly we’re looking into that as something we would do. But I don’t know When, Why, How, Where. As with all things Microsoft, it’s open to interpretation by whichever tea leaf you read about.

Mark V:

From the visibility you have within Microsoft to see what’s going on, could you describe, in a general sense, how a feature for, say, Analysis Services, goes from ideation, with someone saying, “Hey, wouldn’t this be cool?” to making it into the product?

Kamal:

Absolutely. There are multiple ways this happens. One is when we had an idea before or a customer had asked for it, and we decided to ship it and couldn’t. And then it becomes a “backlog” feature. So, next time, when the next version comes, this one is left over and we say, “Hey, come join the train.” We just pull them back on. That’s easy. Done. Everyone knew we wanted, or tons of people had asked for it, and we just couldn’t finish it, so we just put it over. The second, which is much more likely, is customers have been asking for something. For example, we heard the feedback loud and clear, “We need support for Power View of Multidimensional models.” That’s a pretty clear ask. The work is not easy, but it’s a pretty clear ask. So then you go and say, “What does it take?” We figure out the details, the design, the technical part of it. And then you figure out the release timeframe, the testing. All that stuff. And you do it. The third one is when somebody say, “I have an idea.” For example, “I want to do something. I have an idea for an in-memory engine and it could be like a Power Pivot.” And that’s just like, “Wow! Where’d that come from?” And then you say, “Will it work?” And nobody knows, right? So then we start going in an asking customers what they think. Or, more likely, that idea had come from talking to somebody. And this Power Pivot case actually came from talking to customers that said, “Hey, we want end users to be more empowered to do analysis themselves. What can you do?” That was essentially the germination of that idea. When that happens, there’s usually some customer input, some personal input. And then they start to come closer to fleshing it out and ask who’s going to use it. So we ask customers, get some replies. Sometimes we just do a design on paper and say, “Would you use this?” And then people try it out, give it some feedback, and say, “Yeah. That’s good.” So, we go down this path getting more and more input. And then we come up with a real product around it. Sometimes, we have an idea that we think is great and we float it around and we hear vocal feedback saying that’s a bad idea. And we go back to the drawing board and re-do it. And this happens all the time. Many of these features show up, and people don’t realize that we went back and re-did it because the community told us. And many people ask if we use focus groups here [at PASS Summit]. There’s one tomorrow, actually, to go sit down and say, “What do you guys think about XYZ?” There’s feedback; and we listen. So, there’s an idea, there’s feedback, there’s evolution. And we iterate over it all the time until we come to a solution on that. Very rarely to we just build something where we didn’t ask anyone and we just did it. We come up with a brainstorm and flesh it out, but we always get feedback.

Mark V:

So, Matt Masson has discussed the debate within the SSIS team regarding the Single Package deployment model versus the new Project Deployment model. I would assume that happens throughout all the products. Maybe there’s somebody championing this feature and people asking “What are we going to do?”

Kamal:

Yeah. My first memory of working on this team is people shouting in hallways… in a friendly way. People shouting, “Hey! What about THIS?” or “What about THAT?” And it turns out to be a very vocal and a very passionate environment. People don’t just work because they walk in in the morning and write code. They work on this because they are just in love with this product. They are just deeply, deeply involved and committed to what they’re developing. And these people just have opinions. And I don’t mean opinions like “Yeah, I want to have a hamburger.” They have an OPINION. And there are very passionate discussions that go back and forth and back and forth sometimes. Typically what happens is a Senior Architect or someone who has some tie-breaking capability can come in and say, “Look, great ideas everybody, but we’re going to do THIS.” And then people listen, there’s some more argument, and after that, it’s done. And we go and do it. And Power Pivot is a great example of that. People were like, “Are you crazy? What are you doing??” And it was like, “No. We’re going to do it.” And that was it. And the team just rallied behind it and built a great product, and off we go. But, the good part about that story is that, because people have such vocal opinions, they rarely remain silent about these things. We don’t lose anything. And so we have an answer that we can listen to. And we can decide from multiple options as opposed to just going down one path. And then we end up with a decent, rigorously debated solution.

Mark V:

So, there’s a lot going on right now with Cloud and Hybrid solutions; there’s Chuck Heinzelman’s great white paper on building BI solutions in the Cloud in Azure virtual machines. When I talk about that with clients or with colleagues, there’s still a lot of trepidation and people say, “For some industries, that will never ever happen.” What kind of message would you have for people that just don’t see how it’s feasible?

Kamal:

Cloud has many aspects to it. There is the “Everything’s in the Cloud. I want to put my financial, regulatory data in the cloud.” Which, is not going to happen for some people. Then there is the “I can get value form the Cloud on a temporary or on-demand basis. I want to go burst capacity. I want to go off and do a backup.” Whatever it is. There’s that. Then there is the “I’m going to do some parts of my workload in the cloud and the rest will remain on premise.” And for all these options, whether it’s completely in, hybrid, temporary bursting, that value provided is typically what customers decided makes sense for them. If it’s the value of all the infrastructure just being taken care of and I can do more things to add value to a solution, so be it. If it happens to be that I can get extra capacity when I need it, great. If it happens that I can keep my mission critical or compliance related data on premise and lock it up, and hybridly work with the cloud, that also works. And for most customers, most users, there is value in many of these scenarios. And there isn’t any one answer. You can just say that “The Cloud” means that you do X. it just means that you have many options. And interestingly, Microsoft provides many options. We have Platform [PAAS], fully deployed platforms that you pretty much just deploy your database and have to do nothing. Azure SQL Database, good example. All you do is worry about building your database, setting your tables, and you’re done. We take care of all kinds of things in the background. Don’t like that? Go to a VM. And you can do all kinds of fancy things. Chuck’s paper is a great example. We have solutions. Office 365 gives you end-to-end; from data to interfaces to management, all in one. Each of these things have hybrid solutions. You can have data on premise. As we saw today in the keynote, you can backup to Azure. With 365 and Power BI, you can actually do hybrid data connectivity. So, all of these things work in different ways. I think, for every customer out there, it typically is just a trial solution, they want show something, or they want to have part of their solution that works. Either way, it adds value. Typically, what I have found as I talk to customers, is that many of them come from the “I would never do that” to “Oh. I might try that.” Because they have come to the realization that it’s not an all or nothing proposition. And you can do parts, try it out, and it works.

Mark V:

Over the Summer, I did a proof of concept for a client using Tabular. Just because, with the size of it, and looking at its characteristics, I said, “This would be a great Tabular solution. Let me demo it for you.” I have talked to several people in the industry. And the process of developing Tabular in SQL Server Data Tools can be… less than awesome. I had some bumps in the road. There was a great add-in I got from CodePlex [DAX Editor] that helped me deal with DAX in more of a Visual Studio environment. It didn’t apply well to Service Pack 1 [of SSAS 2012 Tabular] and that kind of stuff. There was something that Marco Russo had put forth on Connect that suggested more or less a DDL language for Tabular models. Is something like that feasible?

Kamal:

I don’t know. I don’t have a good answer for that. That reason for that is we’re looking at many options. What would move the ball forward in that direction for a design environment for Tabular models? And there are many options. So would say let’s do it in Excel; take Power Pivot and put it on steroids. It’s a possibility. Or a DDL language. Go off and take the things you had in MOLAP and apply it here, maybe. Maybe something brand new. I don’t know. We’re trying to figure out what that is. I do know that we do want to take care of people who do large scale models, complex models in this environment. I just don’t know how and where and when. But it’s an important constituency, an important set of customers. And we’ll figure out how to do it.

Mark V:

As a BI developer, it’s important to know that the discussion’s being had.

Kamal:

All the time. This happens in a lot of hallway discussions. This is one of those.

Wrapping Up

There’s a little bit of a story to this one. When I decided I wanted to do a blog series composed of interviews conducted with Microsoft folks at the Summit, I wanted to get different perspectives. With Matt Masson (Blog|Twitter) and Kasper de Jonge (Blog|Twitter), I already had members of teams in the trenches of development of the tools. I had then reached out to the awesome Cindy Gross (Blog|Twitter) to get the perspective of someone on the CAT (Customer Advisory Team). Cindy got back to me with a contact for Microsoft PR, Erin Olson, saying that she was told to send me there. Upon contacting Erin, she responded by offering to have me sit down with Kamal Hathi, who would already be on site that day. That was an offer I couldn’t refuse. In hindsight, I am wishing I had asked about sitting down with Cindy as well, but I had already decided that my first series of this sort would be capped at 3 since I had never attempted anything like this before and didn’t know what to expect. If this series proves to be popular and of value to the Community, then I will certainly consider doing it again and asking Cindy to participate.

You will notice some overlap in the questions posed to my fantastic interviewees, particularly between Kasper and Kamal. I wanted to get different perspectives from within Microsoft on some similar topics. I also made sure to branch out in each interview and ask some questions targeted to a particular person.

In response to my “5 years down the road question,” Kamal echoed the importance of Data Stewardship. It is clear that this is an area that Microsoft is taking very seriously. Having done a lot of reporting in my career, my motto has always been, “It HAS to be right.” Clients have appreciated that. As we open up more and more avenues for users to get data, we must keep in mind that the data needs to be trustworthy. 

I really want to highlight the ways in which Kamal described how a feature makes it into the product. Make special note of the fact that customer feedback is vitally important to Microsoft. Sometimes, the idea itself comes from Customers. I think Microsoft often gets a bad wrap as some kind of bully or something merely because it is big. It is certainly not perfect; no company is. But I think it is really important to make note of how Microsoft DOES listen to customer feedback when it comes to the products they provide.

Kamal’s description of the internal debates that occur within Microsoft over features is important. It also echoes what we heard from Matt and Kasper. The people working on these products for us care VERY deeply about what they are doing. The work and passion that go into creating these tools we use every day is staggering to me. While I have never been a “fan boy” of any company, I have chosen the SQL Server related technologies upon which to base my career. And I have no regrets. This is a hugely exciting time to be a BI professional. The investments that Microsoft have been making in this space over the past several years make it even better.

This concludes my PASS Summit Interview series. Thanks so much to Matt Masson, Kasper de Jonge, and Kamal Hathi for taking time out of their very busy schedules to sit down with me and answer my questions. Thanks also to Cindy Gross and Erin Olson for their assistance in connecting me with Kamal. This series turned out even better than I had ever expected thanks to the generosity of those involved.

PASS Summit Interview With Kasper de Jonge

I continue on with my Interview series with Analysis Services Program Manager Kasper de Jonge (Blog|Twitter). As before, some edits were made, with Kasper’s permission, to eliminate byproducts of casual conversation and make things flow better in writing.

Transcript

Mark V:

How would you say my job as an SSAS developer would be different in five years?

Kasper:

Before I joined Microsoft, I was a developer, myself. I developed Analysis Services Cubes and SSRS reports on top of them. And they never seemed to work very well together. One of the things I have seen over the years, since I joined Microsoft, is the Teams started working together better, much better. So, teams like Power View and Analysis Services are coming together in releases, and now Power Query and the Data Steward experience join the mix. But I think that is one of the key aspects going forward.

I have been trying to sell MS BI before joining Microsoft, and it was hard. What do you need if you want to buy MS BI? You need Excel, so you need an Office license key, you need SharePoint, you need Analysis Services, you need Enterprise Edition, or BI Edition now, luckily we have that. So, you need to sell four different products. Now you can just say, we have one product: Power BI.

It’s gradually going. Power Query is still a little bit separate. The M language is there, then there’s the DAX language, and what do you do where? But at least we’re landing. The first thing we said two years ago was that there’s only going to be one model. And that’s the Analysis Services model. In the past, Reporting Services had their own model, right? The SMDLs [Semantic Model Definition Language]. Performance Point had their own models. They all had their own stuff. So we said, “No More. There’s only going to be one model, and that’s going to be Analysis Services.” That’s already a big step. You see people like Power Map come into the picture. The initial versions that were not public were not really connected to our stuff. We sat down together and said, “Let’s be sure we all do the same thing.” So, if you go into Power Pivot, and you say this column is a country, tag it as a country, not only can Power View use it, but Power Map will now use it as well. I think that’s one of the biggest benefits and it was really needed: to make one product, and make them work much better together.

Mark V:

Do you see big changes in the skills of people like myself, not an Information Worker, but someone who sets up the environments in which the Information Workers play?

Kasper:

I don’t really think so. I think the role is going to change a little. And that’s not necessarily to say that you’re going to have to do different things. But in the recent years, as there’s less IT, more cutbacks in IT, you have to do more things in less time. So, enabling the Business User is becoming more and more important. And not just by giving them canned reports, but by giving them better models, which we already did with Multidimensional Models for years. But make it even easier, and that means making good models in either multidimensional or Tabular, and have a good analytical platform on top of that. So, that’s one kind of user who only wants to do template reports or ad hoc visualization on top of models. That kind of stays the same, I think. I do hope that with Tabular models, it’s becoming easier to do shorter iterations, and we can grow the Tabular model over time and make it easier to use and make it easier to do larger things. For example, I have seen people that have six to seven hundred measures in their Tabular model. And that’s pretty hard to maintain. So, we need to come up with stuff to make that easier. I met someone yesterday that had 120 tables and five hundred measures. Well, right now, we don’t have a great experience for you to build and manage that. So we need to think about what that means. It’s more about how the tools change. I’m a PM [Program Manager] who works on the tools side of things. So, that is one aspect of the BI Pro as we know them today.

On the other side of things, with data movement, as Matt Masson was showing earlier today, you can expose data for your users to start using inside Power Query. And you can enable data steward to start creating data. So, you, as IT, are not necessarily building it, but you are starting to enable people. And I remember, back in the day, when I was building cubes myself, I built an application in .Net that allowed business users to add data to the data warehouse. Master Data Services does it now pretty well. So, the two types of Business Users, one being the user that just wants to do reporting, doesn’t want to do any modeling themselves or any calculations. So, that’s one. The other is the actual Power Pivot/Power Query user and we can help them get to the right data easily and make them confident that the data is right. And that’s an important venue. And I think that’s also an important part BI pros have been doing for years. They can shift a little bit into that mindset, and enable that as well.

Mark V:

From a tools perspective, one of the questions I have around enabling the end user to get more and more data, including data directly from the Internet. One of the things you talked about is the experience for the data steward with Master Data Services. Is there discussion around a solution that allows users to get data from the internet, but only so much. Kay Unkroth, at TechEd, did a great session around Managed BI. In that session, a fictitious company tracked the purchasing habits of Pink Panthers. And it wasn’t until a large investment had been made that someone realized, “Oh no. Pink Panthers aren’t real.” So, the experience of getting to more data. But how do we make sure it’s good?

Kasper:

There are definitely discussion about all of that. And you already see it a little bit in the portals. If you saw Matt Masson’s session today, you saw that you can track how many times different data has been used, and by whom. And we have that On-Prem today. And, in my mind, that is one of the most popular things. To allow you to understand what the data means. And I sincerely hope, and I am not sure if this is coming, but things like Data Lineage would make a lot of sense in here as well. I don’t know if you’re familiar with Prodiance? That’s something that the Excel team has. And it’s already released in Excel 2013. And it allows them to do, sort of, Excel spreadsheet lineage focused on the financial markets. I don’t know if you remember, this was a few years ago, and someone made an error in some calculation in an Excel spreadsheet and they lost a few billion dollars. So now all banks, etc, are saying, “OK. We need to manage this.” So they [Excel Team] have a product they bought, I think two years ago, called Prodiance. And it’s now available inside Excel. They only discover Excel workbooks for now and they don’t know anything about data models and everything that goes into that. So, it would be great if we could “hook that up” for example. I’m not saying that we’re doing that. But that’s something that would make sense.

Mark V:

So, with the way that Office and Analysis Services are dovetailing more, like in Power BI, is there sometimes contention between the teams?

Kasper:

No. The Office team loves what we’re doing. We’re adding value to Office. We’re giving them all kinds of new features. And we’re innovating in the BI space. And they love that. They do give us some hints and tips on what they want to see and we try to accommodate that. It’s more like working together. Our directors are working together and they see what is needed and say, “How do we work together on doing this?” We all see we’re working together in the Excel code base. But what do you think about Power BI? It’s one completely shared code base. You have Office 365, SharePoint Online, all the infrastructure. It’s one big surface that lives and breathes together. So, it’s a lot of working together.

Mark V:

That has to be pretty exciting.

Kasper:

Yes. I mean, it’s a big company. The Office team has its own building. It’s a little bit different. Each team has its own rules, and how it works, and it’s different. Office has a longer planning period. We don’t have a long planning period. In the past, we also had different shipping vehicles. Now this is more streamlined.

Mark V:

So, with the evolution of Analysis Services to feature both the Multidimensional and now the Tabular model, I encounter people who say, or have heard others say, “Multidimensional is dying” and “Don’t bother learning MDX because it’s not going to matter anymore” and so on. What kind of message would you have for those people?

Kasper:

My next session, in an hour, is about all the investments that we made in multidimensional that allow you to do Power View over Cubes. And that was not easy improvement. So, we now support DAX queries on top of Multidimensional Cubes. That is some major major work that has happened. We’re saying, now you have all the good stuff with Power View. And whenever Power View does something going forward: you will get it. Automatically. So, it’s definitely not that. Having said that, it’s still a hard decision on when to go for what. Multidimensional is just a much more mature product. It’s been in the market for so long. People have worked with it for all these years. With Multidimensional, we’ve seen all these different usage types.  We’ve seen the Yahoo cubes, the huge ones, the small ones, we’ve seen people do Writeback, and all those kinds of things. So, it’s been around the block. Tabular has not been around the block for long. It just started the journey. So, we’ll see where that ends up. I’ve heard some feedback from people here as well. They did multidimensional cubes and they started Tabular and said, “Well, it’s just great because it makes it so easy and makes it so fast to build something.” But it doesn’t have certain features. That’s for sure. Calculated Members would make my life so much easier. I wouldn’t have to do 400 measures. If I have Calculated Members, I could just have a few Calculated Members, and I’m done. I don’t have to do YTD for this measure, and this measure, and this measure. And when I do custom rollups, you can’t do it in Tabular. There’s just some things in Tabular that you cannot do yet. For example, Hierarchies. Get me the Parent of something. In Multidimensional, is makes sense because you have Attribute Relationships and you have Hierarchy structures. In Tabular, we don’t. We just have Tables. We have Hierarchies there, but hierarchies are more an “ease of use” feature instead of a structural feature, like it is in Multidimensional. So, there’s just a lot of things that haven’t made it. We don’t know if we want to bring that in to Tabular. So, it’s not that, that’s for sure.

Mark V:

Multidimensional is not going away.

Kasper:

No. It’s certainly not going away.

Mark V:

So, with MDX being as complicated as it is, and even though it would take years to get really good at MDX, is it still worthwhile path to go down since there is still so much multidimensional out there?

Kasper:

Yes.

Mark V:

And there are still so many use case for Multidimensional, even with Tabular.

Kasper:

And Excel still talks MDX even to Tabular. There are so many tools out there that talk MDX. But, having said that, I’ve heard a lot of people here that said, “I’ve migrated a lot of Multidimensional Cubes to Tabular Cubes. It makes my life so much easier.” So, I’m not sure I can give an answer. But, I think you can get away with just learning the basics of MDX. Or learning the basics of both. Because, I think, that’s probably what you’re going to need. You probably think about, “What do I need to become an expert in?” I’m not sure what the answer is.

Mark V:

It’s kind of tough. That’s the position I’m in, personally. I’ve done a little MDX. I have a blog series and stuff like that; went really well. And I’m like, “Well, do I dive deeper into that? Do I do something similar for DAX?”

Kasper:

It kind of depends on the situation you’re in, I would think. If you have the opportunity to push Tabular, it fits much more into the Agile world. I mean, it’s so much easier to make some changes. But, if you’re customer demands are not Agile, if they want to stick to the old world methods, then Multidimensional is probably preferred, I would think.

Mark V:

So, having been on the [Analysis Services] team for a few years, are there features of Tabular, of Power Pivot, or anything that you championed and are really proud of? Anything where you’re like “Hey, I stood up for this, it’s in the product, and I’m really pumped?”

Kasper:

It’s so much of the little things. I have business, myself, with everything. Thinks like this particular DAX function; I need to make sure this works correctly. All the small things like Sort by other Column; making sure that came in.

Mark V:

I love that, by the way.

Kasper:

It’s so many of those little things that make the product complete.

Mark V:

I did a POC for a client using Tabular because it’s really a good fit and it was kind of a cool solution. One of the things I found when I was working on it was that, working within SQL Server Data Tools…. It’s not “awesome.” You can do it. You’ve seen some of my Tweets about changing Column names and things of that nature. There was a great tool that Cathy Dumas had written and put on Codeplex.

Kasper:

The DAX Editor one?

Mark V:

Yeah. The DAX Editor. Are there any thoughts to maybe upgrading that? Because, even though it was not fully compatible with [SSAS Tabular] Service Pack 1, and it had “issues,” it was awesome enough, that I used it anyway.

Kasper:

That was a personal prototype, together with someone else. I cannot speak for that person.

Mark V:

OK. But something like that. Writing DAX in THAT environment, even with it not working perfectly, was awesome.

Kasper:

I know. I get that. They found a quicker way to do it. Of course it was Codeplex, so it was not officially supported. But with SP1, a lot of things changed in the model, so it [DAX Editor] broke.

But, I totally get it. I really sincerely hope we can come up with a better example in the product. I’m not saying that we’re doing it right now, but definitely would love to do something like that. This is part of what I was saying about having larger models. In Excel, it’s a different view point. If you are in Excel, you work to solve “a” problem, and then you throw it away. In a Tabular model, as a BI Developer, you have a solve a problem for 40 people. So, you need to look at it from all different angles, and different viewpoints. So, it’s bigger and more complex. So, you need bigger and better tools, and not just the Measure Grid.

Kasper:

One of the other examples of teams working together, and we almost had this on the Keynote: did you know you could have Excel 2013 with Power View and query Hadoop, with no caching, with our existing products today? I mean, this is awesome; it’s teams working together again. Excel 2013 Power View connects to a Tabular model in Direct Query mode. The Tabular model in Direct Query mode connects to PDW [Parallel Data Warehouse]. That sends Polybase queries directly to Hadoop. And we worked with the PDW team to make sure the queries that we send are supported in Polybase. So that they understand the queries that we send. It’s not going to be as fast as putting into Vertipaq [xVelocity]. But, there’s no caching. I directly go from your Excel spreadsheet, in Power View, to data in Hadoop and you return it.

Mark V:

How long has this been supported?

Kasper:

This has been supported for quite some time. One of my colleagues is getting in line to write a blog post about it. He still hasn’t done it. This is one of those things where, before we say anything is “supported,” we have to test it. And that costs money, right? So, that take money away from a Power BI feature or anything like that. But, in this case, we thought, “OK. This is going to be so cool!” And you can imagine, PDW just started going down this path. But, I can imagine, this will become faster in the future. So, this is going to be awesome. 

Wrapping Up

I really liked hearing how teams within Microsoft are working together. Kasper has a great point regarding traditional Microsoft BI requiring you to purchase several different products. Power BI really tosses that model on its head. if Microsoft really wants to democratize BI and bring it to the masses, the simplification of the process is a key step.

I have to confess that I had never heard of Prodiance until Kasper mentioned it. That sounds like some cool functionality that I will want to play with.

It seems that when new technologies come out, there always has to be people that say that some other technology must be dying in consequence. When Power View came out, there were people that decided Report Builder would go away. When Tabular came out, people panicked that Multidimensional must be going away. the sky is always falling, isn’t it? When Kasper made his point about the work that went into having Multidimensional Cubes support Power View, it made a lot of sense. Why would Microsoft invest time and effort in such a difficult task just to sunset Multidimensional soon after? That would make no sense. Kasper was pretty clear: Multidimensional is going to be around a while. As will MDX.

I really like Kasper’s point about Tabular being more in line with the more in tune with the Agile development cycles of today. It is a lot easier to make iterative changes to Tabular than it is in Multidimensional. At the same time, his point about Tabular not having been around the block yet is a great one. There were cool aspects to my choice of Tabular for a client project last year. There were also a few surprises that I had to deal with. I look forward to getting strong expertise with it so that I am in a better position to work around difficulties and take better advantage of new features when they come out. I was heartened by the fact that Kasper saw where I was coming from with a better environment for DAX development. Hopefully, there is more support for that within the team.

Kasper’s example of using Tabular in Direct Query mode hitting PDW is a great example of the future I would like to work in. Taking disparate technologies and putting them together to make a cool solution is just a blast.

Thanks so much to Kasper de Jonge for taking time out of his busy schedule (I think he presented 4 sessions at Summit) to sit down with me. My final interview post, with Director of Program Management for Microsoft BI Kamal Hathi, should come next week.

PASS Summit 2013 Interview with Matt Masson

At PASS Summit 2013 in Charlotte, I had the opportunity to sit down with Matt Masson (Blog|Twitter), Senior Program Manager on the Integration Services Team at Microsoft. I was really honored when Matt explained how busy his week was and then offered me a half hour anyway. I want to give a tremendous THANK YOU to Matt for being so generous with his time.

I had no grand plan/agenda for my series of interviews of Microsoft folk at PASS Summit 2013. As such, I plan to just display the transcript of my conversation with Matt as it occurred. NOTE: With Matt’s permission, I have edited out “Um” and “Ah” and other byproducts of casual conversation so that it flows better in writing.

The Transcript

Mark V

With the way things are going, with Cloud, and everything else going on, what does the future of SSIS development look like 5 years down the road?

Matt

I think five years out is a bit too far. We’re seeing a lot of big changes, especially around Hadoop. I think Hadoop and Big Data processing have been a big disruptor to the ETL space. I think there’s still a lot of what we call “traditional” ETL work, what people do today with SSIS. That’s where SSIS’ strength is. But we’re getting more and more requests about Cloud processing. That’s actually one of the things I’m going to talk about at PASS today, at the SSIS Roadmap session. One of the interesting things is, say, go back two or three years ago, we had people asking, “Can I have SSIS running in the cloud? Can you make SSIS run in the cloud?” And we’re like, “Yeah, that’s a great idea. Let’s go build it.” And then we started asking, “What scenarios?” and “Why do you want to run SSIS in the cloud?” Customers didn’t know. OK. Where’s your data? Data is all on prem. If your data’s all on prem, running in the cloud doesn’t necessarily make sense, right? I think, as we’re seeing a shift of more and more data to cloud sources, so they’re landing in places like Azure, or even pulling in from remote sites or pulling in from different cloud providers like Salesforce.com or something like that. If your data’s already IN the cloud, then doing your ETL processing closer to that data makes a lot of sense. So, today, you can run SSIS in an Azure VM and we’re having a lot of customers do that. So, you’re using your traditional On-Prem tools. It’s just running in the Cloud.

Other things we’re considering and looking at is, basically, what if SSIS could run as a service? What if you didn’t need your VMs? You could just deploy your packages and run things like that?

In addition to traditional ETL, we’re also looking at other technologies. There’s other data movement technologies out there like Azure Data Sync, which is very simple: I want to keep my On-prem databases and my Azure databases in sync. So, you don’t need a full ETL framework. You don’t need an ETL developer. Sync just takes care of it for you automatically.

So that leads us to a couple of different angles. We’re trying to make ETL easier, more automatic. Just keep schemas in sync. While for the more advances scenarios, your traditional ETL scenarios, SSIS still makes a lot of sense. We need to evolve SSIS to better fit in the “Cloud” world.

Then there’s Big Data and Big Data processing. You’re seeing an of evolution of technologies on Hadoop, right? There’s a lot of different technologies, lots of things going on. You’re seeing lots of tools at different stages of maturity. It’s a really interesting space to see how it’s evolving. One of the things I’m going to talk about today is to show SSIS integration with HDInsight, for example. So, from SSIS, you can provision HDInsight clusters, you can run Hive jobs, Pig jobs. You basically orchestrate everything you want to do on Hadoop from SSIS. You get the nice visual experience which is lacking from Hadoop and the Big Data system today.

Mark V

So, when you think about Hadoop, and the Cloud, and the Democratization of data; bringing BI to the Masses; the revolution of Self-Serve, one of the things you have is Users looking at data that they may not know how to vet properly. So, when I think of tools like DQS (Data Quality Services) that are often integrated into ETL, what are some of the things that we could look for in the future? Not necessarily products, but just concepts for how Microsoft is going to help handle that with moving data around to enable that Self-Service, while still keeping it easy to get to.

Matt

So, Self-Service is an interesting space. We have Power Query coming out, which gives you self-service, light-weight ETL. I think our self-service vision has been resonating really well. We’re seeing more and more customers picking up on that. But, just like there’s a space for Self-Service BI, but also a need for traditional BI modelers to take that raw data into a model concept so that the “self-service” people can actually build their reports from there, I think the same thing applies in the ETL space as well. There’s Power Query for that light-weight, self-serve ETL, but there’s still the need for traditional ETL development as well for IT to automate these processes, make them reliable, do the complex transformations, apply business logic, apply filtering, etc. I think there’s going to be that “professional” or “corporate” ETL as well as self-serve ETL. That challenge for us is figuring out whether that is a single tool that does both; perhaps a single tool with different faces or personas, for different roles. I think we’re going to see a lot of convergence in our tools going forward. I think one of Microsoft’s strengths is the rapid time to results, making it as easy as possible to get it, and also have that functionality there that you can extend to do the more complex ETL scenarios as well.

Mark V

One of the other things you’re really known for is the BI Power Hour. Can you talk a little bit about how that was born and how it’s evolved and what it’s like to be a part of something like that?

Matt

Sure. The BI Power Hour is really interesting and I was nowhere near the beginning of it. I think it was Bob Baker who started the original Power Hour and it was focused around Office BI. And then the SQL folks eventually took over. But the idea was to let the Product Team have fun and show off the power of the products in your non-typical scenarios, with no business value whatsoever. And we’ve sort of made it more and more ridiculous as time goes on. There are certain teams, like Reporting Services, that have always been there since the beginning, and they always did a game. Every year they did a game. I think they did Tic Tac Toe, and then Hangman; the game got more and more complex as they went through the years. I think I saw my first Power Hour in 2009 and I immediately wanted to be a part of it. I had never seen one before and I just thought it was really exciting. And the next year, I asked the organizer, Pej Javaheri, if I could participate. He wasn’t sure; “SSIS doesn’t usually do a Power Hour” and “it’s not very interesting.” So, I decided to prove him wrong. Since Pej left Microsoft, I’ve taken over the Power Hour. I do most of the coordinating and stuff. It’s always really interesting to make sure there is a business message there. We’re not as explicit about it anymore. But, afterwards, we always have people coming up to us and saying, “I didn’t know the tools could do that” and “I want to know more.” That’s really the whole point, essentially. And if we can get laughs doing it, then that’s even better. We usually try to balance out presenters showing new technology, show off some valuable things. I typically just do ridiculous demos. I have a whole story that goes along with it. It’s a lot of fun. The hardest part is justifying the days of work that goes into a ten minute demo.

Mark V

It’s really exciting to see people who were involved in building the tools and are just so excited about features getting to go play with them.

Matt

With my demos, which usually revolve around cats, I had spent some time in SSIS and built some custom transformations. I’ve had someone ask me afterwards, “Why do you spend so much time on this? Why aren’t you doing work for the real product?” Yeah… it is a good point, but usually I limit Power Hour stuff to my “free time.” So flights, at home, things like that is usually when I work on those things. I try to really time box it, to justify to myself, devoting time to this really fun thing.

Mark V

When I saw you at TechEd and you were talking about the SSIS Catalog, one of the things you said was that there was some debate within Microsoft regarding the Package Deployment Model and the new Project Deployment Model. Even within the team, people were arguing about which way to go, and you were finally brought around to the Project Deployment Model. Is that something that is common when you are getting features ready for a product that you have that kind of debate? Is there a lot of that?

Matt

Yes, there’s a LOT of debate. The bigger the team, the more debate there is. 2012 was really interesting because that was as big as the SSIS team has really been. We actually had half our team located in Shanghai and they were really driving the Server components. And half our team located in Redmond. So, doing the coordination and making sure both teams agreed on the scenarios of what we were trying go toward was really important. Doing development is all about resource constraints, right? You have a ton of stuff you want to do and you have to figure out, “Where is my time best spent?” Sometimes you’re making guesses. If you only do exactly what the customers want, you’re not necessarily moving your platform forward far enough. If we only focused on bug fixing, we probably wouldn’t have gotten a lot of the great functionality that we did out of 2012.

Mark V

…And the rounded corners…

Matt

Well, the rounded corners, yeah. Actually the rounded corners joke was just a random Power Hour joke that I just came up with on the fly. I’ve been using it since. Although I was in somebody’s session and they spent ten minutes building up that joke and it was really painful to watch. But the rounded corners was just WPF, that’s just the way it looked. But I made the joke about Interns coming in and sanding down the corners for three months. And I actually had an angry customer come up to me afterwards and say, “You guys spent three months working on rounded corners and yet you didn’t fix the Web Services Task” and storm off. “It was a JOKE!” At PASS, people usually get that something’s a joke. At Tech Ed, people expect Microsoft presenters to be more serious and jokes don’t always go over well.

Mark V

Even at a BI Power Hour?

Matt

When I did my first BI Power Hour at Tech Ed, I got a standing ovation when I did some of my lines, not because it was a great presentation, but I think the line was “I’m a programmer. What do I need real friends for when I can create them programmatically?” Standing ovation. And it wasn’t because it was funny. It was because the audience felt the same way. And I just felt really sad at that point. And the next day, I had people coming up to me offering to be my friend and saying, “I don’t have any friends on Facebook either. I had to stop using it.” And they just didn’t get that it was a joke. I did my Power Hour at the Boston user group and nobody laughed. There were some chuckles, but that was it. But then I realized afterwards, when I was talking with somebody else, that the audience actually thought it was real and that they felt sorry for me. So, they didn’t know they were supposed to laugh.

Back to planning. There are definitely different viewpoints on the team. One thing was related to Package Deployment versus Project Deployment. Every time you change functionality, but keep supporting a feature, your Test Matrix increases. So, the number of scenarios you have to test goes up. And we were really short on Test resources. And you can’t release something unless it’s properly tested. So, at one point, they wanted to say “No more Package Deployment Model; we’re just going to do Project because it means we can add more functionality because we’re not supporting these other things anymore.” It just did not make sense to take approach. I think the thing I had mentioned at Tech Ed was Single Package Deployment versus Full Package deployment. Long debates. But it came down to the architectural difference. We showed how much it would cost to implement Single Package Deployment and how much it would cost without. If it’s an extra month in development time, how many bugs can we fix in a month? How many other improvements can we make in a month? So, it’s a balancing act. I still think it’s the right decision. At the same time that we’re making those decisions internally, we’re talking to our MVPs, getting their feedback. I know the MVPs felt really strongly about Project Deployment, keeping it all together. And we were trusting in that. They’re basically the voice of our customers.

Wrapping Up

With Matt being so busy, and prepping for a session, I left the interview off there.

I have only had the chance to use SSIS 2012 one one project. And even with that small taste of this fabulous tool, I was tempted to just give Matt some applause and call it a day. I really appreciate the work and time that went into making SSIS 2012 such a tremendous improvement over previous versions of Integration Services.

I think Matt made some really great points here. The Big Data revolution was certainly a “disruptor” to common ETL. When dealing with data that is aging too quickly or in quantities that make taking the time to bring it into a data warehouse impractical, that certainly would disrupt common thinking around traditional ETL. While, as Matt points out, the need for traditional ETL will remain, there is some need on the part of those of us in the industry to re-assess what ETL looks like in some cases. It’s not always going to be a series of SSIS packages running on a server and populating a data warehouse. Sometimes, it will be information workers using Power Query to bring data from many sources into Excel.

As far as the Power Hour, that holds so many features that I strive to put into my own presentations. Humor is a huge one. There is a lot of research that shows that people learn better when they are having fun. Not to mention that an audience that is having a good time is less likely to throw rotten tomatoes; they stain, you know. Combine that with using features of the tools in creative ways, and you’ve really got something. I love finding new and exciting uses for technology. I often think of Ed Harris’ great line as NASA’s Gene Krantz in Apollo 13, “I don’t care what anything was DESIGNED to do; I care what it CAN do.”

I liked hearing from Matt that there is often a lot of debate within the SSIS team when it comes to features. it should remind all of us of time spent on project teams in our own work. The point this raises is that we need to remember that Microsoft, like any other organization, has finite resources that need to be spent in the best way they can. I hope we can all keep that in mind when we wonder why certain features haven’t gotten much love or don’t work the way we would want them to.

Matt’s point about MVPs is an important one. Along with what prestige may come from receiving the MVP award, there is also responsibility to serve as a voice for the Community as a whole. Being an MVP is not about getting to wear that MVP ribbon at Summit or a pretty trophy; it’s about leadership, with benefits and obligations along with it.

That brings us to the end. Even though my second interview was with Kamal Hathi, that happens to be the longest one as well. Since I have the typing skills of a rainbow trout, transcribing the audio for these interviews is a long process. As such, I will aim to have the post on my interview with Kasper de Jonge (Blog|Twitter) next week and the one with Kamal the week after. Thanks for your patience.