Category: Azure

Processing Azure Analysis Services Using Azure Automation

Recently, I was working with a customer who was deploying a Tabular model to Azure Analysis Services. They were wondering how they would automate the processing of that model. There are certainly a few ways to do that. You can do it with C# or JavaScript via Azure Functions, as is described in this post by Orsolya Gal. Since my background is not in C#, and the customer didn’t have that either, I decided to keep that option on a shelf for later.
 
I have done some PowerShell and I have worked with Azure Automation, so that seemed like a better place to start for me. Two posts in particular were helpful here:
 
Schedule Process Azure Analysis Services Database by the folks at microsoft-bitools.blogspot.com
 
Automation of Azure Analysis Services with Service Principals and PowerShell by Christian Wade, Senior Program Manager on the Analysis Services product team
 
My own solution ended up combining features from both of the above walk-throughs. My goal with this post is not only to walk through my own solution, but to also share, at each stage, why I made the choices I made. Note that I am still very much a student of PowerShell, so experts in that area will likely know of better ways to accomplish what I did. If you are a PowerShell guru and see where my scripts could be improved upon, I would love for you to share that in comments so I (and anyone else reading this) can learn. Mmmmm… learning…
 
Note: If you put all of this PowerShell into a single script file, you can declare each variable a single time and use it as many times as you need. For example, you will be entering a Password and storing it in a secure credential. You will need that in more than one place for this solution. If you have separate scripts for each location, you will have to enter that password more than once and increase the chance of having them not match, which will cause errors. I will proceed as though we are putting all of the PowerShell in a single script file to avoid having to repeat declarations and value assignments for PowerShell variables involved.
 

Creating Azure Active Directory Service Principal Using PowerShell

As Christian points out in his post above, Service Principals are essentially the Azure implementation of the Service Accounts we have on-premises. They allow applications and services to authenticate and access resources without manual intervention. While you can create service principals in the Azure Portal, I chose to go with PowerShell so that I could save the script and easily copy it for use again and again. In order for this to work, it is necessary to install the Azure PowerShell modules on the machine from which you want to run the scripts. In my case, it is my Surface Book. I, personally, use the Windows PowerShell ISE (Integrate Scripting Environment) that is native to Windows 10. Once you have that open, you can use the Login-AzureRmAccount cmdlet to log into Azure so you interact with the objects, etc, in the Azure subscriptions you have access to.
 
Once you log in, the output will include your Tenant ID. Capture that value for later.
 
image
 
From here, we will just assume you have done that.
 
$subscriptionID = "<your subscription ID GUID>"

$rootNameValue = "mvazureasdemo"

$servicePrincipalName = "$($rootNameValue)serviceprincipal"
$servicePrincipalPassword = Read-Host `
-Prompt "Enter Password to use for Service Principal" `
-AsSecureString

Write-Output "Selecting Azure Subscription..."
Select-AzureRmSubscription -Subscriptionid $subscriptionID

I started by creating a few variables and assigning them values. I will end up specifically selecting which azure subscription I want to work in. By specifying this, I can make sure I am creating and working with objects in the right place. To find your subscription ID, you can follow these steps in the Azure Portal.
 
I use the $rootNameValue as a prefix for all objects created as part of a solution. So, my server names, Azure Automation Assets (we’ll get to these in a bit), etc, all will start with this value. It helps me keep clear what goes with what. The $servicePrincipalName then derives its value by appended “serviceprincipal” onto the root name value.
 
The expression for the $servicePrincipalPassword will generate a box for me to securely enter in the password I want my Service Principal to have. This method allows me to make sure I don’t have something as sensitive as a password sitting in my code anywhere.
 
That last line above selects the subscription whose ID I put in the $subscriptionID variable.
 
# Get the list of service principals that have the value in 
# $servicePrincipalName as part of their names
$sp=Get-AzureRmADServicePrincipal `
-SearchString $servicePrincipalName

# Filter the list of just the ONE service principal with a DisplayName
# exactly matching $servicePrincipalName
$mysp = $sp | Where{$_.DisplayName -eq $servicePrincipalName}

# Isolate the ApplicationID of that service principal
$appSPN = $mysp.ApplicationId

This next section checks to see if a Service Principal already exists with the name I want to use. It is a little bit involved since I need to filter on a search string and not a specific name (the Get-AzureRmADServicePrincipal does not currently support filtering by DisplayName directly). So, I isolate all existing (if any) service principals that have my $servicePrincipalName value IN their names. For example, searching for “Foo” would return both “Fool” and “Foot.” Then I filter that list for the one that has a DisplayName property that matches my $servicePrincipalName value exactly. I then grab the ApplicationID property of THAT principal for use later.
 
# Create a service principal if it does not exist
Write-Output "Creating Service Principal $servicePrincipalName..."

try
{
If(!($appSPN))
{
New-AzureRmADServicePrincipal `
-DisplayName $servicePrincipalName `
-Password $servicePrincipalPassword

# Sleep for 20 seconds to make sure principal is propagated in Azure AD

Sleep 20

}
Else
{
Write-Output "Service Principal $servicePrincipalName already exists."
}
}
catch
{
Write-Error -Message $_.Exception
throw $_.Exception
}

In the code section above, I first check to see if my $appSPN variable has a value. If the Service Principal I want to create already exists, then its ApplicationID property value will be in that variable. If that variable is empty, I go ahead and create Service Principal using the value in my $servicePrincipalName variable for the DisplayName and the password in my $servicePrincipalPassword variable for the password. Service Principals can be created with certificates as well. I chose to use the password for now for simplicity. If the Service Principal already exists, I skip the creation and just return a message indicate it already exists. This means I can run my script multiple times without causing issues or generating errors from trying to create a object that already exists.
 
If the creation step DOES execute, my script waits 20 seconds to make sure that new Service Principal is alive and well in Azure Active Directory before taking my next action below.
 
# Assigning Service Principal $servicePrincipalName to Contributor role
Write-Output "Assigning Service Principal $servicePrincipalName to Contributor role..."

try
{
If(!(Get-AzureRmRoleAssignment `
-RoleDefinitionName Contributor `
-ServicePrincipalName $appSPN `
-ErrorAction SilentlyContinue))
{
New-AzureRmRoleAssignment `
-RoleDefinitionName Contributor `
-ServicePrincipalName $appSPN
}
Else
{
Write-Output "Service Principal $servicePrincipalName already `
assigned to Contributor role."

}
}
catch
{
Write-Error -Message $_.Exception
throw $_.Exception
}

 
In the section of code above, I check to see if my Service Principal has already been assigned to the Contributor role in my Azure Subscription. If so, I just return a message to that effect. If not, I add it. Notice again that this prevents my script from trying create something that already exists.
 
Once creation of the Service Principal completes, the output will provide you with the Application ID GUID value. Take note of that as we will need it shortly.
 
image
 

Assign Service Principal to Administrator Role on Azure Analysis Services Server

The newly created service principal needs to be added to the Administrators role on the server via the Security tab in Server Properties. This is to provide it with the necessary rights to perform Processing operations. To add the principal, use the account picker or do what I did, create the string in a text editor and paste it in:
 
app:<Application ID>@<Azure tenant ID>

image
 

Azure Automation Modules

If you don’t already have an Azure Automation account, you can follow the steps here to create one. PowerShell cmdlets are grouped together in Modules that share certain context. For example, Modules that are intended for management of SQL Servers are grouped together in the SqlServer module. Whether it be on your computer for use in the PowerShell ISE or in Azure Automation, the use of cmdlets is only possible when the Modules that contain those cmdlets are installed. You can read more about installing Modules here. For the purposes of processing Azure Analysis Services models with Azure Automation, it will be necessary to install the Azure.AnalysisServices and SqlServer modules in your Automation account. Note that each of those pages includes a Deploy to Azure Automation button.

Azure Automation Basics

Azure Automation allows you to execute scripts against your Azure environment. The components that perform the work, that contain the scripts, are called Runbooks. There are a few different scripting languages supported (including Python, which is now in Preview). For this process, we will use PowerShell Runbooks.

Often, Runbooks will have various parameters or values that are needed at runtime. For example, if you have a Runbook that boots up a virtual machine, you will need to tell it WHICH virtual machine to start, etc. These values can be passed into the Runbooks at runtime as Parameters, OR you can store these values in Assets like Variables and Credentials. For this process, we will be using one Variable Asset and one Credential Asset. Assets allow you to store values once but make those values available across all of your Runbooks, etc. Super cool.

Azure Automation Variable Asset

One value we will need to pass to our Runbook for processing out Analysis Services model is the Tenant ID of our Azure Subscription. This is a GUID that you are welcome to type/paste in every time you want to use it. Or you can hard-code it into your Runbooks that need it. But there is a far better way. You can store it in a Variable Asset. The instructions there show you how to create the variable via the Azure Portal. Since we are already doing PowerShell, let’s stick with it.

# Create function to create variable assets
function CreateVariableAsset
{

param
(
[string] $variableName
, [string] $variableValue
)

Write-Host "Creating Variable Asset $variableName..."

try{
If(!(Get-AzureRmAutomationVariable `
-AutomationAccountName $automationAccountName `
-ResourceGroupName $automationResourceGroup `
-Name $variableName `
-ErrorAction SilentlyContinue))
{
New-AzureRmAutomationVariable `
-AutomationAccountName $automationAccountName `
-ResourceGroupName $automationResourceGroup `
-Name $variableName `
-Value $variableValue `
-Encrypted $false `
-ErrorAction Stop

Write-Host "$variableName created successfully. `r`n"

}
Else
{
Write-Host "Variable Asset $variableName already exists. `r`n"
}
}
catch
{
Write-Error -Message $_.Exception
throw $_.Exception
}

}

The PowerShell above creates a Function that includes the necessary script to create an Azure Automation Variable Asset. Creating functions can be super helpful when you need to run the same block of code several times, with only minor changes to parameter values, etc. I created this function when building out a solution that involve a couple dozen Variables. Rather than repeat this block of code a couple dozen times, I created it once and then called it for each Variable I needed to create. This makes for much cleaner and easier to manage code.

$automationAccountName = "AutomationB"
$automationResourceGroup = "AutomationBRG"
$variableTenantIDName = "$($rootNameValue)tenantid"
$varaibleTenantIDValue = "<your Tenant ID GUID>"

First, we need a create a few more PowerShell variables. The first two specify the name and resource group of the Azure Automation account we are using. When creating Azure Automation components via PowerShell, you must specify which Automation account will house the objects, as you can have more than one Automation account.

The next two variables provide the Name and Value of our Variable Asset, respectively.

CreateVariableAsset `
-variableName $variableTenantIDName `
-variableValue $variableTenantIDValue

The block above calls our function to create the Variable Asset, if it does not already exist.

image

Hazzah!

Azure Automation Credential Asset

Since are going to automate use of a Service Principal with a password, we want to store that securely so that our Runbook can get to it without additional intervention. To do that, we use a Credential Asset. That link provides instructions on creating a Credential Asset via the Azure Portal.

# Create function to create credential assets
function CreateCredentialAsset
{

param
(
[string] $credentialName
, [PSCredential] $credentialValue
)

Write-Host "Creating Credential Asset $credentialName..."

try{
If(!(Get-AzureRmAutomationCredential `
-AutomationAccountName $automationAccountName `
-ResourceGroupName $automationResourceGroup `
-Name $credentialName `
-ErrorAction SilentlyContinue))
{
New-AzureRmAutomationCredential `
-AutomationAccountName $automationAccountName `
-ResourceGroupName $automationResourceGroup `
-Name $credentialName `
-Value $credentialValue `
-ErrorAction Stop

Write-Host "$credentialName created successfully. `r`n"

}
Else
{
Write-Host "Credential Asset $credentialName already exists. `r`n"
}
}
catch
{
Write-Error -Message $_.Exception
throw $_.Exception
}

}

I once again create a Function for the Asset creation.

$credentialServicePrincipalName = "$($rootNameValue)serviceprincipal"
$credentialServicePrincipalUserNameValue = "<your Application ID GUID>"

We need to create two more PowerShell variables; one for the Credential Asset name, another for the Credential Asset User Name value. We will need a Password but we already have that variable from when we created the Service Principal above. We can reuse that same variable so we don’t have to enter in a password again.

# Create a PSCredential object using the User Name and Password
$credentialServicePrincipalValue = `
New-Object System.Management.Automation.PSCredential`
(
$credentialServicePrincipalUserNameValue `
, $servicePrincipalPassword
)


# Call function to create credential assets
CreateCredentialAsset `
-credentialName $credentialServicePrincipalName `
-credentialValue $credentialServicePrincipalValue

We have to do one extra step here. We start by taking our User Name (Application ID) and Password and using them to create a PSCredential object. Then we call our function to create our Credential Asset.

image

More Hazzah!

Azure Automation Runbook

As I stated before, Runbooks contain the scripts that perform the work in Azure Automation. You can see how to create a Runbook in the Azure Portal. We will create out Runbook with PowerShell. However, we will not be using PowerShell to actually put the script inside the Runbook. You can do that with PowerShell by importing a saved .ps1 (in the case of a PowerShell runbook) that contains the PowerShell the Runbook will contain. I decided that was more complex than I wanted to get here. Besides, the steps after that take place in the Portal anyway. So, we will create the Runbook using PowerShell and then go to the Azure Portal to edit it and paste in our script.

$runbookName = "$($rootNameValue)processasdatabase"

We start by declaring a PowerShell variable to hold the name for our Runbook.

# Create empty Runbook
Write-Host "Creating Runbook $runbookName..."

try{
If(!(Get-AzureRmAutomationRunbook `
-AutomationAccountName $automationAccountName `
-ResourceGroupName $automationResourceGroup `
-Name $runbookName `
-ErrorAction SilentlyContinue))
{
New-AzureRmAutomationRunbook `
-AutomationAccountName $automationAccountName `
-ResourceGroupName $automationResourceGroup `
-Name $runbookName `
-Type PowerShell `
-ErrorAction Stop

Write-Host "Runbook $runbookName imported successfully. `r`n"

}
Else
{
Write-Host "Runbook $runbookName already exists. `r`n"
}
}
catch
{
Write-Error -Message $_.Exception
throw $_.Exception
}

Here we create the Runbook, using our shiny new $runbookName PowerShell variable along with a couple we declared previously.

image

Hazzah!

When to go to the Runbooks blade of our Automation account in the Azure Portal, we see our shiny new Runbook. It still has that new Runbook smell.

image

Clicking on the name of the Runbook will open the Runbook blade. Here we click the Edit button.

image

Then we past our PowerShell script into the window.

# Get the values stored in the Assets
$TenantId = Get-AutomationVariable -Name "mvazureasdemotenantid"
$Credential = Get-AutomationPSCredential -Name "mvazureasdemoserviceprincipal"

# Log in to Azure Analysis Services using the Azure AD Service Principal
Add-AzureAnalysisServicesAccount `
-Credential $Credential `
-ServicePrincipal `
-TenantId $TenantId `
-RolloutEnvironment "southcentralus.asazure.windows.net"

# Perform a Process Full on the Azure Analysis Services database
Invoke-ProcessASDatabase `
-Server "asazure://southcentralus.asazure.windows.net/mvazureasdemoserver" `
-DatabaseName "AzureASDemo" `
-RefreshType "Full"

Once we do, it should look like the figure below.

image

There are a few things to note here.

1. The names in the top section of the runbook must match the names you used when creating the Assets.

2. I am using the Add-AzureAnalysisServicesAccount cmdlet instead of the Login-AzureAsAccount cmdlet Christian used. I had trouble making that one work and more trouble trying to find documentation on it. So, I punted and used a different one. You can find documentation on the Add-AzureAnalysisServicesAccount cmdlet here. This is used to pass our Service Principal into the Azure Analysis Services server to connect to it.

3. Notice that my Azure Analysis Services server name matches my naming convention for this entire solution. Yeah. I know. I’m pretty cool. Yours does not have to match. Mine only does since I created it solely for this demo/post. This step performs the processing operation on the database. In this example, I am performing a Process Full on the entire database. You have many other options as well. You can find more information on the Invoke-ProcessASDatabase cmdlet here. On that page, the left sidebar includes other cmdlets related to processing Analysis Services objects.

When you have finished reveling in the awesomeness of this Runbook, click on the Test Pane button, shown by the arrow in the image above.

image

If our Runbook had actual parameters, they would show up in the Parameters section and we would be able to enter values for them. Since our Runbook is configured to just go grab what it needs from Assets, all we have to do is click Start.

image

Our Test run will go into queue to await its turn to run. Since we do not each have our own dedicated capacity in Azure Automation, each job needs to wait for a worker to pick it up to execute it. I have not seen the wait exceed ten seconds, personally. But not that starting a Runbook is not an instantaneous process. You can read more about Runbook execution in Azure Automation here.

image

Once a worker has picked it up, the status will change to Running.

image

Victory! Note that although this was a Test Run in the context of Azure Automation, this ACTUALLY PEROFMS THE ACTIONS in the Runbook. It does not “pretend” to call the Runbook. It executes it. In full.

You can check the status of the database via its Properties in SQL Server Management Studio.

image

The Last Data Refresh status should closely match the time you ran the Test Run.

Close the Test Pane via the X in the upper right-hand corner to return to the Edit pane.

image

Click Publish. This will fully operationalize your Runbook, replacing the version of the Runbook that exists if it was published previously. Once it is published, we can start it manually or give it a Schedule.

image

Note: If you just want to look at the contents of the Runbook, use the View button. Entering Edit mode on a Published Runbook will change its status to In Edit, and it will not execute. If you do enter Edit mode by accident, and don’t want to change anything, choose the Revert to Published button to exit Edit mode and return the Runbook to the Published state.

Click the Schedule button.

image

You can modify Parameter settings, etc, from the bottom option. In our case here, our Runbook has no parameters, so we can click to Link a schedule to our Runbook.

image

This will launch the Schedule blade. If you already have existing Schedules in your Automation Account, like “Every Monday at 6 AM” or something, those will show up here and you can choose to tack your Runbook execution onto one of those. I will just create a new one.

image

Give you schedule a meaningful name, ideally, one more meaningful than what I chose above. Note that you have to make sure the schedule is set to kick off at least 5 minutes into the future. Also, for this purpose, I am just choosing to have this execute Once.

Now we wait. Or, rather I wait, I guess. Anyway, back in a sec. Talk amongst yourselves. Here’s a topic: If Wile E Coyote could buy all that ACME stuff to try to catch the Roadrunner, why couldn’t he just buy some food?

And we’re back.

image

We can check the status of our scheduled Runbook execution by clicking on the Jobs button in the Runbook blade.

image

We can see it completed successfully. Hazzah! If we click on it, we can view more details.

image

The Job Status is Completed. There are not Errors or Warnings. We can click on the Output to see the actual job Output, just like in the Test Pane.

image

Hazzah!

image

And we can see in the Database properties that the Last Data Refresh matches the time of our scheduled Runbook execution.

Hazzah! Hazzah!

Wrapping Up

There you have it. We just automated the processing of an Azure Analysis Services database using an Azure Active Directory Service Principal, PowerShell, and Azure Automation. Although this post is a bit long, the actual steps to make this happen aren’t very difficult. Hopefully, this will help walk you through trying this yourself.

NEW Elastic Database Jobs: SQL Agent For Azure SQL Database

Azure SQL Database is part of a strong, and strengthening overall Data story in Microsoft Azure. The flexibility of the platform, the scalability, both up and out, and the fact that so much functionality is just included all combine to show Azure SQL Database as a great offering. One thing that has been missing that the on-prem (or Azure VM) world continues to enjoy is SQL Server Agent. SQL Agent has long been used for the scheduling and execution of database maintenance tasks relating to indexes, statistics, and on and on. You’ve certainly been able to perform tasks against Azure SQL using PowerShell, Azure Automation, etc. But that familiar ability to schedule the execution of plain old TSQL using plain old TSQL just wasn’t there.

Debra Dove, Principal Program Manager on the SQL Engineering team at Microsoft, announced at Ignite 2016 in September that a shiny NEW Elastic Database Jobs was entering Private Preview. Note that there is an earlier version of Elastic Database Jobs currently in public preview in the Azure Portal. This new version I am showing today will be the path going forward and will replace that current version in the portal. The NEW Elastic Jobs does not include a Portal experience at the time of this writing.

MemeThe new Elastic Database Jobs are designed to echo well the functionality the folks working with SQL Server are accustomed to on-prem with SQL Agent. But it’s even better than that. There are many features that are just baked in that you no longer have to worry about. I’ve presented on the new Elastic Jobs as part of a larger presentation on the overall Elastic tools associated with Azure SQL Database a handful of times. That presentation is called Azure SQL Database Elastic Boogie and references Marcia Griffith’s hit song Electric Boogie (The Electric Slide). Yeah. I know. That will explain the use of the word boogie all over the place.

Even with it just being a very new private preview, my experience has been a great one. Huge kudos to Debra and her team on that.

Note: There are several offerings around Azure SQL Database that include the word “Elastic” in their name: Elastic Database Pools, Elastic Database Query, the Elastic Scale Client .Net library, and Elastic Database Jobs. They all include “Elastic” in order to note how they help you with flexibility in a scaled out/up data tier in Azure SQL Database. You CAN use them all together; it often makes sense to do so. But none of them are requirements for the others. For example, your databases DO NOT need to be in Elastic Database Pools to take advantage of Elastic Database Query, the Elastic Scale Client, or the NEW Elastic Database Jobs.

If you would like to participate in the Private Preview, then feel free to skip down to the Requesting Private Preview Access section at the bottom. I will understand if you want to bypass the transcendent experience of my walkthrough… I guess…

Let’s dig in.

High Level Features of Elastic Database Jobs

1. Asynchronous execution of scripts

This is not meant for interactive querying. You can use Elastic Database Query (currently in public preview) for that. The scripts in your jobs will execute in the background against each target database.

2. Dynamic resolution of target databases

Elastic Database Jobs introduces a feature called a Target Group. When you create a job step, you specify the Target Group for which the job step should execute. Target groups can contain “include” or “exclude” references to Azure SQL Database Servers, Databases, and Elastic Database Pools. When using a Server or a Pool, the targets will be the Databases within those Servers or Pools. Since you can specifically create “exclude” entries in Target Groups, this lets you run your job against, for example, all databases on Server01 EXCEPT DatabaseA.

When the job runs, it starts by querying the entries in the Target Group in order to determine which databases should be targeted. It does this with every execution to make sure the list of targets is accurate at that moment. This is a fantastic feature. In the future, the team plans to allow you to target Resource Groups and Tags as well. I KNOW! RIGHT?! So cool.

Give or take certain special cases around data sovereignty (like Germany and China, for examples), you can execute jobs against databases in any Azure region using a single Elastic Jobs account.

3. Management using TSQL

Currently, during the private preview, you can only manage Jobs using TSQL. However, in the future, they plan to add ability to do so via REST APIs, PowerShell, and the Portal. REMINDER: The Elastic Database Jobs available in the Azure Portal at the time of this writing is the older version, not this new version.

4. Script execution guarantees

The Elastic Jobs engine guarantees “at least once” execution against all target databases. So, if a connection to a particular target databases fails, the engine itself includes configurable retry logic to keep making attempts to connect to that database to execute the job script.

The engine also guarantees that each job will only support one concurrent execution at a time. Suppose you have a job that typically executes in 30 seconds and you run that job every five minutes. If the execution of that job against DatabaseA is taking longer than expected and it still running when five minutes passes and the next execution should begin, the engine will skip DatabaseA and make note in the execution history that a previous execution is still running.

5. Reliable collection of query results to a single table (no duplicates)

You can execute a job that will query all the target databases and bring the results back to a single table for use in reporting, etc. Even if that job must retry against a particular database, it will ensure that no duplicates from multiple attempts are saved to the final destination table. Hereafter, I will refer to the database you use for this purpose as the Jobs Results DB. You can write to a database you use for other purposes as well. I just keep mine dedicated for organizational purposes.

This is a fantastic way to pull information from a scaled out data tier into a central repository for reporting and analytics scenarios.

A Quick Note About Scale

We have teams within Microsoft that rely on this technology to stay within strict reporting SLAs. Those teams have a large infrastructure they use with Azure to manage their business. The Elastic abilities within Azure SQL Database in general and Elastic Database Jobs in particular serve them well. We also have a customer using Elastic Database Jobs to manage a data tier of more than 130,000 Azure SQL Databases. This technology is HUGE for them.

Elastic Jobs Management Database

Once the private preview is activated for your Azure subscription (I will cover this at the end), you will execute some PowerShell to create an Azure SQL Database for the orchestration of Elastic Database Jobs. You can liken this to the role that the msdb system database plays in the on-prem SQL Server world.

This database must be an S2 (currently $75/month) or higher. So, if you want to play with this using Azure credits from an MSDN subscription, you will likely want to create this only when you will be working on it and destroy it when you are done. For the private preview, that is the only cost you really have. You pay for the database, and that’s it. I don’t have information to share on what the pricing will look like at general availability.

Note that you do not need to have the management database, which I will hereafter refer to as the Jobs DB, on a dedicated server. I just chose to do so in my own environment for organizational purposes.

In the Jobs DB you will notice some familiar stored procedures, as shown in my Jobs DB in Figure 1.

Figure 1

image

In Figure 1, you can see sp_add_job, sp_add_job_step, etc., just like in msdb. You use these stored procedures for creating and managing your jobs, job steps, etc.

This database also contains a view you can use for querying the history of job executions, aptly named job_executions. You can see this along with some other useful views in Figure 2.

Figure 2

image

Security Setup

For Elastic Jobs to work properly, there are some credentials, logins, and users that must be created.

First, in the context of the Job DB, you need to create a Database Master Key and two database scoped credentials.

 
-- Create a db master key using your own password. 
CREATE MASTER KEY ENCRYPTION BY PASSWORD='NOT 12345';
GO

-- Create a database scoped credential for job execution.
CREATE DATABASE SCOPED CREDENTIAL elasticjobcred
WITH IDENTITY = 'jobcred', SECRET = 'NOT 12345';
GO

-- Create a database scoped credential for the master database of servers.
CREATE DATABASE SCOPED CREDENTIAL elasticmastercred
WITH IDENTITY = 'mastercred', SECRET = 'NOT 12345';
GO

The jobcred credential will be used for executing the jobs on the targets databases. The mastercred credential is used to the dynamic determination of Target Groups members on target Servers (including the servers housing target Elastic Pools).
 
Next, in the context of the master data on each server that houses databases that will be targets (including the server housing the Jobs Results DB), you need to create a login for the jobcred and mastercred as well as a user mapped to the mastercred login.
 
/* EACH TARGET SERVER (including server to house Jobs Results DB */

/* master database */

CREATE LOGIN jobcred
WITH PASSWORD = 'same password used for jobcred secret';
GO

CREATE LOGIN mastercred
WITH PASSWORD = 'same password used for mastercred secret';
GO

CREATE USER mastercred
FOR LOGIN mastercred
WITH DEFAULT_SCHEMA = dbo;
GO

Then, on each database that will serve as a target, including the Jobs Results DB, create a jobcred user and add it to the role or roles appropriate for the tasks to be executed on that database.
 
/* each target database (including Jobs Results DB) */

CREATE USER jobcred
FOR LOGIN jobcred
WITH DEFAULT_SCHEMA = dbo;
GO

/* Add user to the database appropriate role(s) */
EXEC sp_addrolemember N'db_owner', N'jobcred';
GO
 
Note: I used db_owner merely for simplicity of my own demo environment. You can and should make sure the jobcred has only the rights necessary to perform the intended operations.

My Environment

I think it makes sense to quickly show the environment I use for this demo, which is shown in Figure 3.

Figure 3

image

My elasticjobsserver houses two databases: elasticjobsdb, which is my Jobs DB, and ElasticJobsResults, with is my Jobs Results DB, to which I save query results.
 
Boogieserver01 houses BoogieDB, and NoBoogieDB, both of which are empty.
 
Boogieserver01 houses BoogieDBIA, which contains data I fabricated about school children selling chocolate in Iowa. BoogieDBMN is the same for Minnesota and BoogieDBWI is for Wisconsin.
 
It is important for the walkthrough to note that I have databases on more than one server.
 

The Schema Update Walkthrough

Once the environment is ready to go, I need to create a Target Group. My group is called BoogieServerGroup.
 
-- Add a target group containing server(s)
EXEC jobs.sp_add_target_group 'BoogieServerGroup';

Then we need to add a member to this group, in this case boogieserver02.
 
-- Add a server target member
EXEC jobs.sp_add_target_group_member
'BoogieServerGroup'
, @target_type = 'SqlServer'
, @refresh_credential_name='elasticmastercred'
, @server_name='boogieserver02.database.windows.net';
 
You can see in the code above that the @target_type is SqlServer and I am specifying the mastercredential that will be used for querying the list of databases on that server when jobs using this Target Group execute. We can view the members of that target group via the jobs.target_group_members view.
 
-- View members of target group
SELECT
target_group_name
, membership_type
, target_type
, server_name
FROM jobs.target_group_members
WHERE target_group_name='BoogieServerGroup';

This leads to the results in Figure 4.
 
Figure 4
 
image
 
Notice that this is an Include member of the SqlServer target type. As noted earlier, you can create Exclude members as well. Those will be skipped when the Target Group is used for job execution.
 
Now, it’s time to create a job. This job will house a step to create a new table, dbo.ElectricSlide, in all target databases.
 
-- Add job for create table
EXEC jobs.sp_add_job @job_name='CreateTableElectricSlide'
, @description='Create Table ElectricSlide';

Then we need to create a job step for that job. At this time in the private preview, each job can have a maximum of one job step and that job step must be TSQL. Given that this step could be executed against databases multiple times, it is important that the query be idempotent.
 
-- Add job step for create table
EXEC jobs.sp_add_jobstep
@job_name='CreateTableElectricSlide'
, @command = N'IF NOT EXISTS (SELECT * FROM sys.tables
WHERE object_id = object_id('
'ElectricSlide''))
CREATE TABLE [dbo].[ElectricSlide]([ElectricSlideId] [int] NOT NULL);'

, @credential_name= 'elasticjobcred'
, @target_group_name='BoogieServerGroup'

We can view the job step definition.
 
-- View the job step definition
SELECT
[job_name]
, [step_id]
, [command_type]
, [command]
, [credential_name]
, [target_group_name]
FROM [jobs].[jobsteps]
WHERE job_name = 'CreateTableElectricSlide';

This produces the results in Figure 5.
 
image
 
A quick look at the Boogie databases on both servers shows none of them contain a table called dbo.ElectricSlide, as shown in Figure 6.
 
Figure 6
 
image
 
Let’s execute our job.
 
-- Execute the latest version of a job 
exec jobs.sp_start_job 'CreateTableElectricSlide'

Once our job is running, we can query the latest (in this case, currently running) execution of the job.
 
-- View latest execution
SELECT
is_active
, lifecycle
, last_message
, target_type
, target_group_name
, target_server_name
, target_database_name
FROM jobs.job_executions
WHERE job_name = 'CreateTableElectricSlide'
AND job_execution_id =
(
SELECT job_execution_id FROM jobs.job_executions
WHERE step_id IS NULL and create_time =
(
SELECT MAX(create_time)
FROM jobs.job_executions WHERE step_id IS NULL
)
)
ORDER BY start_time DESC;
GO

If we execute the above query immediately after starting the job, our results look like Figure 7.
 
Figure 7
 
image
 
Notice that the bottom row is WaitingForChildJobExecutions. In this case, the one child job running is the first row with the target_type of TargetGroup and the target_group_name of BoogieServerGroup. This is the child job that is querying the master database on boogieserver02, the only member of BoogieServerGroup, to find the databases on which to run the script in the job step.
 
After a few seconds, running that same query produces the results in Figure 8.
 
Figure 8
 
image
 
The list of databases has been determined and a child job execution has been created for each target database.
 
Once the job completes, the results of that query look like Figure 9.
 
Figure 9
 
image
 
In this case, all child job executions have succeeded, and therefore the overall job execution has succeeded.
 
If we look again at our 4 Boogie databases (3 on boogieserver02 and 1 on boogieserver01), we see what is in Figure 10.
 
Figure 10
 
image
 
All of the databases on boogieserver02 now have a dbo.ElectricSlide table. BoogieDB, since it is not on boogieserver02, does not.
 
Let’s add the BoogieDB database on boogieserver01, but only that database, to our BoogieServerGroup target group.
 
-- Add a database target member
EXEC jobs.sp_add_target_group_member
'BoogieServerGroup'
, @target_type = 'SqlDatabase'
, @server_name='boogieserver01.database.windows.net'
, @database_name='BoogieDB'
 
Once we execute this, we can query out BoogieServerGroup again to list it members.
 
--View the recently added database in target group
SELECT
target_group_name
, membership_type
, target_type
, server_name
, [database_name]
FROM jobs.target_group_members
WHERE target_group_name='BoogieServerGroup';

The results now look like Figure 11.
 
Figure 11
 
image
 
Notice we have two different target_type values, SqlServer for boogieserver02 and SqlDatabase for the row for the BoogieDB database member. This shows that the same target group can contain members of multiple types.
 
Let’s execute our job again.
 
-- Execute the latest version of a job 
exec jobs.sp_start_job 'CreateTableElectricSlide';
 
This time, I’ll just show the final results of the job execution. See Figure 12.
 
Figure 12
 
image
 
Notice that only the BoogieDB database from boogieserver01, not NoBoogieDB, is included. After all, how can you have an ElectricSlide if you have NoBoogie?
 
If we look at all of the Boogie databases again, we see that they all now have the dbo.ElectricSlide table, as shown in Figure 13.
 
Figure 13
 
image
 
As you can see, the new Elastic Database Jobs can be huge boon when managing a scaled out data tier.
 

Query Results Walkthrough

In this walkthrough, we will query all the databases in our target group and bring the results back to a single table in my Jobs Results DB, ElasticJobsResults.
 
We will use the BoogieServerGroup target group for this next phase as well. However, we will remove the BoogieDB target member since that is an empty database and therefore does not have the table that is the target of our query.
 
-- Remove a server target member
DECLARE @mytarget_id uniqueidentifier

SET @mytarget_id =
(
SELECT target_id FROM jobs.target_group_members
WHERE [database_name] = 'BoogieDB'
);

EXEC jobs.sp_delete_target_group_member
@target_group_name = 'BoogieServerGroup'
, @target_id = @mytarget_id;

Now, when we query the BoogieServerGroup, it should only have 1 member, boogieserver02, as shown in Figure 14.
 
Figure 14
 
image
 
Let’s create a job for collecting results from the data tier.
 
-- Add a job to collect results
EXEC jobs.sp_add_job
@job_name ='ResultsJob'
, @description='Collection data from all databases';

Now we can add a job step for collecting those results.
 
-- Add a job step to collect results
EXEC jobs.sp_add_jobstep
@job_name='ResultsJob',
@command= N'SELECT DB_NAME() AS DatabaseName, SchoolID, SchoolName FROM dbo.School;',
@credential_name='elasticjobcred',
@target_group_name='BoogieServerGroup',
@output_type='SqlDatabase',
@output_credential_name='elasticjobcred',
@output_server_name='elasticjobsserver.database.windows.net',
@output_database_name='ElasticJobsResults',
@output_schema_name='dbo',
@output_table_name='School';
 
You can see this job step uses more parameters than our table creation example above. This one specifies the server, database, schema, table into which the results of the query are to be stored. In this simple example, I am merely querying the dbo.School table in each target database. I will bring the results back to my ElasticJobsResults database into a table called dbo.School. Note that this will create the destination table if it does not already exist. This is a good thing since the query results include a field that I do not have in my query.
 
Let’s go ahead and execute this job.
 
-- Execute the latest version of a job 
exec jobs.sp_start_job 'ResultsJob';

Here is my job tracking query again, this time look at ResultsJob.
 
-- View latest execution
SELECT
is_active
, lifecycle
, last_message
, target_type
, target_group_name
, target_server_name
, target_database_name
FROM jobs.job_executions
WHERE job_name = 'ResultsJob'
AND job_execution_id =
(
SELECT job_execution_id FROM jobs.job_executions
WHERE step_id IS NULL and create_time =
(
SELECT MAX(create_time)
FROM jobs.job_executions WHERE step_id IS NULL
)
)
ORDER BY start_time DESC;
GO
 
The final results are shown in Figure 15.
 
Figure 15
image
 
Sweet. We now have a dbo.School table in the ElasticJobsResults database, as shown in Figure 16.
 
Figure 16
 
image
 
Let’s query it.
 
SELECT * FROM dbo.School;
 
The results are shown in Figure 17.
 
Figure 17
 
image
 
Note that the results include the internal_execution_id of the child job that retrieved each record. So, the Elastic Jobs engine is providing some ready-made lineage information for us. I also want to point out that the SchoolID in my dbo.School table in all of my Boogie databases is an identity column starting at 10000. Each one includes only a single record. Therefore, in each database, the single school present has a SchoolID of 10001.
 
This is a super simple example of querying an entire data tier with a single query using Elastic Database Jobs.
 
I hope you can see how exciting this is for the Azure SQL Database story. So far, the reception has been great. I demoed it for a customer the other day and the response was, “WOW! This is EXACTLY what we needed.” The Data story in Azure just keeps getting better and better.
 

Requesting Private Preview Access

Debra and her team are looking for participants for the Private Preview of the shiny new Elastic Database Jobs. If you would like to participate, and I would encourage you to do so, please email me using my Microsoft address, which is mark dot vaillancourt at microsoft dot com. Include a brief description of your role and your environment. I will connect you with Debra, who can get you started.

Passing Credential Asset in Azure Automation Test Pane

Lately, I have been learning Azure Automation. If you are not familiar with it, I encourage you to follow the link in the previous sentence. The remainder of this post will assume you have some familiarity with it.

Note: The solution presented here works today. As Azure Automation continues to evolve, there is no guarantee this will always be the answer.

Also Note: If you just want the answer, and would prefer to skip over my brilliant description of this epic journey contained in the Saga section, feel free to scroll down to the Out With It, Man! section at the bottom of this post.

Saga

Today, I was working on a Runbook to create a shiny new Azure SQL Database Server, as long as that server does not already exist. I have other Runbooks for creating the Server Firewall Rules and the Azure SQL Database itself. Keeping Runbooks small and very targeted is a recommended practice. Later, I will create another Runbook that calls each of the ones above to create the overall Azure SQL Database solution I am after.

One of my favorite features of Azure Automation is the ability to use Assets, which essentially allow you to store Variables, Credentials, and others pieces of information you can reuse across Runbooks. The key for this post is the Credential. This allows you to store a Username and Password pair that is stored encrypted for you in Azure. This allows you to have your Azure Automation Runbooks use this Credential to authenticate to some other service without embedding ANY part of the Username or Password in the Automation script itself. Boom. This is huge. It also allows you to utilize the Credential without having to know that Username or Password. I know, right?

To create this Azure SQL Database Server, my Runbook uses the New-AzureRmSqlServer PowerShell cmdlet. Rather than paste my entire Runbook here, I will just show the key pieces that are necessary for this post.

In accordance with recommended practices, my Runbook has a section at the top that lays out the parameters, shown in Figure 1.

Figure 1

param

(

      [Parameter(Mandatory=$True)] `

      [string] `

      $resourceGroupName

    , [Parameter(Mandatory=$True)] `

      [string] `

      $location    

    , [Parameter(Mandatory=$True)] `

      [string] `

      $serverName    

    , [Parameter(Mandatory=$True)] `

      [string] `

      $serverVersion    

    , [Parameter(Mandatory=$True)] `

      [PSCredential] `

      $sqlAdminstratorCredentials    

)

 

The last parameter, $sqlAdministratorCredentials, has a type of PSCredential. The other parameters are just strings.

The call that actually creates the Server is shown in Figure 2.

Figure 2

New-AzureRmSqlServer `

    -ResourceGroupName $resourceGroupName `

    -ServerName $serverName `

    -Location $location `

    -ServerVersion $serverVersion `

    -SqlAdministratorCredentials $sqlAdminstratorCredentials 

 

There is certainly more code in the Runbook. I am just focusing on the key areas related to this particular post.

In Figure 2, you can see that I am mapping the parameters in my param section in Figure 1 with the parameters of the New-AzureRmSqlServer Powershell cmdlet. The first parameter in Figure 1, $resourceGroupName, will be passed to the –ResourceGroupName paramter of the cmdlet. And so on.

As we all know, and sometimes even practice, Testing is important. Thus, I fully test each of my Runbooks multiple times to make sure they do what I want them to. For example, this Runbook should create a new Azure SQL Database Server. It should most definitely NOT put Max in Space. Azure Automation includes a feature in Edit mode of a Runbook called the Test Pane. It allows you to enter in parameters and run the Runbook before you publish it to make sure it works the way you want it to. Note that the Test Pane does not pretend to do things. It ACTUALLY performs the actions. So, running a Runbook that creates a server in the Test Pane will ACTUALLY create that server. Just keep that in mind.

The Parameters section of the Test Pane for my Runbook looked like Figure 3.

Figure 3

image

With that String parameters, it is easy to just type what I want in the textbox. But, for the PSCredential parameter at the bottom, I was unsure what to do. I figured, since this is just a textbox, I couldn’t just magically pass in a Credential Asset. So, I tried passing in an expression that used the Get-AzureRmAutomationCredential cmdlet that returns the Credential Asset specified. That didn’t work very well. I then started digging through documentation, figuring I would see references and examples of passing a Credential Asset in the Test Pane. It took me a bit, but I finally landed on an Azure.com blog post titled Azure Automation: Runbook Input, Output, and Nested Runbooks. While this post didn’t match all that closely with the search criteria around the Test Pane, it did contain my answer in a section on starting Runbooks:

If a runbook parameter takes a PSCredential type then you need to pass the string name of a Azure Automation credential asset.  Behind the scenes, the Azure Automation credential asset with that name will be retrieved and passed to the runbook.

It turns out, I was not giving the super smart folks on Azure Automation team enough credit. It IS as magical as just putting the name of the Credential Asset in the textbox. While that section of text from the blog post did not mention the Test Pane, the Test Pane is essentially a means of running the Runbook currently in Edit mode without having to Publish it first. So, using the Test Pane IS running the Runbook. Therefore, the above worked just fine.

I learned some valuable things here:

1. Sometimes, the simple answer you dismiss because it is too simple is ACTUALLY the answer.

2. The Azure Automation team is very smart. I knew this one already. But this was a great reminder.

3. My kids have never seen Space Camp.

 

Out With It, Man!

Just put the name of the Credential Asset into the textbox for the Credential parameter in the Test Pane. For example, if your Credential Asset is named MyCredentialAsset, then the Test Pane could look like Figure 4.

Figure 4

image

Yeah. That’s really it.

Connect Suggestion: SSAS Tabular Images In Azure VM Gallery

Greetings, noble readers.

Currently, when utilizing the SQL Server images in the VM Gallery in Azure, any installations of SQL Server Analysis Services default to Multidimensional. Thus, if you want SSAS Tabular, you have additional work to perform.

I was just chatting with a Senior Program Manager on the SQL Server Analysis Services product team. They currently don’t have anything in their plans for providing SQL Server Gallery Images with SSAS Tabular instead of Multidimensional. We agreed that it is a good idea for that to happen. We also agreed that a Connect suggestion would be a great way to gauge broader community support/appetite for providing Gallery images with Tabular installed.

<insert fanfare and trumpets and stuff>

Below is the link to the suggestion I made on Connect.Microsoft.com. If you agree that having SQL Server Images in the Azure VM Gallery with SSAS Tabular already installed would be a good thing, please vote up the suggestion as it will help the Product Team to justify the effort it would take to do so. If you don’t have strong feelings either way, please vote it up for those of us that do. Smile

Provide SQL Server Image in Azure VM Gallery With SSAS Tabular Installed

Your participation in this endeavor is much appreciated.

Thank you for your support.

PASS Summit Interview With Kamal Hathi

For the third, and final, installment in my PASS Summit Interview series, I present my interview with Kamal Hathi, Director of Program Management for Business Intelligence Tools at Microsoft. Kamal is the one who is ultimately responsible for the direction of Microsoft BI.

As with my other interviews, the byproducts of casual conversation have been edited out for better flow in writing.

Transcript

Mark V:

Things are changing very rapidly in the BI space. There are so many tools coming out and Microsoft is delivering so many awesome new features. As a BI Developer, what does my job look like 5 years down the road?

Kamal:

That’s a great question. I think we’ve been on a journey where BI was something central that somebody built and everybody sort of just clicked and used, or maybe had a report. Now, we’re coming to a point where a lot of it is empowered by the business. The business guy can go off and do lots of things. But there’s two parts that are interesting that a Developer, a Professional, if you will, needs to be in the middle of. One, I think, is the data. Where does the data come from? The first requirement is having the right data. And call it what you want, Data Stewardship, Data Sanctioning. And it’s not just Data meaning rows and columns or key-value pair kinds of things. I think there’s Data, as in the Data Model; building a Data Model. And a Data Model, sometimes, is not different than what people do building Cubes. They have Hierarchies, and they have Calculations, and they have interesting drill paths. All kinds of things. So, someone has to go do that work, many times. Even though, I think, end users can go and mash up data themselves. But there are times when you need a more supervisory nature of work; or a more complicated nature, if it turns out that Calculations are difficult, or whatever. Someone’s going to always do THAT work. And that’s interesting. The second piece that is Interesting is that I think there’s going to be a new work flow. We’re already seeing this pattern in many places. The workflow is going to be like this. An End User decides they want to build a solution. And they go off and, essentially, put something together that’s very close to what they want. But it’s not really high performance; it’s not perfect. Maybe it’s got some missing things. And they use it. And their peers use it. It gets popular. And then some Developer comes in and says, “Let me take that over.” And then they improve it, make it better, polish it up. Maybe completely re-write it. But they have the right requirements built in right there. A third piece you will start seeing is going to be the Cloud services based world. That is taking bits and pieces of finished Services, and composing then together to build solutions. And that’s something we don’t see much today. But I can imagine someone saying, “Hey, I’ve got this Power BI piece which gives my visualization, or some way to interact. I can take some part of it and plug it in to another solution. “ They can provide a vertical solution, provide a customized thing for whatever audience it is. And be able to do so. I think those kinds of things will be much more likely than doing the whole thing.

Mark V:

So, instead of necessarily the BI Developer working with the users to help map out requirements, with these End User tools, they can build something that’s their requirement.

Kamal:

More or less. And then the Developer can jump in and make it better. Maybe re-write better expressions and queries and make it faster; all kinds of interesting things. It just adds more value to their job. Instead of sitting there talking to people and getting “Oh, this is wrong. Re-do it.”

Mark V:

When Tabular came out, there was some uproar. When a new feature comes out, there are always “the sky is falling” people that say that something else must be going away. It happened when Power View came out. People said, “Report Builder is going away.” Then, when Tabular came out, “Oh! Multidimensional is going away.” I still hear it out there sometimes that Multidimensional is dead or dying; MDX is dead or dying. What message do you have for those people?

Kamal:

Two things here. MDX and Multidimensional are not the same thing. We should be very careful. Multidimensional is very important. Lots of customers use it. And we’re actually making improvements to it. This latest thing, DAX over MD, which allowed Power View to work over Multidimensional, is a great example. We know this is important. There are many customers who have very large scale systems in Multidimensional. And it’s important to us. We’ve just come to a point with Multidimensional where large jumps in functionality are just harder to do. You can really go on an say we can 10X anything. And so the In-memory stuff, the Tabular, has been the approach we’ve taken to give more kinds of scenarios; the more malleable, flexible stories, the “no schema design up front” kind of approach. But Multidimensional is super important. It isn’t going anywhere. So, when someone asks, “What should I use?” we say, “You pick.” Our tools, our aim is, should work on top of either. We would like Tabular to be on parity with Multidimensional in terms of capabilities. And we’re making progress. We’re getting there. We haven’t quite gotten there. But uses shouldn’t have to worry, in our opinion, about Multidimensional or Tabular. These should be things that you worry about as a tuning parameter, but you shouldn’t have to worry about them as a big choice.

Mark V:

So, the Business shouldn’t be concerned about it.

Kamal:

Right. It’s a store. It’s a model. It’s a calculation engine. And you might want to switch one to the other. And in the future, that could be a possibility. There are lots of limitations that might not make that practically possible, but notionally speaking. I’m not saying you could just pull a switch. But you should be able to have similar capabilities and then decide which one to use.

Mark V:

Or possibly some kind of migration tool?

Kamal:

Yeah, but those kinds of things are harder sometimes. They’re not so easy to do because.. who knows what’s involved? What kind of calculations did you write? Etc. Those are much harder to do. Migration is always hard. But comparable capabilities make a lot more sense. So, I can take this guy, and build the same thing and not have to worry about a dead end.

Mark V:

When I was at TechEd, Kay Unkroth did a great Managed BI session. And he started with a demo of taking in data from the Internet and combining it with some business logic. It was tracking the purchasing habits of Pink Panthers buying cell phones. And in his scenario, they went through a big investment at this Company only to find out that Pink Panthers don’t exist. So, in the realm we have, with data becoming more self-serve, with Power Query, etc, being able to reach out to more and more places, what is the thinking [from Microsoft] on the ways we can continue to have some governance over the realm that users have?

Kamal:

Fantastic. This question goes back to the first question on what Developers do. We talked about Data Stewards and Sanctioned Data and all that. And even with Power Query, if you work with that, the catalog you find things from isn’t just the Internet. The Internet is one catalog. We are also enabling a Corporate, internal catalog, which today, we showed in the keynote. And you saw what we can do. It’s in Power BI, and you can actually try it out. And the goal there is to find a way for someone, it could be a business user, maybe a Developer, the IT professional, to go off and add to a catalog that is internal to the company. They can add something they think is trustworthy and worth sharing. And then someone else can come in. And they want the shopping habits of certain constituencies or a certain segment, they can find it. As opposed to “I heard on YouTube that 20-inch phones are now hot.” Who knows, right? Just because it’s on the Internet, doesn’t mean it’s true. That’s the idea of having this info catalog, essentially. It knows how to provide people the capability of publishing. And that can be a mechanism for whoever one deems fit in the process to provide that sanctioned publishing. And maybe have a data curator look at things and make sure users have a place they can go for trusted sources as opposed to “wide open.” And we actually enable that. People can do that.

Mark V:

Could you then disable the Internet piece of that catalog in certain organizations?

Kamal:

Potentially. The question is Why? And, again, it’s a process thing. You ask people Why and what they want to do. And that’s the danger of governance. The minute you make something such that it becomes that Forbidden Fruit, someone will find a way of doing it. They’ll go out and Cut and Paste. They’ll do something. It’s almost impossible to really lock these things down. What you do want is people to be aware that there’s an alternate, a sanctioned place, where they can go compare and they can understand. And that’s what they should be doing. I think it’s much harder to lock it down and say you can only get data from inside. But that’s a process view; and organization will decide how they want to do it.

Mark V:

Early in my career, I built several report models for clients. Obviously, you know, they’re going away. They’re not be invested in. I recently wrote a Technical Article about alternatives to Report Models in SQL 2012. And I laid out the different options that you have, from Power Pivot for Excel, Power Pivot for SharePoint, Tabular, or full on Multidimensional, etc. One of the things that was pointed out to me after that one of the things that goes away with report models is the ability to make that regular detail style table report in Reporting Services that can go against a drag and drop model that’s not a Pivot table. Is there something coming down the road that could alleviate that?

Kamal:

There are ranges of this, right? You can do a drag and drop tablix, even, in Power View. So, it’s not like it’s completely gone.

Mark V:

Right. It’s not completely eliminated. But it’s not the flexibly of a Report that you can do subscriptions with, etc.

Kamal:

I think in terms of those kinds of things, like Subscriptions, mailings… Two things. One. Reporting Services is still there. It’s not like it’s gone away. And number two, we obviously think it’s important to have the same capabilities in other places and so we’re looking into making that possible. A lot of people ask for it. And it seems like a reasonable thing to ask for. Why would I not want to go schedule even my Excel sheets? A report’s a report. For some people a report is a Reporting Services report. For some people a report’s an Excel report. It’s an important point and certainly we’re looking into that as something we would do. But I don’t know When, Why, How, Where. As with all things Microsoft, it’s open to interpretation by whichever tea leaf you read about.

Mark V:

From the visibility you have within Microsoft to see what’s going on, could you describe, in a general sense, how a feature for, say, Analysis Services, goes from ideation, with someone saying, “Hey, wouldn’t this be cool?” to making it into the product?

Kamal:

Absolutely. There are multiple ways this happens. One is when we had an idea before or a customer had asked for it, and we decided to ship it and couldn’t. And then it becomes a “backlog” feature. So, next time, when the next version comes, this one is left over and we say, “Hey, come join the train.” We just pull them back on. That’s easy. Done. Everyone knew we wanted, or tons of people had asked for it, and we just couldn’t finish it, so we just put it over. The second, which is much more likely, is customers have been asking for something. For example, we heard the feedback loud and clear, “We need support for Power View of Multidimensional models.” That’s a pretty clear ask. The work is not easy, but it’s a pretty clear ask. So then you go and say, “What does it take?” We figure out the details, the design, the technical part of it. And then you figure out the release timeframe, the testing. All that stuff. And you do it. The third one is when somebody say, “I have an idea.” For example, “I want to do something. I have an idea for an in-memory engine and it could be like a Power Pivot.” And that’s just like, “Wow! Where’d that come from?” And then you say, “Will it work?” And nobody knows, right? So then we start going in an asking customers what they think. Or, more likely, that idea had come from talking to somebody. And this Power Pivot case actually came from talking to customers that said, “Hey, we want end users to be more empowered to do analysis themselves. What can you do?” That was essentially the germination of that idea. When that happens, there’s usually some customer input, some personal input. And then they start to come closer to fleshing it out and ask who’s going to use it. So we ask customers, get some replies. Sometimes we just do a design on paper and say, “Would you use this?” And then people try it out, give it some feedback, and say, “Yeah. That’s good.” So, we go down this path getting more and more input. And then we come up with a real product around it. Sometimes, we have an idea that we think is great and we float it around and we hear vocal feedback saying that’s a bad idea. And we go back to the drawing board and re-do it. And this happens all the time. Many of these features show up, and people don’t realize that we went back and re-did it because the community told us. And many people ask if we use focus groups here [at PASS Summit]. There’s one tomorrow, actually, to go sit down and say, “What do you guys think about XYZ?” There’s feedback; and we listen. So, there’s an idea, there’s feedback, there’s evolution. And we iterate over it all the time until we come to a solution on that. Very rarely to we just build something where we didn’t ask anyone and we just did it. We come up with a brainstorm and flesh it out, but we always get feedback.

Mark V:

So, Matt Masson has discussed the debate within the SSIS team regarding the Single Package deployment model versus the new Project Deployment model. I would assume that happens throughout all the products. Maybe there’s somebody championing this feature and people asking “What are we going to do?”

Kamal:

Yeah. My first memory of working on this team is people shouting in hallways… in a friendly way. People shouting, “Hey! What about THIS?” or “What about THAT?” And it turns out to be a very vocal and a very passionate environment. People don’t just work because they walk in in the morning and write code. They work on this because they are just in love with this product. They are just deeply, deeply involved and committed to what they’re developing. And these people just have opinions. And I don’t mean opinions like “Yeah, I want to have a hamburger.” They have an OPINION. And there are very passionate discussions that go back and forth and back and forth sometimes. Typically what happens is a Senior Architect or someone who has some tie-breaking capability can come in and say, “Look, great ideas everybody, but we’re going to do THIS.” And then people listen, there’s some more argument, and after that, it’s done. And we go and do it. And Power Pivot is a great example of that. People were like, “Are you crazy? What are you doing??” And it was like, “No. We’re going to do it.” And that was it. And the team just rallied behind it and built a great product, and off we go. But, the good part about that story is that, because people have such vocal opinions, they rarely remain silent about these things. We don’t lose anything. And so we have an answer that we can listen to. And we can decide from multiple options as opposed to just going down one path. And then we end up with a decent, rigorously debated solution.

Mark V:

So, there’s a lot going on right now with Cloud and Hybrid solutions; there’s Chuck Heinzelman’s great white paper on building BI solutions in the Cloud in Azure virtual machines. When I talk about that with clients or with colleagues, there’s still a lot of trepidation and people say, “For some industries, that will never ever happen.” What kind of message would you have for people that just don’t see how it’s feasible?

Kamal:

Cloud has many aspects to it. There is the “Everything’s in the Cloud. I want to put my financial, regulatory data in the cloud.” Which, is not going to happen for some people. Then there is the “I can get value form the Cloud on a temporary or on-demand basis. I want to go burst capacity. I want to go off and do a backup.” Whatever it is. There’s that. Then there is the “I’m going to do some parts of my workload in the cloud and the rest will remain on premise.” And for all these options, whether it’s completely in, hybrid, temporary bursting, that value provided is typically what customers decided makes sense for them. If it’s the value of all the infrastructure just being taken care of and I can do more things to add value to a solution, so be it. If it happens to be that I can get extra capacity when I need it, great. If it happens that I can keep my mission critical or compliance related data on premise and lock it up, and hybridly work with the cloud, that also works. And for most customers, most users, there is value in many of these scenarios. And there isn’t any one answer. You can just say that “The Cloud” means that you do X. it just means that you have many options. And interestingly, Microsoft provides many options. We have Platform [PAAS], fully deployed platforms that you pretty much just deploy your database and have to do nothing. Azure SQL Database, good example. All you do is worry about building your database, setting your tables, and you’re done. We take care of all kinds of things in the background. Don’t like that? Go to a VM. And you can do all kinds of fancy things. Chuck’s paper is a great example. We have solutions. Office 365 gives you end-to-end; from data to interfaces to management, all in one. Each of these things have hybrid solutions. You can have data on premise. As we saw today in the keynote, you can backup to Azure. With 365 and Power BI, you can actually do hybrid data connectivity. So, all of these things work in different ways. I think, for every customer out there, it typically is just a trial solution, they want show something, or they want to have part of their solution that works. Either way, it adds value. Typically, what I have found as I talk to customers, is that many of them come from the “I would never do that” to “Oh. I might try that.” Because they have come to the realization that it’s not an all or nothing proposition. And you can do parts, try it out, and it works.

Mark V:

Over the Summer, I did a proof of concept for a client using Tabular. Just because, with the size of it, and looking at its characteristics, I said, “This would be a great Tabular solution. Let me demo it for you.” I have talked to several people in the industry. And the process of developing Tabular in SQL Server Data Tools can be… less than awesome. I had some bumps in the road. There was a great add-in I got from CodePlex [DAX Editor] that helped me deal with DAX in more of a Visual Studio environment. It didn’t apply well to Service Pack 1 [of SSAS 2012 Tabular] and that kind of stuff. There was something that Marco Russo had put forth on Connect that suggested more or less a DDL language for Tabular models. Is something like that feasible?

Kamal:

I don’t know. I don’t have a good answer for that. That reason for that is we’re looking at many options. What would move the ball forward in that direction for a design environment for Tabular models? And there are many options. So would say let’s do it in Excel; take Power Pivot and put it on steroids. It’s a possibility. Or a DDL language. Go off and take the things you had in MOLAP and apply it here, maybe. Maybe something brand new. I don’t know. We’re trying to figure out what that is. I do know that we do want to take care of people who do large scale models, complex models in this environment. I just don’t know how and where and when. But it’s an important constituency, an important set of customers. And we’ll figure out how to do it.

Mark V:

As a BI developer, it’s important to know that the discussion’s being had.

Kamal:

All the time. This happens in a lot of hallway discussions. This is one of those.

Wrapping Up

There’s a little bit of a story to this one. When I decided I wanted to do a blog series composed of interviews conducted with Microsoft folks at the Summit, I wanted to get different perspectives. With Matt Masson (Blog|Twitter) and Kasper de Jonge (Blog|Twitter), I already had members of teams in the trenches of development of the tools. I had then reached out to the awesome Cindy Gross (Blog|Twitter) to get the perspective of someone on the CAT (Customer Advisory Team). Cindy got back to me with a contact for Microsoft PR, Erin Olson, saying that she was told to send me there. Upon contacting Erin, she responded by offering to have me sit down with Kamal Hathi, who would already be on site that day. That was an offer I couldn’t refuse. In hindsight, I am wishing I had asked about sitting down with Cindy as well, but I had already decided that my first series of this sort would be capped at 3 since I had never attempted anything like this before and didn’t know what to expect. If this series proves to be popular and of value to the Community, then I will certainly consider doing it again and asking Cindy to participate.

You will notice some overlap in the questions posed to my fantastic interviewees, particularly between Kasper and Kamal. I wanted to get different perspectives from within Microsoft on some similar topics. I also made sure to branch out in each interview and ask some questions targeted to a particular person.

In response to my “5 years down the road question,” Kamal echoed the importance of Data Stewardship. It is clear that this is an area that Microsoft is taking very seriously. Having done a lot of reporting in my career, my motto has always been, “It HAS to be right.” Clients have appreciated that. As we open up more and more avenues for users to get data, we must keep in mind that the data needs to be trustworthy. 

I really want to highlight the ways in which Kamal described how a feature makes it into the product. Make special note of the fact that customer feedback is vitally important to Microsoft. Sometimes, the idea itself comes from Customers. I think Microsoft often gets a bad wrap as some kind of bully or something merely because it is big. It is certainly not perfect; no company is. But I think it is really important to make note of how Microsoft DOES listen to customer feedback when it comes to the products they provide.

Kamal’s description of the internal debates that occur within Microsoft over features is important. It also echoes what we heard from Matt and Kasper. The people working on these products for us care VERY deeply about what they are doing. The work and passion that go into creating these tools we use every day is staggering to me. While I have never been a “fan boy” of any company, I have chosen the SQL Server related technologies upon which to base my career. And I have no regrets. This is a hugely exciting time to be a BI professional. The investments that Microsoft have been making in this space over the past several years make it even better.

This concludes my PASS Summit Interview series. Thanks so much to Matt Masson, Kasper de Jonge, and Kamal Hathi for taking time out of their very busy schedules to sit down with me and answer my questions. Thanks also to Cindy Gross and Erin Olson for their assistance in connecting me with Kamal. This series turned out even better than I had ever expected thanks to the generosity of those involved.