Intro To MDX Decathlon–NON EMPTY and CROSSJOIN (7/10)

10 September, 2012 (14:00) | Decathlon, MDX | By: Mark V

Since you have been following this series with the utmost, life-changing, interest, then you have most certainly read the previous post, Intro To MDX Decathlon – Hierarchy Navigation Basics (6/10). In that post, I used the NON EMPTY keyword and mentioned I would cover it in the next post. Tada! The next post is right here. This one will also cover the CROSSJOIN function.

NON EMPTY Keyword

The NON EMPTY keyword is used on a query axis, placed just before that axis’ expression. It tells MDX to eliminate from the results any axis members that are empty for the entire axis. For example, if place on the COLUMNS axis, any columns in the result for which all of the returned cells are empty would be excluded from being returned.

My Borg cube awaits.

In Query 1, we declare a named set called The Eighties made up of all ten years that appear in that decade. We then return the overall Episode Count of episodes that first aired during each of those years.

Query 1

WITH SET TheEighties AS
{
      (
      [Date].[Date Hierarchy].[Calendar Year].[1980]
      :[Date].[Date Hierarchy].[Calendar Year].[1989]
      )
}
SELECT
      [Measures].[Episode Count] ON COLUMNS
    , TheEighties ON ROWS
FROM [Borg]

 

Result 1

image

You will notice in Result 1 that the cells for 1980 thru 1986 have no contents, as shown by the (null). This may be the result you want to see. However, you may just want to see the years in which there actually were Star Trek episodes. There is a very easy way to do that. Your skills of predictive analytics likely already lead you to the answer.

 

 

 

 

 

 

 

 

 

 

 

In Query 2, we use the NON EMPTY keyword in front of the ROWS axis expression to tell MDX to eliminate the (null) cells from the Rows axis or our result.

Query 2

WITH SET TheEighties AS
{
      (
      [Date].[Date Hierarchy].[Calendar Year].[1980]
      :[Date].[Date Hierarchy].[Calendar Year].[1989]
      )
}
SELECT
      [Measures].[Episode Count] ON COLUMNS
    , NON EMPTY TheEighties ON ROWS
FROM [Borg]

 

Result 2

image

In Result 2, the (null) cells have been beamed out of our result and we only see cells that actually have values for the Episode Count measure.

 

 

 

 

 

This is a very basic usage of NON EMPTY. You will see later on, in one of my CROSSJOIN queries, that you can still get empty cells on the axis for which you have used the NON EMPTY keyword. It only eliminates members of the axis for each every cell on the axis is empty. If there is at least one that has a value, then that member of the axis will be returned. That will make more sense a little bit later.

I also want to make a very important point here. The NON EMPTY keyword is very easy to use. But it is also very easy to misuse. Make sure you fully understand how it will affect your result before you use it. There is also a function, NONEMPTY(), that can eliminate empty cells. That function is much more flexible in that it does not pertain to an entire axis. But usage of that, in my opinion, is outside the scope of an Intro to MDX. The key point I urge you to remember is that use of the NON EMPTY keyword or the NONEMPTY() function must be carefully considered. Understanding how they work is vital to making sure the results you get from your queries is correct and valid for your purpose. My example here is just the tip of the iceberg.

CROSSJOIN Function

The CROSSJOIN function takes as arguments two or more sets. It outputs a super set that is a cross product of every member of every set with every member of every other set. The order of the tuples in the resulting set is based on the order in which you provide the sets to the function as well as the order of the tuples within those sets.

NOTE: While all of the documentation (and examples) I can find indicates that CROSSJOIN takes SETS as arguments, there are instances when it accepts members instead. Both of my CROSSJOIN example queries (Query 3 and Query 4) pass members, not exclusively sets. You may ask how that can be. I have the same question. As of the time of this writing, I don’t know that answer. My search engine skills have thus far failed me on this. If you know the answer, I implore you to post a comment, or a link, or contact me in some fashion. Once I find that answer, I will write up a post that either explains it or points you to a location that does. Until then, just bear with me that it works.

Figure 1 shows a simple illustration of a CROSSJOIN between two sets.

Figure 1

image

Since Set A was listed first, its members are in the first position of the result. Likewise, since the order of the members within Set A is X and then Y, the X member’s CROSSJOIN results appear first. Notice two that the parts of the result from Set B appear in the order they exist in Set B.

There are three different ways of using a CROSSJOIN in your MDX.

Option 1

CROSSJOIN({<<Set A>>}, {<<Set B>>})

 

The first method is to just use the word CROSSJOIN and then list each of the sets, separated by commas, within parentheses.

Option 2

({<<Set A>>},{<<Set B>>})

 

The second is to list each of the sets, separated by commas, within parentheses, leaving off the word CROSSJOIN.

Option 3

{<<Set A>>} * {<<Set B>>}

 

The third is to list out each set and separate them with an asterisk. The asterisk, used in this way, is sometimes referred to as the CROSSJOIN operator.

MVP Chris Webb (b | t) has an excellent blog post on this topic: The rather-too-many ways to crossjoin in MDX. In that post, he examines each option and settles on Option 3 as his preferred method. He lays out great reasons for his choice. I highly recommend that post as well as following Chris’s blog. Having played with the CROSSJOIN options a bit, I agree with him completely. So, for this post, I will be using Option 3, listing out the sets to be cross joined, separating them with an asterisk.

Let’s head to my Borg cube to see CROSSJOIN in action.

CROSSJOIN Example 1: Basic CROSSJOIN

In Query 3, we are cross joining the members of the [Season].[Season Name] attribute hierarchy with the [Measures].[Episode Count] measure. The NON EMPTY keyword usage in Query 3 is just to make sure I am only returning Seasons that actually have episodes. My [Season].[Season Name] dimension ranges from 1 to 15 (Why did I do that?). Since no Star Trek series went into an eighth season, we stop at Season 7. The NON EMPTY is preventing the completely empty Seasons 8 – 15 from muttering up our results.

Query 3

SELECT
      NON EMPTY [Season].[Season Name].children 
        * [Measures].[Episode Count] ON COLUMNS
    , [Series].[Series Name].children ON ROWS
FROM [Borg]

 

Result 3

image

Notice how short and simple Query 3 is. Then notice how cool Result 3 is. MDX and CROSSJOIN rock. While you technically CAN produce results like this using T-SQL, it is going to require far more complexity than the straightforward, compact, MDX of Query 3.

By cross joining the seasons with the episode count, you can see that each column of the episode count for a different season. Since I put the series on the rows, we get, with a very simple query, the total episode count of every (live action) Star Trek series by season. Dig it.

You will note that, despite the NON EMPTY key word on the COLUMNS axis, there are empty cells in the result. Take a close look at the Season 04 column. Since that column has at least one value that is NOT empty, that column is returned. However, there is no column for Season 08 because ALL episode counts that for that column are empty.

CROSSJOIN Example 2: CROSSJOIN And The Autoexists Concept

When you use cross join to get the cross product of members of hierarchies in the same dimension, your result will only include cells that actually exist. That is due to the Autoexists concept you can read more about here. In Query 4, we are cross joining the members of the [Calendar Month] level of the [Date Hierarchy] with the [1966] member of the [Calendar Year] attribute hierarchy. Both of these hierarchies are in the [Date] dimension. This brings the Autoexists feature into play.

Query 4

SELECT
      [Measures].[Episode Count] ON COLUMNS
    , [Date].[Date Hierarchy].[Calendar Month].members
        * [Date].[Calendar Year].[1966] ON ROWS
FROM [Borg]

 

Results 4

image

You can see in Result 4 that we only get [Calendar Month] members that are actually in the year 1966 returned. Using CROSSJOIN in this way can accomplish great things for you if you play around with it a bit. Just be careful since creating cross products can quickly spiral out of control.

 

 

 

 

 

 

 

 

 

 

In this post, we looked at a very simple example of using the NON EMPTY keyword to eliminate null results across an entire axis. We also took a quick look at some simple examples of using the CROSSJOIN function. In the next post, we will take a look at doing a little time travel with the LEAD and LAG functions.

Comments

Pingback from Intro To MDX Decathlon–Introduction | Mark V SQL
Time September 17, 2012 at 1:33 pm

[…] .members Function (5/10) 6. Intro To MDX Decathlon – Hierarchy Navigation Basics (6/10) 7. Intro To MDX Decathlon – NON EMPTY and CROSSJOIN (7/10) 8. Intro To MDX Decathlon – LEAD() and LAG() Functions […]

Comment from yahoo search
Time April 8, 2014 at 10:56 am

Hello just wanted to give you a quick heads up. The text in your article seem
to be running off the screen in Chrome. I’m not sure if this is a
formatting issue or something to do with internet browser compatibility but I figured I’d post to let you know.
The design and style look great though! Hope you get the issue fixed soon.
Many thanks

Comment from Asghar SHAH
Time December 7, 2015 at 10:36 am

Hi Mark,
I have created one CUBE, Having dispatch and sales columns, FROM SSAS CUBE when I Select date dimension from 7th July 2015 to 15 July 2015 then I get only the transaction in between the selected date range. I need to get the Opening Balance (Closing as ON 6th July 2015). On the basis of Opening I can calculate Closing. Kindly help….

Comment from Mark V
Time December 8, 2015 at 6:27 pm

Hi, Asghar. My MDX skills are not much beyond what is in this series, I’m afraid. I would recommend posting this on StackOverflow. Sorry I can’t help at the moment, but I am swamped and would not be able to dig into it at this time.

Write a comment