Enforcing Uniqueness On A Nullable Column

I recently found myself needing to enforce uniqueness on a column which was nullable. To make things interesting, there were already thousands of records for which the field was already NULL and would remain NULL. A unique index on the column was not possible since I had more than one NULL value. I therefore had to come up with another option. After many deep, profound, brain things inside my head, I had come up with a horribly convoluted solution involving duct tape, a Swiss Army knife, 17 sporks, and a hamster named Ginger. I think we can all see why this would not be the best option.

A colleague pointed me to this blog, featuring a comment by Mr. Adam Machanic.

http://weblogs.sqlteam.com/brettk/archive/2005/04/20/4592.aspx

It involved creating a view which returns only non-NULL values for the column in question and then placing an index on that view. Fabtastic! WAY easier than the solution I was hatching myself. Note that there are rules regarding the creation of indexed views to be aware of. Those rules are detailed here: http://msdn.microsoft.com/en-us/library/ms188783.aspx.

Here is how this works.

We start with a database table containing super heroes. One thing to note about these particular super heroes is that none of them have sidekicks.

USE tempdb

GO

 

CREATE TABLE Superhero

(

       SuperheroID int NOT NULL IDENTITY

     , SuperheroName varchar(25) NULL

)

 

INSERT INTO Superhero (SuperheroName) VALUES('Superman')

INSERT INTO Superhero (SuperheroName) VALUES('Wonder Woman')

INSERT INTO Superhero (SuperheroName) VALUES('Spiderman')

 

SELECT 

       SuperheroID

     , SuperheroName

FROM Superhero

 

/*

RESULTS:

 

SuperheroID SuperheroName

----------- -------------------------

1           Superman

2           Wonder Woman

3           Spiderman

 

*/

 

Now, we need to import super heroes with sidekicks and bring their sidekicks into our table as well. We are also tasked with ensuring that no two super heroes can have the same sidekick name.

Let’s add a SidekickName column to our table and toss some sidekicked superheroes into our table.

ALTER TABLE Superhero ADD SidekickName varchar(25)

 

INSERT INTO Superhero (SuperheroName, SidekickName) VALUES('Batman', 'Robin')

INSERT INTO Superhero (SuperheroName, SidekickName) VALUES('Dangermouse', 'Penfold')

 

SELECT 

       SuperheroID

     , SuperheroName

     , SidekickName

FROM Superhero

 

/*

RESULTS:

 

SuperheroID SuperheroName             SidekickName

----------- ------------------------- -------------------------

1           Superman                  NULL

2           Wonder Woman              NULL

3           Spiderman                 NULL

4           Batman                    Robin

5           Dangermouse               Penfold

 

*/

 

So now we have some sidekicks and some superheroes who tend to work alone and have NULL in the SidekickName field. Just to show I am not making things up when it comes to unique indexes, let’s go ahead and try putting a unique index on the Superhero table and see what happens.

CREATE UNIQUE NONCLUSTERED INDEX ixSidekick ON Superhero (SidekickName ASC)

 

Here is the error returned:

Msg 1505, Level 16, State 1, Line 51

The CREATE UNIQUE INDEX statement terminated because a duplicate key was found for the object name 'dbo.Superhero' and the index name 'ixSidekick'. The duplicate key value is (<NULL>).

The statement has been terminated.

 

Blast!

< insert villain-style monologue here concerning the feebleness of the superhero of your choice >

Let’s create our view. There are two important things to notice about this statement below. The view must be created using the WITH SCHEMABINDING option. Also, although I have not been using the two-part table name dbo.Superhero thus far, I MUST do so in order to create this schemabound view.

CREATE VIEW dbo.vSuperheroSidekick

WITH SCHEMABINDING

AS

SELECT

       SuperheroID

     , SuperheroName

     , SidekickName

FROM dbo.Superhero

WHERE SidekickName IS NOT NULL

 

Let’s query our view.

 

SELECT

       SuperheroID

     , SuperheroName

     , SidekickName

FROM vSuperheroSidekick

 

/*

RESULTS:

 

SuperheroID SuperheroName             SidekickName

----------- ------------------------- -------------------------

4           Batman                    Robin

5           Dangermouse               Penfold

 

*/

 

Fabtastical. So far so good. Now this is where it gets cool. Let’s create our UNIQUE CLUSTERED index on our new view.

CREATE UNIQUE CLUSTERED INDEX ix_vSuperheroSidekick_SidekickName_Unique

ON vSuperheroSidekick (SidekickName ASC)

GO

 

Now, let try inserting a new superhero, Safetyrat, with his diminutive sidekick, Penfold (sound familiar to you?):

 

INSERT INTO Superhero (SuperheroName, SidekickName) VALUES('Safetyrat', 'Penfold')

 

We get the following error:

Msg 2601, Level 14, State 1, Line 1

Cannot insert duplicate key row in object 'dbo.vSuperheroSidekick' with unique index 'ix_vSuperheroSidekick_SidekickName_Unique'.

The statement has been terminated.

 

HAZZAH! The attempt to inject a lame mockery of the ace, the greatest, has been foiled by our fabulous indexed view. Take THAT Baron Greenback!

NOTE: In my situation, I was dealing with a table that did not have a lot of writes, so that added overhead of maintaining this indexed view was not a concern. You will want to take this into consideration in your own environment.

Don’t forget to clean up after yourself:

DROP VIEW dbo.vSuperheroSidekick

DROP TABLE Superhero

Using Medians For Quartiling When NTILE Is Too Arbitrary

So, in a previous post on The Coolness of the T-SQL NTILE Function, I discuss that NTILE can perform quartiling (or any other –tiling not requiring a trip to Home Depot) for you. Very cool. However, there is something worth noting about this function that I discovered the hard way. I will continue with my quartile example.

If you are quartiling based on a single field (or a small number of fields) and there is a possibility that more than one record could have the same values in the fields by which you are quartiling, it is possible for records with the same value to end up in different quartiles. Doh.

The reason for this is that NTILE essentially just orders your records by the field specified in the ORDER BY clause and then places the first 25% of the records in quartile 1, the second 25% in quartile 2, etc. So, if you have 8 records, as in my example below, the first two records go in quartile 1, the second two go in quartile 2, etc.

Let’s see this in action to make sure I am not just making this stuff up.

USE tempdb

GO

 

CREATE TABLE TestValue

(

       Record char(1) NOT NULL 

     , Value1 int NOT NULL

)

 

INSERT INTO TestValue (Record, Value1) VALUES('A',2)

INSERT INTO TestValue (Record, Value1) VALUES('B',2)

INSERT INTO TestValue (Record, Value1) VALUES('C',2)

INSERT INTO TestValue (Record, Value1) VALUES('D',4)

INSERT INTO TestValue (Record, Value1) VALUES('E',6)

INSERT INTO TestValue (Record, Value1) VALUES('F',6)

INSERT INTO TestValue (Record, Value1) VALUES('G',8)

INSERT INTO TestValue (Record, Value1) VALUES('H',8)

 

In the code above, note that Record B and Record C have the same exact value in Value1.

/*

Quartiling with NTILE

*/

SELECT

       tv.Record

     , tv.Value1

     , NTILE(4) OVER(ORDER BY tv.Value1 ASC) AS Quartile_NTILE

FROM TestValue tv

 

The query above produces this result set:

 

Record Value1      Quartile_NTILE

------ ----------- --------------------

A      2           1

B      2           1

C      2           2

D      4           2

E      6           3

F      6           3

G      8           4

H      8           4

 

Note that although Records B and C have the same Value1 value, NTILE puts them in different quartiles. As previously stated, doh. Since I am doing my ORDER BY only a single column for which duplicate values are present, NTILE makes somewhat arbitrary decisions within the subsets of each Value1 value and puts Record C in quartile 2 instead of 1. If uniqueness were assured on my Value1 field, then NTILE would be an easy and accurate way to go. But not so much in this case.

I decided that the median would be my champion. Itzik Ben-Gan provides the basis for my median calculation here: http://www.sqlmag.com/Article/ArticleID/49827/sql_server_49827.html.

I ended up calculating 3 median values:

1. Median (I called Median50) across Value1 for all records, forming the 50th percentile boundary and hence the boundary between quartiles 2 and 3.

2. Median (I called Median25) across records whose Value1 <= the Median50, forming the 25th percentile boundary and hence the boundary between quartiles 1 and 2.

3. Median (I called Median75) across records whose Value1 > the Median50, forming the 75th percentile boundary and hence the boundary between quartiles 3 and 4.

Here is the final code for calculating my own quartiles and ensuring that records with the same value end up in the same quartile:

;

WITH

 

/*

Calculate the median over all records

This forms the boundary between quartiles 2 and 3

This must be calculated first since it is needed to calculate the 

other quartile boundaries

*/

 

Median50 AS

(

  SELECT Record, Value1,

    ROW_NUMBER() OVER(ORDER BY Value1) AS RowNum

  FROM TestValue

)

,

MedianResults50 AS

(

SELECT 1 AS JoinField,CAST(AVG(Value1) AS dec(7,3)) AS Median50

FROM Median50

)

,

 

/*

Calculate the median over records whose value is <= the overall median

This forms the boundary between quartiles 1 and 2

*/

 

Median25 AS

(

  SELECT Record, Value1,

    ROW_NUMBER() OVER(ORDER BY Value1) AS RowNum

  FROM TestValue

INNER JOIN MedianResults50 m

ON m.JoinField = 1

WHERE Value1 <= m.Median50

)

,

MedianResults25 AS

(

SELECT 1 AS JoinField,CAST(AVG(Value1) AS dec(7,3)) AS Median25

FROM Median25

) 

,

 

/*

Calculate the median over records whose value is < the overall median

This forms the boundary between quartiles 3 and 4

*/

 

Median75 AS

(

  SELECT Record, Value1,

    ROW_NUMBER() OVER(ORDER BY Value1) AS RowNum

  FROM TestValue

INNER JOIN MedianResults50 m

ON m.JoinField = 1

WHERE Value1 > m.Median50

)

,

MedianResults75 AS

(

SELECT 1 AS JoinField,CAST(AVG(Value1) AS dec(7,3)) AS Median75

FROM Median75

)

 

/*

Final SELECT using a CASE statement to assign quartiles based on 

my various medians

*/

SELECT

       tv.Record

     , tv.Value1

     , NTILE(4) OVER(ORDER BY tv.Value1 ASC) AS Quartile_NTILE

     , CASE

          WHEN tv.Value1 <= m25.Median25 THEN 1

          WHEN tv.Value1 BETWEEN m25.Median25 AND m50.Median50 THEN 2

          WHEN tv.Value1 BETWEEN m50.Median50 AND m75.Median75 THEN 3

          ELSE 4

       END AS MyQuartile

FROM TestValue tv

INNER JOIN MedianResults25 m25

ON m25.JoinField = 1

INNER JOIN MedianResults50 m50

ON m50.JoinField = 1

INNER JOIN MedianResults75 m75

ON m75.JoinField = 1

 

-- Cleaning up

DROP TABLE TestValue

 

Here is the result set produced:

 

Record Value1      Quartile_NTILE       MyQuartile

------ ----------- -------------------- -----------

A      2           1                    1

B      2           1                    1

C      2           2                    1

D      4           2                    2

E      6           3                    3

F      6           3                    3

G      8           4                    4

H      8           4                    4

 

You can see in the result set above that my median-style quartiling ensures that records with the same Value1 value end up in the same quartile.

Thus, in situations where the fields over which you are quartiling are guaranteed to produce unique values across your data set, then NTILE can be a great way to go. However, if there can be dupes, you may want to consider my approach above.

Using Dynamic Values in the PIVOT Operator

I recently worked on a report for a client in which a date input parameter was used to determine the end dates of the previous four calendar quarters. Each time the report ran, it returned data for all four of those quarters.

I ended up using PIVOT in the course of the stored procedure for the report. The first time I did so, I hard-coded the four quarter end dates I needed to work with. This worked fine for my testing, but I needed the PIVOT to use whatever four quarters end dates were derived from the input parameter. I was fortunate that I knew ahead of time that I would always be pivoting on four and only four date values.

With that consistency in mind, I was able to create a common tables expression with a derived column to consistently “alias” the first quarter as 1, the second quarter as 2, and so forth.

Here is the breakdown of how this works using data on the dining habits of Hobbits. In this example, the report involved requires looking at the first three days of the month. This provides the consistency required to make this particular technique work. It could be possible to modify this procedure to work on value with less consistency but I have not yet tried it.

-- Create the Hobbit table to house the meal data

CREATE TABLE Hobbit

(

        Hobbit varchar(20)

      , MealDate date

      , MealCount int

)

 

-- Populate Hobbit table with meal data

INSERT INTO Hobbit VALUES('Bilbo', '2009-01-01', 4)

INSERT INTO Hobbit VALUES('Bilbo', '2009-01-02', 5)

INSERT INTO Hobbit VALUES('Bilbo', '2009-01-03', 3)

INSERT INTO Hobbit VALUES('Bilbo', '2009-02-01', 6)

INSERT INTO Hobbit VALUES('Bilbo', '2009-02-02', 7)

INSERT INTO Hobbit VALUES('Bilbo', '2009-02-03', 3)

INSERT INTO Hobbit VALUES('Samwise', '2009-01-01', 4)

INSERT INTO Hobbit VALUES('Samwise', '2009-01-02', 8)

INSERT INTO Hobbit VALUES('Samwise', '2009-01-03', 7)

INSERT INTO Hobbit VALUES('Samwise', '2009-02-01', 4)

INSERT INTO Hobbit VALUES('Samwise', '2009-02-02', 8)

INSERT INTO Hobbit VALUES('Samwise', '2009-02-03', 7)

INSERT INTO Hobbit VALUES('Meriadoc', '2009-01-01', 6)

INSERT INTO Hobbit VALUES('Meriadoc', '2009-01-02', 9)

INSERT INTO Hobbit VALUES('Meriadoc', '2009-01-03', 12)

INSERT INTO Hobbit VALUES('Meriadoc', '2009-02-01', 4)

INSERT INTO Hobbit VALUES('Meriadoc', '2009-02-02', 8)

INSERT INTO Hobbit VALUES('Meriadoc', '2009-02-03', 7)

 

-- Pivot with hard coded January values

SELECT

        Hobbit

      , [2009-01-01]

      , [2009-01-02]

      , [2009-01-03]

FROM 

(

SELECT 

        Hobbit

      , MealDate

      , MealCount

FROM Hobbit) h

PIVOT

(

SUM(MealCount)

FOR MealDate IN

(

        [2009-01-01]

      , [2009-01-02]

      , [2009-01-03]

)

) AS pvt

 

Results of January hard coded PIVOT:

 

Hobbit               2009-01-01  2009-01-02  2009-01-03

-------------------- ----------- ----------- -----------

Bilbo                4           5           3

Meriadoc             6           9           12

Samwise              4           8           7

 

-- Pivot with hard coded February values

SELECT

        Hobbit

      , [2009-02-01]

      , [2009-02-02]

      , [2009-02-03]

FROM 

(

SELECT 

        Hobbit

      , MealDate

      , MealCount

FROM Hobbit) h

PIVOT

(

SUM(MealCount)

FOR MealDate IN

(

        [2009-02-01]

      , [2009-02-02]

      , [2009-02-03]

)

) AS pvt

 

Results of February hard coded PIVOT:

 

Hobbit               2009-02-01  2009-02-02  2009-02-03

-------------------- ----------- ----------- -----------

Bilbo                6           7           3

Meriadoc             4           8           7

Samwise              4           8           7

 

As you can see from this pivot tables above, Hobbits eat often. The hard coded date values form the column headers. This is nice and all but requires a code change to report on a different set of dates.

Here we see how you can make a CTE with a derived column to pass in to the PIVOT operator. This allows the actual PIVOT to remain exactly the same while reporting on different dates.

-- Varbiable to hold the month number on which to PIVOT

DECLARE @MealMonth int

 

SET @MealMonth = 1 -- Using 1 to prepresent January

 

-- Common table expression to derive MealDay column

;

WITH HobbitMealDays AS

(

SELECT

        Hobbit

      , MealDate

      , DAY(MealDate) AS MealDay -- Using day datepart as integer value

      , MealCount

FROM Hobbit

WHERE MONTH(MealDate) = @MealMonth

)

 

SELECT

        Hobbit

      , [1] AS Day1 -- Alias to stand in for hard coded column headers

      , [2] AS Day2

      , [3] AS Day3

FROM 

(

SELECT 

        Hobbit

      , MealDay

      , MealCount

FROM HobbitMealDays) h

PIVOT

(

SUM(MealCount)

FOR MealDay IN

(

        [1] -- these values are known possible values of MealDay derived column

      , [2]

      , [3]

)

) AS pvt

 

Results of dynamic using @Mealmonth = 1 to represent January in PIVOT:

 

Hobbit               Day1        Day2        Day3

-------------------- ----------- ----------- -----------

Bilbo                4           5           3

Meriadoc             6           9           12

Samwise              4           8           7

 

Just to make the comparison easier, here are the hard coded January PIVOT results again.

Results of January hard coded PIVOT:

 

Hobbit               2009-01-01  2009-01-02  2009-01-03

-------------------- ----------- ----------- -----------

Bilbo                4           5           3

Meriadoc             6           9           12

Samwise              4           8           7

 

If you set @MealMonth = 2 and run the PIVOT again, you will get the February results:

Results of dynamic using @Mealmonth = 2 to represent February in PIVOT:

 

 

Hobbit               Day1        Day2        Day3

-------------------- ----------- ----------- -----------

Bilbo                6           7           3

Meriadoc             4           8           7

Samwise              4           8           7

 

Again, for ease of comparison, here are the hard coded February results again.

Results of February hard coded PIVOT:

 

Hobbit               2009-02-01  2009-02-02  2009-02-03

-------------------- ----------- ----------- -----------

Bilbo                6           7           3

Meriadoc             4           8           7

Samwise              4           8           7

 

-- Cleaning up

DROP TABLE Hobbit

GO

 

As you can see above, the “dynamic” PIVOT returns the exact same values as the hard coded versions but offers the flexibility of reporting on different periods without changing code. You could just reference a parameter in the CTE instead of a variable as I did.

Have fun.

Things You Know Now…

I was tagged by Jason Strate for cool blog chain. I guess I’m it now.

  

Get the Last Date in a Calendar Quarter

I needed to accurately calculate the end dates for the four most recently completed calendar quarters prior to the date provided. I found this snippet online and thought it was a really slick way to accomplish finding the quarter end date. I tried to find the site again so i could give props to the person who posted it but cannot. So, my clever friend, I salute thee.

Let’s start by declaring our date variable and setting it to a fine day in history. My parents told me this was my birthday. I don’t really remember but I will give them the benefit of the doubt.

I have dissected the expression below.

DECLARE @testdate date

 

SET @testdate = '1975-01-26'

 

SELECT

 

-- Get the number of calendar quarters since 0

        DATEDIFF(qq,0,@testdate) AS Step1

 

-- Add 1 quarter

      , DATEDIFF(qq,0,@testdate)+1 AS Step2

 

-- Add the number of quarters in Step2 above to 0 to get the earliest date of next quarter

      , DATEADD(qq, DATEDIFF(qq,0,@testdate)+1,0) AS Step3

 

-- Subtract 3 milliseconds to get to the last datetime value of current quarter

      , dateadd(ms,-3,DATEADD(qq, DATEDIFF(qq,0,@testdate)+1,0)) AS Step4

 

-- Since i was working in SQL 2008, which supports separate data types for DATE and TIME, 

-- I converted the result in Step4 to DATE

      , CONVERT(date,dateadd(ms,-3,DATEADD(qq, DATEDIFF(qq,0,@testdate)+1,0))) AS QuarterEndDate

The query results are as follows:

Step1       Step2       Step3                   Step4                   QuarterEndDate

----------- ----------- ----------------------- ----------------------- --------------

300         301         1975-04-01 00:00:00.000 1975-03-31 23:59:59.997 1975-03-31

 

So, with this nifty expression, we have determined the end date of the quarter for my birthday (no gifts necessary).

We can also change the number of quarters added (or even subtracted) in Step2 to get the quarter end date for future or past quarters, too.

Make room for FLOOR and CEILING

On the same report referenced above, I also had to implement a business rule by rounding a value up to the nearest thousand. A colleague suggested the CEILING function, which returns the smallest integer that is greater than or equal to the provided numeric expression. As it turn out, the FLOOR function does the exact opposite, returning the largest integer that is less than or equal to the numeric expression.

Let’s look at both functions in action.

-- Let's create a tiny table

CREATE TABLE Smurf

(

        Smurf varchar(10)

      , HeightInApples dec(2,1)

)

 

-- Populate the table with a small number of records

INSERT INTO Smurf VALUES('Brainy',3.1)

INSERT INTO Smurf VALUES('Hefty',3.5)

INSERT INTO Smurf VALUES('Painter', 3.7)

 

-- Watch these smurfy functions do their thing

SELECT

        Smurf

      , HeightInApples

      , FLOOR(HeightInApples) AS FloorValue

      , CEILING(HeightInApples) AS CeilingValue

FROM Smurf

 

Results:

 

Smurf      HeightInApples    FloorValue  CeilingValue

---------- ----------------- ----------- -------------

Brainy     3.1               3           4

Hefty      3.5               3           4

Painter    3.7               3           4

 

-- Cleaning up after yourself is smurftastic

DROP TABLE Smurf

Notice that all values were rounded down by the FLOOR function and up by the CEILING function.

For my report, I divided my input values by 1000, ran them through CEILING, and the multiplied by 1000. This rounded my values up to the nearest thousand. I just figured the example above would be a little more fun.

I am tagging Dan English

The Coolness of the T-SQL NTILE Function

I had need of calculating metrics across an industry and then returning the quartile in which each company scored. So, i wrote up a function that took in the MinValue, MaxValue, and CompanyMetricValue as inputs and returned the quartile for the Company. Great. Then, when asking  someone else about this using this new function, the response was, “Why not just use NTILE?” Ug.

Here is the description from SQL 2008 Books Online:

Distributes the rows in an ordered partition into a specified number of groups. The groups are numbered, starting at one. For each row,  NTILE returns the number of the group to which the row belongs.

NTILE (integer_expression)    OVER ( [ <partition_by_clause> ] < order_by_clause > )

So, using NTILE, I was able to do the following:

SELECT NTILE(4) OVER(ORDER BY CompanyMetricValue ASC) AS Quartile

In the example above, i want quarters, so I use 4 as the (integer expression).

Lesson Learned: Before inventing a new wheel, check Books Online for wheels that already exist. 🙂

OUTER JOIN Functionality in SSRS 2005 Report Builder

When we link from one entity to another entity in SSRS 2005 Report Builder, we get an INNER JOIN between the two entities. The Report Builder tool does not natively support OUTER JOIN functionality. However, there is a way to create OUTER JOINs in Report Builder models and overcome this limitation.

We have two tables, CUSTOMER(with a PK of CustomerID) and ORDER(with a PK of OrderID and a FK of CustomerID), with a one-to-many relationship. Each CUSTOMER record could have 0 or more ORDER records. The relationship created between these entities in our data source view will support INNER JOIN links only. Any time we select CUSTOMER records and look to the ORDER entity, our results will only contain CUSTOMER records that have corresponding ORDER records. Our poor Orderless Customers are the victims of foreign key discrimination. In some reporting scenarios, this may serve our purpose. However, if we want to be sure our disenfranchised Customers lacking Orders see the light of day in our report, the default data source view design is not adequate.

We must create a third entity to advocate for the rights of our Orderless Customers. We have a few options as to where to create this entity. We can use a Named Query directly in our Data Source View. Or we can create a view in our database to serve this purpose, which has the added bonus of supporting indexes. Whichever path we choose, our SELECT statement will be the same:

SELECT C.CustomerID, O.OrderID

FROM Customer C

LEFT JOIN Order O ON

O.CustomerID = C.CustomerID

Let’s assume we chose to create a view called vwAdvocate. Once we add vwAdvocate to our data source view, we create two relationships:

1. vwAdvocate.CustomerID to CUSTOMER.CustomerID

2. vwAdvocate.OrderID to ORDER.OrderID

In our Report Model, the CUSTOMER and ORDER entities will each contain a role pointing to vwAdvocate, we will call them CUSTOMER-vwAdvocate and ORDER-vwAdvocate respectively. Likewise, vwAdvocate will have one role pointing to CUSTOMER(vwAdvocate-CUSTOMER) and another role pointing to ORDER (vwAdocate-ORDER). In the Report Builder tool, when we want to use OUTER JOIN functionality between CUSTOMER and ORDER, we utilize the CUSTOMER-vwAdvocate role to get to vwAdvocate. From there, we use the vwAdvocate-ORDER role to get to ORDER. Now we can add ORDER fields to our model without eliminating CUSTOMER records lacking ORDER records.

Below are some things to note about using this method:

1. When we utilize our CUSTOMER-vwAdvocate role to get to vwAdvocate, our report’s primary entity will change from CUSTOMER to vwAdvocate. Thus, we can only add fields from entities that have a relationship with vwAdvocate, namely CUSTOMER and ORDER in our example. So, we are not able to use our OUTER JOIN method to get to more than one lookup table.

2. Now that we have 2 paths (roles) to get from CUSTOMER to ORDER(the direct role as well as the path through the vwAdvocate roles), we have created a bit of a loop. This allows users to click through the roles from CUSTOMER to vwAdvocate to ORDER back to CUSTOMER to vwAdvocate again, etc over and over and over. This can be overcome by educating our users about this in our end-user training/documentation.

3. OUTER JOIN queries require more server resources than INNER JOIN queries. Thus, with the method above, users are now able to have a larger impact on server performance. So, as part of end-user training/documentation, we would want to stress to our users that they should only use this OUTER JOIN functionality when they really need it.

While using this OUTER JOIN functionality in our Report Builder models is not completely without cost, it is still, in my opinion, a worthwhile option to consider when INNER JOINs will not get use the results we desire.