SQL

NullIf

NullIfThis post is about a handy little function that works across many database systems, including Sql Server, Oracle, and MySql. I often forget about it and at times even end up coding some workaround that would have been a lot easier if I had just used NullIf(). NullIf takes two parameters. The first parameter is the value to check and the second parameter is the value that should equate to NULL. Let’s take a look at an example to hopefully make this clearer.

First, we will create a table to work from, populate it with some data, and select the results out so that we can see what we have visually.

USE tempdb
GO

CREATE TABLE dbo.JunkData (
	JunkDataId INT IDENTITY(1,1) PRIMARY KEY CLUSTERED,
	Name VARCHAR(100) NOT NULL,
	StreetAddress VARCHAR(100) NULL
);

INSERT dbo.JunkData (Name, StreetAddress)
SELECT 'Pete', '777 Main St';

INSERT dbo.JunkData (Name, StreetAddress)
SELECT 'Jeff', NULL;

INSERT dbo.JunkData (Name, StreetAddress)
SELECT 'Dustin', '999 Oak St';

INSERT dbo.JunkData (Name, StreetAddress)
SELECT 'Ron', NULL;

INSERT dbo.JunkData (Name, StreetAddress)
SELECT 'Keith', '';

SELECT Name, StreetAddress
FROM dbo.JunkData;

Our table's contents

In this case, the Street Address column is nullable, but for Keith’s record the developer chose to insert a blank space instead of a NULL for a missing address. If we try to just write a query to get rid of the nulls, we still have the blank space issue. Here is a query that uses COALESCE to get rid of the NULLS and its results.

SELECT Name, COALESCE(StreetAddress, 'Not Provided') AS StreetAddress
FROM dbo.JunkData;

Our results with COALESCE Only

You see that we still have the blank address to deal with. One work around is to use a case statement. Something like “CASE WHEN COALESCE(StreetAddress, ‘Not Provided’) = ” THEN ‘Not Provided’ ELSE COALESCE(StreetAddress, ‘Not Provided’) END AS StreetAddress”. But, that is quite a mouthful and we repeat ourselves several times. However, if we could get COALESCE to treat a blank address like a NULL, we’d have been in business in the first place. That is what NULLIF() does. It evaluates the value we specify as a NULL (in this case a blank space) and then the rest of the query can treat it like a NULL. Here is that example

SELECT Name, COALESCE(NULLIF(StreetAddress, ''), 'Not Provided') AS StreetAddress
FROM dbo.JunkData;

Our results with COALESCE and NULLIF

That’s much more succinct and I feel like it conveys our intent much more easily. Here’s another practical use case for NULLIF(). Given the following table and values, we want to find the average sale on each given day.

CREATE TABLE dbo.Reporting(
  RowId INT IDENTITY(1,1) PRIMARY KEY CLUSTERED,
  SalesDate DATETIME NOT NULL,
  TotalSales MONEY NOT NULL,
  QuantitySold NUMERIC(9,2) NOT NULL
);

INSERT dbo.Reporting(SalesDate, TotalSales, QuantitySold)
SELECT '4/1/16', 432.50, 2;

INSERT dbo.Reporting(SalesDate, TotalSales, QuantitySold)
SELECT '4/2/16', 0.00, 0;

INSERT dbo.Reporting(SalesDate, TotalSales, QuantitySold)
SELECT '4/3/16', 5422.10, 10;

We might write something like the following

SELECT SalesDate, (TotalSales/QuantitySold) as AverageSale
FROM dbo.Reporting;

However, when you do, you get a divide-by-zero error. Sometimes in more complicated situations, that can be a bear to track down exactly what is evaluating to zero in an equation. However, if you ask the denominator to evaluate as NULL whenever it is 0, SQL is much happier. Here’s our query now:

SELECT SalesDate, (TotalSales/NULLIF(QuantitySold, 0)) as AverageSale
FROM dbo.Reporting;

Now, we easily get our results.

Results of Dividing by NULL

This helps identify which records are causing our issue and can now easily be dealt with. If we don’t like NULL in our results, we can use ISNULL() or COALESCE() to put a sensible default answer. Done and done.

Podcasts

Podcast Episode 42 – The Answer to Life, the Universe, and Everything, just kidding – MS Build 2016

MS Build 2016In episode 42, I don’t have the answer to Life, the Universe, and Everything, but I do have thoughts and reactions to Microsoft’s 2016 Build conference. I cover Bash on Windows, Ink, Bots, and Free Xamarin for everyone!


Links Mentioned in this Show:
XKCD Thing Explainer
Cleartext

You can also subscribe to the podcast at any of these places:
iTunes Link RSS Feed

Thanks to all the people who listen, and a special thanks to those who have rated me. I really appreciate it.

The episodes have been archived. Click Here to see the archive page.

SQL

SQL Pivot – Static or Dynamic Columns

Pivot ArrowThis is going to be another one of those posts that will surely help some people, but it is far more likely to help me later 😉 I have culled these techniques from a variety of sources across the Internet, but I’m not entirely sure who taught me what in order to give proper credit.

If you want to follow along, I went to the Microsoft Sql Product Sample Site and downloaded “Adventure Works 2014 Full Database Backup.zip” and unzipped and restored it onto my database server. That gives me the OLTP version (rather than the OLAP) of the 2014 Adventure Works database. To start off, let’s take a look at the following query:

USE AdventureWorks2014
GO

SELECT c.AccountNumber, DATENAME(month, h.OrderDate) as OrderMonthName, SUM(d.LineTotal) as TotalSales
FROM Sales.SalesOrderHeader h 
INNER JOIN Sales.SalesOrderDetail d ON h.SalesOrderID = d.SalesOrderID
INNER JOIN Sales.Customer c ON h.CustomerID = c.CustomerID
WHERE YEAR(h.OrderDate) = 2011
GROUP BY c.AccountNumber, DATENAME(month, h.OrderDate)
ORDER BY c.AccountNumber

This returns results like these:
Initial Month-based Results

However, what if you wanted to easily see how each of those customers’ sales looked month over month in comparison to other customers? Typically, a data analyst might take a result set like that and use a tool like Microsoft Excel to create a pivot table. That would let your rows be per customer and your columns would be per month. Reporting wonks could also use a tool like SSRS to pivot the data for presentation to users. However, if you are using Sql Server 2005 or later, you could also use the PIVOT command.

If your data that you’d like to pivot on has a finite and known quantity, this is very easy. I’m going to change the query above to show how to do that:

USE AdventureWorks2014
GO

;WITH SalesData AS (
	SELECT c.AccountNumber, DATENAME(month, h.OrderDate) as OrderMonthName, CAST(SUM(d.LineTotal) AS NUMERIC(12,2)) as TotalSales
	FROM Sales.SalesOrderHeader h 
	INNER JOIN Sales.SalesOrderDetail d ON h.SalesOrderID = d.SalesOrderID
	INNER JOIN Sales.Customer c ON h.CustomerID = c.CustomerID
	WHERE YEAR(h.OrderDate) = 2011
	GROUP BY c.AccountNumber, DATENAME(month, h.OrderDate))

SELECT *
FROM SalesData
PIVOT (SUM(TotalSales) FOR OrderMonthName IN (January, February, March, April, May, June, July, August, September, October, November, December)) pvt

A few peculiar things to notice. First, the values that become the new columns are not quoted. They are treated like literals. I didn’t have to say ‘January’, ‘February’, etc. Secondly, notice that I had to name the pivot table (as pvt here). Even though I don’t reference it, if I omit it, the Sql statement won’t parse and execute. Look at the results below. Notice a third strange thing. Even though I selected out all of the columns (using SELECT *), it did not actually return all columns like you would normally see in a join situation. It showed the AccountNumber column from the SalesData CTE, but then it ignored the OrderMonthNames and TotalSales columns from SalesData. The PIVOT was smart enough to know that those values were being represented in other ways in our new dataset.

Pivoted Month Results

If you want/need to get rid of the NULLS in the results, you can do it the same way that you would do it in a regular query. I could change the “SELECT *” to be “SELECT AccountNumber, ISNULL(January, 0.00) AS January, ISNULL(February, 0.00) AS February” and just continue that pattern for each month.

Simple enough and that gets us pretty far along the way. But, what if I wanted to instead see customer sales by product category? I could see every product category that exists in the system right now and do a static pivot. However, everytime someone updated the system with a new product category, we would have to modify our query (potentially many queries across the enterprise). Is there a way to do that?

No, not directly.

However, we can use Dynamic Sql to dynamically determine our static columns, if that makes sense 😉 Let’s take a look at the query needed to just return the standard results, grouped by customer and product category.

USE AdventureWorks2014
GO

SELECT c.AccountNumber, pc.Name + '-' + ps.Name AS CategoryName, CAST(SUM(d.LineTotal) AS NUMERIC(12,2)) as TotalSales
FROM Sales.SalesOrderHeader h 
INNER JOIN Sales.SalesOrderDetail d ON h.SalesOrderID = d.SalesOrderID
INNER JOIN Production.Product p ON d.ProductID = p.ProductID
INNER JOIN Production.ProductSubcategory ps ON p.ProductSubcategoryID = ps.ProductSubcategoryID
INNER JOIN Production.ProductCategory pc ON ps.ProductCategoryID = pc.ProductCategoryID
INNER JOIN Sales.Customer c ON h.CustomerID = c.CustomerID
WHERE YEAR(h.OrderDate) = 2011
GROUP BY c.AccountNumber, pc.Name, ps.Name

This returns the following results:
Initial Dynamic Pivot Results

There are 8 unique category – subcategory combinations in these results, but it doesn’t take much imagination to see how new products can be added all the time. Probably several new subcategories a year. How can we handle that? The key is those columns listed out in the PIVOT operator definition. Let’s see how we can build that dynamically using a temp table to make working with the data easier.

USE AdventureWorks2014
GO

SELECT c.AccountNumber, pc.Name + '-' + ps.Name AS CategoryName, CAST(SUM(d.LineTotal) AS NUMERIC(12,2)) as TotalSales
INTO #SalesData
FROM Sales.SalesOrderHeader h 
INNER JOIN Sales.SalesOrderDetail d ON h.SalesOrderID = d.SalesOrderID
INNER JOIN Production.Product p ON d.ProductID = p.ProductID
INNER JOIN Production.ProductSubcategory ps ON p.ProductSubcategoryID = ps.ProductSubcategoryID
INNER JOIN Production.ProductCategory pc ON ps.ProductCategoryID = pc.ProductCategoryID
INNER JOIN Sales.Customer c ON h.CustomerID = c.CustomerID
WHERE YEAR(h.OrderDate) = 2011
GROUP BY c.AccountNumber, pc.Name, ps.Name

DECLARE @PivotColumns NVARCHAR(MAX), @SqlStatement NVARCHAR(MAX)

SELECT @PivotColumns= ISNULL(@PivotColumns + ',','') + QUOTENAME(CategoryName) FROM (SELECT DISTINCT CategoryName FROM #SalesData) AS Categories

SET @SqlStatement = 
	N'SELECT AccountNumber, ' + @PivotColumns + '
    FROM #SalesData
    PIVOT(SUM(TotalSales) 
          FOR CategoryName IN (' + @PivotColumns + ')) AS pvt'

EXEC sp_executesql @SqlStatement

DROP TABLE #SalesData

The bulk of the work here is just being done in this line

SELECT @PivotColumns= ISNULL(@PivotColumns + ',','') + QUOTENAME(CategoryName) FROM (SELECT DISTINCT CategoryName FROM #SalesData) AS Categories

That just select a distinct list of categories and builds a comma delimited string out of them. The value in that variable looks like this:
[Bikes-Road Bikes],[Clothing-Socks],[Clothing-Jerseys],[Clothing-Caps],[Components-Mountain Frames],[Accessories-Helmets],[Bikes-Mountain Bikes],[Components-Road Frames]

Because of the spacing in the categories, that’s why we used QUOTENAME. That makes sure the name will work in SQL statements. Ultimately, the executed query looks like this:

SELECT AccountNumber, [Bikes-Road Bikes],[Clothing-Socks],[Clothing-Jerseys],[Clothing-Caps],[Components-Mountain Frames],[Accessories-Helmets],[Bikes-Mountain Bikes],[Components-Road Frames]
FROM #SalesData
PIVOT(SUM(TotalSales) FOR CategoryName IN ([Bikes-Road Bikes],[Clothing-Socks],[Clothing-Jerseys],[Clothing-Caps],[Components-Mountain Frames],[Accessories-Helmets],[Bikes-Mountain Bikes],[Components-Road Frames])) AS pvt

You can see how that is now very much like our initial static example with the months. The query still has the NULL problem, though, so to fix that you can create another variable so the SELECTed columns can contain ISNULL but the PIVOTed columns will not. That looks like this:

USE AdventureWorks2014
GO

SELECT c.AccountNumber, pc.Name + '-' + ps.Name AS CategoryName, CAST(SUM(d.LineTotal) AS NUMERIC(12,2)) as TotalSales
INTO #SalesData
FROM Sales.SalesOrderHeader h 
INNER JOIN Sales.SalesOrderDetail d ON h.SalesOrderID = d.SalesOrderID
INNER JOIN Production.Product p ON d.ProductID = p.ProductID
INNER JOIN Production.ProductSubcategory ps ON p.ProductSubcategoryID = ps.ProductSubcategoryID
INNER JOIN Production.ProductCategory pc ON ps.ProductCategoryID = pc.ProductCategoryID
INNER JOIN Sales.Customer c ON h.CustomerID = c.CustomerID
WHERE YEAR(h.OrderDate) = 2011
GROUP BY c.AccountNumber, pc.Name, ps.Name


DECLARE @PivotColumns NVARCHAR(MAX), @SelectColumns NVARCHAR(MAX), @SqlStatement NVARCHAR(MAX)

SELECT @PivotColumns= ISNULL(@PivotColumns + ',','') + QUOTENAME(CategoryName) FROM (SELECT DISTINCT CategoryName FROM #SalesData) AS Categories
SELECT @SelectColumns = ISNULL(@SelectColumns + ',','') + CategoryName FROM (SELECT DISTINCT 'ISNULL(' + QUOTENAME(CategoryName) + ', ''0.00'') AS ' + QUOTENAME(CategoryName) AS CategoryName FROM #SalesData) AS Categories

SET @SqlStatement = 
	N'SELECT AccountNumber, ' + @SelectColumns + '
    FROM #SalesData
    PIVOT(SUM(TotalSales) 
          FOR CategoryName IN (' + @PivotColumns + ')) AS pvt'

EXEC sp_executesql @SqlStatement

DROP TABLE #SalesData

Basically, I just made the distinct list of category names to be a distinct list of category names wrapped in ISNULL. Then I used the @SelectColumns variable in the select statement and the @PivotColumns variable in the pivot list. The resulting Sql looks like this:

SELECT AccountNumber, ISNULL([Accessories-Helmets], '0.00') AS [Accessories-Helmets],ISNULL([Components-Mountain Frames], '0.00') AS [Components-Mountain Frames],ISNULL([Clothing-Caps], '0.00') AS [Clothing-Caps],ISNULL([Components-Road Frames], '0.00') AS [Components-Road Frames],ISNULL([Bikes-Road Bikes], '0.00') AS [Bikes-Road Bikes],ISNULL([Clothing-Socks], '0.00') AS [Clothing-Socks],ISNULL([Clothing-Jerseys], '0.00') AS [Clothing-Jerseys],ISNULL([Bikes-Mountain Bikes], '0.00') AS [Bikes-Mountain Bikes]
    FROM #SalesData
    PIVOT(SUM(TotalSales) 
          FOR CategoryName IN ([Bikes-Road Bikes],[Clothing-Socks],[Clothing-Jerseys],[Clothing-Caps],[Components-Mountain Frames],[Accessories-Helmets],[Bikes-Mountain Bikes],
[Components-Road Frames])) AS pvt

And the resulting data looks like this:
Dynamic Pivot Final Results

Hopefully you find this helpful (I’m sure future me will be back here). If you have any questions, let me know in the comments.

Podcasts

Podcast Episode 41 – Show Update and the Module Dependency Fiasco

DominoesLast time, I was looking for some feedback about how this show should go in the future (and if it should go on at all). You guys responded and I talk about that response and what to expect from me in the future. I also start talking about the left-pad NPM module that “broke the Internet” and a blog post I found about it. Then, I get a little worked up about it and go on a little rant of my own 😉


Links Mentioned in this Show:
Haney – Have We Forgotten How to Program?
Azer – I’ve Just Liberated My Modules
Kik – A Discussion About the Breaking of the Internet

You can also subscribe to the podcast at any of these places:
iTunes Link RSS Feed

Thanks to all the people who listen, and a special thanks to those who have rated me. I really appreciate it.

The episodes have been archived. Click Here to see the archive page.

Swift

An Introduction to Swift Optionals

Swift OptionalsIf you are anything like me, the concept of Swift Optionals can be a little confusing. At first blush, it seems fine, but then the way that you have to work with them can leave you scratching your head.

The idea is simple. Swift Optionals are essentially nullable/nillable types from other languages. Obviously, reference types can always be null or nil (depending on the language), but value types cannot. In most (if not all) languages that I am familiar with, you cannot have a null integer or a null boolean. You used to have to be left with creating some logic or understanding that if an Id column was 0 or -1, that meant that it had never been set. Or, you had to never fully rely on a boolean’s false value. It could mean no one had ever set some conditional, or it could mean they had “chosen” and they had chosen “no” or “off”.

In C#, they implemented nullable types to solve this problem. I have an example below of using nullable types. You’ll note that you can actually declare them two ways. Like many things in modern languages, there is the “defined” way and the “shorthand” way. There are a few things I want you to note. First, before you use it, you should check and see if it has a value by consulting its “HasValue” property. But, how could you do that if it is null? It isn’t actually null. The nullable type is actually a container. You are asking the container if anything is inside. The other thing to note is that when I want to use it, .Net infers the call to id.Value for instance. I left it as implicit for id and explicit for otherId so you can see the difference.

Nullable<Int32> id = null;
int? otherId = 7; // int? is just "syntatic sugar" for Nullable<Int32>

if (id.HasValue)
{
	Console.WriteLine("The id's value is {0}", id);
}
else
{
	Console.WriteLine("Id has a null value");
}

// We could also be more concise, but maybe less readable
Console.WriteLine(otherId.HasValue ? string.Format("Other Id's value is {0}", otherId.Value) : "Other Id has a null value");

If I had called .Value on id while it was set to null, I would have gotten an error that said “Nullable object must have a value”. That’s why you must check.

So, how does this relate to Swift? The syntax is actually very similar to C#’s. I use the question mark to denote that I’ve declared an optional variable. However, if I try to use it like C# and just print it out, I get something that can be confusing. Take a look at this code:

var id: Int? = 7

print(id)

Here is what is output, “Optional(7)”. Unlike C#, it didn’t just auto-unwrap and show 7. It is showing you visibly that that 7 is “wrapped inside an Optional”. That is because – like C# – the ? is “syntactic sugar”. What you are really declaring is this:

var id: Optional<Int> = Optional<Int>(7)

print(id)

Now it is really no wonder why we see “Optional(7)” as the output. Swift is literally just giving us back what is there. What if you wanted to unwrap it, then? To unwrap it, you use the ! symbol, like this:

var id: Int? = 7

print(id!)

That will actually output “7” to your console. But, what will this code do?

var id: Int? = nil

print(id!)

If I try to force unwrap a nil value, I get this error, “fatal error: unexpectedly found nil while unwrapping an Optional value”. Bad news. One way that you often see this dealt with is like this:

var id: Int? = nil

if let unwrappedId = id {
    print("Id unwrapped is \(unwrappedId)")
}
else {
    print("That was nil")
}

In our case, this code prints “That was nil”. If you change the first line to var id: Int? = 7, it will now output “Id unwrapped is 7”. But… why? This syntax makes use of a few things. First, nil evaluates as false in Swift. Secondly, the “if let” is special syntax to Swift. The variable is only unwrapped in this way in an “if let” scenario. And “if let” only works if you have an Optional being assigned.

For instance, I can’t do either of these:

if let seven = 7 {
    print ("Worked")
}

var id: Int = 8

if let eight = id {
    print("Bingo")
}

The compiler errors on both items say, “initializer for conditional binding must have Optional type, not ‘Int'”.

So, “if let” is what you have to work with if you want to check variables and do something different in execution if they are not there. However, you have other options. What if you are handed an Optional, but you just want to work with a “sensible default”? You can stick with “if let” if that makes sense to you. That would look like this:

var incomingVariable: Int? = nil

var myInternalVariable:Int

if let unwrappedIncomingVariable = incomingVariable {
    myInternalVariable = unwrappedIncomingVariable
}
else {
    myInternalVariable = 0
}

print(myInternalVariable)

Another option is to skip the “else” entirely and just preset your variable to what the default should be. If the Optional variable has a value, it will just be overwritten. The previous code and this code both print out “0” to the console.

var incomingVariable: Int? = nil

var myInternalVariable:Int = 0

if let unwrappedIncomingVariable = incomingVariable {
    myInternalVariable = unwrappedIncomingVariable
}

print(myInternalVariable)

What if you would like to be a little more concise? You do have the option of using the Swift coalesce operator, ??. That would reduce the previous example to something like this:

var incomingVariable: Int? = nil

var myInternalVariable:Int = incomingVariable ?? 0

print(myInternalVariable)

That’s all there is to Optionals at their core. In another post, I’ll revisit Optionals and see how you can use them in iOS applications to deal with controls and casts. If you have any questions, let me know.