Skip to content

T-SQL Tuesday #125: Unit testing databases – we need to do this!!

April 14, 2020

This is my second time taking part in T-SQL Tuesday This week the topic is all about unit testing in databases, and whether it is valuable or not.

This is a topic that is quite close to me heart. I don’t come from a computing background before I started working with SQL Server, so I was quite ignorant when it came to a lot of best practices that other developers who have worked with other languages are aware of. Because of this, I had no idea about unit testing until I attended a talk at a SQL Saturday all about tSQLt. If anyone isn’t aware (as I wasn’t) tSQLt is a free open source unit testing framework for use in SQL Server databases. It is the basis of Redgate’s SQL Test software, and is the most used framework for writing unit tests in SQL Server.

Since then I’ve worked to try and get employers to adopt this as part of a standard development life cycle, with mixed success at best. My current employer is quite keen, but there are two major problems. First, we have a huge amount of legacy code that obviously has no unit tests in place; and second, the way people code is not conducive to unit testing.

It’s the second issue I want to talk about today, maybe I will cover writing unit tests for legacy systems in another blog someday but for now I want to discuss how the way you code may need to change if you adopt unit testing, in particular how you need to start adopting a more modular style of coding.

What is a unit test?

Because there can be a lot of confusion about this, I thought it was best to start by defining unit tests. I see good unit tests as having a few main characteristics:

  1. They test against units of code (stored procedures, functions, views etc.)
  2. Each unit test only tests one thing.
  3. A unit test isolates all the dependencies before testing.
  4. Unit tests can be rerun in an automated way.
  5. Unit tests always pass or always fail for a given set of inputs.

In the original T-SQL Tuesday prompt post, these are listed as:

  1. Decisive – the unit test has all info to determine success/failure
  2. Valid – it produces a result that matches the intention of the code written
  3. Complete – it contains all information it needs to run correctly within the test harness
  4. Repeatable – always gives the same results if the test harness and code are same
  5. Isolated – is not affected by other tests run before nor affects the tests run after it
  6. Automated – requires only a start signal in order to run to completion

How this affects the code I write

Whichever set of criteria you want to use, the results for your style of coding start to look the same. Put simply, unit tests are easier to write when a single unit of code does one thing and one thing only. That thing can be extremely complicated (in which case you will probably have quite a few unit tests around it) or very simple (in which case it may not even need testing at all) but your unit of code should not be attempting to do more than one thing at once.

What do I mean by this? Well, if we look at the Wide World Importers Microsoft sample database, we see a stored procedure called Website.RecordColdRoomTemperatures:

CREATE PROCEDURE Website.RecordColdRoomTemperatures
      @SensorReadings Website.SensorDataList READONLY
AS
    BEGIN TRY

		DECLARE @NumberOfReadings int = (SELECT MAX(SensorDataListID) FROM @SensorReadings);
		DECLARE @Counter int = (SELECT MIN(SensorDataListID) FROM @SensorReadings);

		DECLARE @ColdRoomSensorNumber int;
		DECLARE @RecordedWhen datetime2(7);
		DECLARE @Temperature decimal(18,2);

		-- note that we cannot use a merge here because multiple readings might exist for each sensor

		WHILE @Counter <= @NumberOfReadings
		BEGIN
			SELECT @ColdRoomSensorNumber = ColdRoomSensorNumber,
			       @RecordedWhen = RecordedWhen,
				   @Temperature = Temperature
			FROM @SensorReadings
			WHERE SensorDataListID = @Counter;

			UPDATE Warehouse.ColdRoomTemperatures
				SET RecordedWhen = @RecordedWhen,
				    Temperature = @Temperature
			WHERE ColdRoomSensorNumber = @ColdRoomSensorNumber;

			IF @@ROWCOUNT = 0
			BEGIN
				INSERT Warehouse.ColdRoomTemperatures
					(ColdRoomSensorNumber, RecordedWhen, Temperature)
				VALUES (@ColdRoomSensorNumber, @RecordedWhen, @Temperature);
			END;

			SET @Counter += 1;
		END;

    END TRY
    BEGIN CATCH
        THROW 51000, N'Unable to apply the sensor data', 2;

        RETURN 1;
    END CATCH;
END;

We can see the procedure is doing a couple of things here. First it takes the input parameter @SensorReadings as a custom table data type and iterates through it. As it does this, it inserts or updates the Warehouse.ColdRoomTemperatures table with the values from the current row of the table variable. This is not the most awkward thing to test, but it could be made simpler if the code between rows 23-39 is put into its own stored procedure. Then the outer procedure would look more like this:

CREATE PROCEDURE Website.RecordColdRoomTemperatures
      @SensorReadings Website.SensorDataList READONLY
AS
    BEGIN TRY

		DECLARE @NumberOfReadings int = (SELECT MAX(SensorDataListID) FROM @SensorReadings);
		DECLARE @Counter int = (SELECT MIN(SensorDataListID) FROM @SensorReadings);

		DECLARE @ColdRoomSensorNumber int;
		DECLARE @RecordedWhen datetime2(7);
		DECLARE @Temperature decimal(18,2);

		WHILE @Counter <= @NumberOfReadings
		BEGIN
			SELECT @ColdRoomSensorNumber = ColdRoomSensorNumber,
			       @RecordedWhen = RecordedWhen,
				   @Temperature = Temperature
			FROM @SensorReadings
			WHERE SensorDataListID = @Counter;

			EXEC Warehouse.UpdateColdRoomTemperatureBySensorNumber
                  @ColdRoomSensorNumber = @ColdRoomSensorNumber
                , @RecoredeWhen = @RecordedWhen
                , @Temperature = @Temperature

			SET @Counter += 1;
		END;

    END TRY
    BEGIN CATCH
        THROW 51000, N'Unable to apply the sensor data', 2;

        RETURN 1;
    END CATCH;
END;

This way, we can write unit tests against the outer procedure to see if it is looping effectively, and separate unit tests against the inner procedure to test if it is updating the table correctly. A nice side-effect is if we want to write any other code that wants to merge a single row into the Warehouse.ColdRoomTemperatures table we can use the Warehouse.UpdateColdRoomTemperatureBySensorNumber stored procedure.

This, really, is what modular coding is all about. Making sure that each module in your code base is only trying to do one thing. You can then wrap unit tests around that module to make sure it does that thing well, and re-use it over and over again throughout your code whenever you need to do that one thing.

To keep with the example above, I don’t like the way the code does the UPDATE and then the INSERT if @@ROWCOUNT = 0. Despite the comment in the code, you can use a MERGE provided you only merge in the one row being added in the WHILE block. The end result of this change should function the same as the original code but look more elegant. If I have the code related to the Warehouse.ColdRoomTemperatures in its own stored procedure, with some unit tests around it, I can change that stored procedure however I like provided the unit tests still pass, confident that any calling stored procedures will still function the same.

Orchestration procedures and code contracts

A key part of the modular approach to code are the orchestration procedures. These are the procedures that sit on top of everything else, call other stored procedures, manage control flow, pass variables etc. They do no do much themselves but decide what will be done. The example procedure above is a simple orchestration procedure, but they can get significantly more complex. They might function as the APIs of your database, called by external applications to do something or return some values.

They are quite simple to unit test really. All you do is map the possible paths the orchestration process could take, depending on values passed in and values retrieved during the process. Then you write a test for each of these paths e.g. if Parameter A is set to 32 and function X returns less than 10,000 we should execute stored procedures 3, 5, and 17 with specific parameters. Whether function X, or any of the stored procedures, perform as expected is not something to worry about when testing an orchestration procedure, that is taken care of by the unit tests around these code units.

This idea can also be thought of as a contract between a particular unit of code and the rest of the database. Defined by a specification (hopefully well documented somewhere, but we can’t expect miracles) and enforced by the unit tests, this contract says that this unit of code will behave in this way. You can change the unit however you like, provided it doesn’t break this contract, and if at some point you find it does, then you will most likely find yourself faced with a long, tedious job of trawling through every other bit of code that calls it to make sure this new contract still fits what they expect from the code unit.

Summary

If you adopt unit testing, you may need to change your coding style to a more modular one in order to get the best from it. This does come with other benefits, however, like clearer separation of responsibilities in a database, easier code reuse, and better defined functional units of code. I think even without unit tests, more modular coding is the way to go when possible (and sometimes performance issue will make it impossible) for all of the reasons just mentioned. I also think that writing unit tests changes your perspective on the code you write, it helps you think about things like error handling, and code contracts, and code reuse, and will make you a better coder.

From → Uncategorized

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: