Avoiding coding traps

October 27, 2020

Brent Ozar put up a post last week called If Your Trigger Uses UPDATE(), It’s Probably Broken. In it he talks about the UPDATE() function and how it can cause unexpected behaviour in your code.

For anyone not aware, UPDATE() is a function that can be used in triggers, and which returns TRUE if an INSERT or UPDATE action happens on the column passed into the function. So, something like this:

CREATE TABLE dbo.Primarch
    (
          PrimarchID INT IDENTITY(1, 1) NOT NULL
        , PrmarchName VARCHAR(200) NOT NULL
        , ChatpterName VARCHAR(200) NULL
        , LastKnownStatus VARCHAR(100) NOT NULL
        , LastKnownLocation VARCHAR(100) NULL
        , YearsAtLastKnownLocation VARCHAR(100) NULL
        , NumberOfSuccessorChapters INT NOT NULL
    );
GO

CREATE TRIGGER Primarch_Update
    ON dbo.Primarch
    AFTER UPDATE
AS
    IF UPDATE(LastKnownLocation)
        UPDATE Prim
        SET
             YearsAtLastKnownLocation = 0
        FROM dbo.Primarch AS Prim
        INNER JOIN inserted AS ins
            ON Prim.PrimarchID = ins.PrimarchID

The idea here is to set the YearsAnLastKnownLocation to 0 if the LastKnownLocation is changed, with the logic being we are receiving real-time information on the last known location of the primarchs (this example is a bit contrived but go with it). However, if you want to run a full update on a primarch you might have a bit of code like this:

CREATE PROCEDURE dbo.UpdatePrimarchInformation
    (
          @PrimarchID INT
        , @LastKnownStatus VARCHAR(100)
        , @LastKnownLocation VARCHAR(100)
        , @NumberOfSuccessorChapters INT
    )
AS
    UPDATE Prim
    SET
          LastKnownStatus = @LastKnownStatus
        , LastKnownLocation = @LastKnownLocation
        , NumberOfSuccessorChapters = @NumberOfSuccessorChapters
    FROM dbo.Primarch AS Prim
    WHERE 1 = 1
        AND Prim.PrimarchID = @PrimarchID

Now, when you run this and input the current LastKnownLocation and LastKnownStatus, and a new NumberOfSuccessorChapters, you still trigger the update of YearsAtLastKnownLocation to 0.

The exact reasons why aren’t important, and Brent does a typically great job of explaining the issue on his blog. The reason for this post is I wanted to look at these types of errors a bit more, and a coding principal I like to adopt that helps to avoid them.

Coding traps, not bugs

Brent calls this a bug at one point, and while I agree with him that it’s bad practice to code like this I wouldn’t call it a bug myself. The reasoning behind that is without the second bit of code (the stored proc) the application functions as it was intended to. It’s not until someone codes in something that might update LastKnownLocation without changing the value that you start to have undesirable behaviour in the app.

It is, however, absolutely a trap for any future developer working on the app. The stored proc above is an entirely reasonable bit of code to write, most developers I know won’t check every table their code hits for triggers, and similarly won’t read every bit of documentation on a table before writing a simple bit of code against it.

Some people might say that testing should save you, and in theory it should but things do get missed in testing, and that’s assuming you’re working somewhere that encourages testing in the first place. And in any case, even if it does get caught in testing, that’s another bit of work for the dev to do to figure out why this update is causing the YearsAtLastKnownLocation column to always set to 0.

That’s extra time to do the work, extra stress on the dev, and all because of a bit of code in the trigger that could be written better.

Coding to the existing code

So, this is the principal I want to talk about, that I try to work to as much as I can. Essentially it means you code to what is possible within the existing code base, not what people tell you the existing code base will be used for.

One great example of this is always coding around NULLs if they are allowed in a column. It doesn’t matter if the business assures you there will never be a NULL in that column, you should always assume there will be. As an example, the following code assumes the LastKnownLocation will never be NULL:

SELECT
      Prim.PrmarchName 
        + ', primarch of the ' 
        + Prim.ChatpterName 
        + ' chapter. Current status: ' 
        + Prim.LastKnownStatus 
        + '. Last seen at ' 
        + Prim.LastKnownLocation 
        + ',  ' 
        + CAST(Prim.YearsAtLastKnownLocation AS VARCHAR(100)) 
        + ' years ago.'
FROM dbo.Primarch AS Prim
WHERE 1 = 1
    AND Prim.PrimarchID = @PrimarchID

If the LastKnownLocation is NULL, because we don’t have a last known location for the Primarch, the whole string returns as NULL. You may have a guarantee from the project manager, or the business sponsor, that we will always have a value for LastKnownLocation, but the code says something different. If you follow the business steer in these circumstances, and code as if the column will never be NULL, you are introducing a trap for developers working on this software in the future. One of them will see the column allows NULLs and will write some code that can add a NULL into the column, and then this bit of display code somewhere else in the app will break.

Another example that may be familiar to people who’ve written a lot of ETL processes is when you come to create an import table. In a lot of cases you are importing from files that don’t necessarily remain the same every time, so you have a process that reads the column headers from the file and creates a table to hold them all, something a bit like this:

DECLARE @ColumnList AS TABLE
    (
          ColumnName VARCHAR(255)
        , OrdinalPosition SMALLINT
    );
INSERT INTO @ColumnList
    (
          ColumnName
        , OrdinalPosition
    )
VALUES
      ('PrmarchName', 1)
    , ('ChatpterName', 2)
    , ('LastKnownStatus', 3)
    , ('LastKnownLocation', 4)
    , ('YearsAtLastKnownLocation', 5)
    , ('NumberOfSuccessorChapters', 6);

DECLARE
      @Counter AS INT
    , @SQLScript AS NVARCHAR(MAX) = N'CREATE TABLE dbo.PrimarchImport(RowNumber INT IDENTITY(1, 1) NOT NULL';

SELECT @Counter = MIN(OrdinalPosition) FROM @ColumnList;

WHILE @Counter <= (SELECT MAX(OrdinalPosition) FROM @ColumnList)
BEGIN
    SELECT
          @SQLScript = @SQLScript + N', ' + ColLst.ColumnName + N' NVARCHAR(MAX) NULL'
    FROM @ColumnList AS ColLst
    WHERE 1 = 1
        AND ColLst.OrdinalPosition = @Counter;

    SET @Counter = @Counter + 1;
END

SET @SQLScript = @SQLScript + N');';

EXEC sp_executesql
      @stmt = @SQLScript;

This is a bit of a cut down version as I’m just adding the column names direct to the @ColumnList table variable rather than generating them with C# code or something else, but it works for our purposes. This particular block of code will run just fine, but what happens when we get a column name in the source file with a space, or a column, or anything else that will cause us issues with our create table script? And again, we may be promised that this will never happen, but why take the risk? All we need to do to fix it is escape the column names in our dynamic SQL by wrapping them in [].

In both of these examples the application may work as expected for years, or even forever. But it may not, and if it it does break the root cause of the problem will be you not coding for a scenario the existing code told you was possible.

From → key concepts, T-SQL

Avoiding coding traps

Coding traps, not bugs

Coding to the existing code

Leave a comment Cancel reply

Recent Posts

Archives

Categories

Meta

Subscribe to Blog via Email

Avoiding coding traps

Coding traps, not bugs

Coding to the existing code

Share this:

Related

Leave a comment Cancel reply

Recent Posts

Archives

Categories

Meta

Subscribe to Blog via Email