Chris Johnson

June 16, 2020

Service Broker 101 Lesson 5: Stored procedures to handle messages

At this point we’ve set up our message types, our contracts, our services, and our queues. We’ve opened conversations and sent messages across them (but only initiator to target), and we’ve seen how to get those messages off the target queue. This lesson will go through the last step in the process, doing something with these messages.

Message handling from outside of the queue

This is the easier option, but it will probably cause you some issues down the line. This is where you let messages accumulate on the queue, and at set times run a stored procedure (or other statement) to do something with them. It might look something like this:

DECLARE @Messages AS TABLE
    (
          MessageType NVARCHAR(256) NOT NULL
        , MessageBody XML NULL
    );

RECEIVE
      message_type_name
    , CAST(message_body AS XML) AS XMLMessageBody
FROM dbo.TargetQueue
INTO @Messages;

INSERT INTO dbo.GoodMessages
    (
          MessageType
        , MessageBody
    )
SELECT
      Msg.MessageType
    , Msg.MessageBody
FROM @Messages AS Msg
WHERE 1 = 1
    AND Msg.MessageType IN
        (     'ServiceBrokerExample/Example1/MessageType/Outgoing'
            , 'ServiceBrokerExample/Example2/MessageType/Outgoing');

INSERT INTO dbo.ServiceBrokerErrors
    (
          MessageType
        , MessageBody
    )
SELECT
      Msg.MessageType
    , Msg.MessageBody
FROM @Messages AS Msg
WHERE 1 = 1
    AND Msg.MessageType LIKE 'ServiceBrokerExample/Example[0-9]/MessageType/Error';

This code will retrieve every message from the queue, and add it to one of the two tables depending on the message type. This will process the messages, and empty the queue, but what about all of the conversations that have been left open. We could execute an END CONVERSATION statement for each conversation_handle in the queue, but that will only close the conversation from the initiator side. We could END CONVERSATION WITH CLEANUP, but that’s a brute force approach like KILLing a connection,a nd should really be a last resort. In any case, we don’t know for sure which of these conversations we do want to end. If we are doing something more complex, maybe we want to keep a conversation going for some time.

Message handling using stored procedures as queue handlers

As discussed in the last lesson, you can attach a stored procedure to a queue when creating or altering it using the following syntax:

WITH
    , ACTIVATION
        (
              STATUS = ON
            , PROCEDURE_NAME = dbo.TargetQueueProcedure
            , MAX_QUEUE_READERS = 4
            , EXECUTE AS SELF
        )

This stored procedure is used by the queue to handle messages that arrive. When a message arrives the queue will execute the procedure over and over until there are no more messages on the queue. Your stored procedure therefore needs to be removing messages as it goes. You have the option to have multiple versions of the query executing at once, to clear down a queue faster or to keep up with a high volume of messages, using the MAX_QUEUE_READERS setting. You can turn the stored procedure on or off using the STATUS, while this is set to OFF nothing will happen but as soon as it is set to ON the query will start processing messages again. Finally you need to specify what user the query will execute under. The options here are SELF, as the current user (the person who runs the CREATE or ALTER script), OWNER, as the person who owns the queue, or a username that the current user has impersonate permissions for.

Below is an example of how you might set out a stored procedure you want to run as a queue handler:

DECLARE
      @ConversationHandle UNIQUEIDENTIFIER
    , @MessageType NVARCHAR(256)
    , @MessageBody XML
    , @ReceivedMessage BIT = 1

WHILE @ReceivedMessage = 1
BEGIN
    SET @ReceivedMessage = 0
    WAITFOR
        (
            RECEIVE TOP (1)
                  @ConversationHandle = conversation_handle
                , @MessageBody = message_body
                , @MessageType = message_type_name
                , @ReceivedMessage = 1
            FROM dbo.TargetQueue
        ), TIMEOUT 1000;

    IF @ReceivedMessage = 1
    BEGIN
        IF @MessageType IN
                ('ServiceBrokerExample/Example1/MessageType/Outgoing'
                , 'ServiceBrokerExample/Example2/MessageType/Outgoing')
            INSERT INTO dbo.GoodMessages
                (
                      MessageType
                    , MessageBody
                )
            VALUES (@MessageType, @MessageBody);
            
        IF @MessageType LIKE 'ServiceBrokerExample/Example[0-9]/MessageType/Error'
            INSERT INTO dbo.ServiceBrokerErrors
                (
                      MessageType
                    , MessageBody
                )
            VALUES (@MessageType, @MessageBody);
        
        END CONVERSATION @ConversationHandle;
    END
END

The WAITFOR, TIMEOUT 1000 that we wrap around the RECEIVE TOP(1) does exactly what you might expect, it waits 1000 seconds to see if we can retrieve a message from the queue, and breaks at either the 1000 second mark or as soon as a message is received. The WHILE @ReceviedMessage = 1 ensures that the query will keep executing as long as it finds messages, and will end as soon as it does not.

Once the message has been received we do something with it, in this case adding it to different tables depending on the message type, and then close the conversation. We could also send messages back if we wished, using the same conversation, and keep the conversation open if we expect to receive more messages from our initiator queue in response. We could even open more conversations and send messages on to further queues.

The initiator queue will need a similar query attached to it. At the simplest, this should just look for the Microsoft “http://schemas.microsoft.com/SQL/ServiceBroker/EndDialog” message type, and end the conversation from the initiator endpoint as well. It may also do other things, depending on how complex you want the conversation to be, but the most important thing is that any possible conversation path ends with both sides executing an END CONVERSATION eventually. Otherwise conversation endpoints will persist and cause performance issues down the line.

POISON_MESSAGE_HANDLING

POISON_MESSAGE_HANDLING is another queue-level property you can set with a CREATE or ALTER QUEUE statement. A poison message in the service broker context is a message that causes the activation stored procedure to roll back the transaction when it executes. This will roll back the RECEIVE statement and put the message back on the queue, then the activation stored procedure will activate, attempt to process it again, and roll back again. This can cause an infinite loop, and potentially block up the queue or at the very least consume resources.

One way round this is to not include transactions and rollbacks in your query. This might be ok depending on how you handle the error, but what if the error handling also fails? Also, what if the failure is a temporary connection problem or something similar, ideally then you do want to roll back and try again.

The POISON_MESSAGE_HANDLING setting decides what the queue does if you are in this situation. The default is to set it to ON, in which case the queue will disable (STATUS = OFF) after 5 consecutive ROLLBACKs. If you set it to OFF, the queue will keep on trying to execute the stored procedure forever so if you are going to disable this you need to be confident your activation stored procedure can handle this scenario.

That’s all for now, next time the lesson will be a brief look at how you can extract data from the XML of a message.

June 10, 2020

Service Broker 101 Lesson 4: All about Queues

So far, this series has covered what service broker is, the different components that make it up, and how we use these to open a conversation and send a message to a queue. We’ve talked about queues quite a bit in these sessions, but never really gone into detail about what they are, so that is what this lesson is all about.

Retrieving data from Queues

A queue is a full database object, like a table or a stored procedure. As such, it is part of a schema, and appears in the sys.objects view. A queue holds messages that have been sent to it, in the same way that a table does, and these messages can even be queried in the same way that you would query a table.

You can’t change the columns that are available, and there are quite a few of them. To see what there is, just run SELECT * against any queue, but a few of the key ones are service_name, service_contract_name, message_type_name, message_body, message_enqueue_time, conversation_handle.

You can also retrieve data from a queue using a RECEIVE statement. These work very much like SELECTs, except that they remove the received message from the queue. This is how you would typically process messages, to make sure that the queue does not grow too large. Also note that the RECEIVE statement, like a CTE, needs the previous statement to terminate with a ;

RECEIVE TOP (1)
      priority
    , conversation_handle
    , message_type_name
    , CAST(message_body AS XML) AS XMLMessageBody
    , message_body
FROM dbo.TargetQueue

As you can see, you need to convert the message_body to XML or a string data type for it to make any sense. When you specify a TOP (X) in your RECEIVE statement, you will get the top X messages in the order they arrived (oldest first). You can also filter RECEIVE statements, but only on conversation_handle or conversation_group_id.

Creating queues

Queues can be a bit more complex to create, and there’s a few things you need to be aware of. The basic queue creation is pretty simple

CREATE QUEUE dbo.TargetQueue

But there are a few more options than we’ve seen for the other components we’ve created. Note that queues cannot use the CREATE OR ALTER script that was brought in for SQL 2016.

CREATE QUEUE dbo.TargetQueue
WITH
      STATUS = ON
    , RETENTION = OFF
    , ACTIVATION
        (
              STATUS = ON
            , PROCEDURE_NAME = dbo.TargetQueueProcedure
            , MAX_QUEUE_READERS = 4
            , EXECUTE AS SELF
        )
    , POISON_MESSAGE_HANDLING (STATUS = ON)

Everything in the WITH block can be changed with an ALTER QUEUE statement.

STATUS – This effectively says if the queue is active or not, so whether it can send and receive messages or not. The default is ON
RETENTION – If this is on, messages on the queue are kept until the end of a conversation regardless of if they have been RECEIVEd or not. It defaults to OFF
ACTIVATION – This allows you to attach a stored procedure to the queue, that will handle messages as they arrive. This is the subject of the next lesson, so I won’t say anything more about it here.
POISON_MESSAGE_HANDLING – This relates to the ACTIVATION section, so I will cover it next lesson.

All of these settings can be changed once the queue is created, using ALTER QUEUE. That’s pretty much all there is to queues, next lesson will dive deeper into the stored procedures we attach to them and how they work.

June 2, 2020

Service Broker 101 Lesson 3: Conversations and messages

The last post went through the basics of the different components Service Broker uses in SQL Server. I talked about message types, queues that send and receive messages, contracts that specify particular message types to be sent to the target or initiator queue, and services that attach to queues and specify the contracts that can target that queue.

This lesson is all about how we fit these things together to open a conversation and send a message. I’ve seen some other posts that rush through this so I want to take my time with this and go through it step by step.

The first thing we need is an open conversation. This is generated with the BEGIN DIALOG CONVERSATION statement. Oddly, the CONVERSATION part is redundant, despite the fact that every other command refers to these items as conversations, not dialogues. Anyway, the command to open a simple conversation is as follows:

DECLARE @ConversationHandle UNIQUEIDENTIFIER;

BEGIN DIALOG CONVERSATION @ConversationHandle
    FROM SERVICE [ServiceBrokerExample/Example1/Service/ServiceSource]
    TO SERVICE 'ServiceBrokerExample/Example1/Service/ServiceTarget'
    ON CONTRACT [ServiceBrokerExample/Example1/Contract/Complicated]

There are a few key things here.

First, each conversation has its own unique dialog_handle (not conversation handle, despite everything calling it a conversation from now on, score one for consistency Microsoft). We need to capture this handle in a UNIQUEIDENTIFIER variable, as we will need it later on to send messages across the conversation. In fact, the statement will error if you don’t supply a variable to capture the handle.

Second, we need to supply both FROM and TO services. These tell the conversation which service is the source and which is the target. Remember, each service is attached to a queue, and can have one or more contracts attached to it. The source service is a database object, but the target service is an NVARCHAR. This allows the target service to live outside the database, which is something that I will cover at some point in the Service Broker 201 series.

Finally we need to give the contract the conversation will obey. This contract details what message types can be sent by the initiator and target services, and it has to be one of the contracts allowed by the target service.

At this point we have a conversation, and we can start sending messages, the code to do that is below:

SEND
    ON CONVERSATION @ConversationHandle
    MESSAGE TYPE [ServiceBrokerExample/Example1/MessageType/Reply]
        ('<message>hello world</message>')

This sends the specified message type across the specified conversation from the initiator to the target. You can also specify the message you want to send in brackets after the message type, but this is not necessary unless the message type forces you to. Sometimes just having a message of a specified type is all the information you need.

Note that the initiator to target rule applies to any code executed in a standard session connected to the database that holds the initiating service. Retrieving a message from the queue, and sending messages from the target to the initiator will be covered in the next session about queues.

The last thing we need to do is close our conversation. To do this we call the END CONVERSATION @ConversationHandle statement. This sends a message using the inbuilt Microsoft message type http://schemas.microsoft.com/SQL/ServiceBroker/EndDialog and closes the conversation from one end. To fully close the conversation you need to also call END CONVERSATION from the other end.

It’s important to always fully close conversations from both ends, otherwise they will hang around eating resources in your database. They may not be much individually, but over time they can add up significantly. A conversation does have a lifetime, specified in seconds, and you can set the lifetime when you create it, but if you don’t the default is 2,147,483,647 (max value of an INT) which is roughly 68 years.

That’s it for now, hopefully this stuff is starting to come together. The nest session will be all about Queues, how we retrieve these messages from them, how we process the messages once we have them, and how we can send messages back from our target to the initiator.

May 29, 2020

Custom backgrounds in Microsoft Teams

Just a quick blog post today, taking you through how to add a custom background to Microsoft Teams.

Instead of doing something through Teams itself, you need to copy the custom image to the C:\Users\YOURUSERNAME\AppData\Roaming\Microsoft\Teams\Backgrounds\Uploads folder, obviously replacing YOURUSERNAME with your actual username. Note that if you try to navigate to this folder in file explorer, AppData is a hidden folder so you need to click on View and the Hidden Items checkbox before you can see it.

Once you have added the image to that folder, you can select it in Teams by clicking on the ellipsis during a call and selecting Show Background Effects. You should see a number of possible images, with your custom images at the end.

May 26, 2020

Service Broker 101 Lesson 2: Service Broker components and how they fit together

Last lesson I gave a very quick introduction to Service Broker, and outlined a couple of scenarios where it might be useful. This time I want to talk through the different components that make up Service Broker, how they fit together, and what part they play in sending and managing messages.

There are 4 main components that Service Broker needs you to create in SQL Server:

Message Type

Any message you send will have two components, the message type and the message. The CREATE MESSAGE TYPE syntax allows you to specify a message type, and what constitutes a valid message for that type.

CREATE MESSAGE TYPE [SBE/Example1/MessageType/Outgoing]
    VALIDATION = WELL_FORMED_XML;

In this case, we create a message type of ServiceBrokerExample/Example1/MessageType/Outgoing, that takes a well-formed XML message. Other values for VALIDATION are NONE, where the message can contain anything or be NULL; EMPTY, where there is no message; or VALID_XML WITH SCHEMA COLLECTION schema_collection_name, where the message has to be XML that conforms to the specified schema collection, which must already exist.

SQL Server already has a collection of message types it uses to signal various events, you can find them in the sys.service_message_types.

Queues

Queues are the most similar to SQL objects we may be used to. They are also the only Service Broker object to appear in the sys.objects table, and to be owned by schemas. There is also quite a bit to say about them, so I will largely leave them to Lesson 4.

Contracts

Contracts define the types of conversation that can be had between Queues. Specifically, each contract defines the message types that can be sent by the initiator and target queues. Each message type included in the contract can be sent by the initiator, the target, or both.

CREATE CONTRACT [SBE/Example1/Contract/Complicated]
    (
          [SBE/Example1/MessageType/Outgoing] SENT BY INITIATOR
        , [SBE/Example1/MessageType/Reply] SENT BY TARGET
        , [SBE/Example1/MessageType/Alert] SENT BY TARGET
        , [SBE/Example1/MessageType/Error] SENT BY ANY
    )

In this example, once a conversation has been opened under this contract, the initiator queue (the queue that sent the first message) can send Outgoing or Error message types, and the target queue can send reply, alert, and error messages. Note that once a conversation begins the initiator and target roles are locked for the purposes of that conversation.

Service

A service is the way a conversation connects to a queue. Each service sits above a single queue, although a queue can have multiple services. In order to be the target of a conversation, a service must also specify at least one conversation that can be used to target it.

CREATE SERVICE [SBE/Example1/Service/ServiceTarget]
    ON QUEUE dbo.TargetQueue
    (
          [SBE/Example1/Contract/Complicated]
        , [SBE/Example1/Contract/Emergency]
    )

CREATE SERVICE [SBE/Example1/Service/ServiceSource]
    ON QUEUE dbo.SourceQueue

CREATE SERVICE [SBE/Example1/Service/ServiceEmergencySource]
    ON QUEUE dbo.EmergencySourceQueue

In this example, the ServiceTarget service allows the TargetQueue to be the target of conversations using the Complicated or Emergency contracts. The ServiceSource service allows the SourceQueue to initiate conversations but not be the target of them, and the ServiceEmergencySource service does the same for the EmergencySourceQueue.

That’s it for this lesson, I know this is probably a little confusing at the moment but next lesson I’ll take us through opening a conversation and sending a message, and at that point things should get a little clearer.

May 19, 2020

Service Broker 101 Lesson 1: What is Service Broker?

If you don’t really know what Service Broker is, you’re not alone. I had probably heard the term a couple of times in my 14 years as a SQL developer, but had never come across anyone using it until I started my latest job. Even then, I only discovered it when I imported a database into an SSDT database project, and saw a Queue object appear.

I did a little investigation after that; and it seemed an interesting, if little used, piece of functionality. I didn’t really think anything more of it, but filed it away in the bit of my brain that stores marked “investigate someday” (that part of my brain gets pretty cluttered and seldom cleared out).

Then, recently, I had an issue where Service Broker seemed the perfect solution, so spent a weekend experimenting, coded the fix using Service Broker, and that release is making it’s way through UAT at the moment.

But what is Service Broker?

I can hear you thinking “yeah, yeah, get to the point already”, so I will.

Service Broker is a queueing system within SQL Server. It allows you to send a message from one queue to another, and handle what happens when that message arrives. That’s pretty much it.

So, why would I want that?

Well, that’s where the story I told you at the start comes in. The issue we had was that out client wanted to run a process during the day that usually gets run at night. This process is pretty long, and locks up some tables for a minute or more, including some tables it locks through an indexed view (that’s a whole other issue that I’ll maybe blog about some other day). At the same time, users are logging onto the application to do various things, including downloading one-use vouchers. The stored proc that sits behind that API reads the data ok, but wants to write the record that these vouchers have been viewed to one of the locked tables.

What I’ve done is shift the write to table from the stored procedure to a queue. Now when a user requests their vouchers the system selects them from the right table, and fires off a message to the queue with all the details of the vouchers they just viewed, and the queue adds them to the table whenever it has a chance.

So, that’s scenario #1, when you have a process that needs to do something but doesn’t need to do it immediately, you can split that part of the process out and use a queue to have it happen as soon as possible, allowing your main process to complete quicker. Typically this will be logging that something has happened, where you need to return the result of the thing immediately but the logging table might be blocked by either a big read/write, or lots of small reads/writes.

Scenario #2 is when you are running something like the ETL systems I’ve seen and heard getting built more and more. These systems work off queues in an case, typically built as tables, where you have an overarching system that dynamically decides what happens and in what order.

As an example, you start by importing 3 files, so you add those imports to the queue and they all start executing. Once the first one is finished, you add a validation for that to the queue and that starts processing. File 2 finishes importing but that’s flagged as not needing validation so you check what data loads you can run with just that file and add them to the queue. File 3 takes a long time to load, and File 1 finishes validation in that time but there’s nothing that can be populated from File 1 until File 3 is finished loading so nothing gets added to the queue at that point.

If you have a system that wants to work like that, where you are executing a set of modules in a dynamic order, then Service Broker may be useful as a way of managing that order.

I was going to post a bit more here about the different components that make up Service Broker, but I’ve gone on for longer than I expected just on this, so I think I’ll leave that for the next post.

May 19, 2020

Lessons

I’m trying something new with my blog post this week. I want to start doing different series on subjects I think people would benefit from a deep dive into. I’m starting with Service Broker, a topic I knew nothing about until a few months ago. Other possible topics inclued:

Different components of SQL Server
SSAS
MDS
DQS
SQLCLR

Some of these I know quite a bit about, others I don’t have much of a clue at the moment and that’s part of the reason I want to write about them, to force myself to learn. These posts will be titled something like [topic] 101 Lesson 1: [lesson subject]. Depending on how this goes I might add some 201 or 301 series in the future, but for now the idea is to assume no knowledge from the reader and try to get to a point where they can not only (in this case) code a simple service broker solution, but also understand what they are doing.

Anyway, the first service broker post goes up today. Enjoy.

May 12, 2020

TSQL Tuesday #126: Folding@Home

This month’s T-SQL Tuesday comes from Glenn Berry, and is all about what you are doing to help during the ongoing Coronavirus crisis.

He links to Folding@Home, which allows you to use your personal machine(s) to do complex protein folding calculations to help with medical research. I don’t pretend to understand all of what they’re doing, but essentially it’s taking a problem, breaking it down into lots of mini problems, and sending these mini problems out to individual computers to find solutions. I’ve signed up to that now, and joined the Tech Nottingham team, so that’s one thing I’m doing. Plus I’m trying to get my old desktop working again, so it can run on there as well as my newer machine, and I can fold double the things.

I’m finding this a hard blog to write though, because I don’t feel like I’m doing that much apart from that. The main thing I’ve done is to set up a Microsoft Teams organisation for my local creative writing group, Nottingham Writers Collective. We’ve run a few meetups on there and it’s been a fun way to keep in touch with some people who are very important to me. We’ve also set up a number of things in the team to help people share work, and set challenges for themselves, but so far nobody has really used them. They’re there though, if anyone does feel a need, and I hope as the lockdown continues we will take advantage of them a bit more.

I think this is one of the little things that a lot of us are probably in a position to do. So much of our lives are suddenly being lived online, and it’s easy for us as techie people to forget how daunting a lot of this stuff is. So, make more of an effort to be patient with your family as they ask you for the 10th time how Skype works, or want to know why they can’t see everyone on a Zoom chat. Look out for things that can help keep people connected, and tell everyone about them.

I’ve discovered tabletop simulator, humbe bundle’s offer on Asmodee games, and board game arena’s free to play browser version of many tabletop games, and managed to organise a gaming session with some of my friends a few weekends ago. I think it helped us feel a bit less alone, and I will try to organise similar things in the future.

This was an odd blog to write, I’ve not really written anything on here until now that wasn’t very SQL/tech focussed. I hope anyone who reads this is keeping well and coping ok with everything.

May 5, 2020

AND and OR interactions

I’ve been working through a particularly nasty bug recently, and when I eventually found the cause it turned out to be a mistake in a WHERE clause including several ANDs and ORs. I thought it’d make an interesting topic to dive into for a quick blog post.

The basic issue looked something like this:

INSERT INTO dbo.TargetTable
    (
          TableGUID
        , Column1
        , Column2
    )
SELECT DISTINCT
      TableGUID
    , Column1
    , Column2
FROM dbo.SourceTable
WHERE SourceTable.StatusColumn = 'A'
    OR (SourceTable.StatusColumn = 'B' AND SourceTable.StatusDate IS NULL)
    AND SourceTable.TableGUID NOT IN 
        (SELECT TableGUID FROM dbo.TargetTable)

The problem was we wanted to apply the last AND predicate every time, but the interactions between the ANDs and the OR meant that wasn’t happening. To see exactly what I mean, here’s a couple of simplified versions of the code where I’ve used brackets to make it clearer what is happening:

SELECT 1 -- returns successfully
WHERE 1 = 1
    OR 2 = 2
    AND 2 = 1 -- we want it to not return because of this
    
SELECT 1 -- this is what is actually happening
WHERE 1 = 1
    OR (2 = 2 AND 2 = 1)

SELECT 1 -- this is what we should have done
WHERE (1 = 1 OR 2 = 2)
    AND 2 = 1

So, basically, the OR treats everything after it as being part of the OR, so when the first predicate returns true it doesn’t matter what the rest of the predicates are because they’re all on the other side of the OR. At this point we have a diagnosis, and the solution seems pretty clear: re-write the code with some brackets to tell the query engine what to do.

INSERT INTO dbo.TargetTable
    (
          TableGUID
        , Column1
        , Column2
    )
SELECT DISTINCT
      TableGUID
    , Column1
    , Column2
FROM dbo.SourceTable
WHERE (SourceTable.StatusColumn = 'A'
    OR (SourceTable.StatusColumn = 'B' AND SourceTable.StatusDate IS NULL))
    AND SourceTable.TableGUID NOT IN 
        (SELECT TableGUID FROM dbo.TargetTable)

That gives us a functionally correct solution, but to me there’s another issue. We have re-written the code to clarify things for the query engine, but I’d argue we haven’t made it particularly clear for the next developer who has to edit this code (this is all part of the same insane block of code I wrote about in my code noise post a couple of weeks ago), and that can lead to all kinds of issues further down the line.

I have a particular approach whenever I’m writing a set of predicates connected with both ANDs and ORs. I effectively layer the predicates, starting with a top layer of either ANDs or ORs, then moving to the second layer which will be the opposite. Each sub-layer is wrapped in brackets and indented, and I usually keep each predicate on a different line. For example, this is how I would lay out the code we started this post with:

INSERT INTO dbo.TargetTable
    (
          TableGUID
        , Column1
        , Column2
    )
SELECT DISTINCT
      TableGUID
    , Column1
    , Column2
FROM dbo.SourceTable
WHERE 1 = 1
    AND (SourceTable.StatusColumn = 'A' -- top layer of ANDs
         OR (SourceTable.StatusColumn = 'B' -- second layer of ORs
             AND SourceTable.StatusDate IS NULL)) -- third layer of ANDs
    AND SourceTable.TableGUID NOT IN
        (SELECT TableGUID FROM dbo.TargetTable)

This makes it quite clear that the last AND needs to be evaluated separately to the rest of the WHERE clause.

Now you might be wondering where the 1 = 1 came from. That’s something I like to include in all of my code to make it easier to debug by allowing you to comment out the first predicate easily. Without that, if you want to comment out the first predicate and keep the second you end up having to do something awkward like this:

FROM dbo.SourceTable
WHERE --(SourceTable.StatusColumn = 'A'
       --OR (SourceTable.StatusColumn = 'B'
           --AND SourceTable.StatusDate IS NULL))
    --AND 
    SourceTable.TableGUID NOT IN
        (SELECT TableGUID FROM dbo.TargetTable)

But with the 1 = 1 you can do this instead:

FROM dbo.SourceTable
WHERE 1 = 1
    --AND (SourceTable.StatusColumn = 'A'
         --OR (SourceTable.StatusColumn = 'B'
             --AND SourceTable.StatusDate IS NULL))
    AND SourceTable.TableGUID NOT IN (SELECT TableGUID FROM dbo.TargetTable)

Which saves you from messing about with the last AND predicate at all.

Now, if your query is largely ORs so you want that to be your top layer, you can’t do quite the same thing because the OR means the WHERE always comes back as TRUE. So, what you use instead is 1 = 2, which achieves the same thing as far as ease of debugging is concerned:

FROM dbo.SourceTable
WHERE 1 = 2
    OR (SourceTable.StatusColumn = 'A'
        AND SourceTable.TableGUID NOT IN
            (SELECT TableGUID FROM dbo.TargetTable))
    OR (SourceTable.StatusColumn = 'B'
        AND SourceTable.StatusDate IS NULL
        AND SourceTable.TableGUID NOT IN
            (SELECT TableGUID FROM dbo.TargetTable))

This isn’t the neatest way of writing the code, because we have to repeat the NOT IN across the different OR predicates, but it does the same thing as the rest of the code we’ve been looking at. I suppose for consistency, I should include the 1 = 1 or 1 = 2 in the bracketed predicates as well, and that would help when it comes to debugging, but it would also clutter the code more than a little as we can see:

FROM dbo.SourceTable
WHERE 1 = 2
    OR (1 = 1
        AND SourceTable.StatusColumn = 'A'
        AND SourceTable.TableGUID NOT IN
            (SELECT TableGUID FROM dbo.TargetTable))
    OR (1 = 1
        AND SourceTable.StatusColumn = 'B'
        AND SourceTable.StatusDate IS NULL
        AND SourceTable.TableGUID NOT IN
            (SELECT TableGUID FROM dbo.TargetTable))

Having said that, I do quite like the way that looks. In particular, I like the way each new AND block is clearly defined with the 1 = 1. These kind of standards are something to discuss with your team, if possible, and work together to standardise the way you write code.

Finally, here’s a made up example with several layers to show how this can look with very complex statements. The numeric predicates (1 = 1 and 1 = 2) are there to allow the commenting out of other predicates, and everything else is there as a stand-in for actual query logic:

SELECT
      1
WHERE 1 = 2 -- false or false or true or false = true
    OR (1 = 1        -- true and true and false = false
        AND 'A' = 'A'  -- true
        AND 'B' = 'B'  -- true
        AND (1 = 2     -- false or false = false
            OR 'AB' = 'AC' -- false
            OR 'AB' IN     -- false
                ('AD', 'A', 'AA', 'ABA')))
    OR (1 = 1        -- true and true and false = false
        AND (1 = 2     -- true or false = true
            OR 'X' = 'X'  -- true
            OR 'Z' = 'A') -- false
        AND (1 = 2     -- true or false = true
            OR 'G' = 'G'  -- true
            OR 'F' = 'W') -- false
        AND (1 = 2     -- false or false = false
            OR 'F' = 'G'   -- false
            OR 'H' = 'I')) -- false
    OR (1 = 1        -- true and true = true
        AND (1 = 2     -- false or true = true
            OR (1 = 1     -- false and true = false
                AND 'E' = 'F'  -- false
                AND 'G' = 'G') -- true
            OR 'A' = 'A') -- true
        AND 'B' = 'B') -- true
    OR NOT 'A' = 'A' -- not true = false

So, in conclusion, if you include an OR in your code, be aware that anything after the OR should be treated as being bracketed together. And ideally write your code with explicit brackets and style it in a way to make it clear what is going on.

April 28, 2020

Values blocks

Values blocks are a really useful little bit of code in SQL Server. Basically, they are a block of defined values that you can use pretty much like any other data set.

The main place you may have encountered them before is as a source for an input. Often when people need to add a set of records to a table I see something like this:

INSERT INTO dbo.Table1
    (
          Column1
        , Column2
    )
SELECT 1, 'some value'
UNION
SELECT 2, 'some other value';

Or even worse:

INSERT INTO dbo.Table1 (Column1, Column2)
SELECT 1, 'some value';
INSERT INTO dbo.Table1 (Column1, Column2)
SELECT 2, 'some other value';

The first attempt is ok, but it’s bulky and unnecessary, and it needs you to keep typing UNION over and over. The second attempt is actively inefficient, as each row is inserted individually instead of inserting everything as a set.

The cleaner way, that uses a VALUES block, is:

INSERT INTO dbo.Table1
    (
          Column1
        , Column2
    )
VALUES
      (1, 'some value')
    , (2, 'some other value');

This saves you from typing out the UNION all the time, and in my opinion looks neater on the page and makes your block of values easier to read.

The basic rules for a values block are:

The data in each row is comma separated
Each row of data is wrapped in a set of brackets
The rows themselves are also separated by commas
Each row has the same number of values (2 in the example above)
Each value position has to hold data of the same type for every row (int, varchar in the example above)
NULLs are allowed for any value

This use case is useful to know about by itself, but I think the more powerful use of values blocks comes when you start using them in other queries. To do this you need to treat them as a subquery, like in this example:

SELECT
      val.Column2
    , tbl.Column4
FROM dbo.Table2 AS tbl
INNER JOIN
    (
        VALUES
              (1, 'some value')
            , (2, 'some other value')
    ) AS val(Column1, Column2)
    ON tbl.Column1 = val.Column1;

So, all you have to do is wrap the values block in brackets, alias it , and name the columns, and you can use it like any other subquery. The only thing that’s different here is the need to name those columns when you do the aliasing, but you do that simply by listing the names in brackets after the alias.

Interestingly, this is something you can do with regular subqueries as well. It’s probably not something you will use very often as you’re more likely to rename the column in the subquery, but it never hurts to know about these things.

Finally, if I’m using a big values block in a SQL statement, it can be a bit unwieldy to have it in a subquery. It can dominate the rest of the statement, and if it’s long enough you won’t be able to see all of the statement on the screen, and the individual values rarely add much to your understanding of the code. That’s why I will often put the values block in a common table expression at the start of the statement. That also allows you to reuse it if you need to refer to it more than once. Example below:

WITH val AS
(
    SELECT
          val.Column1
        , val.Column2
    FROM
        (
            VALUES
                  (1, 'some value')
                , (2, 'some other value')
        ) AS val(Column1, Column2)
)
SELECT
      val.Column2
    , tbl.Column4
FROM dbo.Table2 AS tbl
INNER JOIN val
    ON tbl.Column1 = val.Column1;

Another option here might be to put the values into a table variable or temp table, but those won’t be as efficient for the query to process. The only time I’d consider that is if there were several statements that wanted to access the same data set, and the data set was small.

So, that’s pretty much everything I know about values blocks. Hope that was useful.