The Case for Stored Procedures

In some parts of the C# community, I’ve noticed that there seems to be a desire to avoid writing SQL code. Some advocates for Entity Framework have even cited the ability to avoid writing SQL as one of its benefits. I admit that I am no EF expert. If I had to choose an ORM, my list stops at Dapper, and I normally roll my own data access layer.

But I’m not here to try to dissuade anyone from using Entity Framework. What I’d like to do, though, is lay out some of the benefits of embracing SQL in your solutions, specifically Transact-SQL and the use of stored procedures in SQL Server.

First let me share my typical design for a new solution based on a relational database. I not only use a database-first approach, but I build the database using scripts that are part of a repository. Everything after the CREATE DATABASE statement itself is in the repo. Each script is essentially idempotent. In other words, the script can be executed repeatedly without adverse effect. For tables, this means checking for existence before creating them. For changes to existing tables, additional logic is used to (for example) check for the existence of a new column before adding it. For other objects, this is mostly just using the “CREATE OR ALTER” syntax. I also take steps to maintain backward compatibility, but that’s for another post.

By the way, kudos to the attendee at my PASS Summit 2022 session who called me out for using the word idempotent. I’ve been throwing that word out for years, knowing that it’s not quite what I mean, but close enough. So, no, the scripts aren’t really idempotent, but I haven’t found a better word yet for, “I can F5 them to death.”

Anwyay, I also include PowerShell scripts in the repo that execute the T-SQL scripts on a database, and incorporate these into a CI/CD pipeline. I won’t explain the whole pipeline, but one crucial part is that a merge of an approved PR into the dev branch will trigger a deployment to a reference development environment. The scripts are all executed against the database(s) in this environment, and then application code is deployed to the servers.

The point I’m trying to make is that the database code is treated as part of the solution. It is in the repos just like the C# code. It is deployed just like the C# code. I even include unit testing, just like the C# code.

I also keep all T-SQL in the database. There is no T-SQL code inside the C# code. Stored procedures are used for all functionality, even basic CRUD functions. In fact, probably 80–90% of the stored procedures in my systems are essentially CRUD functions. The only operation that doesn’t go through stored procedures is bulk insertion, and I use SqlBulkCopy for that.

Here is an example of my approach, one that I put together for that Summit session: https://dev.azure.com/downshiftdata/_git/SearchOverflow

Why Do It This Way?

  1. Changes to the database are tracked in the repo just like changes to application code. So knowing who changed what, when they changed it, who reviewed it, when it got deployed, etc, is no more difficult to answer for database changes than application code changes.
  2. The stored procedures act as an interface layer, keeping database code in the database and application code in the application. Radical changes to the database are possible, without any effect on application code and without incurring any downtime. Performance issues in the database can be addressed quickly, through hotfixes, without redeploying the application. And the database design can take advantage of this interface layer. For example, instead of relying on AFTER triggers, audit logging can be done through OUTPUT clauses inside the stored procedures. This is because you can reasonably guarantee that all the writes are coming in through the procedures (and even support this assumption with appropriate authorization configuration).
  3. Stored procedures are compiled, with their plans in the procedure plan cache. Yes, this happens with ad hoc queries as well (and I’m an advocate for the “Optimize for Ad Hoc Workloads” option in SQL Server). But there are issues like parameter sniffing and cache bloat that need to be handled. Routing execution through stored procedures makes it easier to manage the cache and understand what is happening in the database.

The Other Side

I’ve seen quite a few arguments against approaches like this. Here are some of the more popular ones:

  1. If I use Entity Framework and take a code-first approach, then it does all the SQL for me and I don’t have to worry about it.

Once again, I’m not an EF expert, and don’t know the full scope of what it can and can’t do. But I’ve heard variations of this one quite a lot. And the argument falls apart like any other code-generation argument. It’ll work fine 90% of the time, and then you’ll be in trouble when you hit the other 10%. Now, in some situations, this may work out. You’ll always be inside that 90%. But to me, it’s a calculated risk, and one that I don’t have to make because it’s not the approach I take.

Where this one concerns me is in situations in which a DBA (or someone in a similar role) was essentially ignored. The DBA advocated for something less wizard-y, and had good reasons for it. But they’re often outnumbered and – frankly – not well respected, and decisions are made despite their arguments. And then the situation hits that 10% and the DBA’s list of options is limited. They often know the change they should make, but they can’t make it because that code is being generated in the application layer. Or the change they want to make will break application layer code.

  1. That’s the old way of doing things.

This makes me cringe every time. Yeah, we all wrote stored procedures twenty years ago, but that’s not how you do things today. This goes hand-in-hand with the idea that RDBMSs themselves are antiques and shouldn’t be used. I’m a firm believer in the KISS principle. And despite being ancient in the tech world, the RDBMS is a straightforward solution to a lot of problems. Would a big data store be better for a certain situation? Of course. Will your app ever reach a threshold that demands the advantages of that big data store? If not, then why add to the complexity of your solution? If your tech stack includes an RDBMS, then you have one product that can reasonably support a host of requirements.

  1. The database is the database and the application code is the application code and the two shouldn’t be mixed.

In my experience, those who make this case have had to suffer through thousand-line stored procedures that tried to do everything in the database. That is indeed a nightmare situation, and one that I’ve seen all too often.

But their response is to swing fully in the other direction. Don’t let any code reside in the database. But the truth is that there are scenarios in which it makes sense to go either direction. If you need to join several large tables, have good covering indexes, need a timely response, are faced with frequent cache invalidation, and only need a few rows in each call, it makes a mountain of sense to have that functionality in a stored procedure. Yes, that could be dynamically generated in the application layer. But there is definitely a non-zero chance of getting a bad execution plan in that scenario. And as I said before, your options then are much more limited.

Conclusion

Stored procedures form an interface layer, between the database and the application code. This layer of abstraction is incredibly beneficial to application security, maintainability, and scalability. Not using it is a design decision that should be considered very carefully.

When I am in conversations with others in the database community, or with coworkers who share some of my experience, I wouldn’t dream of having to make these arguments. But now that I’m back in C# land more than I used to be, I’m shocked at how often my opinions are at odds with others in this community. So, I don’t know if I’ve put any of those arguments to rest – probably not – but maybe that anti-stored-procedure crowd is at least thinking a bit more about the situation.

Back

Before COVID hit the United States in 2020…

  • I was a software engineer at an Indianapolis-based SaaS start-up in the human resources sector.
  • I was a soccer coach.
  • I had spoken at 22 different PASS-affiliated events over the previous four years.
  • I was a Program Manager for PASS.
  • PASS existed.

And now…

  • I’m still a software engineer, but for a different start-up, one that deals with AI in the defense industry. It’s based out of Austin.
  • After fourteen years, I’ve put coaching on an indefinite hiatus.
  • I haven’t spoken at any event.
  • PASS, as it was, is no more.

But PASS is back! SQL Saturdays are coming back! Redgate seems to be doing an admirable job of picking up the pieces, and carrying the torch forward. And, thanks to them, the Summit is back next week! And I’m speaking at it!

It’s a session entitled “Unit Testing T-SQL“. Arguably, I take a bit of a wandering path through DevOps to get there, but – in the end – I do indeed bring the discussion around to a way to perform unit testing in Transact-SQL.

What happens after this? I honestly don’t know. One reason I put aside coaching was to focus on my work more. I want to take that part of my brain that is focused on teaching and supporting others and move it in from the soccer field. So now it’s time to figure out what to do with it.

Also, one of these days, I finally need to actually go to Austin. Everyone tells me it’s nice there.

Interviewing for Software Engineering

I do a lot of interviewing as part of my job. The most common position I’m trying to fill is essentially my own, a software engineer. Specifically, I’m a back-end specialist — someone who develops APIs, works with databases, etc. Interviewing for a role like mine is more of an art than a science. Those that treat it like a science are often focused on the technical aspects of the role. They give candidates coding exercises, and treat the interview like a pop quiz.

But you won’t get the best candidates by putting them through what amounts to a technical certification. What you really need are people who will work well with your team, who will add something of value to it. So, to that end, here are three of the things I look for in an interview.

1. Samplers and Deep-Divers

Developers generally come in two flavors, what I’ll call samplers and deep-divers. When I first see a résumé, I’ll look at how varied their experience is. A sampler generally has a very busy résumé, with references to a wide array of technology. Deep-divers are more focused on something specific.

If you’re a sampler, then I have two concerns. One is that I think this flavor is more of a “flight risk”. If we invest in you, will you still be here in six months? Or will you get bored and move on to the next opportunity? I don’t want to invest my time and resources into someone who won’t stick around long enough to be a return on that investment.

And samplers also tend to be early adopters, who favor the most functionally appropriate solution over one that is more maintainable. This can often result in solutions that are more bloated, with more dependencies and complications. Especially if I think you’re a flight risk, do I really want you to saddle me with additional tech debt before you go?

If you’re a deep-diver, I first want to know if your chosen area of expertise is what I’m looking for. I generally work in C# shops, for instance, and wouldn’t have much use for a Java deep-diver. And then I want to see if you’re actually honing your craft. Years of experience doesn’t necessarily mean anything. Personally, I learned more about T-SQL in my first six months at one position than I had accumulated in the previous ten years.

I’m sure I favor deep-divers a bit, because I am one. But there is a place for samplers as well. This is less about preferring one or the other, and more about identifying what type of contributor you will be and how you will fit into our team.

2. Humility

I like this word, because it often provokes a reaction that gives away how humble a person really is. While modesty, selflessness, or even kindness might be better words, this one tends to elicit a more telling response. Those with little humility tend to view it as a shortcoming. But I think humility is a coworker’s (and especially a manager’s) greatest asset.

So I will throw the word out there and see how you react. And then I’ll also look for other clues about how much your pride may get in the way of your productivity. Will you be able to admit when you’re wrong? Are you open to change? Will you approach your work as part of a team?

Or are you the type of developer who would prefer to go hide somewhere and get your work done on your own and only come up for air when it’s time to submit your 1,000-change PR? Because, if that’s the case, then I frankly don’t want you on my team.

Another way I smoke this out is to ask about peer programming experience. Most people don’t have much, but I’m not asking it to gauge the amount of experience you have. I’m looking for your reaction to the concept. If you’ve participated in it, then what did you think of it? If you haven’t, then how open are you to it? What I’m really answering in my own mind when I ask about pair programming is whether or not you’re going to be collaborative.

I’ve seen a lot of fragile egos over the years in my line of work. I’m sure I’ve even been there myself. It comes with the territory. We don’t want others to tell us that our code is ugly. But it IS ugly. Look at what you wrote six months ago and tell me there isn’t something about it you don’t like. So I’m looking for that contributor who knows that he or she isn’t perfect, that skills evolve, and that they evolve best when we can be honest with each other. And that honesty requires a certain degree of humility.

3. What are you asking me?

Whenever I go into an interview as a candidate, I always have a list of questions of my own. They’re generally in three categories.

First, I want to know what your tech stack is like, what kind of problems you’re having, how you’re solving them, etc. Second, I want to know what your team culture is like, how under water you are (e.g. working nights and weekends), your impression of your leadership, etc. And third, I want to gauge how healthy your business is.

In other words, I’m interviewing the employer as much as they’re interviewing me. So when I’m on the other side of the table, I’m expecting the interviewee to be doing the same thing. And if they’re not, then why not?

So, there’s definitely more to it than this, but these are three things I keep in mind when I’m interviewing candidates. What do you look for?

What’s in a Number?

In one of our matches last season, my girls scored 5 goals.

Now I’ll ask you a question. What’s wrong with that statistic?

It’s useless!

But why? When you look at the score of a soccer match, it’s pretty clear – you have nothing against which you can compare it. The obvious missing ingredient is the number of goals the other team scored. With that additional piece of data, so much more is gained. What if the other team scored 0? Or 4? Or how about 12? Each of those numbers, paired with our score, tells a story. But one score alone does not.

If this is so clear in sports, though, why do we repeatedly overlook it in the business world? I was on a discovery call today with a customer who shared a metric they track. It’s a raw number. Not a percentage or anything else that can be compared – just a raw number. And then the customer went on to dismiss the metric as being largely ignored. But it was clear from the call that he hadn’t given any thought to why it was ignored, only that it is.

In his defense – and this customer shall remain nameless, anyway – I think the raw number metric was someone else’s idea and he was only relaying information in this case. But I was still concerned by the fact that it wasn’t immediately apparent to him why it’s ignored.

By the way, what I’m sharing here is not original. My own attention was first drawn to it by Rob Collie (T|L). He presented a session at IndyPASS a couple of years ago on aspects of data, and included this wisdom. When analyzing data, look for key words in the name; words like “by”, “of”, “per”, and “versus”. “Actual Widgets Produced” is just a number. “Actual Widgets Sold VERSUS Forecasted Widgets Sold” is a valuable metric.

The Dow Jones closed at 29,395.33 on Friday. So?!?! That was down 0.12% from the close on Thursday. Ok, now we’re getting somewhere. Give me Friday’s raw number and it tells me nothing. Compare it to Thursday’s and now I have something to work with. I can surmise that Friday was kind of a “meh” day for the stock market. So if my own portfolio took a hard turn one way or another, it might be something worth checking into. But if all I had was Friday’s close, I wouldn’t know that.

Anyway, the reason I bring it up now is because it is so rarely recognized. And it’s the simplest of concepts that can turn useless data into actionable metrics. Whenever you are analyzing anything, ask yourself, “What am I comparing this against?” If the answer is “nothing” then you need to adjust your metric.

I hope this helps. Oh, and that other team scored 2 goals. So we won the Shelby County Derby this year. Go Tigers!

VSCode Extensions in SQL Ops Studio

The list of extensions for SQL Ops Studio is still pretty small. But since it’s a fork of VSCode, it’s possible that any given VSCode extension just might work fine in Ops Studio. If that’s the case, here’s what you can do.

  1. Go to the VSCode Marketplace and find the extension you want. In my case, I’m adding the tsqllint extension that I mentioned in a previous post.
  2. Under the Resources heading on the extension’s Marketplace page, you’ll see a link to “Download Extension”. This is a .vsix file that you’ll want to save to your local system.
  3. In Ops Studio, under the File menu, click “Install Extension from VSIX Package”. Pick the file you downloaded. Once it’s installed, Ops Studio will prompt you to Reload for the new extension to take effect.

And that’s all there is to it! Once again, I’m indebted to ck (twitter|blog) for finding this.

If you try this with an extension and run into a problem, I’d like to hear what it was. I’m curious about which extensions don’t carry over well, and why they have a problem.

T-SQL Tuesday #106

This is a response to T-SQL Tuesday #106, Trigger Headaches or Happiness, by Steve Jones.

I can only recall one time in the past several years (at least a decade) that I’ve found triggers to be useful. It involves data migration.

The problem: You have a massive, high-activity table. Let’s call this Table_A. You need to make significant changes to it. For example, the clustered index needs to change. How do you accomplish this?

The solution: Create a script that does the following:

  1. Create a second table, which we’ll call Table_B. This one will eventually become the new Table_A, so design it with your changes in mind.
  2. Create a third table, which we’ll call Table_C. This one is like Table_A as it is now, except that it includes an additional identity column, and is clustered on that column. Assuming there’s an existing clustered index on Table_A, recreate that as a non-clustered index on Table_C. Depending on how Table_A is updated, you may need additional columns to track what updates occur.
  3. Create a trigger on Table_A. This trigger duplicates all changes in Table_A to Table_C.
  4. Looping via a suitable batch size for your environment, write all rows from Table_A to Table_B.
  5. Looping again, write all rows from Table_C to Table_B (taking into account the appropriate insert/update/delete logic for your situation). Note where you stopped with Table_C, the “high water mark” for that identity column.
  6. Call sp_rename to change Table_A to Table_D, then again to change Table_B to Table_A.
  7. From the high water mark, write from Table_C to the newly-renamed Table_A.
  8. My favorite part: Drop Table_C, Table_D, and the trigger.

There are caveats to this method, of course, but they have been acceptable for my situations. They are:

  • The table is unavailable (non-existent, really) between the two renames in Step 6. This is an extremely brief window, but it does need to occur. Also, in order to apply a trigger, the table needs to be briefly locked, which may present a problem.
  • Step 7 is present so that changes between Steps 5 and 6 are carried over to the new table. However, these can occur after the new table is active following Step 6, meaning that the following scenario is possible:
    • A row is updated in Table_A, and the change is carried over to Table_C.
    • The renames occur.
    • The same row is updated in the new Table_A.
    • Th second change is overwritten with the first change.

If the table you wish to migrate has a considerable number of updates and deletes, then this solution may present a data integrity problem. However, for insert-heavy tables, it has proven to work very well. Also, a second trigger on the new table and some additional logic could circumvent the second issue I described, depending on your situation.

So there you go – a use for triggers.

Godspeed!

Running PowerShell files in SQL Operations Studio

As I’ve used SQL Operations Studio more and more, I’ve also been finally using PowerShell in more situations. Given that I like the editor and that there’s a built-in terminal, I’ve been running those in my Ops Studio instance. But for a while I didn’t have a slick way of running an entire PowerShell file in the terminal. Usually, I’d just Ctrl+A/Ctrl+C/Ctrl+V, which is a bit awkward.

But among all the other ways you can customize Ops Studio, you have a lot of control over the key mappings. One way to edit these mappings is to pull up the Command Pallette (Ctrl+Shift+P) and start typing “key”, and you’ll see “Preferences: Open Keyboard Shortcuts”. You’ll also see it mentions the Ctrl+K/Ctrl+S shortcut. This will bring you to the basic Keyboard Shortcuts window, where you’ll need to click “keybindings.json”. Either way, just like Ops Studio’s overall settings (and VSCode’s, for that matter), you get a JSON file you can now tweak. Actually, two of them, with the defaults on the left and your own settings on the right.

Here’s the mapping I’ve added:


{
"key": "shift+f5",
"command": "workbench.action.terminal.runActiveFile",
"when": "editorTextFocus && editorLangId == powershell"
}

The effect of this is that pressing Shift+F5 while focus is on your PowerShell script file will cause Ops Studio to run that file in the terminal window. As an old SSMS and Visual Studio user, F5 seemed natural to me, and I noticed that Shift+F5 wasn’t already taken.

Note that this runs the file, not necessarily what you have in the window. So you may want to precede this with Ctrl+S while you’re working.

My gratitude to ck (twitter|blog) for pointing me in the right direction here. When it comes to either Ops Studio or PowerShell these days, I just assume he’s smarter than me.

Tigers in the Rain

This is the first in what may become a series of anecdotal posts about what I’ve learned as a coach and how I think it translates to becoming a good manager…

After the match, my brother told me he knew how it was going to go when he saw them in warm-ups. The other team, he said, looked like they didn’t want to be there, shuffling around half-heartedly in the drizzle. In contrast, our girls were all smiles.

They’d been there before.

During the previous season, the TC Tigers hosted Shelbyville, a larger nearby school and a fierce rival. The rain came down in buckets that night. It was their last season on the grass, and the field held up surprisingly well. So did the girls, who beat the Golden Bears 3-1.

I’d never seen a match played in those conditions… until that night at the 2017 Sectional championship. At one point, they had to call the match on account of lightning, and the teams were sent off to the locker rooms to wait it out. But it eventually resumed, and so did the dogfight. Heritage Christian may have come into the game as heavy favorites, but our girls held their own and held the score 0-0 through regulation and two overtimes.

Then they won the shootout. And the hardware.

When he shared his observation with me afterward, I realized the team had achieved something else as well. Not only could they play well in the rain, they believed they could play well in the rain. And I also realized that it’s up to me to keep that belief alive.

We’re preparing them for a new season now. Only seven of the girls on the current roster were part of that Shelbyville game. That doesn’t matter, though. I’ve brought up those rain matches a couple of times in the off-season, when I’ve had a mix of those who were there and those who weren’t. I’d get them going, and then let the veterans reminisce about what it was like to play in those matches and to come out on top. And let the rookies soak it in.

The girls are going to win some and lose some. That’s the nature of the game. But if it rains during a match this season… well, I almost feel sorry for the other team. Are they that much better than every other team in the rain? Maybe not. But they believe they are that much better – even the girls who weren’t at those two matches.

A development team won’t ever be asked to write code outside in the rain. But they’ll have their own rain matches. A production outage, a performance issue, an angry customer… who knows what the situation will be? But there will inevitably be something – some bugaboo that they’ll be able to overcome once or twice. It’s up to the manager to spot those successes and capitalize on them. It’s up to the manager to nudge the team toward embracing them and allowing them to become part of its identity. It might take nothing more than a “Remember when…” at a team lunch. Whatever it takes, though, the important thing is to gently – imperceptibly – encourage them to believe in themselves, and in each other. Especially when the rain comes.

tsqllint

My PASS local group, IndyPASS, has its monthly meeting tonight. At my insistence, first-time presenter Nathan Boyd is showing off a SQL tool called tsqllint. Nathan, a coworker of mine at Salesforce, is the leading developer behind this GitHub project.

A lint (or linter), if you didn’t know, “analyzes source code to flag programming errors, bugs, stylistic errors, and suspicious constructs” (wikipedia). This one is designed specifically for T-SQL, is highly configurable, and includes a Visual Studio Code extension. What more could you want, right? If you want cleaner T-SQL code out of your developers, with less hassle on the part of your reviewers, it’s definitely worth your time.

If you’re in the area, keep in mind there’s a location change tonight. While IndyPASS usually meets at Virtusa, 1401 North Meridian (formerly Apparatus), this month’s meeting is at Moser Consulting in Castleton. As usual, doors open at 5:30pm, and we’ll turn it over to Nathan by about 6:15pm.

T-SQL Tuesday #104

My thanks to Bert Wagner and his chosen topic for T-SQL Tuesday, Code You Would Hate To Live Without. It was just enough of an excuse to dust off the cobwebs here and get back to posting.

Anyway, since half of my time is spent in C#, I thought I’d venture into that world for my response. I’ll share a couple of common extensions that I include in most of my projects. Extensions, as their name implies, extend the functionality of existing objects. Here is a code snippet with a couple of extensions I typically add:

namespace myproj.Extension
{
  public static class Extensions
  {
    public static bool In(this T val, params T[] values) where T : struct
    {
      return ((System.Collections.Generic.IList)values).Contains(val);
    }

    public static object ToDbNull(this object val)
    {
      return val ?? System.DBNull.Value;
    }

    public static object FromDbNull(this object val)
    {
      return val == System.DBNull.Value ? null : val;
    }
  }
}

The first method enables me to easily search enumerations for a given value. For example, if I’ve defined this enumeration:

namespace myRacingProject.Enum
{
  public enum Series
  {
    None = 0,
    Indycar = 1,
    IndyLights = 2,
    ProMazda = 3,
    Usf2000 = 4
  }
}

Then I could use the extension like this:

if (mySeries.In(Enum.Series.ProMazda, Enum.Series.Usf2000)) myChassis = "Tatuus";

As for the other two methods, well… When is a null not a null? When it’s a System.DBNull.Value, of course! SQL Server pros who have spent any time in the .NET Framework will recognize this awkwardness:

var p = new System.Data.SqlClient.SqlParameter("@myParam", System.Data.SqlDbType.Int);
p.Value = (object)myVar ?? System.DBNull.Value;

With the extension, the second line becomes:

p.Value = mVar.ToDbNull();

Similarly, when reading, this:

var myInt = (int?)(myDataRow[myIndex] == System.DBNull.Value ? null : myDataRow[myIndex]);

Becomes this:

var myInt = (int?)myDataRow[myIndex].FromDbNull();

They’re not earth-shattering improvements, but my real point is that extensions are an often-overlooked feature that can improve your quality of life as a developer. Anytime you find yourself writing the same bit of code over and over, especially if that bit is rather unsightly, you might consider making it an extension.

Want to know more? Here ya go: https://docs.microsoft.com/en-us/dotnet/csharp/programming-guide/classes-and-structs/extension-methods