Consider the following PostgreSQL query. There’s something wrong with it.
UPDATE users SET passwordHash = '...' WHERE email = ( SELECT email FROM resets WHERE resetSecret = '...' );
You can’t tell what’s wrong with this query unless I give you the table schema, of course. Let’s say the
users table has the columns
passwordHash, and some other fields. Let’s say the
resets table has two columns –
resetSecret. When a user asks to reset their password, we populate
resets table with a tuple with
id as their email’s ID and
resetSecret as some random string.
You happily run the above query and it goes horribly wrong and everybody’s
passwordHash got reset. Now that you have context, what’s wrong with the above query?
Answer: There’s no
resets table1. We should be using the
id column to select the row to update, not the
And do you want to know the worst part? Postgres (and basically every SQL flavour) will silently accept this query and you’ll never see a damned error. But what’s really happening?
Postgres knows that the
resets table does not have an
users table. Here’s what Postgres sees:
UPDATE users SET users.passwordHash = '...' WHERE users.email = ( SELECT users.email FROM resets WHERE resets.resetSecret = '...' );
Let’s say there’s a row in
resets that matches the
resets.resetSecret = '...' condition in the subquery. In the outer query, when going through each row in
users, the subquery will return that row’s
Why did this mistake happen? Human error and oversight. If we had qualified our query as we had originally intended, it would look like this:
UPDATE users SET users.passwordHash = '...' WHERE users.email IN ( SELECT resets.email FROM resets WHERE resets.resetSecret = '...' );
The key difference is in line 4. While we had intended to write
resets.email, omitting the table qualifier allowed Postgres to assume wrongly. The above query would properly return an error at execution because there is no
This is a real story that happened to me (thankfully, the impact wasn’t too massive). I know it’s easier to just drop the
table. prefix in the interest of brevity and clarity. I know human error isn’t completely eliminated even if you qualify your columns. But it gives us a much better chance of not committing such errors, and honestly, we could use any help we can get in the war against bugs.
If you answered correctly, would you still be able to do the same at 2:00am in the night when you’re dead tired? ↩