Postgres GUC as Session/Transaction Variables
Edit Feb 8 8:53: After playing around a bit more with the functions, I’ve added another caveat dealing with the function volatility.
Some time ago, I wrote about session variables in PostgreSQL. I’ve been using the solution for some time to address the problem of performing a more-or-less-automatic audit trail for certain important tables when using accounts defined by the system and not the database, and it’s been working pretty well so far. However, I’ve always been concerned about the idea of potentially creating a new table for every transaction, even if it’s temporary. The database in question is a very small low-throughput system just used internally, but being inefficient just cause nobody will notice doesn’t seem like a good enough excuse.
As Rails is our front-end and it uses connection pooling, there’s a possibility that two subsequent pageviews would use two different connections to the DB. While that doesn’t always generate a new table (the second connection could have already had the temporary table created), it does necessitate resetting the values in the table that I use for auditing (current_user and audit_notes) every time, just to be sure. I’m unsure where on the scale of efficiency it would fall to validate that an update must occur before actually doing it, but it seems, due to the fact that I must validate the table as a whole exists before trying to do anything, that this is the least of my problems.
However, I recently had the magic of Tom Lane shine down from on high in this recent thread on the pgsql-sql mailing list, wherein he made mention of a feature of which I had been previously unaware: custom GUC variables. I’m not actually sure what the GUC stands for, even. However, what it provides is a namespace into which you can throw arbitrary variables. It’s designed for modules that are loaded at run time and need configuration (like plperl.use_strict).
If you clicked on the link to the thread, you’ll realize that Tom suggested this as a solution to the very problem I had built my “variable” temporary table method to address. The requested solution involved transaction-level isolation, but as ActiveRecord doesn’t seem to like using them unless you beat it thoroughly about the head, I’m more concerned with connection-level isolation. Fortunately, it looks like this does both! Instead of my big long complicated functions, you can simply include “custom_variable_classes = ‘audit’” in postgresql.conf, reload, and in any connection do “SET audit.”current_user” = ‘whoever’;” There are a couple of small caveats worth noting, however:
- This is not the usage for which the GUC system was designed; as such, it is somewhat a Bad Thing to do, as it can potentially cause screwing around with modules that are loaded. This is particularly notable if you have multiple databases in your server, as it’s a global setting. Each one of your users will have that variable namespace. As far as I know, that’s not a security concern, but rather a nuisance concern, if they’ve never asked for such a thing. However, I think it’d be charitable to describe the situation where it’s problematic as an extreme edge case. And as far as the loaded modules, it would seem fairly trivial, unless you have millions being loaded and unloaded all the time (I don’t even know if you can unload without a server restart), that assigning a unique name to your variable class should not be a problem.
- SET statements in PostgreSQL allow for setting string literals with optional quoting. This may not be obvious if you’ve never used the SET command (which I tend only to use for search_path), but it means you can’t set the value using variable substitution, i.e. in a function call. You’ll have to compile the query at runtime of the function using EXECUTE and that can be unpleasant for everybody if you’re not careful about it. That is, use pg_catalog.quote_literal() to make sure your variables are safe, because any characters PostgreSQL can’t figure are part of the string will cause errors. You shouldn’t bother using pg_catalog.quote_nullable(), for why, see the last point.
- Certain values of variables must be ident-quoted. So far I’ve found that to be true of “user” and “current_user” at least (so, set audit.”user” instead of audit.user), and I presume there are others. Someone smarter than me may have an answer for this.
- If you want to access the value of a variable via a function, the function must be declared as VOLATILE. IMMUTABLE is clearly out because it doesn’t depend on inputs, and for some reason that I’m unaware of STABLE is also out. This is probably a function of the SHOW command rather than of custom GUC variables in particular. Speaking of SHOW…
- Retrieving the value of the variable, say in a PL/pgSQL trigger, can be done via “SHOW audit.”current_user” INTO some_variable;” – I’m not sure if there are more efficient ways but that’s the one I’ve found that works. At least, most of the time…
- Retrieving the value of a configuration variable that has not been set yet causes an exception to be raised. This is not an insurmountable problem, as you can simply trap the exception, but as the documentation warns, an exception-trapping block in PL/pgSQL is far more expensive than a regular block, so it shouldn’t be done if you can avoid it, which would be easy except…
- SET statements do not allow you to assign NULL values to configuration variables. This is problematic if, like me, you want to allow someone to optionally insert some notes to go along with any auditing for a particular chunk of work (“I just changed the received time on this log because it turns out that was a 2 not a squiggled-out number”), but don’t want a pile of empty strings littering everywhere. You can handle it in one of two ways: have your trigger functions call NULLIF() and always assign the return value of the variable you want to NULL if it’s set to ”, or just trap any exceptions from unused variables and return NULL. While I think NULLIF() is probably the cheaper option (without any benchmarks backing this gut feeling up), the trapping exceptions method is probably the Right Thing to do.
So, for all those things to be kept aware of, the end result can be just as simple as:
CREATE FUNCTION audit_user(OUT TEXT) LANGUAGE PLPGSQL AS $$BEGIN SHOW audit."current_user" INTO $1; EXCEPTION WHEN OTHERS THEN $1 := NULL; END;$$; CREATE FUNCTION audit_notes(OUT TEXT) LANGUAGE PLPGSQL AS $$BEGIN SHOW audit.notes INTO $1; EXCEPTION WHEN OTHERS THEN $1 := NULL; END;$$;
I trapped the exception OTHERS for two reasons: one, I assumed that OTHERS would be faster than comparing against a specific case; and two, I plain just don’t know what specific exception gets raised when this happens. I also did not create a function that would set the auditing variables, as I figured there would be little point to creating a function that would basically just be wrapping a SET call. It’s all clearly much shorter and more sane than the temporary table solution, not to mention it seems it’s likely to be a lot faster.
I think I’ve found myself a winner.