Postgres Triggers, Why Do You Hate Me?
- February 9th, 2010
- By sycobuny
- Write comment
Actually, in spite of the melodramatic post title, figuring out triggers in PostgreSQL has been relatively painless. Of course, I had a pretty firm grasp of them in MySQL, and thus the major migration headache is realizing that the code handling the trigger has to be defined separate from the trigger itself.
The issue for me, however, is that (unlike Peter Eisentraut) I have a sordid love affair with schemas. For me, it’s not just about addressing the potential naming conflicts, but breaking down the tasks our database/frontend performs into more manageable blocks. Put it this way: I have over 100 tables, functions aplenty, and datatypes (as ENUM now has to be a datatype) to spare. While I recognize that it’s not impossible to manage, and there’s many systems that probably have a volume of data far exceeding mine, it still does wonders for my sanity if I can break those down so I only have to look through at most 25 tables at a time. They’re not randomly selected, they are geared towards similar ends and logically fit together, and I think schemas really helps on that front.
My solution to Peter’s problem of localized search paths is obtuse, but it works: I deliberately name the full path in anything that’s going inside a stored procedure. It’s a very tedious and defensive posture, but it has worked pretty well, at least up until now. This is where we get back to triggers: triggers in PostgreSQL are not named according to the same conventions as almost everything else in the DBMS. While most times, in the docs “simple_name” by itself is tantamount to “public”.”simple_name” (as the default search path is ‘”$user”, public’), this is not so for triggers. They are associated with the tables for which they are defined. While in hindsight, this makes sense, it took some time to figure this out (also, Michael Graziano pointed me in the right direction after I bitched about it). What would have been far more simple is if, anywhere in the documentation, they had simply specified this strange behavior. Even a hint, when executing “CREATE TRIGGER my_schema.do_something” other than a bland “syntax error at ‘.’” would be nice.
The other part of my problem with triggers doesn’t have anything to do with the trigger mechanism itself, but an issue I’ve encountered in the database as a whole. There is, as far as I can tell, no mechanism for creating connection-level or database-level custom variables. You can make variables obviously in any of the procedural languages. However, setting a variable, like MySQL’s “SET @@my_custom_variable := ‘some custom value’;” just doesn’t seem to exist. While this may not seem like a feature that would be particularly useful (after all, there are the aforementioned procedural languages), I’ve been finding it quite problematic.
When we were on MySQL, we used wxPerl and wxRuby as front-ends to connect directly to the database using database-level logins, and I was able to write auditing fairly handily: not only could I have information about the table on what data was modified and when, I could also log who modified the data. That doesn’t seem all that amazing, except that this was all database-side. The clients had to change exactly 0 code. You could even optionally issue a “@@COMMENTS := ‘my editing comments’;” and have it apply to all of the changes automatically until you unset it, providing an easy-access way to comment the audit log. I knew this was going to be problematic switching to rails, as ActiveRecord only ever connects as one user, and uses special models to manage logins. However, with MySQL, I could just ensure that the client issued a “@@CURRENT_USER := ‘whomever@address’;” before starting work. This, while obtuse, still has the database doing the majority of the legwork. I could easily reject any statements that occured before that variable was assigned, to make sure that changes made should always include complete auditing data.
PostgreSQL has denied my attempts at a simple solution to this, however. I’ve been searching, but the easiest solution seems to be to make a special table which holds “current connection” information and somehow tie the connection ID to the user ID that way. This seems pretty complicated, and I’m not sure how safe it is to assume that the connection ID will be unique for an arbitrary length of time (as I’m not sure what mechanisms exist for periodically purging the table of stale IDs). If there is any other way around this, I’d love to hear about it.