Archive for the ‘Technology’ Category

Testing and PHP

I’ve discovered that after a certain point, I become allergic to not having tests. It doesn’t manifest in hives or in anaphylactic shock, or any of the regular symptoms. It actually results in a brain freeze. There’s just so much code I can write before my brain refuses to go any further. In a way, maybe it’s more like a phobia. I’m simply afraid to write the next piece of code, knowing how tenuously linked to reality I am by that point, and the next code could be simply the beginning of the cascade of horrible ideas and messy disgusting code.

I wasn’t always this way. For quite a while I was completely happy with not documenting, not testing, and just writing line after line of incomprehensible code. Now, I can’t stand having poorly-documented code, and while I can still stomach writing a project for a while without tests, I hit a point eventually where I just can’t go any further without knowing what lies behind is thoroughly tested.

So it is with a project I’ve been working on that I hope will help me out at work: PGModel. I have been working on it for quite some time, though rather sporadically. It’s simply an ORM for PHP that is designed to be compatible all the way back to PHP 5.1 (I know it’s been end-of-lifed forever, but the project’s documentation goes into more detail on the “why”). And it does a few particular ORM-like things already, such as basic dataset loading and associations. These are all inspired, as the README states, by Sequel.

But while all of this is relatively simple at this point, it’s still already very complicated, and the more I write the more I worry that I’m missing critical bugs. So, I finally wrote myself a testing suite. It’s simple but I think it adheres to at least a subset of the Test Anything Protocol, and allows me to write a large number of tests fairly quickly and easily.

Unfortunately, there are some drawbacks. I could allow a function to be called à la call-user-func-array() to handle checking for exceptions, but it’s rather inelegant. As it stands, though, there’s no other way to check for exceptional cases, as closures didn’t become available until PHP 5.3. It’s also currently completely done from a memory of how testing works in Test::More from Perl, and as such is almost definitely far less feature-complete than a full testing suite.

There’s also already a unit testing suite written for PHP, but it’s sadly only compatible back to 5.2.7. So, I think, as long as I’m stupidly insistent on sticking with 5.1, I’ll keep using this. Who knows, maybe I’ll learn to love it, over time. But for now, I can finally scratch that itch and secure my code (more than it has been, at any rate) with proper testing.

My First #YAPC

This past week I finally attended my first YAPC. While previously I’d attended the excellent Surge conference hosted by OmniTI as a commuter, and the brand-spanking-new DCBPW by the DC and Baltimore Perl Mongers, this was my first full-immersion conference with flying out of town and everything, and it was quite an experience.

My work sponsored myself and two coworkers to attend the three-day conference. They were going to send us to the two-day workshops beforehand but unfortunately they sold out before the long process of travel paperwork completed. Still, I think we got a very worthwhile event.

As with most conferences, there was a lot of value in simply being around other people who are passionate about Perl. If I’ve got one big regret about this past week, it’s that I didn’t spend more time talking to those people. It’s not for their lack of trying, however. Everyone I did meet was very friendly and willing to talk, and more than once we were told by conference organizers to simply introduce ourselves to others. I’m unfortunately far too quiet for my own good, even among such good company. While the Linode-sponsored beer garden certainly helped me feel more comfortable with expressing myself, I mostly wound up expressing myself through dance.

One of the big topics throughout the days, however, wasn’t even just about Perl programming, but Perl culture, and expanding that culture through diversification. Michael Schwern (@schwern) gave the keynote on Wednesday morning, and impressively dove straight into waters which have been churning and sinking many ships for a couple of years now: women and minorities in a culture dominated by white males: the “geek” culture. Still, he brought it up in a non-combative, humorous way that still got everyone talking, and I think that’s the whole point. As he said, he wasn’t going to solve the diversity problem in 50 minutes at the start of YAPC::NA, but it was quite awesome to see everyone get into the topic and discuss it throughout the conference with as much weight as how we were going to solve the next big computational problems in Perl.

Speaking of the next big problems… For me, my latest issues have been focused around web development, and I’m mentioning that almost purely so I can segue into my favorite scheduled talk of the conference: Glen Hinkle (@tempire)’s Introduction to Mojolicious. For a long while now, any time I’ve wanted to throw anything onto the web, I’ve reached straight for Sinatra, because it’s so fast to install, quick to write in, and so easy to deploy to Heroku. Now, I’m feeling like the rewrite I started of my own blog may be scrapped and redone in Mojo. So, while many talks captured my interests, nothing quite changed my whole attitude on a topic like the Mojo Intro.

Oh, yeah, and Damian Conway’s recorded talk on Regexp::Debugger? Ho-lee crap. Rocked my socks off.

So, in all, it was a great experience, and one I hope to repeat at future YAPCs. While I’m hoping to attend one abroad one day, for now I fear I will just have to stay local. If YAPC::NA proved anything to me, though, it’s that there’s a wealth of value even in that.

Postgres GUC as Session/Transaction Variables

Edit Feb 8 8:53: After playing around a bit more with the functions, I’ve added another caveat dealing with the function volatility.

Some time ago, I wrote about session variables in PostgreSQL. I’ve been using the solution for some time to address the problem of performing a more-or-less-automatic audit trail for certain important tables when using accounts defined by the system and not the database, and it’s been working pretty well so far. However, I’ve always been concerned about the idea of potentially creating a new table for every transaction, even if it’s temporary. The database in question is a very small low-throughput system just used internally, but being inefficient just cause nobody will notice doesn’t seem like a good enough excuse.

As Rails is our front-end and it uses connection pooling, there’s a possibility that two subsequent pageviews would use two different connections to the DB. While that doesn’t always generate a new table (the second connection could have already had the temporary table created), it does necessitate resetting the values in the table that I use for auditing (current_user and audit_notes) every time, just to be sure. I’m unsure where on the scale of efficiency it would fall to validate that an update must occur before actually doing it, but it seems, due to the fact that I must validate the table as a whole exists before trying to do anything, that this is the least of my problems.

However, I recently had the magic of Tom Lane shine down from on high in this recent thread on the pgsql-sql mailing list, wherein he made mention of a feature of which I had been previously unaware: custom GUC variables. I’m not actually sure what the GUC stands for, even. However, what it provides is a namespace into which you can throw arbitrary variables. It’s designed for modules that are loaded at run time and need configuration (like plperl.use_strict).

If you clicked on the link to the thread, you’ll realize that Tom suggested this as a solution to the very problem I had built my “variable” temporary table method to address. The requested solution involved transaction-level isolation, but as ActiveRecord doesn’t seem to like using them unless you beat it thoroughly about the head, I’m more concerned with connection-level isolation. Fortunately, it looks like this does both! Instead of my big long complicated functions, you can simply include “custom_variable_classes = ‘audit’” in postgresql.conf, reload, and in any connection do “SET audit.”current_user” = ‘whoever’;” There are a couple of small caveats worth noting, however:

  • This is not the usage for which the GUC system was designed; as such, it is somewhat a Bad Thing to do, as it can potentially cause screwing around with modules that are loaded. This is particularly notable if you have multiple databases in your server, as it’s a global setting. Each one of your users will have that variable namespace. As far as I know, that’s not a security concern, but rather a nuisance concern, if they’ve never asked for such a thing. However, I think it’d be charitable to describe the situation where it’s problematic as an extreme edge case. And as far as the loaded modules, it would seem fairly trivial, unless you have millions being loaded and unloaded all the time (I don’t even know if you can unload without a server restart), that assigning a unique name to your variable class should not be a problem.
  • SET statements in PostgreSQL allow for setting string literals with optional quoting. This may not be obvious if you’ve never used the SET command (which I tend only to use for search_path), but it means you can’t set the value using variable substitution, i.e. in a function call. You’ll have to compile the query at runtime of the function using EXECUTE and that can be unpleasant for everybody if you’re not careful about it. That is, use pg_catalog.quote_literal() to make sure your variables are safe, because any characters PostgreSQL can’t figure are part of the string will cause errors. You shouldn’t bother using pg_catalog.quote_nullable(), for why, see the last point.
  • Certain values of variables must be ident-quoted. So far I’ve found that to be true of “user” and “current_user” at least (so, set audit.”user” instead of audit.user), and I presume there are others. Someone smarter than me may have an answer for this.
  • If you want to access the value of a variable via a function, the function must be declared as VOLATILE. IMMUTABLE is clearly out because it doesn’t depend on inputs, and for some reason that I’m unaware of STABLE is also out. This is probably a function of the SHOW command rather than of custom GUC variables in particular. Speaking of SHOW…
  • Retrieving the value of the variable, say in a PL/pgSQL trigger, can be done via “SHOW audit.”current_user” INTO some_variable;” – I’m not sure if there are more efficient ways but that’s the one I’ve found that works. At least, most of the time…
  • Retrieving the value of a configuration variable that has not been set yet causes an exception to be raised. This is not an insurmountable problem, as you can simply trap the exception, but as the documentation warns, an exception-trapping block in PL/pgSQL is far more expensive than a regular block, so it shouldn’t be done if you can avoid it, which would be easy except…
  • SET statements do not allow you to assign NULL values to configuration variables. This is problematic if, like me, you want to allow someone to optionally insert some notes to go along with any auditing for a particular chunk of work (“I just changed the received time on this log because it turns out that was a 2 not a squiggled-out number”), but don’t want a pile of empty strings littering everywhere. You can handle it in one of two ways: have your trigger functions call NULLIF() and always assign the return value of the variable you want to NULL if it’s set to ”, or just trap any exceptions from unused variables and return NULL. While I think NULLIF() is probably the cheaper option (without any benchmarks backing this gut feeling up), the trapping exceptions method is probably the Right Thing to do.

So, for all those things to be kept aware of, the end result can be just as simple as:

CREATE FUNCTION audit_user(OUT TEXT) LANGUAGE PLPGSQL AS
$$BEGIN
    SHOW audit."current_user" INTO $1;
EXCEPTION WHEN OTHERS THEN
    $1 := NULL;
END;$$;

CREATE FUNCTION audit_notes(OUT TEXT) LANGUAGE PLPGSQL AS
$$BEGIN
    SHOW audit.notes INTO $1;
EXCEPTION WHEN OTHERS THEN
    $1 := NULL;
END;$$;

I trapped the exception OTHERS for two reasons: one, I assumed that OTHERS would be faster than comparing against a specific case; and two, I plain just don’t know what specific exception gets raised when this happens. I also did not create a function that would set the auditing variables, as I figured there would be little point to creating a function that would basically just be wrapping a SET call. It’s all clearly much shorter and more sane than the temporary table solution, not to mention it seems it’s likely to be a lot faster.

I think I’ve found myself a winner.

#FollowFriday

I can’t really get behind Follow Friday. I like it when people mention me, perhaps because I enjoy the confidence boost of someone saying what I think is interesting. But honestly, I don’t think anyone new has ever followed me from being mentioned, and I don’t think I’ve ever followed someone who was mentioned. In spite of the fact that I feel guilty for not giving those people props back, I feel like it would be a disservice to all 20 people and 140 spam bots following me to simply spam a bunch of names of people who have mentioned me.

It wasn’t always that I felt so negative about this particular aspect of Friday, but these days it seems like follow lists are just that: lists of names. If you throw my name in there with about 10 other people in one of 3 tweets that is nothing but names and “#FF”, it doesn’t really show much of an effort. The first times I saw anything about Follow Friday, it had “#FollowFriday” and a single name with a reason to follow them. That’s something worth doing. It shows you’ve put some thought into it. Of course, these days if I actually spent that much time it’d practically seem like a love letter to spend that much time thinking about a single person on my list.

Twitter’s always been a pretty ephemeral medium, so it makes sense that over time processes that occur on it will be condensed. But the law of diminishing utility comes into effect nonetheless. If you give me more and more names and do that at the expense of the “why,” because it’s “more efficient” that way, then you’ve lost any sort of meaning with it. Few people will click through the list and figure out if they want to follow those people as well.

I doubt this will impact anyone and prevent them from doing their own list come Friday, and that’s fine, really. I don’t intend to convince people, but merely explain why I won’t just “hit you back,” as it were. I prefer a high signal-to-noise ratio in my personal Twitter feed, despite what it may seem like sometimes. That’s why I skip the “Good Morning!” tweets and the (to me) meaningless “#FF” list.

Now, #WhiskeyFriday is all well and good, and #FridayReads, in spite of not being alliterative, is just fine by me. Like #MusicMonday (which I haven’t seen in quite some time), I am always ready for some new media (but not New Media) recommendations. I’ve also been told about #FridayRide, though I’ve never actually seen that one before. Hey, biking to work is always good (although for me, it might take about three or four hours each way).

Dealing With YAPB

It’s been a while since I blogged. Sue me.

I fully intend to get into a regular posting schedule, one of these weeks. I’ve even got a plan mapped out. But that’s for later. For now, I’ll detail how I set up my partner’s photoblog. It was actually less than completely straightforward.

We run our sites on Dreamhost, and setting up the MySQL database, subdomain, and WordPress installation went about as easily as it ever does (that is to say, in about 5-10 minutes I had it all running). Then, I went to install a plugin called “Yet Another Photoblog” which I had read in at least one place was a pretty decent plugin for converting WordPress more easily into a photoblog. The plugin installed fine, after a couple of attempts – for whatever reason, WordPress was giving me unzip errors and I thought I’d have to do it manually; fortunately they resolved and installation proceeded.

The plugin itself makes minimal changes to the overall admin interface of WordPress. There’s basically an additional file upload field above the “Add New Post” main editbox, as well as an additional section in the Settings sidebar. I hadn’t played with it before, so I tried a few posts. Here’s where I ran into issues.

I kept on getting errors saying “Error: File does not exists!” when I would get to the preview page. I tried with just jumping straight to publishing, and that didn’t work either. The posts had thumbnails on the admin side, but nothing showed up on the front-end. Also, the pictures were in the uploads directory, so I knew they were there.

I read on the plugin’s page that themes had to be chosen specifically for YAPB, and so I loaded the site up with one. The thumbnails showed a frame, but no actual picture. Chrome said the thumbnails were being sized as 1px X 1px. I couldn’t figure out how to get the full CSS picture with Chrome’s interface, so I jumped over to Firefox where I had some neat tools, and found the thumbnails showed up just fine there. That’s odd.

Eventually, through much Googling and hair-pulling, I tried manually creating the cache directory (didn’t fix it), renaming “phpThumb.config.php.default” to “phpThumb.config.php” in the plugin directory (god only knows why it was named that way anyway, nothing mentioned it except an obscure forum that I’d link to if I could find it again; still didn’t work though), and some hackery with the PHP in the phpThumb library itself (which also didn’t work).

Eventually, somehow, I managed to find this forum page, which linked to this other forum page, and detailed exactly how to fix my problem: Going to the Settings page, then to Media (it said Miscellaneous in the forum, but I guess the name or changed since then), and manually setting the uploads folder to “wp-content/uploads/”. This shouldn’t work, as the default is ALREADY “wp-content/uploads/”, but it does. I haven’t had any other problems.

If this post was incredibly boring to you, it’s because dealing with figuring this out sucked my brain out through a straw, threw it in a blender and hit “fuck this motherfucker up.” I think it wouldn’t have been so bad if the error message had been slightly more explanatory (a file name/line number would have killed you?), or if the solution hadn’t been so mind-numbingly stupid at the end.

Gizmodo and the iPhone (Finally)

So, my blog went down for a few days. Dreamhost’s automatic scanning script detected something wrong, and disabled it. All I got in the error message was a warning to update all of my software/plugins (which everything was, except for two plugins that went out of date while the site was down), and to check the server-side code for malicious modifications. Despite WordPress being a giant hideous PHP beast, I went through it yesterday, and everything looked just about like I’d expect. I think it was triggered because I had an unencrypted/uncompressed backup copy in a subdirectory with a much older version of WordPress. I’m not sure if it was accessible, but I deleted it anyway to prevent future occurrences.

Moving on.

So, Gizmodo is apparently made up of jackasses. As anyone who’s read this is already well aware, they somehow acquired a next-generation iPhone prototype. As we all know, Apple’s ass is squeezed so tight even radio signals can’t get out, so it’s clear that Gizmodo having the device in the first place wasn’t very much on the up and up. That much was clear as soon as I read the original article.

However, they then upped their jack-assery by outing the Apple engineer whose phone it was. Now, don’t get me wrong: I have no doubt that eventually Apple was going to get their hardware back, and a simple serial number check would tell them to whom they gave it. His life at Apple, likely, was ended. That sucks for him, cause people who work at Apple tend to like it, in spite of the draconian restrictions on talking to anyone about what you do (I know people who work at NSA who are allowed to talk more about what they do for a living). Of course, that much was his own fault.

The problem for me, though, is that all of that is an internal Apple affair. In no way was it journalism to out a guy that was about to get canned. It might be a human-interest story about how evil Apple is that they’d fire someone for losing a prototype; but that might happen at any company, it’s just that much more certain at Apple. And that argument is even flawed, because if Gizmodo had simply been up front with Apple and returned the device, there might not have even been an issue. The human-interest argument, broken as it clearly is, also assumes that they were doing it for some sort of altruistic purpose.

They weren’t.

Reading through their repeated posts, it sounds like they’re trying to be funny while fingering the guy. Let me clue you in, Gizmodo: Apple isn’t going to say “well, clearly it’s this guy’s fault so we’ll just let it slide.” The whole thing reads like the following story: a nerdy guy is encouraged by his smooth-talking friends to steal his dad’s porno stash so they can all beat off in the tool shed later; the nerdy guy gets caught; the smooth-talking friends say, while snickering, “Well, shucks, Mr. Jobs, poor old Gray just made a mistake any of us could make, if we were trying to STEAL PORN MAGS TO BEAT OFF, golly goshes.” They act smugly about the entire affair, but the problem is that this wasn’t some small-time misunderstanding, and Steve Jobs doesn’t seem like the kindly hearted dad-next-door who doesn’t want to spank you with the full force of Johnny Law.

I do not like Apple’s methods of locking down all their research, the entire environment of their computers/devices, or much of anything about Apple (aside from the physical appearance and software stability of their computers, which you have to admit is sexy). However, it’s their prerogative. As a consumer, the only way you get to vote on this is with your dollars. They don’t do anything wrong legally by walling off their ecosystem, and it’s not a bout of journalistic prudence to crack open an illicitly-acquired prototype. It’s potential theft and destruction of property charges. And as much as I dislike Apple, and would relish the opportunity to know what the next iteration of their software/hardware does with out the “Apple Event” Steve Jobs/media circle jerk, it’s the way they do things, and the way they’re allowed to do things.

I’m not sure what the statutes will say about any of this legally, since the device has now been returned to Apple without invoking any law enforcement thus far. However, Apple has (to my estimation) the following possible recourses:

  • Do Nothing – Unlikely, to me. They rely on extreme secrecy, and if a breach of that secrecy goes unpunished, other people will be willing to say “screw it” in the future.
  • Cockblock Gizmodo – This seems almost a given. While other media outlets are invited to the Apple Events to get first cracks at live-blogging/tweeting new hardware and software releases, Gizmodo may have to sit outside in the rain and wait for scraps in the trash can left over from more favored pets. Note that the following options are still available in conjunction with this one.
  • Red Tape – Assuming there’s nothing that Apple can eventually legally do, they can still squash Gizmodo with long-term legal problems, overmatching them with a legal team big enough to staff an aircraft carrier, as big corporations are known for having, tying them up until their funds completely dry up and they collapse.
  • Lawsuit – Like the previous one, but successful: assuming they can prove that they lost R&D money, or eventual sales due to less impact at their eventually unveiling, or anything resulting from a yet-to-be-proven-illegal “transaction” (read: theft) of a prototype, that could land Gizmodo in spicy legal waters which could prove disastrous: from major fines all the way up to jail time.

I do not like being on Apple’s side on this. If it had stopped at “they published a story which damages Apple’s bottom line,” I’d wince and look away, feeling badly as they were eviscerated and/or annihilated at Cupertino’s hands; I might even write an objection at Apple’s shitty tactics (I did say I don’t like them). But the arrogance and flippant way in which they tossed the engineer’s name out there, while still protecting the guy who sold them the phone “as a source,” like they were some sort of legitimate news organization that just happened to act like guilty 15-year-olds, makes me hope for the worst.

Postgres Session Variables – Neat.

After futzing around a bit, and once again having my suspicions confirmed, I came up with the following solution to my session variables problem: temporary tables. They are dropped at the end of the session, so it all pans out nicely. I still didn’t like having to make front-ends do all the work (plus I was all excited after figuring it out), so I slapped together a pretty basic couple of wrapper functions:

/*
 * Session Variables in PostgreSQL via PL/pgSQL
 * Written/tested on version 8.4.2, but should work anywhere
 *
 * Code written by Stephen "sycobuny" Belcher
 *
 * Free to use, I enjoy writing stuff like this.
 * Just give me some props if you do.
 *
 */

/* -- These lines may not be necessary, lang/schema may exist

   -- Remove commenting only if they need to be created

CREATE
  LANGUAGE 'plpgsql';

CREATE
  SCHEMA SV;

 */

-- Make sure the session_variables temporary table exists

-- TODO: Make sure it has the right columns, too
CREATE OR REPLACE
  FUNCTION SV.ensure_session_table_exists()
  RETURNS VOID AS
$BODY$BEGIN
  PERFORM *
    FROM pg_catalog.pg_class
    WHERE relname = 'session_variables' AND
          relnamespace = pg_catalog.pg_my_temp_schema();

  IF NOT FOUND THEN
    CREATE
      TEMPORARY TABLE session_variables (
        "key" TEXT PRIMARY KEY,
        "value" TEXT
      );
    RETURN;
  END IF;
END;$BODY$ LANGUAGE 'plpgsql';

-- Set a variable. Yep.
CREATE OR REPLACE
  FUNCTION SV.set(IN xKey TEXT, INOUT xValue TEXT) AS
$BODY$BEGIN
  PERFORM SV.ensure_session_table_exists();
  PERFORM *
    FROM session_variables
    WHERE "key" = xKey;

  IF FOUND THEN
    UPDATE session_variables
      SET "value" = xValue
      WHERE "key" = xKey;
  ELSE
    INSERT
      INTO session_variables ("key", "value")
      VALUES (xKey, xValue);
  END IF;

  RETURN;
END;$BODY$ LANGUAGE 'plpgsql';

-- Get a variable's value. It's just that easy!
CREATE OR REPLACE
  FUNCTION SV.get(IN xKey TEXT, OUT xValue TEXT) AS
$BODY$BEGIN
  PERFORM SV.ensure_session_table_exists();
  PERFORM "value"
    FROM session_variables
    WHERE "key" = xKey;

  IF NOT FOUND THEN
    RAISE WARNING 'Variable % does not exist', xKey;
  END IF;

  SELECT "value"
    INTO xValue
    FROM ession_variables
    WHERE "key" = xKey;

  RETURN;
END;$BODY$ LANGUAGE 'plpgsql';

From this point, you just do:

SELECT SV.set('my session variable', 'its value');
SELECT SV.get('my session variable');

It handles creating and updating the table and values independently, without any hassles.

Well, there’s a couple hassles:

If it’s the first time you access it in a session, it barfs out warnings because PostgreSQL likes to warn you when it creates new indexes and constraints implicitly; whether this causes any libraries such as ActiveRecord to croak, I’m not sure – I will be testing that one shortly at least. Also, the get() function itself throws up a warning if you try to get a variable that hasn’t been defined yet. This is because the value of any variable can actually be NULL, but there’s nothing else to return if it hasn’t been defined yet. It’s an intrinsically implementation-specific concern whether this behavior is desired, so I split the difference: you can do it, but you’re going to be tut-tutted by the database.

So, there you have it. Of course, changing the schema into which I put these functions should be trivial; I put them there because I like the simple syntactic sugar it provides, but, as they are not tied to a permanent table it should be as simple as just changing the function names. I’ll probably wind up doing it myself, as my schemas are thematically named (individual “projects” and their “code names”).

There’s one final gotcha, which I didn’t fully account for because I ran out of ideas how to ensure it continues working right: PostgreSQL processes statements directed towards temporary tables first (if you don’t specify the fully qualified name), before checking the schemas in the search_path. I’m not sure if this is an SQL-standard way of doing things; if it isn’t, I’m sure they’ll correct it at some point, and then my code will be broken. Unfortunately, I’m not entirely sure how best to perform a query on a temporary schema (which is assigned a technically-random name by the database) without constructing the query as string, which totally removes any optimization done by preparing the statements ahead of time. If there’s some way around that, I’d be quite interested to hear it. Oh yeah, and it hides your “session_variables” table, if you’ve made one yourself. Sorry. Qualify your names and it won’t be a problem, though.

Postgres Triggers, Why Do You Hate Me?

Actually, in spite of the melodramatic post title, figuring out triggers in PostgreSQL has been relatively painless. Of course, I had a pretty firm grasp of them in MySQL, and thus the major migration headache is realizing that the code handling the trigger has to be defined separate from the trigger itself.

The issue for me, however, is that (unlike Peter Eisentraut)  I have a sordid love affair with schemas. For me, it’s not just about addressing the potential naming conflicts, but breaking down the tasks our database/frontend performs into more manageable blocks. Put it this way: I have over 100 tables, functions aplenty, and datatypes (as ENUM now has to be a datatype) to spare. While I recognize that it’s not impossible to manage, and there’s many systems that probably have a volume of data far exceeding mine, it still does wonders for my sanity if I can break those down so I only have to look through at most 25 tables at a time. They’re not randomly selected, they are geared towards similar ends and logically fit together, and I think schemas really helps on that front.

My solution to Peter’s problem of localized search paths is obtuse, but it works: I deliberately name the full path in anything that’s going inside a stored procedure. It’s a very tedious and defensive posture, but it has worked pretty well, at least up until now. This is where we get back to triggers: triggers in PostgreSQL are not named according to the same conventions as almost everything else in the DBMS. While most times, in the docs “simple_name” by itself is tantamount to “public”.”simple_name” (as the default search path is ‘”$user”, public’), this is not so for triggers. They are associated with the tables for which they are defined. While in hindsight, this makes sense, it took some time to figure this out (also, Michael Graziano pointed me in the right direction after I bitched about it). What would have been far more simple is if, anywhere in the documentation, they had simply specified this strange behavior. Even a hint, when executing “CREATE TRIGGER my_schema.do_something” other than a bland “syntax error at ‘.’” would be nice.

The other part of my problem with triggers doesn’t have anything to do with the trigger mechanism itself, but an issue I’ve encountered in the database as a whole. There is, as far as I can tell, no mechanism for creating connection-level or database-level custom variables. You can make variables obviously in any of the procedural languages. However, setting a variable, like MySQL’s “SET @@my_custom_variable := ‘some custom value’;” just doesn’t seem to exist. While this may not seem like a feature that would be particularly useful (after all, there are the aforementioned procedural languages), I’ve been finding it quite problematic.

When we were on MySQL, we used wxPerl and wxRuby as front-ends to connect directly to the database using database-level logins, and I was able to write auditing fairly handily: not only could I have information about the table on what data was modified and when, I could also log who modified the data. That doesn’t seem all that amazing, except that this was all database-side. The clients had to change exactly 0 code. You could even optionally issue a “@@COMMENTS := ‘my editing comments’;” and have it apply to all of the changes automatically until you unset it, providing an easy-access way to comment the audit log. I knew this was going to be problematic switching to rails, as ActiveRecord only ever connects as one user, and uses special models to manage logins. However, with MySQL, I could just ensure that the client issued a “@@CURRENT_USER := ‘whomever@address’;” before starting work. This, while obtuse, still has the database doing the majority of the legwork. I could easily reject any statements that occured before that variable was assigned, to make sure that changes made should always include complete auditing data.

PostgreSQL has denied my attempts at a simple solution to this, however. I’ve been searching, but the easiest solution seems to be to make a special table which holds “current connection” information and somehow tie the connection ID to the user ID that way. This seems pretty complicated, and I’m not sure how safe it is to assume that the connection ID will be unique for an arbitrary length of time (as I’m not sure what mechanisms exist for periodically purging the table of stale IDs). If there is any other way around this, I’d love to hear about it.

Boring Life Update

Having spent a decent portion of today talking about programming with rakaur and dKingston, I feel I have the urge to actually do some programming.  I do that for my job, of course, but that’s not really what I’m talking about.  I’ve been doing that all day today, and while it’s nice to get a task done, writing database interfaces for government processes isn’t exactly illustrious.  I don’t hate my job, though the people I work with can be, shall we say, frustrating; I just want to do something other than this.

Recently, I went off the deep end and acquired an iPhone, which I’m almost certain heralds the apocalypse in some way.  However, it presents an interesting opportunity: I could learn another language and actually write some interesting things.  I’m assuming the ideas for those things will come later, after the learning part.  The more important thing is to actually do something new, while I have the vibe for it.

Also, I never posted those things I promised I would a week or two ago.  Turns out, I had no Internet connection.  Of course, now I’ve totally lost interest in them.  One’s written up, but it’s terrible, so I guess I’ll post it, and just trash the other.

Twitter Lists

In order to appease all of my reader, I’m writing up another blog post.  Really, I should be doing it to appease myself, but whatever.  I’ve written up, or at least started, a couple posts which I think should be interesting.  However, in an attempt to only come out with (relatively) quality work, I’m trying to edit them a bit before posting.  I think I’ll give you a hard deadline (and myself as well); I’ll post them by Monday night: one post on me complaining about myself, and one post on me complaining about militant/hardcore atheists.  Aren’t you terrifically excited?  I know I am.

Speaking of being all atwitter, it seems that they‘ve gotten around to providing a feature a lot of programs already gave: lists.  Most people are excited, I know that some people aren’t.  Of course, there’s a valid point: some people are using lists to correct for a flaw inherent in their own system.  Information overload occurs for them because they don’t know when to stop clicking the damn “follow” button, or how to prune people that add nothing to their view of the global conversation.  Here’s a free “Pro Tip”: don’t follow someone back just cause they followed you; statistically speaking, they’re probably a boring self-absorbed asshole (but then, likely so are you).

I think there are other possibilities for twitter lists, though.  I only follow 80 people at the moment, and I have 9 lists.  I view it as a way to figure out more interesting people to follow.  When twitter nerfed the ability to see all replies (even if you’re not following the user being replied to) in your timeline however many moons ago that was, I was pretty pissed.  I was just about as pissed as one can be about a free service that impacts virtually nothing in one’s real life.  It removed a function of “serendipitous discovery,” or whatever phrase proponents assigned to it.  I think this goes at least a little of the way towards mending that bridge.

Take a look at it from this angle: I follow a wide smattering of people because, like most humans, I find multiple topics interesting.  Someone who follows me might share one of those interests, but probably not more than 3.  With lists, they can see who I’ve grouped together as also sharing that interest.  For instance, I have a list for linguistics, and a list of people doing work with PostgreSQL.  The two lists have absolutely no one in common, but they’re both interests that someone following me might share.  Now, rather than either clicking through my following list to find similarly interesting people, or just giving up entirely on finding interesting people (the more likely course), people can have a cherry-picked list prepared for them before they even arrive.

Of course, the feature is far from perfect.  The navigation through lists is shoddy (every page seems to have a different set of links to different functions), and it’s not easy to find users in your following list without just clicking through every page (which, even with my small number of followees, is annoying).  I’d also like to see a way to provide a brief description for a list, for those situations when it’s not entirely clear, nor could it be made clear in a brief URL-like description, why these people are grouped together.  I have a list of people I’ve actually met in meat space, but I couldn’t figure out a good way to say that succinctly for my list name.  A brief description field would be awesome to clarify that, so you wouldn’t have to wade through my bizarre version of comedic naming to “get it.”

Of course, I’m sure it won’t be long before I find that most people are using it to make worthless lists and somehow spamming becomes a problem.

Return top