PHP DateTime Class

July 2nd, 2007

With as much time as I spend talking about Rails, you’d think that’s all I do. In fact, Bitscribe - like many companies, I’m sure - maintains a number of apps which predate Rails. My team and I often find ourselves bring concepts from Rails (and other frameworks; we have a few Django fans here, for example) into our the frameworks used on these apps. A sort of backport, if you will.

In that vein, a coworker of mine created a timestamp manipulation class for PHP. Dates come out of the database as strings, which are easy enough to turn into time_t timestamps; but hard to manipulate or do comparisons on. I often find myself falling back to doing the manipulation in SQL, since Postgres has excellent date/time manipulation; but this is pretty ugly, and basically impossible to unit test.

Since it is a standalone class, it was easy to extract from the framework, so I suggested he post it on the Bitscribe open source page, which he did. Here it is: BDateTime. Next time you find yourself manipulating timestamps through mktime(), strtotime(), and (heaven forbid) regular expressions, try BDateTime out instead.

Comet with Rails + Mongrel

May 8th, 2007

In my last post I described how to create a mongrel handler. I said you might want to do this for optimization purposes, but my own interest came about in an attempt to solve the server-push problem with Rails.

Comet is the term that seems to be catching on for server-push via XmlHttpRequest. Possible applications include chat clients or a stock ticker. Anything that wants constant updates will be both responsive and less demanding of server resources if it waits for data to be pushed to it, instead of opening a new status query every few seconds.

Since the server can’t initiate a connection to the user’s browser, the only possible solution is to have the browser hold a connection open indefinitely, waiting for an update. Since Rails is single-threaded, however, this means that one whole server instance would be tied up by this connection - clearly infeasible in almost all situations.

You might say, “Why not have another small server listening on a separate port to hold on to these push-status connections?” Good idea - except that XmlHttpRequest won’t let you connect to another port. This is because the port is considered part of the hostname, and connecting to another hostname from within the javascript sandbox would be a big security no-no. (It would be trivially easy, for example, to inject a little javascript into a site which caused all of its visitors’ browsers to start hammering another unrelated site as soon as they visited the homepage.)

Juggernaut gets around this with a little hidden Flash component. This is a nifty idea, but for me it is unappealing because Flash is not readily available for my platform (Ubuntu AMD64). More importantly, I’d prefer to avoid building technology that depends on a proprietary plugin built by a monolithic, old-fashioned (i.e., shrink wrap) software company.

So holding open connections to Rails won’t work due to its controller lock. But as was demonstrated in the previous entry, a mongrel handler won’t have that problem. I’ll extend the auction example shown there to use server-push.


require 'active_record'

class StatusHandler < Mongrel::HttpHandler
   def process(request, response)
      id = request.params['PATH_INFO'].slice(1, 20)
      current = request.params['QUERY_STRING']

      while status(id) == current do
         sleep 0.2
      end

      response.start(200) do |head, out|
         head["Content-Type"] = "text/html"
         out.write status(id)
      end
   end

   def status(id)
      connection.select_value("select status from auctions where id=#{id.to_i}")
   end

   def connection
      ActiveRecord::Base.connection
   end
end

uri "/status", :handler => StatusHandler.new, :in_front => true

This assumes your auctions table has a field named “status,” which I’m using as an integer, but any type should work. http://localhost:3000/status/1 now delivers just one value, the status. Where it gets interesting is something like http://localhost:3000/status/1?100, assuming that the status of auction id=1 is currently set to 100 in the database. Now, the connection will hang and wait for the value to change. (You’ll see the database queries in development.log, but no web hits.) Pop open a sql shell and run “update auctions set status=101″ and the connection will resolve immediately, printing out the new value.

Here’s a simple example of making an ajax call to this url from within a page:


<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html>
<head>
   <%= javascript_include_tag :defaults %>
</head>
<body>
   Status is now: <span id="status"></span>

   <script language="javascript">
      function respawn()
      {
         new Ajax.Updater('status', '/status/1?' + $('status').textContent, { onComplete: respawn });
      }

      respawn();
   </script>
</body>
</html>

Experiment with updating the status value in the sql shell and you’ll see that the page always updates instantly. To watch the connections, open the Firebug console, click Options in the upper-right, and make sure “Show XMLHttpRequests” is checked. Reload the page and you’ll now see a POST each time you update the status. There will always be an active one at the bottom, waiting, waiting for the status update.

And there you go. Server-push connections with only Rails and Mongrel.

Update: Mere minutes after I finished writing this article, I came across Shooting Star, a Rails plugin for adding Comet to your apps. So far this looks a little heavy-weight for my purposes, and somewhat platform-dependent so far - not to mention that they push the meteor metaphor a bit far in their method naming. Still, this may be a more robust solution than my little hack, so check it out. If anyone has tried Juggernaut, Shooting Star, and my hack, I’d be curious to hear a comparison.

HOWTO: Custom Mongrel Handlers

May 6th, 2007

There doesn’t seem to be any good documentation for creating custom Mongrel handlers. The Mongrel site seems to be completely silent on the subject. I was able to extract what I needed to know from this article and Ezra Zygmuntowicz’s slides.

It’s actually quite easy to make a mongrel handler, so here’s a quick tutorial that should tell you everything you need to know.

First, why would you want to make a mongrel handler? The main reason would be speed and scalability. Mongrel is multithreaded even though Rails isn’t. Mongrel is fast, Rails is slow. Of course, Rails is Rails, and Mongrel is just a webserver; but many types of apps may have some services that are hit far more than others, and thus it would be worth writing what amounts to a stripped-down version of the controller action in order to handle just those requests.

For example, if you were writing eBay in Rails, you might want to implement the API calls which are used to check the status of an auction as a custom mongrel handler. This part of the site may be accessed with great frequency by third party apps trying to keep a current status, and chances are generating the results are pretty simple (converting a row in the database to XML and printing it out).

Here’s what that might look like:


class StatusHandler < Mongrel::HttpHandler
   def initialize
      @mutex = Mutex.new
   end

   def process(request, response)
      id = request.params['PATH_INFO'].slice(1, 20)  # trim leading slash

      response.start(200) do |head, out|
         head["Content-Type"] = "application/xml"
         out.write status(id).to_xml
      end
   end

   def status(id)
      rows = @mutex.synchronize { ActiveRecord::Base.connection.select_all("select * from auctions where id=#{id.to_i}") }
      return { 'error' => ‘No such record’ } if rows.length < 1
      return rows.first
   end
end

uri "/status", :handler => StatusHandler.new, :in_front => true

Name this status_handler.rb and drop it into the root dir of your Rails project. Instead of running script/server, execute this command:


mongrel_rails start -S status_handler.rb

Assuming you’ve set up a little database with some sample data in the table named by the handler (”auctions” in my example), accessing the url http://localhost:3000/status/1 will show the data for the record with id=1.

Now what’s so hot about this? For one, it’s fast - see Erza’s slides for benchmarks. But more importantly - to my mind - is that a long request won’t hold up any others. Try putting a “sleep 10″ as the first line of the process method. Restart your server and hit the status url again. The connection will hang temporarily, but now open another tab and hit any other page in your Rails app. Notice that it displays right away, even though the other tab is still loading.

The downside is that you don’t have Rails, and as it turns out, we like Rails. So suddenly you’re stuck doing a lot of your own dirty work. Here, for example, I load up ActiveRecord and manage the database connection and raw sql manupulation. (This whole thing could be done in one line as a Rails controller: respond_to { |f| f.xml { Auction.find(params[:id]).to_xml } }) Parsing the request string can be time-consuming so I went for simplicity - String#slice instead of a regular expression or tokenization. You also have to protect against CGI parameter attacks, which I again simplify with a to_i.

I wasn’t able to figure out how to load a custom handler from a mongrel yaml config file. It seems like the keyword should be config_script, but it doesn’t seem to produce the same result as -S. Anyone knows how to make this work, please comment.

Now that you know how to write a mongrel handler, the real fun can begin. In my next post I’ll describe how this can be used for server-push connections.

Update: Rick Olson improved the code by trimming out the superfluous establish_connection, I’ve used his version above.

Working With Rails

May 1st, 2007

I’ve been aware of Working With Rails for a while, and have even connected with a few developers who have done work for Bitscribe through it. Cool concept, definitely. Just recently I noticed that someone had recommended me (one of the contributors to Gyre - thanks, Michael!) so I decided to make a real profile. Interesting to note that, with one recommendation, my popularity is 89%. Guess there are a lot of empty records up there.

The profile form asks how long you’ve been working with Ruby, and Rails. For the former I looked at the datestamp of my first fooling-around Ruby script. For the latter I hit the svn log of my first Rails project. I was surprised to see that I’ve been doing Rails for exactly a year this month, with my Ruby tinkerings predating that by about six months. Woah, really? Ruby and Rails feel so comfortable now that it seems like years since I’ve used anything else. It’s gratifying that time seems to stretch out during periods of rapid change - I guess time doesn’t always fly when you’re having fun.

Rails / Ubuntu Feisty Quickstart

April 21st, 2007

If you’ve just installed a fresh copy of Ubuntu 7.04 (Feisty Fawn), the following sequence of commands will give you everything you need to run Rails with MySQL or Postgres and Mongrel. This should be run as root (”sudo su -” will get you a root shell).

First, core packages through apt-get:


apt-get install ruby rubygems rake ruby1.8-dev irb rdoc libopenssl-ruby1.8 postgresql-8.2 libpgsql-ruby libmysql-ruby1.8 mysql-server gcc libc6-dev make subversion openssh-server

And your gems:


gem install -y rails mongrel --no-rdoc --nori

If it prompts you for which version of Mongrel (or other gems) to install, the first one on the list (type “1″ and press enter) is almost always right, unless it reads “win32″, in which case pick the first one that says “ruby”. (This silliness is definitely a major weak point of the gem package manager. I’ve created a patch that fixes this issue, which is being studiously ignored by the rubygems maintainers.)

Now, enable mod_rewrite and mod_proxy in your Apache modules (the latter is only necessary if you plan to proxy mongrel, but might as well have it):


a2enmod rewrite
a2enmod proxy
/etc/init.d/apache2 restart

For bonus points, you might want to install a few other useful developer tools:


apt-get install php5 php5-cli php5-pgsql php5-mysql vim-gtk vim-ruby

Javascript Text Editor

April 20th, 2007

Here’s a source code editor I wrote in javascript, inspired by CodePress. Feel free to snag and use in your application or modify to your needs. Syntax highlighting and indenting for Ruby is hardcoded, but it could easily be modified by swapping out ruby_syntax.js with your own class. I used Prototype just out of habit, but it could be factored out very easily - I just use it for a couple of simple things like Event.observe.

One thing that really struck me working on this is just how powerful javascript and DHTML have become. I mean, writing a programmer’s editor is hard, right? But I cranked this out in a few evenings. And yeah, I know it’s far from full-featured; but it really blew me away just how easy this was. The only part that was even mildly challenging was handling selections.

I’ve worked with a number of different display paradigms over the years. Early on I was doing character-based output, drawing little boxes and menus and so forth using the upper 127 ASCII line drawing characters. This was great because it was so easy, mostly because everything fit onto a grid.

Later I started working with graphics by manipulating the raw pixels. It took a pretty massive amount of time to do something as simple as make a button, let alone something complicated like a scrolling panel or tabs.

A few more years passed and now there were various sorts of GUI toolkits. I started with hideous, barely usable ones like Motif and the raw Win32 libs. Later I moved on to more enlightened toolkits like Qt and GTK. The box model they offered for packing widgets, coupled with some good visual design tools, made the process of building the display portion of your app pretty reasonable.

Still, it was difficult to mix freeform drawing like lines or shapes with the box-model widgets. And either way, it was just nowhere near the ease of working with those simple character-based displays. That’s the price of progress though, right?

Nope. Somehow - and I’m not sure how or when it happened - DHTML managed to evolve into a combination of the box model (for auto-adjusting layouts) and the canvas model (for freeform drawing). And somehow, working with it, I had a strange sense of deja-vu: working with DHTML is very pleasantly reminiscent of the good ol’ fashioned character grid from my old 80×24 text mode programs.

Maybe because, at it’s core, HTML really is just a bunch of characters. But it’s divided into container nodes that can easily be positioned however, including using box model stuff to get them to line up neatly and dynamically resize with their content. And then it’s like a canvas in that you’re free to position things however, breaking them out of the grid if you want. And then when I see the amazing things you can now do with inline SVG, all I can say is: the sky’s the limit.

On Focus

April 17th, 2007

Through most of my career as an entrepreneur, my mornings have always been devoted to checking email. This has always seemed like a good way to dive into the day. But recently, I’ve come to the conclusion that this behavior is actually quite poor for productivity. Not because there’s anything wrong with checking email - it certainly needs to be done at some point (or many points) each day. No, the reason why I prefer to do things differently now is that I think a morning email-check starts you off at the wrong level of detail.

Email tends to be very zoomed in. Little notes, FYIs, and requests related to stuff you’ve already done, or may do in the future. But one of the biggest impediments to productivity is getting too caught up in the details, and thus missing out on the big picture.

There are always an infinite number of details, and you can run yourself ragged trying to keep up with them all. It’s easy to do so, because all of these details tend to be so demanding. They sit on a todo list looking terribly not-done, or worse yet, they come in attached to subtly demanding emails from your coworkers, clients, or partners.

The start of the day is a perfect time to look at the big picture. You’re rested, and your head is clear, since it’s been 12 hours or so since you last thought about work. The whole day is ahead of you, full of promise and potential. Now’s the time to ask: what is the absolute most valuable way I could spend the next eight hours?

I’ve been amazed at the insights this produces. I’m more productive - not from doing more work, but from working on things that matter more. Humans tend to get caught up in the details so easily. Once caught up, we rarely stop to question the comparative value of this here vs. anything else you could be doing with your time and energy. But the opportunity cost of that energy may be huge. Morning is a great time to stop and think about this; it’s the one time of day that you’re not already wrapped up in something.

Not surprisingly, I’m not the first to have this insight. Getting Things Done recommends setting aside two hours each week to look at the “50,000 foot view”, or the really big picture. This is one of those “But I don’t have time for that!” → “You don’t have time not to” things. My complaint with the specific method suggested by GTD - doing this review late on a Friday - is that you don’t want to think about the big picture much then. The events of the week are still fresh in your brain, demanding your attention. I find it much more effective to think about this at a time when I’m more distant from the details.

Getting Chop-Happy with Axeman

March 16th, 2007

An important but oft-overlooked principle of software design is the aggressive culling of unused features. The best software products are slim and lean, with exactly the features its users need and few that they don’t. This like weeding and pruning in your garden: without it, you’ll eventually be overrun.

The types of apps that I most commonly work on are internal applications used by perhaps a few hundred users in a single organization, or across several organizations. You’d think that with such a narrow audience, it would be easy to get information about what’s being used and what isn’t. Not as easy as it seems, though, because the users are not good at analyzing their own use. If you ask them whether a particular report is used, for example, they’ll make vauge noises like “Oh yeah I clicked on that once” or “Huh I hadn’t seen that before, but I’ll definitely use it now that I know about it.” In most cases this stuff is not true, but it’s very hard to tell.

Historically my approach to this has been to cut a feature I think is unused and wait for someone to complain. This works well enough because my intuition is right 90% of the time; but the 10% it is wrong, I can end up with cranky users. (In doing experimental cutting like this, I usually remove the link a few days prior to deleting the connected code. That makes it a cinch to put it back in when necessary.)

A burst of inspiration hit me the other evening. We don’t need to ask the users: the application should be able to track this! To this end, I’ve created Axeman. This is a tiny Rails plugin that tracks usage in a SQLite database, and displays a simple report with the results constrained by time. Screenshot:

In the left column is a traffic report, comparable to a web log analyzer like the venerable AWStats. The Axeman report is way simpler and doesn’t have any fancy graphs, though, so this isn’t too exciting. Besides, you can install AWStats or whatever to get this info about your Rails app. Where it gets more interesting is the right column.

Here we see controller actions that have not been accessed during the selected time period. These are determined by analyzing the source of your app/controllers directory, and cross-referencing it against the usage data.

As an aside, I think it’s interesting that this is only possible because of the structured and convention-based nature of modern application frameworks. Axeman is a very simple example, but I am hoping that as time progresses, we will see more self-aware / self-introspecting application components.

What does it mean when actions appear in the righthand column? Let’s look at the example shown in the screenshot. This is a tiny app and I didn’t expect there to be many dark corners, but as it turns out there’s quite a bit of dead code. First, there’s a bunch of account signup stuff which is unused - this was created by the generator for the login engine. It’s not used, so axe it.

Next, we see that categories and authors both have index actions which are never accessed. Looking at the code I see that these are just redirects to the list action. However, the list action seems to be linked directly, since that one appears in the lefthand column. No need for them then: the axe claims two more victims.

Books has a few unused actions. sort_order is a vestigial remnant of an ajax feature which is no longer used; it goes under the axe. destroy is a working method, but not linked anywhere; most likely created as part of CRUD, but then whoever did the UI didn’t feel that it was needed. We could link it, I suppose, but why bother? If the app has gone this long without anyone complaining that they can’t delete books, then there seems little need to maintain the code that implements the feature. Chop, chop. Last, it seems that there is some confusion about new, edit, and new_and_edit methods on the Books controller. Looking at the code I see that new_and_edit is called by both new and edit, but is never accessed directly by the user’s browser. Therefore it should be made private (Axeman ignores private and protected controller methods). With all of these changes, the Books controller is quite a bit cleaner.

Also on the executioner’s block should be methods with low hit counts, that is, ones that appear at the very bottom of the lefthand column. This requires more knowledge of the user story for each page than does completely unaccessed pages. For example, you could have a page which displays some tax information which is only accessed once a year by one person in the organization. Therefore a low hit count should be expected, and the page should not face the axe. But most other kinds of pages should probably be removed if they haven’t been accessed frequently. The default time period is 3 months, which I think is about the timespan in which you’d expect something to be accessed at least a few dozen times. If it’s only got one or two clicks, chances are good this was just someone who hit the wrong link, or perhaps was just curious. Truly useful pages will have hundreds or thousands of views, depending on the size of your user base.

What about the idea that a page does offer useful features, but people don’t know about it? If you think that this is why it’s unused, then you need to find a better place to link it, or a better way to educate your users. The bottom line is that it doesn’t matter how theoretically useful a page is: if no one is using it, then it is not actually useful.

And keep in mind that this tool (and in fact, the entire concept of aggressive feature culling) is most effective not as a one-time event, but as a habit over time. A page which might have been extremely popular last year could fall into disuse when another page is added which provides similar but slightly improved functionality.

This plugin was the result of just an hour or two of hacking, but I’ve already been surprised at how useful it has been in my production apps. New ideas are suggesting themselves as I use it, including watching for unused partials, showing changes over time in a visual fashion, or even trying to look for unused model methods. For this last item, it’s been my experience that over time, model start to bristle with methods, many of which are remnants of historic functionality and no longer used, though this will be by far the hardest one to implement.

Another feature that I’ll try to add soon is a logfile anaylzer which scans production.log in a manner similar to how AWStats processes Apache’s access.log. This will allow the importation of historic data, and will also make Axeman more suitable for use on high-traffic, public-facing sites, where hitting an external SQLite database on each pageview may not be acceptable.

A subtle but powerful point that is driven home by the usefulness of this plugin is just how much design is an evolutionary process, not a one-time occurrence. Of course I know this, as do most of us, but I’m finding that Axeman makes it tangible. Here’s a piece of code which exists for no other reason than to help the application’s design change over time. The only other component I can think of that really acknowledges this is migrations, but these are more at the underlying technical level, rather than at the level of user-facing features.

New Gyre Screencast

March 8th, 2007

Exploring variables visually with the Gyre console.

Behavior-Driven Development

March 2nd, 2007

Behavior-Driven Development (BDD) has been hanging around, tugging gently at my brain for a few months now. Like most interesting ideas, it’s only a small evolutionary step from its predecessor, TDD. But I think it’s a subtly powerful concept, one that I’m starting to pay more and more attention to in my work. The introduction sums up the evolutionary shift at a personal level:

1. The developer starts writing unit tests around their code using a test framework like JUnit.
2. As the body of tests increases the developer begins to enjoy a strongly increased sense of confidence in their work.
3. At some point the developer has the insight (or are shown) that writing the tests before writing the code, helps them focus on only writing the code that they need.
4. The developer also notices that when they return to some code that they haven’t seen for a while, the tests serve to document how the code works.
5. A point of revelation occurs when the developer realises that writing tests in this way helps them to “discover” the API to their code. TDD has now become a design process.
6. Expertise in TDD begins to dawn at the point where the developer realizes that TDD is not about testing, it is about defining behaviour.
7. Behaviour is about the interactions between components of the system and so the use of mocking is a fundamental to advanced TDD.

Word.