Why Do We Still Use Relational Databases in an OO Era?
Object-Relational Mappers (ORMs) have been a hot topic for, oh I don’t know, a decade now. Mapping an object-oriented data set onto a flat relational schema is unexpectedly difficult. Inheritence, polymorphism, and even simple deviations in type handling between SQL and your language of choice make ORMs a virtual minefield. Some have even said that ORMs are the Vietnam of computer science.
Really, though, it seems rather silly when you think about it. Why go to so very much trouble to map the data from an OO format and back again? Why not just keep the objects in object format all the time?
It’s been done. But it doesn’t work terribly well, and certainly doesn’t have anywhere near the robustness necessary for large and long-lifetime applications. Why not?
As it turns out, the limitations of a relational database are the very thing that make it so robust over the long term. There’s benefit to be gained by forcing the data to be turned into a simple, transparent, structured format every time you save it. Or indeed, to force an explicit saving step at all, instead of just persisting objects flagged to do so.
ZODB and its ilk are hauntingly similar to a core dump: easy to save and load, but fragile over any significant timespan. There’s no enforced strictness to data cleanliness when you can just dump everything out effortlessly.
Another case of embracing constraints? Maybe. I’d like to think that someday this longstanding dilemma will be resolved. Until then, thinking about ORMs this way will make me feel better about using them.
January 9th, 2007 at 6:19 am
Thinking on similar lines, does it mean C++/Java/C# was a wrong step forward from C?
The only reason why there hasn’t been any more progress in the DB domain could be because of Larry Ellison and Oracle..
January 9th, 2007 at 6:43 am
When I can query an OODB with a standard language to bring back a well defined subset of data quickly and efficiently I’ll worry about this. For the moment I keep my data in a RDBMS and I use SQL when I need efficiency and I use an ORM when I don’t.
Works for me
January 9th, 2007 at 10:05 am
I have been doing web stuff since 1999. With content objects, you don’t always know what kind of data there will be associated with it. Using relational databases, the only way to deal with this is to have a table “object id, parameter, value”. Or you need to change your tables constantly by adding new columns. It’s PITA to write SQL migration scripts and test them.
On the other hand, OO databases like ZODB you mention, give more flexibility. It’s true that this flexibility comes with a price of spaghetti under bad management (think about Perl of databases). Still, sometimes it doesn’t feel worth of going through all those steps in SQL -> (Business logic layer) -> Object -> Servlet -> HTML template output -> HTML chain just to get a new variable printed on a member home page.
Since we are dealing with objects anyway (who does procedural programming raises his hand now), hiding or getting rid of those extra data layers saves my typing time, making me more productive. It’s another matter if you are doing performance sensitive code, say, a YouTube clone, but most of us are not.
January 9th, 2007 at 10:35 am
It’s interesting that you would like a 37signals article on this topic without talking about Ruby on Rails. Have you tried using it at all. It has a really great ORM that lets you persist objects with ease. In fact chances are you’ll never have to write any sql at all unless you’re doing something really complicated and you can just deal with everything as native objects, calling a .save method when you want them to be written to the database.
January 9th, 2007 at 7:25 pm
My 2c got a bit big but here it is…
OODBMS vs RDBMS
- CJ Date & Type Inheritance
- Object Graph building/traversal
- Ad Hoc query support
- Inheritance & SQL3
- Updates in a Stateless environment
CJ Date & Type Inheritance::
—————————-
There is a view by some (CJ Date included) that the only thing OO brings to
the Relational model is Type Inheritance. SQL3 includes Type Inheritance
but its not widely supported by RDBMS Vendors. I wonder if this is still
the case?
Object Graph building/traversal::
———————————
OODBMS are generally designed to do this well (better than ORM/RDBMS).
ORM/RDBMS may be able to make some optimisations that make
this faster (closer to OODBMS) using Physical ROWID’s and fetch ahead
(of related data) type strategies.
OODBMS may have a much smaller performance benefit when the database is
very large (and OODBMS behaves more like RDBMS/ORM).
http://www.avaje.org/rowid.html
Ad Hoc query support::
———————-
RDBMS/SQL is very good at Ad Hoc queries. For me, the Relational approach
shines in this area (compared with both ORM or OODBMS).
Inheritance & SQL3::
——————–
It could be said that complex inheritance hierarchies are better handled
by OODBMS as opposed to ORM/RDBMS. For me this is largely due to
RDBMS vendors not implementing all of SQL3 - specifically the
Table Inheritance part.
Postgres and Virtuoso perhaps are leading the way here.
http://www.avaje.org/typeinheritance.html
Updates in a Stateless environment::
————————————
For stateless Web development it should be noted that a Relational SQL Update
has advantages over ORM (not sure about OODBMS). Specifically in avoiding
a fetch of the Object just to update(or delete) it and accidentally bypassing
Optimistic Concurrency Checking.
For me, this scenario is best handled by a simple SQL Update.
Check it out at…
http://www.avaje.org/ormupdate.html
January 10th, 2007 at 9:02 am
The relational model (and therefore most of the RDBMS implementations) are based on mathematical principles applied to information descriptions (normalization) and a minimal and coherent set of operations to specify data transformations.
This has nothing to do with a particular brand of DBMS or the arguably horrible thing called SQL, but it is a fundamental base defining the very nature of sets of related pieces of information.
You just can’t get away from that.
Object Orientation is more a way to organize large pieces of code in a human designer/maintainer friendly way.
These two things stand on their own, with their own focus and legitimacy. You can’t really argue that one is “better” or “an evolution” of the other… The challenge is to acknowledge the value what the two ideas bring to our work, and find a way to unify them. Understanding their different focus is a good start…
January 10th, 2007 at 12:13 pm
The problem is data format and data management. It is hard to constrict a project (long term) to an OODB metaphor when you do not know if future technologies will support it or even be able to interface with it. If you add to that the amount of accumulated skill available within organizatons for managing data directly (i.e. administration, migration, etc) then it is a risky proposition.
Ultimately, until the OODB vendors either make it a seemless integration for relational tools / drivers and support the same sophistication that sql provides you won’t see it. This is simply because developers typically are the only ones that understand an oo datamodel. However, they are not structurally in a position to admin it. DBAs are comfortable with relational. So, OODB companies have to market to them…and have been extremely unsuccessful there.
January 11th, 2007 at 4:39 pm
Interesting post. I think this topic needs to be explored more. In fact, there’s an interesting interview on the Astoria Software blog about XML content management systems versus relational databases and other solutions.
Check it out here: http://www.astoriablogs.com