Friday, February 29, 2008

Quote of the day - Chris Wheeler on Values

Chris Wheeler on the Extreme Programming (XP) group:

I think values are important, but I'm almost certain I could derive XP's practices from Peace,Love, and Understanding as easily as they have been derived from the values that currently undergird them.

Incidentally, the XP group is *very* high volume. Its hard to keep up, but its fun because there are so many names that I recognize from books and blogs - always some interesting discussion going on.

Thursday, February 28, 2008

Introducing Value-Driven Project Management

In my ramblings on Values, it came to me that it is relatively easy to redefine "traditional" and "agile" project management in terms of Values.

Traditional project management has the values of "fixed budget", "fixed scope", "fixed delivery date". Sure, the project manager can vary these (iron triangle), but they are the only values that traditional project management recognizes. (I know, thats an oversimplification of traditional project management).

Contrast this with agile project management. It has values of "maintainability", "time-to-market", and "feature prioritization".

No wonder the two are so hard to reconcile! But that is not the point of this post.

I am proposing a new style of project management, "Value-Driven Project Management". This new style encompasses both Agile and traditional project management - they are two subsets that focus on specific sets of values.

At the core of Value-Driven Project Management is the notion that all of the practices of project management exist only to support the values. An example is in order...

Lets consider "risk". Traditional project management devotes large amounts of effort to risk management. This is because the values are inherently risky - "FIXED budget", "FIXED scope", "FIXED delivery date". Let us contrast that with Agile project management, which has far fewer risk management tools (spikes, tracer bullets). The simple Agile risk management techniques, support the value of "time-to-market".

There are many more values than the ones I have mentioned. Too many to mention, but consider "high initial quality", "low development cost", "certain set of core features", "performance measures". Both traditional and agile project management fall short of addressing these values. And when you ignore them, they suffer. We need to help our clients to identify and prioritize their values. Only then can we truly design appropriate project management strategies.

Once we know the values, we need to be able to pick and choose our practices in a way that supports those values. We must pick practices that will combine to provide the client with a satisfying project experience, much as a chef must pick ingredients that combine to provide a satisfying meal.

Let me bash Agile project management for a moment. I have little doubt that Agile project management is one of the most efficient strategies for delivering software (because it prioritizes features). However, it does not directly address "quality" as a Value. Because of that, it is difficult for teams to know how much of the quality practices (TDD, code reviews, automated acceptance tests, formal QA) they should be applying. All it takes is a conversation to identify the value of quality to the client (relative to other values), and then a team can motivate the quality-effort.

That brings me to the final point of this post - it is all relative. The iron triangle is an ugly and blunt tool, but it is correct in the sense that you have to choose certain values above others. That is still the nature of project management.

Wednesday, February 27, 2008

On Values, Principles and Practices, and the suckiness of Project Management

I mentioned a few posts ago that I really liked the quote from Kent Beck

It is about values, out of which flow principles, out of which flow practices.

Some recent online discussions have made it clearer why this is. Values are things that we can all share. As a business analyst or a software architect, I can talk to a client about the value of long-term maintainability of the code, or the value of time-to-market, or the value of high initial quality, or the value of low development cost, or the value of a certain set of core features, or the value of performance measures.

As an experienced software professional, I can choose principles and practices that support the values that a client is interested in. I can explain to a client that there is a cost to applying certain practices, and he may have to weigh one value (e.g. maintainability) against another (e.g. time-to-market).

I use my own judgment in these tradeoffs, but I would much rather use hard-data. I think as an industry, software development can mature if we get serious about identifying the various practices, their relative costs and their relationship with the values that a client is interested in. But that cannot happen yet, and I'll tell you why...

The iron triangle of project management grossly misrepresents the trade-offs, because it only understands the values of Time, Cost and Scope. You want to know why software quality is low? Because it has no choice - simplistic project management will steal from quality to pay scope or cost or time. Developers have very little real control over quality.

Some have suggested adding a fourth "quality" side to the triangle, or turning it into a pyramid. This misses the point. There are things that the client values. These are the things that we have to balance against each other to find the path to a successful project. Sorry, its just not as simple as a polygon of any fixed number of sides.

It has been left to the developers to try and motivate good practices that support other values, but only be able to apply them if they can do so without affecting the golden project management values. The end-result is that less good practices are used, and many developers fight against good practices as much as project managers.

The 8 T's in a developer's work-day

All developers do the following things in various proportions through the day (starting with T, for fun):
  • Typing
  • Thinking (design)
  • Twiddling thumbs (waiting)
  • Tinkering (experimenting)
  • Trying (debugging)
  • Talking (design, analysis)
  • Testing
  • Tweaking (refactoring)
Some add no direct value (Trying, Typing, Twiddling thumbs, maybe Tweaking) and some add value in moderation (the rest).

The value in this list is that you can use it to motivate the validity of various practices. For example, doing TDD will increase Typing, but it will almost eliminate Trying, significantly improve the efficiency of Thinking and do some Testing for you. For many scenarios, this means that the benefits outweigh the costs.

Filling in timesheets will increase Typing, but will not improve anything else.

Pair programming will increase Talking and Thinking, but maybe decrease Typing (only one person can type at a time).

Tuesday, February 26, 2008

List of Software Architecture Laws

There are several universally accepted software architecture laws. These have the characteristic that if you heed their principles, your software will be of better quality and will last longer. These are the ones I am aware of:

Law of Demeter
aka principle of least knowledge, aka only talk to your immediate friends, aka low coupling and high cohesion.

Separation of Concerns
The notion that it is better to allow the code (and the developer) to concentrate on one concern at a time. This is the mother of many other principles, for example layering, or splitting software along logical lines.

Conway's Law
Any piece of software reflects the organizational structure that produced it. The cause is more sociological than technical. The antidote is better communication, or smaller teams.

Shalloway's Law
aka DRY. "When N things need to change and N>1, Shalloway will find at most N-1 of these things". I like Shalloway's version, because it manages to capture the essence of DRY, with the added subtlety that it is ok to duplicate stuff, as long as you don't have to manually change it.

Monday, February 25, 2008

Quote of the day - Kent Beck on defining XP

From Kent Beck on the XP yahoo group (emphasis mine):

XP is not defined by its practices. It is about values, out of which flow principles, out of which flow practices. Enough of software development is similar enough that, even in different circumstances, the practices derived from the same values and principles end up looking similar. Different values, different principles and you end up with different practices. They might still produce valuable software, but they wouldn't be XP.

Friday, February 22, 2008

Steve's 2nd Law of Good Software Architecture

My first rule of good software architecture dealt with ways of making a particular code-base last a long time. The focus of the 2nd rule is different - it assumes that a problem domain will be solved multiple times by different software, or multiple versions of the same software. It suggests ways that we can make each new re-solving of the problem easier than the last.

To review, my 2nd law of good software architecture is:

Keep as much information as possible in an accessible, declarative form. This will eliminate duplication, and enable your software to be discarded and re-written without losing quite as much

This law is all about re-use of information, and describing how a particular problem domain can become better understood, even to the extreme where the "software" is just data.

We've all seen the tool-sets for generating entire applications - enter your requirements (mostly just your data structure) using vendor X's WonderMaker(tm) and lo and behold, out springs an application with handy generated forms for doing wonderful things. As it turns out, those wonderful things are pretty much Create, Read, Update and Delete. Not so useful after all.

Those generic tools do solve particular problems well - but it is usually not the problem we want to solve. Understandably, users demand more than just create, read update and delete - they want to use the software to perform some task that meets their goals.

We can achieve the goal of a tool that generates most of an application, but only once we understand the problem domain well enough. We need to understand the domain, because we need to know what we can generate, and what we need to leave open to extension.

It is always a mistake to design a v1.0 system where logic is executed based on models of application logic. There are plenty of horror stories about the architect who thought he could model the business logic using XML. Don't be the next one. This post is about evolving your understanding of a particular domain to the point where you can create models with confidence they will work.

That said, even in version 1.0, there are some things we can recognize. The first is that there are at least two easily identified models of the system. From the user's perspective, there is the model that they understand and interact with. At the other end, there is the database. The important logic of the application sits between the user's model and the database. This is the origin of the old 3-tiered concept - UI + Application + Database.

What we have to realize is that we can model each of these things in a way that is declarative. In version 1.0, we may not understand the way the user wants to use the UI well enough to do much work in this regard. However, we can certainly model the database in declarative form, and have that model persist after version 1.0.

Modeling the Application layer is the last evolutionary step. You will not reach that point until the core application requirements are stable and well-understood.

For existing products, the process of modeling requires a re-write of some portion of the system. This is unavoidable, because you have to extract information from where it is hidden in the code, and represent it outside of the code. The code will no longer work. The good news is that once you have correctly modeled a part of the system, the model can be extended to capture new types of information, and need not be re-written again.


A re-usable database Model
So what does a re-usable database model look like?
  • It treats relationships as a first class concept - they have names, and they have attributes (one-to-many, cascade-delete behavior).
  • It describes fields in a rich, descriptive manner. Strings have maximum lengths, phone numbers are represented by a phone-number data type, etc.
  • It describes lookups (sets of values that are acceptable for a field)
  • It describes roles - how field values come together to represent a particular flavor of record that has meaning to the user. A particular flavor may be extended with additional fields and properties.
  • It should be directly and easily accessible to the rest of the code (re-usable).
  • It should be able to be transformed into something that the data access layer (or ORM tool) can use directly.
  • It should be able to be transformed into an empty database (it is complete).
The most obvious storage form of the model is as XML, because it is very accessible, and because it can represent hierarchical data. Other forms are ok, as long as they meet the above criteria.

Why all of the richness? We want to capture as much information as possible in a single place. This allows us to make use of that information at higher layers of the application, in ways that enhance the user and the developer experience. DRY (Don't Repeat Yourself) is a powerful architectural technique.

Not all database structures represent the model we wish we had. We may have inherited a database, and it may be a horrible thing to behold. My first law of good software architecture applies - since we are exposing the model directly to the developer, we want it to be the one we wish we had. If the real database is too far from what we want, then we need to take steps to address that inconsistency. To do otherwise is to invite artificial complexity in the application code.


A re-usable UI Model
As mentioned previously, I do not expect that many version 1.0 products have a very good UI model. Still, if we can understand what a UI model looks like, then we can work towards it.

Firstly, a UI model has a relationship with the database model. The relationship is mapped - i.e. there is some automated transformation that can be used to relate a field on the UI back to one or more fields in the database. This is important, because the relationship is what allows us to re-use information defined at the database model (such as rich data types, lookups, maximum field lengths etc). If we're re-using information, then we are not duplicating it.

A UI model can grow in pieces. First, you can model screens, then larger pieces that describe how various screens fit together. Screen models are the easiest. (Even today, many applications make use of screen models).

Again, XML is a good choice for representing the UI model.

Beware of including layout information in the UI model. That is a different aspect that belongs in a different model. The primary purpose of the UI model is to bring together fields and screens in a way that represents how the user sees them. This may include their likely order on the screen, but should not include their actual co-ordinates.

UI models are re-usable in several ways. Security, Form Design, and Ad-hoc user queries are a few.


Layout of Forms (Views)
Form layout can be defined declaratively, but it is seldom worth the trouble to do that manually. We cannot predict the next evolution of UI well enough to design a representation that is good enough. The best you can probably do is favor form-design tools that save themselves declaratively (for example, XAML).

Your form layout should make use of the UI Model directly (via data binding and control-binding). Otherwise, you are just duplicating yourself. (Control-binding is the technique of having the final appearance of a particular control determined based on metadata. See the screen shots in my Egg UI post for an example).

Form layouts can be generated. This is a dangerous path, because it can limit your ability to satisfy the needs of the end-user.


Security
It is particularly useful to relate security to the UI model. One reason is that security is highly contextual - whether a user has rights to touch particular data elements can be driven by many factors, including the time of day. Another reason is that users need to understand security in order to effectively define it. The UI model's shared understanding of the user's perspective allows a good point of interaction for security.


Ad-hoc user queries
Often, we may want to expose the ability for users to query a database in some way that is fairly dynamic. A UI model that is mapped back to the database model provides a simple way to provide that feature.

Thursday, February 21, 2008

Steve's First Law of Good Software Architecture

First, I should touch on the intent of good architecture. The intent is to build something of quality that will last a long time. The "something" we build will not be static - it will be changed, and should be amenable to those changes without loss of quality. Small applications are easy to replace, rather than change - so good architecture is most relevant to medium to large applications.

To review, my first law of good architecture is:

Identify all core services to the application. Code against the interface of the service you wish you had, not to the implementation of the one you actually have.

Now when I talk of services here, I am specifically *not* talking about SOA. I am talking about all the pieces of your application that are not business logic. In this context, a "core service" is pretty much everything that is not the business logic itself. This includes the entire user interface, the database, the file system, and the application settings.


User Interface
Lets talk about the user interface as a service. This is a little-known technique, so I'll take the time to motivate it as best I can.

Consider this statement: Logic naturally wants to be at the points of control. By default, the main point of control of an application is its user interface. This is why it is so hard for developers to keep it out of there! For non-visual applications, this is still true - the logic wants to be on the edges. From an architectural perspective, this tendency is very dangerous - the user interface is the most likely part of the application to be discarded, and the hardest (practically impossible) to re-use.

One well-known technique for limiting the damage is layering of the user-interface on top of the application logic. With discipline, this can work well. However, good architecture does not assume discipline. It assumes team members of average talent at best, and structures the application so that they are as effective as possible. Layering is not the best answer.

Given that logic wants to be at the point of control, we can make a conscious decision to put the application logic in control. This will make the other parts to the application subservient to the application logic. As it turns out, that is the definition of a service - a part of the application that is subservient to another.

Another important characteristic of a service is that it has a well-defined API. So well defined in fact, that we can define its interface, and code against that interface rather than the actual service. So score 1 for the user interface as a service - it makes automated testing of business logic easy.

Of course, the user will still interact with the application - clicking on menus, entering data, and generally driving the flow of the application. However, they will be doing that within the context that the application logic has defined and supplied to the user interface.

In practice, this is a lot easier than it sounds. You can evolve the interface as you develop the application. Modern inversion of control techniques make it easy to inject the actual user interface at the time of execution, and to supply an appropriate context that the user interface can operate within.


The File System
Some pre-built services, such as the file system are very broad. Do we create an interface over that entire surface? No. We code against the interface we wish we had. The file system may provide the implementation, but we would be introducing unnecessary complexity if we dealt with the file system directly.


Settings
My own view is that you can combine settings with the user interface service. This is because settings can often be user-choices in one implementation, and settings in another, and hard-coded in yet another. There is no perceivable downside to having the user interface implementation control the settings.


The Database
It turns out, the most difficult aspect of the application to make into a service is the database. A database is like a pool of data. Most times, the interface we wish for is to be able to scoop up the data with a bucket, play with it, then throw the data back into the pool. A simple data access layer can be good enough for this.

Sometimes we want more - for example, we may want to have data access run on a different application server, or be scaled across multiple servers. We may want to provide the ability to have the application run disconnected from a server. Or we may want to totally insulate the application from the data structure or vendor. These are all up-front choices we must make. All come with a cost. In the more expensive cases, the interface we wish for will be more service-like than a simple data access layer would.

If we do have to make data access into a service, we should still be sure to make it the service we wish we had. This implies that design of the service should be driven based on the needs of the application.


Conclusion
Good architecture puts the important logic at the center and treats the less important logic as subservient (services). We code against the services we wish we had, because to do otherwise introduces artificial complexity. (The implementation of the services can take care of translating back to the reality of the underlying provider).

"User Interface as-a-service" is a new concept. I have implemented it with success, although I didn't understand it then as well as I do now. Others have too - Cockburn's hexagonal architecture is a similar concept to what I have described. I think I will be writing more about it in later posts, because I have treated it too high-level here. People will want to know how to actually do it before they believe it is a good idea.

Wednesday, February 20, 2008

Introducing the Egg UI Pattern

For the longest time, I have been trying to characterize the UI design I have been using for the last few years. I think I finally have a handle on explaining it. I hope you find it as interesting as I do.

To summarize up front...the Egg UI pattern is a technique for creating a rich user interface that is decoupled from specific application logic, but coupled to a large piece of infrastructure code. Some business value is invested in a common infrastructure, enabling the business to quickly add more modules that behave in similar ways. For a particular module, business value is invested in application logic, where it can be re-used independent of the infrastructure or UI. Almost zero business value is invested in the actual UI for a particular module.

If it sounds like I have frameworkitus, hold off on the judgment for a minute. A framework can be bad, because it risks coupling of your application logic to the framework. The Egg UI pattern does not do that. The UI is an egg, and the application logic is an egg. I'm calling them eggs, because they are self-contained (as opposed to layers, which have one-way dependencies).

In the diagram above, direct dependencies are shown as solid lines, and indirect dependencies are shown with dashed lines. The defining characteristic is that the UI is provided as a stateless service to the application. Everything else flows from that. The target platform is a rich-forms environment, for very large, modular applications that display and edit lots of data. (It may work for other environments too, but I have only used it in the one).

Some components:
  • Presenter - responsible for applying form-level logic, providing field metadata, and validation.
  • Menus - represent possible user actions. These may be rendered on the UI as buttons, or menus. They have captions, and metadata describing their required context.
  • Commands - represent the details of the actions that menus execute.
  • UI Service Interface - defines all of the activities that can be requested of the user interface. Also defines all settings that the application may need, and methods for sending messages to the user.
  • UI Service Implementation - an implementation of the user interface. Uses data binding and infrastructure code to interact with the context (mostly the Presenter and the Menus)
  • Views - Simple data forms, or pieces of more complex forms.
  • UI Model - metadata, representing a shared understanding of the structure of the data in the user's view of the system. Provides a means for the Views to be bound, and a mapping of UI fields to the database.
  • Context - A holder for any context that the UI may need. Includes a minimum of the Menus, the Presenter, and other supporting methods. The Application Egg owns the context, but the UI Egg can see it and add to it.
So what is this pattern good for? I'm glad you asked.

Most importantly, User Interface and Application logic are decoupled from each other as much as is feasible. This pattern is almost at the extreme end of user interface decoupling. Any further and the forms would be drawing themselves (not a good thing, in my experience).

This decoupling provides an environment where it is very, very obvious to the developers where their code should go (hint - a presenter or a command). We can partially or completely re-work the UI infrastructure (e.g. Winforms => WPF) without concern for the application logic. We can extend the application logic (e.g. add additional user choices or change the types of fields) without touching the user interface code. We can test the Application without being concerned with the UI.

We can also repeat the pattern over and over in many modules that together comprise the application as a whole. In other words, it is amenable to vertical layering of the system, a factor which increases the workable size of the application by at least an order of magnitude.

There are many benefits, but there is also a big one-time cost - a significant amount of infrastructure (framework) code. This is necessary for any pattern where you want the user interface to be dumb. (And this user interface is particularly stupid). The UI needs to be able to act as a reflection of the application logic. This requires an investment in components that can read metadata and use that metadata to extend on the "drawn" user interface.

For example, this is a screen shot of a form in design mode:
Here is the same form at runtime:
And this is the user-code behind the form (the presenter contains all the meaningful code):
And here is the grid from which the form was accessed (no user-code):
And this is an intentionally blurred image of the context in which the grid was accessed (to demonstrate that this works at multiple levels, not just a simple master-detail example):
The infrastructure code has taken metadata, and used it to show labels, buttons, menus, images, treeviews, icons, dates, times, and dropdown controls. It also applies security, handles validation errors and generally gives a very rich user interaction experience. Unfortunately, this sort of power requires an investment. To me, that investment represents direct business advantage - in the ability to provide a unique, consistent experience with the richness and stability the users demand, while still leaving the door open to future possibilities.

Friday, February 15, 2008

Steve's Laws of Good Software Architecture


Meditate on these, and you may achieve some enlightenment :)


Steve's first law of good architecture:
Identify all core services to the application. Code against the interface of the service you wish you had, not to the implementation of the one you actually have.

Steve's second law of good architecture:
Keep as much information as possible in an accessible, declarative form. This will eliminate duplication, and enable your software to be discarded and re-written without losing quite as much.

Corollary to Steve's second law of good architecture:
A particular application domain is effectively solved (and no longer requires custom code) once all information about the application can be represented in declarative form.

Another Corollary Steve's second law of good architecture:
When using a tool to generate some or all of an application, ensure that the declarative data of the tool is stored in an accessible form.

Steve's third law of good architecture:
Usable components may evolve, but practical, re-usable components must be designed.

Corollary to Steve's third law of good architecture:
Component re-use is only practical once you have designed an approachable, stable interface to the component.


Enlightenment Image by Sakka, licensed under Creative Commons ShareAlike version 2.5

Tuesday, February 12, 2008

The difficult blue eyes logic puzzle

See also xkcd version, or on wikipedia. I found the original link on Damien Katz's blog. Its difficult to me, because there is a widely accepted solution that I could not grasp for quite some time. The wikipedia link includes the solution. Read one of the others if you just want the problem.

Here is the puzzle, followed closely by the solution:

On an island, there are 100 people who have blue eyes, and the rest of the people have green eyes. If a person ever knows herself to have blue eyes, she must leave the island at dawn the next day. Each person can see every other persons' eye color, there are no mirrors, and there is no discussion of eye color. At some point, an outsider comes to the island and makes the following public announcement, heard and understood by all people on the island: "at least one of you has blue eyes". The problem: Assuming all persons on the island are truthful and completely logical, what is the eventual outcome?

The accepted solution is that all of 100 the blue eyed people leave the island after 100 days. The short, misunderstood explanation is that the outsider introduced some "common knowledge" that was not there before, which allowed all the blue-eyed people to deduce their eye color.

The proof uses induction, and goes like this. If there were only 1 blue-eyed person (n=1), then he would see that there are no other blue-eyed people, and deduce that he is the one person the outsider mentioned. We would leave the island. If there were 2 blue-eyed people (n=2), then they would both see the other and expect the other to leave on day 1. When neither leaves the island after 1 day, they will each realize that they must be the "other one" with blue eyes, and leave together on the day 2. Using induction, bla bla, 100 days later all blue-eyed people leave.

Lets look at that more closely.

The argument works for day 1. Fairly obvious. Blue eyed person sees no other blue eyes, so he knows he is the one and leaves.

The argument still works for day 2. At first it seems the 2nd blue-eyed person has no reason to assume he is the "other one". But he knows that there is more than one (one would have left after 1 day), but he can only only see one (so he must be the other).

Consider a green-eyed person on day 2. He would also know that there is more than 1 blue-eyed person. But he can see 2 blue-eyed people, so he will do nothing. He will not know that he has green eyes - he will simply reserve his judgment until day 3.

Eventually day 100 comes (induction allows us to jump forward like that), and all blue eyed people are confronted with the inevitable truth, and they leave.

Further truths:
  • The 1st day pronouncement that someone has blue eyes appears to add no new knowledge. This is true for everything except the simplest case of a single-blue eyed person. The pronouncement is a device to assist in the induction proof. Really, they would simply leave 100 days after they got there, no outsider pronouncement necessary. (That is just harder to explain/prove).
  • In some versions of the puzzle, the person has to know their eyecolor to leave (the example above is limited to blue). In those versions, if all of them know there are only 2 eye colors, then on day 101, all green eyed people will leave too. They would leave earlier if there were > 1 of them and < 100, and then the blue eyed people would leave one day later.

Monday, February 11, 2008

Matt Blodgett's First Law of Software Development

See Matt Blodgett's First Law of Software Development

A development process that involves any amount of tedium will eventually be done poorly or not at all.

I like that. To me, it is yet another argument for DRY (Don't Repeat Yourself), which I consider to be the most important aspect of long term software quality.

If you are doing DRY, then you are not repeating yourself. Therefore, you are doing the least amount that you can in order to solve the problem. Any tedium is thus inherent in the problem, and could not be avoided.

(Of course, if you find or invent the right tool, you can also mitigate the remaining tedium. For example, using a diagramming tool to draw your database relationships rather than typing them in XML or SQL).

Tuesday, February 05, 2008

The unexpected benefits of Hibernate Query Language (HQL)

I started using Castle ActiveRecord last week, in an effort to refactor the data access layer of a medium-size application that I have to re-write (or at least rinse, DRY and repeat until it is maintainable).

Firstly, kudos to the Castle team. For the most part, they have made the simple things simple. In case you're not familiar, Castle ActiveRecord is an API that supports the ActiveRecord data access pattern. Rather than re-invent the wheel, they use the robust (but somewhat complex and under-documented) NHibernate O/R Mapper for their data access.

Anyway, on to the subject at hand - HQL = Hibernate Query Language. I was hesitant to use this at first, because it is after all *text*. I dislike using text-sql in my applications because there is no type-safety, and that limits my ability to change the code. Nevertheless, as part of the refactoring process, I decided to leave some of the SQL there, for now.

I did not want to learn HQL. It seemed a waste - I don't need to work with multiple database flavors - just with Sql Server. But after many struggles, I eventually did learn it. And I am glad, because I quickly (ok, slowly) realized that HQL addressed my biggest pet peeve regarding normal SQL - it treats relations as first class citizens. This leads to wonderful, easy to read and write queries, almost exactly like I thought they should look (see my previous post for my thoughts on that).

HQL - if you have been avoiding it, consider taking a second look. Its much more than just a way of abstracting the database implementation. It is SQL, the way it should have been.

Friday, February 01, 2008

The failure of SQL

I've been using relational databases for some time now - in fact, its the only type of database I've used professionally. I've even had a go at writing my own.

Over the years, relational databases have not substantially changed. Sure, they manage themselves a bit better, and there are a few more data types (xml), but basically the way they expose the data is unchanged.

I would like to suggest that this model (exposed via the SQL language) is missing a major piece. Adding this piece would make the SQL language much more approachable, and make databases more self-documenting.

The missing piece is "relationships". Its hard to believe, but relational database do not treat relationships as first-order objects. How would doing so change things? Well consider the following valid SQL query:

SELECT oi.*
FROM orders o
INNER JOIN orderItems oi ON o.Id = oi.OrderId
WHERE o.Number = 123

Notice how I had to specify the details of the relationship (the INNER JOIN line). Now consider what would happen if SQL had named the the relationship "Items". To be clear, the following SQL is not valid today:

SELECT orders.Items.*
WHERE Number = 123

The above version is far more approachable. Add some intellisense, and even a non-developer could write it.

The simple fact is that the SQL is in need of a major overhaul, but no-one cares. Developers don't care, because we have been ignored by the database vendors for so long that we have built whole mini-industries around abstracting the database. Database vendors don't care, because...well who knows...they live in their own little world. Report tool vendors should probably care, but in my experience, most have a serious lack of imagination.