Posts tagged ‘OO’

Transaction Semantics

I have been lurking in a recent discussion of using [Transaction]-like attributes in C# to indicate that certain methods can participate in, or require a transaction. Castle ActiveRecord has a another technique of allowing the user to specify a TransactionContext, like

using new TransactionContext
{
...do stuff that will automatically be in the transaction
}

The problem with all of these techniques is that they are essentially procedural. Specifically, anything that you want to participate in the transaction has to be manually called as part of the call-stack. Put another way, they fail to separate the concerns of ATOMic persistence and the identification of what needs to be persisted (the unit of work). The result is that it becomes difficult to implement some aspects of persistence, leading to an increase in artificial complexity.

For example, an aspect of saving a deposit into an account is that there should be a “dual” entry in another account, and balances must be updated. A single aspect like this is somewhat manageable using the proposed semantics, but if you have just a few more, then they quickly lead to very wordy, procedural and possibly complex “Save” methods. You will also end up adding additional state variables to classes that contain “Save” methods, in order to support the logic of the save.

Another way of looking at this is as the problem of the typical “business entity” class that simply does too much. There are cross-cutting concerns that do not belong in one “business entity” or another. That is the major weakness of the ActiveRecord-style of data access – when you take the world-view that every business entity is a table, then you encumber your ability to clearly work with the aspects that are orthogonal or cross-cutting to the entities.

My own solution (there may be better ones) is to explicitly expose the unit of work, and have a technique that allows class instances to intelligently enlist into it. Its worth describing in a little more detail. First, you need an interface that a class can implement to enlist in the work:

Interface IWorkEnlistee
Sub Participate(work as UnitOfWork)
Readonly Property UniqueKey() as String
End Interface

The Participate method is called just before the database Save, but after validation of user-data. The UniqueKey property is necessary to prevent two identical instances from participating (I usually just return the hash-code of some entity instance). You could add methods to the interface to get greater functionality, such as in-memory rollback.

For the dual account entry example, I would have an instance of the above interface that participates by adding the reverse entry and updating the balances. Any state data it needs will be passed in the constructor, which is called before the in-memory data is changed (so that it can get a clear before-picture). It will probably have several related state variables that otherwise would have found themselves complicating some other piece of code.

Using this technique, the entity, presentation and flow logic of the application remains clean, and the cross-cutting aspects that participate in transactions are nicely separated and encapsulated.

What’s the most important aspect of long-term-quality software?

Just doodling…What’s the most important aspect of long-term-quality software? I’ll define long-term quality software as some piece of software with a lifespan of many years, over which that software can be extended and changed to suit new needs without compromising quality.

Some potential answers, with my best spur-of-the-moment arguments:

  • Strong typing – not just variable types, but any sort of type, like a database table. If something changes in the contract (field name changes, or a 1-1 relationship becomes 1-many), then I should be able to make the change and within minutes, know each of the places that are impacted in the code. The justified fear of making changes to a system is driven by the unknown impacts. If I know all of the impacts, then I am in a very strong place.
  • DRY (Don’t Repeat Yourself) – If logic is represented in multiple places, then someone will only change it in one, which will automatically create some inconsistency. If you are lucky, then the inconsistency will be noticed quickly. If you are not, then you will only find out later when the damage is done.
  • YAGNI (You Ain’t Gonna Need It) – Software is complex. At some point, the complexity becomes too much for us to fit in our minds at one time. The longer we can defer that point in time, the more maintainable (and learnable) the software will be. There are two distinct types of complexity though – inherent complexity (because the problem is complex) and artificial (unnecessary) complexity. By introducing functionality before we know for sure that we need it, we are creating artificial complexity. Thus, we will reach the point of too much complexity before we should have.
  • Minimized Coupling – The complexity of software is directly related to how big it is. When we couple things together, we are making something more monolithic, and thus harder to understand. We also cross a line that is difficult to un-cross. (One coupling-point is just the first of many). Minimized coupling is an antidote to complexity.

I think minimized coupling is perhaps the most important, because it has such a direct impact on complexity. I looove DRY though – it is addictive once you try it in earnest. Strong-typing is of limited use without DRY. YAGNI is good advice, although some take it too far.

Are there other candidates?

Exception Handling 101

  1. In some older languages, the runtime was not able to expose a call stack when exceptions occurred. This led to programmers spattering the code with exception handling-code, so that they would have a better idea of where exceptions occurred.
  2. Historically, programmers have sometimes used exceptions to communicate status.

The above 2 ways of doing things are no longer appropriate.

First, in a language which exposes an exception call stack (like C# or VB.NET), the only place you need an exception handler is at the top-most point of a thread. So, you need a try…catch around your application entry point (Main sub), and around every new thread you manually spawn.

Second, using exceptions to communicate status is a terrible practice. I cannot articulate why really. Perhaps because it forces you to code everything very defensively, which obscures the intention of the code. In any case, experience shows that things are much simpler if you follow the rule that exceptions are only for exceptional scenarios. In other words, do everything you can to avoid having to trap exceptions.

This is harder than it sounds. It pairs well with the “fail fast” rule though. You can implement this rule by validating parameters to your methods. If they do not meet your expectations (e.g. something is NULL when it should not be), then throw an exception. The intention of the exception is always to indicate an invalid system state. (An invalid system state is unpredictable and dangerous, and should never occur – it is an exception).

API Design vs. OO Design

Traditional OO lore teaches us that objects are things that have both data and behavior. Blindly following this rule can lead us to make poor design choices, especially around what many refer to as “business objects”.

The pattern is that these objects already have data, so we seek to add behavior as well. In this way we can feel happy and content that we have a true “object”, and we are successful OO programmers.

The problem is that adding behavior as a sort of “suffix” to an object is ignoring a more important aspect of objects, which is that they should do one thing, and do it well. Add too many “suffix” behaviors, and pretty soon you can have a tightly coupled bowl of spaghetti.

This is not just theoretical – I have seen it happen, more than once. I’ve even been guilty of it.

So what is the solution? When we have classes that are primarily data, should we resist adding behavior?

My answer is “it depends”. To understand why, we need to take a small detour into API design…

Sometimes, programmers expect things to be a certain, simple way. They do not want to ask a FactoryLocator for an IObjectPersistorFactory, use that to get an IObjectPersistor, and finally tell the IObjectPersistor to Save their object to the database. They just want to write:

myObject.Save()
or
myObject.Load(id)

This ActiveRecord implementation is easy to write and easy to read. In short, it is good because it is a nice API for the client of the object. It has drawbacks (no transaction support, high risk of coupling to database). But in many systems, this API will be sufficient.

So the ActiveRecord “suffix” is mostly ok. What other behaviors can we add? How about validation? The save method should probably validate before it saves, so as to ensure we have good data in the database. How about some initial field values for new objects? And some event driven behavior – let field A be defaulted when field B changes? And we need properties for other objects. MyCustomer.Address.ZipCode works real nice. We can even lazy-load the Address property. Not too hard.

Hmm. Question. If we save the Customer object, should the Address save too? Probably. So we need to add some more code to the Save method for that.

etc. etc.

You get the picture (I hope). You can create a perfectly functional system in this way, but the coupling of all functionality to a single class will make it difficult to change in any substantial way. It will also have poor quality, because we are ignoring several key principles, such as DRY and Open-Closed.

There is only one way in which you can mitigate the problem. Use code-generation to generate your “business object” implementations. This mitigates quality problems substantially (DRY does not apply to generated code). It also forces you to either state some things declaratively (such as required fields), or else move them into their own dedicated area.