Posts tagged ‘.NET’

LINQ 201 – Changing the way I work with lists

LINQ brings the set-based power of SQL-like syntax to VB.NET  and C#.  I thought I’d highlight some of the ways this has changed and simplified my code.

Find an element in a list

Pre-LINQ:
function FindItem(key as string)
  For each item in myList
    if item.Key = key then return item
  Next
end function
 
dim foundItem = FindItem(key)
if foundItem isnot nothing then
  ...process item
else
  ...not found
end if
Post-LINQ:
dim foundItems = from item in MyList where item.Key = key
if foundItems.Any then
  ...process item (foundItems.First)
else
  ...not found
end if

Find matching elements in a list

Pre-LINQ:
dim newList = new List(Of SomeClass)
for each item in oldList
  if item.SomeField = "Foo" then newList.Add(item)
next
Post-LINQ:
dim newList = from item in oldList where item.SomeField = "Foo"

Convert a list of items of one type into another

Pre-LINQ:
dim newList = new List(Of NewType)
for each item in oldList
  dim newItem = new NewType
  newItem.Field1 = item.FieldA
  newItem.Field2 = item.FieldB
 
  newList.Add(newItem)
next
Post-LINQ:
dim newList = from item in oldList _
  select new NewType with _
  {.Field1 = item.FieldA, _
   .Field2 = item.FieldB}

Sorting a list

Pre-LINQ:
Way too hard. Suffice to say it involved creating a new IComparer class or
implementing IComparable in the existing class
Post-LINQ:
dim sortedList = from item in myList _
  order by item.Field1

(or you could use lambda expression, but I find the above more readable)

Transaction Semantics

I have been lurking in a recent discussion of using [Transaction]-like attributes in C# to indicate that certain methods can participate in, or require a transaction. Castle ActiveRecord has a another technique of allowing the user to specify a TransactionContext, like

using new TransactionContext
{
...do stuff that will automatically be in the transaction
}

The problem with all of these techniques is that they are essentially procedural. Specifically, anything that you want to participate in the transaction has to be manually called as part of the call-stack. Put another way, they fail to separate the concerns of ATOMic persistence and the identification of what needs to be persisted (the unit of work). The result is that it becomes difficult to implement some aspects of persistence, leading to an increase in artificial complexity.

For example, an aspect of saving a deposit into an account is that there should be a “dual” entry in another account, and balances must be updated. A single aspect like this is somewhat manageable using the proposed semantics, but if you have just a few more, then they quickly lead to very wordy, procedural and possibly complex “Save” methods. You will also end up adding additional state variables to classes that contain “Save” methods, in order to support the logic of the save.

Another way of looking at this is as the problem of the typical “business entity” class that simply does too much. There are cross-cutting concerns that do not belong in one “business entity” or another. That is the major weakness of the ActiveRecord-style of data access – when you take the world-view that every business entity is a table, then you encumber your ability to clearly work with the aspects that are orthogonal or cross-cutting to the entities.

My own solution (there may be better ones) is to explicitly expose the unit of work, and have a technique that allows class instances to intelligently enlist into it. Its worth describing in a little more detail. First, you need an interface that a class can implement to enlist in the work:

Interface IWorkEnlistee
Sub Participate(work as UnitOfWork)
Readonly Property UniqueKey() as String
End Interface

The Participate method is called just before the database Save, but after validation of user-data. The UniqueKey property is necessary to prevent two identical instances from participating (I usually just return the hash-code of some entity instance). You could add methods to the interface to get greater functionality, such as in-memory rollback.

For the dual account entry example, I would have an instance of the above interface that participates by adding the reverse entry and updating the balances. Any state data it needs will be passed in the constructor, which is called before the in-memory data is changed (so that it can get a clear before-picture). It will probably have several related state variables that otherwise would have found themselves complicating some other piece of code.

Using this technique, the entity, presentation and flow logic of the application remains clean, and the cross-cutting aspects that participate in transactions are nicely separated and encapsulated.

Exception Handling 101

  1. In some older languages, the runtime was not able to expose a call stack when exceptions occurred. This led to programmers spattering the code with exception handling-code, so that they would have a better idea of where exceptions occurred.
  2. Historically, programmers have sometimes used exceptions to communicate status.

The above 2 ways of doing things are no longer appropriate.

First, in a language which exposes an exception call stack (like C# or VB.NET), the only place you need an exception handler is at the top-most point of a thread. So, you need a try…catch around your application entry point (Main sub), and around every new thread you manually spawn.

Second, using exceptions to communicate status is a terrible practice. I cannot articulate why really. Perhaps because it forces you to code everything very defensively, which obscures the intention of the code. In any case, experience shows that things are much simpler if you follow the rule that exceptions are only for exceptional scenarios. In other words, do everything you can to avoid having to trap exceptions.

This is harder than it sounds. It pairs well with the “fail fast” rule though. You can implement this rule by validating parameters to your methods. If they do not meet your expectations (e.g. something is NULL when it should not be), then throw an exception. The intention of the exception is always to indicate an invalid system state. (An invalid system state is unpredictable and dangerous, and should never occur – it is an exception).

Presenter-Model View with Controllers

At my current (soon to be gone) workplace, we have a unique style of doing our UI….

I think I’ll call what we have Presenter-Model View with Controllers. (There is a View and there is a very rich Presenter Model. There are Controllers too).

We mostly drop generic container controls onto forms with zero or minimal code. We have extended properties to be able to bind those controls at design-time. (The appearance is determined at run-time). We have bi-directional deep (multiple dots) data binding, which allows the view to be completely driven by the Presenter Model.

The Presenter Model is more than simply a device for binding a form. It is a first class object in the system, used by security. It also supplies Validation.

Underlying that, we have a custom O/R Mapper with integrated support for database structure evolution.

It took a long time to set that all up, and it saddens me that the product will die soon :(

HashTable of HashTables

Today I discovered that the effective limit of a HashTable with random hashes is around 65,000 items. This is because the hash-key is a 4-byte integer (32 bits). The way the stats work out, you should expect collisions from about 2^(32/2) = 65536 items. In many scenarios (mine included), that risk is too high!

It is not hard to come up with a unique enough random hash-key. The problem is that the HashTable will only allow that key to be an integer. So my workaround is to create *two* keys instead. This will increase the limit to 2^(64/2) = 4294967296 items. If I have that many items in memory, I will have other problems!

Once you have two keys, use the first to key into HashTable-1. Then make each element of HashTable-1 a new HashTable, and use the 2nd key in that one. (In a totally random scenario, each 2nd-level HashTable will only have a single element. So it would be best to initialize it with that in mind, so as not to use too much memory).

My particular scenario (an identity map) uses .NET types as the first key, and database IDs as the second. This is not the best-case for randomness, but it is sufficient for my purposes (actually, guaranteed unique).