Sunday, August 25, 2013

Inheritance is like Jenga

These last few days, I have been working on a piece of our codebase that accidentally got very inheritance heavy.

When it comes to inheritance versus composition, there are a few widely accepted rules of thumb out there. While prefer composition over inheritance doesn't cover the nuances, it's not terrible advice; composition will statistically often be the better solution. Steve McConnell's composition defines a 'has a'- relationship while inheritance defines an 'is a'-relationship, gives you a more nuanced and simple tool to apply to a scenario. The Liskov substitution principle which states that, if S is a subtype of T, then objects of T may be replaced with objects of type S without any of the desirable properties of that program, is probably the most complete advice.

Inheritance, when applied with the wrong motivations - reuse and such, often leads to fragile monster class hierarchies which are too big to wrap your head around and extremely hard to change.

When I was working on such a monstrous hierarchy, it reminded me of playing Jenga. Some time not that long ago, someone had built this tower from the ground, laying block over block, layer over layer. On the surface it appears to be stable and rigid, but as soon as someone wants to winkle out one block, it becomes obvious one block can bring down the whole structure. The lower the block in the structure, the more layers rest on it, the greater the risk of breaking everything on top. Even if you do succeed in pulling one block out, chances are you had to touch the surrounding blocks to prevent the tower from tumbling over.

Instead of Jenga, I'd prefer a puzzle made of just a few pieces - designed for toddlers. A puzzle is flat, you can see the big picture in one glance, while you can reason about each piece individually as well. As long as the edges of the pieces fit together, you can assemble whatever picture you want. 

Sunday, August 18, 2013

But I already wrote it

A few weeks ago, we set out to implement a feature that enabled back office users to set a new rate ahead of time. With our analyst and the involved user being out of the office for days, we had to solely rely on written requirements. Two of us skimmed the documents, but didn't take the time to assure there wasn't any ambiguity - it looked trivial really. I went back to what I was doing, while my colleague set out to implement this feature. Going over the implementation together the next day, he had built something a lot more advanced than I had anticipated. While I argued that this was a lot more than we needed, we agreed to wait for feedback from our analyst to return from her holiday.

When our analyst returned, she confirmed that the implementation did a lot more than we needed. I suggested removing what we didn't really need. My colleague argued that he now already had put in the effort to write it, and we should just leave it as is.

I can relate to feeling good about freshly written code, but that shouldn't stop you from throwing it away. Code is just a means to an end; the side product of creating a solution or learning about a problem. If you really can't let go, treasure it in a gist.

In this particular scenario, one could argue that making the solution more advanced than it should be, isn't strong enough of an argument to make a big deal out of it. We're giving the users a little extra for free, right? 
I cannot stress the importance of simplicity enough; to be simple is to be great, perfection is achieved not when there is nothing more to add, but when there's is nothing left to take away, and all of that. Nobody likes bulky software. Nobody likes fighting complexity all day.
But by only considering the cost of initially writing it, you are also ignorant of the true cost of what appears to be a small little extra on the surface. Users, developers, designers and analysts alike have yet another thing to wrap their heads around. More code is not a good thing; more code to test, more code to maintain. Each feature, how small it may seem, needs to be taken into account when planning on new ones. Each feature, definitely an advanced one, makes the cost of training and support go up. The cost of implementing a feature is just a tiny portion of what it costs to support that feature through its entire lifetime.

Using this argument, I eventually succeeded in persuading my peer to dump the ballast. The real lesson for me however, is probably that how trivial it might have seemed, we could have ruled out any possible ambiguity in advance by using one of the various tools we have to our disposal; a smallish white board session or maybe pairing on some high level tests.

Thursday, August 15, 2013

Eventual consistent domain events with RavenDB and IronMQ

Working on side projects, I often find myself using RavenDB for storage and IronMQ for queueing. I wrote about that last one before here and here.

One project I'm working on right now makes use of domain events. As an example, I'll use the usual suspect: the BookingConfirmed event. When a booking has been confirmed, I want to notify my customer by sending him an email.

I want to avoid that persisting a booking fails because an eventhandler throws - the mail server is unavailable. I also don't want that an eventhandler executes an operation that can't be rolled back - sending out an email - without first making sure the booking was persisted succesfully. If an eventhandler fails, I want to give it the opportunity to fix what's wrong and retry.
public void Confirm()
{
    Status = BookingStatus.Accepted;

    Events.Raise(new BookingConfirmed(Id));
}
Get in line

The idea is, instead of dealing with the domain events in memory, to push them out to a queue so that  eventhandlers can deal with them asynchronously. If we're trusting IronMQ with our queues, we get in trouble guaranteeing that the events aren't sent out unless the booking is persisted succesfully; you can't make IronMQ enlist in a transaction.

Avoiding false events

To avoid pushing out events, and alerting our customer, without having succesfully persisted the booking, I want to commit my events in the same transaction. Since IronMQ can't be enlisted in a transaction, we have to take a detour; instead of publishing the event directly, we're going to persist it as a RavenDB document. This guarantees the event is committed in the same transaction as the booking.
public class DomainEvent
{
    public DomainEvent(object body)
    {
        Guard.ForNull(body, "body");          
        
        Type = body.GetType();
        Body = body;
        Published = false;
        TimeStamp = DateTimeProvider.Now();
    }
    
    protected DomainEvent() { }

    public string Id { get; private set; }

    public DateTime TimeStamp { get; private set; }

    public Type Type { get; private set; }

    public object Body { get; private set; }

    public bool Published { get; private set; }

    public void MarkAsPublished()
    {
        Published = true;
    }
}

public class DomainEvents : IDomainEvents
{
    private IDocumentSession _session;

    public DomainEvents(IDocumentSession session)
    {
        _session = session;
    }

    public void Raise<T>(T args) where T : IDomainEvent
    {       
        _session.Store(new DomainEvent(args));
    }
}

Getting the events out

Now we still need to get the events out of RavenDB. Looking into this, I found this to be a very good use of the Changes API. Using the Changes API, you can subscribe to all changes made to a certain document. If you're familiar with relation databases, the Changes API might remind you of triggers - except for that the Changes API doesn't live in the database, nor does it run in the same transaction. In this scenario, I use it to listen for changes to the domain events collection. On every change, I'll load the document, push the content out to IronMQ, and mark it as published.
public class DomainEventPublisher
{
    private readonly IQueueFactory _queueFactory;
    
    public DomainEventPublisher(IQueueFactory queueFactory)
    {           
        _queueFactory = queueFactory;
    }

    public void Start()
    {
        DocumentStore
          .Get()
          .Changes()
          .ForDocumentsStartingWith(typeof(DomainEvent).Name)
          .Subscribe(PublishDomainEvent);
    }

    private void PublishDomainEvent(DocumentChangeNotification change)
    {
        Task.Factory.StartNew(() =>
        {
            if (change.Type != DocumentChangeTypes.Put)
                return;

            using (var session = DocumentStore.Get().OpenSession())
            {
                var domainEvent = session.Load<DomainEvent>(change.Id);

                if (domainEvent.Published)
                    return;

                var queue = _queueFactory.CreateQueue(domainEvent.Type.Name);
                queue.Push(JsonConvert.SerializeObject(domainEvent.Body));

                domainEvent.MarkAsPublished();

                session.SaveChanges();
            }
        });
    }
}
I tested this by raising 10,000 events on my machine, and got up to an average of pushing out 7 events a second. With an average of 250ms per request, the major culprit is posting messages to IronMQ. Since I'm posting these messages over the Atlantic, IronMQ is not really to blame. Once you get closer, response times go down to the 10ms - 100ms range.

A back-up plan

If the subscriber goes down, events won't be pushed out, so you need to have a back-up plan. I planned for missing events by scheduling a Quartz job that periodically queries for old unpublished domain events and publishes them.

In conclusion

You don't need expensive infrastructure or a framework to enable handling domain events in an eventual consistent fashion. Using RavenDB as an event store, the Changes API as an event listener, and IronMQ for queuing, we landed a rather light-weight solution. It won't scale endlessly, but it doesn't have to either.

I'm interested in hearing which homegrown solutions you have come up with, or how I could improve mine.

Sunday, August 4, 2013

When your commands spell CUD

A good while ago, I blogged on commands (and queries). After exploring various flavors, I eventually settled on this one; commands, handlers and an in-memory bus that serves as a command executor.

Commands help you in supporting the ubiquitous language by explicitly capturing user intent at the boundaries of your system - think use cases. You can look at them as messages that are being sent to your domain. In this regard, they also serve as a layer over your domain - decoupling the inside from the outside, allowing you to gradually introduce concepts on the inside, without breaking the outside. The command executor gives you a nice pipeline you can take advantage of to centralize security, performance metrics, logging, session management and so on.

We always need to be critical of abstractions though, and regularly assess their value. A smell that might indicate that commands might not be working for you, or are adding little value, is that the first letters of your commands spell CUD - Create Update Delete.

For example; CreateCarCommand, UpdateCarCommand and DeleteCarCommand.

The language needs attention

Possibly, your team hasn't fully grasped the power of cultivating the ubiquitous language. If you start listening to your domain experts, you might end up with totally different command names; TakeInNewCarCommand, RepaintCarCommand, InstallOptionCommand and RemoveCarFromFleetCommand.

Maybe though, there is no language at all, and you're really just doing CRUD. If the context you are working on is implementing a generic or supporting subdomain this might not be terrible.

If I'm doing CRUD, do I still need commands?

Commands help you decouple the inside from the outside. If there is no domain on the inside though, they can still help you decouple the application layer from other concerns. You might prefer to use another facade to separate concerns though, such as a thin service layer. I don't think the service layer abstraction gives you anything commands don't though. Maybe you don't find any value in separating things at all, and just dump everything in the the application layer.

All of these approaches are valid. You just have to consider the trade-offs. With these last approaches, next to losing decoupling in from out, you also lose that central pipeline that a command executor gives you.

Doesn't my application layer give me this pipeline for free?

It sure can. Looking at modern networking stacks, these all have interception points built in. For example; NancyFx allows you to hook in the request pipeline using before and after hooks; Web API gives your message handlers and action filters; and WCF has a concept of interceptors.

Not all application frameworks do though - think of frameworks targeting desktop software.

The advantage of having your own pipeline, instead of solely having to rely on your application framework's interception points, is that you're boss and don't have to study the ins and outs of each framework, and hope for the framework to have thought of your needs. Also when you need to support multiple application layers, you don't have to implement all features twice.

What about using aspects instead of a pipeline to centralize all these concerns?

You can - instead of a pipeline - use aspects to take care of croscutting concerns such as security, logging, session management.. and have them woven into your executable at compile- or runtime. I think of aspects as if they were macros, which save you on lines of code written, but also often conceal the real problem; missing concepts. While they try to sell you on separation of concerns, notice that you're actually still producing procedural code, but instead of writing it by hand, you're now letting an AOP framework do it for you. Add harder testing, debugging and readability to the bunch, and you can understand why I'm not a fan of AOP. There are scenarios where this all doesn't matter much though, and to strengthen the cliche; granular logging is a good use case.

When your commands spell CUD, it might indicate you could do without them. Do realize what the consequences are of taking them away though;
  • you lose the opportunity to capture user intent at the boundaries of your system, to strengthen the ubiquitous language
  • you may need an alternative facade to decouple in from out
  • you lose that command executor serving as your own pipeline