Sunday, March 29, 2015

Checked errors in F#

In the land of C#, exceptions are king. By definition exceptions help us deal with "unexpected or exceptional situations that arise while a program is running". In that regard, we're often optimistic, overoptimistic. Most code bases treat errors as exceptional while they're often commonplace. We are so confident about the likelyhood of things going wrong, we don't even feel the need to communicate to consumers what might go wrong. If a consumer of a method wants to know what exceptions might be thrown, he needs to resort to reading the documentation (or source) and hope it's up-to-date.

Java on the other hand has a concept of unchecked and checked exceptions. Unchecked exceptions are exceptions that are caused by a programming mistake and should be left unhandled (null reference, division by zero, argument out of range etc); while checked exceptions are exceptions that your program might be able to recover from. They become part of the method signature and the Java compiler forces consumers to handle them explicitly.

While checked exceptions might bloat the method's contract and enlarge the API surface area, they might have every right to. Dealing with errors is an important part of programming. Having discoverable errors which require thoughtful care, should improve overall quality. Having said that, it also requires careful consideration from the designer to decide what's truly exceptional.

Coming up with something that can compete with the mechanics of checked exceptions in C# seems to be impossible. We could return a result with an error from a method, but the compiler doesn't force you to do anything with that result.

F# on the other hand doesn't allow for the result of an expression to be thrown away. That is, unless you explicitly ignore it, or bind it and leave it unused.

Let's look at an example. We start by defining two discriminated unions. The first type defines a generic result; it can either be success or failure. The second type defines all the errors that can be returned after deleting a file.

Then we write a function that deletes a file, but instead of throwing exceptions when an error occurs, it returns a specific error. When no errors occur, success is returned.

When I now use this function, the compiler will tell me that it has a return value which needs to ignored or binded.

While ignoring a result stands out, an unused binding is easier to go unnoticed. I wish the F# compiler had a flag to detect unused bindings.

Assuming I don't ingore the result, I can use pattern matching to address each error specifically.

By not including a wildcard pattern, extending the contract by adding errors will introduce a breaking change. We'll have to consider what to do with newly added errors.

For example, if I add the error PathTooLong, the compiler shows me this warning.

In summary, it might be more safe to be a bit less optimistic when it comes to errors. Instead of throwing exceptions, making errors part of the public interface, communicating errors explicitly, and handing responsibility on what to do with the error to the caller, might lead to more robust systems. While this can be achieved with C#, the mechanics are error-prone. Expressions and pattern matching make that F# allows for stronger, yet still not ideal, mechanics.  

Sunday, March 15, 2015

Scaling promotion codes

In our system a backoffice user can issue a promotion code for users to redeem. Redeeming a promotion code, a user receives a discount on his next purchase or a free gift. A promotion code is only active for a limited amount of time, can only be redeemed a limited amount of times and can only be redeemed once per user.

In code these requirements translated into a promotion code aggregate which would guard three invariants.

The command handler looked something like this.

Depending on the promotion code, we would often have a bunch of users doing this simultaneously, leading to a hot aggregate, leading to concurrency exceptions.

Studying the system, we discovered that the limit on the amount of times a promotion code could be redeemed was not being used in practice. Issued promotion codes all had the limit set to 999999. Just by looking at production usage, we were able to remove an invariant, saving us some trouble.

The next invariant we looked at, is the one that avoids users redeeming a promotion code multiple times. Instead of this being part of the big promotion code aggregate, a promotion code redemption now is a separate aggregate. The promotion code aggregate now picks up a new role; the role of a factory, it decides on the creation of new life.

The promotion code redemption's identifier is a composition of the promotion code identifier and the user identifier. Thus even when the aggregate is stored as a stream, we can check in a consistent fashion whether the aggregate (or stream) already exists, avoiding users redeeming a promotion code multiple times. On creation of the stream, the repository can pass to the event store that it expects no stream to be there yet, making absolutely sure we don't redeem twice. The event store would throw an exception when it would find a stream to already exist (think unique key constraint).

In this example, we were able to remove an annoying and expensive invariant by looking at the data. Even if we had to keep supporting promotion code depletion, we might have removed this invariant and replaced it with data fed into the aggregate/factory from the read model. Ask yourself, how big is the cost of having a few more people redeem a promotion code? Teasing apart the aggregate even further, we discovered that the promotion code had a second role; a creational role. It now helps us spawning promotion code redemptions while still making sure this only happens when the promotion code is active. Each promotion code redemption is now a new short-lived aggregate, while the promotion code itself stays untouched. By checking the existence of the aggregate up front and by using the stream name to enforce uniqueness, we avoid users redeeming a promotion code more than once. This has allowed us to completely avoid contention on the promotion code, making it perform without hiccups.