Friday, December 15, 2017

Passing the AWS Certified Solutions Architect exam

Before last week, the only certification exam I ever passed was the Microsoft .NET Framework Application Development Foundation certification. This was almost eight years ago. My manager back then thought getting certified was the best way for me to get a raise. It would be a win-win. I for one would learn something along the way, and the company would have less trouble keeping its Microsoft Gold partnership. As far as I remember, I spent a good six months reading, studying and memorizing this 794 pages thick book. Although the book did teach me a fair amount of solid .NET framework internals, most time was spent force feeding myself the ins and outs of framework API's you only need once in a blue moon and should just Google for when needed.

This time around though, it was my own decision to get certified. Mid 2016, our components were getting more and more structured in a way that allowed us to deploy them away from our on-premise data center. Components that didn't own data bound to a specific territory by regulatory requirements and that would allow for some down-time were the obvious candidates.

Moving some of our infrastructure to the cloud, we had a few goals in mind:
  • Take advantage of managed cloud services to reduce operation cost significantly.
  • More freedom to scale up or down. The structure of  the contracts with our data center (and regulations that require us to own our own racks) generally forces us to over provision our infrastructure. Making changes halfway the contract takes time and is costly.
  • Ease into learning how to run software on the cloud for when we move to other markets or when we build services that have less strict territoriality constraints.

Getting started with AWS is easy enough. Starting an EC2 instance, attaching a disk, using a managed database, configuring a load balancer is child's play. But when it came to networking, security, fault-tolerance and properties guaranteed by AWS, I had a lot of questions. I hoped to find answers going over the AWS Certified Solutions Architect material. Why not set myself an artificial goal and get certified while I was at it?

After three months of studying, I passed the exam with a score of 95%. Here's a list of the resources I used, including how much money and time were spent.

Exam Blueprint

You should go over the exam blueprint to understand what they expect you to know to pass the exam. It's a good idea to go over the document while studying and to tick off domains you feel comfortable with.

Money spent: €0, time spent: 30 minutes

A Cloud Guru

A good collection of short videos and mini-exams covering all the topics needed to pass the exam. Details that require extra attention to pass the exam are highlighted throughout the course.

You can get the videos on for €99. I got it through Udemy for only €10. Worth every cent.

Money spent: €10, time spent: 26 hours

FAQs and Whitepapers

AWS advises you to read a specific set of whitepapers and FAQs. The material can be a bit dry, but it's extremely useful. Not just to pass the exam, but to avoid nasty surprises in production.

Money spent: €0, time spent: 6 hours

AWS Open Guide

An open-source effort to document real world experiences running environments on AWS.

Money spent: €0, time spent: 1 hour


Somewhere around 500 practice questions that tease out the topics you don't completely master. When I got a question wrong, I would read up on the topic and play around in the AWS console until I felt like I got it.

Although there were some questions that were very similar, you can't pass the exam by just studying these questions.

Money spent: €20, time spent: 12 hours

Exam Guru

Mobile app affiliated with A Cloud Guru containing more practice questions. These are less scenario based and less in-depth. The Whizzlabs questions are much more in the direction of what to expect on the actual exam. Disappointing to be fair.

Money spent: €20, time spent: 2 hours

Test exam

A small set of questions provided by AWS in the style of the actual exam. This was very much a waste of time and money. Whizzlabs had copied all of these questions word for word.

Money spent: €20, time spent: 20 minutes

Test day

The least enjoyable part of the experience.  55 multiple choice questions need to be answered in 80 minutes. Half of the questions are quite straight forward. The other half are more involved. For the longer questions, I first read the answers and wrote down the options. For then to read the question and strike through the options that definitely were not a part of the answer. This helped me to focus on the important bits of the question and to gain momentum plowing through the questions at a steady pace. I finished with 25 minutes left.

Money spent: €135, time spent: 90 minutes. Extra money spent on parking in the city center of Brussels: €10, searching for a spot: 60 minutes

In short... Go over the material, practice, take notes, practice some more, review your notes until you get sick of them.

Sunday, July 30, 2017

Fast projections

Most EventStore client libraries allow you to subscribe to a stream by passing in a callback which is invoked when an event occurs (either a live or historic event).

Let's say we subscribe to a stream of a popular video service, and we want to project a read model that shows how many videos a viewer has watched. We don't care about the bookmarked videos for now.

We're sitting on top of storage that can execute a single statement and a batch of statements.

The statements supported are limited:
  • Set a checkpoint
  • Increment the view count for a viewer

The storage engine exposes a method which calculates the cost of executed statements:
  • Executing a single statement costs 1 execution unit
  • Executing a batch also costs 1 execution unit plus 0.1 execution unit per statement in the batch
The stream

For this exercise the stream contains 3500 historic views, 50 historic bookmarks and 100 live views.

First attempt

The first attempt at projecting the stream to state, executes a statement for each event we're interested in and checkpoints after each event (even the ones we're not interested in).

The cost of this projection is high: 7250 execution units - even though there are only 3600 events we're interested in. We execute a statement for each event we handled and checkpoint immediately after, even for the events we didn't handle.

Less checkpointing

It's not hard to get rid of some of the checkpointing though.

The cost has improved, but only marginally. We saved 50 execution units by avoiding checkpointing after events we do not handle. Time for a bigger improvement..


Instead of handling each event individually, we will buffer them as soon as they come in. When we're catching up and seeing historic events, we only flush the buffer every 100 events. When we're caught up, we flush on each event. We want to always make a best attempt at showing fresh data.

When the buffer gets flushed, events are mapped into a sequence of statements, which are sent in batch to the storage engine. The checkpoint is appended to the tail of the batch.

This approach makes a significant difference. Execution cost has reduced by 93%! Batching of historic events makes replays much faster, but with some extra effort we can take this optimization even further.

Batching with transformation

It always pays off to understand the guarantees and intricacies of the storage you're using. Looking closely at the storage interface, we find that we can increment the view count by any number. If we use a local data structure to aggregate the view count up front, we can reduce the number of statements even further.

In practice, we filter the for events we're interested in, group by the viewer id, count the values and map that into a single statement per viewer.

This further reduces costs by more than 2/3th. The optimization makes the code a bit more elaborate, but not necessarily that more complex - it's still a local optimization.


In three steps, we brought cost down from 7250 execution units to only 162 units. That makes me a 44x engineer, right?

In general, storage is one of the slowest components of your system. Making your system faster often involves making it do less work. Avoiding waste by batching and some more work up front, can make a big impact when you want to make your projection faster.

You can find the complete F# script here.

Sunday, July 23, 2017

From human decisions, to suggestions to automated decisions

I've been wanting to share this experience for a while, but it took me a while to come up with a story and example I could use in a blog post.

I help out during the weekends in a small family run magic shop. I'm the third generation working in the shop. My great-grandfather always hoped that his only son would follow in his footsteps as a carpenter. But at only eighteen years old, my grandfather said goodbye to the chisels and sawdust, and set out for the big city to chase his dream of becoming a world class magician. The first few years were tough, he was no Houdini. He would (hardly) get by performing at kid birthday parties, weddings and store openings. That's how he met my late grandmother. She worked as a shop girl in one of the first malls that were built in the city, and happened to show up each time my grandfather performed in one of the stores. After getting married, having a baby (my dad) and saving every dime they earned, my grandfather was able to rent a hole in the wall and open up his own tiny magic shop - in that same mall. Once my dad finished school, he worked as a middle school teacher for a few years, giving up on that job to join his father in the family business. He loves to tell you how he can now still teach children, without the chore of grading their homework. I've been running around and helping out in the store since I could barely walk. I guess you can say that magic runs in our blood.

Since the beginning of time, our trade has relied on secrecy. However, due to the rise of the internet, magic is dying a slow death. Even the greatest of tricks and illusions are challenged and destroyed in the open by non-believers. Our craft is now reduced by many to a cheap fairground attraction.

My grandfather, even after suddenly losing grandma last year, isn't willing to give up on the business though. "There will be a time when the people need magic once more, and we will be waiting right here." He decided not to see modern technology as the nemesis of magic, but rather as a potential assistant.

Instead of fiddling in his study all night with a book of cards, a hat, a scarf and the lonely rabbit, I've been working with him trying to show him what happened in technology over the last 30 years. Being a programmer, I started by showing off some of my unfinished hobby projects experimenting with micro controllers. I hadn't gotten much further than making some LEDs blink controlled by my voice, but that was enough to spark my grandfather's creativity. "Can that chip make the lights go out? Can it blow smoke? Can it sense if I flip it around real fast?" One evening, tired after brainstorming and testing ideas all night, he told me "being able to tell these little computers what to do might not be magic, but a miracle".

I would like to tell you the details of what we came up with, but I'd have to make you disappear after I did. To make a long story short, it was an overwhelming success. Neighbourhood magicians picked it up, and even mere mortals thought it was a great gimmick they could show off with. A friend of mine even told me he used our invention to pick up girls. It wasn't for long before I got daily emails and tweets from all over the world, begging me to ship our gadget their way.

Thanks to some lovely open source software, I was able to set up a full blown web shop in a matter of days. While orders started rolling in, we got a grip on how to actually produce our new product at a sufficient pace. Shipping overseas turned out to be surprisingly easy. What we didn't anticipate for was how to handle returns and refunds. This is where the actual story starts...

When our usual customers visit the store, we take our sweet time to show them how to perform the trick. This results in us knowing our customers pretty well and hardly ever having anyone return an item or ask for a refund. Admittedly, growing the same connection with our customers online hasn't been a success. A lot of them lack the magician mindset. You can't just buy magic, you have to put the practice in to make the magic happen. These maladjusted expectations make for quite a few phoney complaints.

Me, my dad and my grandfather in turn have been performing the chore of handling returns and refunds. This domain of the open source shop is very much underdeveloped. Even something as simple as looking up a customer's details and order history, requires scanning multiple pages of information or even querying the database by hand. Making a well-informed decision takes up way more time than it should.

A use case specific view

The first thing I did in an attempt to speed up this process was building a use case specific view. I asked my dad and grandfather which heuristics they use and which data is needed to feed those heuristics. To get the full picture as soon as possible, I imposed the rule that only this specific view can be used to make a decision. If a piece of data was lacking, I would add it the same day.

This process was more useful than I expected. What we learned is that we all used different heuristics, but were also victim to different biases. For example, I learned that my grandfather used to have a Dutch neighbour who would leave for work very early, and slam the door so loud, it woke my grandfather up each morning. He has grown a disliking for the Dutch ever since. When customers were Belgian, he would much more lean towards issuing a refund, since he believes Belgians are less likely to lie about the cause of a broken item. We also discovered that we used different words for specific numbers. I would use "Items purchased sum", but my dad would use "Items purchased lifetime" to define the total amount of money spent purchasing items. We decided on being more explicit and making a composition of all those words.

I ended up with a simple screen that rendered a read model that looked something like this.

Based on heuristics in the head and a snapshot of information available in the world, we would make a decision and click a button to execute a specific command.

Making suggestions

By now, I had gotten quite interested in this domain. Each day I would go over all of the decisions and see whether I understood why a decision was made. I would call the shop each time I didn't understand and scribble down notes whenever I discovered a new implicit rule. In the meanwhile, I started experimenting with codifying these rules to make automated decisions, but when I ran the older snapshots through my routine the results were not 100% there yet. Instead of making automated decisions, I switched to making suggestions instead.

I rendered these suggestions on top of the existing view and observed the decisions that were made.

My partners in crime were quite ecstatic with this new feature. They had a bit more room to breathe and could spend more time doing things they liked.

After observing and comparing the suggestions with the decisions made, I kept tweaking the routine a bit more. I got close but I felt as if I wasn't quite there yet.

Automating decisions

It was my grandfather who eventually pushed me to fully automate making these decisions. He said "I almost always find myself picking the first option. It's fine if the machine is a bit off now and then. Just defer decisions and call in a human when the machine is not confident enough." How can I question my grandfather's wisdom? And so this happened...

From human decisions, to suggestions, to automated decisions

With that, we've come full circle. I'm happy to report that my dad, grandfather and I are back to spending more time in my grandfather's study coming up with new tricks.

I regularly have a look at the data to check for anomalies and to tweak the routine a bit further. But even that is taking up less and less time. To be fair, I didn't invest in a full blown test suite even though this small routine has grown into 200 lines of code. I find much relief in the fact that when I replay past decisions, I hardly ever find a regression.

Maybe if we invent another popular trick like this one, we will acquire enough data to let the machine do the learning for me. For now, it's automagical enough.