Sunday, July 30, 2017

Fast projections

Most EventStore client libraries allow you to subscribe to a stream by passing in a callback which is invoked when an event occurs (either a live or historic event).

Let's say we subscribe to a stream of a popular video service, and we want to project a read model that shows how many videos a viewer has watched. We don't care about the bookmarked videos for now.

We're sitting on top of storage that can execute a single statement and a batch of statements.

The statements supported are limited:
  • Set a checkpoint
  • Increment the view count for a viewer

The storage engine exposes a method which calculates the cost of executed statements:
  • Executing a single statement costs 1 execution unit
  • Executing a batch also costs 1 execution unit plus 0.1 execution unit per statement in the batch
The stream

For this exercise the stream contains 3500 historic views, 50 historic bookmarks and 100 live views.

First attempt

The first attempt at projecting the stream to state, executes a statement for each event we're interested in and checkpoints after each event (even the ones we're not interested in).

The cost of this projection is high: 7250 execution units - even though there are only 3600 events we're interested in. We execute a statement for each event we handled and checkpoint immediately after, even for the events we didn't handle.

Less checkpointing

It's not hard to get rid of some of the checkpointing though.

The cost has improved, but only marginally. We saved 50 execution units by avoiding checkpointing after events we do not handle. Time for a bigger improvement..


Instead of handling each event individually, we will buffer them as soon as they come in. When we're catching up and seeing historic events, we only flush the buffer every 100 events. When we're caught up, we flush on each event. We want to always make a best attempt at showing fresh data.

When the buffer gets flushed, events are mapped into a sequence of statements, which are sent in batch to the storage engine. The checkpoint is appended to the tail of the batch.

This approach makes a significant difference. Execution cost has reduced by 93%! Batching of historic events makes replays much faster, but with some extra effort we can take this optimization even further.

Batching with transformation

It always pays off to understand the guarantees and intricacies of the storage you're using. Looking closely at the storage interface, we find that we can increment the view count by any number. If we use a local data structure to aggregate the view count up front, we can reduce the number of statements even further.

In practice, we filter the for events we're interested in, group by the viewer id, count the values and map that into a single statement per viewer.

This further reduces costs by more than 2/3th. The optimization makes the code a bit more elaborate, but not necessarily that more complex - it's still a local optimization.


In three steps, we brought cost down from 7250 execution units to only 162 units. That makes me a 44x engineer, right?

In general, storage is one of the slowest components of your system. Making your system faster often involves making it do less work. Avoiding waste by batching and some more work up front, can make a big impact when you want to make your projection faster.

You can find the complete F# script here.

Sunday, July 23, 2017

From human decisions, to suggestions to automated decisions

I've been wanting to share this experience for a while, but it took me a while to come up with a story and example I could use in a blog post.

I help out during the weekends in a small family run magic shop. I'm the third generation working in the shop. My great-grandfather always hoped that his only son would follow in his footsteps as a carpenter. But at only eighteen years old, my grandfather said goodbye to the chisels and sawdust, and set out for the big city to chase his dream of becoming a world class magician. The first few years were tough, he was no Houdini. He would (hardly) get by performing at kid birthday parties, weddings and store openings. That's how he met my late grandmother. She worked as a shop girl in one of the first malls that were built in the city, and happened to show up each time my grandfather performed in one of the stores. After getting married, having a baby (my dad) and saving every dime they earned, my grandfather was able to rent a hole in the wall and open up his own tiny magic shop - in that same mall. Once my dad finished school, he worked as a middle school teacher for a few years, giving up on that job to join his father in the family business. He loves to tell you how he can now still teach children, without the chore of grading their homework. I've been running around and helping out in the store since I could barely walk. I guess you can say that magic runs in our blood.

Since the beginning of time, our trade has relied on secrecy. However, due to the rise of the internet, magic is dying a slow death. Even the greatest of tricks and illusions are challenged and destroyed in the open by non-believers. Our craft is now reduced by many to a cheap fairground attraction.

My grandfather, even after suddenly losing grandma last year, isn't willing to give up on the business though. "There will be a time when the people need magic once more, and we will be waiting right here." He decided not to see modern technology as the nemesis of magic, but rather as a potential assistant.

Instead of fiddling in his study all night with a book of cards, a hat, a scarf and the lonely rabbit, I've been working with him trying to show him what happened in technology over the last 30 years. Being a programmer, I started by showing off some of my unfinished hobby projects experimenting with micro controllers. I hadn't gotten much further than making some LEDs blink controlled by my voice, but that was enough to spark my grandfather's creativity. "Can that chip make the lights go out? Can it blow smoke? Can it sense if I flip it around real fast?" One evening, tired after brainstorming and testing ideas all night, he told me "being able to tell these little computers what to do might not be magic, but a miracle".

I would like to tell you the details of what we came up with, but I'd have to make you disappear after I did. To make a long story short, it was an overwhelming success. Neighbourhood magicians picked it up, and even mere mortals thought it was a great gimmick they could show off with. A friend of mine even told me he used our invention to pick up girls. It wasn't for long before I got daily emails and tweets from all over the world, begging me to ship our gadget their way.

Thanks to some lovely open source software, I was able to set up a full blown web shop in a matter of days. While orders started rolling in, we got a grip on how to actually produce our new product at a sufficient pace. Shipping overseas turned out to be surprisingly easy. What we didn't anticipate for was how to handle returns and refunds. This is where the actual story starts...

When our usual customers visit the store, we take our sweet time to show them how to perform the trick. This results in us knowing our customers pretty well and hardly ever having anyone return an item or ask for a refund. Admittedly, growing the same connection with our customers online hasn't been a success. A lot of them lack the magician mindset. You can't just buy magic, you have to put the practice in to make the magic happen. These maladjusted expectations make for quite a few phoney complaints.

Me, my dad and my grandfather in turn have been performing the chore of handling returns and refunds. This domain of the open source shop is very much underdeveloped. Even something as simple as looking up a customer's details and order history, requires scanning multiple pages of information or even querying the database by hand. Making a well-informed decision takes up way more time than it should.

A use case specific view

The first thing I did in an attempt to speed up this process was building a use case specific view. I asked my dad and grandfather which heuristics they use and which data is needed to feed those heuristics. To get the full picture as soon as possible, I imposed the rule that only this specific view can be used to make a decision. If a piece of data was lacking, I would add it the same day.

This process was more useful than I expected. What we learned is that we all used different heuristics, but were also victim to different biases. For example, I learned that my grandfather used to have a Dutch neighbour who would leave for work very early, and slam the door so loud, it woke my grandfather up each morning. He has grown a disliking for the Dutch ever since. When customers were Belgian, he would much more lean towards issuing a refund, since he believes Belgians are less likely to lie about the cause of a broken item. We also discovered that we used different words for specific numbers. I would use "Items purchased sum", but my dad would use "Items purchased lifetime" to define the total amount of money spent purchasing items. We decided on being more explicit and making a composition of all those words.

I ended up with a simple screen that rendered a read model that looked something like this.

Based on heuristics in the head and a snapshot of information available in the world, we would make a decision and click a button to execute a specific command.

Making suggestions

By now, I had gotten quite interested in this domain. Each day I would go over all of the decisions and see whether I understood why a decision was made. I would call the shop each time I didn't understand and scribble down notes whenever I discovered a new implicit rule. In the meanwhile, I started experimenting with codifying these rules to make automated decisions, but when I ran the older snapshots through my routine the results were not 100% there yet. Instead of making automated decisions, I switched to making suggestions instead.

I rendered these suggestions on top of the existing view and observed the decisions that were made.

My partners in crime were quite ecstatic with this new feature. They had a bit more room to breathe and could spend more time doing things they liked.

After observing and comparing the suggestions with the decisions made, I kept tweaking the routine a bit more. I got close but I felt as if I wasn't quite there yet.

Automating decisions

It was my grandfather who eventually pushed me to fully automate making these decisions. He said "I almost always find myself picking the first option. It's fine if the machine is a bit off now and then. Just defer decisions and call in a human when the machine is not confident enough." How can I question my grandfather's wisdom? And so this happened...

From human decisions, to suggestions, to automated decisions

With that, we've come full circle. I'm happy to report that my dad, grandfather and I are back to spending more time in my grandfather's study coming up with new tricks.

I regularly have a look at the data to check for anomalies and to tweak the routine a bit further. But even that is taking up less and less time. To be fair, I didn't invest in a full blown test suite even though this small routine has grown into 200 lines of code. I find much relief in the fact that when I replay past decisions, I hardly ever find a regression.

Maybe if we invent another popular trick like this one, we will acquire enough data to let the machine do the learning for me. For now, it's automagical enough.

Monday, February 13, 2017

How to organize a meetup

I've organized a few DDDBE meetups in the past, and always succeed in forgetting something. Either someone points it out well in advance, or I end up stressing last minute. This post partly serves as a checklist for myself, but it would be a welcome side effect to also see it encourage others to help out organizing future meetups. Organizing a meetup is not rocket science, having a list of what to take care of is a good start.

Finding a speaker

One of the crucial ingredients of an interesting evening is the speaker and the content he or she brings. My strategy is to use Twitter to keep a tab on people that produce content that's of interest to me. A meetup is a great excuse to meet in person and to hear them out. When it's someone who isn't local, I keep an eye on whether they will be attending any conferences nearby in the future.

Contacting a speaker

The medium you use doesn't matter that much, whether it's through Twitter, email, Slack or in person, as long as you give them enough context to work with.

Things to mention when contacting someone:

  •  Introduce yourself and the user group.
  • Tell them why you think he or she would make a great guest. Which was the talk, tweet or blog that piqued your interest? What is it that makes the content so relevant to your audience?
  • Tell them about the expected composition of the audience. Will there be 20 people or 100? Should most of them be quite familiar with the topic, or do you think an introductory talk would work better?
  • Propose a date or a small set of dates from the get-go. Even if none of those dates work, you might settle on an alternative date instead which allows you to move forward and start planning.
  • To avoid awkward situations, mention up front whether expenses can be covered or not. Not having a budget normally isn't an issue when speakers are local or when they're around for a conference or customer. When they have to get on a plane, it's another matter.
Gathering speaker requirements

Ask the speaker what he needs, to be able to do his session:
  •  A projector
  • A whiteboard
  • Modelling space
  • Markers and sticky notes
  •  ...
Selecting a location sponsor

With those requirements in mind, you can start looking for a location sponsor to host the meetup.

It's a luxury to have a pool of location sponsors. Depending on the speaker and format of the session, you can make a selection on what would be the best fit. When the speaker is a big name in the community, try to find a location that fits a lot of people. A large auditorium perhaps. When it's a workshop, make sure there's plenty of modelling space. An open space with lots of walls or windows you can use to your advantage. If you think the interest in a certain topic will be rather limited, aim for something a bit smaller and more cozy, which could benefit the quality of interactions between the speaker and the audience.

Once you've come up with a short list of ideal locations, you should be aware of the type of relationship your community has with the sponsor. Even if they don't get much out of it, they might not mind doing you a favor once in a while, but hosting a weekly meetup might be a bit much. I like to order them in a way that makes a best attempt at a round-robin distribution. This also benefits attendees and avoids over concentrated geographical communities. Not everyone's up to drive half way across the country on a weekday for a one hour talk.

Contacting a location sponsor

Keep a list of location sponsors and the people you need to contact, and who inside your community knows them best. You have a higher chance of getting a positive response if you have someone that knows them well contact them. You can send an email, but there's a high risk that it ends up in a low priority queue somewhere - nobody likes sending out those reminder emails. It's more efficient to just call them, to ping them on Slack or to send them a direct message on Twitter - in that particular order.

Prepare a list of things the location sponsor needs to know:
  • The date
  • The name of the speaker and the subject
  • The speaker requirements
  • The amount of people you're expecting
  • Expectations towards food: sandwiches, pizzas, soda, beers?
I try to contact location sponsors one after the other. This avoids broadcasting the request, to have to turn it down later on. This has never been a problem, as long as response times are low. If you have to wait weeks to get a reply, you're going to end up stressing out once the deadline approaches.

You can schedule a meetup as soon as the speaker and the date are confirmed.

Opening the RSVP's is best done after the location is also confirmed and not more than one month in advance. You want to avoid people reserving their seat to eventually not show up.

Details you want to include when making the announcement:
  • The date
  • The location and the maximum number of attendees
  • The speaker and the abstract of the session
  • Where to park your car
  • Whether food is provided or not. Include what to expect if possible, so that people with a specific diet can make the necessary arrangements.
  • The agenda of the evening: doors open, start of the session, debriefing
Not all these details need to be present when you schedule the meetup, you can still enrich later on. My own experience tells me I mostly care about the speaker, date and location. I only look at the specifics one week up front.

When you have people on the waiting list, it's a good idea to send a personal reminder one week in advance, asking people to update their RSPV's in case they can't make it. These seem to be more effective than the auto generated ones.

Even with those reminders, you will often find a percentage of people not showing up. Many factors influence the no-show rate. Count the empty chairs, and allow for some overbooking once you get a feel for it. The worst thing that can happen is that the organizers have to watch the session standing up.

Speaker gift

Like Christmas gifts, you also want to think of a speaker gift up front. This doesn't have to be something super expensive, a small token of appreciation will do. I'm going to guess this is a culture thing. In Belgium people often end up gifting each other beers or wine. This isn't always appropriate or practical. Not everyone likes to drink and when you're travelling it's quite literally a burden on your shoulders. I'm actually thinking that something small, like a book with a personal note might be a better idea.


Before recording a session, there are a few things you need to consider.

Does the speaker mind being recorded? Some bring experimental content that's rough around the edges which isn't necessarily something they're ready to show to a large audience. Others see it as a perfect opportunity to be able to pitch a conference talk for later on, or to spread their content.

When you announce that the talk will be recorded up front, more people tend to stay at home. They can catch the video later on, but eventually often end up never doing so. Even when they do, they've missed out on the best bits: the interactions before and after the talk.

Day of the meetup

Make sure you set aside enough time in your agenda. You don't want to be stressing out last minute, or God forbid, be late.

Your job is to make the speaker as comfortable as can be, and to think fast when something falls apart last minute.

If the speaker has been travelling, go get him at the station or at the airport. Travel is tiring. Reassure him he can just relax from now on, you will take care of anything he needs. The basics first: is he thirsty, hungry, does he need to use the toilet, an internet connection? You will also help him set up his laptop and put a bottle of water nearby before the session starts.

Arrive at the venue in time, at least 20 minutes up front. Say hi to the host and quick check whether all requirements are met.

Once the first people start pouring in, say hi, point them in the right direction and make some small talk. Once a small crowd has found its way to sandwiches and the social area, you can focus on ensuring the room and speaker are ready to go. The next batch of people to come in will be able to find their way on their own.

Once people are well fed and settled, take the stage and do a short introduction:
  • Community announcements: scheduled meetups, oncoming befriended conferences...
  • Sponsor raffles: software licenses, conference tickets...
  •  Mention the types of sponsoring the community is still looking for: locations, markers, budget...
  • Thank the sponsor(s) and allow them to say a few words if they like to
  • The agenda for the night
  • The speaker(s)
Now the session is well on its way, you can mostly just sit back and relax. Pay attention though. In case nobody wants to go first during QA, you should lead by example and have a question ready.

Once questions dry up, or people are thirsty, thank the speaker and don't forget to hand off your speaker gift.

After the session, the location sponsor usually has a fridge with beverages you're free to plunder. Out of courtesy, you don't want to make it too late and see yourself out in time. Ask the host what time that should be. Usually 30% of the people leave right after the talk, the rest sticks around for one or two drinks. When the time comes to leave, not everyone wants to head home right away. As part of your prep, find a bar that's nearby and easily accessible, where you can gather afterwards in case not everyone has had enough.

Give thanks

Whew, congratulations, you made it! The only thing left to do is to thank the speaker and the location sponsor once more. Give them a shout out in public, and send them a personal thank you note through a more personal medium.

Thanks to the whole DDDBE community for the inspiration and the platform. Thanks to Mathias, Yves, Stijn, GienAntonios in particular for reviewing.