High scale architecture using Event Sourcing and Eventual Consistency

by Facundo La Rocca   Last Updated May 27, 2018 18:05 PM

I want to share with you and know what your opinion/improvements are regarding a solution I was tasked with a technical interview I faced.

The exercise:

  • You need to create an API which will indicate based on a DNA if a person is male or female. This API must have two endpoints:

POST /genre: It will receive a DNA (a matrix of NxN) and will return 200 (ok) if it is a woman and 403 (forbidden) if it is a man. The body should look like this:

{
  "dna" : ["ATGCGA","CAGTGC","TTATGT","AGAAGG","CCCCTA","TCACTG"] 
}

The algorithm will decide whether it is a woman if there is more than one sequence with four equal characters. For example:

enter image description here

GET /stats

Will return the results in this way:

{
  "count_men_dna" :40, 
  "count_women_dna": 60,
  "ratio: 0.4
}

Instructions:

  • Build the API
  • Persist each DNA analyzed only and only once
  • Take into account that the API could be overwhelmed unexpectedly with bursts up to ONE MILLION REQUEST PER SECOND

Let's suppose I've got the best algorithm ever in the world for analyzing DNAs, let's focus on the architecture.

My Architecture

enter image description here

  1. A request is received, the DNA is validated and analyzed SYNCRHONOUSLY
  2. A response is sent to the client and a new message is pushed to the queue
  3. The worker gets the message and executes two steps ASYNCRHONOUSLY:
    1. Decides is the DNA is already in the store and saves it if it is required
    2. Push a new message to the Stats Event Store (NEW_WOMAN or NEW_MAN)

The API

/genre

It is written in NodeJS + ExpressJS. Will run in a VM which will be behind a Load Balancer. This services should be able to auto-scale based on traffic/request (Azure for example). For each request, the DNA is analyzed and a new message is added to the queue. This decouples the process of responding to the client from the storing process. But It is still a possible bottleneck or contention point since the analysis is taken place here.

/stats

Reads all events into the Stats Event Store and calculates what is required. This is using Event Sourcing architecture, so there must be a process responsible for creating snapshot periodically in order to avoid having many logs in the store.

The store

Is an ElasticSearch database with two indexes, one for storing men/women and another one for storing the stats

The Worker

It is process written NodeJS which runs in a VM. It reads messages from the queue and checks if the DNA already exists in the People:

  1. Saves the DNA in the store if it does not exist yet
  2. Push a new log in the Stats Event Store if it is a new person.

Here is where I find the most complicated point of contention and blocker for scaling out. While it is only one instance running there is no problem as all messages will be read one by one. Problems begin if I want to create more instances of the worker since there will be consistency problems.

What should I change/add in the architecture and what considerations should I take into account to accomplish the target?

How could I take real advantage of patterns like Event Sourcing, Eventual Consistency and CQRS?

I'd appreciate any docs, opinion or experience you guys can share.

I want to add that I've got the position, so I just want to learn and improve my architecture skills

Thanks.



Related Questions


Designing EDA for multiple services

Updated February 16, 2018 17:05 PM


Domain driven design with eventual consistency

Updated July 27, 2015 17:02 PM