Cloud Platform Scalability & Optimisation
Behind the scenes
Our customer built an AWS-based cloud platform to process data from their mobile application SDK by collecting smartphone sensor data.
It was designed according to a lambda architecture with both a speed layer for real-time processing and a batch layer for daily processing of sensor data.
After a cloud architecture analysis and load tests, we improved the messaging layer from 4k to 200k and supported persistent connections using Scala, Apache-Spark, Kafka and elasticsearch.
“At Klarrio we learned the hard way that the real scalability challenge goes beyond optimizing the CPU and memory of the current processing jobs. We found that the real issues lied in the resources that cannot be easily scaled up, such as network, single point of failures, recovering from failed components, read/write postgresql limitations, …
We really had to think outside the box here. Besides the technical optimizations, we also had to adjust the data science code written in Python, and solve queuing problems in the APIs between micro services as well as in Kafka caused by bursty traffic patterns.”
”Solving scalability issues without additional compute resources and costs is a tough, but fun challenge for any data engineer.Bruno De BusCTO