The internal centralized logging services at DigitalOcean (DO) have gone through numerous substantial iterations. This talk focuses on the costs and benefits of the four primary architectures between 2015 and now:
5 server-grade machines
20 virtual machines (droplets)
80 virtual machines
96 decommissioned (aka certified pre-owned) server-grade machines orchestrated by Mesos/Marathon
While scaling may have forced the team to consistently expand the infrastructure, this does not imply that Mesos is the optimal solution for all uses of Elasticsearch. In parallel, the talk alludes to the changing culture of Observability at DO, the growth of the team, and the features that have been and will continue to be built for DO’s engineers.
Marc is a former physicist turned senior engineer and technical lead on DigitalOcean's Observability Platforms team. He specializes in system infrastructure designed to handle large amounts of data. His passion for numbers has led Marc to focus heavily on building observable systems and the services to expose the data thereof. He has architected several iterations of DigitalOcean's internal centralized logging service, which most recently consists of a 300+ node Elasticsearch cluster orchestrated by Mesos. Marc is only 90% nerd. In the other 10%, he enjoys playing ice hockey, traveling, languages, cooking and most importantly, eating.
What's coming soon in Elasticsearch
We will discuss features coming soon in Elasticsearch: Implications of upgrading to lucene 8, new aggregations, zen2, index lifecycle management, SQL, etc.
Paul Sanwald is a team lead for the elasticsearch team at Elastic. Prior to Elastic, Paul served as CTO for RedOwl Analytics and was an early user (and big fan) of elasticsearch.