Using ESRally & Prometheus to set Elasticsearch dynamic SLAs
While maintaining a massive Elasticsearch environment (~25 million docs/minute) is often no walk in the park, providing a clear way to set SLAs for our various clusters was very challenging. How do we know how many docs our ES cluster can handle? How much CPU time do we spend indexing? What can we guarantee to our users in terms of query performance? How much lag are we willing to tolerate? What metrics are actually important? By running rally multiple times a week on all of our clusters (leveraging Jenkins, Prometheus and Grafana) we are now able to answer all these questions in real time, provide ourselves actual insights on when and how to scale our clusters, and even set alerts based on our clusters' capacity.
Nikita Ostrovsky is a DevOps Manager that leads the Visibility team at Outbrain. He's been in the SRE/DevOps space for over 10 years, and primarily focuses on bringing sane metrics, logging, event streaming and alerting solutions to the organization at large.
Elastic 6.5 is one of our most ambitious releases ever, bringing many innovative and long-anticipated features into the stack, such as:
• Cross-cluster replication
• Centralized management for Beats
• Canvas for live infographic presentations
• Kibana Spaces for multitenant dashboards
• Built-in UI for real-time infra monitoring
• Built-in UI for real-time log tailing
• Distributed tracing for Elastic APM
• ODBC driver for Elasticsearch
Dave Moore, an Elastic solutions architect, will introduce and demonstrate many of these new features, and share his thoughts on what it all means for the future of the Elastic Stack.
Dave Moore is a solutions architect at Elastic, where he helps people succeed with real-time search and analytics at scale. In his past life he designed and implemented distributed data processing systems including the patient identification system used by one of the largest health care companies in the world. At Elastic he applies his expertise to solve critical operational problems such as enterprise search, logging and metrics analytics, and application performance monitoring. He is the author of zentity, an Elasticsearch plugin for real-time entity resolution, and a long-time user of open source software. Dave lives in the Research Triangle and works with everyone from Charleston to Philadelphia.