Lessons Learned: Upgrading a 100 node ES Cluster from 1.7 to 7.x
- Challenges faced with 1.7: Mapping explosion, large cluster state, high CPU usage, EOL
- Estimating the optimal cluster size keeping performance and cost in the mind
- Cluster provisioning using ECK on AWS EKS with separate hot and cold data tiers to save costs
- Designing settings and optimal sizes of indices, aliases, and shards
- Implementing rollover and Index Lifecycle Management
- Upgrading Java/Spring based microservices using spring-data-elasticsearch and retiring TransportClient
- Data Migration from on-prem ES to ECK based ES on AWS: Direct data migration instead of snapshot-restore approach
- Performance improvement measurements
Sri Dalta is an Elastic Certified Engineer, currently working as a Software Engineer at Egen. Working on Egen's Industrial IoT project for the last couple of years, he deals with Elastic eco-system on a daily basis, both from an Elastic Administrator and a data engineer's perspective. Apart from this, he has worked within modern distributed stack involving Kafka, data pipelines, microservices, containers, and AWS Cloud.
Yani Julian is currently working as a software engineer at Egen on distributed systems across multiple cloud platforms. Experienced with building microservices where Elasticsearch is the primary data store along with Elasticsearch cluster administration, monitoring and fine-tuning. Yani currently focuses on developing automatic tools to improve developer productivity.