Elasticsearch is a real-time search and analytics engine, and it is the core product behind the well-known Elastic Stack. It is mainly used for log analytics and for creating interactive dashboards to browse and drill-down into data, usually events or time based. Response times with Elastic are in most cases subsecond, thus it is being widely used for ad-hoc data investigation and often using an interactive UI or Kibana dashboards.
Presto is a high performance, distributed SQL query engine for BigData. Slowly but surely, it is becoming the de-facto standard for implementing cost-effective Data Lakes and Data Warehouses - mainly thanks to its ability to query huge amounts of data in what we often call “interactive time”.
More often than not we find ourselves implementing BigData architectures that include those two technologies. Presto is usually deployed for what we call the “cold layer”, and Elasticsearch for the “hot layer”. In most systems, real-time access isn’t required for the lion’s share of the data where the main concern is keeping costs low; and so S3 and Presto are a great fit. Usually ultra-low latency queries are only required for a portion of the data, and that is where Elasticsearch, which is more hardware demanding and hence costler, really shines.
And this is where things start being really interesting. Join us in this talk to learn more about Presto and Elasticsearch, and how they can complement each other in the real-world and some really cool use-cases.
About the speaker:
Itamar Syn-Hershko is the CTO and Founder of BigData Boutique, they help companies and organizations succeed with their Big Data projects. Itamar is happy to still be an active developer and a hands-on architect. Over the years he has built and maintained several big mission-critical systems and gained a lot of experience which he now uses to perfect systems built to deal with scale.
His origins are from the Information Retrieval field - specifically text search and analytics. He has been involved with Lucene and Elasticsearch for over a decade now, solved the challenge of search on Hebrew texts, and improved relevance and performance for too many search systems to remember.
One thing led to another and now his focus is on creating cost-effective, properly-modeled and high performance data platforms: Choosing the right technologies and piecing everything together to create a fully functioning Data Lake or Data Warehouse - or just a data-rich platform at scale.