8:00 PM: Closing remarks, head homewards or off-site for continued networking
Streamlining Search Indexing using Elasticsearch and Spark
Everyone who has maintained a search cluster knows the pain of keeping our online update code and offline reindexing pipelines in sync. Subtle bugs can pop up when our data is indexed differently depending on the context. By using Spark & Spark Streaming with Elasticsearch we can reuse the same indexing code. We will explore creating a simple off-line indexing pipeline and reusing the it for online indexing. If time (and wi-fi) allow a brief demo using the Twitter fire hose will be shown.
Real-Time Analytics and Anomaly Detection using Elasticsearch and Apache Hadoop
Finding relevant information fast has always been a challenge, even more so in today’s growing “oceans” of data. Over the past few years, leading businesses have deployed Apache Hadoop extensively to store and process this ocean. Today’s challenge is to maximize analytical insights and return on investment from this existing Hadoop infrastructure.
Enter Elasticsearch for Apache Hadoop, affectionately known as es-hadoop. es-hadoop enables data-hungry businesses to enhance their Hadoop workflows with a full-blown search and analytics engine. Best of all, es-hadoop allows businesses to gain insights from their data in real-time.
In this talk, Costin Leau, lead developer for es-hadoop, will discuss:
• An overview of how Elasticsearch plays in the overall Hadoop ecosystemWhat is es-hadoop?
• A full feature overview and the benefits of using itHow es-hadoop augments existing Hadoop deployments, regardless of flavor of Hadoop distro
• How Elasticsearch and es-hadop help businesses extract insights and analytics from their Hadoop deployments, all in real-time
Apache Usergrid & Elasticsearch
Apache Usergrid has evolved through beta to 1.0, now to beta 2.0. This presentation will cover the problems we faced in 1.0, and how they were solved in 2.0 with the help of Elasticsearch.
Costin Leau, Software Developer, Elasticsearch Inc.
Costin Leau leads Elasticsearch’s Hadoop efforts. An open source veteran, Costin led various Spring projects and authored an OSGi spec. He is also a speaker at various editions of EclipseCon/OSGi DevCon, JavaOne, Devoxx/Javapolis, JavaZone, SpringOne, TSSJS on Java/Spring/Hadoop-related topics.
Holden Karau, Software Engineer, Databricks
Holden Karau is a software development engineer at Databricks and is active in open source. She the author of a book on Spark and has assisted with Spark workshops. Prior to Databricks she worked on a variety of search and classification problems at Google, Foursquare, and Amazon. She graduated from the University of Waterloo with a Bachelors of Mathematics in Computer Science.
Todd Nine, Senior Software Engineer, Apigee
Todd Nine is the lead engineer/architect of Apache Usergrid, a distributed Backend-as-a-Service (“BaaS”), http://usergrid.incubator.apache.org. He has over 15 years industry experience, and has spent the last 6 designing and implementing distributed, eventually consistent systems utilized in commercial and safety industries.