Anurag Khandelwal will give a talk about Succinct. Kyle Methews from relaterocket will talk about clustering data with ES.
Anurag: This is not strictly Elasticsearch, but I think very interesting.
Students at the AMPLab UC Berkeley have been working on a project called Succinct, with the goal of enabling queries directly on compressed data (http://succinct.cs.berkeley.edu/). Integrating Succinct techniques into Elasticsearch seems like a promising direction. They'd like to give a talk at the meetup to get feedback from the users and developers and see if Succinct would be a good fit for Elasticsearch.
"Cloud services today need to perform fast, interactive queries on large data volumes. Several recent studies have shown that data is growing faster than memory capacity, making in-memory query execution increasingly challenging. At UC Berkeley, we have built Succinct, a distributed data store that overcomes this problem by enabling a wide range of interactive queries (e.g., search, random access, range queries, and even regular expressions) directly on compressed data.
Besides its ability to execute queries on compressed data, Succinct differs from existing data stores along several dimensions. First, Succinct unifies several powerful data models (key-value stores, document stores, tables, etc.) using a single interface. Second, Succinct enables applications to choose a desired compression factor, allowing applications to use larger memory for improved performance. Finally, Succinct allows applications to change the compression factor on the fly, enabling new approaches to handling skewed query distributions, time-varying loads, and failure tolerance.
In this talk, I will describe Succinct's design, implementation and semantics. Succinct is completely open-sourced, and we have also released Succinct as a library that simplifies integration of Succinct data structures and techniques with existing data stores.”
Kyle's blurb: "RelateRocket uses Elasticsearch to dynamically find "cliques" within business social networks. We then derive what we call 'relatability narratives" from these cliques and serve these to sales reps. Using Elasticsearch has proved phenomenally successful as with it we can score ~30 potential narratives and query detailed information about the top narratives in ~3-500 miliseconds."
Hiplead will be out hosts. Door 6:15, first talk (Kyle) 6:30, second talk 6:50-7:10, 15min q&a, go home.