Elasticsearch Learning to Rank: Search as a ML Problem & Search Logs + ML

GitHub - 88 Colin P Kelly Jr St San Francisco San Francisco
Wed, Apr 10, 2019, 6:00 PM (PDT)

About this event

BIG THANK YOU TO GITHUB FOR HOSTING THIS EVENT AT THEIR OFFICE

We are excited to announce Doug Turnbull & John Berryman the authors of the fabulous book Relevant Search (https://www.manning.com/books/relevant-search) are reunited for this events tech talks.

Talk 1: Elasticsearch Learning to Rank: Search as a Machine Learning Problem
Speaker: Doug Turnbull (https://twitter.com/softwaredoug)

Search relevance is how questions are answered through search. It's the process of changing the ranking of search results for a user query to return what users want. A search for 'iPhone XS' should rank documents highly when the product name matches. But a different query, 'smartphone with two cameras' would require a completely different strategy for ranking candidate results. What gives teams a headache is that all the diverse use cases for search must be handled by a single ranking algorithm.

This is where Learning to Rank comes in. We will discuss how search can be treated as a machine learning problem. 'Learning to Rank' takes the step to returning optimized results to users based on patterns in usage behavior. We will talk through where Learning to Rank has shined, as well as the limitations of a machine learning based solution to improve search relevance.

Doug Turnbull: Search relevance consultant. Co-Author of Relevant Search.

Talk2: Search Logs + Machine Learning = Auto-Tagged Inventory
Speaker: John Berryman (https://twitter.com/JnBrymn)

For e-commerce applications, matching users with the items they want is the name of the game. If they can't find what they want then how can they buy anything?! Typically this functionality is provided through search and browse experience. Search allows users to type in text and match against the text of the items in the inventory. Browse allows users to select filters and slice-and-dice the inventory down to the subset they are interested in. But with the shift toward mobile devices, no one wants to type anymore - thus browse is becoming dominant in the e-commerce experience.

But there's a problem! What if your inventory is not categorized? Perhaps your inventory is user generated or generated by external providers who don't tag and categorize the inventory. No categories and no tags means no browse experience and missed sales. You could hire an army of taxonomists and curators to tag items - but training and curation will be expensive. You can demand that your providers tag their items and adhere to your taxonomy - but providers will buck this new requirement unless they see obvious and immediate benefit. Worse, providers might use tags to game the system - artificially placing themselves in the wrong category to drive more sales. Worst of all, creating the right taxonomy is hard. You have to structure a taxonomy to realistically represent how your customers think about the inventory.

Eventbrite is investigating a tantalizing alternative: using a combination of customer interactions and machine learning to automatically tag and categorize our inventory. As customers interact with our platform - as they search for events and click on and purchase events that interest them - we implicitly gather information about how our users think about our inventory. Search text effectively acts like a tag and a click on an event card is a vote for that clicked event is representative of that tag. We are able to use this stream of information as training data for a machine learning classification model; and as we receive new inventory, we can automatically tag it with the text that customers will likely use when searching for it.

In this talk I will explain in depth the problem space and Eventbrite's approach in solving the problem.

John Berryman is a Senior Software Engineer at Eventbrite, where he is helping build Eventbrite's event discovery platform. He also recently coauthored a tech book, Relevant Search, published by Manning.

When

Wednesday, Apr 10
6:00 PM - 9:00 PM (PDT)

Where

GitHub
88 Colin P Kelly Jr St San Francisco