Elasticsearch and Fuzzy Name Matching

Washington

Jun 25, 2015, 10:00 PM – Jun 26, 2015, 12:00 AM

RSVPs

About this event

We're pleased to announce upcoming Elasticsearch meetup! This meetup features two talks.

Talk 1: "An Elasticsearch Plugin for Simple Fuzzy Name Matching"

Normalization is crucial to high quality search results -- who wants irrelevant variations between queries and documents leading to missed hits (e.g., “celebrity” v. “celebrities”)? Normalizing dictionary words works, but what if your application focuses on names? Whether you’re tackling log analysis, e-commerce, watchlist screening or other applications, names are often the key. Can you find “Abdul Jabbar, Karim” if you search for “Kareem AbdalJabar” or “كريم عبد الجبار”? Applications using Elasticsearch provide some fuzziness by mixing its built-in edit-distance matching and phonetic analysis with more generic analyzers and filters (see example #1 or #2). 

We’ve tried to go beyond that to provide both better matching and a simpler integration. We use a custom Mapper and Score Function so that linguistic nuances can be handled behind-the-scenes. We’ll talk about how we built this sort of plug-in for Rosette, its customization, and its connection to broader trend of entity-centric search.

Graham Morehead studied physics and computer science formally, and linguistics more informally.  He has 20 years of technology development experience spanning fields such as speech recognition, mobile apps, and complex adaptive systems.

Talk 2: "Content access and processing using Aspire for Elasticsearch"

Overview of Aspire and how it is being used in several projects for accessing, cleaning, enriching and normalizing content to improve search relevancy, navigation and/or analytics before ingestion by Elasticsearch. 

John-Henry Gross is a Product Manager. He has worked in the software business for for over 25 years for companies such as Adobe, Sun, Convera, MetaCarta and now Search Technologies. He diverse career includes positions in engineering, training, marketing and product management.

When

When

June 25 – 26, 2015
10:00 PM – 12:00 AM UTC

Organizer

  • Subash Thota

    Community Organizer

Contact Us