Elasticsearch and Fuzzy Name Matching

Basis Technology - One Alewife Center Cambridge Boston
Thu, Jun 18, 2015, 6:00 PM (EDT)

About this event

Basis Technology is graciously hosting this Elasticsearch meetup at their Alewife location. Come enjoy some food, refreshments and networking at 6pm before the talk begins.

Topic: An Elasticsearch Plugin for Simple Fuzzy Name Matching

Normalization is crucial to high quality search results -- who wants irrelevant variations between queries and documents leading to missed hits (e.g., “celebrity” v. “celebrities”)? Normalizing dictionary words works, but what if your application focuses on names? Whether you’re tackling log analysis, e-commerce, watch list screening or other applications, names are often the key. Can you find “Abdul Jabbar, Karim” if you search for “Kareem AbdalJabar” or “كريم عبد الجبار”? 

Applications using Elasticsearch provide some fuzziness by mixing its built-in edit-distance matching and phonetic analysis with more generic analyzers and filters (see example #1 or #2). We’ve tried to go beyond that to provide both better matching and a simpler integration. We use a custom Mapper and Score Function so that linguistic nuances can be handled behind-the-scenes. We’ll talk about how we built this sort of plug-in for Rosette, its customization, and its connection to broader trend of entity-centric search.

Speaker Bios: 

Brian Sawyer joined Basis in 2010. He is an Engineering Manager and the Product Owner of the Rosette Name Indexer (RNI), using Lucene and Lucene-backed search applications to provide name matching solutions. He holds a B.S. in Computer Science and Cognitive Psychology from Northeastern University.

Chris Mack is the Director of Customer Engineering for text analytics at Basis Technology. Chris's team designs solutions and delivers services to adapt text analytic components for a broad range of customer problems. Chris has spent the last 20 years in software development, data analytics, business strategy, and business operations. Chris received his BS in Management from Bentley University where he also studied Computer Information Systems.


Thursday, Jun 18
6:00 PM - 9:00 PM (EDT)


Basis Technology
One Alewife Center Cambridge


  • Lindsay Hill

    Lindsay Hill

    Community Organizer

  • Dan Morgan

    Dan Morgan

    Community Organizer

    View Profile
  • Richard Juknavorian

    Richard Juknavorian

    IT Squared

    Community Organizer

    View Profile
  • Theron Roe

    Theron Roe

    Community Organizer

    View Profile