Data Ingestion and Entity Linking

Waterloo & Kitchener

Nov 7, 2018, 11:00 PM – Nov 8, 2018, 1:30 AM


About this event

Join us for our upcoming meetup on November 7 at Green Brick Labs –– thank you very much for hosting us!

We'll have two talks this evening: (Doors open at 6pm, and talks will start at 6:30pm):

1. "High volume time-series data ingestion on a budget"
Claus Strommer, Senior Security and IT Engineer, Green Brick Labs

Summary: Pushing a high volume of events per second to Elasticsearch can be a daunting task when the hardware budget is limited. We will cover common pitfalls done by operators when attempting to scale their ingestion rate, and provide tips and tricks for getting the most out of your budget.

About the Speaker: Claus Strommer is a Senior Security and IT Engineer at Green Brick Labs, using the ELK stack for logging, monitoring, and dissecting of performance and security events. He also maintains an academic interest in using Elasticsearch for natural language processing and machine learning.

2. "Entity Linking@Scale Over Heterogeneous Datasets"
Atif Khan, Vice President of AI & Data Science at Messagepoint Inc.

Summary: Data is a manifestation of information about real world entities that exhibit complex relationships along heterogeneous sets of attributes. Given the diversity of information (schema, representation etc.) across different datasets, a robust approach is required to be able to uniquely identify entities. In this talk we will review the use of Elasticsearch as a scalable entity linking/deduplication tool. We will present the high level architecture and design of such a system and then review its application in the context of two major use cases of data deduplication and attribute-based link discovery.

About the Speaker: Atif Khan is the Vice President of AI & Data Science at Messagepoint Inc. Under his guidance, his team is focused on harnessing the power of AI to enhance CCM and CX use cases, redefining "Content Intelligence". Atif has enjoyed success in building various AI enabled Big Data platforms, utilizing core AI capabilities (such as machine learning, probabilistic knowledge inference and relational graph based data modelling) to cutting edge distributed compute technologies (like Hadoop, Spark, HBase, Elastic, Titan etc.), facilitating big data integration,
visualization, analytics, and knowledge engineering.
Twitter: @__AtifKhan



November 7 – 8, 2018
11:00 PM – 1:30 AM UTC

Contact Us