Best Paper Award at Swiss Data Science Conference 2020
The paper “Annotating Web Tables through Knowledge Bases: A Context-Based Approach” by Yasamin Eslahi, Akansha Bhardwaj, Paolo Rosso, Kurt Stockinger and Philippe Cudré-Mauroux received the best paper award at the 7th Swiss Data Science Conference in June 2020.
Abstract of the Paper: The Web has a collection of over 150 million tables, which as a whole represents an invaluable source of semi-structured knowledge. Such tables are commonly referred to as Web tables, and are considerably easier to leverage in automated processes than completely unstructured, free-format text. Understanding the semantics of Web tables is important since they are used in various applications like knowledge base augmentation, information retrieval or natural language interfaces for databases. The task of understanding the semantics of a given Web table is known as Web table annotation. In recent years, it has been tackled through methods where the table is enriched using existing knowledge bases containing valuable information on the domain at hand, its entities and their mutual relationships.
In this paper, we present two novel and unsupervised Web table annotation methods, which leverage the context of the tables to better capture their semantics. Our first method is lookup-based and exploits text similarity to find reference entities in the knowledge base. The second method uses distributional vector representations – a.k.a. embeddings – of the Web tables to elicit their context and disambiguate their semantics. Experiments show that our proposed approach outperforms the state of the art in Web table annotation by up to 18%. Another contribution of this work is a manually corrected version of one of the popular gold standard datasets, Limaye, with annotations from DBpedia. Our dataset and code are publicly available.
The full paper can be found at: https://digitalcollection.zhaw.ch/handle/11475/20229