ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

QALD 2016 - 6th Open Challenge on Question Answering over Linked Data

Date2016-05-29 - 2016-06-02

Deadline2016-03-11

VenueHeraklion, Crete, Greece Greece

Keywords

Websitehttps://2016.eswc-conferences.org/6th-op...

Topics/Call fo Papers

The past years have seen a growing amount of research on question answering over Semantic Web data, shaping an interaction paradigm that allows end users to profit from the expressive power of Semantic Web standards while at the same time hiding their complexity behind an intuitive and easy-to-use interface. The Question Answering over Linked Data challenge provides an up-to-date benchmark for assessing and comparing systems that mediate between a user, expressing his or her information need in natural language, and RDF data.
The key challenge for question answering over linked data is to translate a user's information need into a form such that it can be evaluated using standard Semantic Web query processing and inferencing techniques. In order to focus on specific aspects and involved challenges, QALD comprises three tasks: multilingual question answering over RDF data, hybrid question answering over both RDF and free text data, and question answering over statistical data in RDF data cubes.
The main goal is to gain insights into the strengths and shortcomings of different approaches and into possible solutions for coping with the heterogeneous and distributed nature of Semantic Web data.
Target Audience
QALD targets all researchers and practitioners working on querying linked data, natural language processing for question answering, multilingual information retrieval and related topics.
Tasks:
Task 1: Multilingual Question Answering
Given the diversity of languages used on the web, there is an impeding need to facilitate multilingual access to semantic data. The core task of QALD is thus to retrieve answers from an RDF data repository given an information need expressed in a variety of natural languages. The underlying RDF dataset is DBpedia 2015. The training data consists of 350 questions available in eight different languages (English, Spanish, German, Italian, French, Dutch, Romanian, and Farsi). Those questions are general, open-domain factual questions and they vary with respect to their complexity. Each question is annotated with a manually specified SPARQL query and answers. The test dataset will consist of 100 similar questions. Training data: http://greententacle.techfak.uni-bielefeld.de/~cun...
Task 2: Hybrid Question Answering
A lot of information is still available only in textual form, both on the web and in the form of labels and abstracts in linked data sources. Therefore, approaches are needed that can not only deal with the specific character of structured data but also with finding information in several sources, processing both structured and unstructured information, and combining such gathered information into one answer. QALD therefore includes a task on hybrid question answering, asking systems to retrieve answers for questions that required the integration of data both from RDF and from textual sources. The task builds on DBpedia 2015 as RDF knowledge base, together with its abstracts and optionally English Wikipedia as textual data source. Training data comprises 50 English questions, annotated with answers as well as a pseudo query that indicates which information can be obtained from RDF data and which from free text. As test questions, we will provide 50 similar questions. Training data: http://greententacle.techfak.uni-bielefeld.de/~cun...
Task 3: Statistical Question Answering over RDF Data Cubes
As new task, QALD provides a benchmark focusing on multi-dimensional, statistical data comprising several datasets from LinkedSpending, which provides government spendings as linked data modeled according to the RDF data cube vocabulary. Question answering over this kind of data poses challenges that are different from general, open-domain question answering as represented by the above two tasks, with respect to both the structure of the data and the amount of aggregation necessary to answer information needs. The training question set consists of 100 questions compiled in the CubeQA project, annotated with SPARQL queries and answers. As test data, we will provide 50 additional questions. Training data: http://greententacle.techfak.uni-bielefeld.de/~cun...

Last modified: 2016-01-31 22:52:59