ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

robus 2011 - Workshop on Robust Unsupervised and Semisupervised Methods in Natural Language Processing

Date2011-09-12

Deadline2011-05-10

VenueHissar, Estonia Estonia

Keywords

Website

Topics/Call fo Papers

In natural language processing (NLP), supervised learning scenarios are more frequently explored than unsupervised or semi-supervised ones. Unfortunately, labeled data are often highly domain-dependent and short in supply. It has therefore become increasingly important to leverage both labeled and unlabeled data to achieve the best performance in challenging NLP problems that involve learning of structured variables.

Until recently most results in semi-supervised learning of structured variables in NLP were negative (Abney, 2008), but today the best part-of-speech taggers (Suzuki et al., 2008), named entity recognizers (Turian et al., 2010), and dependency parsers (Sagae and Tsujii, 2007; Suzuki et al., 2009; Søgaard and Rishøj, 2010) exploit mixtures of labeled and unlabeled data. Unsupervised and minimally unsupervised NLP also sees rapid growth.

The most commonly used semi-supervised learning algorithms in NLP are feature-based methods (Koo et al., 2008; Sagae and Gordon, 2009; Turian et al., 2010) and EM, self- or co-training (Mihalcea, 2004; Sagae and Tsujii, 2007; Spoustova et al., 2009). Mixture models have also been successfully used (Suzuki and Isozaki, 2008; Suzuki et al., 2009). While feature-based methods seem relatively robust, self-training and co-training are very parameter-sensitive, and parameter tuning has therefore become an important research topic (Goldberg and Zhu, 2009). This is not only a concern in NLP, but also in other areas such as face recognition, e.g. Yan and Wang (2009). Parameter-sensitivity is even more dramatic in unsupervised learning of structured variables, e.g. unsupervised part-of-speech tagging and grammar induction.

By more robust unsupervised or semi-supervised learning algorithms we mean algorithms with few parameters that give good results across different data sets and different applications.

Specifically, we encourage submissions on the following topics:
assessing robustness of known or new unsupervised or semi-supervised methods across different NLP problems or languages
new unsupervised or semi-supervised methods for NLP problems
positive and negative results on using of unsupervised or semi-supervised methods in applications
application-oriented evaluation of unsupervised or semi-supervised methods
comparison and combination of unsupervised or semi-supervised methods

This workshop aims to bring together researchers dedicated to designing and evaluating robust unsupervised or semi-supervised learning algorithms for NLP problems. This includes, but is not limited to POS tagging, grammar induction and parsing, named entity recognition, word sense induction and disambiguation, machine translation, sentiment analysis and taxonomy learning. Our goal is to evaluate known unsupervised and semi-supervised learning algorithms, foster novel and more robust ones and discuss positive and negative results that may otherwise not appear in a technical paper at a major conference. We welcome submissions that address the robustness of unsupervised or semi-supervised learning algorithms for NLP, and especially encourage authors to provide results for different data sets, languages or applications.

https://sites.google.com/site/robus2011/

Last modified: 2011-04-14 20:43:00