ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

BUCC 2014 - 7th Workshop on Building and Using Comparable Corpora



VenueReykjavik , Iceland Iceland



Topics/Call fo Papers

We solicit contributions including but not limited to the following topics.
Topics related to the special theme:
Methods and tools for collecting and processing MT data, including crowdsourcing
Methods and tools for quality control
Tools for efficient annotation
Bilingual term and named entity collections
Multilingual treebanks, wordnets, propbanks, etc.
Comparable corpora with parallel units annotated
Comparable corpora for under-resourced languages and specific domains
Multilingual corpora with rich annotations: POS tags, NEs, dependencies, semantic roles, etc.
Data for special applications: patent translation, movie subtitles, MOOCs, meetings, chat-rooms, social media, etc.
Legal issues with collecting and redistributing data and generating derivatives
Building Comparable Corpora:
Human translations
Automatic and semi-automatic methods
Methods to mine parallel and non-parallel corpora from the Web
Tools and criteria to evaluate the comparability of corpora
Parallel vs non-parallel corpora, monolingual corpora
Rare and minority languages, across language families
Multi-media/multi-modal comparable corpora
Applications of comparable corpora:
Human translations
Language learning
Cross-language information retrieval & document categorization
Bilingual projections
Machine translation
Writing assistance
Mining from Comparable Corpora:
Extraction of parallel segments or paraphrases from comparable corpora
Extraction of bilingual and multilingual translations of single words and multi-word expressions; proper names, named entities, etc.

Last modified: 2014-01-04 15:08:29