ScaNLP 2013 - International Workshop on Scalable Natural Language Processing (ScaNLP 2013)
Topics/Call fo Papers
The first workshop on Scalability in Natural Language Processing
This workshop, held with RANLP 2013, aims to introduce contemporary work and to discuss novel methods for natural language processing at a large scale, and explore how the resulting technology and methods can be reused in applications both on the Web and in the physical world.
What is scalable NLP?
For a processing approach to be scalable, it should be to take on large volumes of data; it can work through them at high speed; and it can smoothly adapt to changes in these needs. We discuss this in the context of NLP, with particular focus on the core tasks of resource creation, discourse processing, and evaluation.
Why is this workshop timely?
Now is a particularly important time to develop scalable methods in our field. Big data is new and the benefits of effectively getting through it remain to be harvested by pioneers. We have huge datasets that are becoming available: Google Books contains 155 billion tokens over which only shallow surveys have been conducted; the new Common Crawl web corpus contains over 60 terabytes of text and metadata. But size alone is not a driver for scalable methods ? the rapid text creation that we are seeing compounds the problems. For example, Twitter alone is responsible for 500 million microtexts every day; and Wordpress.org hosts a large proportion of the 2 million blog documents we create every 24 hours.
As well as big text data becoming prolific, demand for this data is also high. The fast, un-curated nature of microtext has been shown to be of value in stock valuation by multiple researchers. User location and movement analysis enables powerful search and analysis modes, such as computational journalism and powerful personalisation. Sentiment detection informs corporations, governance and political activities. Media monitoring requires extracting and co-referring entities and events from thousands of outlets in real time. And finally, the emerging field of deep learning places but one core demand in all its guises: large amounts of data. All these applications' pressures create a demand for NLP that can be done quickly and broadly.
Why is this topic important?
Now more than ever is there demand for scalable natural language processing. Many organisations are interested in the potential results as big data becomes better defined and data-intensive approaches to computational linguistics reach production-level performance. Enormous quantities of data, from user input to news archives, are being mined using more powerful and computationally demanding techniques.
Newly introduced data-intensive approaches to computational linguistics continue thrive on input volume; we need scalable technology to handle the next order of magnitude in corpus sizes and continue advancing our field.
Technical issues
With regard to Scalable NLP, we aim to encourage discussion regarding three key areas of natural language processing: resource creation; processing of discourse; and evaluation.
Topics include but are not limited to:
General scalability issues
Application approaches
Performance limits
Flexible resource creation
Parallelising annotation
Handling huge corpora
Crowdsourcing for corpus creation
Decomposing resource creation tasks
Rapid or realtime annotation quality assessment
Scalable processing
Running NLP in the cloud
Privacy issues
NLP application parallelisation
NLP application optimisation
Scalable machine learning for NLP
High performance computing for NLP results
Rapid evaluation
On-line learning for NLP
Reinforcement learning
Iterative and ensemble learning
Hypothesis generation
In addition to the invited talk and presentations, we intend to include a 30-minute hands-on demonstration slot with participants doing NLP in the cloud using GATECloud, possibly including social media processing using GATE TwitIE (supported and funded by the organisers).
This workshop, held with RANLP 2013, aims to introduce contemporary work and to discuss novel methods for natural language processing at a large scale, and explore how the resulting technology and methods can be reused in applications both on the Web and in the physical world.
What is scalable NLP?
For a processing approach to be scalable, it should be to take on large volumes of data; it can work through them at high speed; and it can smoothly adapt to changes in these needs. We discuss this in the context of NLP, with particular focus on the core tasks of resource creation, discourse processing, and evaluation.
Why is this workshop timely?
Now is a particularly important time to develop scalable methods in our field. Big data is new and the benefits of effectively getting through it remain to be harvested by pioneers. We have huge datasets that are becoming available: Google Books contains 155 billion tokens over which only shallow surveys have been conducted; the new Common Crawl web corpus contains over 60 terabytes of text and metadata. But size alone is not a driver for scalable methods ? the rapid text creation that we are seeing compounds the problems. For example, Twitter alone is responsible for 500 million microtexts every day; and Wordpress.org hosts a large proportion of the 2 million blog documents we create every 24 hours.
As well as big text data becoming prolific, demand for this data is also high. The fast, un-curated nature of microtext has been shown to be of value in stock valuation by multiple researchers. User location and movement analysis enables powerful search and analysis modes, such as computational journalism and powerful personalisation. Sentiment detection informs corporations, governance and political activities. Media monitoring requires extracting and co-referring entities and events from thousands of outlets in real time. And finally, the emerging field of deep learning places but one core demand in all its guises: large amounts of data. All these applications' pressures create a demand for NLP that can be done quickly and broadly.
Why is this topic important?
Now more than ever is there demand for scalable natural language processing. Many organisations are interested in the potential results as big data becomes better defined and data-intensive approaches to computational linguistics reach production-level performance. Enormous quantities of data, from user input to news archives, are being mined using more powerful and computationally demanding techniques.
Newly introduced data-intensive approaches to computational linguistics continue thrive on input volume; we need scalable technology to handle the next order of magnitude in corpus sizes and continue advancing our field.
Technical issues
With regard to Scalable NLP, we aim to encourage discussion regarding three key areas of natural language processing: resource creation; processing of discourse; and evaluation.
Topics include but are not limited to:
General scalability issues
Application approaches
Performance limits
Flexible resource creation
Parallelising annotation
Handling huge corpora
Crowdsourcing for corpus creation
Decomposing resource creation tasks
Rapid or realtime annotation quality assessment
Scalable processing
Running NLP in the cloud
Privacy issues
NLP application parallelisation
NLP application optimisation
Scalable machine learning for NLP
High performance computing for NLP results
Rapid evaluation
On-line learning for NLP
Reinforcement learning
Iterative and ensemble learning
Hypothesis generation
In addition to the invited talk and presentations, we intend to include a 30-minute hands-on demonstration slot with participants doing NLP in the cloud using GATECloud, possibly including social media processing using GATE TwitIE (supported and funded by the organisers).
Other CFPs
- International Workshop on Adaptation of language resources and tools for closely related languages and language variants
- The 1st International Workshop on Advances in Multilingual Coreference Resolution
- First International Convention for Physical Education and Sport Science Research - INCOPHED 2013
- Workshop on Issues of Sentiment Discovery and Opinion Mining
- 4th Jordanian International Conference and Exhibition for Roads and Transport
Last modified: 2013-03-30 21:46:14