VL 2016 - 5th Workshop on Vision and Language (VL'16)
Topics/Call fo Papers
The 5th Workshop on Vision and Language (VL'16) will be held on August 12 and hosted by the 54th Annual Meeting of the Association for Computational Linguistics (ACL), in Berlin, Germany. The workshop is being organised by COST Action IC1307 The European Network on Integrating Vision and Language (iV&L Net).
Research involving both language and vision computing spans a variety of disciplines and applications, and goes back a number of decades. In a recent scene shift, the big data era has thrown up a multitude of tasks in which vision and language are inherently linked. The explosive growth of visual and textual data, both online and in private repositories by diverse institutions and companies, has led to urgent requirements in terms of search, processing and management of digital content. Solutions for providing access to or mining such data effectively depend on the connection between visual and textual content being made interpretable, hence on the 'semantic gap' between vision and language being bridged.
One perspective has been integrated modelling of language and vision, with approaches located at different points between the structured, cognitive modelling end of the spectrum, and the unsupervised machine learning end, with state-of-the-art results in many areas currently being produced at the latter end, in particular by deep learning approaches.
Another perspective is exploring how knowledge about language can help with predominantly visual tasks, and vice versa. Visual interpretation can be aided by text associated with images/videos and knowledge about the world learned from language. On the NLP side, images can help ground language in the physical world, allowing us to develop models for semantics. Words and pictures are often naturally linked online and in the real world, and each modality can provide reinforcing information to aid the other.
TOPICS
The 5th Workshop on Vision and Language (VL’16) aims to address all the above, with a particular focus on the integrated modelling of vision and language. We welcome papers describing original research combining language and vision. To encourage the sharing of novel and emerging ideas we also welcome papers describing new data sets, grand challenges, open problems, benchmarks and work in progress as well as survey papers.
Topics of interest include (in alphabetical order), but are not limited to:
Computational modeling of human vision and language
Computer graphics generation from text
Human-computer interaction in virtual worlds
Human-robot interaction
Image and video description and summarization
Image and video labeling and annotation
Image and video retrieval
Language-driven animation
Machine translation with visual enhancement
Medical image processing
Models of distributional semantics involving vision and language
Multi-modal discourse analysis
Multi-modal human-computer communication
Multi-modal temporal and spatial semantics recognition and resolution
Recognition of narratives in text and video
Recognition of semantic roles and frames in text, images and video
Retrieval models across different modalities
Text-to-image generation
Visual question answering / visual Turing challenge
Visually grounded language understanding
Research involving both language and vision computing spans a variety of disciplines and applications, and goes back a number of decades. In a recent scene shift, the big data era has thrown up a multitude of tasks in which vision and language are inherently linked. The explosive growth of visual and textual data, both online and in private repositories by diverse institutions and companies, has led to urgent requirements in terms of search, processing and management of digital content. Solutions for providing access to or mining such data effectively depend on the connection between visual and textual content being made interpretable, hence on the 'semantic gap' between vision and language being bridged.
One perspective has been integrated modelling of language and vision, with approaches located at different points between the structured, cognitive modelling end of the spectrum, and the unsupervised machine learning end, with state-of-the-art results in many areas currently being produced at the latter end, in particular by deep learning approaches.
Another perspective is exploring how knowledge about language can help with predominantly visual tasks, and vice versa. Visual interpretation can be aided by text associated with images/videos and knowledge about the world learned from language. On the NLP side, images can help ground language in the physical world, allowing us to develop models for semantics. Words and pictures are often naturally linked online and in the real world, and each modality can provide reinforcing information to aid the other.
TOPICS
The 5th Workshop on Vision and Language (VL’16) aims to address all the above, with a particular focus on the integrated modelling of vision and language. We welcome papers describing original research combining language and vision. To encourage the sharing of novel and emerging ideas we also welcome papers describing new data sets, grand challenges, open problems, benchmarks and work in progress as well as survey papers.
Topics of interest include (in alphabetical order), but are not limited to:
Computational modeling of human vision and language
Computer graphics generation from text
Human-computer interaction in virtual worlds
Human-robot interaction
Image and video description and summarization
Image and video labeling and annotation
Image and video retrieval
Language-driven animation
Machine translation with visual enhancement
Medical image processing
Models of distributional semantics involving vision and language
Multi-modal discourse analysis
Multi-modal human-computer communication
Multi-modal temporal and spatial semantics recognition and resolution
Recognition of narratives in text and video
Recognition of semantic roles and frames in text, images and video
Retrieval models across different modalities
Text-to-image generation
Visual question answering / visual Turing challenge
Visually grounded language understanding
Other CFPs
- 14th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology
- 10th Linguistic Annotation Workshop
- 16h International Conference on Next Generation Wired/Wireless Advanced Networks and Systems NEW2AN 2016
- International Conference on Public Management and Futures Studies (PMFS 2016)
- Seventh International Conference on Advances in Computing, Control, and Telecommunication Technologies - ACT 2016
Last modified: 2016-05-04 12:54:33