ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

Gen-Deep 2018 - Generalization in the Age of Deep Learning, Inability

Date2018-06-05

Deadline2018-03-19

VenueNew Orleans, USA - United States USA - United States

Keywords

Websitehttps://naacl2018.org/workshops.html

Topics/Call fo Papers

Deep learning has brought a wealth of state-of-the-art results and new capabilities. Although methods have achieved near human-level performance on many benchmarks, numerous recent studies imply that these benchmarks only weakly test their intended purpose, and that simple examples produced either by human or machine, cause systems to fail spectacularly. For example, a recently released textual entailment demo was criticized on social media for predicting:
“John killed Mary”

“Mary killed John”
Entailing with 92% confidence
Such surprising failures combined with the inability to interpret state-of-the-art models have eroded confidence in our systems, and while these systems are not perfect, the real flaw lies with our benchmarks that do not adequately measure a model’s ability to generalize, and are thus easily gameable.
This workshop provides a venue for exploring new approaches for measuring and enforcing generalization in models. We are soliciting work in the following areas:
Analysis of existing models and their failings
Creation of new evaluation paradigms,
e.g. zero-shot learning, Winnograd schema, and datasets that avoid explicit types of gamification.
Modeling advances
regularization, compositionality, interpretability, inductive bias, multi-task learning, and other methods that promote generalization.
Some of our goals are similar in spirit to those of the recent “Build it Break it” shared task. However, we propose going beyond identifying areas of weakness (i.e. “breaking” existing systems), and discussing scalable evaluations that more rigorously test generalization as well as modeling techniques for enforcing it.

Last modified: 2018-01-12 06:49:45