ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

EGPAI 2016 - 1st International Workshop on Evaluating General-Purpose AI

Date2016-08-30

Deadline2016-06-01

VenueThe Hague, Netherlands, The Netherlands, The

Keywords

Websitehttps://users.dsic.upv.es/~flip/EGPAI2016

Topics/Call fo Papers

The aim of this workshop is to bring to bear on the expertise of a diverse set of researchers to progress in the evaluation of general purpose AI systems. Up to now, most AI systems are tested on specific tasks. However, to be considered truly intelligent, a system should exhibit enough flexibility to be able to learn how to perform a wide variety of tasks, some of which may not be known until after the system is deployed. This workshop will examine formalisations, methodologies and testbenches for evaluating the numerous aspects of this type of general AI systems. More specifically, we are interested in theoretical or experimental research focused on the development of concepts, tools and clear metrics to characterise and measure the intelligence, and other cognitive abilities, of general AI agents.
We are interested in questions such as: Can the various tasks and benchmarks in AI provide a general basis for evaluation and comparison of a broad range of such systems?, Can there be a theory of tasks, or cognitive abilities, that enables a more direct comparison and characterisation of AI systems? How much does the specificity of an AI agent relate to how fast it can approach the optimal performance?
We welcome regular papers, demo papers about benchmarks or tools, and position papers, and encourage discussions over a broad list of topics (not exhaustive):
Analysis and comparisons of AI benchmarks and competitions. Lessons learnt.
Proposals for new general tasks, evaluation environments, workbenches and general AI development platforms.
Theoretical or experimental accounts of the space of tasks, abilities and their dependencies.
Evaluation of development in robotics and other autonomous agents, and cumulative learning in general learning systems.
Tasks and methods for evaluating: transfer learning, cognitive growth, structural self-modification and self-programming.
Evaluation of social, verbal and other general abilities in multi-agent systems, video games and artificial social ecosystems.
Evaluation of autonomous systems: cognitive architectures and multi-agent systems versus general components: machine learning techniques, SAT solvers, planners, etc.
Unified theories for evaluating intelligence and other cognitive abilities, independently of the kind of subject (humans, animals or machines): universal psychometrics.
Analysis of reward aggregation and utility functions, environment properties (Markov, ergodic, etc.) in the characterisation of reinforcement learning tasks.
Methods supporting automatic generation of tasks and problems with systematically introduced variations.
Better understanding of the characterisation of task requirements and difficulty (energy, time, trials needed..), beyond algorithmic complexity.
Evaluation of AI systems using generalised cognitive tests for humans. Computer models taking IQ tests. Psychometric AI.
Application of (algorithmic) information theory, game theory, theoretical cognition and theoretical evolution for the definition of metrics of cognitive abilities.
Adaptation of evaluation tools from comparative psychology and psychometrics to AI: item response theory, adaptive testing, hierarchical factor analysis.
Evaluation methods for multiresolutional perception in AI systems and agents.

Last modified: 2016-03-20 23:37:20