ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

2011 - Pascal Exploration & Exploitation Challenge 2011

Date2011-06-28

Deadline2011-04-29

VenueWashington, USA - United States USA - United States

Keywords

Website

Topics/Call fo Papers

Call for submissions:

Pascal Exploration & Exploitation Challenge 2011

http://explo.cs.ucl.ac.uk/

Data set

The dataset provided by Adobe comprises 20 million records. It simulates the clicks of visitors on content presented to them on a given website, during a period of 9 months when 6 different pieces of content (“options”) were randomly served. The overall Click Through Rate for each option is the same. The first 120 attributes of a record are features that characterise a visitor, for instance: demographics, number of times they have looked at a particular page on the site, time of day, week, year, etc. The penultimate attribute is the id of the option, between 1 and 6, that was served to that visitor. The last attribute is a boolean that indicates whether the visitor clicked or not.

Task

Each algorithm submitted to the Challenge will perform a sequence of “iterations”. For each iteration, a batch of 6 visitor-option pairs is given to the algorithm. All 6 visitors are different and have never been seen before, options range from 1 to 6, and at least one of these 6 pairs is associated to a click. The algorithm has to predict which visitor-option pair is most likely to be associated to a click. If, for the chosen visitor-option pair, the corresponding record in the dataset did indicate a click, then a reward of 1 is given. Otherwise, a reward of 0 is given. The objective of the algorithm is to maximise its cumulated reward (“score”). Note that, because the CTRs for each campaign are the same, it is necessary to use the visitor features in order to make better predictions than random. Also, visitor responses might change through time, making it essential to keep learning their interests.

Evaluation Process

Algorithms will have to be implemented in Java and submitted to our web application as JAR archives, as indicated in the Getting Started page. The evaluation will start with a first phase corresponding to the first 500,000 records of the data set. Submissions will be evaluated as they are received and the scores will be given on a leader board. The deadline for final submission for this first phase is 29/4/2011. Afterwards, participants will be given access to the data used for evaluation and will be able to update their algorithms until 20/5/2011, if they wish. Then, we will proceed to the second evaluation phase where we will run the algorithms on the rest of the dataset. The final scores will be announced during our ICML workshop. The best challenge entrants will be expected to present their algorithms at the workshop and to describe them in a way that would allow others to duplicate their work.

Phase 1

The evaluation for phase 1 will be performed on 500 000 iterations, with about 100ms allowed per iteration. Batches of visitor-option pairs will be taken in this subset of the data, in chronological order.

Participants will be able to submit as many times as they want until 29 April (submitting while a previous submission is still being evaluated will stop the current evaluation and start a new one). The results will be given within 36 hours of submission and the leader board will be instantly updated. Depending on the first results, we may decide to provide more data at the end of March, for the participants to learn more about the problem.

When the deadline is passed, the leader board will freeze and the 3 best participants will be invited to present their work at the workshop and to submit a contributed paper. The submission for phase 2 will open and the data for phase 1 will be available to download.

Phase 2

From 29 April until 20 May, each participant will be able to submit a new algorithm to be considered for phase 2. Unlike phase 1, no evaluation will be performed upon reception of a new submission. Evaluations will start when the deadline is passed, and the final results will be given at the workshop. Up until the beginning of phase 2, we reserve the right to change the allowed time per iteration and the number of data points to use for evaluation, based on the results of phase 1. Note that, if a participant does not submit a new algorithm for phase 2, the algorithm will be started “from scratch” and not in the state it was at the end of phase 1.

Last modified: 2011-03-18 10:25:20