NIPS Experiment Analysis

Sorry for the relative silence on the NIPS experiment. Corinna and I have both done some analysis on the data. Over the Christmas break I focussed an analysis on the ‘raw numbers’ which people have been discussing. In particular I wanted to qualify the certainties that people are placing on these numbers. There are a couple of different ways of doing this, bootstrap, or a Bayesian analysis. I went for the latter. Corinna has also been doing a lot of work on how the scores correlate, and the ball is in my court to pick up on that. However, before doing that I wanted to make the initial Bayesian analysis of the data. In doing so, we’re also releasing a little bit more information on the numbers.

Headline figure is that if we re-ran the conference we would expect anywhere between 38% and 64% of the same papers to have been presented again. This is the figure that several commentators mentioned that is the one attendees are really interested in. Of course, when you think about it, you also realise it is a difficult figure to estimate because you reduce the power of the study because the figure is based only on papers which had at least one accept or more (rather than the full 168 papers used in the study).

Anyway details of the Bayesian analysis are available in a Jupyter notebook on github.

Proceedings of Machine Learning Research

Back in 2006 when the wider machine learning community was becoming aware of Gaussian processes (mainly through the publication of the Rasmussen and WIlliams book). Joaquin Quinonero Candela, Anton Schwaighofer and I organised the Gaussian Processes in Practice workshop at Bletchley Park. We planned a short proceedings for the workshop, but when I contacted Springer’s LNCS proceedings, a rather dismissive note came back with an associated prohibitive cost. Given that the ranking of LNCS wasn’t (and never has been) that high, this seemed a little presumptuous on their part. In response I contacted JMLR and asked if they’d ever considered a proceedings track. The result was that I was asked by Leslie Pack Kaelbling to launch the proceedings track.

JMLR isn’t just open access, but there is no charge to authors. It is hosted by servers at MIT and managed by the community.

We launched the proceedings in March 2007 with the first volume from the Gaussian Processes in Practice workshop. Since then there have been 38 volumes including two volumes in the pipeline. The proceedings publishes several leading conferences in machine learning including AISTATS, COLT and ICML.

From the start we felt that it was important to share the branding of JMLR with the proceedings, to show that the publication was following the same ethos as JMLR. However, this led to the rather awkward name: JMLR Workshop and Conference Proceedings, or JMLR W&CP. Following discussion with the senior editorial board of JMLR we now feel the time is right to rebrand with the shorter “Proceedings of Machine Learning Research”.

As part of the rebranding process the editorial team for the Proceedings of Machine Learning Research (which consists of Mark Reid and myself) is launching a small consultation exercise looking for suggestions on how we can improve the service for the community. Please feel free to leave comments on this blog post or via Facebook or Twitter to let us have feedback!

The NIPS Experiment

Just back from NIPS where it was really great to see the results of all the work everyone put in. I really enjoyed the program and thought the quality of all presented work was really strong. Both Corinna and I were particularly impressed by the work that put in by oral presenters to make their work accessible to such a large and diverse audience.

We also released some of the figures from the NIPS experiment, and there was a lot of discussion at the conference about what the result meant.

As we announced at the conference the consistency figure was 25.9%. I just wanted to confirm that in the spirit of openness that we’ve pursued across the entire conference process Corinna and I will provide a full write up of our analysis and conclusions in due course!

Some of the comment in the existing debate is missing out some of the background information we’ve tried to generate, so I just wanted to write a post that summarises that information to highlight its availability.

Scicast Question

With the help of Nicolo Fusi, Charles Twardy and the entire Scicast team we launched a Scicast question a week before the results were revealed. The comment thread for that question already had an amount of interesting comment before the conference. Just for informational purposes before we began reviewing Corinna forecast this figure would be 25% and I forecast it would be 20%. The box plot summary of predictions from Scicast is below.

forecast

Comment at the Conference

There was also an amount of debate at the conference about what the results mean, a few attempts to answer this question (based only on the inconsistency score and the expected accept rate for the conference) are available here in this little Facebook discussion and on this blog post.

Background Information on the Process

Just to emphasise previous posts on this year’s conference see below:

  1. NIPS Decision Time
  2. Reviewer Calibration for NIPS
  3. Reviewer Recruitment and Experience
  4. Paper Allocation for NIPS

Software on Github

And finally there is a large amount of code available on a github site for allowing our processes to be recreated. A lot of it is tidied up, but the last sections on the analysis are not yet done because it was always my intention to finish those when the experimental results are fully released.