News

March 29th, 2017
The CASMI 2016 Cat 2+3 paper is out!

Jan 20th, 2017
Organisation of CASMI 2017 is underway, stay tuned!

Dec 4th, 2016
The MS1 peak lists for Category 2+3 have been added for completeness.

May 6th, 2016
The winners and full results are available.

April 25th, 2016
The solutions are public now.

April 18th, 2016
The contest is closed now, the results are fantastic and will be opened soon!

April 9th, 2016
All teams who submit before the deadline April 11th will be allowed to update the submission until Friday 15th.

February 12th, 2016
New categories 2 and 3 and data for automatic methods released. 10 new challenges in category 1.

January 25th, 2016
E. Schymanski and S. Neumann joined the organising team, additional contest data coming soon.

January 11th, 2016
New CASMI 2016 raw data files are available.


Results in Category 2

Summary of Challenge wins

Vaniya
Duehrkop
Verdegem
Allen
Brouard
Gold 70 82 44 63 86
Silver 26 21 53 71 50
Bronze 35 11 65 40 31
Gold (neg) 33 0 24 26 20
Gold (pos) 37 82 20 37 66

Summary statistics per participant

Mean rank Median rank Top Top3 Top10 Mean RRP Median RRP
Vaniya 19.75 3.0 46 79 101 0.804 0.922
Duehrkop 25.17 1.0 70 90 100 0.945 1.000
Verdegem 70.79 9.8 24 59 105 0.880 0.972
Allen 47.98 6.0 39 77 123 0.906 0.987
Brouard 127.34 5.2 62 93 118 0.874 0.988

Summary of Rank by Challenge and Participant

For each challenge, the rank of the winner(s) is highlighted in bold. If the submission did not contain the correct candidate this is denoted as "-". If someone did not participate in a challenge, nothing is shown. The tables are sortable if you click into the column header.

This summary is also available as CSV download.

Vaniya Duehrkop Verdegem Allen Brouard
challenge-001 29.5 353.0 27.5 21.5
challenge-002 - 5.0 5.0 5.5
challenge-003 7.5 27.0 7.0 4.5
challenge-004 21.5 8.5 8.0 7.0
challenge-005 2.0 3.5 112.0 383.0
challenge-006 40.5 86.0 63.0 75.0
challenge-007 2.0 4.0 3.0 266.0
challenge-008 1.5 2.5 2.0 2.0
challenge-009 - 1.0 1.0 55.0
challenge-010 4.5 3.5 6.0 7.0
challenge-011 1.0 14.5 8.0 3753.0
challenge-012 15.0 28.5 47.5 2530.0
challenge-013 1.0 1.0 40.0
challenge-014 - 35.5 19.5 22.0
challenge-015 3.0 73.0 146.0 39.0
challenge-016 101.0 1.5 2.0 72.0
challenge-017 - 95.5 82.0 58.0
challenge-018 1.0 3.0 1.0 1.0
challenge-019 - 21.5 3.0 341.0
challenge-020 71.0 10.5 70.0 1.0
challenge-021 - 2.0 32.0 1217.0
challenge-022 4.5 8.0 4.5 1.0
challenge-023 2.0 6.0 7.5 917.0
challenge-024 2.5 70.5 27.0 183.0
challenge-025 8.5 5.0 7.0 65.0
challenge-026 2.5 75.5 1.5 1.0
challenge-027 - 109.5 81.5 31.0
challenge-028 26.5 14.0 14.0 1.0
challenge-029 4.0 3.5 9.0 3.0
challenge-030 19.0 139.5 2.0 81.0
challenge-031 - 9.5 6.5 3.0
challenge-032 68.5 3.0 42.0 78.0
challenge-033 - 6.0 49.5 1.0
challenge-034 1.0 1.5 1.0 6.0
challenge-035 23.5 14.5 12.5 5.0
challenge-036 8.0 1.0 1170.5 972.0
challenge-037 6.5 6.5 64.0 68.0
challenge-038 3.5 2.5 3.5 29.0
challenge-039 - 240.5 205.0 8.0
challenge-040 - 6.5 33.5 39.0
challenge-041 1.0 139.0 424.0 1.0
challenge-042 6.5 5.0 6.5 1.0
challenge-043 - 188.5 12.0 20.0
challenge-044 2.5 1.5 3.0 19.0
challenge-045 - 74.5 14.0 16.0
challenge-046 1.5 62.0 29.0 44.0
challenge-047 1.0 3.5 136.0 216.0
challenge-048 2.0 2.0 3.0 5.0
challenge-049 12.5 13.5 11.5 129.0
challenge-050 - 3.5 3.0 234.0
challenge-051 1.0 79.0 159.5 36.0
challenge-052 - 48.5 103.5 160.0
challenge-053 1.0 61.0 308.5 2014.0
challenge-054 3.0 50.0 17.0 17.0
challenge-055 1.0 11.5 4.0 21.0
challenge-056 - 84.0 5.0 14.0
challenge-057 22.5 1.5 1.0 81.0
challenge-058 1.0 1.0 11.0 5.5
challenge-059 - 2.0 2.0 4.0
challenge-060 3.0 44.5 69.0 95.0
challenge-061 2.0 21.0 319.0
challenge-062 1.0 66.5 76.0 605.0
challenge-063 1.0 1.0 1.0 20.0
challenge-064 2.0 3.0 23.0 12.0
challenge-065 - 3.5 3.5 134.0
challenge-066 17.0 23.5 4.5 14.0
challenge-067 - 17.0 1.0 5.0
challenge-068 2.0 1.5 1.0 3.0
challenge-069 5.0 84.5 21.5 101.0
challenge-070 - 3.0 2.5 367.0
challenge-071 2.0 3.0 3.0 2.0
challenge-072 1.0 1.0 70.0
challenge-073 1.0 1.0 1.0 1.0
challenge-074 1.0 1.0 1.0 90.0
challenge-075 4.5 9.0 4.0 3.0
challenge-076 1.5 17.0 4.0 57.0
challenge-077 4.5 39.0 63.0 36.0
challenge-078 16.0 7.0 112.0
challenge-079 1.0 7.5 1.0 6.0
challenge-080 1.5 2.0 28.0
challenge-081 4.0 5.5 8.5 6.0
challenge-082 17.0 1.0 4.0 1.0 1.0
challenge-083 147.0 3.0 3.5 16.0 33.0
challenge-084 11.0 14.0 48.0 17.0 63.0
challenge-085 49.0 1.0 53.0 89.0 16.0
challenge-086 76.5 1.0 53.0 72.0 1.0
challenge-087 34.5 10.0 87.0 35.5 1.0
challenge-088 41.0 1.0 50.0 65.0 1.0
challenge-089 131.5 1.0 28.0 68.0 1.0
challenge-090 12.5 3.0 12.5 38.5 6.0
challenge-091 10.0 11.0 89.5 6.5 1.0
challenge-092 - 1.0 629.0 2.0 1.0
challenge-093 79.0 1.0 13.5 22.0 26.0
challenge-094 - 81.0 1.0 1.0 85.0
challenge-095 106.0 1.0 4.0 1.0 76.0
challenge-096 2.5 1.0 2.0 2.0 1.0
challenge-097 1.0 11.0 32.0 257.5 71.0
challenge-098 1.0 1.5 48.0 2.5 1.5
challenge-099 1.0 1.0 138.5 15.0 1.0
challenge-100 - 1.0 8.5 15.5 1.0
challenge-101 9.0 5.0 14.0 2.0 4.0
challenge-102 184.0 22.0 116.0 212.0 31.0
challenge-103 238.5 1.0 158.0 5.0 1.0
challenge-104 4.0 1.0 7.0 6.0 1.0
challenge-105 1.0 1.5 7.5 3.5 128.5
challenge-106 7.0 1.0 1.0 3.0
challenge-107 44.5 2.0 41.5 2.0 2.0
challenge-108 27.0 1.0 1.0 2.0 1.0
challenge-109 3.0 1.5 9.0 3.0 1.5
challenge-110 1.0 1.0 1281.0 124.5 1.0
challenge-111 1.0 1.0 1.0 2.0 1.0
challenge-112 - 2.0 2.0 6.0 4.0
challenge-113 35.5 1.0 3.5 35.0 1.0
challenge-114 11.0 1.0 9.0 20.0 1.0
challenge-115 1.0 1.0 5.5 3.0 1.0
challenge-116 49.5 2.0 31.0 1.5 2.0
challenge-117 1.0 1.0 1.5 40.0 1.0
challenge-118 - 1.0 11.0 5.0 3.0
challenge-119 - 94.0 134.5 125.0 131.0
challenge-120 77.0 66.0 614.0 9.0
challenge-121 46.0 34.0 3.0 6.0 136.0
challenge-122 2.5 1.0 4.0 12.0 46.0
challenge-123 1.5 1.0 1.5 1.0 1.0
challenge-124 3.0 1.0 6.5 2.0 1.0
challenge-125 117.0 24.0 156.0 123.5 4.0
challenge-126 9.0 195.0 87.0 18.0 2.0
challenge-127 21.0 4.0 43.0 65.0 1.0
challenge-128 20.0 1.0 66.0 6.0 1.0
challenge-129 139.0 3.0 13.5 6.0 2.0
challenge-130 1.0 1.0 6.5 52.5 1.0
challenge-131 151.5 966.0 64.0 39.5 990.0
challenge-132 1.0 1.0 3.5 1.0 1.0
challenge-133 1.0 1.0 1.0 1.0
challenge-134 6.5 4.0 2.5 3.0 30.0
challenge-135 - 17.0 31.0 1.0 3.0
challenge-136 15.5 9.0 3.5 3.0 2.0
challenge-137 1.0 1.0 2.0 177.5 1.0
challenge-138 1.0 1.0 1.0 1.0 15.0
challenge-139 1.0 1.0 1.0 1.0 66.0
challenge-140 - 1.0 8.5 6.0 1.0
challenge-141 - 1.0 14.0 186.0 2.0
challenge-142 1.0 1.0 65.0 2.0 2.0
challenge-143 1.0 1.0 525.0 13.0 1.0
challenge-144 1.5 1.0 144.0 88.0 230.0
challenge-145 1.0 1.0 15.0 1.0 3.0
challenge-146 - 1.0 3.0 2.0 77.0
challenge-147 - 2.0 3.5 4.0 1.0
challenge-148 1.0 1.0 3.0 2.0 1.0
challenge-149 1.0 6.0 2.5 5.0 96.0
challenge-150 1.0 1.0 2.0 3.0 1.0
challenge-151 1.0 1.5 25.5 40.0 1.5
challenge-152 - 1.0 265.0 173.0 2075.0
challenge-153 - 1.0 9.0 2.0 1.0
challenge-154 - 11.0 12.0 3.0 54.0
challenge-155 9.0 1.0 252.0 27.0 1.0
challenge-156 - 1.0 1.0 1.0 1.0
challenge-157 36.0 268.0 8.5 143.5 32.0
challenge-158 2.0 1.0 1.0 1.0 1.0
challenge-159 2.0 506.0 16.0 2.0 61.0
challenge-160 33.0 1.0 68.0 121.0 2.0
challenge-161 - 1.0 193.0 21.0 1.0
challenge-162 12.0 11.0 208.0 53.0 14.0
challenge-163 6.0 55.0 227.0 135.0 26.0
challenge-164 2.0 1.0 1.0 1.0 1.0
challenge-165 1.0 1.0 168.0 29.0 1.0
challenge-166 - 1.0 102.0 72.5 1.0
challenge-167 - 1.0 205.0 1.0 3.0
challenge-168 13.5 2.0 335.5 120.0 3.0
challenge-169 1.0 3.0 1.0 1.0 3.0
challenge-170 - 3.0 33.0 4.5 1.0
challenge-171 2.0 7.0 8.5 24.0 7.0
challenge-172 11.0 1.0 186.0 64.0 1.0
challenge-173 40.0 1.0 20.5 88.0 4.0
challenge-174 - 3.0 244.0 10.0 2.0
challenge-175 15.5 44.0 136.0 5.5 8.0
challenge-176 1.0 1.0 1.5 1.0 1.0
challenge-177 1.0 1.0 28.0 213.5 24.0
challenge-178 72.5 1.0 1809.5 615.5 3101.0
challenge-179 3.0 20.0 22.5 1.0 14.0
challenge-180 19.5 44.0 186.5 4.5 6.0
challenge-181 1.0 41.0 7.0 6.0 11.0
challenge-182 - 1.5 2.0 9.0 1.0
challenge-183 6.0 33.0 217.0 9.0 40.0
challenge-184 1.0 1.0 270.0 32.0 1.0
challenge-185 - 1.0 11.5 4.0 1.0
challenge-186 1.0 1.0 2.0 1.0 3.0
challenge-187 1.0 1.0 1.0 1.0 23.0
challenge-188 2.0 1.0 81.0 1.0 1.0
challenge-189 - - 1.0 10.0 682.0
challenge-190 1.0 1.0 3.0 2.0 1.0
challenge-191 - 2.0 103.5 4.0 2.0
challenge-192 1.0 1.0 5.5 1.0 1.0
challenge-193 3.0 3.0 6.0 1.0 2.0
challenge-194 1.5 1.0 2.5 3.0 3.0
challenge-195 1.0 1.0 1.0 1.0 1.0
challenge-196 4.5 297.0 3.5 3.0 300.0
challenge-197 - 34.0 845.5 13.5 8.0
challenge-198 1.5 1.0 9.5 6.0 4.0
challenge-199 94.5 9.0 280.5 1.0 131.0
challenge-200 - 56.0 21.5 7.0 73.0
challenge-201 - 1.0 2.5 2.5 1.0
challenge-202 - 1.0 505.0 1090.0 758.0
challenge-203 1.0 1.0 1.0 1.0
challenge-204 - 6.0 233.5 6.5 5.0
challenge-205 2.0 1.0 10.0 10.0 3.0
challenge-206 1.0 2.0 1.0 1.0 1.0
challenge-207 88.0 25.0 146.0 39.0 25.0
challenge-208 2.0 1.5 2.0 2.0 1.5


Participant information and abstracts

Participant:	Vaniya
Authors:	Vaniya, Arpana [1], Stephanie N. Samra [1], Sajjan S. Mehta [1], 
		Diego Pedrosa [1], Hiroshi Tsugawa [2], and Oliver Fiehn [1]
Affiliations: 	[1] Genome Center, University of California, Davis 
		[2] RIKEN Center for Sustainable Resource Science (CSRS), Wako, Japan

ParticipantID:	avaniya003
Category:	Category 2
Automatic methods: Yes

Abstract: 

MS-FINDER developed by H.Tsugawa et al. was used as an in silico software for unknown
compound identification in Category 2. MS-FINDER version 1.62 was used. MS/MS spectra
were uploaded to MS-FINDER in msp format. Precursor m/z, ion mode, mass accuracy of
instrument, and precursor type were used as metadata. Each candidate file was
converted to a structure database file which can be read by MS-FINDER. Each file was
saved in the software folder in order for it to be called by MS-FINDER. This file was
changed after each calculation in order to match the challenge data. A search of the
challenge msp against the challenge candidate list was performed on the top 500
candidates. Up to 500 top candidates structures were exported as a text file from
MS-FINDER. Final scores and SMILES were reported for submission to CASMI 2016.
Multiple candidates were submitted for each challenge.
Participant:	        Duehrkop
Authors: 		Dührkop, Kai (1) and Shen, Huibin (2) and Meusel, Marvin (1)
			and Rousu, Juho (2) and Böcker, Sebastian (1)
Affiliations:         	(1) Chair of Bioinformatics, Friedrich-Schiller University, Jena
			(2) Department of Computer Science, Aalto University

ParticipantID:	      csifingerid
Category:	      category2
Automatic pipeline:   yes
Spectral libraries:   yes (for training)

Abstract

We processed the peaklists in MGF format using a command line version
of CSI:FingerId 1.0.1. Fragmentation trees were computed with Sirius 3.1.4 
using the Q-TOF instrument settings. We computed trees for all
candidate formulas in the given structure candidate list.  Only the
top scoring trees were selected for further processing: Trees with a
score smaller than 80% of the score of the optimal tree were
discarded. Each of these trees was processed with CSI:FingerId as
described in [1]. We predicted for each tree a molecular fingerprint
(with platt probability estimates) and compared them against the
fingerprints of all structure candidates with the same molecular
formula. The resulting hits were merged together in one list and were
sorted by score. A constant value of 10000 was added to all scores to
make them positive (as stated in the CASMI rules). Ties of compounds
with same score (and sometimes also with same 2D structure) were
ordered randomly.

The machine learning method was trained on 7352 spectra (4564
compounds) downloaded from GNPS [2] and Massbank [3]. As our training
dataset contains only spectra in positive ion mode (there are too few
spectra with negative ion mode in GNPS), we ommited all challenges
with negative ion mode; As long as there are not enough public
available reference spectra measured in negative ion mode our method
will be only able to process positive ion mode spectra.

We observed for 67 challenges that the top scoring structure candidate
was a compound which is also contained in our training set. If we
evaluate our method on spectra from compounds we have already trained
on we usually reach a performance comparable to spectral library
search. To avoid an overestimation of the performance of our method,
we removed all of these top scoring candidates from our training set
and retrained our classifiers. To compensate the removed spectra, we
added the training spectra that are provided by CASMI. The submission
with the ParticipantID csifingerid_leaveout contains the search
results of these newly trained classifiers.

[1] Kai Dührkop, Huibin Shen, Marvin Meusel, Juho Rousu and Sebastian
    Böcker Searching molecular structure databases with tandem mass
    spectra using CSI:FingerID.  Proc Natl Acad Sci U S A,
    112(41):12580-12585, 2015.

[2] https://gnps.ucsd.edu

[3] Horai H, et al. MassBank: a public repository for sharing mass 
    spectral data for life sciences.  J Mass Spectrom 45(7):703–714, 2010.
Participant:		Verdegem
Authors:		Verdegem, Dries and Ghesquière, Bart
Affiliation:		Vesalius Research Center, VIB/KULeuven, Leuven, Belgium

ParticipantID:		dverdegem
Category:		category2
Automatic method:	yes

Abstract
For all assignments, we used the MAGMa+ software [1].

MAGMa+ uses MAGMa [2] under the hood. It runs MAGMa twice with two
different, fine-tuned parameters of which the values depend on the
ionization mode. MAGMa+ then determines the molecular class of the top
ranked metabolites returned by both MAGMa runs. This latent molecular
class is determined by a trained two-class random forest
classifier. Depending on the most prevelant molecular class, one of
both MAGMa outcomes (the one from the run with the parameters
corresponding to the most prevelant class) is returned to the user.

As structure database, the possible solution list provided in the
contest was used. We did not perform any prefiltering.

[1] Verdegem, Dries, et al. "Improved metabolite identification with
    MIDAS and MAGMa through MS/MS spectral dataset-driven parameter
    optimization." accepted for publication in Metabolomics
[2] Ridder, Lars, et al. "Substructure‐based annotation of
    high‐resolution multistage MSn spectral trees." Rapid
    Communications in Mass Spectrometry 26.20 (2012): 2461-2471.
Participant:          Allen
Authors:              Felicity Allen, Tanvir Sajed, Russ Greiner, David Wishart
Affiliations:         Department of Computing Science
		      University of Alberta, Canada

ParticipantID:        FelicityAllenCFMOrig
Category:             category2
Automatic pipeline:   yes
Spectral libraries:   no

Abstract

We processed the list of molecules and provided candidates using cfm-id.
The original  CFM positive and negative models were used, which were trained 
on data from the Metlin database.  Mass tolerances of 10ppm were used
and the Jaccard score was applied for spectral comparisons. The input spectrum
was repeated for the low, medium and high energies.
Participant:	      Brouard
Authors:              Céline Brouard(1,2), Huibin Shen(1,2), Kai Dührkop(3), 
		      Sebastian Böcker(3) and Juho Rousu(1,2)
Affiliations:         (1) Department of Computer Science, Aalto University, Espoo, Finland
                      (2) Helsinki Institute for Information Technology, Espoo, Finland
                      (3) Chair for Bioinformatics, Friedrich-Schiller University, 
		      Jena, Germany

ParticipantID:        IOKRAlignf
Category:	      category2
Automatic pipeline:   yes
Spectral libraries:   no

Abstract

We used a recent machine learning approach, called Input Output Kernel
Regression, for predicting the candidate scores. In this method, the
similarities between the MS/MS spectra and the molecular similarities
are encoded using two kernel functions. In input, we computed
different kernels based on MS/MS spectra and on fragmentation
trees. In output we built a gaussian kernel based on molecular
fingerprints. We used approximately 6000 molecular fingerprints from
OpenBabel. We combined the different input kernels using the Alignf
algorithm, which searches to maximize the alignment between the
combined kernel the output kernel.

We trained separate models for the MS/MS spectra in positive mode and
the MS/MS spectra in negative mode.  We considered additional MS/MS
spectra from GNPS and MassBank for training the models.

Details per Challenge and Participant. See legend at bottom for more details

The details table is also available as HTML and as CSV download. The individual submissions are also available for download.