News

Oct 31st, 2017
The results are now available.

Oct 30th, 2017
The solutions are now available.

Sept 8th, 2017
Update for Challenge 15 available, but will not count in evaluation.

Sept 4th, 2017
Updated mailling list and submission information.

Aug 23rd, 2017
The preliminary results have been sent out to participants, and are now available.

July 09th, 2017
We fixed the intensities in the TSV archive for challenges 046-243.

June 22nd, 2017
We added the Category 4 on a subset of the data files.

May 22nd, 2017
We have improved challenges 29, 42, 71, 89, 105, 106 and 144.

April 26th, 2017
The rules and challenges of CASMI 2017 are public now !

Jan 20th, 2017
Organisation of CASMI 2017 is underway, stay tuned!


Results in Category 2

Summary of participant performance

F1 score Mean rank Median rank Top Top3 Top10 Misses TopPos TopNeg Mean RRP Median RRP N
kai_iso 2775 585.67 5.0 77 110 147 4 55 22 0.922 0.999 243
kai112 1373 513.93 9.0 38 55 74 102 28 10 0.937 0.999 243
yuanyuesimple 942 214.54 31.0 16 31 72 10 9 7 0.970 0.990 243
yuanyuesqrt 912 203.51 34.0 16 32 68 10 11 5 0.970 0.990 243
yuanyuelogsum 368 1165.47 260.0 7 14 27 10 5 2 0.831 0.941 243
Rakesh 109 939.83 333.0 0 0 5 28 0 0 0.852 0.906 243
This summary is also available as CSV download.

Table legend:

F1 score
The Formula 1 score awards points similar to the scheme in F1 racing for each challenge based on the rank of the correct solution. In the participant table, these are summed over all challenges. Please note that the F1 score is thus not neccessarily comparable across categories.
Mean/Median rank
Mean and median rank of the correct solution. For tied ranks with other candidates, the average rank of the ties is used.
Top, Top3, Top10
Number of challenges where the correct solution is ranked first, among the Top 3 and Top 10
Misses
Number of challenges where the correct solution is missing.
TopPos, TopNeg
Top1 ranked solutions in positive or negative ionization mode.
Mean/Median RRP
The relative ranking position, which is also incorporating the length of candidate list.
N
Number of submissions that have passed the evaluation scripts.

Summary of Rank by Challenge

For each challenge, the lowest rank among participants is highlighted in bold. If the submission did not contain the correct candidate this is denoted as "-". If someone did not participate in a challenge, the table cell is empty. The tables are sortable if you click into the column header.

Category2:

kai_iso kai112 Rakesh yuanyuelogsum yuanyuesimple yuanyuesqrt
challenge-001 31.0 9.0 461.5 96.0 103.0 130.0
challenge-002 11.0 - 709.0 - - -
challenge-003 1.0 47.0 5911.0 458.0 7.0 9.0
challenge-004 1.0 1.0 197.5 829.0 322.0 372.0
challenge-005 95.0 101.0 197.5 367.0 1.0 1.0
challenge-006 1.0 1.0 197.5 308.0 8.0 17.0
challenge-007 1.0 1.0 197.5 9.0 1.0 1.0
challenge-008 1.0 1.0 1238.5 768.0 3.0 5.0
challenge-009 1.0 1.0 731.5 262.0 50.0 59.0
challenge-010 8584.0 - 8449.5 8226.5 7371.0 12403.0
challenge-011 2.0 2.0 - - - -
challenge-012 4.0 2556.0 3050.0 478.0 183.0 182.0
challenge-013 6.0 1.0 425.0 85.5 2.0 2.0
challenge-014 180.0 118.0 197.5 407.0 70.0 53.0
challenge-015 14.0 21.0 197.5 1.0 1.0 1.0
challenge-016 1.0 1.0 530.0 262.0 236.0 158.0
challenge-017 1.0 19.0 1200.0 10.0 4.0 3.0
challenge-018 92.0 - 604.0 59.0 113.0 135.0
challenge-019 2.0 - 4807.0 100.0 31.0 6.0
challenge-020 42.0 - 3783.5 523.0 486.0 491.0
challenge-021 372.0 3805.0 6625.0 1581.0 1886.0 649.0
challenge-022 3.0 - 446.5 - - -
challenge-023 25.0 252.0 5076.0 59.0 10.0 9.0
challenge-024 7.0 258.0 1755.0 938.0 15.0 78.0
challenge-025 10.0 - 302.5 22.0 38.0 37.0
challenge-026 2.0 1.0 618.5 269.0 79.0 79.0
challenge-027 - 122.0 7241.0 4550.0 1.0 1.0
challenge-028 10.0 34.0 - - - -
challenge-029 7.0 - 28.0 380.0 6.0 4.0
challenge-030 3.0 4.0 - - - -
challenge-031 4.0 2.0 436.5 219.0 40.0 51.0
challenge-032 2.0 - 461.5 980.0 210.0 196.0
challenge-033 59.0 - 262.5 1261.0 199.0 154.0
challenge-034 - - 484.5 87.0 104.0 107.0
challenge-035 1.0 - - - - -
challenge-036 2.0 1.0 2832.0 1.0 1.0 1.0
challenge-037 14.0 1.0 411.5 - - -
challenge-038 25.0 1.0 709.0 301.0 1.0 1.0
challenge-039 10.0 - 944.5 126.0 19.0 33.0
challenge-040 3.0 - 52.0 19.0 11.0 8.0
challenge-041 1.0 1.0 - - - -
challenge-042 1.0 - 18061.0 5.0 5.0 5.0
challenge-043 - - - - - -
challenge-044 4.0 86.0 1439.5 206.0 237.0 130.0
challenge-045 - - - - - -
challenge-046 950.0 - 1312.0 907.0 6.0 4.0
challenge-047 10.0 - 18.0 322.0 8.0 12.0
challenge-048 1316.0 - 747.0 155.0 135.0 97.0
challenge-049 642.5 - 807.5 851.0 344.0 681.0
challenge-050 2.0 - 82.5 32.0 12.0 10.0
challenge-051 23.0 37.0 - 96.0 5.0 6.0
challenge-052 99.0 25.0 706.0 2382.0 326.0 311.0
challenge-053 1659.5 - 16.0 38.0 10.0 8.0
challenge-054 480.0 510.0 266.0 1896.0 36.0 53.0
challenge-055 1.0 136.0 - 66.0 8.0 8.0
challenge-056 1.0 1.0 1156.0 186.0 247.0 215.0
challenge-057 10074.0 - 174.0 8153.0 3318.0 117.0
challenge-058 100.0 119.0 1759.5 709.0 143.0 114.0
challenge-059 1.0 - - 1.5 1.5 1.5
challenge-060 989.0 - 2271.5 2786.0 510.0 391.0
challenge-061 65.0 32.0 303.5 1104.0 79.0 112.0
challenge-062 1.0 1.0 1134.0 86.0 191.0 192.0
challenge-063 1.0 1.0 146.5 388.0 113.0 137.0
challenge-064 5.0 10.0 398.5 11063.5 1.0 1.0
challenge-065 32.0 - 35.0 252.0 15.0 14.0
challenge-066 6553.5 34.0 664.5 89.0 28.0 33.0
challenge-067 213.0 114.0 807.5 2831.0 1193.0 1163.0
challenge-068 6.5 90.5 - 2.0 4.0 3.0
challenge-069 56.0 133.0 17.0 39.0 11.0 12.0
challenge-070 2841.0 5937.0 637.5 5955.0 3088.0 2793.0
challenge-071 4.0 - 42.5 619.0 25.0 28.0
challenge-072 529.0 749.0 432.0 329.0 25.0 21.0
challenge-073 1.0 1.0 12.0 1146.0 25.0 32.0
challenge-074 4.0 - 26.5 2.0 11.5 13.0
challenge-075 99.0 202.0 1298.5 35.0 47.0 64.0
challenge-076 9417.0 192.0 653.5 3960.0 3.0 2.0
challenge-077 1608.0 - 866.5 883.0 366.0 121.0
challenge-078 575.0 7599.0 896.0 2210.0 213.0 182.0
challenge-079 1043.5 3.0 75.0 3.0 4.0 3.0
challenge-080 1.0 - 94.5 287.0 167.0 118.0
challenge-081 2.0 1.0 4.0 40.0 25.0 10.0
challenge-082 19.0 18.0 111.0 36.0 7.0 10.0
challenge-083 2.5 - 98.0 9.0 2.0 2.0
challenge-084 1012.0 4376.0 160.0 184.0 76.0 136.0
challenge-085 89.0 73.0 283.0 1.0 13.0 7.0
challenge-086 556.0 - 944.5 64.0 50.0 61.0
challenge-087 2.0 - 962.0 2070.0 4.0 4.0
challenge-088 35.5 - 79.5 3281.0 1.5 1.5
challenge-089 1.0 - 86.0 120.0 20.0 24.0
challenge-090 71.0 676.0 23.5 20.0 25.0 30.0
challenge-091 1.0 22.0 40.5 33.0 3.0 3.0
challenge-092 5.0 - - 252.5 10.0 5.0
challenge-093 1.0 - 43.5 265.0 197.0 116.0
challenge-094 21.0 21.0 249.5 7063.0 25.0 29.0
challenge-095 1146.0 28.0 327.5 172.0 383.0 365.0
challenge-096 866.0 13073.0 3018.0 309.0 701.0 616.0
challenge-097 1.0 1.0 36.0 283.0 19.0 32.0
challenge-098 1.0 - 35.0 1577.0 117.0 148.0
challenge-099 6480.0 - 298.0 7091.0 1.0 2.0
challenge-100 1349.0 1936.0 1156.0 341.0 530.0 471.0
challenge-101 8312.5 12.0 250.0 1791.0 1.0 1.0
challenge-102 5.5 - 163.0 68.0 11.0 19.0
challenge-103 14.0 2.0 297.0 358.0 284.0 461.0
challenge-104 41.0 111.0 547.0 598.0 493.0 568.0
challenge-105 4.0 - 51.5 23.0 11.0 11.0
challenge-106 8.0 8.0 82.5 13.0 1.0 1.0
challenge-107 141.0 116.0 - 4298.0 12.0 11.0
challenge-108 3.0 2.0 341.5 89.0 33.0 31.0
challenge-109 89.0 120.0 258.0 1818.0 59.0 60.0
challenge-110 471.0 - 3451.5 1970.0 53.0 38.0
challenge-111 1.0 32.0 377.5 106.0 125.0 148.0
challenge-112 1088.0 7219.0 107.0 513.0 93.0 161.0
challenge-113 5.0 - 15.5 9.5 14.0 9.0
challenge-114 1.0 - 22.5 17.0 1.0 2.0
challenge-115 1.0 - - 11.0 31.0 122.0
challenge-116 325.5 - 15.5 8.0 7.0 6.0
challenge-117 5.0 4.0 39.5 54.0 168.0 153.0
challenge-118 4370.5 63.0 21.0 41.0 7.0 9.0
challenge-119 1996.0 - - 2559.0 319.0 349.0
challenge-120 1.0 1.0 333.0 156.0 47.0 52.0
challenge-121 3.0 4.0 9.0 10.0 13.0 11.0
challenge-122 1.0 - 46.5 222.0 28.0 37.0
challenge-123 2187.0 5892.0 832.5 1357.0 226.0 257.0
challenge-124 1.0 - 7.5 4.0 6.0 9.0
challenge-125 14.5 7.5 48.0 60.5 36.0 17.0
challenge-126 5.0 122.0 - 3.0 6.0 21.0
challenge-127 6.0 - 49.0 2.0 2.0 2.0
challenge-128 1.0 4.0 76.5 52.0 193.0 119.0
challenge-129 1.0 - 151.0 1493.0 6.0 24.0
challenge-130 5.0 9.0 - 31.0 14.0 15.0
challenge-131 5.0 8.0 18.0 87.0 10.0 12.0
challenge-132 201.0 - 1312.0 1294.0 90.0 65.0
challenge-133 1.0 - 18.0 26.0 12.0 13.0
challenge-134 9.0 - 747.0 86.0 103.0 88.0
challenge-135 297.5 - 807.5 1211.0 171.0 178.0
challenge-136 1.0 3.0 1759.5 89.0 98.0 105.0
challenge-137 1.5 - - 116.0 18.0 26.0
challenge-138 8.0 - 866.5 62.0 90.0 135.0
challenge-139 2.0 2.0 27.5 13.0 95.0 19.0
challenge-140 10454.0 - 3848.0 2242.0 209.0 194.0
challenge-141 1.0 27.0 547.0 8.0 60.0 69.0
challenge-142 1.0 - 22.5 3768.0 13.0 19.0
challenge-143 6.0 32.0 1950.5 298.0 409.0 356.0
challenge-144 18.0 - 615.5 1347.0 109.0 91.0
challenge-145 1.0 - 772.5 82.0 138.0 140.0
challenge-146 1.0 - 80.5 2665.0 5.0 5.0
challenge-147 1039.0 3176.0 2770.5 133.0 7.0 9.0
challenge-148 2.0 - 5180.5 6906.0 27.0 17.0
challenge-149 2.0 2.0 - 2.0 3.0 5.0
challenge-150 1.0 - - 283.0 8.0 8.0
challenge-151 5997.5 2.0 266.0 8568.0 168.0 129.0
challenge-152 3.0 91.5 - 1.0 1.0 1.0
challenge-153 32.0 28.0 1156.0 51.0 219.0 267.0
challenge-154 1.0 - 174.0 36.0 12.0 17.0
challenge-155 1.0 - 66.5 1364.5 26.5 26.5
challenge-156 16.0 9.0 303.5 8231.0 23.5 101.5
challenge-157 23.0 - 2271.5 179.0 443.0 469.0
challenge-158 4.0 3.0 303.5 36.5 4.0 8.0
challenge-159 1.0 1.0 1134.0 115.0 90.0 160.0
challenge-160 6866.0 1.0 398.5 3006.0 105.0 121.0
challenge-161 587.0 2924.0 590.0 1069.0 906.0 889.0
challenge-162 1.0 1.0 652.0 192.0 12.0 11.0
challenge-163 1.0 - 1331.0 310.0 4.0 3.0
challenge-164 1.0 1.0 2271.5 560.0 727.0 672.0
challenge-165 1256.5 - 35.0 858.0 40.0 48.0
challenge-166 1.0 2.0 664.5 2378.0 6.0 12.0
challenge-167 2.0 - 6948.5 284.0 145.0 67.0
challenge-168 2248.5 12.0 807.5 130.0 54.0 137.0
challenge-169 1.0 11.0 19.0 28.0 28.0 14.0
challenge-170 2.0 1.0 1156.0 270.0 348.0 492.0
challenge-171 3.0 3.0 869.5 45.0 167.0 218.0
challenge-172 289.0 219.0 3853.0 277.0 277.0 251.0
challenge-173 1.0 9.0 637.5 2193.0 1365.0 1779.0
challenge-174 7.0 - 42.5 5667.0 50.0 47.0
challenge-175 17.0 15.0 432.0 108.0 13.0 12.0
challenge-176 9.0 - 2271.5 2505.0 323.0 290.0
challenge-177 1.0 1.0 43.0 260.0 1.0 1.0
challenge-178 2072.0 - 1449.0 1407.0 696.0 600.0
challenge-179 1.0 - 87.0 1.0 1.0 1.0
challenge-180 4.0 - 26.5 93.0 20.0 8.0
challenge-181 125.0 6.0 3018.0 128.0 138.0 112.0
challenge-182 21.0 55.0 896.0 21924.0 7408.0 4694.0
challenge-183 1.5 1.5 913.0 1498.0 2.0 1.0
challenge-184 1.0 2.0 75.0 526.0 8.0 86.0
challenge-185 1.0 - 94.5 284.0 243.0 225.0
challenge-186 2.0 - 959.0 453.0 149.0 183.0
challenge-187 1.0 1.0 4.0 195.0 22.0 29.0
challenge-188 1.0 1.0 111.0 11135.0 26.0 24.0
challenge-189 1.0 3.0 283.0 3387.0 6.0 6.0
challenge-190 74.0 295.0 869.5 663.0 378.0 360.0
challenge-191 323.0 730.0 1025.5 217.0 62.0 85.0
challenge-192 1.0 - 944.5 41.0 55.0 29.0
challenge-193 4.0 - 962.0 404.0 5.0 5.0
challenge-194 1.0 1.0 1007.0 1106.0 520.0 730.0
challenge-195 1.5 - 111.5 1449.5 1.5 1.5
challenge-196 1.0 - 213.0 70.0 50.0 34.0
challenge-197 36.0 37.0 40.5 29.0 5.0 5.0
challenge-198 1.0 4.0 138.0 3613.0 927.0 120.0
challenge-199 1893.0 - 39.5 1529.0 650.0 516.0
challenge-200 1.0 1.0 2443.5 23.0 2.0 1.0
challenge-201 1.0 4.0 249.5 73.0 65.0 85.0
challenge-202 2.0 1.0 327.5 49.0 72.0 122.0
challenge-203 1.0 3.0 3018.0 722.0 591.0 582.0
challenge-204 1.0 1.0 36.0 5387.0 104.0 79.0
challenge-205 2.0 - 757.5 86.0 57.0 83.0
challenge-206 1465.5 - 35.0 1141.0 274.0 362.0
challenge-207 6783.0 - 38.0 383.0 43.0 55.0
challenge-208 129.0 45.0 1156.0 549.0 589.0 477.0
challenge-209 2.0 3.0 250.0 137.0 2.0 2.0
challenge-210 10.0 - 54.0 1891.0 92.0 90.0
challenge-211 1.0 - 720.5 237.0 96.0 130.0
challenge-212 12955.5 12.0 297.0 7904.0 14.0 16.0
challenge-213 1.0 5.0 1438.0 37.0 1.0 1.0
challenge-214 7.0 20.0 410.5 890.0 230.0 126.0
challenge-215 25.0 - 2121.0 3057.0 69.0 103.0
challenge-216 10.0 - 80.5 8.0 3.0 14.0
challenge-217 2.0 - 301.5 8.0 6.0 5.0
challenge-218 2.0 1.0 82.5 1190.0 1.0 2.0
challenge-219 1.0 1.0 68.5 4062.0 115.0 126.0
challenge-220 1.0 - 301.5 1.0 4.0 1.0
challenge-221 5.0 9.0 1031.5 757.0 9.0 6.0
challenge-222 1.0 1.0 258.0 1241.0 33.0 47.0
challenge-223 1.0 - 2046.5 11.0 9.0 10.0
challenge-224 2.0 1.0 213.0 5774.0 164.0 103.0
challenge-225 392.5 - - 552.0 342.0 328.0
challenge-226 1150.5 - - 153.0 72.0 286.0
challenge-227 44.0 263.0 39.5 1988.0 305.0 169.0
challenge-228 16.0 291.0 21.0 89.0 8.0 6.0
challenge-229 1.0 - 496.5 252.0 299.0 253.0
challenge-230 3.0 7.0 333.0 902.0 184.0 132.0
challenge-231 1.0 - - 254.0 15.0 15.0
challenge-232 1.0 1.0 9.0 4.0 6.0 7.0
challenge-233 1.0 - 46.5 1721.0 28.0 32.0
challenge-234 268.0 3450.0 832.5 1500.0 19.0 18.0
challenge-235 10.0 3199.0 2332.0 133.0 222.0 270.0
challenge-236 1.0 - 2055.5 3439.0 25.0 30.0
challenge-237 1.0 90.5 - 1.0 8.0 14.0
challenge-238 419.5 - - 62.0 8.0 14.0
challenge-239 2449.5 - 27.5 36.0 3.0 5.0
challenge-240 1.0 1.0 76.5 37.0 32.0 37.0
challenge-241 1.0 1.0 - 5.0 10.0 15.0
challenge-242 1.0 - 73.0 24.0 50.0 55.0
challenge-243 2.0 4.0 - 128.0 18.0 12.0
This summary is also available as CSV download.


Participant information and abstracts

ParticipantID:        
Category:	      Category 2
Authors:              Rakesh Kumar [1], Nilesh Kumar [1], Ranjan Nanda [1],
		      Dinesh Gupta [1]
Affiliations:         [1] International Centre for Genetic Engineering
		      and Biotechnology (ICGEB), New Delhi
Automatic pipeline:   Automated
Spectral libraries:   no

Abstract:

1) The molecular formulas from PubChem were stored in local
   library. The MS/MS spectra files (mgf format) were directly used
   for the analysis.

2) At first, all the possible candidate for each principal peak were
   fetched and ranked on the basis of likeness of natural product,
   employing our algorithm and in house script (Python 2.7).  The
   formulas for rest of the sub peaks were also fetched. For each peak
   explained, explanation score was appended for the candidate
   formula.  The molecular formula with highest score was ranked top
   and rest of the formulas ranked accordingly.

3) Finally, INCHI each of the molecular formula was arranged and same
   rank awarded to all INCHI of a particular compound.


ParticipantID:        yuanyuelogsum
Category:             category2
Authors:              Yuanyue Li, Michael Kuhn and Peer Bork
Affiliations:         European Molecular Biology Laboratory, 69117 Heidelberg, Germany
Automatic pipeline:   yes
Spectral libraries:   no

Abstract:

For challenges 1-45, the candidates were retrieved as InChI structures
from PubChem with +/- 6ppm. For challenges 46-243, the molcules from
the category4 (nonredundant) are used as the candidates. We use a new
developed machine learing approach to predict the probability spectrum
for each candidate molecule. Then the score was calculated base on the
similiarity between the probability spectrum and real spectrum. And
the log of the molecules numbers in the model is considered as a
weight. In this approach, all the possible adduct are considered
[M+H]+, [M+Na]+ and [M+NH4]+ for the positive ions, [M-H]-, [M+Cl]-
and [M+COO]- for the negative ions.
ParticipantID:        yuanyuesimple
Category:             category2
Authors:              Yuanyue Li, Michael Kuhn and Peer Bork
Affiliations:         European Molecular Biology Laboratory, 69117 Heidelberg, Germany
Automatic pipeline:   yes
Spectral libraries:   no

Abstract:

For challenges 1-45, the candidates were retrieved as InChI structures
from PubChem with +/- 6ppm. For challenges 46-243, the molcules from
the category4 (nonredundant) are used as the candidates. We use a new
developed machine learing approach to predict the probability spectrum
for each candidate molecule. Then the score was calculated base on the
similiarity between the probability spectrum and real spectrum. In
this approach, only the [M+H]+ is considered for positive ions, the
[M-H]- is considered for negative ions.
ParticipantID:        yuanyuesqrt
Category:             category2
Authors:              Yuanyue Li, Michael Kuhn and Peer Bork
Affiliations:         European Molecular Biology Laboratory, 69117 Heidelberg, Germany
Automatic pipeline:   yes
Spectral libraries:   no

Abstract:

For challenges 1-45, the candidates were retrieved as InChI structures
from PubChem with +/- 6ppm. For challenges 46-243, the molcules from
the category4 (nonredundant) are used as the candidates. We use a new
developed machine learing approach to predict the probability spectrum
for each candidate molecule. Then the score was calculated base on the
similiarity between the probability spectrum and real spectrum. And
the square root of intensity is considered as weight. In this
approach, only the [M+H]+ is considered for positive ions, the [M-H]-
is considered for negative ions.
ParticipantID:        kai_iso
Category:             category2
Authors:              Dührkop, Kai (1) and Ludwig, Marcus (1) and Böcker, Sebastian (1)
		      and Bach, Eric (2) and Brouard, Céline (2) and Rousu, Juho (2)
Affiliations:         (1) Chair of Bioinformatics, Friedrich-Schiller University, Jena
                      (2) Department of Computer Science, Aalto University
                      Developmental Biology, Halle, Germany
Automatic pipeline:   yes
Spectral libraries:   no

Abstract
We processed the peaklists in MGF format using an in-house version of CSI:FingerID. 
Fragmentation trees were computed with Sirius 3.1.5 
using the Q-TOF instrument settings. 

As the spectra were measured in MSe mode we expect to see isotope peaks in
MSMS. We used an experimental feature in SIRIUS that
allows for detecting isotope patterns in MSMS and incorporate them into the fragmentation
tree scoring. 

We used the standard workflow of the SIRIUS+CSI:FingerID (version 3.5)
software: We computed trees for all candidate formulas in the given
structure candidate list from category 4. For challenges 1-45 we
downloaded all candidate structures from our in-house version of
PubChem.

Only the top scoring trees were selected for further processing: Trees
with a score smaller than 75% of the score of the optimal tree were
discarded. Each of these trees was processed with CSI:FingerId as
described in [1]. We predicted for each tree a molecular fingerprint
(with platt probability estimates) and compared them against the
fingerprints of all structure candidates with the same molecular
formula. For comparison of fingerprints, we used the new new maximum
likelihood scoring function which is implemented since SIRIUS 3.5.
The resulting hits were merged together in one list and were sorted by
score. A constant value was added to all scores to make them positive
(as stated in the CASMI rules). Ties of compounds with same score were
ordered randomly. If a compound could not be processed (e.g. because
of multiple charges) its score was set to zero.

[1] Kai Dührkop, Huibin Shen, Marvin Meusel, Juho Rousu and Sebastian
    Böcker Searching molecular structure databases with tandem mass
    spectra using CSI:FingerID.  Proc Natl Acad Sci U S A,
    112(41):12580-12585, 2015.
ParticipantID:        kai112
Category:             category2
Authors:              Dührkop, Kai (1) and Ludwig, Marcus (1) and Böcker, Sebastian (1)
		      and Bach, Eric (2) and Brouard, Céline (2) and Rousu, Juho (2)
Affiliations:         (1) Chair of Bioinformatics, Friedrich-Schiller University, Jena
                      (2) Department of Computer Science, Aalto University
                      Developmental Biology, Halle, Germany
Automatic pipeline:   yes
Spectral libraries:   no

Abstract
We processed the peaklists in MGF format using an in-house version of CSI:FingerID. 
Fragmentation trees were computed with Sirius 3.1.5 
using the Q-TOF instrument settings. 



As the spectra for challenges 46 - 243 were measured in MSe mode we expect to see isotope peaks in
the MSMS. For these challenges we used an experimental feature in SIRIUS that
allows for detecting isotope patterns in MSMS and incorporate them into the fragmentation
tree scoring. 

The preliminary results have shown that we miss a lot of compounds
because we were not always able to identify the correct molecular
formula in top ranks. This might be because no isotope patterns for
the precursor were given. So we prepared a second submission kai112
which is not longer using a hard threshold, but instead consider all
molecular formulas for the CSI:FingerID search and add the SIRIUS
score on top of the CSI:FingerID score.  To avoid that empty trees
(which we would have thrown away by a hard threshold) get high scores
by random, we add a penalty of 1000 if a tree explains not a single
fragment peak. Furthermore, for the kai112 submission we trained
CSI:FingerID on a larger dataset that contains also spectra from NIST.

Beside removing the hard threshold, the kai112 submission follows the
standard SIRIUS+CSI:FingerID protocol: We computed trees for all
candidate formulas in the given structure candidate list from category
4. For challenges 1-45 we downloaded all candidate structures from our
in-house version of PubChem. Each of these trees was processed with
CSI:FingerId as described in [1]. We predicted for each tree a
molecular fingerprint (with platt probability estimates) and compared
them against the fingerprints of all structure candidates with the
same molecular formula. For comparison of fingerprints, we used the
new new maximum likelihood scoring function which is implemented since
SIRIUS 3.5.  Trees with one node get a penalty of 1000. For all other
trees, the SIRIUS score was added to the CSI:FingerID score. The
resulting hits were merged together in one list and were sorted by
score. A constant value was added to all scores to make them positive
(as stated in the CASMI rules). Ties of compounds with same score were
ordered randomly. If a compound could not be processed (e.g. because
of multiple charges) its score was set to zero.

[1] Kai Dührkop, Huibin Shen, Marvin Meusel, Juho Rousu and Sebastian
    Böcker Searching molecular structure databases with tandem mass
    spectra using CSI:FingerID.  Proc Natl Acad Sci U S A,
    112(41):12580-12585, 2015.

Details per Challenge and Participant. See legend at bottom for more details

The details table is also available as HTML and as CSV download.