NRPP

Mass Spectral Search Algorithms

Peptides are identified from mass spectra by using search algorithms which compare experimentally obtained mass spectral peaks to the theoretical masses derived from protein sequences.

There are many search algorithms that are available either commercially or as open source softwares. Sequest was first such algoritm wherein scoring for matches is based on number of spectral peaks that are common to theoretical and experimental spectra (Eng et al 1994).

Another search algorithm, X! Tandem, generates theoretical spectra for peptide sequences using information about intensity associated with amino acids. These spectra are compared with experimental data to generate an expectation value (Craig and Beavis 2003; Craig and Beavis 2004).

Mascot, a probabilistic algorithm, estimates the probability that a predicted peptide sequence generated the experimentally observed peptide by chance (Kapp et al 2005). Scaffold is a computer program that integrates search results from the above three algorithms (Sequest, X! Tandem and Mascot) to generate peptide identification and protein identification probabilities.

Phenyx uses a scoring scheme based on signal detection theory and pattern recognition to calculate a likelihood ratio that distinguishes true from false peptide identifications (Colinge et al 2003).

Paragon, a search algorithm that is a part of ProteinPilot suite of softwares, uses Sequence Temperature Values along with feature probabilities for identification of peptides from a database (Shilov et al 2007).

Examples of other search algorithms are OMSSA, Sonar and Spectrum Mill.

It is quite evident that each search algorithm uses a slightly different approach to identify peptides. Following are links to guides for various search algorithms that we use to analyze mass spectral data in our laboratory. These are introductory guidelines for beginners that can be used to initiate mass spectral searches.

Decoy Database Search - PDF

Mascot Search Guide - PDF

Phenyx Search Guide - PDF

ProteinPilot Search Guide - PDF

Scaffold Guide - PDF

Results from multiple search algorithms have been compared in recent manuscripts (Kapp et al 2005 and Balgley et al 2007). Combining results from multiple search algorithms has been utilized in order to increase the number of identifications (Searle et al 2008).


References:

Eng, J. K.,McCormack, A. L., Yates III, J. R., J. Am. Soc.Mass Spectrom. 1994, 5, 976–989.

Craig, R., Beavis, R. C., Bioinformatics 2004, 20, 1466–1467.3)

Craig, R., Beavis, R. C., Rapid Commun. Mass Spectrom. 2003, 17, 2310–2316.

Kapp EA, Schütz F, Connolly LM, Chakel JA, Meza JE, Miller CA, Fenyo D, Eng JK, Adkins JN, Omenn GS, Simpson RJ. Proteomics. 2005, 5, 3475-3490.

http://www.proteomesoftware.com/Proteome_software_prod_Scaffold.html

Colinge, J.; Masselot, A.; Giron, M.; Dessingy, T.; Magnin, J. Proteomics 2003, 3, 1454-1463.

Shilov IV, Seymour SL, Patel AA, Loboda A, Tang WH, Keating SP, Hunter CL, Nuwaysir LM, Schaeffer DA. Mol Cell Proteomics. 2007, 6, 1638-1655.

Balgley BM, Laudeman T, Yang L, Song T, Lee CS. Mol Cell Proteomics. 2007, 6, 1599-1608.

Searle BC, Turner M, Nesvizhskii AI. J Proteome Res. 2008, 7, 245-253.

© 2008 The Regents of the University of Michigan