Introduction
In shotgun proteomics, database searching of tandem mass spectra results in a great number of
peptide-spectrum matches (PSMs), many of which are false positives. Quality control of PSMs is a multiple hypothesis
testing problem, and the false discovery rate (FDR) or the posterior error probability (PEP) is the commonly used
statistical confidence measure. PEP, also called local FDR, can evaluate the confidence of individual PSMs and thus
is more desirable than FDR, which evaluates the global confidence of a collection of PSMs. Estimation of PEP can be
achieved by decomposing the null and alternative distributions of PSM scores as long as the given data is sufficient.
However, in many proteomic studies, only a group (subset) of PSMs, e.g. those with specific post-translational modifications,
are of interest. The group can be very small, making the direct PEP estimation by the group data inaccurate, especially for
the high-score area where the score threshold is taken. Using the whole set of PSMs to estimate the group PEP is inappropriate
either, because the null and/or alternative distributions of the group can be very different from those of combined scores.
The transfer PEP algorithm is proposed to more accurately estimate the PEPs of peptide identifications in small groups.
Transfer PEP derives the group null distribution through its empirical relationship with the combined null distribution,
and estimates the group alternative distribution, as well as the null proportion, using an iterative semi-parametric method.
Validated on both simulated data and real proteomic data, transfer PEP showed remarkably higher accuracy than the direct
combined and separate PEP estimation methods. We presented a novel approach to group PEP estimation for small groups and
implemented it for the peptide identification problem in proteomics. The methodology of the approach is in principle
applicable to the small-group PEP estimation problems in other fields.
Software
The transfer PEP algorithm was implemented in Matlab. The
source codes and the user guide are available at https://github.com/XinpeiYi/Transfer-PEP.
A test data can be downloaded here.
Publication
Xinpei Yi, Fuzhou Gong, Yan Fu. Transfer posterior error probability estimation for peptide identification. 2020. Submitted.
Contact
Address: No.55 Zhongguancun East Road,
Haidian District, Beijing, China
Postcode:
100190
Any problem with the software or this website, please contact:
Yan Fu's Research Group