Yan Fu PhD, Professor

Academy of Mathematics and Systems Science, Chinese Academy of Sciences

Address: No.55 Zhongguancun East Road, Haidian District, Beijing, 100190, China

E-mail: yfu(at)amss(dot)ac(dot)cn

Website: http://fugroup.amss.ac.cn/


Research Interests

Bioinformatics, biostatistics, machine learning and data science

Research Experiences

2021.04—now Professor at Academy of Mathematics and Systems Science, Chinese Academy of Sciences

2011.12—2021.04 Associate Professor at Academy of Mathematics and Systems Science, Chinese Academy of Sciences

2009.09—2011.12 Associate Professor at Institute of Computing Technology, Chinese Academy of Sciences

2007.03—2009.09 Assistant Professor at Institute of Computing Technology, Chinese Academy of Sciences

2000.09—2007.03 Ph.D. student at Institute of Computing Technology, Chinese Academy of Sciences

Selected Publications

Kun He, Meng-jie Li, Yan Fu*, Fu-zhou Gong, Xiao-ming Sun. Null-free False Discovery Rate Control Using Decoy Permutations. Acta Mathematicae Applicatae Sinica, English Series, 38(2):235-253, 2022.[pdf]

Jinghan Yang#, Zhiqiang Gao#, Xiuhan Ren, Jie Sheng, Ping Xu, Cheng Chang*, Yan Fu*. DeepDigest: prediction of protein proteolytic digestion with deep learning. Analytical Chemistry, 93(15):6094-6103, 2021.

Xinpei Yi, Fuzhou Gong*, Yan Fu*. Transfer posterior error probability estimation for peptide identification. BMC Bioinformatics, 21:173, 2020.

Feng Xu#*, Li Yu#, Xuehui Peng#, Junling Zhang, Suzhen Li, Shu Liu, Yanan Yi, Zhiwu An, Fuqiang Wang, Yan Fu*, Ping Xu*. Unambiguous Phosphosite Localization through the Combination of Trypsin and LysargiNase Mirror Spectra in a Large-Scale Phosphoproteome Study. Journal of Proteome Research, 19(6):2185-2194, 2020.

Qingbo Shu#, Mengjie Li#, Lian Shu#, Zhiwu An, Jifeng Wang, Hao Lv, Ming Yang, Tanxi Cai, Tony Hu, Yan Fu* and Fuquan Yang*. Large-scale Identification of N-linked Glycopeptides in Human Serum using HILIC Enrichment and Spectral Library Search. Molecular & Cellular Proteomics, 19:672–689, 2020.

Zhiqiang Gao#, Cheng Chang#, Jinghan Yang, Yunping Zhu*, Yan Fu*. AP3: An Advanced Proteotypic Peptide Predictor for Targeted Proteomics by Incorporating Peptide Digestibility. Analytical Chemistry, 2019, 91, 8705−8711.

Zhiwu An#, Linhui Zhai#, Wantao Ying, Xiaohong Qian, Fuzhou Gong*, Minjia Tan* and Yan Fu*. PTMiner: Localization and Quality Control of Protein Modifications Detected in an Open Search and Its Application to Comprehensive Post-translational Modification Characterization in Human Proteome. Molecular & Cellular Proteomics, 2019, 18 (2) 391-405.

Cheng Chang#, Zhiqiang Gao#, Wantao Ying#, Yan Fu*, Yan Zhao, Songfeng Wu, Mengjie Li, Guibin Wang, Xiaohong Qian*, Yunping Zhu*, Fuchu He*. LFAQ: towards unbiased label-free absolute protein quantification by predicting peptide quantitative factors. Analytical Chemistry, 2019, 91, 1335−1343.

Yi Shao, Chunyan, Chen Hao, Shen, Bin Z He, Daqi Yu, Shuai Jiang, Shilei Zhao, Zhiqiang Gao, Zhenglin Zhu, Xi Chen, Yan Fu, Hua Chen, Ge Gao, Manyuan Long, Yong E Zhang. GenTree, an integrated resource for analyzing the evolution and function of primate-specific coding genes. Genome Research, 2019 04 12;29(4):682-696.

Xinpei Yi#, Bo Wang#, Zhiwu An, Fuzhou Gong*, Jing Li*, Yan Fu*, Quality control of single amino acid variations detected by tandem mass spectrometry, Journal of Proteomics, 187:144–151, 2018.

Zhiwu An#, Qingbo Shu#, Hao Lv, Lian Shu, Jifeng Wang, Fuquan Yang*, Yan Fu*, N-Linked Glycopeptide Identification Based on Open Mass Spectral Library Search, BioMed Research International, doi.org/10.1155/2018/1564136, 2018.

Yan Fu, Data Analysis Strategies for Protein Modification Identification, In Klaus Jung (Ed.): Statistical Analysis in Proteomics, Humana Press, New York, NY,pp1362:265-75, 2016.

Kun Zhang#,Yan Fu*,Wen-Feng Zeng,Kun He,Hao Chi,Chao Liu,Yan-Chang Li,Yuan Gao,Ping Xu*,Si-Min He*,A note on the false discovery rate of novel peptides in proteogenomic,Bioinformatics,2015.06.14,3249~3253

Shan Lu,Sheng-Bo Fan,Bing Yang,Yu-Xin Li,Jia-Ming Meng,Long Wu,Pin Li,Kun Zhang,Mei-Jun Zhang,Yan Fu,Jin-Cai Luo,Rui-Xiang Sun,Si-Min He,Meng-Qiu Dong,Mapping native disulfide bonds at a proteome scale,Nature Methods,2015.01.01,12:329~331

Yan Fu* and Xiaohong Qian. Transferred Subgroup False Discovery Rate for Rare Post-translational Modifications Detected by Mass Spectrometry. Molecular & Cellular Proteomics, 13(5):1359-1368, 2014.

Yan Fu. Kernel Methods and Applications in Bioinformatics. In Kasabov, Nikola K. (Ed.): Handbook of Bio-/Neuro-Informatics, Springer-Verlag Berlin and Heidelberg GmbH & Co. K, pp275-285, 2013.

Yan Fu. Bayesian false discovery rates for post-translational modification proteomics. Statistics and Its Interface, 5:4759, 2012.

Zuo-Fei Yuan, Chao Liu, Hai-Peng Wang, Rui-Xiang Sun, Yan Fu, Jing-Fen Zhang, Le-Heng Wang, Hao Chi, You Li, Li-Yun Xiu, Wen-Ping Wang, Si-Min He. pParse: a method for accurate determination of monoisotopic peaks in high-resolution mass spectra. Proteomics, 12(2): 226–235, 2012.

Yan Fu, Liyun Xiu, Wei Jia, Ding Ye, Ruixiang Sun, Xiaohong Qian, Si-min He. DeltAMT: a statistical algorithm for fast detection of protein modifications from LC-MS/MS data. Molecular & Cellular Proteomics, 10(5):M110.000455, 2011.

Yan Fu, Rong Pan, Qiang Yang, Wen Gao. Query-Adaptive Ranking with Support Vector Machines for Protein Homology Prediction. In Proceedings of the 7th International Symposium on Bioinformatics Research and Applications (ISBRA2011). Lecture Notes in Bioinformatics, 6674:320–331, 2011.

Ding Ye, Yan Fu*, Ruixiang Sun*, Haipeng Wang, Zuofei Yuan, Hao Chi and Simin He*. Open MS/MS Spectral Library Search to Identify Unanticipated Post-Translational Modifications and Increase Spectral Identification Rate. In Proceedings of the 18th Annual International Conference on Intelligent Systems for Molecular Biology (ISMB 2010). Bioinformatics, 26(12):i399-i406, 2010.

Yan Fu*, Wei Jia, Zhuang Lu, Haipeng Wang, Zuofei Yuan, Hao Chi, You Li, Liyun Xiu, Wenping Wang, Chao Liu, Leheng Wang, Ruixiang Sun, Wen Gao, Xiaohong Qian, Si-Min He. Efficient discovery of abundant post-translational modifications and spectral pairs using peptide mass and retention time differences. The Seventh Asia-Pacific Bioinformatics Conference (APBC 2009). BMC Bioinformatics. 10(Suppl 1):S50, 2009.

Wei Jia#, Zhuang Lu#, Yan Fu#, Hai-Peng Wang, Le-Heng Wang, Hao Chi, Zuo-Fei Yuan, Zhao-Bin Zheng, Li-Na Song, Huan-Huan Han, Yi-Min Liang, Jing-Lan Wang, Yun Cai, Yu-Kui Zhang, Yu-Lin Deng, Wan-Tao Ying, Si-Min He, and Xiao-Hong Qian. A strategy for precise and large-scale identification of core fucosylated glycoproteins. Molecular & Cellular Proteomics. 8:913-923, 2009.

Yan Fu, Wen Gao, Simin He, Ruixiang Sun, Hu Zhou, Rong Zeng. Mining Tandem Mass Spectral Data to Develop a More Accurate Mass Error Model for Peptide Identification. Pacific Symposium on Biocomputing (PSB) 12:421-432, 2007.

Le-Heng Wang, De-Quan Li, Yan Fu, Hai-Peng Wang, Jing-Fen Zhang, Zuo-Fei Yuan,Rui-Xiang Sun, Rong Zeng, Si-Min He, Wen Gao, pFind 2.0: a software package for peptide and protein identification via tandem mass spectrometry. Rapid Communications in Mass Spectrometry, 21,2985-2991,2007.

Haipeng Wang, Yan Fu, Ruixiang Sun, Simin He, Rong Zeng, and Wen Gao. An SVM Scorer for More Sensitive and Reliable Peptide Identification via Tandem Mass Spectrometry. Pacific Symposium on Biocomputing (PSB) 11:303-314, 2006.

Dequan Li, Yan Fu, Ruixiang Sun, Charles X. Ling, Yonggang Wei, Hu Zhou, Rong Zeng, Qiang Yang, Simin He and Wen Gao. pFind: a novel database-searching software system for automated peptide and protein identification via tandem mass spectrometry. Bioinformatics, 21(13), pp3049-3050, 2005.

Yan Fu, Ruixiang Sun, Qiang Yang, Simin He, Chunli Wang, Haipeng Wang, Shiguang Shan, Junfa Liu, Wen Gao. A Block-Based Support Vector Machine Approach to the Protein Homology Prediction Task in KDD Cup 2004. ACM SIGKDD Explorations. Vol.6, No.2, pp120-124, 2004.

Yan Fu, Qiang Yang, Ruixiang Sun, Dequan Li, Rong Zeng, Charles X. Ling, Wen Gao. Exploiting the kernel trick to correlate fragment ions for peptide identification via tandem mass spectrometry. Bioinformatics. Vol.20, pp1948-1954, 2004.

Yan Fu, Qiang Yang, Charles X. Ling, Haipeng Wang, Dequan Li, Ruixiang Sun, Hu Zhou, Rong Zeng, Yiqiang Chen, Simin He, Wen Gao. A Kernel-based Case Retrieval Algorithm with Application to Bioinformatics. In Proceedings of the 8th Pacific Rim International Conference on Artificial Intelligence (PRICAI 2004), Auckland, New Zealand, August 9-13, 2004, LNAI 3157, pp. 544–553.

Yan Fu, Simin He, Ruixiang Sun, Leheng Wang. A review of Key computational problems in tandem mass spectrometry-based protein identification. Information Technology Letter, 8(1):16-32, 2010. (in Chinese)

Yan Fu. Machine Learning Based Bioinformation Retrieval. Doctoral dissertation, Chinese Academy of Sciences, 2007. (in Chinese)

Ruixiang Sun, Yan Fu, Dequan Li, Jingfen Zhang, Xiaobiao Wang, Quanhu Sheng, Rong Zeng, Yiqiang Chen, Simin He, Wen Gao. Mass Spectrometry-Based Computational Proteomics Research. SCIENCE IN CHINA Ser. E Information Sciences. 36(2), 222-234, 2006. (in Chinese)

Yiqiang Chen, Wen Gao, Yan Fu, Dequan Li, Xiang Chen. Research on Protein Recognition base on Information Technology. Chinese Bulletin of Life Sciences, Vol.15, No.2, pp70-78, 2003. (in Chinese)

Yan Fu, Yaowei Wang, Weiqiang Wang, Wen Gao. Content-Based Natural Image Classification and Retrieval Using SVM. Chinese Journal of Computers, Vol.26, No.10, pp.1261-1265, 2003. (in Chinese)

Yan Fu, Tiejun Huang, Ke Yu, Tao Li, Hao Zhang. Overview of Interactive Model of Computing. Chinese Journal of Computer Research and Development, vol.39, no.6, pp701-706, 2002. (in Chinese)

Software Tools

DeepDigest: A software tool for prediction of peptide digestibilities in proteomics using deep learning

PTMiner: Localization and Quality Control of Protein Modifications Detected by Open Search

SAVControl: Quality control of single amino acid variations detected by tandem mass spectrometry

pMatchGlyco: N-Linked Glycopeptide Identification Based on Open Mass Spectral Library Search

LFAQ: Unbiased label-free absolute protein quantification by predicting peptide quantitative factors

AP3: Prediction of proteotypic peptides in proteomics using random forest algorithm

TransferPEP: Transfer posterior error probability (local FDR) estimation for peptide identification

pFind: a database-searching engine for peptide & protein identification via tandem mass spectrometry

pMatch: an open MS/MS library search tool for identification of peptides and their modifications

pCluster: a clustering tool for modification detection using LC, MS or MS/MS information


Microsoft Fellowship 2004, Microsoft Research Asia

Champion of ACM KDD Cup 2004 data mining competition