Genes (Basel). 2024 Jun 13. pii: 775. [Epub ahead of print]15(6):
The high-throughput proteomics data generated by increasingly more sensible mass spectrometers greatly contribute to our better understanding of molecular and cellular mechanisms operating in live beings. Nevertheless, proteomics analyses are based on accurate genomic and protein annotations, and some information may be lost if these resources are incomplete. Here, we show that most proteomics data may be recovered by interconnecting genomics and proteomics approaches (i.e., following a proteogenomic strategy), resulting, in turn, in an improvement of gene/protein models. In this study, we generated proteomics data from Leishmania donovani (HU3 strain) promastigotes that allowed us to detect 1908 proteins in this developmental stage on the basis of the currently annotated proteins available in public databases. However, when the proteomics data were searched against all possible open reading frames existing in the L. donovani genome, twenty new protein-coding genes could be annotated. Additionally, 43 previously annotated proteins were extended at their N-terminal ends to accommodate peptides detected in the proteomics data. Also, different post-translational modifications (phosphorylation, acetylation, methylation, among others) were found to occur in a large number of Leishmania proteins. Finally, a detailed comparative analysis of the L. donovani and Leishmania major experimental proteomes served to illustrate how inaccurate conclusions can be raised if proteomes are compared solely on the basis of the listed proteins identified in each proteome. Finally, we have created data entries (based on freely available repositories) to provide and maintain updated gene/protein models. Raw data are available via ProteomeXchange with the identifier PXD051920.
Keywords: Leishmania donovani; experimental proteome; mass spectrometry; post-translational modifications (PTMs); proteogenomics