bims-micpro Biomed News
on Discovery and characterization of microproteins
Issue of 2021‒10‒10
four papers selected by
Thomas Martinez
Salk Institute for Biological Studies


  1. Front Cell Dev Biol. 2021 ;9 720570
      Bioactive peptides exhibit key roles in a wide variety of complex processes, such as regulation of body weight, learning, aging, and innate immune response. Next to the classical bioactive peptides, emerging from larger precursor proteins by specific proteolytic processing, a new class of peptides originating from small open reading frames (sORFs) have been recognized as important biological regulators. But their intrinsic properties, specific expression pattern and location on presumed non-coding regions have hindered the full characterization of the repertoire of bioactive peptides, despite their predominant role in various pathways. Although the development of peptidomics has offered the opportunity to study these peptides in vivo, it remains challenging to identify the full peptidome as the lack of cleavage enzyme specification and large search space complicates conventional database search approaches. In this study, we introduce a proteogenomics methodology using a new type of mass spectrometry instrument and the implementation of machine learning tools toward improved identification of potential bioactive peptides in the mouse brain. The application of trapped ion mobility spectrometry (tims) coupled to a time-of-flight mass analyzer (TOF) offers improved sensitivity, an enhanced peptide coverage, reduction in chemical noise and the reduced occurrence of chimeric spectra. Subsequent machine learning tools MS2PIP, predicting fragment ion intensities and DeepLC, predicting retention times, improve the database searching based on a large and comprehensive custom database containing both sORFs and alternative ORFs. Finally, the identification of peptides is further enhanced by applying the post-processing semi-supervised learning tool Percolator. Applying this workflow, the first peptidomics workflow combined with spectral intensity and retention time predictions, we identified a total of 167 predicted sORF-encoded peptides, of which 48 originating from presumed non-coding locations, next to 401 peptides from known neuropeptide precursors, linked to 66 annotated bioactive neuropeptides from within 22 different families. Additional PEAKS analysis expanded the pool of SEPs on presumed non-coding locations to 84, while an additional 204 peptides completed the list of peptides from neuropeptide precursors. Altogether, this study provides insights into a new robust pipeline that fuses technological advancements from different fields ensuring an improved coverage of the neuropeptidome in the mouse brain.
    Keywords:  micropeptide; neuropeptide; non-coding; peptidomics; proteogenomics analysis; sORF-encoded polypeptide (SEP); spectral intensity prediction; timsTOF Pro mass spectrometry
    DOI:  https://doi.org/10.3389/fcell.2021.720570
  2. Neurobiol Aging. 2021 Sep 10. pii: S0197-4580(21)00288-8. [Epub ahead of print]
      Premature termination codon (PTC) mutations in the granulin gene (GRN) lead to loss-of-function (LOF) of the progranulin protein (PGRN), causing frontotemporal lobar degeneration (FTLD) by haploinsufficiency. GRN expression is regulated at multiple levels, including the 5' untranslated region (UTR). The main 5' UTR of GRN and an alternative 5' UTR, contain upstream open reading frames (uORFs). These mRNA elements generally act as cis-repressors of translation. Disruption of each uORF of the alternative 5' UTR, increases protein expression with the 2 ATG-initiated uORFs being capable of initiating translation. We performed targeted sequencing of the uORF regions in a Flanders-Belgian cohort of patients with frontotemporal dementia (FTD) and identified 2 genetic variants, one in each 5' UTR. Both variants increase downstream protein levels, with the main 5' UTR variant rs76783532 causing a significant 1.5-fold increase in protein expression. We observed that the presence of functional uORFs in the alternative 5' UTR act as potential regulators of PGRN expression and demonstrate that genetic variation within GRN uORFs can alter their function.
    Keywords:  FTD; Frontotemporal lobar degeneration; GRN gene; PGRN protein; Rare genetic variants; Upstream open reading frame
    DOI:  https://doi.org/10.1016/j.neurobiolaging.2021.09.007
  3. Nat Rev Genet. 2021 Oct 05.
      Modern genome-scale methods that identify new genes, such as proteogenomics and ribosome profiling, have revealed, to the surprise of many, that overlap in genes, open reading frames and even coding sequences is widespread and functionally integrated into prokaryotic, eukaryotic and viral genomes. In parallel, the constraints that overlapping regions place on genome sequences and their evolution can be harnessed in bioengineering to build more robust synthetic strains and constructs. With a focus on overlapping protein-coding and RNA-coding genes, this Review examines their discovery, topology and biogenesis in the context of their genome biology. We highlight exciting new uses for sequence overlap to control translation, compress synthetic genetic constructs, and protect against mutation.
    DOI:  https://doi.org/10.1038/s41576-021-00417-w
  4. EMBO J. 2021 Oct 06. e108542
      Bacterial small RNAs (sRNAs) are well known to modulate gene expression by base pairing with trans-encoded transcripts and are typically non-coding. However, several sRNAs have been reported to also contain an open reading frame and thus are considered dual-function RNAs. In this study, we discovered a dual-function RNA from Vibrio cholerae, called VcdRP, harboring a 29 amino acid small protein (VcdP), as well as a base-pairing sequence. Using a forward genetic screen, we identified VcdRP as a repressor of cholera toxin production and link this phenotype to the inhibition of carbon transport by the base-pairing segment of the regulator. By contrast, we demonstrate that the VcdP small protein acts downstream of carbon transport by binding to citrate synthase (GltA), the first enzyme of the citric acid cycle. Interaction of VcdP with GltA results in increased enzyme activity and together VcdR and VcdP reroute carbon metabolism. We further show that transcription of vcdRP is repressed by CRP allowing us to provide a model in which VcdRP employs two different molecular mechanisms to synchronize central metabolism in V. cholerae.
    Keywords:   Vibrio cholerae ; Hfq; citrate synthase; dual-function RNA; small protein
    DOI:  https://doi.org/10.15252/embj.2021108542