bims-micpro Biomed News
on Discovery and characterization of microproteins
Issue of 2025–11–23
27 papers selected by
Thomas Farid Martínez, University of California, Irvine



  1. Methods Mol Biol. 2026 ;2992 75-89
      Small ORFs (sORFs) may encode signaling peptides or microproteins that regulate diverse developmental and environmental responses. Additionally, their translation may play a crucial role in the biogenesis of noncoding RNAs. Despite their importance, sORFs are typically absent from genome annotations, and their prediction remains a significant challenge. Here, we present a bioinformatics pipeline for the identification and visualization of novel translated sORFs using Ribo-seq data in Arabidopsis. This pipeline integrates: (1) Ribo-seq/RNA-seq preprocessing and mapping, (2) (optional) transcriptome assembly to detect unannotated transcripts, (3) ORF discovery using RiboTaper, and (4) the gene viewer ggRibo for high-resolution visualization of individual sORF expression. By combining these modules, our approach provides a powerful framework for analyzing Ribo-seq data, facilitating the discovery, visualization, and future functional characterization of hidden sORFs. This pipeline can be applied to different organisms for systematic identification and visualization of novel sORFs.
    Keywords:  Ribo-seq; RiboTaper; Ribosome profiling; Small ORFs; Transcriptome assembly; ggRibo
    DOI:  https://doi.org/10.1007/978-1-0716-5013-4_7
  2. Methods Mol Biol. 2026 ;2992 91-112
      Microproteins encoded from small open reading frames (smORFs) in the human genome have long been hypothesized to play physiological and regulatory roles, but they have historically been excluded from genome annotations due to the challenges in their identification. The advent of ribosome profiling (Ribo-seq), a deep sequencing technology to capture genome-wide translation, has surfaced thousands of novel ORFs with the potential to encode previously unannotated microproteins. However, due to variability in data quality and sparseness in read coverage, distinguishing truly translated smORFs from background noise remains challenging. While there are many approaches to address these challenges, here we describe the translation signature approach. This approach utilizes large-scale pooled Ribo-seq data to enable visualization of translation in individual ORFs with unprecedented clarity as compared to individual samples. It then quantifies evidence of translation by using translation signature scores, which include three metrics, namely, P-sites in frame, uniformity, and drop-off. Lastly, annotated protein-coding ORFs are used as a reference to learn the expected range of the translation signature scores reflecting active translation and novel ORFs with such scores are prioritized. We summarize here the key data resources and methods, as well as essential considerations for conducting such an analysis. Additionally, we describe a web resource hosting the default set of smORFs generated using the described workflow. With an increasing volume of high-quality Ribo-seq datasets, the translation signature scores provide a robust framework to prioritize ORFs with strong evidence of translation.
    Keywords:  Microproteins; RNA; Small open reading frames; Translation; smORFs
    DOI:  https://doi.org/10.1007/978-1-0716-5013-4_8
  3. Methods Mol Biol. 2026 ;2992 3-18
      Once dismissed as nonfunctional transcriptional noise, small open reading frames (sORFs) and their encoded microproteins have rapidly emerged as key players in diverse biological processes. Ranging from just a few to about 150 amino acids in length, microproteins are now recognized for their ability to modulate cellular functions, often acting as dominant-negative regulators, scaffolds, or signaling intermediates. Initially overlooked due to technical and conceptual limitations, they are increasingly being detected thanks to advances in high-resolution proteomics, ribosome profiling, and integrative bioinformatics. In this chapter, we provide a concise overview of the discovery, origins, functions, and biological significance of microproteins. We also introduce the structure of this methods book, which captures the latest experimental and computational tools used to identify, characterize, and functionally dissect microproteins across a wide range of organisms and research disciplines. This emerging field exemplifies the power of cross-disciplinary collaboration, and we hope this volume will support and inspire further research in this exciting and rapidly expanding area.
    Keywords:  Evolution; Microproteins; Protein structure; lncRNAs; sORFs
    DOI:  https://doi.org/10.1007/978-1-0716-5013-4_1
  4. Methods Mol Biol. 2026 ;2992 113-125
      Small open reading frames (smORFs, <100 amino acids) are a ubiquitous feature of eukaryotic and prokaryotic genomes. However, only a fraction of these smORFs encode functional proteins, known as microproteins. Functional microproteins have been found in all three domains of life, although due to limitations in their prediction and characterization, only a few have been described in plants. Therefore, the catalogization and functional analysis of plant smORF-encoded microproteins is far from complete. In this chapter, we present a bioinformatic pipeline designed to predict and validate coding smORFs in plant transcriptomes. The workflow comprises several key steps: transcriptome assembly, prediction of potentially coding smORFs, filtration of candidates, and validation of their translation through analysis of mass-spectrometry data.
    Keywords:  Mass spectrometry; Microproteins; Plant genomes; Transcriptomes; smORFs
    DOI:  https://doi.org/10.1007/978-1-0716-5013-4_9
  5. Methods Mol Biol. 2026 ;2992 51-60
      Microproteins, encoded by small open reading frames (smORFs or sORFs) of less than 100 amino acids, play critical roles in gene expression, cellular signaling and metabolism, and hold potential as therapeutic targets. Despite recent genome annotation efforts, many smORFs remain unannotated, leaving a significant gap in our understanding of microprotein structure and function. Efficient microprotein identification is essential for functional studies and improving genome annotation, but remains a technical challenge. Here, we describe a proteogenomic approach that combines microprotein enrichment with mass spectrometry to identify both annotated and novel smORFs. Using this approach, we have successfully identified and validated several novel smORFs in the fully annotated yeast genome. While yeast serves as a model system, this protocol is adaptable to samples from different species, providing a robust framework for microprotein discovery.
    Keywords:  Mass spectrometry; Microprotein; Microprotein enrichment; Proteogenomics; smORF
    DOI:  https://doi.org/10.1007/978-1-0716-5013-4_5
  6. Methods Mol Biol. 2026 ;2992 61-71
      SEPs are important as co-operators of various biological processes with canonical proteins. In microalgae, they are reported to be involved in photosynthesis, nitrogen metabolism, stress response, and so on. Because of low abundance and small molecular weight, identifying SEPs in proteomics is usually challenging, requiring specially designed technical strategies for sample handling and data analysis. Here, we present two method examples for SEP in microalgae (one cyanobacterium and one green alga), from SEP extraction and enrichment to database search, aiming to achieve an ideal identification of SEPs while reducing the interference of canonical proteins. This chapter will provide a reference procedure for SEP exploration in microalgae.
    Keywords:  Chlamydomonas reinhardtii; Cyanobacteria; Green algae; Mass spectrometry; Microalgae; PCC 6803; SEPs; Small protein enrichment; Synechocystis sp.
    DOI:  https://doi.org/10.1007/978-1-0716-5013-4_6
  7. Methods Mol Biol. 2026 ;2992 31-39
      Alternative proteins (AltProts) produced from short open reading frames have been missed when genomes were annotated primarily due to their small sizes. In addition, their limited size and low abundance further challenge their identification using conventional proteomics. Several specific workflows have been developed to identify AltProts by mass spectrometry. However, the different protocols rely on the separation of AltProts from canonical proteins, thus preventing concomitant identification of both types of molecules within a single sample. Here, we present an experimental strategy for simultaneous identification of both canonical proteins and AltProts within a single mass spectrometry analysis. This approach includes a simple protein extraction method and an analysis pipeline dedicated to the identification of AltProts.
    Keywords:  Alternative proteins; Data-dependent acquisition; Mass spectrometry; Microprotein; Peptide; Peptidomics; Proteomics; Short open reading frame; Short open reading frame-encoded polypeptide
    DOI:  https://doi.org/10.1007/978-1-0716-5013-4_3
  8. Methods Mol Biol. 2026 ;2992 151-170
      MicroProteins (miPs), small functional proteins derived from short open reading frames (sORFs), play pivotal roles in posttranslational regulation by interacting with multidomain proteins. Acting as dominant-negative regulators, miPs modulate transcriptional activity and protein complex assembly. Their compact structure and functional versatility make them indispensable in cellular regulation, with significant implications in diseases such as cancer and neurodegenerative disorders. Intronless genes (IGs, characterized by the absence of introns in their coding sequences, are linked to key regulatory functions in development, cell proliferation, and disease pathways. While the roles of many IG-encoded proteins are well established, the potential of these genes to encode miPs, as well as their evolutionary history, remain largely unexplored. This chapter introduces a computational framework integrating three bioinformatics tools to investigate the evolution in vertebrates and the functional roles of miPs encoded by IGs. The framework begins with miPFinder2, which identifies potential miPs by annotating small peptides. Next, IGFinder classifies IGs based on genomic and UTR features. Finally, REvolutionH-tl reconstructs evolutionary histories that comprehend orthogroups, orthologs, paralogs, gene trees, species trees, and species tree/gene tree reconciliations. This integrative approach provides comprehensive insights into the structural, functional, and evolutionary reconstruction of miPs encoded by IGs, contributing to advancements in functional genomics, evolutionary biology, and disease research.
    Keywords:  Evolutionary biology; Functional genomics; Gene duplication; Intronless genes; Microproteins; Orthology; Phylogenetic analysis; Short open reading frames
    DOI:  https://doi.org/10.1007/978-1-0716-5013-4_11
  9. Methods Mol Biol. 2026 ;2992 19-29
      RIBO-seq and proteomics have revealed that mammalian genomes harbor thousands of unannotated small and alternative open reading frames (sm/alt-ORFs <150 codons) that are translated into microproteins and alternative proteins (micro/alt-proteins). Several dozens of micro/alt-proteins have been characterized at the molecular level and demonstrated to play important biological roles. However, the overwhelming majority of micro/alt-proteins remain undefined, possibly because they are excluded from protein databases and cannot be detected by a classical proteomic workflow. Here, we introduce a proteomic pipeline to detect unannotated micro/alt-proteins in cultured cells or tissues from total protein extraction, gel-based size selection, digestion, and liquid chromatography-mass spectrometry proteomics (LC-MS/MS), to data analysis and manual validation. This approach is able to identify unannotated micro/alt-proteins from cultured cells and (patho)physiological tissues.
    Keywords:  Alternative open reading frame (alt-ORF); Alternative protein (alt-protein); Microprotein; Proteomics; Small open reading frame (smORF); Unannotated protein
    DOI:  https://doi.org/10.1007/978-1-0716-5013-4_2
  10. Methods Mol Biol. 2026 ;2992 41-49
      Microproteins and microOpen Reading Frames (microORFs) are small proteins encoded by short, often overlooked, coding regions in the genome. These peptides are typically under 100 amino acids in length and are believed to play critical roles in cellular processes such as stress response, metabolic regulation, and protein quality control. However, their discovery has been hindered by limitations in traditional proteomics and gene annotation pipelines. In this chapter, we present a novel method for the systematic identification of microproteins and microORFs across a range of eukaryotic species using an integrative approach that combines ribosome profiling, mass spectrometry-based proteomics, and advanced bioinformatics tools. Our method enhances the sensitivity of microprotein detection and allows for the validation of candidate microORFs in vivo. This methodology provides a comprehensive platform for microprotein discovery, offering new insights into gene regulation and cellular function.
    Keywords:  Bioinformatics; Cellular Function; Mass Spectrometry; MicroORFs; Microproteins; Ribosome Profiling; Small Peptides
    DOI:  https://doi.org/10.1007/978-1-0716-5013-4_4
  11. Methods Mol Biol. 2026 ;2990 253-281
      Ribosome profiling (or ribo-sequencing) is a powerful technique that enables high-resolution analysis of active translation by sequencing ribosome-protected mRNA fragments. Developed in 2009 by Nicholas Ingolia and Jonathan Weissman, it provides direct insights into which mRNAs are being translated and at what rate, surpassing traditional transcriptomic and proteomic methods. Ribo-seq allows for precise mapping of ribosome positions, enabling detailed characterization of ribosome dynamics and the identification of alternative translation initiation sites and upstream open reading frames. The standard workflow includes key steps such as ribosome stabilization, nuclease digestion, fragment isolation, and deep sequencing. The protocol described in this chapter incorporates a polysome profiling step prior to RNase treatment, allowing, for instance, the isolation of distinct ribosome species (also known as k-somes). Ribo-seq has transformed our understanding of translational regulation and has become an essential tool for omics studies in various fields such as developmental biology, cancer biology, virology, and microbiology.
    Keywords:  Ribo-sequencing; Ribosome profiling; Ribosome-protected fragments; Translatome
    DOI:  https://doi.org/10.1007/978-1-0716-4997-8_19
  12. Methods Mol Biol. 2026 ;2992 267-281
      Micropeptides encoded by small open reading frames (sORFs) represent a newly recognized class of biomolecules involved in important cellular functions such as metabolism, signal transduction and homeostasis. In the past, these peptides have been overlooked due to annotation and technical challenges. Recent advances in high-throughput techniques such as ribosome profiling, liquid chromatography-tandem mass spectrometry (LC-MS/MS) and bioinformatics tools have enabled the systematic identification and characterization of these peptides. In prostate cancer (PCa), urinary micropeptides have been shown to be promising noninvasive biomarkers, as the prostate is located close to the urinary tract and urine can be easily collected. In this chapter, we discuss the methods used to detect micropeptides, incorporating in silico predictions, LC-MS/MS-based proteomic analyzes, and metabolic pathway enrichment approaches. We highlight the recent discoveries of micropeptides with differential expression in PCa and their potential as diagnostic and prognostic biomarkers. These findings highlight the importance of micropeptide research and pave the way for new strategies in PCa diagnosis and personalized medicine.
    Keywords:  Micropeptides; Noncoding RNA; Proteomics; Systems biology; Therapeutic targets; sORFs
    DOI:  https://doi.org/10.1007/978-1-0716-5013-4_19
  13. Methods Mol Biol. 2026 ;2992 247-259
      Mounting studies suggest that the protein-coding potential of the human genome is underestimated, and previously unannotated open reading frames (ORFs) have been revealed with technological advances. Although the coding potential of many non-canonical ORFs (ncORFs) is now recognized, their functional prevalence remains to be characterized, partly due to the technical difficulty and labour intensity in screening numerous ncORFs at once. Here, we describe a gain-of-function genomic screen to identify functional ncORF-encoded proteins responsible for breast cancer tumorigenesis. This method is thought to improve the efficiency of characterizing previously neglected ncORFs and reveal potential targets for cancer treatment.
    Keywords:  Cancer biology; Functional genomic screen; Gain-of-function screen; Micropeptides; Microproteins; Tumorigenesis; ncORF; smORF
    DOI:  https://doi.org/10.1007/978-1-0716-5013-4_17
  14. BMC Genomics. 2025 Nov 21.
       BACKGROUND: The human genome has been the subject of scrutiny for more than two decades, yet new protein coding genes are still being uncovered and recently ribosome profiling experiments have provided evidence for the translation of thousands of novel open reading frames (ORFs). To determine how many of these novel ORFs have peptide support, we carried out an in-depth investigation of an entire mass spectrometry proteomics database.
    RESULTS: We analysed the peptides housed in the human build of the PeptideAtlas database and identified reliable evidence for 35 potential coding genes not annotated in the Ensembl/GENCODE reference gene set. Evidence from complementary sources confirmed that 16 were almost certainly coding genes, but we believe that at least 14 are most likely to be undergoing aberrant translation. These 14 genes had reading frames that were not preserved beyond human and their peptides were restricted to cancers or cell lines. Remarkably, three of the sixteen likely coding genes were derived from endogenous retroviral gag ORFs and were expressed only in placenta. All three had evidence of purifying selection. Retroviral env ORFs (syncytins) with distinct origins are expressed in almost all mammalian placentae and these results suggest that co-opted gag ORFs may also play an important role in placental development.
    CONCLUSIONS: Our analysis shows that proteomics data can be used in conjunction with evolutionary evidence to confirm the existence of new coding genes. The evidence suggests that both testis and placenta are the tissues most likely to express still to be identified coding genes, and that there may be other transposon-derived ORF that have been co-opted as coding genes. The strong evidence for the translation of regions under dysregulated conditions has important implications for the annotation of coding genes and in the analysis of cancer and other degenerative diseases.
    Keywords:  Co-option; Coding genes; Endogenous retrovirus; Proteomics; Pseudogenes
    DOI:  https://doi.org/10.1186/s12864-025-12238-w
  15. Nucleic Acid Ther. 2025 Nov 17.
      Pathogenic variants creating upstream open reading frames (uORFs) in the 5' untranslated region (5'UTR) of the ENG gene can disrupt translation from the main ORF and contribute to hereditary hemorrhagic telangiectasia (HHT). This is the case of the ENG c.-79C>T that introduces a uAUG shown to decrease endoglin expression and associates with HHT. Here, we investigated whether 2'-O-methyl (2'OMe) antisense oligonucleotides (ASOs) could restore protein levels by masking this aberrant uAUG or by targeting predicted secondary structures within the ENG 5'UTR. Several ASOs of varying lengths and backbone chemistries (full phosphodiester or full phosphorothioate) were designed to target the mutant region. Their effects were evaluated in HeLa cells transfected and in HUVECs transduced with wild-type or mutant ENG constructs. Transfection efficiency was verified by MALAT1 knockdown via qPCR, and endoglin protein levels were assessed by Western blot. Despite efficient ASO delivery and optimized experimental conditions, no reproducible increase in endoglin expression was observed upon ASO treatment. These findings highlight the limitations of steric-blocking ASOs targeting 5'UTR variants and underscore the need for deeper mechanistic understanding of uORF-mediated translational regulation.
    Keywords:  5′; ENG gene; antisense oligonucleotides; hereditary hemorrhagic telangiectasia; untranslated region; upstream AUG
    DOI:  https://doi.org/10.1177/21593337251396711
  16. Methods Mol Biol. 2026 ;2992 261-266
      The identification and exploration of microproteins encoded by long noncoding RNA (lncRNA) genes have the potential to revolutionize our understanding of human biology and expedite medical breakthroughs. Proteomics enables the identification and quantification of thousands of proteins present in cells, tissues or organisms, with LC-MS/MS serving as a pivotal technique in current proteomic research. Here, we describe a comprehensive suite of integrated methods to enhance proteomic analysis for the identification of lncRNA-encoded microproteins. This approach combines tricine gel electrophoresis, in-gel digestion, the establishment of a customized microprotein reference database, and LC-MS/MS analysis.
    Keywords:  Customized microprotein reference database; In-gel digestion; LncRNA; Microproteins; Proteomics; Tricine gel electrophoresis
    DOI:  https://doi.org/10.1007/978-1-0716-5013-4_18
  17. Methods Mol Biol. 2026 ;2992 229-246
      Epitope tag immunoblotting represents a routine method for targeted surveys of protein abundance and expression. Especially for microproteins, defined by an arbitrary cutoff of 100 amino acids in length, blotting-based approaches are indispensable as the small size of microproteins often renders them elusive to mass spectrometry. Nonetheless, the blotting of microproteins introduces a set of technical challenges, leading to microprotein losses, which significantly affect the sensitivity of blotting-based approaches like epitope tag immunoblotting.We introduce HiBiT blotting, an antibody-free luminescent detection method for HiBiT-tagged proteins, offering an alternative blotting-based protein detection method to improve microprotein analysis. The availability of an anti-HiBiT antibody enabled a comparative analysis of HiBiT versus classical epitope tag immunoblotting, overall demonstrating the superior sensitivity of HiBiT blotting in detecting HiBiT-tagged microproteins. By offering a more direct and sensitive approach for small protein analysis, HiBiT blotting represents a substantial contribution to the field, enabling the effective study of microproteins and addressing the longstanding challenge of their detection.
    Keywords:  HiBiT blotting; Immunoblotting; Luminescence; Microproteins; Protein blotting; SDS-PAGE
    DOI:  https://doi.org/10.1007/978-1-0716-5013-4_16
  18. Methods Mol Biol. 2026 ;2992 213-228
      A large number of novel microproteins discovered to date are nuclear encoded, mitochondrial proteins, pointing to their widespread roles in metabolic regulation. In this chapter, we provide a workflow of how to verify if a candidate microprotein is localized to the mitochondria, its submitochondrial localization (i.e., outer, inner membrane, or matrix) and how to determine its interactome in order to elucidate its molecular function.
    Keywords:  Microproteins; Mitochondria; OXPHOS
    DOI:  https://doi.org/10.1007/978-1-0716-5013-4_15
  19. Methods Mol Biol. 2026 ;2992 203-212
      Advancements in biotechnology, bioinformatics tools, and high-throughput sequencing techniques have greatly enhanced our ability to identify small but significant molecules in plants that were previously thought to be junk. Small molecules, including microproteins (miPs) and micoRNA-encoded peptides (miPEPs), play a crucial role in various aspects of plant biology, such as growth, development, stress responses, hormonal signaling, and primary and secondary metabolism. Despite their importance, many of these molecules still need to be characterized due to the challenges associated with validating them in plants. Traditional validation methods, like protein sequencing and western blotting using specific antibodies, are often expensive and labor-intensive, which limits their use in large-scale studies. To address these challenges, alternative methods like promoter-reporter lines linked to a reporter gene such as β-glucuronidase (GUS) can offer a more efficient and cost-effective solution to validate transcriptionally active small peptides/ORFs in planta. The GUS assay allows for the direct visualization of gene activity in plant tissues, simplifying the validation of candidate molecules (small peptides/ORFs) and alternative transcripts. This approach speeds up the functional characterization of novel genes and molecules, ultimately advancing plant research and biotechnology applications focused on their functional characterization.
    Keywords:  Alternative transcript; GUS (β-glucuronidase); Hormonal signalling; Microproteins; ORF; Small peptides; Stress responses; miPEP
    DOI:  https://doi.org/10.1007/978-1-0716-5013-4_14
  20. Methods Mol Biol. 2026 ;2992 183-201
      While thousands of putative microproteins have been identified through ribosome profiling, reporter assays, and mass spectrometry-based methods, their functional testing has remained challenging. Advances in genome sequencing and CRISPR/Cas technologies enable the prioritization and testing of candidate microprotein functions for roles in development, for example, in the maternal-to-zygotic transition or in neurodevelopment. Here, we describe the functional testing of microproteins in vivo using a vertebrate model of early development, Danio rerio (zebrafish).
    Keywords:  Behavior; CRISPR/Cas13d; CRISPR/Cas9; Development; Maternal-zygotic; Microproteins; Vertebrate
    DOI:  https://doi.org/10.1007/978-1-0716-5013-4_13
  21. EMBO J. 2025 Nov 19.
      Sororin is essential for establishing sister chromatid cohesion concurrently with DNA replication in metazoans. Although acetylation of the cohesin subunit SMC3 by ESCO1/2 is necessary for Sororin recruitment, it is by itself not sufficient. Here, we demonstrate that DNA replication-coupled Poly(ADP-Ribose) Polymerase (PARP) activity is an additional prerequisite in human cells. During normal S-phase, PARP1 PARylates a microprotein encoded by the alternative ORF C11ORF98, which we designate RSMC (28S rRNA/ribosome and Sororin micro-cofactor). This PARylation strengthens the interaction of RSMC with Sororin, enhancing both chromatin recruitment and anti-Wapl activity of Sororin in concert with SMC3 acetylation. Notably, overexpression of RSMC is able to rescue cohesion defects induced by the PARP inhibitor olaparib. These findings highlight understudied microproteins as critical regulators of fundamental cellular processes, such as sister chromatid cohesion.
    Keywords:  Chromatin; Cohesin; DNA Replication; Dark Proteome; Microprotein
    DOI:  https://doi.org/10.1038/s44318-025-00641-8
  22. Heart Rhythm. 2025 Jan;pii: S1547-5271(24)03538-0. [Epub ahead of print]22(1): 3
      
    DOI:  https://doi.org/10.1016/j.hrthm.2024.11.001
  23. Microlife. 2025 ;6 uqaf034
      Bacteria use small regulatory RNAs (sRNAs) and small proteins to change gene expression and modulate cellular processes in response to changing environmental conditions. In addition, several transcripts have been reported to combine base-pairing sRNA activities and coding capacity. These transcripts are known as dual-function RNAs. In some cases, the sRNA and the protein operate within the same pathway, while in other cases, they modulate separate processes in the cell. Thereby, dual-function RNAs enable bacteria to adjust their gene expression and physiology at multiple levels, which can have synergistic regulatory effects or help to synchronize the output of cellular pathways. In this review, we summarized the regulatory and physiological roles of dual-function RNAs in bacteria, including their roles in intercellular communication, virulence, stress response, and metabolism. In addition, we discuss open challenges and possible future applications for harnessing dual regulators for precise gene expression control in bacteria.
    Keywords:  bacterial gene regulation; dual-function RNAs; regulatory RNAs; small proteins
    DOI:  https://doi.org/10.1093/femsml/uqaf034
  24. Methods Mol Biol. 2026 ;2992 129-150
      In the past decade, computational predictions of protein structure and disorder have become widely accessible, achieving accuracy that, in some cases, rivals experimental results. These advances have been instrumental in identifying structural homologies, guiding protein design, and enhancing functional annotation. However, most prediction models are trained on "classical proteins," favoring specific sequence lengths and patterns, leading to biases. Microproteins-small proteins with emerging roles in development and disease-function similarly across the tree of life and stand to benefit significantly from structure and disorder predictions. Yet, their short length and molecular interactions present unique challenges, making homology detection more difficult and requiring careful methodological considerations. Here, I outline workflows for predicting, analyzing, and refining microprotein structures and disorders, emphasizing key precautions to ensure reliable insights.
    Keywords:  Disorder prediction; Microproteins; Molecular dynamics simulations; Protein structure prediction; Sequence feature annotation; Structural annotation
    DOI:  https://doi.org/10.1007/978-1-0716-5013-4_10
  25. Methods Mol Biol. 2026 ;2992 283-315
      Mass spectrometry-based proteomics is a versatile technique that facilitates the study of microproteins as biomarkers and potential translational applications by utilizing liquid chromatography-trapped ion mobility-tandem mass spectrometry (LC-TIMS-MS/MS) and matrix assisted laser desorption/ionization-time-of-flight mass spectrometry (MALDI-TOF MS). Importantly, MS has been shown to be compatible with complex biological samples (i.e., mammalian biofluids and patient samples). Here we describe a workflow combining: (1) top-down proteomics for analysis of intact microproteins via MALDI-TOF MS and (2) bottom-up proteomics for analysis of enzymatically digested microproteins via LC-TIMS-MS/MS to annotate putative microproteins of interest from a complex biological sample: patient-derived tampons.
    Keywords:  Bioinformatics; Cheminformatics; LC–MS/MS; MALDI; Microproteins; Mss spectrometry
    DOI:  https://doi.org/10.1007/978-1-0716-5013-4_20
  26. Cell Chem Biol. 2025 Nov 20. pii: S2451-9456(25)00347-2. [Epub ahead of print]32(11): 1305-1307
      In this issue of Cell Chemical Biology, Zhang et al.1 report the identification of a high-affinity EMBOW-derived inhibitor of WDR5, Ac7, which demonstrates in-cell target engagement and in vivo antileukemic efficacy. The microprotein-inspired inhibitor potently blocks the WDR5-MLL1 interaction, suppressing H3K4 methylation and transcription of target genes in mixed lineage leukemia (MLL)-rearranged leukemia.
    DOI:  https://doi.org/10.1016/j.chembiol.2025.10.010
  27. Dev Cell. 2025 Nov 20. pii: S1534-5807(25)00666-5. [Epub ahead of print]
      Microscopy offers an indispensable technique for visualizing biological processes and for defining cytological abnormalities characteristic of disease. However, combining microscopy with the power of pooled CRISPR screening presents considerable technical challenges, hindering application of systematic genetic analysis to imaging-defined phenotypes. Here, we establish a fluorescence microscopy-based CRISPR screening platform that combines ease of implementation with flexible analysis of live-cell or antibody-based molecular markers, including post-translational modifications. Applying this methodology, we systematically identify regulators of primary cilium structure and function in human cells through targeted and genome-wide screens. We further show that integration of screens focused on distinct ciliary phenotypes yields multi-dimensional profiles that delineate precise gene functions. Among the identified hits, TZMP1 (SMIM27) encodes a microprotein at the ciliary transition zone that is required for ciliogenesis in human cells and for ciliary function in Xenopus embryos. More broadly, our approach provides a technological and conceptual strategy for microscopy-based functional genomics.
    Keywords:  CRISPR screen; axoneme; cilia; ciliopathy; functional genomics; microprotein; optical screening; polyglutamylation; transition zone
    DOI:  https://doi.org/10.1016/j.devcel.2025.10.015