Blood Adv. 2023 01 03. pii: bloodadvances.2022008410. [Epub ahead of print]
As a heterogeneous group of hematological malignancies, leukemia has been widely studied at the transcriptome level. However, a comprehensive transcriptomic landscape and resources for different leukemia subtypes are lacking. Thus, in this study, we integrated the RNA-Seq datasets of more than 3,000 samples from 14 leukemia subtypes and 53 related cell lines through a unified analysis pipeline. We depicted the corresponding transcriptomic landscape and developed a user-friendly data portal LeukemiaDB (http://bioinfo.life.hust.edu.cn/LeukemiaDB/). LeukemiaDB was designed with five main modules: Protein-coding gene, LncRNA, CircRNA, Alternative splicing, and Fusion gene modules. In LeukemiaDB, users can search and browse the expression level, regulatory modules, and molecular information across leukemia subtypes or cell lines. In addition, a comprehensive analysis of data in LeukemiaDB demonstrates that (1) different leukemia subtypes or cell lines have similar expression distribution of protein-coding gene and lncRNA; (2) some alternative splicing events are shared among nearly all leukemia subtypes, e.g., MYL6 in A3SS, MYB in A5SS, HMBS in RI, GTPBP10 in MXE, and POLL in SE; (3) some leukemia-specific protein-coding genes, e.g., ABCA6, ARHGAP44, WNT3, and BLACE, and fusion genes, e.g., BCR-ABL1 and KMT2A-AFF1 are involved in leukemogenesis; (4) some highly correlated regulatory modules were also identified in different leukemia subtypes, e.g., the HOXA9 module in AML and the NOTCH1 module in T-ALL. In summary, the developed LeukemiaDB provides valuable insights into oncogenesis and progression of leukemia and, to the best of our knowledge, is the most comprehensive transcriptome resource of human leukemia available to the research community.