Comput Struct Biotechnol J. 2023 ;21 1014-1021
E3 ubiquitin ligases (E3s) and deubiquitinating enzymes (DUBs) play key roles in protein degradation. However, a large number of E3 substrate interactions (ESIs) and DUB substrate interactions (DSIs) remain elusive. Here, we present DeepUSI, a deep learning-based framework to identify ESIs and DSIs using the rich information present in protein sequences. Utilizing the collected golden standard dataset, key hyperparameters in the process of model training, including the ones relevant to data sampling and number of epochs, have been systematically assessed. The performance of DeepUSI was thoroughly evaluated by multiple metrics, based on internal and external validation. Application of DeepUSI to cancer-associated E3 and DUB genes identified a list of druggable substrates with functional implications, warranting further investigation. Together, DeepUSI presents a new framework for predicting substrates of E3 ubiquitin ligases and deubiquitinates.
Keywords: AUPRC, area under the PR curve; AUROC, area under the ROC curve; CNN, convolutional neutral network; DSI, DUB-substrate interaction; DUB, deubiquitinating enzymes; DUB-substrate interactions; Deep learning; E1, ubiquitin-activating enzymes; E2, ubiquitin-conjugating enzymes; E3, ubiquitin ligases; E3-substrate interactions; ESI, E3-substrate interaction; GSP, gold standard positive dataset; PR, precision recall; Pan-cancer analysis; ROC, receiver operating characteristic; TCGA, The Cancer Genome Atlas; UPS, ubiquitin-proteasome system; Ubiquitin proteasome system; Ubiquitination