Accurately Predicting Microbial Phosphorylation Sites Using Evolutionary and Structural Features
Abstract
Post-translational modification (PTM) is a biological process involving a protein’s enzymatic changes after its translation by the ribosome. Phosphorylation is one of the most critical PTMs that occurs when a phosphate group interacts with an amino acid residue along the protein sequence. It contributes to cell communication, DNA repair, and gene regulation. Predicting microbial phosphorylation sites can help better understand host-pathogen interaction and develop anti-microbial agents. Experimental methods such as radioactive chemical methods, chromatin immunoprecipitation, and mass spectrometry are time-consuming, laborious, and expensive. Therefore, finding fast and effective computational approaches to detect phosphorylation sites accurately is drawing immense attention. In this thesis, we propose two new computational tools, RotPhoPred and DeepPhoPred, for predicting phospho-serine (pS), phospho-threonine (pT), and phospho-tyrosine (pY) sites in the microbial organism. In RotPhoPred, we integrate the evolutionary bigram profile with structural information and use Rotation Forest as the classification technique. To the best of our knowledge, our extracted features and employed classifier have never been utilized for this task. Comparative results demonstrate that the RotPhoPred surpasses its peers in terms of different metrics such as sensitivity (90.0%, 75.4% and 78.2%), specificity (92.1%, 97.2% and 94.7%), accuracy (91.0%, 86.3%, 86.4%), and MCC (0.82, 0.74 and 0.74) for pS, pT, and pY sites predictions, respectively. RotPhoPred as a standalone predictor and all its source codes are publicly available at : https://github.com/faisalahm3d/RotPredPho.
In DeepPhosPred, we design a lightweight two-headed CNN architecture blending squeeze and excitation blocks that can automatically discover compelling features from the peptide’s structural and evolutionary properties without human intervention to detect phosphorylation sites. To the best of our knowledge, it is the first tool that integrates two information sources and applies deep learning technique to tackle this problem. DeepPhosPred outperforms RotPhoPred and current predictors with significantly higher sensitivity (100%, 99.2% and 99.3%), specificity (98.9%, 99% and 98.7%), accuracy (99.0%, 99.0%, 98.7%), and MCC (0.93, 0.94 and 0.94) for pS, pT, and pY sites predictions, respectively.
Collections
- M.Sc Thesis/Project [149]