Han Yu, Zi-Ang Shen, Yuan-Ke Zhou and Pu-Feng Du*
Long non-coding RNAs (LncRNAs) are a type of RNA with little or no protein-coding ability. Their length is more than 200 nucleotides. A large number of studies have indicated that lncRNAs play a significant role in various biological processes, including chromatin organizations, epigenetic programmings, transcriptional regulations, post-transcriptional processing, and circadian mechanism at the cellular level. Since lncRNAs perform vast functions through their interactions with proteins, identifying lncRNA-protein interaction is crucial to the understandings of the lncRNA molecular functions. However, due to the high cost and time-consuming disadvantage of experimental methods, a variety of computational methods have emerged. Recently, many effective and novel machine learning methods have been developed. In general, these methods fall into two categories: semi-supervised learning methods and supervised learning methods. The latter category can be further classified into the deep learning-based method, the ensemble learning-based method, and the hybrid method. In this paper, we focused on supervised learning methods. We summarized the state-of-the-art methods in predicting lncRNA-protein interactions. Furthermore, the performance and the characteristics of different methods have also been compared in this work. Considering the limits of the existing models, we analyzed the problems and discussed future research potentials.
lncRNA-protein interaction prediction; computational model; machine learning; deep learning, LncRNAs, chromatin organizations.
College of Intelligence and Computing, Tianjin University, Tianjin 300350, , College of Intelligence and Computing, Tianjin University, Tianjin 300350, , College of Intelligence and Computing, Tianjin University, Tianjin 300350, , College of Intelligence and Computing, Tianjin University, Tianjin 300350