Predicting Gene Expression Levels from Histone Modification Signals with Convolutional Recurrent Neural Networks

Abstract

In this paper we study how a Convolutional Recurrent Neural Network performs for predicting the gene expression levels from histone modification signals. Moreover, we consider two simplified variants of the Convolutional Recurrent Neural Network: Convolutional Neural Network and Recurrent Neural Network. The performance of the methods is evaluated with histone modification signal and gene expression data derived from Roadmap Epigenomics Mapping Consortium database, and compared against the state of the art method: the DeepChrome. It is shown that the proposed models give a statistically significant improvement over the baseline.

Architecture

overview

The architecture of the proposed CRNN model is composed of six layers: two convolutional layers followed by one LSTM layer, two dense layers and one output layer. The first two convolutional layers map the input gene sequences from a 100 × 5 matrix into 32 feature maps with the size of 3×3 filters and a stride of 3. Before feeding the output of the convolutional layers into the recurrent layer, the feature maps were concatenated along histone axis in order to intensify the effects of histone modifications while keeping the bin axis unchanged. In the LSTM layer with 32 nodes, dropout operations are applied to the input gates and recurrent connections for alleviating overfitting. After that, there are two fully connected layers with 100 and 20 nodes respectively. We adopt the Rectified Linear Unit (RELU) activation function and dropout regularizer to each layer, and sigmoid operator to the output layer for the final binary prediction.

Citation