During the spring semester, I took a graduate computer science class named Neural Modeling that focused on neural networks. One large component of this class was a research project that involved recurrent neural networks. A recurrent neural network is a neural network that “remembers” the previous input. A traditional neural network takes in a pattern, produces an output, takes in a new pattern, produces a new output, etc. The order of inputs does not matter so it cannot handle temporal or sequential data. If you want to learn more about long short-term memory, a very popular type of RNN that I use for this project, I encourage you to check out this blog post from Colah.

I chose to focus on predicting the specificity of transcription factor binding to DNA sequences because I wanted my project to be focused on biology and this problem is well-studied with widely available data.

Once I selected my problem, I needed to implement the RNN. After some googling and consulting with classmates, I decided to use Google’s open-source TensorFlow, which is a python package. The installation of TensorFlow was easy. Figuring out how to use TensorFlow to implement even a simple LSTM neural network was not. Their LSTM tutorial involved super high level pseudo-code and a several hundred lines long implementation with nothing in between. I tried googling for tutorials and I only found one remotely useful tutorial.

So as I learned how to use TensorFlow, I kept track of the steps that I took so that I could turn them into a tutorial that would help other people. The code for this first step is in a GitHub gist

The Input

When I work through something new, I like to break my final goal into a series of smaller steps. My first goal was to build the simplest possible network. The input to the network is a sequence of nucleotides where each nucleotide is represented by a binary hot-encoded vector of length 4 (all entries are zero except for 1). I arbitrarily chose to input DNA sequences of length 10.

Building The Graph

TensorFlow computations are broken into two distinct steps: building the computation graph and running the computation graph using real data. The first step is to create an input placeholder object, which is the same dimensions as the input but contains no real data. Next, we connect this placeholder to a single tf.contrib.BasicLSTMCell, which returns two objects, the output of the graph and the final state of the graph.

Running The Graph

To run the graph, we first need to create a TensorFlow session. Next, we take one of the output placeholders, either output or state, and pass it to sess.run, which runs all the necessary prior steps to get a value for the placeholder. However, the input to the LSTM cell is a placeholder so you need to pass actual values for this placeholder as the second argument.

Results

You can run the code using python basic_lstm_example.py. You should see two sets of printed outputs: the output value of the LSTM cell from every input and the state of the LSTM cell after running all inputs. You’ll see ten output values, one for each nucleotide that it sees. Even though every nucleotide in the sequence is identical, you can see that the output score is different because the previous nucleotides change the internal state of the LSTM cell, which changes its output. You’ll also see the state of the LSTM cell, which is a tuple of two components: c, the cell memory, and h, the output. You should see that h is identical to the last output. The components of the state that are missing are the values of the gates. I looked at the source code and struggled to figure out where the values of the gates are coming from. If you want to learn more, I’d suggest checking out the source code for BasicLSTMCell.

Next Steps

This blog post was the most basic implementation of a LSTM neural network. This network only contains a single LSTM cell, which doesn’t learn at all. However, hopefully this blog post was a good start on how to implement a LSTM network (this first step sure helped me) and I’ll be back with the next step.