We have created a customized lstm model (lstm.py) using tensorflow. Here is the tutorial:

Build Your Own LSTM Model Using TensorFlow: Steps to Create a Customized LSTM

In this tutorial, we will use this customized lstm model to train mnist set and classify handwritten digits. To understand mnist set, you can view:

Understand and Read TensorFlow MNIST Dataset for Beginners

**Preliminary**

We should import some libraries.

import tensorflow as tf from tensorflow.examples.tutorials.mnist import input_data import os import numpy as np import random import lstm

Here import lstm is used to load our customized lstm model.

**Load mnist data**

mnist = input_data.read_data_sets(os.getcwd() + "/MNIST-data/", one_hot=True)

We have saved our mnist data in MNIST-data folder.

**Set some hyper parameters**

We should set some hyper parameters first.

# learning rate learning_rate = 1e-3 hidden_dim = 50 input_size = 28 time_step = 28 # 28 lstm cells total_steps = 1000 category_num = 10 steps_per_validate = 15 steps_per_test = 15 batch_size = 64

**Define mode input**

We should create some tensorflow placehoder variables.

x = tf.placeholder(tf.float32, [None, 784]) y_label = tf.placeholder(tf.float32, [None, 10]) x_shape = tf.reshape(x, [-1, time_step, input_size]) # batch_size * 28 * 28 batch_size_train = tf.placeholder(tf.int32, [])

**Use customized lstm model to classify handwritten digits**

Get lstm output

custom_lstm = lstm.LSTM(x_shape, emb_dim =input_size, hidden_dim = hidden_dim, sequence_length = time_step) output = custom_lstm.gen_o output_y = tf.reduce_mean(output, 1)

In this model, we will average all outputa of all lstm cells. output_y is the final output of lstm.

Then we can use outpu_y to train model and get the prediction.

w = tf.Variable(tf.truncated_normal([hidden_dim, category_num], -0.01, 0.01), dtype=tf.float32) b = tf.Variable(tf.random_uniform([category_num], -0.01, 0.01), dtype=tf.float32) y = tf.matmul(output_y, w) + b # Loss cross_entropy = tf.nn.softmax_cross_entropy_with_logits(labels=y_label, logits=y) train = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cross_entropy) # Prediction correction_prediction = tf.equal(tf.argmax(y, axis=1), tf.argmax(y_label, axis=1)) accuracy = tf.reduce_mean(tf.cast(correction_prediction, tf.float32))

**Start to train model**

We will train lstm model and print the train process. We will get the test accuracy based on the best validation accuracy as our final result.

init = tf.global_variables_initializer() try: with tf.Session() as sess: sess.run(init) test_acc = 0. dev_acc = 0. better_acc = 0.0 #set train times for step in range(total_steps + 1): batch_x, batch_y = mnist.train.next_batch(batch_size) _, acc = sess.run([train, accuracy] , feed_dict={x: batch_x, y_label: batch_y, keep_prob_train: keep_prob, batch_size_train: batch_size}) # Train Accuracy print("train step="+str(step) +" accuracy = " + str(acc)) # Validation Accuracy if step % steps_per_validate == 0: dev_x, dev_y = mnist.validation.images, mnist.validation.labels dev_acc = sess.run(accuracy,feed_dict = {x: dev_x, y_label: dev_y, keep_prob_train: 1.0, batch_size_train: dev_x.shape[0]}) print("dev step="+str(step) +" accuracy = " + str(dev_acc)) if better_acc < dev_acc: test_x, test_y = mnist.test.images, mnist.test.labels test_acc = sess.run(accuracy,feed_dict = {x: test_x, y_label: test_y, keep_prob_train: 1.0, batch_size_train: test_x.shape[0]}) print("test step="+str(step) +" accuracy = " + str(test_acc)) better_acc = dev_acc except Exception as e: print(e)

Run this code, you will get a train process below: