In this exercise, we will use Tensorflow for classification.

To begin, install tensorflow on your machine. One way to do so is using Anaconda. Another is to install pip, and then type in the command line 'pip install tensorflow'.

Hello world

We will now execute a simple command that creates a constant tensor. We also create a session for running the tensorflow operations.

In [1]:
import tensorflow as tf

hello = tf.constant('Hello, TensorFlow!')
sess = tf.Session()
print(sess.run(hello))
b'Hello, TensorFlow!'

Basic Operations

We now execute some basic operations using the tensorflow library.

In a basic constant operation, the value returned by the constructor represents the output of the Constant op.

In [2]:
a = tf.constant(2)
b = tf.constant(3)

Let us launch the default graph.

In [3]:
with tf.Session() as sess:
    print("a: %i" % sess.run(a), "b: %i" % sess.run(b))
    print("Addition with constants: %i" % sess.run(a+b))
    print("Multiplication with constants: %i" % sess.run(a*b))
a: 2 b: 3
Addition with constants: 5
Multiplication with constants: 6

In basic Operations with variable graph input, the value returned by the constructor represents the output of the Variable op (define as input when running session). This is done so using a placeholder.

In [4]:
a = tf.placeholder(tf.int16)
b = tf.placeholder(tf.int16)

We can define the addition and multiplication operations as follows.

In [5]:
add = tf.add(a, b)
mul = tf.multiply(a, b)

Let us launch the default graph.

In [6]:
with tf.Session() as sess:
    # Run every operation with variable input
    print("Addition with variables: %i" % sess.run(add, feed_dict={a: 2, b: 3}))
    print("Multiplication with variables: %i" % sess.run(mul, feed_dict={a: 2, b: 3}))
Addition with variables: 5
Multiplication with variables: 6

Let us now look at matrix multiplication.

Create a Constant op that produces a 1x2 matrix. The op is added as a node to the default graph.

In [7]:
matrix1 = tf.constant([[3., 3.]])

Create another Constant that produces a 2x1 matrix.

In [8]:
matrix2 = tf.constant([[2.],[2.]])

Create a Matmul op that takes 'matrix1' and 'matrix2' as inputs. The returned value, 'product', represents the result of the matrix multiplication.

In [9]:
product = tf.matmul(matrix1, matrix2)

To run the matmul op we call the session 'run()' method, passing 'product' which represents the output of the matmul op. This indicates to the call that we want to get the output of the matmul op back. All inputs needed by the op are run automatically by the session. They typically are run in parallel. The call 'run(product)' thus causes the execution of threes ops in the graph: the two constants and matmul. The output of the op is returned in 'result' as a numpy ndarray object.

In [10]:
with tf.Session() as sess:
    result = sess.run(product)
    print(result)
[[ 12.]]

Linear Regression

Let us now perform linear regression using tensorflow.

In [11]:
import tensorflow as tf
import numpy
import matplotlib.pyplot as plt
rng = numpy.random

Define the parameters. 'learning_rate' sets the rate of convergence of the gradient descent optimizer. 'training_epochs' sets the number of iterations, where an 'epoch' is an iteration. 'display_step' sets the frequency of epochs to print the current status of the algorithm.

In [12]:
learning_rate = 0.01
training_epochs = 1000
display_step = 50

The training data is given as below.

In [13]:
train_X = numpy.asarray([3.3,4.4,5.5,6.71,6.93,4.168,9.779,6.182,7.59,2.167,
                         7.042,10.791,5.313,7.997,5.654,9.27,3.1])
train_Y = numpy.asarray([1.7,2.76,2.09,3.19,1.694,1.573,3.366,2.596,2.53,1.221,
                         2.827,3.465,1.65,2.904,2.42,2.94,1.3])
n_samples = train_X.shape[0]

The input graph is created and weights for the model are set.

In [14]:
X = tf.placeholder("float")
Y = tf.placeholder("float")

W = tf.Variable(rng.randn(), name="weight")
b = tf.Variable(rng.randn(), name="bias")

The linear model is constructed.

In [15]:
pred = tf.add(tf.multiply(X, W), b)

The mean squared error and the optimizer are defined.

In [16]:
cost = tf.reduce_sum(tf.pow(pred-Y, 2))/(2*n_samples)
 
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)

Initialize the variables (i.e. assign their default value).

In [17]:
init = tf.global_variables_initializer()

Start training.

In [18]:
with tf.Session() as sess:
    sess.run(init)

    # Fit all training data
    for epoch in range(training_epochs):
        for (x, y) in zip(train_X, train_Y):
            sess.run(optimizer, feed_dict={X: x, Y: y})

        #Display logs per epoch step
        if (epoch+1) % display_step == 0:
            c = sess.run(cost, feed_dict={X: train_X, Y:train_Y})
            print("Epoch:", '%04d' % (epoch+1), "cost=", "{:.9f}".format(c), \
                "W=", sess.run(W), "b=", sess.run(b))

    print("Optimization Finished.")
    training_cost = sess.run(cost, feed_dict={X: train_X, Y: train_Y})
    print("Training cost=", training_cost, "W=", sess.run(W), "b=", sess.run(b), '\n')

    #Graphic display
    plt.plot(train_X, train_Y, 'ro', label='Original data')
    plt.plot(train_X, sess.run(W) * train_X + sess.run(b), label='Fitted line')
    plt.legend()
    plt.show()
Epoch: 0050 cost= 0.167519063 W= 0.417855 b= -0.408992
Epoch: 0100 cost= 0.157056987 W= 0.407847 b= -0.337
Epoch: 0150 cost= 0.147803172 W= 0.398435 b= -0.26929
Epoch: 0200 cost= 0.139618054 W= 0.389583 b= -0.205607
Epoch: 0250 cost= 0.132378444 W= 0.381257 b= -0.145711
Epoch: 0300 cost= 0.125975028 W= 0.373426 b= -0.0893783
Epoch: 0350 cost= 0.120311320 W= 0.366062 b= -0.0363956
Epoch: 0400 cost= 0.115301885 W= 0.359135 b= 0.0134361
Epoch: 0450 cost= 0.110871151 W= 0.35262 b= 0.060304
Epoch: 0500 cost= 0.106952339 W= 0.346492 b= 0.104385
Epoch: 0550 cost= 0.103486292 W= 0.340729 b= 0.145843
Epoch: 0600 cost= 0.100420788 W= 0.335309 b= 0.184836
Epoch: 0650 cost= 0.097709529 W= 0.330211 b= 0.22151
Epoch: 0700 cost= 0.095311597 W= 0.325416 b= 0.256003
Epoch: 0750 cost= 0.093190826 W= 0.320907 b= 0.288444
Epoch: 0800 cost= 0.091315188 W= 0.316665 b= 0.318956
Epoch: 0850 cost= 0.089656383 W= 0.312676 b= 0.347653
Epoch: 0900 cost= 0.088189326 W= 0.308925 b= 0.374644
Epoch: 0950 cost= 0.086891890 W= 0.305396 b= 0.40003
Epoch: 1000 cost= 0.085744530 W= 0.302077 b= 0.423905
Optimization Finished.
Training cost= 0.0857445 W= 0.302077 b= 0.423905 

Q1. Observe the dependence of the training cost (error) on the learning rate.

With all other parameters fixed, we now vary the learning rate to take three values in $\{0.0001,0.01,0.8\}$. Draw a plot with y-axis representing the training cost and the x-axis representing the iteration (0 through 1000). There must be three curves in this plot, each representing a choice of learning rate.

In [19]:
#----------------- Your code here ---------------#







#------------------------------------------------#

As we can observe, very small or very large learning rates aren't preferable since even convergence is not guaranteed.

Logistics Regression

Let us now perform logistics regression using tensorflow. We will now be using the MNIST database of handwritten digits.

The data can be imported

In [20]:
import tensorflow as tf
from matplotlib import pyplot as plt
import numpy as np

# Import MINST data
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("/tmp/data/", one_hot=True)
Extracting /tmp/data/train-images-idx3-ubyte.gz
Extracting /tmp/data/train-labels-idx1-ubyte.gz
Extracting /tmp/data/t10k-images-idx3-ubyte.gz
Extracting /tmp/data/t10k-labels-idx1-ubyte.gz

Let us see visualize a part of the data.

In [21]:
def gen_image(arr):
    two_d = (np.reshape(arr, (28, 28)) * 255).astype(np.uint8)
    plt.imshow(two_d, interpolation='nearest')
    return plt

# Get a batch of two random images and show in a pop-up window.
batch_xs, batch_ys = mnist.test.next_batch(20)
for i in range(20):
    gen_image(batch_xs[i]).show()
    print("True classification:",list(batch_ys[i]).index(1))
True classification: 7
True classification: 6
True classification: 6
True classification: 8
True classification: 9
True classification: 7
True classification: 7
True classification: 6
True classification: 2
True classification: 2
True classification: 0
True classification: 7
True classification: 8
True classification: 2
True classification: 0
True classification: 2
True classification: 0
True classification: 3
True classification: 8
True classification: 7

Then, we set the parameters similar to the linear regression exercise.

In [22]:
# Parameters
learning_rate = 0.01
training_epochs = 100
batch_size = 100
display_step = 1

# tf Graph Input
x = tf.placeholder(tf.float32, [None, 784]) # mnist data image of shape 28*28=784
y = tf.placeholder(tf.float32, [None, 10]) # 0-9 digits recognition => 10 classes

# Set model weights
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))

# Construct model
pred = tf.nn.softmax(tf.matmul(x, W) + b) # Softmax

# Minimize error using cross entropy
cost = tf.reduce_mean(-tf.reduce_sum(y*tf.log(pred), reduction_indices=1))
# Gradient Descent
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)

# Initialize the variables (i.e. assign their default value)
init = tf.global_variables_initializer()

Start training.

In [23]:
with tf.Session() as sess:
    sess.run(init)

    # Training cycle
    for epoch in range(training_epochs):
        avg_cost = 0.
        total_batch = int(mnist.train.num_examples/batch_size)
        # Loop over all batches
        for i in range(total_batch):
            batch_xs, batch_ys = mnist.train.next_batch(batch_size)
            # Fit training using batch data
            _, c = sess.run([optimizer, cost], feed_dict={x: batch_xs,
                                                          y: batch_ys})
            # Compute average loss
            avg_cost += c / total_batch
        # Display logs per epoch step
        if (epoch+1) % display_step == 0:
            print("Epoch:", '%04d' % (epoch+1), "cost=", "{:.9f}".format(avg_cost))
            
    prediction=tf.argmax(pred,1)
    print("Optimization Finished.")

    # Test model
    correct_prediction = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))
    
    # Calculate accuracy for 3000 examples
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
    print("Accuracy:", accuracy.eval({x: mnist.test.images[:3000], y: mnist.test.labels[:3000]}))
    
    predicted_labels = list(prediction.eval({x: mnist.test.images[:3000]}))
    images = mnist.test.images[:3000]
    for i in range(10):
        gen_image(images[i]).show()
        print("Predicted classification:",predicted_labels[i])   
Epoch: 0001 cost= 1.183898214
Epoch: 0002 cost= 0.665308342
Epoch: 0003 cost= 0.552783571
Epoch: 0004 cost= 0.498667641
Epoch: 0005 cost= 0.465489739
Epoch: 0006 cost= 0.442589355
Epoch: 0007 cost= 0.425541973
Epoch: 0008 cost= 0.412195165
Epoch: 0009 cost= 0.401432465
Epoch: 0010 cost= 0.392425293
Epoch: 0011 cost= 0.384742346
Epoch: 0012 cost= 0.378192706
Epoch: 0013 cost= 0.372398232
Epoch: 0014 cost= 0.367317174
Epoch: 0015 cost= 0.362698238
Epoch: 0016 cost= 0.358573213
Epoch: 0017 cost= 0.354844159
Epoch: 0018 cost= 0.351481977
Epoch: 0019 cost= 0.348353035
Epoch: 0020 cost= 0.345467231
Epoch: 0021 cost= 0.342773291
Epoch: 0022 cost= 0.340238411
Epoch: 0023 cost= 0.337885419
Epoch: 0024 cost= 0.335737078
Epoch: 0025 cost= 0.333681319
Epoch: 0026 cost= 0.331781960
Epoch: 0027 cost= 0.329938343
Epoch: 0028 cost= 0.328224855
Epoch: 0029 cost= 0.326590512
Epoch: 0030 cost= 0.325041974
Epoch: 0031 cost= 0.323573996
Epoch: 0032 cost= 0.322148951
Epoch: 0033 cost= 0.320782947
Epoch: 0034 cost= 0.319562138
Epoch: 0035 cost= 0.318319913
Epoch: 0036 cost= 0.317137144
Epoch: 0037 cost= 0.316020662
Epoch: 0038 cost= 0.314912067
Epoch: 0039 cost= 0.313856030
Epoch: 0040 cost= 0.312815333
Epoch: 0041 cost= 0.311887442
Epoch: 0042 cost= 0.310994318
Epoch: 0043 cost= 0.310054436
Epoch: 0044 cost= 0.309147701
Epoch: 0045 cost= 0.308342149
Epoch: 0046 cost= 0.307528140
Epoch: 0047 cost= 0.306719379
Epoch: 0048 cost= 0.305973771
Epoch: 0049 cost= 0.305215142
Epoch: 0050 cost= 0.304486037
Epoch: 0051 cost= 0.303802654
Epoch: 0052 cost= 0.303103241
Epoch: 0053 cost= 0.302469346
Epoch: 0054 cost= 0.301819288
Epoch: 0055 cost= 0.301206708
Epoch: 0056 cost= 0.300565897
Epoch: 0057 cost= 0.299928822
Epoch: 0058 cost= 0.299398293
Epoch: 0059 cost= 0.298784484
Epoch: 0060 cost= 0.298247610
Epoch: 0061 cost= 0.297701291
Epoch: 0062 cost= 0.297226269
Epoch: 0063 cost= 0.296695855
Epoch: 0064 cost= 0.296180533
Epoch: 0065 cost= 0.295679112
Epoch: 0066 cost= 0.295198161
Epoch: 0067 cost= 0.294734182
Epoch: 0068 cost= 0.294281722
Epoch: 0069 cost= 0.293796275
Epoch: 0070 cost= 0.293383557
Epoch: 0071 cost= 0.292945816
Epoch: 0072 cost= 0.292467417
Epoch: 0073 cost= 0.292113494
Epoch: 0074 cost= 0.291675508
Epoch: 0075 cost= 0.291273939
Epoch: 0076 cost= 0.290882596
Epoch: 0077 cost= 0.290497047
Epoch: 0078 cost= 0.290144363
Epoch: 0079 cost= 0.289698564
Epoch: 0080 cost= 0.289384448
Epoch: 0081 cost= 0.289043821
Epoch: 0082 cost= 0.288678120
Epoch: 0083 cost= 0.288325593
Epoch: 0084 cost= 0.287998520
Epoch: 0085 cost= 0.287664026
Epoch: 0086 cost= 0.287294150
Epoch: 0087 cost= 0.286994001
Epoch: 0088 cost= 0.286687467
Epoch: 0089 cost= 0.286383921
Epoch: 0090 cost= 0.286066549
Epoch: 0091 cost= 0.285718978
Epoch: 0092 cost= 0.285478832
Epoch: 0093 cost= 0.285162855
Epoch: 0094 cost= 0.284903606
Epoch: 0095 cost= 0.284611574
Epoch: 0096 cost= 0.284313518
Epoch: 0097 cost= 0.284051882
Epoch: 0098 cost= 0.283776877
Epoch: 0099 cost= 0.283469135
Epoch: 0100 cost= 0.283229419
Optimization Finished.
Accuracy: 0.924
Predicted classification: 7
Predicted classification: 6
Predicted classification: 6
Predicted classification: 8
Predicted classification: 7
Predicted classification: 7
Predicted classification: 7
Predicted classification: 6
Predicted classification: 2
Predicted classification: 2

Q2. Observe the dependence of the training cost (error) on the learning rate.

With all other parameters fixed, we now vary the learning rate to take three values in $\{0.0001,0.01,1\}$. Draw a plot with y-axis representing the training cost and the x-axis representing the iteration (0 through 1000). There must be three curves in this plot, each representing a choice of learning rate.

In [24]:
#----------------- Your code here ---------------#








#------------------------------------------------#
In [ ]: