Getting Started with TensorFlow: A Beginner's Guide

2018-08-13 · Ryan · Post Comment

Introduction

Deep learning has driven breakthroughs in AI research and applications, transforming many aspects of modern life. TensorFlow, Google's open-source deep learning framework, is one of the most popular tools for this field. It is widely used for image classification, audio processing, recommendation systems, and natural language processing. While powerful, TensorFlow has a relatively gentle learning curve if you know Python and have a basic understanding of machine learning and neural networks. This guide provides a foundational introduction to TensorFlow.

Getting to Know TensorFlow

Installation

TensorFlow runs on Python 2.7 or 3.x, and supports 64-bit Linux, macOS, and Windows. There are two main package variants: tensorflow (CPU-only) and tensorflow-gpu (GPU-accelerated). For production, the GPU version is recommended for its computational power, but it requires installing CUDA Toolkit and cuDNN. The CPU version is simpler to install, typically via pip install tensorflow. If you encounter issues, search online for solutions based on the error messages.

After installation, verify it by running Python and importing TensorFlow:

>>> import tensorflow as tf

This import statement is a standard convention used throughout this guide.

TensorFlow's Computational Model

In TensorFlow, computation is expressed as a graph of operations. Let's compute c = a + b where a = 3 and b = 2.

>>> a = tf.constant(3)
>>> b = tf.constant(2)
>>> c = a + b
>>> sess = tf.Session()
>>> print(sess.run(c))
5

Unlike Python's immediate print(3+2), TensorFlow requires defining operations and then executing them via a Session. The objects a, b, and c are Tensors—multidimensional arrays that are the core data structure. A scalar is a 0-D tensor, a vector is a 1-D tensor, and a matrix is a 2-D tensor. In deep learning, data like weights, biases, and images are represented as tensors.

The name "TensorFlow" reflects tensors flowing through a computational graph. Each node in the graph is an operation (like addition), and tensors are the inputs and outputs. Execution happens in two phases: building the graph and then running it via a session. This design allows efficient execution, especially on GPUs, by minimizing data transfer overhead.

Instead of sess.run(c), you can use c.eval(session=sess). For convenience, you can create an interactive session:

>>> sess = tf.InteractiveSession()
>>> print(c.eval())
5

To make computations parameterizable, use tf.placeholder:

>>> a = tf.placeholder(tf.int32)
>>> b = tf.placeholder(tf.int32)
>>> c = a + b
>>> print(c.eval({a:3, b:2}))
5

For updatable parameters, use tf.Variable. Variables must be initialized within a session:

>>> a = tf.Variable(3)
>>> b = tf.Variable(2)
>>> c = a + b
>>> init = tf.global_variables_initializer()
>>> sess.run(init)
>>> print(c.eval())
5

Variables are typically used for model parameters like weights and biases that are optimized during training.

Machine Learning with TensorFlow: MNIST Example

Loading Data

The MNIST dataset contains 70,000 labeled images of handwritten digits (0–9). TensorFlow provides a helper function to download and load it:

from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

The data is split into 55,000 training images, 5,000 validation images, and 10,000 test images. Each image is 28x28 pixels, flattened into a 784-element vector. Pixel values are normalized between 0 and 1. Labels are one-hot vectors of length 10 (e.g., digit 3 is [0,0,0,1,0,0,0,0,0,0]).

Building a Softmax Regression Model

We'll use a simple Softmax regression model for classification. The model computes:

y = softmax(W * x + b)

Where x is the input vector (784 pixels), W is a weight matrix (784x10), b is a bias vector (10), and softmax converts outputs to probabilities.

x = tf.placeholder(tf.float32, [None, 784])
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))
y = tf.nn.softmax(tf.matmul(x, W) + b)

We define a loss function to minimize. For classification, cross-entropy is common:

y_ = tf.placeholder(tf.float32, [None, 10])
cross_entropy = -tf.reduce_sum(y_ * tf.log(y))

TensorFlow provides a combined function for numerical stability:

y = tf.matmul(x, W) + b
cross_entropy = tf.reduce_mean(
    tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y))

Training the Model

We use gradient descent to minimize the loss:

train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy)

Training involves running the optimizer in a session:

sess = tf.InteractiveSession()
sess.run(tf.global_variables_initializer())
for i in range(1000):
    batch_xs, batch_ys = mnist.train.next_batch(100)
    sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})

We use mini-batch stochastic gradient descent for efficiency.

Evaluating the Model

Accuracy is computed by comparing predictions to true labels:

correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
print(sess.run(accuracy, feed_dict={x: mnist.test.images, y_: mnist.test.labels}))

This simple model achieves about 91% accuracy.

Deep Learning with TensorFlow: Convolutional Neural Network

CNN Basics

Convolutional Neural Networks (CNNs) are more effective for image tasks. They use convolutional layers to extract local features and pooling layers to reduce dimensionality. ReLU is the preferred activation function for CNNs.

Building a LeNet-5 Style CNN

We'll construct a CNN with two convolutional-pooling layers, followed by fully connected layers.

First Convolutional Layer:

x_image = tf.reshape(x, [-1, 28, 28, 1])
W_conv1 = tf.Variable(tf.truncated_normal([5,5,1,32], stddev=0.1))
b_conv1 = tf.Variable(tf.constant(0.1, shape=[32]))
h_conv1 = tf.nn.relu(tf.nn.conv2d(x_image, W_conv1, strides=[1,1,1,1], padding='SAME') + b_conv1)
h_pool1 = tf.nn.max_pool(h_conv1, ksize=[1,2,2,1], strides=[1,2,2,1], padding='SAME')

Second Convolutional Layer:

W_conv2 = tf.Variable(tf.truncated_normal([5,5,32,64], stddev=0.1))
b_conv2 = tf.Variable(tf.constant(0.1, shape=[64]))
h_conv2 = tf.nn.relu(tf.nn.conv2d(h_pool1, W_conv2, strides=[1,1,1,1], padding='SAME') + b_conv2)
h_pool2 = tf.nn.max_pool(h_conv2, ksize=[1,2,2,1], strides=[1,2,2,1], padding='SAME')

Fully Connected Layer with Dropout:

h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64])
W_fc1 = tf.Variable(tf.truncated_normal([7*7*64, 1024], stddev=0.1))
b_fc1 = tf.Variable(tf.constant(0.1, shape=[1024]))
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)
keep_prob = tf.placeholder(tf.float32)
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)

Output Layer:

W_fc2 = tf.Variable(tf.truncated_normal([1024, 10], stddev=0.1))
b_fc2 = tf.Variable(tf.constant(0.1, shape=[10]))
y_conv = tf.matmul(h_fc1_drop, W_fc2) + b_fc2
cross_entropy = tf.reduce_mean(
    tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y_conv))

Training and Evaluation

We use the Adam optimizer and include dropout during training:

train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
correct_prediction = tf.equal(tf.argmax(y_conv,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
sess.run(tf.global_variables_initializer())
for i in range(20000):
    batch_xs, batch_ys = mnist.train.next_batch(50)
    if i % 100 == 0:
        train_accuracy = accuracy.eval(feed_dict={x:batch_xs, y_:batch_ys, keep_prob:1.0})
        print('step %d, training accuracy %g' % (i, train_accuracy))
    train_step.run(feed_dict={x:batch_xs, y_:batch_ys, keep_prob:0.5})
print('test accuracy %g' % accuracy.eval(feed_dict={
    x: mnist.test.images, y_: mnist.test.labels, keep_prob:1.0}))

This CNN achieves about 99.2% accuracy on the test set.

Conclusion

This guide introduced TensorFlow's core concepts and demonstrated basic machine learning and deep learning workflows using the MNIST dataset. TensorFlow offers extensive tools for building, training, and deploying models. To learn more, explore the official documentation and experiment with your own projects.