TensorFlow tf.nn.conv2d() function is widely used to build a convolution network in deep learning. In this tutorial, we will use some examples to show how to use it correctly.

## Syntax

tf.nn.conv2d() is defined as:

tf.nn.conv2d( input, filter, strides, padding, use_cudnn_on_gpu=True, data_format='NHWC', dilations=[1, 1, 1, 1], name=None )

It computes a 2-D convolution given 4-D input and filter tensors.

## Parameters

input: The shape of it should be [batch, in_height, in_width, **in_channels**].

If it represents image data, batch will be the image batch size, in_height will be the height of image, in_width will be the width of image, in_channels will be the image color channels, such as r, g, b.

filter: The shape of it should be [filter_height, filter_width, **in_channels**, out_channels].

You should notice: the value of in_channels in input and filter are the same.

strides: It shoud be [1,stride,stride,1]. It represents the stride of the sliding window for each dimension of input. The dimension order is determined by the value of data_format.

data_format: It can be NHWC or NCHW, default is NHWC. It determines the dimension of input and strides.

NHWC: It means the input = [batch, in_height, in_width, in_channels], strides = [1,stride,stride,1]

NCHW: It means the input = [batch, in_channels, in_height, in_width], strides = [1, 1, stride,stride]

padding: It can be SAME or VALID. The type of padding algorithm to use.

To know the difference between SAME and VALID, you can read:

Understand the Difference Between ‘SAME’ and ‘VALID’ Padding in Convolution Networks

dilations: Defaults to [1, 1, 1, 1]. The dilation factor for each dimension of input. If set to k > 1, there will be k-1 skipped cells between each filter element on that dimension. The dimension order is determined by the value of data_format.

## Return

tf.nn.conv2d() will return a tensor with the shape [batch, out_height, out_width, out_channels ], out_height and out_width is determinded by filter, strides, padding and dilations.

In order to know how to determine out_height and out_width, you can read:

Understand the Shape of Tensor Returned by tf.nn.conv2d()

Then we will some examples to show how to use tf.nn.conv2d() .

## How to use tf.nn.conv2d() ?

Look at this example code:

import tensorflow as tf input = tf.Variable(tf.constant(1.0, shape=[1, 5, 5, 1])) filter = tf.Variable(tf.constant([-1.0, 0, 0, -1], shape=[2, 2, 1, 1])) op = tf.nn.conv2d(input, filter, strides=[1, 2, 2, 1], padding='SAME') init = tf.global_variables_initializer() with tf.Session() as sess: sess.run(init) print("op:\n",sess.run(op))

In this example, we can find:

As to input:

batch= 1, in_height = 5, in_width = 5, in_channels = 1

As to filter:

filter_height = 2, filter_width = 2, in_channels = 1, out_channels =1

We can find the shape of op may be: [1, out_height, out_width, 1]

Run thid code, we can find the op will be:

op: [[[[-2.] [-2.] [-1.]] [[-2.] [-2.] [-1.]] [[-1.] [-1.] [-1.]]]]

The process is:

If we set padding=’VALID’

The op will be:

op: [[[[-2.] [-2.]] [[-2.] [-2.]]]]

The last row and column will be dropped.

If the shape of filter = [2, 2, 1, 2]

filter = tf.Variable(tf.constant([-1.0, 0, 0, -1], shape=[2, 2, 1, 2])) op = tf.nn.conv2d(input, filter, strides=[1, 2, 2, 1], padding='VALID')

the shape of op may be: [1, out_height, out_width, 2]

op: [[[[-3. -3.] [-3. -3.]] [[-3. -3.] [-3. -3.]]]]