In this tutorial, we will introduce what is mixed precision training, how about the effect of it and how to use it.

## What is mixed precision training?

Mixed precision training means we will use float32 or float16 precision when traing a model, it has two benefits:

- Decrease the required amount of memory.

Half-precision floating point format (FP16) uses 16 bits, compared to 32 bits for single precision (FP32). Lowering the required memory enables training of larger models or training with larger minibatches.

- Shorten the training or inference time.

Execution time can be sensitive to memory or arithmetic bandwidth. Half-precision halves the number of bytes accessed, thus reducing the time spent in memory-limited layers.

## The effect of mixed precision training

From paper: MIXED PRECISION TRAINING, we can find: Mixed precision training does not decrease the effect of model.

## How to implement mixed precision training?

If you are using pytorch, you can use torch.cuda.amp.GradScaler to implement, here is the tutorial:

Implement Mixed Precision Training with GradScaler in PyTorch – PyTorch Tutorial