What is a GPU?

GPUs (Graphics Processing Units) are specialized computer hardware originally created to render images at high frame rates (most commonly images in video games). Since graphics texturing and shading require more matrix and vector operations executed in parallel than a CPU (Central Processing Unit) can reasonably handle, GPUs were made to perform these calculations more efficiently.

Why a GPU?

It so happens that Deep Learning also requires super fast matrix computations. So researchers put two and two together and started training models in GPU’s and the rest is history.

Optimized FP Operations Complex Instruction Set
Slow (1-2 Ghz) Fast (3-4 Ghz)
> 1000 Cores < 100 Cores
Fast Dedicated VRAM Large Capacity System RAM

Deep Learning really only cares about the number of Floating Point Operations (FLOPs) per second. GPUs are highly optimized for that.


In the log scale chart above, you can see that GPUs (red/green) can theoretically do 10-15x the operations of CPUs (in blue). This speedup very much applies in practice too. But do not take our word for it!

Try running this inside a Jupyter Notebook:

Cell [1]:
import torch
t_cpu = torch.rand(500,500,500)
%timeit t_cpu @ t_cpu

Cell [2]:
t_gpu = torch.rand(500,500,500).cuda()
%timeit t_gpu @ t_gpu

If you would like to train anything meaningful in deep learning, a GPU is what you need - specifically an NVIDIA GPU.


We recommend you to use an NVIDIA GPU since they are currently the best out there for a few reasons:

  1. Currently the fastest

  2. Native Pytorch support for CUDA

  3. Highly optimized for deep learning with cuDNN

Many thanks to Andrew Shaw for writing this guide.