{ "cells": [ { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# For tips on running notebooks in Google Colab, see\n# https://pytorch.org/tutorials/beginner/colab\n%matplotlib inline" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A Gentle Introduction to `torch.autograd`\n=========================================\n\n`torch.autograd` is PyTorch's automatic differentiation engine that\npowers neural network training. In this section, you will get a\nconceptual understanding of how autograd helps a neural network train.\n\nBackground\n----------\n\nNeural networks (NNs) are a collection of nested functions that are\nexecuted on some input data. These functions are defined by *parameters*\n(consisting of weights and biases), which in PyTorch are stored in\ntensors.\n\nTraining a NN happens in two steps:\n\n**Forward Propagation**: In forward prop, the NN makes its best guess\nabout the correct output. It runs the input data through each of its\nfunctions to make this guess.\n\n**Backward Propagation**: In backprop, the NN adjusts its parameters\nproportionate to the error in its guess. It does this by traversing\nbackwards from the output, collecting the derivatives of the error with\nrespect to the parameters of the functions (*gradients*), and optimizing\nthe parameters using gradient descent. For a more detailed walkthrough\nof backprop, check out this [video from\n3Blue1Brown](https://www.youtube.com/watch?v=tIeHLnjs5U8).\n\nUsage in PyTorch\n----------------\n\nLet\\'s take a look at a single training step. For this example, we load\na pretrained resnet18 model from `torchvision`. We create a random data\ntensor to represent a single image with 3 channels, and height & width\nof 64, and its corresponding `label` initialized to some random values.\nLabel in pretrained models has shape (1,1000).\n\n```{=html}\n
This tutorial works only on the CPU and will not work on GPU devices (even if tensors are moved to CUDA).
\n```\n```{=html}\nAn important thing to note is that the graph is recreated from scratch; after each.backward()
call, autograd starts populating a new graph. This isexactly what allows you to use control flow statements in your model;you can change the shape, size and operations at every iteration ifneeded.