{
  "cells": [
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "# For tips on running notebooks in Google Colab, see\n# https://codelin.vip/beginner/colab\n%matplotlib inline"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "What is a state\\_dict in PyTorch\n================================\n\nIn PyTorch, the learnable parameters (i.e. weights and biases) of a\n`torch.nn.Module` model are contained in the model's parameters\n(accessed with `model.parameters()`). A `state_dict` is simply a Python\ndictionary object that maps each layer to its parameter tensor.\n\nIntroduction\n------------\n\nA `state_dict` is an integral entity if you are interested in saving or\nloading models from PyTorch. Because `state_dict` objects are Python\ndictionaries, they can be easily saved, updated, altered, and restored,\nadding a great deal of modularity to PyTorch models and optimizers. Note\nthat only layers with learnable parameters (convolutional layers, linear\nlayers, etc.) and registered buffers (batchnorm's running\\_mean) have\nentries in the model's `state_dict`. Optimizer objects (`torch.optim`)\nalso have a `state_dict`, which contains information about the\noptimizer's state, as well as the hyperparameters used. In this recipe,\nwe will see how `state_dict` is used with a simple model.\n\nSetup\n-----\n\nBefore we begin, we need to install `torch` if it isn't already\navailable.\n\n``` {.sh}\npip install torch\n```\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Steps\n=====\n\n1.  Import all necessary libraries for loading our data\n2.  Define and initialize the neural network\n3.  Initialize the optimizer\n4.  Access the model and optimizer `state_dict`\n\n1. Import necessary libraries for loading our data\n--------------------------------------------------\n\nFor this recipe, we will use `torch` and its subsidiaries `torch.nn` and\n`torch.optim`.\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "import torch\nimport torch.nn as nn\nimport torch.nn.functional as F\nimport torch.optim as optim"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "2. Define and initialize the neural network\n===========================================\n\nFor sake of example, we will create a neural network for training\nimages. To learn more see the Defining a Neural Network recipe.\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "class Net(nn.Module):\n    def __init__(self):\n        super(Net, self).__init__()\n        self.conv1 = nn.Conv2d(3, 6, 5)\n        self.pool = nn.MaxPool2d(2, 2)\n        self.conv2 = nn.Conv2d(6, 16, 5)\n        self.fc1 = nn.Linear(16 * 5 * 5, 120)\n        self.fc2 = nn.Linear(120, 84)\n        self.fc3 = nn.Linear(84, 10)\n\n    def forward(self, x):\n        x = self.pool(F.relu(self.conv1(x)))\n        x = self.pool(F.relu(self.conv2(x)))\n        x = x.view(-1, 16 * 5 * 5)\n        x = F.relu(self.fc1(x))\n        x = F.relu(self.fc2(x))\n        x = self.fc3(x)\n        return x\n\nnet = Net()\nprint(net)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "3. Initialize the optimizer\n===========================\n\nWe will use SGD with momentum.\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "4. Access the model and optimizer `state_dict`\n==============================================\n\nNow that we have constructed our model and optimizer, we can understand\nwhat is preserved in their respective `state_dict` properties.\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "# Print model's state_dict\nprint(\"Model's state_dict:\")\nfor param_tensor in net.state_dict():\n    print(param_tensor, \"\\t\", net.state_dict()[param_tensor].size())\n\nprint()\n\n# Print optimizer's state_dict\nprint(\"Optimizer's state_dict:\")\nfor var_name in optimizer.state_dict():\n    print(var_name, \"\\t\", optimizer.state_dict()[var_name])"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "This information is relevant for saving and loading the model and\noptimizers for future use.\n\nCongratulations! You have successfully used `state_dict` in PyTorch.\n\nLearn More\n==========\n\nTake a look at these other recipes to continue your learning:\n\n-   [Saving and loading models for inference in\n    PyTorch](https://pytorch.org/tutorials/recipes/recipes/saving_and_loading_models_for_inference.html)\n-   [Saving and loading a general checkpoint in\n    PyTorch](https://pytorch.org/tutorials/recipes/recipes/saving_and_loading_a_general_checkpoint.html)\n"
      ]
    }
  ],
  "metadata": {
    "kernelspec": {
      "display_name": "Python 3",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.10.12"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}