{ "cells": [ { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# For tips on running notebooks in Google Colab, see\n# https://codelin.vip/beginner/colab\n%matplotlib inline" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "What is a state\\_dict in PyTorch\n================================\n\nIn PyTorch, the learnable parameters (i.e. weights and biases) of a\n`torch.nn.Module` model are contained in the model's parameters\n(accessed with `model.parameters()`). A `state_dict` is simply a Python\ndictionary object that maps each layer to its parameter tensor.\n\nIntroduction\n------------\n\nA `state_dict` is an integral entity if you are interested in saving or\nloading models from PyTorch. Because `state_dict` objects are Python\ndictionaries, they can be easily saved, updated, altered, and restored,\nadding a great deal of modularity to PyTorch models and optimizers. Note\nthat only layers with learnable parameters (convolutional layers, linear\nlayers, etc.) and registered buffers (batchnorm's running\\_mean) have\nentries in the model's `state_dict`. Optimizer objects (`torch.optim`)\nalso have a `state_dict`, which contains information about the\noptimizer's state, as well as the hyperparameters used. In this recipe,\nwe will see how `state_dict` is used with a simple model.\n\nSetup\n-----\n\nBefore we begin, we need to install `torch` if it isn't already\navailable.\n\n``` {.sh}\npip install torch\n```\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Steps\n=====\n\n1. Import all necessary libraries for loading our data\n2. Define and initialize the neural network\n3. Initialize the optimizer\n4. Access the model and optimizer `state_dict`\n\n1. Import necessary libraries for loading our data\n--------------------------------------------------\n\nFor this recipe, we will use `torch` and its subsidiaries `torch.nn` and\n`torch.optim`.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "import torch\nimport torch.nn as nn\nimport torch.nn.functional as F\nimport torch.optim as optim" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "2. Define and initialize the neural network\n===========================================\n\nFor sake of example, we will create a neural network for training\nimages. To learn more see the Defining a Neural Network recipe.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "class Net(nn.Module):\n def __init__(self):\n super(Net, self).__init__()\n self.conv1 = nn.Conv2d(3, 6, 5)\n self.pool = nn.MaxPool2d(2, 2)\n self.conv2 = nn.Conv2d(6, 16, 5)\n self.fc1 = nn.Linear(16 * 5 * 5, 120)\n self.fc2 = nn.Linear(120, 84)\n self.fc3 = nn.Linear(84, 10)\n\n def forward(self, x):\n x = self.pool(F.relu(self.conv1(x)))\n x = self.pool(F.relu(self.conv2(x)))\n x = x.view(-1, 16 * 5 * 5)\n x = F.relu(self.fc1(x))\n x = F.relu(self.fc2(x))\n x = self.fc3(x)\n return x\n\nnet = Net()\nprint(net)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "3. Initialize the optimizer\n===========================\n\nWe will use SGD with momentum.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "4. Access the model and optimizer `state_dict`\n==============================================\n\nNow that we have constructed our model and optimizer, we can understand\nwhat is preserved in their respective `state_dict` properties.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# Print model's state_dict\nprint(\"Model's state_dict:\")\nfor param_tensor in net.state_dict():\n print(param_tensor, \"\\t\", net.state_dict()[param_tensor].size())\n\nprint()\n\n# Print optimizer's state_dict\nprint(\"Optimizer's state_dict:\")\nfor var_name in optimizer.state_dict():\n print(var_name, \"\\t\", optimizer.state_dict()[var_name])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This information is relevant for saving and loading the model and\noptimizers for future use.\n\nCongratulations! You have successfully used `state_dict` in PyTorch.\n\nLearn More\n==========\n\nTake a look at these other recipes to continue your learning:\n\n- [Saving and loading models for inference in\n PyTorch](https://pytorch.org/tutorials/recipes/recipes/saving_and_loading_models_for_inference.html)\n- [Saving and loading a general checkpoint in\n PyTorch](https://pytorch.org/tutorials/recipes/recipes/saving_and_loading_a_general_checkpoint.html)\n" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.12" } }, "nbformat": 4, "nbformat_minor": 0 }