{
  "cells": [
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "# For tips on running notebooks in Google Colab, see\n# https://codelin.vip/beginner/colab\n%matplotlib inline"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "(beta) Running the compiled optimizer with an LR Scheduler\n==========================================================\n\n**Author:** [Michael Lazos](https://github.com/mlazos)\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "The optimizer is a key algorithm for training any deep learning model.\nIn this example, we will show how to pair the optimizer, which has been\ncompiled using `torch.compile`, with the LR schedulers to accelerate\ntraining convergence.\n\n```{=html}\n<div style=\"background-color: #54c7ec; color: #fff; font-weight: 700; padding-left: 10px; padding-top: 5px; padding-bottom: 5px\"><strong>NOTE:</strong></div>\n```\n```{=html}\n<div style=\"background-color: #f3f4f7; padding-left: 10px; padding-top: 10px; padding-bottom: 10px; padding-right: 10px\">\n```\n```{=html}\n<p>This tutorial requires PyTorch 2.3.0 or later.</p>\n```\n```{=html}\n</div>\n```\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Model Setup\n===========\n\nFor this example, we\\'ll use a simple sequence of linear layers.\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "import torch\n\n# Create simple model\nmodel = torch.nn.Sequential(\n    *[torch.nn.Linear(1024, 1024, False, device=\"cuda\") for _ in range(10)]\n)\ninput = torch.rand(1024, device=\"cuda\")\n\n# run forward pass\noutput = model(input)\n\n# run backward to populate the grads for our optimizer below\noutput.sum().backward()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Setting up and running the compiled optimizer with LR Scheduler\n===============================================================\n\nIn this section, we\\'ll use the Adam optimizer with LinearLR Scheduler\nand create a helper function to wrap the `step()` call for each of them\nin `torch.compile()`.\n\n```{=html}\n<div style=\"background-color: #54c7ec; color: #fff; font-weight: 700; padding-left: 10px; padding-top: 5px; padding-bottom: 5px\"><strong>NOTE:</strong></div>\n```\n```{=html}\n<div style=\"background-color: #f3f4f7; padding-left: 10px; padding-top: 10px; padding-bottom: 10px; padding-right: 10px\">\n```\n```{=html}\n<p><code>torch.compile</code> is only supported on CUDA devices that have a compute capability of 7.0 or higher.</p>\n```\n```{=html}\n</div>\n```\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "# exit cleanly if we are on a device that doesn't support ``torch.compile``\nif torch.cuda.get_device_capability() < (7, 0):\n    print(\"Exiting because torch.compile is not supported on this device.\")\n    import sys\n    sys.exit(0)\n\n# !!! IMPORTANT !!! Wrap the lr in a Tensor if we are pairing the\n# the optimizer with an LR Scheduler.\n# Without this, torch.compile will recompile as the value of the LR\n# changes.\nopt = torch.optim.Adam(model.parameters(), lr=torch.tensor(0.01))\nsched = torch.optim.lr_scheduler.LinearLR(opt, total_iters=5)\n\n@torch.compile(fullgraph=False)\ndef fn():\n    opt.step()\n    sched.step()\n\n\n# Warmup runs to compile the function\nfor _ in range(5):\n    fn()\n    print(opt.param_groups[0][\"lr\"])"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Extension: What happens with a non-tensor LR?\n=============================================\n\nFor the curious, we will show how to peek into what happens with\n`torch.compile` when we don\\'t wrap the LR in a tensor.\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "# No longer wrap the LR in a tensor here\nopt = torch.optim.Adam(model.parameters(), lr=0.01)\nsched = torch.optim.lr_scheduler.LinearLR(opt, total_iters=5)\n\n@torch.compile(fullgraph=False)\ndef fn():\n    opt.step()\n    sched.step()\n\n# Setup logging to view recompiles\ntorch._logging.set_logs(recompiles=True)\n\n# Warmup runs to compile the function\n# We will now recompile on each iteration\n# as the value of the lr is mutated.\nfor _ in range(5):\n    fn()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "With this example, we can see that we recompile the optimizer a few\ntimes due to the guard failure on the `lr` in `param_groups[0]`.\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Conclusion\n==========\n\nIn this tutorial we showed how to pair the optimizer compiled with\n`torch.compile` with an LR Scheduler to accelerate training convergence.\nWe used a model consisting of a simple sequence of linear layers with\nthe Adam optimizer paired with a LinearLR scheduler to demonstrate the\nLR changing across iterations.\n\nSee also:\n\n-   [Compiled optimizer\n    tutorial](https://pytorch.org/tutorials/recipes/compiling_optimizer.html) -\n    an intro into the compiled optimizer.\n-   [Compiling the optimizer with\n    PT2](https://dev-discuss.pytorch.org/t/compiling-the-optimizer-with-pt2/1669) -\n    deeper technical details on the compiled optimizer.\n"
      ]
    }
  ],
  "metadata": {
    "kernelspec": {
      "display_name": "Python 3",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.10.12"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}