{ "cells": [ { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# For tips on running notebooks in Google Colab, see\n# https://codelin.vip/beginner/colab\n%matplotlib inline" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Reducing torch.compile cold start compilation time with regional compilation\n============================================================================\n\n**Author:** [Animesh Jain](https://github.com/anijain2305)\n\nAs deep learning models get larger, the compilation time of these models\nalso increases. This extended compilation time can result in a large\nstartup time in inference services or wasted resources in large-scale\ntraining. This recipe shows an example of how to reduce the cold start\ncompilation time by choosing to compile a repeated region of the model\ninstead of the entire model.\n\nPrerequisites\n-------------\n\n- Pytorch 2.5 or later\n\nSetup\n-----\n\nBefore we begin, we need to install `torch` if it is not already\navailable.\n\n``` {.sh}\npip install torch\n```\n\n```{=html}\n
This feature is available starting with the 2.5 release. If you are using version 2.4,you can enable the configuration flag torch._dynamo.config.inline_inbuilt_nn_modules=True
to prevent recompilations during regional compilation. In version 2.5, this flag is enabled by default.