{ "cells": [ { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# For tips on running notebooks in Google Colab, see\n# https://pytorch.org/tutorials/beginner/colab\n%matplotlib inline" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[Learn the Basics](intro.html) \\|\\|\n[Quickstart](quickstart_tutorial.html) \\|\\|\n[Tensors](tensorqs_tutorial.html) \\|\\| **Datasets & DataLoaders** \\|\\|\n[Transforms](transforms_tutorial.html) \\|\\| [Build\nModel](buildmodel_tutorial.html) \\|\\|\n[Autograd](autogradqs_tutorial.html) \\|\\|\n[Optimization](optimization_tutorial.html) \\|\\| [Save & Load\nModel](saveloadrun_tutorial.html)\n\nDatasets & DataLoaders\n======================\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Code for processing data samples can get messy and hard to maintain; we\nideally want our dataset code to be decoupled from our model training\ncode for better readability and modularity. PyTorch provides two data\nprimitives: `torch.utils.data.DataLoader` and `torch.utils.data.Dataset`\nthat allow you to use pre-loaded datasets as well as your own data.\n`Dataset` stores the samples and their corresponding labels, and\n`DataLoader` wraps an iterable around the `Dataset` to enable easy\naccess to the samples.\n\nPyTorch domain libraries provide a number of pre-loaded datasets (such\nas FashionMNIST) that subclass `torch.utils.data.Dataset` and implement\nfunctions specific to the particular data. They can be used to prototype\nand benchmark your model. You can find them here: [Image\nDatasets](https://pytorch.org/vision/stable/datasets.html), [Text\nDatasets](https://pytorch.org/text/stable/datasets.html), and [Audio\nDatasets](https://pytorch.org/audio/stable/datasets.html)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Loading a Dataset\n=================\n\nHere is an example of how to load the\n[Fashion-MNIST](https://research.zalando.com/project/fashion_mnist/fashion_mnist/)\ndataset from TorchVision. Fashion-MNIST is a dataset of Zalando's\narticle images consisting of 60,000 training examples and 10,000 test\nexamples. Each example comprises a 28\u00d728 grayscale image and an\nassociated label from one of 10 classes.\n\nWe load the [FashionMNIST Dataset](https://pytorch.org/vision/stable/datasets.html#fashion-mnist) with the following parameters:\n\n: - `root` is the path where the train/test data is stored,\n - `train` specifies training or test dataset,\n - `download=True` downloads the data from the internet if it\\'s\n not available at `root`.\n - `transform` and `target_transform` specify the feature and label\n transformations\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "import torch\nfrom torch.utils.data import Dataset\nfrom torchvision import datasets\nfrom torchvision.transforms import ToTensor\nimport matplotlib.pyplot as plt\n\n\ntraining_data = datasets.FashionMNIST(\n root=\"data\",\n train=True,\n download=True,\n transform=ToTensor()\n)\n\ntest_data = datasets.FashionMNIST(\n root=\"data\",\n train=False,\n download=True,\n transform=ToTensor()\n)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Iterating and Visualizing the Dataset\n=====================================\n\nWe can index `Datasets` manually like a list: `training_data[index]`. We\nuse `matplotlib` to visualize some samples in our training data.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "labels_map = {\n 0: \"T-Shirt\",\n 1: \"Trouser\",\n 2: \"Pullover\",\n 3: \"Dress\",\n 4: \"Coat\",\n 5: \"Sandal\",\n 6: \"Shirt\",\n 7: \"Sneaker\",\n 8: \"Bag\",\n 9: \"Ankle Boot\",\n}\nfigure = plt.figure(figsize=(8, 8))\ncols, rows = 3, 3\nfor i in range(1, cols * rows + 1):\n sample_idx = torch.randint(len(training_data), size=(1,)).item()\n img, label = training_data[sample_idx]\n figure.add_subplot(rows, cols, i)\n plt.title(labels_map[label])\n plt.axis(\"off\")\n plt.imshow(img.squeeze(), cmap=\"gray\")\nplt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "------------------------------------------------------------------------\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Creating a Custom Dataset for your files\n========================================\n\nA custom Dataset class must implement three functions:\n[\\_\\_init\\_\\_]{.title-ref}, [\\_\\_len\\_\\_]{.title-ref}, and\n[\\_\\_getitem\\_\\_]{.title-ref}. Take a look at this implementation; the\nFashionMNIST images are stored in a directory `img_dir`, and their\nlabels are stored separately in a CSV file `annotations_file`.\n\nIn the next sections, we\\'ll break down what\\'s happening in each of\nthese functions.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "import os\nimport pandas as pd\nfrom torchvision.io import decode_image\n\nclass CustomImageDataset(Dataset):\n def __init__(self, annotations_file, img_dir, transform=None, target_transform=None):\n self.img_labels = pd.read_csv(annotations_file)\n self.img_dir = img_dir\n self.transform = transform\n self.target_transform = target_transform\n\n def __len__(self):\n return len(self.img_labels)\n\n def __getitem__(self, idx):\n img_path = os.path.join(self.img_dir, self.img_labels.iloc[idx, 0])\n image = decode_image(img_path)\n label = self.img_labels.iloc[idx, 1]\n if self.transform:\n image = self.transform(image)\n if self.target_transform:\n label = self.target_transform(label)\n return image, label" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`__init__`\n==========\n\nThe \\_\\_[init](#init__) function is run once when instantiating the\nDataset object. We initialize the directory containing the images, the\nannotations file, and both transforms (covered in more detail in the\nnext section).\n\nThe labels.csv file looks like: :\n\n tshirt1.jpg, 0\n tshirt2.jpg, 0\n ......\n ankleboot999.jpg, 9\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "def __init__(self, annotations_file, img_dir, transform=None, target_transform=None):\n self.img_labels = pd.read_csv(annotations_file)\n self.img_dir = img_dir\n self.transform = transform\n self.target_transform = target_transform" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`__len__`\n=========\n\nThe \\_\\_[len](#len__) function returns the number of samples in our\ndataset.\n\nExample:\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "def __len__(self):\n return len(self.img_labels)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`__getitem__`\n=============\n\nThe \\_\\_[getitem](#getitem__) function loads and returns a sample from\nthe dataset at the given index `idx`. Based on the index, it identifies\nthe image\\'s location on disk, converts that to a tensor using\n`decode_image`, retrieves the corresponding label from the csv data in\n`self.img_labels`, calls the transform functions on them (if\napplicable), and returns the tensor image and corresponding label in a\ntuple.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "def __getitem__(self, idx):\n img_path = os.path.join(self.img_dir, self.img_labels.iloc[idx, 0])\n image = read_image(img_path)\n label = self.img_labels.iloc[idx, 1]\n if self.transform:\n image = self.transform(image)\n if self.target_transform:\n label = self.target_transform(label)\n return image, label" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "------------------------------------------------------------------------\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Preparing your data for training with DataLoaders\n=================================================\n\nThe `Dataset` retrieves our dataset\\'s features and labels one sample at\na time. While training a model, we typically want to pass samples in\n\\\"minibatches\\\", reshuffle the data at every epoch to reduce model\noverfitting, and use Python\\'s `multiprocessing` to speed up data\nretrieval.\n\n`DataLoader` is an iterable that abstracts this complexity for us in an\neasy API.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "from torch.utils.data import DataLoader\n\ntrain_dataloader = DataLoader(training_data, batch_size=64, shuffle=True)\ntest_dataloader = DataLoader(test_data, batch_size=64, shuffle=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Iterate through the DataLoader\n==============================\n\nWe have loaded that dataset into the `DataLoader` and can iterate\nthrough the dataset as needed. Each iteration below returns a batch of\n`train_features` and `train_labels` (containing `batch_size=64` features\nand labels respectively). Because we specified `shuffle=True`, after we\niterate over all batches the data is shuffled (for finer-grained control\nover the data loading order, take a look at\n[Samplers](https://pytorch.org/docs/stable/data.html#data-loading-order-and-sampler)).\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# Display image and label.\ntrain_features, train_labels = next(iter(train_dataloader))\nprint(f\"Feature batch shape: {train_features.size()}\")\nprint(f\"Labels batch shape: {train_labels.size()}\")\nimg = train_features[0].squeeze()\nlabel = train_labels[0]\nplt.imshow(img, cmap=\"gray\")\nplt.show()\nprint(f\"Label: {label}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "------------------------------------------------------------------------\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Further Reading\n===============\n\n- [torch.utils.data API](https://pytorch.org/docs/stable/data.html)\n" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.12" } }, "nbformat": 4, "nbformat_minor": 0 }