PyTorch Ignite 0.4.2 : Examples : MNIST with Visdom (翻訳/解説)
翻訳 : (株)クラスキャットセールスインフォメーション
作成日時 : 02/11/2021 (0.4.2)

* 本ページは、PyTorch Ignite ドキュメントの以下のサンプルを適宜書き換えて補足説明したものです：

examples/mnist/mnist_with_visdom.py

* サンプルコードの動作確認はしておりますが、必要な場合には適宜、追加改変しています。
* ご自由にリンクを張って頂いてかまいませんが、sales-info@classcat.com までご一報いただけると嬉しいです。

★ 無料セミナー実施中 ★ クラスキャット主催人工知能 & ビジネス Web セミナー

人工知能とビジネスをテーマにウェビナー (WEB セミナー) を定期的に開催しています。スケジュールは弊社公式 Web サイトでご確認頂けます。

お住まいの地域に関係なく Web ブラウザからご参加頂けます。事前登録 が必要ですのでご注意ください。
Windows PC のブラウザからご参加が可能です。スマートデバイスもご利用可能です。

クラスキャットは人工知能・テレワークに関する各種サービスを提供しております :

人工知能研究開発支援	人工知能研修サービス	テレワーク & オンライン授業を支援
PoC(概念実証)を失敗させないための支援 (本支援はセミナーに参加しアンケートに回答した方を対象としています。)

◆ お問合せ : 本件に関するお問い合わせ先は下記までお願いいたします。

株式会社クラスキャット セールス・マーケティング本部セールス・インフォメーション

E-Mail：sales-info@classcat.com ; WebSite: https://www.classcat.com/

Facebook: https://www.facebook.com/ClassCatJP/

PyTorch Ignite 0.4.2 : Examples : MNIST with Visdom

Visdom を利用して訓練と検証を監視する MNIST サンプルです。

from argparse import ArgumentParser

import numpy as np
import torch
import torch.nn.functional as F
from torch import nn
from torch.optim import SGD
from torch.utils.data import DataLoader
from torchvision.datasets import MNIST
from torchvision.transforms import Compose, Normalize, ToTensor

from ignite.engine import Events, create_supervised_evaluator, create_supervised_trainer
from ignite.metrics import Accuracy, Loss

try:
    import visdom
except ImportError:
    raise RuntimeError("No visdom package is found. Please install it with command: \n pip install visdom")

train_batch_size = 64
val_batch_size = 1000
epochs = 10
lr = 0.01
momentum = 0.5
log_interval = 10
log_file = None

訓練と検証データセットを torch.utils.data.DataLoader として定義します :

def get_data_loaders(train_batch_size, val_batch_size):
    data_transform = Compose([ToTensor(), Normalize((0.1307,), (0.3081,))])

    train_loader = DataLoader(
        MNIST(download=True, root=".", transform=data_transform, train=True), batch_size=train_batch_size, shuffle=True
    )

    val_loader = DataLoader(
        MNIST(download=False, root=".", transform=data_transform, train=False), batch_size=val_batch_size, shuffle=False
    )
    return train_loader, val_loader

train_loader, val_loader = get_data_loaders(train_batch_size, val_batch_size)

Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to ./MNIST/raw/train-images-idx3-ubyte.gz
100.1%
Extracting ./MNIST/raw/train-images-idx3-ubyte.gz to ./MNIST/raw
Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz to ./MNIST/raw/train-labels-idx1-ubyte.gz
113.5%
Extracting ./MNIST/raw/train-labels-idx1-ubyte.gz to ./MNIST/raw
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz to ./MNIST/raw/t10k-images-idx3-ubyte.gz
100.4%
Extracting ./MNIST/raw/t10k-images-idx3-ubyte.gz to ./MNIST/raw
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz to ./MNIST/raw/t10k-labels-idx1-ubyte.gz
180.4%
Extracting ./MNIST/raw/t10k-labels-idx1-ubyte.gz to ./MNIST/raw
Processing...
/home/ubuntu/anaconda3/envs/torch171_ignite.py37/lib/python3.7/site-packages/torchvision/datasets/mnist.py:480: UserWarning: The given NumPy array is not writeable, and PyTorch does not support non-writeable tensors. This means you can write to the underlying (supposedly non-writeable) NumPy array using the tensor. You may want to copy the array to protect its data or make it writeable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at  /opt/conda/conda-bld/pytorch_1607370156314/work/torch/csrc/utils/tensor_numpy.cpp:141.)
  return torch.from_numpy(parsed.astype(m[2], copy=False)).view(*s)
Done!

%ls -l MNIST/raw

total 65012
-rw-rw-r-- 1 ubuntu ubuntu  7840016 Feb  7 23:29 t10k-images-idx3-ubyte
-rw-rw-r-- 1 ubuntu ubuntu  1648877 Feb  7 23:29 t10k-images-idx3-ubyte.gz
-rw-rw-r-- 1 ubuntu ubuntu    10008 Feb  7 23:29 t10k-labels-idx1-ubyte
-rw-rw-r-- 1 ubuntu ubuntu     4542 Feb  7 23:29 t10k-labels-idx1-ubyte.gz
-rw-rw-r-- 1 ubuntu ubuntu 47040016 Feb  7 23:29 train-images-idx3-ubyte
-rw-rw-r-- 1 ubuntu ubuntu  9912422 Feb  7 23:29 train-images-idx3-ubyte.gz
-rw-rw-r-- 1 ubuntu ubuntu    60008 Feb  7 23:29 train-labels-idx1-ubyte
-rw-rw-r-- 1 ubuntu ubuntu    28881 Feb  7 23:29 train-labels-idx1-ubyte.gz

%ls -l MNIST/processed

total 54144
-rw-rw-r-- 1 ubuntu ubuntu  7921089 Feb  7 23:29 test.pt
-rw-rw-r-- 1 ubuntu ubuntu 47521089 Feb  7 23:29 training.pt

device = "cpu"
if torch.cuda.is_available():
  device = "cuda"

モデルを定義します :

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(1, 10, kernel_size=5)
        self.conv2 = nn.Conv2d(10, 20, kernel_size=5)
        self.conv2_drop = nn.Dropout2d()
        self.fc1 = nn.Linear(320, 50)
        self.fc2 = nn.Linear(50, 10)

    def forward(self, x):
        x = F.relu(F.max_pool2d(self.conv1(x), 2))
        x = F.relu(F.max_pool2d(self.conv2_drop(self.conv2(x)), 2))
        x = x.view(-1, 320)
        x = F.relu(self.fc1(x))
        x = F.dropout(x, training=self.training)
        x = self.fc2(x)
        return F.log_softmax(x, dim=-1)

model = Net()
model.to(device)  # Move model before creating optimizer

optimizer を定義します :

optimizer = SGD(model.parameters(), lr=lr, momentum=momentum)

次に trainer と evaluator エンジンを定義します。このサンプルではヘルパー・メソッド create_supervised_trainer() と create_supervised_evaluator() を使用しています :

trainer = create_supervised_trainer(model, optimizer, F.nll_loss, device=device)

evaluator を作成するヘルパー関数 create_supervised_evaluator は引数 metrics を受け取ることに注意してください :

evaluator = create_supervised_evaluator(
    model, metrics={"accuracy": Accuracy(), "nll": Loss(F.nll_loss)}, device=device
)

ここで 2 つのメトリクスを定義しています : 検証データ・セット上で計算するための精度と損失です。メトリクスのより多くの情報は ignite.metrics で見つかります。

オブジェクト trainer と evaluator は Engine のインスタンスです – Ignite の主要コンポーネントです。Engine は訓練/検証ループに渡る抽象です。
※ 一般に、Engine クラスとカスタム訓練/検証ステップ・ロジックを直接使用して trainer と evaluator を定義することができます。

次に Visdom を利用する準備を行ないます :

def create_plot_window(vis, xlabel, ylabel, title):
    return vis.line(X=np.array([1]), Y=np.array([np.nan]), opts=dict(xlabel=xlabel, ylabel=ylabel, title=title))

vis = visdom.Visdom()

# if not vis.check_connection():
#     raise RuntimeError("Visdom server not running. Please run python -m visdom.server")

train_loss_window = create_plot_window(vis, "#Iterations", "Loss", "Training Loss")
train_avg_loss_window = create_plot_window(vis, "#Iterations", "Loss", "Training Average Loss")
train_avg_accuracy_window = create_plot_window(vis, "#Iterations", "Accuracy", "Training Average Accuracy")
val_avg_loss_window = create_plot_window(vis, "#Epochs", "Loss", "Validation Average Loss")
val_avg_accuracy_window = create_plot_window(vis, "#Epochs", "Accuracy", "Validation Average Accuracy")

コードスニペットの最も興味深いパートはイベント・ハンドラの追加です。Engine は実行の間にトリガーされる様々なイベント上にハンドラを追加することを可能にします。イベントがトリガーされたとき、装着されたハンドラ (関数) が実行されます。そして、ロギング目的で総ての log_interval -th 反復の終わりに実行される関数を追加しました :

@trainer.on(Events.ITERATION_COMPLETED(every=log_interval))
def log_training_loss(engine):
    print(
        f"Epoch[{engine.state.epoch}] Iteration[{engine.state.iteration}/{len(train_loader)}] "
        f"Loss: {engine.state.output:.2f}"
    )
    vis.line(
        X=np.array([engine.state.iteration]),
        Y=np.array([engine.state.output]),
        update="append",
        win=train_loss_window,
    )

エポックが終了するとき訓練と検証メトリクス (*1) を計算することを望みます。その目的で train_loader と val_loader 上で前に定義した evaluator を実行できます。そのため epoch complete イベント上で trainer に 2 つの追加のハンドラを装着できます :

@trainer.on(Events.EPOCH_COMPLETED)
def log_training_results(engine):
    evaluator.run(train_loader)
    metrics = evaluator.state.metrics
    avg_accuracy = metrics["accuracy"]
    avg_nll = metrics["nll"]
    print(
        f"Training Results - Epoch: {engine.state.epoch} Avg accuracy: {avg_accuracy:.2f} Avg loss: {avg_nll:.2f}"
    )
    vis.line(
        X=np.array([engine.state.epoch]), Y=np.array([avg_accuracy]), win=train_avg_accuracy_window, update="append"
    )
    vis.line(X=np.array([engine.state.epoch]), Y=np.array([avg_nll]), win=train_avg_loss_window, update="append")

@trainer.on(Events.EPOCH_COMPLETED)
def log_validation_results(engine):
    evaluator.run(val_loader)
    metrics = evaluator.state.metrics
    avg_accuracy = metrics["accuracy"]
    avg_nll = metrics["nll"]
    print(
        f"Validation Results - Epoch: {engine.state.epoch} Avg accuracy: {avg_accuracy:.2f} Avg loss: {avg_nll:.2f}"
    )
    vis.line(
        X=np.array([engine.state.epoch]), Y=np.array([avg_accuracy]), win=val_avg_accuracy_window, update="append"
    )
    vis.line(X=np.array([engine.state.epoch]), Y=np.array([avg_nll]), win=val_avg_loss_window, update="append")

最後に、訓練データセット上でエンジンをスタートさせて 100 エポックの間それを実行します :

# kick everything off
trainer.run(train_loader, max_epochs=epochs)

以上

2021年2月
月	火	水	木	金	土	日
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28