Lightly 1.2 : Tutorials : 2. CIFAR-10 上の MoCo の訓練

Lightly 1.2 : Tutorials : 2. CIFAR-10 上の MoCo の訓練 (翻訳/解説)

翻訳 : (株)クラスキャット セールスインフォメーション
作成日時 : 08/19/2022 (v1.2.25)

* 本ページは、Lightly の以下のドキュメントを翻訳した上で適宜、補足説明したものです:

* サンプルコードの動作確認はしておりますが、必要な場合には適宜、追加改変しています。
* ご自由にリンクを張って頂いてかまいませんが、sales-info@classcat.com までご一報いただけると嬉しいです。

 

クラスキャット 人工知能 研究開発支援サービス

クラスキャット は人工知能・テレワークに関する各種サービスを提供しています。お気軽にご相談ください :

◆ 人工知能とビジネスをテーマに WEB セミナーを定期的に開催しています。スケジュール
  • お住まいの地域に関係なく Web ブラウザからご参加頂けます。事前登録 が必要ですのでご注意ください。

お問合せ : 本件に関するお問い合わせ先は下記までお願いいたします。

  • 株式会社クラスキャット セールス・マーケティング本部 セールス・インフォメーション
  • sales-info@classcat.com  ;  Web: www.classcat.com  ;   ClassCatJP

 

Lightly 1.2 : Tutorials : 2. CIFAR-10 上の MoCo の訓練

このチュートリアルでは、MoCo 論文 Momentum Contrast for Unsupervised Visual Representation Learning (教師なし視覚表現学習のためのモメンタム・コントラスト) に基づいてモデルを訓練します。

対照損失を使用して自己教師ありモデルを訓練するとき通常は一つの大きな問題に直面します。良い結果を得るには、対照損失が機能するために多くのネガティブサンプルを必要とします。従って、大きなバッチサイズを必要とします。けれども、総ての人が GPU や TPU を満載したクラスタへのアクセスを持つわけではありません。この問題を解決するために、代替のアプローチが開発されました。それらの幾つかは、小さいバッチサイズを補うために、問い合わせ可能な古いネガティブサンプルをストアするメモリバンクを使用します。MoCo はモメンタム・エンコーダを含めることでこのアプローチを更にワンステップ進めました。

このチュートリアルのために CIFAR-10 データセットを使用します。

このチュートリアルでは以下を学習します :

  • データセットをロードしてモデルを訓練するために lightly を使用する方法。

  • メモリバンクを使用して MoCo モデルを作成する方法。

  • 転移学習タスクのために自己教師あり学習の後、事前訓練済みモデルを使用する方法。

 

インポート

このチュートリアルに必要なPython フレームワークをインポートします。lightly をインストールしたことを確認してください。

pip install lightly
import torch
import torch.nn as nn
import torchvision
import pytorch_lightning as pl
import copy
import lightly

from lightly.models.modules.heads import MoCoProjectionHead
from lightly.models.utils import deactivate_requires_grad
from lightly.models.utils import update_momentum
from lightly.models.utils import batch_shuffle
from lightly.models.utils import batch_unshuffle

 

Configuration

実験のために幾つかの設定パラメータを設定します。それらを自由に変更して効果を分析してください。

デフォルト設定は 512 のバッチサイズを使用します。これは約 6.4GB の GPU メモリを必要とします。100 エポック訓練するとき、約 73% のテストセット精度を達成するはずです。200 エポックの訓練では精度は約 80% に増加します。

num_workers = 8
batch_size = 512
memory_bank_size = 4096
seed = 1
max_epochs = 100

パスを貴方の CIFAR-10 データセットの場所で置き換えます。各クラスに対するサブフォルダを持つ train フォルダと内部に .png 画像を持つものと仮定します。

Kaggle からフォルダ内の CIFAR-10 をダウンロードできます。

# The dataset structure should be like this:
# cifar10/train/
#  L airplane/
#    L 10008_airplane.png
#    L ...
#  L automobile/
#  L bird/
#  L cat/
#  L deer/
#  L dog/
#  L frog/
#  L horse/
#  L ship/
#  L truck/
path_to_train = '/datasets/cifar10/train/'
path_to_test = '/datasets/cifar10/test/'

実験の再現性を保証するためにシードを設定しましょう。

pl.seed_everything(seed)

Out :

1

 

データ増強とローダのセットアップ

データ前処理パイプラインから始めます。MOCO 論文から lightly により提供される collate 関数を使用して増強を実装できます。MoCo v2 については、SimCLR と同じ増強を使用できますが、入力サイズと blur (ぼかし) をオーバーライドします。CIFAR-10 データセットからの画像は 32×32 ピクセルの画像を持っています。モデルを訓練するためにこの解像度を使用しましょう。

Note : モデルを訓練するためにより高い入力解像度を使用できるでしょう。けれども、CIFAR-10 画像の元の解像度は低いので、解像度を増やす価値はありません。高い解像度は高いメモリ消費につながり、その代償としてバッチサイズを減じる必要性があるでしょう。

# MoCo v2 uses SimCLR augmentations, additionally, disable blur
collate_fn = lightly.data.SimCLRCollateFunction(
    input_size=32,
    gaussian_blur=0.,
)

テストデータセットに対してはどのような増強も望みません。そのため、カスタムな torchvision ベースのデータ変換を作成します。サイズが正しいことを確認して、訓練データで行ったのと同じ方法でデータを正規化しましょう。

# Augmentations typically used to train on cifar-10
train_classifier_transforms = torchvision.transforms.Compose([
    torchvision.transforms.RandomCrop(32, padding=4),
    torchvision.transforms.RandomHorizontalFlip(),
    torchvision.transforms.ToTensor(),
    torchvision.transforms.Normalize(
        mean=lightly.data.collate.imagenet_normalize['mean'],
        std=lightly.data.collate.imagenet_normalize['std'],
    )
])

# No additional augmentations for the test set
test_transforms = torchvision.transforms.Compose([
    torchvision.transforms.Resize((32, 32)),
    torchvision.transforms.ToTensor(),
    torchvision.transforms.Normalize(
        mean=lightly.data.collate.imagenet_normalize['mean'],
        std=lightly.data.collate.imagenet_normalize['std'],
    )
])

# We use the moco augmentations for training moco
dataset_train_moco = lightly.data.LightlyDataset(
    input_dir=path_to_train
)

# Since we also train a linear classifier on the pre-trained moco model we
# reuse the test augmentations here (MoCo augmentations are very strong and
# usually reduce accuracy of models which are not used for contrastive learning.
# Our linear layer will be trained using cross entropy loss and labels provided
# by the dataset. Therefore we chose light augmentations.)
dataset_train_classifier = lightly.data.LightlyDataset(
    input_dir=path_to_train,
    transform=train_classifier_transforms
)

dataset_test = lightly.data.LightlyDataset(
    input_dir=path_to_test,
    transform=test_transforms
)

バックグラウンドでデータをロードして前処理するデータローダを作成します。

dataloader_train_moco = torch.utils.data.DataLoader(
    dataset_train_moco,
    batch_size=batch_size,
    shuffle=True,
    collate_fn=collate_fn,
    drop_last=True,
    num_workers=num_workers
)

dataloader_train_classifier = torch.utils.data.DataLoader(
    dataset_train_classifier,
    batch_size=batch_size,
    shuffle=True,
    drop_last=True,
    num_workers=num_workers
)

dataloader_test = torch.utils.data.DataLoader(
    dataset_test,
    batch_size=batch_size,
    shuffle=False,
    drop_last=False,
    num_workers=num_workers
)

 

MoCo Lightning モジュールの作成

次に MoCo モデルを作成します。モデルを訓練するために PyTorch Lightning を使用します。lightning モジュールの仕様に従います。この例では隠れ次元に対する特徴数を 512 に設定します。モメンタム・エンコーダの momentum は 0.99 に設定されます (デフォルトは 0.999)、別のレポートが Cifar-10 についてはこれがより良く機能すると示しているからです。

バックボーンについては resnet-18 の lightly のバリエーションを使用します。playground to use custom backbones に従って別のモデルを使用できます。

class MocoModel(pl.LightningModule):
    def __init__(self):
        super().__init__()

        # create a ResNet backbone and remove the classification head
        resnet = lightly.models.ResNetGenerator('resnet-18', 1, num_splits=8)
        self.backbone = nn.Sequential(
            *list(resnet.children())[:-1],
            nn.AdaptiveAvgPool2d(1),
        )

        # create a moco model based on ResNet
        self.projection_head = MoCoProjectionHead(512, 512, 128)
        self.backbone_momentum = copy.deepcopy(self.backbone)
        self.projection_head_momentum = copy.deepcopy(self.projection_head)
        deactivate_requires_grad(self.backbone_momentum)
        deactivate_requires_grad(self.projection_head_momentum)

        # create our loss with the optional memory bank
        self.criterion = lightly.loss.NTXentLoss(
            temperature=0.1,
            memory_bank_size=memory_bank_size)

    def training_step(self, batch, batch_idx):
        (x_q, x_k), _, _ = batch

        # update momentum
        update_momentum(self.backbone, self.backbone_momentum, 0.99)
        update_momentum(
            self.projection_head, self.projection_head_momentum, 0.99
        )

        # get queries
        q = self.backbone(x_q).flatten(start_dim=1)
        q = self.projection_head(q)

        # get keys
        k, shuffle = batch_shuffle(x_k)
        k = self.backbone_momentum(k).flatten(start_dim=1)
        k = self.projection_head_momentum(k)
        k = batch_unshuffle(k, shuffle)

        loss = self.criterion(q, k)
        self.log("train_loss_ssl", loss)
        return loss

    def training_epoch_end(self, outputs):
        self.custom_histogram_weights()

    # We provide a helper method to log weights in tensorboard
    # which is useful for debugging.
    def custom_histogram_weights(self):
        for name, params in self.named_parameters():
            self.logger.experiment.add_histogram(
                name, params, self.current_epoch)

    def configure_optimizers(self):
        optim = torch.optim.SGD(
            self.parameters(),
            lr=6e-2,
            momentum=0.9,
            weight_decay=5e-4,
        )
        scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(
            optim, max_epochs
        )
        return [optim], [scheduler]

 

分類器 Lightning モジュールの作成

MoCo を使用して抽出した特徴量を使用して線形分類器を作成してそれをデータセットで訓練します。

class Classifier(pl.LightningModule):
    def __init__(self, backbone):
        super().__init__()
        # use the pretrained ResNet backbone
        self.backbone = backbone

        # freeze the backbone
        deactivate_requires_grad(backbone)

        # create a linear layer for our downstream classification model
        self.fc = nn.Linear(512, 10)

        self.criterion = nn.CrossEntropyLoss()

    def forward(self, x):
        y_hat = self.backbone(x).flatten(start_dim=1)
        y_hat = self.fc(y_hat)
        return y_hat

    def training_step(self, batch, batch_idx):
        x, y, _ = batch
        y_hat = self.forward(x)
        loss = self.criterion(y_hat, y)
        self.log("train_loss_fc", loss)
        return loss

    def training_epoch_end(self, outputs):
        self.custom_histogram_weights()

    # We provide a helper method to log weights in tensorboard
    # which is useful for debugging.
    def custom_histogram_weights(self):
        for name, params in self.named_parameters():
            self.logger.experiment.add_histogram(
                name, params, self.current_epoch
            )

    def validation_step(self, batch, batch_idx):
        x, y, _ = batch
        y_hat = self.forward(x)
        y_hat = torch.nn.functional.softmax(y_hat, dim=1)

        # calculate number of correct predictions
        _, predicted = torch.max(y_hat, 1)
        num = predicted.shape[0]
        correct = (predicted == y).float().sum()
        return num, correct

    def validation_epoch_end(self, outputs):
        # calculate and log top1 accuracy
        if outputs:
            total_num = 0
            total_correct = 0
            for num, correct in outputs:
                total_num += num
                total_correct += correct
            acc = total_correct / total_num
            self.log("val_acc", acc, on_epoch=True, prog_bar=True)

    def configure_optimizers(self):
        optim = torch.optim.SGD(self.fc.parameters(), lr=30.)
        scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(optim, max_epochs)
        return [optim], [scheduler]

 

MoCo モデル の訓練

モデルをインスタンス化してそれを lightning トレーナーを使用して訓練します。

# use a GPU if available
gpus = 1 if torch.cuda.is_available() else 0

model = MocoModel()
trainer = pl.Trainer(max_epochs=max_epochs, gpus=gpus,
                     progress_bar_refresh_rate=100)
trainer.fit(
    model,
    dataloader_train_moco
)
/opt/hostedtoolcache/Python/3.10.2/x64/lib/python3.10/site-packages/pytorch_lightning/trainer/connectors/callback_connector.py:90: LightningDeprecationWarning: Setting `Trainer(progress_bar_refresh_rate=100)` is deprecated in v1.5 and will be removed in v1.7. Please pass `pytorch_lightning.callbacks.progress.TQDMProgressBar` with `refresh_rate` directly to the Trainer's `callbacks` argument instead. Or, to disable the progress bar pass `enable_progress_bar = False` to the Trainer.
  rank_zero_deprecation(

Training: 0it [00:00, ?it/s]
Training:   0%|          | 0/97 [00:00 ?, ?it/s]
Epoch 0:   0%|          | 0/97 [00:00 ?, ?it/s]
Epoch 0: 100%|##########| 97/97 [00:20 00:00,  4.69it/s]
Epoch 0: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=7.13, v_num=2]
Epoch 0:   0%|          | 0/97 [00:00 ?, ?it/s, loss=7.13, v_num=2]
Epoch 1:   0%|          | 0/97 [00:00 ?, ?it/s, loss=7.13, v_num=2]
Epoch 1: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=7.13, v_num=2]
Epoch 1: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=7.36, v_num=2]
Epoch 1:   0%|          | 0/97 [00:00 ?, ?it/s, loss=7.36, v_num=2]
Epoch 2:   0%|          | 0/97 [00:00 ?, ?it/s, loss=7.36, v_num=2]
Epoch 2: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=7.36, v_num=2]
Epoch 2: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=7.19, v_num=2]
Epoch 2:   0%|          | 0/97 [00:00 ?, ?it/s, loss=7.19, v_num=2]
Epoch 3:   0%|          | 0/97 [00:00 ?, ?it/s, loss=7.19, v_num=2]
Epoch 3: 100%|##########| 97/97 [00:20 00:00,  4.68it/s, loss=7.19, v_num=2]
Epoch 3: 100%|##########| 97/97 [00:20 00:00,  4.68it/s, loss=6.83, v_num=2]
Epoch 3:   0%|          | 0/97 [00:00 ?, ?it/s, loss=6.83, v_num=2]
Epoch 4:   0%|          | 0/97 [00:00 ?, ?it/s, loss=6.83, v_num=2]
Epoch 4: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=6.83, v_num=2]
Epoch 4: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=6.64, v_num=2]
Epoch 4:   0%|          | 0/97 [00:00 ?, ?it/s, loss=6.64, v_num=2]
Epoch 5:   0%|          | 0/97 [00:00 ?, ?it/s, loss=6.64, v_num=2]
Epoch 5: 100%|##########| 97/97 [00:20 00:00,  4.68it/s, loss=6.64, v_num=2]
Epoch 5: 100%|##########| 97/97 [00:20 00:00,  4.68it/s, loss=6.48, v_num=2]
Epoch 5:   0%|          | 0/97 [00:00 ?, ?it/s, loss=6.48, v_num=2]
Epoch 6:   0%|          | 0/97 [00:00 ?, ?it/s, loss=6.48, v_num=2]
Epoch 6: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=6.48, v_num=2]
Epoch 6: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=6.41, v_num=2]
Epoch 6:   0%|          | 0/97 [00:00 ?, ?it/s, loss=6.41, v_num=2]
Epoch 7:   0%|          | 0/97 [00:00 ?, ?it/s, loss=6.41, v_num=2]
Epoch 7: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=6.41, v_num=2]
Epoch 7: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=6.26, v_num=2]
Epoch 7:   0%|          | 0/97 [00:00 ?, ?it/s, loss=6.26, v_num=2]
Epoch 8:   0%|          | 0/97 [00:00 ?, ?it/s, loss=6.26, v_num=2]
Epoch 8: 100%|##########| 97/97 [00:20 00:00,  4.67it/s, loss=6.26, v_num=2]
Epoch 8: 100%|##########| 97/97 [00:20 00:00,  4.67it/s, loss=6.14, v_num=2]
Epoch 8:   0%|          | 0/97 [00:00 ?, ?it/s, loss=6.14, v_num=2]
Epoch 9:   0%|          | 0/97 [00:00 ?, ?it/s, loss=6.14, v_num=2]
Epoch 9: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=6.14, v_num=2]
Epoch 9: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=6.01, v_num=2]
Epoch 9:   0%|          | 0/97 [00:00 ?, ?it/s, loss=6.01, v_num=2]
Epoch 10:   0%|          | 0/97 [00:00 ?, ?it/s, loss=6.01, v_num=2]
Epoch 10: 100%|##########| 97/97 [00:20 00:00,  4.68it/s, loss=6.01, v_num=2]
Epoch 10: 100%|##########| 97/97 [00:20 00:00,  4.68it/s, loss=5.85, v_num=2]
Epoch 10:   0%|          | 0/97 [00:00 ?, ?it/s, loss=5.85, v_num=2]
Epoch 11:   0%|          | 0/97 [00:00 ?, ?it/s, loss=5.85, v_num=2]
Epoch 11: 100%|##########| 97/97 [00:20 00:00,  4.68it/s, loss=5.85, v_num=2]
Epoch 11: 100%|##########| 97/97 [00:20 00:00,  4.68it/s, loss=5.67, v_num=2]
Epoch 11:   0%|          | 0/97 [00:00 ?, ?it/s, loss=5.67, v_num=2]
Epoch 12:   0%|          | 0/97 [00:00 ?, ?it/s, loss=5.67, v_num=2]
Epoch 12: 100%|##########| 97/97 [00:20 00:00,  4.68it/s, loss=5.67, v_num=2]
Epoch 12: 100%|##########| 97/97 [00:20 00:00,  4.68it/s, loss=5.54, v_num=2]
Epoch 12:   0%|          | 0/97 [00:00 ?, ?it/s, loss=5.54, v_num=2]
Epoch 13:   0%|          | 0/97 [00:00 ?, ?it/s, loss=5.54, v_num=2]
Epoch 13: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=5.54, v_num=2]
Epoch 13: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=5.43, v_num=2]
Epoch 13:   0%|          | 0/97 [00:00 ?, ?it/s, loss=5.43, v_num=2]
Epoch 14:   0%|          | 0/97 [00:00 ?, ?it/s, loss=5.43, v_num=2]
Epoch 14: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=5.43, v_num=2]
Epoch 14: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=5.32, v_num=2]
Epoch 14:   0%|          | 0/97 [00:00 ?, ?it/s, loss=5.32, v_num=2]
Epoch 15:   0%|          | 0/97 [00:00 ?, ?it/s, loss=5.32, v_num=2]
Epoch 15: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=5.32, v_num=2]
Epoch 15: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=5.25, v_num=2]
Epoch 15:   0%|          | 0/97 [00:00 ?, ?it/s, loss=5.25, v_num=2]
Epoch 16:   0%|          | 0/97 [00:00 ?, ?it/s, loss=5.25, v_num=2]
Epoch 16: 100%|##########| 97/97 [00:20 00:00,  4.67it/s, loss=5.25, v_num=2]
Epoch 16: 100%|##########| 97/97 [00:20 00:00,  4.67it/s, loss=5.16, v_num=2]
Epoch 16:   0%|          | 0/97 [00:00 ?, ?it/s, loss=5.16, v_num=2]
Epoch 17:   0%|          | 0/97 [00:00 ?, ?it/s, loss=5.16, v_num=2]
Epoch 17: 100%|##########| 97/97 [00:20 00:00,  4.67it/s, loss=5.16, v_num=2]
Epoch 17: 100%|##########| 97/97 [00:20 00:00,  4.67it/s, loss=5.11, v_num=2]
Epoch 17:   0%|          | 0/97 [00:00 ?, ?it/s, loss=5.11, v_num=2]
Epoch 18:   0%|          | 0/97 [00:00 ?, ?it/s, loss=5.11, v_num=2]
Epoch 18: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=5.11, v_num=2]
Epoch 18: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=5.03, v_num=2]
Epoch 18:   0%|          | 0/97 [00:00 ?, ?it/s, loss=5.03, v_num=2]
Epoch 19:   0%|          | 0/97 [00:00 ?, ?it/s, loss=5.03, v_num=2]
Epoch 19: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=5.03, v_num=2]
Epoch 19: 100%|##########| 97/97 [00:20 00:00,  4.68it/s, loss=5, v_num=2]
Epoch 19:   0%|          | 0/97 [00:00 ?, ?it/s, loss=5, v_num=2]
Epoch 20:   0%|          | 0/97 [00:00 ?, ?it/s, loss=5, v_num=2]
Epoch 20: 100%|##########| 97/97 [00:20 00:00,  4.68it/s, loss=5, v_num=2]
Epoch 20: 100%|##########| 97/97 [00:20 00:00,  4.68it/s, loss=4.88, v_num=2]
Epoch 20:   0%|          | 0/97 [00:00 ?, ?it/s, loss=4.88, v_num=2]
Epoch 21:   0%|          | 0/97 [00:00 ?, ?it/s, loss=4.88, v_num=2]
Epoch 21: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=4.88, v_num=2]
Epoch 21: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=4.85, v_num=2]
Epoch 21:   0%|          | 0/97 [00:00 ?, ?it/s, loss=4.85, v_num=2]
Epoch 22:   0%|          | 0/97 [00:00 ?, ?it/s, loss=4.85, v_num=2]
Epoch 22: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=4.85, v_num=2]
Epoch 22: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=4.74, v_num=2]
Epoch 22:   0%|          | 0/97 [00:00 ?, ?it/s, loss=4.74, v_num=2]
Epoch 23:   0%|          | 0/97 [00:00 ?, ?it/s, loss=4.74, v_num=2]
Epoch 23: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=4.74, v_num=2]
Epoch 23: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=4.66, v_num=2]
Epoch 23:   0%|          | 0/97 [00:00 ?, ?it/s, loss=4.66, v_num=2]
Epoch 24:   0%|          | 0/97 [00:00 ?, ?it/s, loss=4.66, v_num=2]
Epoch 24: 100%|##########| 97/97 [00:20 00:00,  4.68it/s, loss=4.66, v_num=2]
Epoch 24: 100%|##########| 97/97 [00:20 00:00,  4.68it/s, loss=4.6, v_num=2]
Epoch 24:   0%|          | 0/97 [00:00 ?, ?it/s, loss=4.6, v_num=2]
Epoch 25:   0%|          | 0/97 [00:00 ?, ?it/s, loss=4.6, v_num=2]
Epoch 25: 100%|##########| 97/97 [00:20 00:00,  4.68it/s, loss=4.6, v_num=2]
Epoch 25: 100%|##########| 97/97 [00:20 00:00,  4.68it/s, loss=4.51, v_num=2]
Epoch 25:   0%|          | 0/97 [00:00 ?, ?it/s, loss=4.51, v_num=2]
Epoch 26:   0%|          | 0/97 [00:00 ?, ?it/s, loss=4.51, v_num=2]
Epoch 26: 100%|##########| 97/97 [00:20 00:00,  4.68it/s, loss=4.51, v_num=2]
Epoch 26: 100%|##########| 97/97 [00:20 00:00,  4.68it/s, loss=4.44, v_num=2]
Epoch 26:   0%|          | 0/97 [00:00 ?, ?it/s, loss=4.44, v_num=2]
Epoch 27:   0%|          | 0/97 [00:00 ?, ?it/s, loss=4.44, v_num=2]
Epoch 27: 100%|##########| 97/97 [00:20 00:00,  4.68it/s, loss=4.44, v_num=2]
Epoch 27: 100%|##########| 97/97 [00:20 00:00,  4.68it/s, loss=4.41, v_num=2]
Epoch 27:   0%|          | 0/97 [00:00 ?, ?it/s, loss=4.41, v_num=2]
Epoch 28:   0%|          | 0/97 [00:00 ?, ?it/s, loss=4.41, v_num=2]
Epoch 28: 100%|##########| 97/97 [00:20 00:00,  4.67it/s, loss=4.41, v_num=2]
Epoch 28: 100%|##########| 97/97 [00:20 00:00,  4.67it/s, loss=4.38, v_num=2]
Epoch 28:   0%|          | 0/97 [00:00 ?, ?it/s, loss=4.38, v_num=2]
Epoch 29:   0%|          | 0/97 [00:00 ?, ?it/s, loss=4.38, v_num=2]
Epoch 29: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=4.38, v_num=2]
Epoch 29: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=4.31, v_num=2]
Epoch 29:   0%|          | 0/97 [00:00 ?, ?it/s, loss=4.31, v_num=2]
Epoch 30:   0%|          | 0/97 [00:00 ?, ?it/s, loss=4.31, v_num=2]
Epoch 30: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=4.31, v_num=2]
Epoch 30: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=4.19, v_num=2]
Epoch 30:   0%|          | 0/97 [00:00 ?, ?it/s, loss=4.19, v_num=2]
Epoch 31:   0%|          | 0/97 [00:00 ?, ?it/s, loss=4.19, v_num=2]
Epoch 31: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=4.19, v_num=2]
Epoch 31: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=4.21, v_num=2]
Epoch 31:   0%|          | 0/97 [00:00 ?, ?it/s, loss=4.21, v_num=2]
Epoch 32:   0%|          | 0/97 [00:00 ?, ?it/s, loss=4.21, v_num=2]
Epoch 32: 100%|##########| 97/97 [00:20 00:00,  4.68it/s, loss=4.21, v_num=2]
Epoch 32: 100%|##########| 97/97 [00:20 00:00,  4.68it/s, loss=4.15, v_num=2]
Epoch 32:   0%|          | 0/97 [00:00 ?, ?it/s, loss=4.15, v_num=2]
Epoch 33:   0%|          | 0/97 [00:00 ?, ?it/s, loss=4.15, v_num=2]
Epoch 33: 100%|##########| 97/97 [00:20 00:00,  4.68it/s, loss=4.15, v_num=2]
Epoch 33: 100%|##########| 97/97 [00:20 00:00,  4.68it/s, loss=4.12, v_num=2]
Epoch 33:   0%|          | 0/97 [00:00 ?, ?it/s, loss=4.12, v_num=2]
Epoch 34:   0%|          | 0/97 [00:00 ?, ?it/s, loss=4.12, v_num=2]
Epoch 34: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=4.12, v_num=2]
Epoch 34: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=4.06, v_num=2]
Epoch 34:   0%|          | 0/97 [00:00 ?, ?it/s, loss=4.06, v_num=2]
Epoch 35:   0%|          | 0/97 [00:00 ?, ?it/s, loss=4.06, v_num=2]
Epoch 35: 100%|##########| 97/97 [00:20 00:00,  4.70it/s, loss=4.06, v_num=2]
Epoch 35: 100%|##########| 97/97 [00:20 00:00,  4.70it/s, loss=4.04, v_num=2]
Epoch 35:   0%|          | 0/97 [00:00 ?, ?it/s, loss=4.04, v_num=2]
Epoch 36:   0%|          | 0/97 [00:00 ?, ?it/s, loss=4.04, v_num=2]
Epoch 36: 100%|##########| 97/97 [00:20 00:00,  4.68it/s, loss=4.04, v_num=2]
Epoch 36: 100%|##########| 97/97 [00:20 00:00,  4.68it/s, loss=4, v_num=2]
Epoch 36:   0%|          | 0/97 [00:00 ?, ?it/s, loss=4, v_num=2]
Epoch 37:   0%|          | 0/97 [00:00 ?, ?it/s, loss=4, v_num=2]
Epoch 37: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=4, v_num=2]
Epoch 37: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=3.99, v_num=2]
Epoch 37:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.99, v_num=2]
Epoch 38:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.99, v_num=2]
Epoch 38: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=3.99, v_num=2]
Epoch 38: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=3.94, v_num=2]
Epoch 38:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.94, v_num=2]
Epoch 39:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.94, v_num=2]
Epoch 39: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=3.94, v_num=2]
Epoch 39: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=3.9, v_num=2]
Epoch 39:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.9, v_num=2]
Epoch 40:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.9, v_num=2]
Epoch 40: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=3.9, v_num=2]
Epoch 40: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=3.95, v_num=2]
Epoch 40:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.95, v_num=2]
Epoch 41:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.95, v_num=2]
Epoch 41: 100%|##########| 97/97 [00:20 00:00,  4.68it/s, loss=3.95, v_num=2]
Epoch 41: 100%|##########| 97/97 [00:20 00:00,  4.68it/s, loss=3.89, v_num=2]
Epoch 41:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.89, v_num=2]
Epoch 42:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.89, v_num=2]
Epoch 42: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=3.89, v_num=2]
Epoch 42: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=3.86, v_num=2]
Epoch 42:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.86, v_num=2]
Epoch 43:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.86, v_num=2]
Epoch 43: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=3.86, v_num=2]
Epoch 43: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=3.82, v_num=2]
Epoch 43:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.82, v_num=2]
Epoch 44:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.82, v_num=2]
Epoch 44: 100%|##########| 97/97 [00:20 00:00,  4.68it/s, loss=3.82, v_num=2]
Epoch 44: 100%|##########| 97/97 [00:20 00:00,  4.68it/s, loss=3.8, v_num=2]
Epoch 44:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.8, v_num=2]
Epoch 45:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.8, v_num=2]
Epoch 45: 100%|##########| 97/97 [00:20 00:00,  4.70it/s, loss=3.8, v_num=2]
Epoch 45: 100%|##########| 97/97 [00:20 00:00,  4.70it/s, loss=3.78, v_num=2]
Epoch 45:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.78, v_num=2]
Epoch 46:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.78, v_num=2]
Epoch 46: 100%|##########| 97/97 [00:20 00:00,  4.68it/s, loss=3.78, v_num=2]
Epoch 46: 100%|##########| 97/97 [00:20 00:00,  4.68it/s, loss=3.73, v_num=2]
Epoch 46:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.73, v_num=2]
Epoch 47:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.73, v_num=2]
Epoch 47: 100%|##########| 97/97 [00:20 00:00,  4.70it/s, loss=3.73, v_num=2]
Epoch 47: 100%|##########| 97/97 [00:20 00:00,  4.70it/s, loss=3.73, v_num=2]
Epoch 47:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.73, v_num=2]
Epoch 48:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.73, v_num=2]
Epoch 48: 100%|##########| 97/97 [00:20 00:00,  4.70it/s, loss=3.73, v_num=2]
Epoch 48: 100%|##########| 97/97 [00:20 00:00,  4.70it/s, loss=3.71, v_num=2]
Epoch 48:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.71, v_num=2]
Epoch 49:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.71, v_num=2]
Epoch 49: 100%|##########| 97/97 [00:20 00:00,  4.70it/s, loss=3.71, v_num=2]
Epoch 49: 100%|##########| 97/97 [00:20 00:00,  4.70it/s, loss=3.65, v_num=2]
Epoch 49:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.65, v_num=2]
Epoch 50:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.65, v_num=2]
Epoch 50: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=3.65, v_num=2]
Epoch 50: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=3.65, v_num=2]
Epoch 50:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.65, v_num=2]
Epoch 51:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.65, v_num=2]
Epoch 51: 100%|##########| 97/97 [00:20 00:00,  4.68it/s, loss=3.65, v_num=2]
Epoch 51: 100%|##########| 97/97 [00:20 00:00,  4.68it/s, loss=3.61, v_num=2]
Epoch 51:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.61, v_num=2]
Epoch 52:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.61, v_num=2]
Epoch 52: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=3.61, v_num=2]
Epoch 52: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=3.61, v_num=2]
Epoch 52:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.61, v_num=2]
Epoch 53:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.61, v_num=2]
Epoch 53: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=3.61, v_num=2]
Epoch 53: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=3.58, v_num=2]
Epoch 53:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.58, v_num=2]
Epoch 54:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.58, v_num=2]
Epoch 54: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=3.58, v_num=2]
Epoch 54: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=3.56, v_num=2]
Epoch 54:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.56, v_num=2]
Epoch 55:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.56, v_num=2]
Epoch 55: 100%|##########| 97/97 [00:20 00:00,  4.68it/s, loss=3.56, v_num=2]
Epoch 55: 100%|##########| 97/97 [00:20 00:00,  4.68it/s, loss=3.5, v_num=2]
Epoch 55:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.5, v_num=2]
Epoch 56:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.5, v_num=2]
Epoch 56: 100%|##########| 97/97 [00:20 00:00,  4.68it/s, loss=3.5, v_num=2]
Epoch 56: 100%|##########| 97/97 [00:20 00:00,  4.68it/s, loss=3.5, v_num=2]
Epoch 56:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.5, v_num=2]
Epoch 57:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.5, v_num=2]
Epoch 57: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=3.5, v_num=2]
Epoch 57: 100%|##########| 97/97 [00:20 00:00,  4.68it/s, loss=3.48, v_num=2]
Epoch 57:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.48, v_num=2]
Epoch 58:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.48, v_num=2]
Epoch 58: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=3.48, v_num=2]
Epoch 58: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=3.48, v_num=2]
Epoch 58:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.48, v_num=2]
Epoch 59:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.48, v_num=2]
Epoch 59: 100%|##########| 97/97 [00:20 00:00,  4.70it/s, loss=3.48, v_num=2]
Epoch 59: 100%|##########| 97/97 [00:20 00:00,  4.70it/s, loss=3.44, v_num=2]
Epoch 59:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.44, v_num=2]
Epoch 60:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.44, v_num=2]
Epoch 60: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=3.44, v_num=2]
Epoch 60: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=3.4, v_num=2]
Epoch 60:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.4, v_num=2]
Epoch 61:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.4, v_num=2]
Epoch 61: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=3.4, v_num=2]
Epoch 61: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=3.43, v_num=2]
Epoch 61:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.43, v_num=2]
Epoch 62:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.43, v_num=2]
Epoch 62: 100%|##########| 97/97 [00:20 00:00,  4.68it/s, loss=3.43, v_num=2]
Epoch 62: 100%|##########| 97/97 [00:20 00:00,  4.68it/s, loss=3.38, v_num=2]
Epoch 62:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.38, v_num=2]
Epoch 63:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.38, v_num=2]
Epoch 63: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=3.38, v_num=2]
Epoch 63: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=3.38, v_num=2]
Epoch 63:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.38, v_num=2]
Epoch 64:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.38, v_num=2]
Epoch 64: 100%|##########| 97/97 [00:20 00:00,  4.70it/s, loss=3.38, v_num=2]
Epoch 64: 100%|##########| 97/97 [00:20 00:00,  4.70it/s, loss=3.37, v_num=2]
Epoch 64:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.37, v_num=2]
Epoch 65:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.37, v_num=2]
Epoch 65: 100%|##########| 97/97 [00:20 00:00,  4.70it/s, loss=3.37, v_num=2]
Epoch 65: 100%|##########| 97/97 [00:20 00:00,  4.70it/s, loss=3.37, v_num=2]
Epoch 65:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.37, v_num=2]
Epoch 66:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.37, v_num=2]
Epoch 66: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=3.37, v_num=2]
Epoch 66: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=3.33, v_num=2]
Epoch 66:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.33, v_num=2]
Epoch 67:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.33, v_num=2]
Epoch 67: 100%|##########| 97/97 [00:20 00:00,  4.68it/s, loss=3.33, v_num=2]
Epoch 67: 100%|##########| 97/97 [00:20 00:00,  4.68it/s, loss=3.3, v_num=2]
Epoch 67:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.3, v_num=2]
Epoch 68:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.3, v_num=2]
Epoch 68: 100%|##########| 97/97 [00:20 00:00,  4.68it/s, loss=3.3, v_num=2]
Epoch 68: 100%|##########| 97/97 [00:20 00:00,  4.68it/s, loss=3.3, v_num=2]
Epoch 68:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.3, v_num=2]
Epoch 69:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.3, v_num=2]
Epoch 69: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=3.3, v_num=2]
Epoch 69: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=3.25, v_num=2]
Epoch 69:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.25, v_num=2]
Epoch 70:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.25, v_num=2]
Epoch 70: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=3.25, v_num=2]
Epoch 70: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=3.25, v_num=2]
Epoch 70:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.25, v_num=2]
Epoch 71:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.25, v_num=2]
Epoch 71: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=3.25, v_num=2]
Epoch 71: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=3.24, v_num=2]
Epoch 71:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.24, v_num=2]
Epoch 72:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.24, v_num=2]
Epoch 72: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=3.24, v_num=2]
Epoch 72: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=3.21, v_num=2]
Epoch 72:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.21, v_num=2]
Epoch 73:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.21, v_num=2]
Epoch 73: 100%|##########| 97/97 [00:20 00:00,  4.70it/s, loss=3.21, v_num=2]
Epoch 73: 100%|##########| 97/97 [00:20 00:00,  4.70it/s, loss=3.18, v_num=2]
Epoch 73:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.18, v_num=2]
Epoch 74:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.18, v_num=2]
Epoch 74: 100%|##########| 97/97 [00:20 00:00,  4.68it/s, loss=3.18, v_num=2]
Epoch 74: 100%|##########| 97/97 [00:20 00:00,  4.68it/s, loss=3.16, v_num=2]
Epoch 74:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.16, v_num=2]
Epoch 75:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.16, v_num=2]
Epoch 75: 100%|##########| 97/97 [00:20 00:00,  4.68it/s, loss=3.16, v_num=2]
Epoch 75: 100%|##########| 97/97 [00:20 00:00,  4.68it/s, loss=3.19, v_num=2]
Epoch 75:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.19, v_num=2]
Epoch 76:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.19, v_num=2]
Epoch 76: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=3.19, v_num=2]
Epoch 76: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=3.15, v_num=2]
Epoch 76:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.15, v_num=2]
Epoch 77:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.15, v_num=2]
Epoch 77: 100%|##########| 97/97 [00:20 00:00,  4.67it/s, loss=3.15, v_num=2]
Epoch 77: 100%|##########| 97/97 [00:20 00:00,  4.67it/s, loss=3.16, v_num=2]
Epoch 77:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.16, v_num=2]
Epoch 78:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.16, v_num=2]
Epoch 78: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=3.16, v_num=2]
Epoch 78: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=3.16, v_num=2]
Epoch 78:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.16, v_num=2]
Epoch 79:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.16, v_num=2]
Epoch 79: 100%|##########| 97/97 [00:20 00:00,  4.70it/s, loss=3.16, v_num=2]
Epoch 79: 100%|##########| 97/97 [00:20 00:00,  4.70it/s, loss=3.12, v_num=2]
Epoch 79:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.12, v_num=2]
Epoch 80:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.12, v_num=2]
Epoch 80: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=3.12, v_num=2]
Epoch 80: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=3.16, v_num=2]
Epoch 80:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.16, v_num=2]
Epoch 81:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.16, v_num=2]
Epoch 81: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=3.16, v_num=2]
Epoch 81: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=3.09, v_num=2]
Epoch 81:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.09, v_num=2]
Epoch 82:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.09, v_num=2]
Epoch 82: 100%|##########| 97/97 [00:20 00:00,  4.68it/s, loss=3.09, v_num=2]
Epoch 82: 100%|##########| 97/97 [00:20 00:00,  4.68it/s, loss=3.14, v_num=2]
Epoch 82:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.14, v_num=2]
Epoch 83:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.14, v_num=2]
Epoch 83: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=3.14, v_num=2]
Epoch 83: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=3.07, v_num=2]
Epoch 83:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.07, v_num=2]
Epoch 84:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.07, v_num=2]
Epoch 84: 100%|##########| 97/97 [00:20 00:00,  4.68it/s, loss=3.07, v_num=2]
Epoch 84: 100%|##########| 97/97 [00:20 00:00,  4.68it/s, loss=3.08, v_num=2]
Epoch 84:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.08, v_num=2]
Epoch 85:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.08, v_num=2]
Epoch 85: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=3.08, v_num=2]
Epoch 85: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=3.09, v_num=2]
Epoch 85:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.09, v_num=2]
Epoch 86:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.09, v_num=2]
Epoch 86: 100%|##########| 97/97 [00:20 00:00,  4.68it/s, loss=3.09, v_num=2]
Epoch 86: 100%|##########| 97/97 [00:20 00:00,  4.68it/s, loss=3.07, v_num=2]
Epoch 86:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.07, v_num=2]
Epoch 87:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.07, v_num=2]
Epoch 87: 100%|##########| 97/97 [00:20 00:00,  4.68it/s, loss=3.07, v_num=2]
Epoch 87: 100%|##########| 97/97 [00:20 00:00,  4.68it/s, loss=3.06, v_num=2]
Epoch 87:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.06, v_num=2]
Epoch 88:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.06, v_num=2]
Epoch 88: 100%|##########| 97/97 [00:20 00:00,  4.68it/s, loss=3.06, v_num=2]
Epoch 88: 100%|##########| 97/97 [00:20 00:00,  4.68it/s, loss=3.06, v_num=2]
Epoch 88:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.06, v_num=2]
Epoch 89:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.06, v_num=2]
Epoch 89: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=3.06, v_num=2]
Epoch 89: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=3.04, v_num=2]
Epoch 89:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.04, v_num=2]
Epoch 90:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.04, v_num=2]
Epoch 90: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=3.04, v_num=2]
Epoch 90: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=3.05, v_num=2]
Epoch 90:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.05, v_num=2]
Epoch 91:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.05, v_num=2]
Epoch 91: 100%|##########| 97/97 [00:20 00:00,  4.70it/s, loss=3.05, v_num=2]
Epoch 91: 100%|##########| 97/97 [00:20 00:00,  4.70it/s, loss=3.06, v_num=2]
Epoch 91:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.06, v_num=2]
Epoch 92:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.06, v_num=2]
Epoch 92: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=3.06, v_num=2]
Epoch 92: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=3.04, v_num=2]
Epoch 92:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.04, v_num=2]
Epoch 93:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.04, v_num=2]
Epoch 93: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=3.04, v_num=2]
Epoch 93: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=3.04, v_num=2]
Epoch 93:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.04, v_num=2]
Epoch 94:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.04, v_num=2]
Epoch 94: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=3.04, v_num=2]
Epoch 94: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=3.02, v_num=2]
Epoch 94:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.02, v_num=2]
Epoch 95:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.02, v_num=2]
Epoch 95: 100%|##########| 97/97 [00:20 00:00,  4.68it/s, loss=3.02, v_num=2]
Epoch 95: 100%|##########| 97/97 [00:20 00:00,  4.68it/s, loss=3.06, v_num=2]
Epoch 95:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.06, v_num=2]
Epoch 96:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.06, v_num=2]
Epoch 96: 100%|##########| 97/97 [00:20 00:00,  4.70it/s, loss=3.06, v_num=2]
Epoch 96: 100%|##########| 97/97 [00:20 00:00,  4.70it/s, loss=3.04, v_num=2]
Epoch 96:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.04, v_num=2]
Epoch 97:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.04, v_num=2]
Epoch 97: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=3.04, v_num=2]
Epoch 97: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=3.03, v_num=2]
Epoch 97:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.03, v_num=2]
Epoch 98:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.03, v_num=2]
Epoch 98: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=3.03, v_num=2]
Epoch 98: 100%|##########| 97/97 [00:20 00:00,  4.69it/s, loss=3.02, v_num=2]
Epoch 98:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.02, v_num=2]
Epoch 99:   0%|          | 0/97 [00:00 ?, ?it/s, loss=3.02, v_num=2]
Epoch 99: 100%|##########| 97/97 [00:20 00:00,  4.70it/s, loss=3.02, v_num=2]
Epoch 99: 100%|##########| 97/97 [00:20 00:00,  4.70it/s, loss=3.04, v_num=2]
Epoch 99: 100%|##########| 97/97 [00:20 00:00,  4.64it/s, loss=3.04, v_num=2]

 
分類器を訓練します。

model.eval()
classifier = Classifier(model.backbone)
trainer = pl.Trainer(max_epochs=max_epochs, gpus=gpus,
                     progress_bar_refresh_rate=100)
trainer.fit(
    classifier,
    dataloader_train_classifier,
    dataloader_test
)
/opt/hostedtoolcache/Python/3.10.2/x64/lib/python3.10/site-packages/pytorch_lightning/trainer/connectors/callback_connector.py:90: LightningDeprecationWarning: Setting `Trainer(progress_bar_refresh_rate=100)` is deprecated in v1.5 and will be removed in v1.7. Please pass `pytorch_lightning.callbacks.progress.TQDMProgressBar` with `refresh_rate` directly to the Trainer's `callbacks` argument instead. Or, to disable the progress bar pass `enable_progress_bar = False` to the Trainer.
  rank_zero_deprecation(

Validation sanity check: 0it [00:00, ?it/s]
Validation sanity check:   0%|          | 0/2 [00:00 ?, ?it/s]
Validation sanity check: 100%|##########| 2/2 [00:00 00:00,  2.54it/s]


Training: 0it [00:00, ?it/s]
Training:   0%|          | 0/117 [00:00 ?, ?it/s]
Epoch 0:   0%|          | 0/117 [00:00 ?, ?it/s]
Epoch 0:  85%|########5 | 100/117 [00:04 00:00, 24.58it/s]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00 ?, ?it/s]

Validating: 100%|##########| 20/20 [00:01 00:00, 13.62it/s]
Epoch 0:  85%|########5 | 100/117 [00:05 00:00, 18.03it/s, loss=6.11, v_num=3, val_acc=0.561]


Epoch 0: 100%|##########| 117/117 [00:06 00:00, 17.78it/s, loss=6.11, v_num=3, val_acc=0.561]
Epoch 0:   0%|          | 0/117 [00:00 ?, ?it/s, loss=6.11, v_num=3, val_acc=0.561]
Epoch 1:   0%|          | 0/117 [00:00 ?, ?it/s, loss=6.11, v_num=3, val_acc=0.561]
Epoch 1:  85%|########5 | 100/117 [00:04 00:00, 24.79it/s, loss=6.11, v_num=3, val_acc=0.561]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00 ?, ?it/s]

Validating: 100%|##########| 20/20 [00:01 00:00, 14.39it/s]
Epoch 1:  85%|########5 | 100/117 [00:05 00:00, 18.40it/s, loss=7.1, v_num=3, val_acc=0.594]


Epoch 1: 100%|##########| 117/117 [00:06 00:00, 18.39it/s, loss=7.1, v_num=3, val_acc=0.594]
Epoch 1:   0%|          | 0/117 [00:00 ?, ?it/s, loss=7.1, v_num=3, val_acc=0.594]
Epoch 2:   0%|          | 0/117 [00:00 ?, ?it/s, loss=7.1, v_num=3, val_acc=0.594]
Epoch 2:  85%|########5 | 100/117 [00:04 00:00, 24.65it/s, loss=7.1, v_num=3, val_acc=0.594]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00 ?, ?it/s]

Validating: 100%|##########| 20/20 [00:01 00:00, 14.07it/s]
Epoch 2:  85%|########5 | 100/117 [00:05 00:00, 18.21it/s, loss=6.49, v_num=3, val_acc=0.567]


Epoch 2: 100%|##########| 117/117 [00:06 00:00, 18.26it/s, loss=6.49, v_num=3, val_acc=0.567]
Epoch 2:   0%|          | 0/117 [00:00 ?, ?it/s, loss=6.49, v_num=3, val_acc=0.567]
Epoch 3:   0%|          | 0/117 [00:00 ?, ?it/s, loss=6.49, v_num=3, val_acc=0.567]
Epoch 3:  85%|########5 | 100/117 [00:04 00:00, 24.51it/s, loss=6.49, v_num=3, val_acc=0.567]

...

Epoch 97: 100%|##########| 117/117 [00:06 00:00, 18.34it/s, loss=0.75, v_num=3, val_acc=0.743]
Epoch 97:   0%|          | 0/117 [00:00 ?, ?it/s, loss=0.75, v_num=3, val_acc=0.743]
Epoch 98:   0%|          | 0/117 [00:00 ?, ?it/s, loss=0.75, v_num=3, val_acc=0.743]
Epoch 98:  85%|########5 | 100/117 [00:04 00:00, 24.70it/s, loss=0.75, v_num=3, val_acc=0.743]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00 ?, ?it/s]

Validating: 100%|##########| 20/20 [00:01 00:00, 14.33it/s]
Epoch 98:  85%|########5 | 100/117 [00:05 00:00, 18.33it/s, loss=0.763, v_num=3, val_acc=0.744]


Epoch 98: 100%|##########| 117/117 [00:06 00:00, 18.26it/s, loss=0.763, v_num=3, val_acc=0.744]
Epoch 98:   0%|          | 0/117 [00:00 ?, ?it/s, loss=0.763, v_num=3, val_acc=0.744]
Epoch 99:   0%|          | 0/117 [00:00 ?, ?it/s, loss=0.763, v_num=3, val_acc=0.744]
Epoch 99:  85%|########5 | 100/117 [00:04 00:00, 24.75it/s, loss=0.763, v_num=3, val_acc=0.744]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00 ?, ?it/s]

Validating: 100%|##########| 20/20 [00:01 00:00, 14.19it/s]
Epoch 99:  85%|########5 | 100/117 [00:05 00:00, 18.31it/s, loss=0.763, v_num=3, val_acc=0.744]


Epoch 99: 100%|##########| 117/117 [00:06 00:00, 18.32it/s, loss=0.763, v_num=3, val_acc=0.744]
Epoch 99: 100%|##########| 117/117 [00:06 00:00, 18.04it/s, loss=0.763, v_num=3, val_acc=0.744]

Checkout the tensorboard logs while the model is training.

Run tensorboard –logdir lightning_logs/ to start tensorboard

 

以上