AutoGluon の MultiModalPredictor を使って画像やテキストを含む表データの分類を行う

公開日:2022年3月10日
最終更新日:2022年9月10日

はじめに

AutoGluonに導入されている「MultiModalPredictor」を使って画像やテキストを含む表データの分類を行います。

以前も同様のことを行っていますがその時にはMultiModalPredictorは使用していません。
touch-sp.hatenablog.com

環境

Ubuntu 20.04 on WSL2 (Windows 11)
CUDA Toolkit 11.3.1
Python 3.8.10

インストール

PyTorchとAutoGluonをpipでインストールするだけです。
MXNetのインストールは不要です。

pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchtext --extra-index-url https://download.pytorch.org/whl/cu113
pip install autogluon

実行

データのダウンロード

from autogluon.core.utils.loaders import load_zip
download_dir = './ag_petfinder_tutorial'
zip_file = 'https://automl-mm-bench.s3.amazonaws.com/petfinder_for_tutorial.zip'
load_zip.unzip(zip_file, unzip_dir=download_dir)

学習スクリプト

import warnings
warnings.filterwarnings('ignore')

import os
import pandas as pd
from autogluon.multimodal import MultiModalPredictor

download_dir = './ag_petfinder_tutorial'
dataset_path = download_dir + '/petfinder_for_tutorial'

train_data = pd.read_csv(f'{dataset_path}/train.csv', index_col=0)

label_col = 'AdoptionSpeed'
image_col = 'Images'

train_data[image_col] = train_data[image_col].apply(lambda x: os.path.abspath(os.path.join(dataset_path, x.split(';')[0])))

predictor = MultiModalPredictor(label=label_col)
predictor.fit(train_data=train_data)

predictor.save('my_saved_dir')

学習時の出力

Global seed set to 123
Auto select gpus: [0]
Using 16bit native Automatic Mixed Precision (AMP)
GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name              | Type                | Params
----------------------------------------------------------
0 | model             | MultimodalFusionMLP | 349 M
1 | validation_metric | AUROC               | 0
2 | loss_func         | CrossEntropyLoss    | 0
----------------------------------------------------------
349 M     Trainable params
0         Non-trainable params
349 M     Total params
699.890   Total estimated model params size (MB)
Epoch 0:  50%|████████████████████████████████████████████████████████████▌                                                            | 45/90 [00:09<00:09,  4.97it/s, loss=1.25, v_num=Epoch 0, global step 1: 'val_roc_auc' reached 0.58528 (best 0.58528), saving model to '/mnt/wsl/PHYSICALDRIVE0p1/autogluon_works/multimodal/AutogluonModels/ag-20220910_052227/epoch=0-step=1.ckpt' as top 3
Epoch 0: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 90/90 [00:23<00:00,  3.82it/s, loss=1.74, v_num=Epoch 0, global step 4: 'val_roc_auc' reached 0.70139 (best 0.70139), saving model to '/mnt/wsl/PHYSICALDRIVE0p1/autogluon_works/multimodal/AutogluonModels/ag-20220910_052227/epoch=0-step=4.ckpt' as top 3
Epoch 1:  50%|████████████████████████████████████████████████████████████▌                                                            | 45/90 [00:08<00:08,  5.08it/s, loss=1.02, v_num=Epoch 1, global step 5: 'val_roc_auc' reached 0.73111 (best 0.73111), saving model to '/mnt/wsl/PHYSICALDRIVE0p1/autogluon_works/multimodal/AutogluonModels/ag-20220910_052227/epoch=1-step=5.ckpt' as top 3
Epoch 1: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 90/90 [00:48<00:00,  1.85it/s, loss=0.746, v_num=Epoch 1, global step 8: 'val_roc_auc' reached 0.76333 (best 0.76333), saving model to '/mnt/wsl/PHYSICALDRIVE0p1/autogluon_works/multimodal/AutogluonModels/ag-20220910_052227/epoch=1-step=8.ckpt' as top 3
Epoch 2:  50%|████████████████████████████████████████████████████████████▌                                                            | 45/90 [00:13<00:13,  3.22it/s, loss=0.97, v_num=Epoch 2, global step 9: 'val_roc_auc' reached 0.76944 (best 0.76944), saving model to '/mnt/wsl/PHYSICALDRIVE0p1/autogluon_works/multimodal/AutogluonModels/ag-20220910_052227/epoch=2-step=9.ckpt' as top 3
Epoch 2: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 90/90 [01:04<00:00,  1.40it/s, loss=0.657, v_num=Epoch 2, global step 12: 'val_roc_auc' reached 0.77917 (best 0.77917), saving model to '/mnt/wsl/PHYSICALDRIVE0p1/autogluon_works/multimodal/AutogluonModels/ag-20220910_052227/epoch=2-step=12.ckpt' as top 3
・
・
・
Epoch 8:  50%|████████████████████████████████████████████████████████████                                                            | 45/90 [00:08<00:08,  5.21it/s, loss=0.929, v_num=Epoch 8, global step 33: 'val_roc_auc' reached 0.80639 (best 0.80639), saving model to '/mnt/wsl/PHYSICALDRIVE0p1/autogluon_works/multimodal/AutogluonModels/ag-20220910_052227/epoch=8-step=33.ckpt' as top 3
Epoch 8: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 90/90 [00:57<00:00,  1.55it/s, loss=0.266, v_num=Epoch 8, global step 36: 'val_roc_auc' reached 0.80694 (best 0.80694), saving model to '/mnt/wsl/PHYSICALDRIVE0p1/autogluon_works/multimodal/AutogluonModels/ag-20220910_052227/epoch=8-step=36.ckpt' as top 3
Epoch 9:  50%|████████████████████████████████████████████████████████████                                                            | 45/90 [00:11<00:11,  4.06it/s, loss=0.904, v_num=Epoch 9, global step 37: 'val_roc_auc' reached 0.80667 (best 0.80694), saving model to '/mnt/wsl/PHYSICALDRIVE0p1/autogluon_works/multimodal/AutogluonModels/ag-20220910_052227/epoch=9-step=37.ckpt' as top 3
Epoch 9: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 90/90 [00:51<00:00,  1.76it/s, loss=0.241, v_num=Epoch 9, global step 40: 'val_roc_auc' was not in top 3
Epoch 9: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 90/90 [01:02<00:00,  1.44it/s, loss=0.241, v_num=]
Start to fuse 3 checkpoints via the greedy soup algorithm.
Predicting DataLoader 0: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00,  4.04it/s]
Predicting DataLoader 0: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00,  4.26it/s]
Predicting DataLoader 0: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00,  4.44it/s]

評価スクリプト

import warnings
warnings.filterwarnings('ignore')

import os
import pandas as pd
from autogluon.multimodal import MultiModalPredictor

download_dir = './ag_petfinder_tutorial'
dataset_path = download_dir + '/petfinder_for_tutorial'

test_data = pd.read_csv(f'{dataset_path}/test.csv', index_col=0)

image_col = 'Images'

test_data[image_col] = test_data[image_col].apply(lambda x: os.path.abspath(os.path.join(dataset_path, x.split(';')[0])))

predictor = MultiModalPredictor.load('my_saved_dir')
scores = predictor.evaluate(test_data, metrics=["roc_auc"])

print(scores)

評価の結果

Load pretrained checkpoint: my_saved_dir/model.ckpt
Predicting DataLoader 0: 100%|████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:05<00:00,  1.41s/it]
{'roc_auc': 0.9084}