公開日:2021年1月26日
最終更新日:2022年9月13日
AutoGluonは画像分類にMultiModalPredictorを推奨しています。
ImagePredictorからMultiModalPredictorに変更して記事を書き換えました。
はじめに
人が絶対に間違えないような簡単な画像分類で深層学習は100%の精度が出せるのか試してみました。今回使用したデータ
miniJSRT_database | 日本放射線技術学会 画像部会から「Classification > Directions01(128×128,RGB Color:24bit)」をダウンロードさせて頂きました。胸部X線写真の向きを判定する問題です。「上向き」「下向き」「右向き」「左向き」の4クラス分類問題として解くことができます。
ダウンロードしたZIPファイルを解凍するとtrainフォルダとtestフォルダの2つのフォルダにそれぞれup, down. right, leftの4つのフォルダが入っています。
DIRECTIONS01_RGB ├─test │ ├─down │ ├─left │ ├─right │ └─up └─train ├─down ├─left ├─right └─up
以下の2行で2つのPandasデータフレームが作成されます。
from autogluon.vision import ImagePredictor train_dataset, _, test_dataset = ImagePredictor.Dataset.from_folders('DIRECTIONS01_RGB')
このPandasデータフレームを学習とテストに使用することになります。
>>> train_dataset image label 0 D:\DIRECTIONS01_RGB\train\down\0.png 0 1 D:\DIRECTIONS01_RGB\train\down\1.png 0 2 D:\DIRECTIONS01_RGB\train\down\10.png 0 3 D:\DIRECTIONS01_RGB\train\down\100.png 0 4 D:\DIRECTIONS01_RGB\train\down\101.png 0 .. ... ... 943 D:\DIRECTIONS01_RGB\train\up\95.png 3 944 D:\DIRECTIONS01_RGB\train\up\96.png 3 945 D:\DIRECTIONS01_RGB\train\up\97.png 3 946 D:\DIRECTIONS01_RGB\train\up\98.png 3 947 D:\DIRECTIONS01_RGB\train\up\99.png 3 [948 rows x 2 columns]
>>> test_dataset image label 0 D:\DIRECTIONS01_RGB\test\down\1.png 0 1 D:\DIRECTIONS01_RGB\test\down\10.png 0 2 D:\DIRECTIONS01_RGB\test\down\2.png 0 3 D:\DIRECTIONS01_RGB\test\down\3.png 0 4 D:\DIRECTIONS01_RGB\test\down\4.png 0 .. ... ... 35 D:\DIRECTIONS01_RGB\test\up\5.png 3 36 D:\DIRECTIONS01_RGB\test\up\6.png 3 37 D:\DIRECTIONS01_RGB\test\up\7.png 3 38 D:\DIRECTIONS01_RGB\test\up\8.png 3 39 D:\DIRECTIONS01_RGB\test\up\9.png 3
学習と検証
import warnings warnings.filterwarnings('ignore') from autogluon.vision import ImagePredictor from autogluon.multimodal import MultiModalPredictor train_dataset, _, test_dataset = ImagePredictor.Dataset.from_folders('Directions01_RGB') predictor = MultiModalPredictor(label="label") predictor.fit(train_data = train_dataset) score = predictor.evaluate(test_dataset , metrics=["accuracy"]) print(score) predictor.save('my_saved_dir')
上記を実行すると以下の結果が返ってきます。
Global seed set to 123 Auto select gpus: [0] Using 16bit native Automatic Mixed Precision (AMP) GPU available: True, used: True TPU available: False, using: 0 TPU cores IPU available: False, using: 0 IPUs HPU available: False, using: 0 HPUs LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0] | Name | Type | Params ---------------------------------------------------------------------- 0 | model | TimmAutoModelForImagePrediction | 86.7 M 1 | validation_metric | Accuracy | 0 2 | loss_func | CrossEntropyLoss | 0 ---------------------------------------------------------------------- 86.7 M Trainable params 0 Non-trainable params 86.7 M Total params 173.495 Total estimated model params size (MB) Epoch 0: 50%|██████████████████████████▊ | 71/143 [00:04<00:04, 15.71it/s, loss=1.51, v_num=Epoch 0, global step 2: 'val_accuracy' reached 0.27895 (best 0.27895), saving model to '/mnt/wsl/PHYSICALDRIVE0p1/autogluon_works/AutogluonModels/ag-20220913_141522/epoch=0-step=2.ckpt' as top 3 Epoch 0: 99%|███████████████████████████████████████████████████▋| 142/143 [00:10<00:00, 13.93it/s, loss=0.707, v_num=Epoch 0, global step 5: 'val_accuracy' reached 0.76316 (best 0.76316), saving model to '/mnt/wsl/PHYSICALDRIVE0p1/autogluon_works/AutogluonModels/ag-20220913_141522/epoch=0-step=5.ckpt' as top 3 Epoch 1: 50%|██████████████████████████▎ | 71/143 [00:04<00:04, 15.51it/s, loss=0.332, v_num=Epoch 1, global step 8: 'val_accuracy' reached 0.89474 (best 0.89474), saving model to '/mnt/wsl/PHYSICALDRIVE0p1/autogluon_works/AutogluonModels/ag-20220913_141522/epoch=1-step=8.ckpt' as top 3 Epoch 1: 99%|███████████████████████████████████████████████████▋| 142/143 [00:13<00:00, 10.17it/s, loss=0.144, v_num=Epoch 1, global step 11: 'val_accuracy' reached 1.00000 (best 1.00000), saving model to '/mnt/wsl/PHYSICALDRIVE0p1/autogluon_works/AutogluonModels/ag-20220913_141522/epoch=1-step=11.ckpt' as top 3 Epoch 2: 50%|██████████████████████████▎ | 71/143 [00:04<00:04, 15.78it/s, loss=0.125, v_num=Epoch 2, global step 14: 'val_accuracy' reached 1.00000 (best 1.00000), saving model to '/mnt/wsl/PHYSICALDRIVE0p1/autogluon_works/AutogluonModels/ag-20220913_141522/epoch=2-step=14.ckpt' as top 3 Epoch 2: 99%|██████████████████████████████████████████████████▋| 142/143 [00:11<00:00, 12.90it/s, loss=0.0188, v_num=Epoch 2, global step 17: 'val_accuracy' reached 1.00000 (best 1.00000), saving model to '/mnt/wsl/PHYSICALDRIVE0p1/autogluon_works/AutogluonModels/ag-20220913_141522/epoch=2-step=17.ckpt' as top 3 Epoch 3: 50%|█████████████████████████▊ | 71/143 [00:06<00:06, 11.01it/s, loss=0.0958, v_num=Epoch 3, global step 20: 'val_accuracy' was not in top 3 Epoch 3: 99%|█████████████████████████████████████████████████▋| 142/143 [00:11<00:00, 11.96it/s, loss=0.00292, v_num=Epoch 3, global step 23: 'val_accuracy' was not in top 3 Epoch 4: 50%|█████████████████████████▊ | 71/143 [00:04<00:04, 17.23it/s, loss=0.0086, v_num=Epoch 4, global step 26: 'val_accuracy' was not in top 3 Epoch 4: 99%|██████████████████████████████████████████████████▋| 142/143 [00:09<00:00, 15.00it/s, loss=0.0241, v_num=Epoch 4, global step 29: 'val_accuracy' was not in top 3 Epoch 5: 50%|██████████████████████████▊ | 71/143 [00:04<00:04, 17.12it/s, loss=0.02, v_num=Epoch 5, global step 32: 'val_accuracy' was not in top 3 Epoch 5: 99%|█████████████████████████████████████████████████▋| 142/143 [00:09<00:00, 14.77it/s, loss=0.00628, v_num=Epoch 5, global step 35: 'val_accuracy' was not in top 3 Epoch 6: 50%|█████████████████████████▎ | 71/143 [00:04<00:04, 17.20it/s, loss=0.00142, v_num=Epoch 6, global step 38: 'val_accuracy' was not in top 3 Epoch 6: 99%|████████████████████████████████████████████████▋| 142/143 [00:09<00:00, 14.75it/s, loss=0.000366, v_num=Epoch 6, global step 41: 'val_accuracy' was not in top 3 Epoch 6: 99%|████████████████████████████████████████████████▋| 142/143 [00:10<00:00, 13.22it/s, loss=0.000366, v_num=] Start to fuse 3 checkpoints via the greedy soup algorithm. Predicting DataLoader 0: 100%|████████████████████████████████████████████████████████████| 6/6 [00:00<00:00, 12.04it/s] Predicting DataLoader 0: 100%|████████████████████████████████████████████████████████████| 6/6 [00:00<00:00, 15.46it/s] Predicting DataLoader 0: 100%|████████████████████████████████████████████████████████████| 6/6 [00:00<00:00, 12.22it/s] Predicting DataLoader 0: 100%|████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 16.62it/s] {'accuracy': 1.0}
数分で学習が終わり、テストデータに対する正解率が100%になっています。
もう少し細かく見てみましょう
どのくらいの確率で正解を言い当てているかすべてのテストデータについて見てみました。import warnings warnings.filterwarnings('ignore') from autogluon.vision import ImagePredictor from autogluon.multimodal import MultiModalPredictor _, _, test_dataset = ImagePredictor.Dataset.from_folders('Directions01_RGB') predictor = MultiModalPredictor.load('my_saved_dir') proba = predictor.predict_proba(test_dataset) proba['label'] = test_dataset['label'] folder_names = ['down', 'left', 'right', 'up'] for i in range(4): print(folder_names[i], 'images:') print(proba[proba['label']==i][i])
Load pretrained checkpoint: my_saved_dir/model.ckpt Predicting DataLoader 0: 100%|████████████████████████████████████████████████████████████| 2/2 [00:01<00:00, 1.06it/s] down images: 0 0.999938 1 0.999947 2 0.999744 3 0.999950 4 0.999883 5 0.999847 6 0.999931 7 0.999984 8 0.999946 9 0.999917 Name: 0, dtype: float32 left images: 10 0.999397 11 0.999616 12 0.996307 13 0.984800 14 0.997568 15 0.996720 16 0.999718 17 0.999014 18 0.998500 19 0.999051 Name: 1, dtype: float32 right images: 20 0.999887 21 0.999670 22 0.999889 23 0.999903 24 0.999819 25 0.999073 26 0.999945 27 0.999656 28 0.999897 29 0.999908 Name: 2, dtype: float32 up images: 30 0.999922 31 0.999998 32 0.999994 33 0.999977 34 0.999996 35 0.999992 36 0.999995 37 0.999996 38 0.999994 39 0.999955 Name: 3, dtype: float32