Monodepth2 で Depth Prediction(GluonCV)

初めに

今回はチュートリアルを実行したのみ

環境(GPUなし)

Windows10 Pro 64bit
GPUなし
Python 3.8.2

mxnet, gluoncvのインストール

pip install mxnet
pip install gluoncv --pre

その他のパッケージはインストール不要

バージョンの確認(pip freeze)

certifi==2020.6.20
chardet==3.0.4
cycler==0.10.0
gluoncv==0.8.0b20200730
graphviz==0.8.4
idna==2.6
kiwisolver==1.2.0
matplotlib==3.3.0
mxnet==1.6.0
numpy==1.16.6
Pillow==7.2.0
portalocker==1.7.1
pyparsing==3.0.0a2
python-dateutil==2.8.1
pywin32==228
requests==2.18.4
scipy==1.5.2
six==1.15.0
tqdm==4.48.0
urllib3==1.22

実行ファイル

import numpy as np
import mxnet as mx
from mxnet.gluon.data.vision import transforms
import gluoncv

ctx = mx.cpu(0)

transform_fn = transforms.Compose([
    transforms.Resize((640,192)),
    transforms.ToTensor(),
])

url = 'https://raw.githubusercontent.com/KuangHaofei/GluonCV_Test/master/monodepthv2/tutorials/test_img.png'
filename = 'test_img.png'
gluoncv.utils.download(url, filename)

img = mx.image.imread(filename)

original_width, original_height = img.shape[1], img.shape[0]

img = transform_fn(img)
img = img.expand_dims(0).as_in_context(ctx)

model = gluoncv.model_zoo.get_model('monodepth2_resnet18_kitti_stereo_640x192', root='./model',
                                    pretrained_base=False, ctx=ctx, pretrained=True)

outputs = model.predict(img)
disp = outputs[("disp", 0)]
disp_resized = mx.nd.contrib.BilinearResize2D(disp, height=original_height, width=original_width)

import matplotlib as mpl
from matplotlib import cm
from matplotlib import pyplot as plt

disp_resized_np = disp_resized.squeeze().as_in_context(mx.cpu()).asnumpy()
vmax = np.percentile(disp_resized_np, 95)
normalizer = mpl.colors.Normalize(vmin=disp_resized_np.min(), vmax=vmax)
mapper = cm.ScalarMappable(norm=normalizer, cmap='magma')
colormapped_im = (mapper.to_rgba(disp_resized_np)[:, :, :3] * 255).astype(np.uint8)

plt.axis('off')
plt.imshow(colormapped_im)
plt.show()

結果