DiffBIR (Towards Blind Image Restoration with Generative Diffusion Prior) で画像復元

はじめに

DiffBIRはImage Restoration（画像復元）のためのモデルです。

画像復元とSuper Resolution（超解像）との違いはよくわかっていません。

どちらも低解像度の画像を綺麗にする手法です。

結果

最初に結果を示します。

左の画像から右の画像が作成されます。
かなりクリアになっているのが分かると思います。

左の画像はAnimateDiffを使って作成した動画の1フレームです。
こちらで作成しました。

全フレームに今回のDiffBIRを適応すると動画の質が見違えるようになりました。
動画の結果はGoogle Bloggerに載せています。
support-touchsp.blogspot.com

環境

公式では「torch==1.13.1+cu116」を使用していますが今回は「torch==2.0.1」を使います。

Windows 11

Windows 11
CUDA 11.7
Python 3.10

「requirements.txt」を作成したのでPython環境の構築は1行で済みます。

pip install -r https://raw.githubusercontent.com/dai-ichiro/myEnvironments/main/DiffBIR/requirements_cu117_win.txt

Ubuntu 22.04

Ubuntu 22.04 on WSL2
CUDA 11.8
Python 3.10

「requirements.txt」を作成したのでPython環境の構築は1行で済みます。

pip install -r https://raw.githubusercontent.com/dai-ichiro/myEnvironments/main/DiffBIR/requirements_cu118.txt

リポジトリのクローン

git clone https://github.com/XPixelGroup/DiffBIR

スクリプトの書き換え

そのまま実行すると以下のようなエラーが出ます。

File "/home/hoge/documents/diffbir/DiffBIR/ldm/models/diffusion/ddpm.py", line 20, in <module>
from pytorch_lightning.utilities.distributed import rank_zero_only
ModuleNotFoundError: No module named 'pytorch_lightning.utilities.distributed'

「DiffBIR/ldm/models/diffusion/ddpm.py」の20行目を書き換える必要があります。

変更前

from pytorch_lightning.utilities.distributed import rank_zero_only

変更後

from pytorch_lightning.utilities.rank_zero import rank_zero_only

モデルのダウンロード

公式に示す通りに「general_swinir_v1.ckpt」と「general_full_v1.ckpt」をダウンロードして「weights」フォルダに配置します。

「weights」フォルダは新規に作成する必要があります。

実行

python inference.py \
--input inputs/demo/general \
--config configs/model/cldm.yaml \
--ckpt weights/general_full_v1.ckpt \
--reload_swinir \
--swinir_ckpt weights/general_swinir_v1.ckpt \
--steps 50 \
--sr_scale 2 \
--color_fix_type wavelet \
--output results \
--device cuda

「sr_scale」で何倍に拡大するかを指定しています。

上記で「inputs/demo/general」フォルダ内にある画像に対して画像復元が行われ、結果が「results」フォルダに保存されます。