【IP-Adapter】IP-Adapter と Inpaintモデルを組み合わせれば写真に写る人物の顔を好みの顔に変えられる？

この写真に写る女性の顔をこの顔に変えられるかどうか試してみました。

実行するにあたりこのようなマスク画像を用意しました。

結果

Diffusersを使って実行しています。
DiffusersではFaceIDがまだ使えないのでそれ以前のIP-Adapterで試してます。

ip-adapter_sd15

ip-adapter-plus_sd15

ip-adapter-plus-face_sd15

ip-adapter-full-face_sd15

下二つは残念ながらあまりうまくいっていません。
上二つも、画像はきれいですが希望通りの顔になったとは言い難いです。

Pythonスクリプト

4つのIP-Adapterを一気に試しました。

from transformers import CLIPVisionModelWithProjection
from diffusers import StableDiffusionInpaintPipeline, DDIMScheduler
import torch
from diffusers.utils import load_image
from PIL import Image

adapter_list = [
    "ip-adapter_sd15",
    "ip-adapter-plus_sd15",
    "ip-adapter-plus-face_sd15",
    "ip-adapter-full-face_sd15",
]

image_encoder = CLIPVisionModelWithProjection.from_pretrained(
    "IP-Adapter", 
    subfolder="models/image_encoder",
    torch_dtype=torch.float16,
)

noise_scheduler = DDIMScheduler(
    num_train_timesteps=1000,
    beta_start=0.00085,
    beta_end=0.012,
    beta_schedule="scaled_linear",
    clip_sample=False,
    set_alpha_to_one=False,
    steps_offset=1
)

image = load_image("girl.jpg")
mask = load_image("girl_mask.png")
ip_image = load_image("face.png")

image = image.resize((768, 768))
mask = mask.resize((768, 768))

for adapter in adapter_list:

    pipeline = StableDiffusionInpaintPipeline.from_pretrained(
        "model/yayoiMix_v25",
        image_encoder = image_encoder,
        scheduler=noise_scheduler,
        safety_checker=None,
        torch_dtype=torch.float16,
        variant="fp16"
    ).to("cuda")

    pipeline.load_ip_adapter("IP-Adapter", subfolder="models", weight_name=f"{adapter}.safetensors")

    generator = torch.manual_seed(33)
    images = pipeline(
        prompt='japanese woman, best quality, high quality', 
        image = image,
        mask_image = mask,
        ip_adapter_image=ip_image,
        negative_prompt="monochrome, lowres, bad anatomy, worst quality, low quality", 
        num_images_per_prompt=1, 
        num_inference_steps=50,
        generator=generator,
        strength=0.5,
        width=768,
        height=768
    ).images
    images[0].save(f"{adapter}_result.png")