はじめに
Perturbed-Attention Guidance(PAG)についてはこちらを見て下さい。touch-sp.hatenablog.com
今回はぼやけた写真を修復する「SDXL_Controlnet_Tile_Realistic」と組み合わせてみます。
用意した写真
結果
左上:PAGなし
右上:pag_applied_layers=["mid"]
左下:pag_applied_layers=["down.block_2"]
右下:pag_applied_layers=["down.block_2", "up.block_1.attentions_0"]
元画像と「pag_applied_layers=["mid"]」で作成した画像を並べた結果がこちらです。
Pythonスクリプト
import torch from diffusers import ControlNetModel, DPMSolverMultistepScheduler, AutoPipelineForText2Image from diffusers.utils import load_image # model was downloaded from https://huggingface.co/OzzyGT/SDXL_Controlnet_Tile_Realistic controlnet = ControlNetModel.from_pretrained( "controlnet/SDXL_Controlnet_Tile_Realistic", torch_dtype=torch.float16, variant="fp16" ) pipeline = AutoPipelineForText2Image.from_pretrained( "fudukiMix_v20", torch_dtype=torch.float16, variant="fp16", controlnet=controlnet, ).to("cuda") pipeline.scheduler = DPMSolverMultistepScheduler.from_config( pipeline.scheduler.config, use_karras_sigmas=True ) control_image = load_image("face1024.jpg") prompt = "high quality image of a woman" negative_prompt = "blurry, low quality" # no PAG generator=torch.Generator(device="cpu").manual_seed(0) image = pipeline( prompt=prompt, negative_prompt=negative_prompt, guidance_scale=7.5, num_inference_steps=25, image=control_image, controlnet_conditioning_scale=1.0, generator=generator ).images[0] image.save("no_pag.png") # with PAG pipeline = AutoPipelineForText2Image.from_pipe(pipeline, enable_pag=True) for i, layer in enumerate([["mid"], ["down.block_2"], ["down.block_2", "up.block_1.attentions_0"]]): pipeline.set_pag_applied_layers(layer) generator=torch.Generator(device="cpu").manual_seed(0) image = pipeline( prompt=prompt, negative_prompt=negative_prompt, image=control_image, controlnet_conditioning_scale=1.0, num_inference_steps=25, num_images_per_prompt=1, guidance_scale = 3.0, width=1024, height=1024, generator=generator, pag_scale=5.0 ).images[0] image.save(f"with_pag_{i}.jpg")