【Diffusers】PixArtSigma で PAG (Perturbed-Attention Guidance) を使ってみる

Pythonスクリプト

from diffusers import AutoPipelineForText2Image, PixArtSigmaPAGPipeline
import torch

pipeline = AutoPipelineForText2Image.from_pretrained(
    "PixArt-Sigma-XL-2-1024-MS",
    torch_dtype=torch.float16
).to("cuda")

prompt = "an insect robot preparing a delicious meal, anime style"

# no pag
generator = torch.Generator(device="cpu").manual_seed(0)
image = pipeline(
    prompt=prompt,
    generator=generator,
).images[0]
image.save("no_pag.jpg")

# with pag
pipeline = PixArtSigmaPAGPipeline.from_pipe(pipeline)

for i, layer in enumerate([[14], [13, 14], [14, 15], [13, 14, 15]]):
    pipeline.set_pag_applied_layers(layer)
    generator = torch.Generator(device="cpu").manual_seed(0)
    image = pipeline(
        prompt=prompt,
        guidance_scale=3.0,
        generator=generator,
        pag_scale=3.0,
    ).images[0]
    image.save(f"with_pag_{i}.jpg")

結果

PAGなし


PAGあり


左から
pag laysers: [14]
pag laysers: [13, 14]
pag laysers: [14, 15]
pag laysers: [13, 14, 15]



このエントリーをはてなブックマークに追加