環境
Windows 11 CUDA 11.7 Python 3.10
pip install torch==2.0.1+cu117 torchvision==0.15.2+cu117 --index-url https://download.pytorch.org/whl/cu117 pip install diffusers[torch] pip install transformers omegaconf sentencepiece beautifulsoup4 ftfy
Pythonスクリプト
import torch from diffusers import PixArtAlphaPipeline pipe = PixArtAlphaPipeline.from_pretrained( "model/PixArt-XL-2-1024-MS", torch_dtype=torch.float16 ).to("cuda") prompt = "A small cactus with a happy face in the Sahara desert" seed = 110000 for steps in range(20, 50, 10): generator = torch.manual_seed(seed) image = pipe( prompt, generator=generator, num_inference_steps=steps ).images[0] image.save(f"pixart_result_{steps}.png")
結果
左からnum_inference_steps 20→30→40です。補足1
diffusers==0.23.0以降、「PixArtAlphaPipeline」は「AutoPipelineForText2Image」で代用可能です。import torch from diffusers import AutoPipelineForText2Image pipe = AutoPipelineForText2Image.from_pretrained( "model/PixArt-XL-2-1024-MS", torch_dtype=torch.float16 ).to("cuda") prompt = "A small cactus with a happy face in the Sahara desert" seed = 110000 for steps in range(20, 50, 10): generator = torch.manual_seed(seed) image = pipe( prompt, generator=generator, num_inference_steps=steps ).images[0] image.save(f"pixart_result_{steps}.png")
補足2
縦横比1以外(正方形でない)画像を作成する時には resolution_binning が有効とのことです。import torch from diffusers import AutoPipelineForText2Image pipe = AutoPipelineForText2Image.from_pretrained( "model/PixArt-XL-2-1024-MS", torch_dtype=torch.float16 ).to("cuda") prompts = [ "A small cactus with a happy face in the Sahara desert.", "Pirate ship trapped in a cosmic maelstrom nebula, rendered in cosmic beach whirlpool engine, volumetric lighting, spectacular, ambient lights, light pollution, cinematic atmosphere, art nouveau style, illustration art artwork by SenseiJaye, intricate detail.", "stars, water, brilliantly, gorgeous large scale scene, a little girl, in the style of dreamy realism, light gold and amber, blue and pink, brilliantly illuminated in the background." ] seed = 100000 for i, prompt in enumerate(prompts): generator = torch.manual_seed(seed) no_resolution_binning = pipe( prompts, height=1024, width=768, generator=generator, use_resolution_binning=False ).images generator = torch.manual_seed(seed) resolution_binning = pipe( prompts, height=1024, width=768, generator=generator, use_resolution_binning=True ).images from diffusers.utils import make_image_grid for i in range(len(prompts)): image = make_image_grid([no_resolution_binning[i], resolution_binning[i]], rows=1, cols=2) image.save(f"resulst{i}.png")
左がresolution_binningなし、右がありです。