Hugging Face 與 Stable Diffusion:引領生成式 AI 的新時代

  • Hugging Face 是一個知名的人工智慧與機器學習平台,致力於為開發者和研究者提供高效的開源工具與模型。Stable Diffusion 是由 Stability AI 開發的圖像生成模型,能根據文本描述創建高品質的圖像,結合 Hugging Face 的平台優勢,進一步推動了生成式 AI 的應用。
  • Stable Diffusion 是基於擴散模型的技術,透過逐步還原噪音的過程生成影像。其強大的能力不僅支持藝術創作、遊戲設計,還能用於廣告、教育和科研等多領域。Hugging Face 提供簡單易用的 API 與介面,開發者可以輕鬆地將 Stable Diffusion 模型整合到自己的應用中,無需深厚的技術背景。
  • 下面將介紹大家如何使用這個模型來創造有趣有好玩的圖片

安裝環境

pip install diffusers transformers accelerate scipy safetensors

使用方法

import torch
from diffusers import StableDiffusionPipeline, DPMSolverMultistepScheduler

model_id = "stabilityai/stable-diffusion-2-1"

# Use the DPMSolverMultistepScheduler (DPM-Solver++) scheduler here instead
pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16)
pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)
pipe = pipe.to("cuda")

prompt = "a photo of an astronaut riding a horse on mars"
image = pipe(prompt).images[0]
    
image.save("astronaut_rides_horse.png")
  • 輸出結果
  • 權重儲存: 執行完的權重會存在cache資料夾中
    • ls ~/.cache/huggingface/diffusers/
  • 使用參數
    • prompt (str or List[str], optional) — The prompt or prompts to guide image generation. If not defined, you need to pass prompt_embeds.
    • height (int, optional, defaults to self.unet.config.sample_size * self.vae_scale_factor) — The height in pixels of the generated image.
    • width (int, optional, defaults to self.unet.config.sample_size * self.vae_scale_factor) — The width in pixels of the generated image.
    • num_inference_steps (int, optional, defaults to 50) — The number of denoising steps. More denoising steps usually lead to a higher quality image at the expense of slower inference.
    • timesteps (List[int], optional) — Custom timesteps to use for the denoising process with schedulers which support a timesteps argument in their set_timesteps method. If not defined, the default behavior when num_inference_steps is passed will be used. Must be in descending order.
    • sigmas (List[float], optional) — Custom sigmas to use for the denoising process with schedulers which support a sigmas argument in their set_timesteps method. If not defined, the default behavior when num_inference_steps is passed will be used.
    • guidance_scale (float, optional, defaults to 7.5) — A higher guidance scale value encourages the model to generate images closely linked to the text prompt at the expense of lower image quality. Guidance scale is enabled when guidance_scale > 1.
    • negative_prompt (str or List[str], optional) — The prompt or prompts to guide what to not include in image generation. If not defined, you need to pass negative_prompt_embeds instead. Ignored when not using guidance (guidance_scale < 1).
    • num_images_per_prompt (int, optional, defaults to 1) — The number of images to generate per prompt.
    • eta (float, optional, defaults to 0.0) — Corresponds to parameter eta (η) from the DDIM paper. Only applies to the DDIMScheduler, and is ignored in other schedulers.
  • Text to image arena
    • 下圖是目前世界的排名,上面介紹的stable-diffusion-2-1 ELO分數大概為749
  • 參考資料
0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments