こすたろーんエンジニアの試行錯誤部屋

作成物の備忘録を書いていきますー

【Stable Video Diffusion】How to resolve hang-up in StableVideoDiffusionPipeline.from_pretrained

スポンサーリンク

A mage2Video model called stable video diffusion, which generates video from images, is now available.
touch-sp.hatenablog.com
huggingface.co

I tried to use google colab to try stable video diffusion, but it stopped at the model loading point.
I checked how to deal with it.
This article is a reminder of how to solve the problem.

contents

スポンサーリンク

abstract

How to resolve hang-up in StableVideoDiffusionPipeline.from_pretrained

1.requirement

Google Colab
Diffusers == 0.25.0

2.issue

When I run the following sample code, I get a situation where I can't proceed from the "StableVideoDiffusionPipeline.from_pretrained" point
I got a situation where I could not proceed forever!

import torch
from diffusers import StableVideoDiffusionPipeline
from diffusers.utils import load_image, export_to_video

pipe = StableVideoDiffusionPipeline.from_pretrained(
    "stabilityai/stable-video-diffusion-img2vid-xt", torch_dtype=torch.float16, variant="fp16"
)
pipe.enable_model_cpu_offload()

# Load the conditioning image
image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/svd/rocket.png")
image = image.resize((1024, 576))

generator = torch.manual_seed(42)
frames = pipe(image, decode_chunk_size=8, generator=generator).frames[0]

export_to_video(frames, "generated.mp4", fps=7)

3. how to solve

They say you can work around it by running the following code

from huggingface_hub.utils import _runtime
_runtime._is_google_colab = False

The improved code is as follows

import torch
from diffusers import StableVideoDiffusionPipeline
from diffusers.utils import load_image, export_to_video
from huggingface_hub.utils import _runtime
_runtime._is_google_colab = False

pipe = StableVideoDiffusionPipeline.from_pretrained(
    "stabilityai/stable-video-diffusion-img2vid-xt", torch_dtype=torch.float16, variant="fp16"
)
pipe.enable_model_cpu_offload()

# Load the conditioning image
image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/svd/rocket.png")
image = image.resize((1024, 576))

generator = torch.manual_seed(42)
frames = pipe(image, decode_chunk_size=8, generator=generator).frames[0]

export_to_video(frames, "generated.mp4", fps=7)

Successfully executed after modification.

スポンサーリンク

3.refarence

github.com