Stable Diffusion Videos 0.6 : ノートブック (翻訳/解説)

翻訳 : (株)クラスキャットセールスインフォメーション
作成日時 : 11/15/2022

* 本ページは、nateraw/stable-diffusion-videos の以下のドキュメントを翻訳した上で適宜、補足説明したものです：

stable_diffusion_videos.ipynb

* サンプルコードの動作確認はしておりますが、必要な場合には適宜、追加改変しています。
* ご自由にリンクを張って頂いてかまいませんが、sales-info@classcat.com までご一報いただけると嬉しいです。

クラスキャット人工知能研究開発支援サービス

◆ クラスキャットは人工知能・テレワークに関する各種サービスを提供しています。お気軽にご相談ください :

人工知能研究開発支援
1. 人工知能研修サービス(経営者層向けオンサイト研修)
2. テクニカルコンサルティングサービス
3. 実証実験(プロトタイプ構築)
4. アプリケーションへの実装
人工知能研修サービス
PoC(概念実証)を失敗させないための支援

◆ 人工知能とビジネスをテーマに WEB セミナーを定期的に開催しています。スケジュール。

お住まいの地域に関係なく Web ブラウザからご参加頂けます。事前登録 が必要ですのでご注意ください。

◆ お問合せ : 本件に関するお問い合わせ先は下記までお願いいたします。

株式会社クラスキャット セールス・マーケティング本部セールス・インフォメーション
sales-info@classcat.com ; Web: www.classcat.com ; ClassCatJP

Stable Diffusion Videos 0.6 : ノートブック

このノートブックは Stable Diffusion の潜在空間を補間することにより動画を生成することを可能にします。

同じプロンプトの異なるバージョンを思い描いたり、異なるテキストプロンプト間でモーフィングすることができます (再現性のためにそれぞれにシードが設定されています)。

セットアップ

%%capture
! pip install stable_diffusion_videos[realesrgan]
! git config --global credential.helper store

Hugging Face による認証

貴方は 🤗 Hugging Face ハブの登録ユーザである必要があり、そしてコードを動作させるにはアクセストークンを使用する必要もあります。アクセストークンの詳細は、ドキュメントのこのセクションを参照してください。

from huggingface_hub import notebook_login

notebook_login()

アプリケーションの実行 🚀

インターフェイスのロード

このステップは、それを最初に実行したとき、数分かかります。

import torch

from stable_diffusion_videos import StableDiffusionWalkPipeline, Interface

pipeline = StableDiffusionWalkPipeline.from_pretrained(
    "CompVis/stable-diffusion-v1-4",
    torch_dtype=torch.float16,
    revision="fp16",
).to("cuda")

interface = Interface(pipeline)

出力をセーブするために Google Drive に接続する

#@title Connect to Google Drive to Save Outputs

#@markdown If you want to connect Google Drive, click the checkbox below and run this cell. You'll be prompted to authenticate.

#@markdown If you just want to save your outputs in this Colab session, don't worry about this cell

connect_google_drive = True #@param {type:"boolean"}

#@markdown Then, in the interface, use this path as the `output` in the Video tab to save your videos to Google Drive:

#@markdown > /content/gdrive/MyDrive/stable_diffusion_videos


if connect_google_drive:
    from google.colab import drive

    drive.mount('/content/gdrive')

起動

このセルは Gradio インターフェイスを起動します。ここに (提案する) それを使用する方法があります :

好きな画像を生成するために “Images” タブを使用します。
- その間でモーフィングしたい 2 つの画像を見つけます。
- これらの画像は同じ設定を使用する必要があります (ガイダンス・スケール, height, width)。
- 使用したシード/設定を記録すれば再現できます。
“Videos” タブを使用して動画を生成します。
- 上記のステップから見い出した画像を使用して、記録したプロンプト/シードを提供します。
- num_interpolation_steps を設定します – テストのためには 3 や 5 の小さい数を使用できますが、素晴らしい結果を得るにはより大きな値を (60 – 200 ステップ) 使用するべきでしょう。

interface.launch(debug=True)

プログラミング的に walk を使う

別のオプションはインターフェイスを使用しないで、代わりにプログラミング的に walk を使用することです。ここにそれを行なう方法があります …

最初に colab で動画を可視化するヘルパー関数を定義します。

from IPython.display import HTML
from base64 import b64encode

def visualize_video_colab(video_path):
    mp4 = open(video_path,'rb').read()
    data_url = "data:video/mp4;base64," + b64encode(mp4).decode()
    return HTML("""
    <video width=400 controls>
        <source src="%s" type="video/mp4">
    </video>
    """ % data_url)

Walk! 🚶‍♀️

video_path = pipeline.walk(
    ['a cat', 'a dog'],
    [42, 1337],
    fps=5,                      # use 5 for testing, 25 or 30 for better quality
    num_interpolation_steps=5,  # use 3-5 for testing, 30 or more for better results
    height=512,                 # use multiples of 64 if > 512. Multiples of 8 if < 512.
    width=512,                  # use multiples of 64 if > 512. Multiples of 8 if < 512.
)
visualize_video_colab(video_path)

Bonus! ミュージックビデオ ...

upload an audio file and give it a try

# Seconds in the song
audio_offsets = [146, 148]
fps = 8

# Convert seconds to frames
num_interpolation_steps = [(b-a) * fps for a, b in zip(audio_offsets, audio_offsets[1:])]


video_path = pipeline.walk(
    prompts=['blueberry spaghetti', 'strawberry spaghetti'],
    seeds=[42, 1337],
    num_interpolation_steps=num_interpolation_steps,
    height=512,                            # use multiples of 64
    width=512,                             # use multiples of 64
    audio_filepath='kanye_west_fade.mp3',  # Use your own file
    audio_start_sec=audio_offsets[0],      # Start second of the provided audio
    fps=fps,                               # important to set yourself based on the num_interpolation_steps you defined
    batch_size=1,                          # increase until you go out of memory
    output_dir='./dreams',                 # Where images will be saved
    name=None,                             # Subdir of output dir. will be timestamp by default
)
visualize_video_colab(video_path)

以上

月	火	水	木	金	土	日
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30