★★★ 本文源自AlStudio社区精品项目,【点击此处】查看更多精品内容 >>>

DreamBooth 介绍

DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation是一种新的文本生成图像(text2image)的“个性化”(可适应用户特定的图像生成需求)扩散模型。虽然 DreamBooth 是在 Imagen 的基础上做的调整,但研究人员在论文中还提到,他们的方法也适用于其他扩散模型。只需几张(通常 3~5 张)指定物体的照片和相应的类名(如“狗”)作为输入,并添加一个唯一标识符植入不同的文字描述中,DreamBooth 就能让被指定物体“完美”出现在用户想要生成的场景中。

LoRA 介绍

LoRA: Low-Rank Adaptation of Large Language Models 是微软研究员引入的一项新技术,主要用于处理大模型微调的问题。目前超过数十亿以上参数的具有强能力的大模型 (例如 GPT-3) 通常在为了适应其下游任务的微调中会呈现出巨大开销。LoRA 建议冻结预训练模型的权重并在每个 Transformer 块中注入可训练层 (秩-分解矩阵)。因为不需要为大多数模型权重计算梯度,所以大大减少了需要训练参数的数量并且降低了 GPU 的内存要求。研究人员发现,通过聚焦大模型的 Transformer 注意力块,使用 LoRA 进行的微调质量与全模型微调相当,同时速度更快且需要更少的计算。

简而言之,LoRA允许通过向现有权重添加一对秩分解矩阵,并只训练这些新添加的权重来适应预训练的模型。这有几个优点:

  • 保持预训练的权重不变,这样模型就不容易出现灾难性遗忘 catastrophic forgetting;
  • 秩分解矩阵的参数比原始模型少得多,这意味着训练的 LoRA 权重很容易移植;
  • LoRA 注意力层允许通过一个 scale 参数来控制模型适应新训练图像的程度。

1. 安装依赖

  • 运行下面的按钮安装依赖,为了确保安装成功,安装完毕请重启内核!(注意:这里只需要运行一次!)
!python -m pip install -U paddlenlp ppdiffusers visualdl --user

2. 准备要训练的图片

  • 运行下面的按钮,解压图片资源(注意:这里只需要运行一次!)
!unzip data/data190562/国潮2.zip -d data/paints
  • 在这里我们已经在 data/paints 文件夹准备好了如下所示的图片资源。

  • 文件夹下还有更多如下

  • 图片处理,把其中质量不好的图片去掉

3. 开始训练

  • 下载训练脚本
!wget https://raw.githubusercontent.com/PaddlePaddle/PaddleNLP/develop/ppdiffusers/examples/dreambooth/train_dreambooth_lora.py

DreamBooth LoRA

参数解释:

主要修改的参数

  • --pretrained_model_name_or_path: 所使用的 Stable Diffusion 模型权重名称或者本地下载的模型路径,目前支持了上表中的8种模型权重,我们可直接替换使用。
  • --instance_data_dir: 实例(物体)图片文件夹地址。
  • --instance_prompt: 带有特定实例(物体)的提示词描述文本,例如a photo of sks dog,其中dog代表实例(物体)。
  • --class_data_dir: 类别(class)图片文件夹地址,主要作为先验知识。
  • --class_prompt: 类别(class)提示词文本,该提示器要与实例(物体)是同一种类别,例如a photo of dog,主要作为先验知识。
  • --num_class_images: 事先需要从class_prompt中生成多少张图片,主要作为先验知识。
  • --prior_loss_weight: 先验loss占比权重。
  • --sample_batch_size: 生成class_prompt文本对应的图片所用的批次(batch size),注意,当GPU显卡显存较小的时候需要将这个默认值改成1。
  • --with_prior_preservation: 是否将生成的同类图片(先验知识)一同加入训练,当为True的时候,class_promptclass_data_dirnum_class_imagessample_batch_sizeprior_loss_weight才生效。
  • --num_train_epochs: 训练的轮数,默认值为1
  • --max_train_steps: 最大的训练步数,当我们设置这个值后,它会重新计算所需的num_train_epochs轮数。
  • --checkpointing_steps: 每间隔多少步(global step步数),保存模型权重。
  • --gradient_accumulation_steps: 梯度累积的步数,用户可以指定梯度累积的步数,在梯度累积的 step 中。减少多卡之间梯度的通信,减少更新的次数,扩大训练的 batch_size 。
  • --train_text_encoder: 是否一同训练文本编码器的部分,默认为False

可以修改的参数

  • --height: 输入给模型的图片高度,由于用户输入的并不是固定大小的图片,因此代码中会将原始大小的图片压缩成指定高度的图片,默认值为None
  • --width: 输入给模型的图片宽度,由于用户输入的并不是固定大小的图片,因此代码中会将原始大小的图片压缩成指定宽度的图片,默认值为None
  • --resolution: 输入给模型图片的分辨率,当高度宽度None时,我们将会使用resolution,默认值为512
  • --learning_rate: 学习率。
  • --scale_lr: 是否根据GPU数量,梯度累积步数,以及批量数对学习率进行缩放。缩放公式:learning_rate * gradient_accumulation_steps * train_batch_size * num_processes
  • --lr_scheduler: 要使用的学习率调度策略。默认为 constant
  • --lr_warmup_steps: 用于从 0 到 learning_rate 的线性 warmup 的步数。
  • --train_batch_size: 训练时每张显卡所使用的batch_size批量,当我们的显存较小的时候,需要将这个值设置的小一点。
  • --center_crop: 在调整图片宽和高之前是否将裁剪图像居中,默认值为False
  • --random_flip: 是否对图片进行随机水平反转,默认值为False
  • --gradient_checkpointing: 是否开启gradient_checkpointing功能,在一定程度上能够更显显存,但是会减慢训练速度。
  • --output_dir: 模型训练完所保存的路径,默认设置为dreambooth-model文件夹,建议用户每训练一个模型可以修改一下输出路径,防止先前已有的模型被覆盖了。

基本无需修改的参数

  • --seed: 随机种子,为了可以复现训练结果,Tips:当前paddle设置该随机种子后仍无法完美复现。
  • --adam_beta1: AdamW 优化器时的 beta1 超参数。默认为 0.9
  • --adam_beta2: AdamW 优化器时的 beta2 超参数。默认为 0.999
  • --adam_weight_decay: AdamW 优化器时的 weight_decay 超参数。 默认为0.02
  • --adam_weight_decay: AdamW 优化器时的 epsilon 超参数。默认为 1e-8
  • --max_grad_norm: 最大梯度范数(用于梯度裁剪)。默认为 -1 表示不使用。
  • --logging_dir: Tensorboard 或 VisualDL 记录日志的地址,注意:该地址会与输出目录进行拼接,即,最终的日志地址为<output_dir>/<logging_dir>
  • --report_to: 用于记录日志的工具,可选["tensorboard", "visualdl"],默认为visualdl,如果选用tensorboard,请使用命令安装pip install tensorboardX
  • --push_to_hub: 是否将模型上传到 huggingface hub,默认值为 False
  • --hub_token: 上传到 huggingface hub 所需要使用的 token,如果我们已经登录了,那么我们就无需填写。
  • --hub_model_id: 上传到 huggingface hub 的模型库名称, 如果为 None 的话表示我们将使用 output_dir 的名称作为模型库名称。
!python train_dreambooth_lora.py \
  --pretrained_model_name_or_path="Linaqruf/anything-v3.0"  \
  --instance_data_dir="./data/paints" \
  --output_dir="./dream_booth_lora_outputs" \
  --instance_prompt="<Guochao>" \
  --resolution=512 \
  --train_batch_size=1 \
  --gradient_accumulation_steps=1 \
  --checkpointing_steps=1000 \
  --learning_rate=1e-4 \
  --report_to="visualdl" \
  --lr_scheduler="constant" \
  --lr_warmup_steps=0 \
  --max_train_steps=5000 \
  --lora_rank=256 \
  --validation_prompt="pretty girl,<Guochao>" \
  --validation_epochs=500 \
  --seed=0
[32m[2023-03-21 11:56:42,999] [    INFO][0m - Found /home/aistudio/.paddlenlp/models/Linaqruf/anything-v3.0/tokenizer/tokenizer_config.json[0m
[32m[2023-03-21 11:56:43,000] [    INFO][0m - We are using <class 'paddlenlp.transformers.clip.tokenizer.CLIPTokenizer'> to load 'Linaqruf/anything-v3.0/tokenizer'.[0m
[32m[2023-03-21 11:56:43,001] [    INFO][0m - Already cached /home/aistudio/.paddlenlp/models/Linaqruf/anything-v3.0/tokenizer/vocab.json[0m
[32m[2023-03-21 11:56:43,001] [    INFO][0m - Already cached /home/aistudio/.paddlenlp/models/Linaqruf/anything-v3.0/tokenizer/merges.txt[0m
[32m[2023-03-21 11:56:43,001] [    INFO][0m - Already cached /home/aistudio/.paddlenlp/models/Linaqruf/anything-v3.0/tokenizer/added_tokens.json[0m
[32m[2023-03-21 11:56:43,001] [    INFO][0m - Already cached /home/aistudio/.paddlenlp/models/Linaqruf/anything-v3.0/tokenizer/special_tokens_map.json[0m
[32m[2023-03-21 11:56:43,001] [    INFO][0m - Already cached /home/aistudio/.paddlenlp/models/Linaqruf/anything-v3.0/tokenizer/tokenizer_config.json[0m
[32m[2023-03-21 11:56:43,247] [    INFO][0m - Found /home/aistudio/.paddlenlp/models/Linaqruf/anything-v3.0/text_encoder/model_config.json[0m
[32m[2023-03-21 11:56:43,248] [    INFO][0m - loading configuration file /home/aistudio/.paddlenlp/models/Linaqruf/anything-v3.0/text_encoder/model_config.json[0m
[32m[2023-03-21 11:56:43,249] [    INFO][0m - Model config PretrainedConfig {
  "architectures": [
    "CLIPTextModel"
  ],
  "initializer_factor": 1.0,
  "initializer_range": 0.02,
  "max_text_length": 77,
  "paddlenlp_version": null,
  "projection_dim": 768,
  "text_embed_dim": 768,
  "text_heads": 12,
  "text_hidden_act": "quick_gelu",
  "text_layers": 12,
  "vocab_size": 49408
}
[0m
[32m[2023-03-21 11:56:43,249] [    INFO][0m - Already cached /home/aistudio/.paddlenlp/models/Linaqruf/anything-v3.0/scheduler/scheduler_config.json[0m
W0321 11:56:43.251735  2942 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.2, Runtime API Version: 11.2
W0321 11:56:43.255620  2942 gpu_resources.cc:91] device: 0, cuDNN Version: 8.2.
[32m[2023-03-21 11:56:44,836] [    INFO][0m - Found /home/aistudio/.paddlenlp/models/Linaqruf/anything-v3.0/text_encoder/model_config.json[0m
[32m[2023-03-21 11:56:44,836] [    INFO][0m - loading configuration file /home/aistudio/.paddlenlp/models/Linaqruf/anything-v3.0/text_encoder/model_config.json[0m
[32m[2023-03-21 11:56:44,837] [    INFO][0m - Model config CLIPTextConfig {
  "architectures": [
    "CLIPTextModel"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 0,
  "dropout": 0.0,
  "eos_token_id": 2,
  "hidden_act": "quick_gelu",
  "hidden_size": 768,
  "initializer_factor": 1.0,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-05,
  "max_position_embeddings": 77,
  "model_type": "clip_text_model",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "pad_token_id": 1,
  "paddlenlp_version": null,
  "projection_dim": 768,
  "return_dict": true,
  "vocab_size": 49408
}
[0m
[32m[2023-03-21 11:56:44,908] [    INFO][0m - Found /home/aistudio/.paddlenlp/models/Linaqruf/anything-v3.0/text_encoder/model_state.pdparams[0m
[32m[2023-03-21 11:56:46,375] [    INFO][0m - All model checkpoint weights were used when initializing CLIPTextModel.
[0m
[32m[2023-03-21 11:56:46,375] [    INFO][0m - All the weights of CLIPTextModel were initialized from the model checkpoint at Linaqruf/anything-v3.0/text_encoder.
If your task is similar to the task the model of the checkpoint was trained on, you can already use CLIPTextModel for predictions without further training.[0m
[32m[2023-03-21 11:56:46,376] [    INFO][0m - Already cached /home/aistudio/.paddlenlp/models/Linaqruf/anything-v3.0/vae/model_state.pdparams[0m
[32m[2023-03-21 11:56:46,376] [    INFO][0m - Already cached /home/aistudio/.paddlenlp/models/Linaqruf/anything-v3.0/vae/config.json[0m
[32m[2023-03-21 11:56:47,056] [    INFO][0m - Already cached /home/aistudio/.paddlenlp/models/Linaqruf/anything-v3.0/unet/model_state.pdparams[0m
[32m[2023-03-21 11:56:47,057] [    INFO][0m - Already cached /home/aistudio/.paddlenlp/models/Linaqruf/anything-v3.0/unet/config.json[0m
[32m[2023-03-21 11:56:56,555] [    INFO][0m - -----------  Configuration Arguments -----------[0m
[32m[2023-03-21 11:56:56,555] [    INFO][0m - adam_beta1: 0.9[0m
[32m[2023-03-21 11:56:56,555] [    INFO][0m - adam_beta2: 0.999[0m
[32m[2023-03-21 11:56:56,555] [    INFO][0m - adam_epsilon: 1e-08[0m
[32m[2023-03-21 11:56:56,555] [    INFO][0m - adam_weight_decay: 0.01[0m
[32m[2023-03-21 11:56:56,555] [    INFO][0m - center_crop: False[0m
[32m[2023-03-21 11:56:56,555] [    INFO][0m - checkpointing_steps: 1000[0m
[32m[2023-03-21 11:56:56,555] [    INFO][0m - class_data_dir: None[0m
[32m[2023-03-21 11:56:56,555] [    INFO][0m - class_prompt: None[0m
[32m[2023-03-21 11:56:56,555] [    INFO][0m - dataloader_num_workers: 0[0m
[32m[2023-03-21 11:56:56,555] [    INFO][0m - gradient_accumulation_steps: 1[0m
[32m[2023-03-21 11:56:56,555] [    INFO][0m - gradient_checkpointing: False[0m
[32m[2023-03-21 11:56:56,555] [    INFO][0m - height: 512[0m
[32m[2023-03-21 11:56:56,556] [    INFO][0m - hub_model_id: None[0m
[32m[2023-03-21 11:56:56,556] [    INFO][0m - hub_token: None[0m
[32m[2023-03-21 11:56:56,556] [    INFO][0m - instance_data_dir: ./paints[0m
[32m[2023-03-21 11:56:56,556] [    INFO][0m - instance_prompt: <Guochao>[0m
[32m[2023-03-21 11:56:56,556] [    INFO][0m - learning_rate: 0.0001[0m
[32m[2023-03-21 11:56:56,556] [    INFO][0m - logging_dir: ./dream_booth_lora_outputs/logs[0m
[32m[2023-03-21 11:56:56,556] [    INFO][0m - lora_rank: 256[0m
[32m[2023-03-21 11:56:56,556] [    INFO][0m - lr_num_cycles: 1[0m
[32m[2023-03-21 11:56:56,556] [    INFO][0m - lr_power: 1.0[0m
[32m[2023-03-21 11:56:56,556] [    INFO][0m - lr_scheduler: constant[0m
[32m[2023-03-21 11:56:56,556] [    INFO][0m - lr_warmup_steps: 0[0m
[32m[2023-03-21 11:56:56,556] [    INFO][0m - max_grad_norm: 1.0[0m
[32m[2023-03-21 11:56:56,556] [    INFO][0m - max_train_steps: 8000[0m
[32m[2023-03-21 11:56:56,556] [    INFO][0m - num_class_images: 100[0m
[32m[2023-03-21 11:56:56,556] [    INFO][0m - num_train_epochs: 534[0m
[32m[2023-03-21 11:56:56,556] [    INFO][0m - num_validation_images: 4[0m
[32m[2023-03-21 11:56:56,556] [    INFO][0m - output_dir: ./dream_booth_lora_outputs[0m
[32m[2023-03-21 11:56:56,556] [    INFO][0m - pretrained_model_name_or_path: Linaqruf/anything-v3.0[0m
[32m[2023-03-21 11:56:56,556] [    INFO][0m - prior_loss_weight: 1.0[0m
[32m[2023-03-21 11:56:56,557] [    INFO][0m - push_to_hub: False[0m
[32m[2023-03-21 11:56:56,557] [    INFO][0m - random_flip: False[0m
[32m[2023-03-21 11:56:56,557] [    INFO][0m - report_to: visualdl[0m
[32m[2023-03-21 11:56:56,557] [    INFO][0m - resolution: 512[0m
[32m[2023-03-21 11:56:56,557] [    INFO][0m - sample_batch_size: 4[0m
[32m[2023-03-21 11:56:56,557] [    INFO][0m - scale_lr: False[0m
[32m[2023-03-21 11:56:56,557] [    INFO][0m - seed: 0[0m
[32m[2023-03-21 11:56:56,557] [    INFO][0m - tokenizer_name: None[0m
[32m[2023-03-21 11:56:56,557] [    INFO][0m - train_batch_size: 1[0m
[32m[2023-03-21 11:56:56,557] [    INFO][0m - validation_epochs: 500[0m
[32m[2023-03-21 11:56:56,557] [    INFO][0m - validation_prompt: pretty girl,<Guochao>[0m
[32m[2023-03-21 11:56:56,557] [    INFO][0m - width: 512[0m
[32m[2023-03-21 11:56:56,557] [    INFO][0m - with_prior_preservation: False[0m
[32m[2023-03-21 11:56:56,557] [    INFO][0m - ------------------------------------------------[0m
[32m[2023-03-21 11:56:56,658] [    INFO][0m - ***** Running training *****[0m
[32m[2023-03-21 11:56:56,659] [    INFO][0m -   Num examples = 15[0m
[32m[2023-03-21 11:56:56,659] [    INFO][0m -   Num batches each epoch = 15[0m
[32m[2023-03-21 11:56:56,659] [    INFO][0m -   Num Epochs = 534[0m
[32m[2023-03-21 11:56:56,659] [    INFO][0m -   Instantaneous batch size per device = 1[0m
[32m[2023-03-21 11:56:56,659] [    INFO][0m -   Total train batch size (w. parallel, distributed & accumulation) = 1[0m
[32m[2023-03-21 11:56:56,659] [    INFO][0m -   Gradient Accumulation steps = 1[0m
[32m[2023-03-21 11:56:56,659] [    INFO][0m -   Total optimization steps = 8000[0m
Train Steps:   0%| | 15/8000 [00:06<45:48,  2.91it/s, epoch=0000, lr=0.0001, ste[32m[2023-03-21 11:57:03,236] [    INFO][0m - Running validation... 
 Generating 4 images with prompt: pretty girl,<Guochao>.[0m
[32m[2023-03-21 11:57:03,237] [    INFO][0m - Already cached /home/aistudio/.paddlenlp/models/Linaqruf/anything-v3.0/model_index.json[0m
[32m[2023-03-21 11:57:03,238] [    INFO][0m - Already cached /home/aistudio/.paddlenlp/models/Linaqruf/anything-v3.0/vae/model_state.pdparams[0m
[32m[2023-03-21 11:57:03,238] [    INFO][0m - Already cached /home/aistudio/.paddlenlp/models/Linaqruf/anything-v3.0/vae/config.json[0m
[32m[2023-03-21 11:57:04,064] [    INFO][0m - Found /home/aistudio/.paddlenlp/models/Linaqruf/anything-v3.0/text_encoder/model_config.json[0m
[32m[2023-03-21 11:57:04,066] [    INFO][0m - loading configuration file /home/aistudio/.paddlenlp/models/Linaqruf/anything-v3.0/text_encoder/model_config.json[0m
[32m[2023-03-21 11:57:04,067] [    INFO][0m - Model config CLIPTextConfig {
  "architectures": [
    "CLIPTextModel"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 0,
  "dropout": 0.0,
  "eos_token_id": 2,
  "hidden_act": "quick_gelu",
  "hidden_size": 768,
  "initializer_factor": 1.0,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-05,
  "max_position_embeddings": 77,
  "model_type": "clip_text_model",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "pad_token_id": 1,
  "paddlenlp_version": null,
  "projection_dim": 768,
  "return_dict": true,
  "vocab_size": 49408
}
[0m
[32m[2023-03-21 11:57:04,130] [    INFO][0m - Found /home/aistudio/.paddlenlp/models/Linaqruf/anything-v3.0/text_encoder/model_state.pdparams[0m
[32m[2023-03-21 11:57:05,460] [    INFO][0m - All model checkpoint weights were used when initializing CLIPTextModel.
[0m
[32m[2023-03-21 11:57:05,460] [    INFO][0m - All the weights of CLIPTextModel were initialized from the model checkpoint at Linaqruf/anything-v3.0/text_encoder.
If your task is similar to the task the model of the checkpoint was trained on, you can already use CLIPTextModel for predictions without further training.[0m
[32m[2023-03-21 11:57:05,462] [    INFO][0m - Already cached /home/aistudio/.paddlenlp/models/Linaqruf/anything-v3.0/tokenizer/vocab.json[0m
[32m[2023-03-21 11:57:05,462] [    INFO][0m - Already cached /home/aistudio/.paddlenlp/models/Linaqruf/anything-v3.0/tokenizer/merges.txt[0m
[32m[2023-03-21 11:57:05,462] [    INFO][0m - Already cached /home/aistudio/.paddlenlp/models/Linaqruf/anything-v3.0/tokenizer/added_tokens.json[0m
[32m[2023-03-21 11:57:05,462] [    INFO][0m - Already cached /home/aistudio/.paddlenlp/models/Linaqruf/anything-v3.0/tokenizer/special_tokens_map.json[0m
[32m[2023-03-21 11:57:05,462] [    INFO][0m - Already cached /home/aistudio/.paddlenlp/models/Linaqruf/anything-v3.0/tokenizer/tokenizer_config.json[0m
[32m[2023-03-21 11:57:05,524] [    INFO][0m - Already cached /home/aistudio/.paddlenlp/models/Linaqruf/anything-v3.0/scheduler/scheduler_config.json[0m
[32m[2023-03-21 11:57:05,527] [    INFO][0m - Found /home/aistudio/.paddlenlp/models/Linaqruf/anything-v3.0/feature_extractor/preprocessor_config.json[0m
[32m[2023-03-21 11:57:05,528] [    INFO][0m - loading configuration file https://bj.bcebos.com/paddlenlp/models/community/Linaqruf/anything-v3.0/feature_extractor/preprocessor_config.json from cache at /home/aistudio/.paddlenlp/models/Linaqruf/anything-v3.0/feature_extractor/preprocessor_config.json[0m
[32m[2023-03-21 11:57:05,529] [    INFO][0m - size should be a dictionary on of the following set of keys: ({'width', 'height'}, {'shortest_edge'}, {'shortest_edge', 'longest_edge'}), got 224. Converted to {'shortest_edge': 224}.[0m
[32m[2023-03-21 11:57:05,529] [    INFO][0m - crop_size should be a dictionary on of the following set of keys: ({'width', 'height'}, {'shortest_edge'}, {'shortest_edge', 'longest_edge'}), got 224. Converted to {'height': 224, 'width': 224}.[0m
[32m[2023-03-21 11:57:05,529] [    INFO][0m - Image processor CLIPFeatureExtractor {
  "crop_size": {
    "height": 224,
    "width": 224
  },
  "do_center_crop": true,
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "feature_extractor_type": "CLIPFeatureExtractor",
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "CLIPFeatureExtractor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "shortest_edge": 224
  }
}
[0m
You have disabled the safety checker for <class 'ppdiffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline'> by passing `safety_checker=None`. Ensure that you abide to the conditions of the Stable Diffusion license and do not expose unfiltered results in services or applications open to the public. PaddleNLP team, diffusers team and Hugging Face strongly recommend to keep the safety filter enabled in all public facing circumstances, disabling it only for use-cases that involve analyzing network behavior or auditing its results. For more information, please have a look at https://github.com/huggingface/diffusers/pull/254 .
Train Steps:  12%|▏| 1000/8000 [06:16<41:28,  2.81it/s, epoch=0066, lr=0.0001, s[32m[2023-03-21 12:03:13,837] [    INFO][0m - Saved lora weights to ./dream_booth_lora_outputs/checkpoint-1000[0m
Train Steps:  25%|▎| 2000/8000 [12:17<35:40,  2.80it/s, epoch=0133, lr=0.0001, s[32m[2023-03-21 12:09:14,978] [    INFO][0m - Saved lora weights to ./dream_booth_lora_outputs/checkpoint-2000[0m
Train Steps:  38%|▍| 3000/8000 [18:16<29:04,  2.87it/s, epoch=0199, lr=0.0001, s[32m[2023-03-21 12:15:13,741] [    INFO][0m - Saved lora weights to ./dream_booth_lora_outputs/checkpoint-3000[0m

Train Steps: 38%|▍| 3000/8000 [18:16<29:04, 2.87it/s, epoch=0199, lr=0.0001, s[32m[2023-03-21 12:15:13,741] [ INFO][0m - Saved lora weights to ./dream_booth_lora_outputs/checkpoint-3000[0m
Train Steps: 49%|▍| 3954/8000 [24:09<23:38, 2.85it/s, epoch=0263, lr=0.0001, s

4. 启动visualdl程序,查看我们训练过程

5. 加载训练好的文件进行推理

  • 加载模型
from ppdiffusers import StableDiffusionPipeline
from ppdiffusers import DPMSolverMultistepScheduler
import paddle
from IPython.display import clear_output

# 模型
pretrained_model_name_or_path = "Linaqruf/anything-v3.0"
unet_model_path = "./dream_booth_lora_outputs"

# 加载原始的模型
pipe = StableDiffusionPipeline.from_pretrained(pretrained_model_name_or_path, safety_checker=None)
pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)
# 将 adapter layers 添加到 UNet 模型
pipe.unet.load_attn_procs(unet_model_path, from_hf_hub=False)

clear_output()
  • 使用模型进行推理预测
from IPython.display import display


prompt               = "pretty girl,full body,pretty face,Perfect face,clear face,fine details,<Guochao>"
negative_prompt      = "lowres, error_face, bad_face, bad_anatomy, error_body, error_hair, error_arm, (error_hands, bad_hands, error_fingers, bad_fingers, missing_fingers) error_legs, bad_legs, multiple_legs, missing_legs, error_lighting, error_shadow, error_reflection, text, error, extra_digit, fewer_digits, cropped, worst_quality, low_quality, normal_quality, jpeg_artifacts, signature, watermark, username, blurry"

guidance_scale       = 8
num_inference_steps  = 50
height = 512
width = 512

img = pipe(prompt, negative_prompt=negative_prompt, guidance_scale=guidance_scale, height=height, width=width, num_inference_steps=num_inference_steps).images[0]
display(img)
  0%|          | 0/50 [00:00<?, ?it/s]

在这里插入图片描述

参考资料

  • https://github.com/huggingface/diffusers/tree/main/examples/dreambooth
  • https://github.com/CompVis/stable-diffusion
  • https://github.com/PaddlePaddle/PaddleNLP/tree/develop/ppdiffusers/examples/dreambooth
  • https://aistudio.baidu.com/aistudio/projectdetail/5481677?channelType=0&channel=0

此文章为搬运
原项目链接

Logo

学大模型,用大模型上飞桨星河社区!每天8点V100G算力免费领!免费领取ERNIE 4.0 100w Token >>>

更多推荐