AdaFreeU: Adaptive FreeU Parameter Prediction for Text-to-Image Diffusion Models

We propose AdaFreeU, an adaptive extension of FreeU for text-to-image diffusion models such as Stable Diffusion (SD). FreeU improves generation by re-weighting U-Net backbone and skip features during inference through two scaling factors, but its fixed parameters may not generalize well across different prompts, seeds, and visual styles. We address this limitation by predicting FreeU parameters adaptively using strategies such as Gaussian policy prediction, REINFORCE, and DPO. Experimental results show that AdaFreeU improves over both standard SD and default FreeU, with DPO achieving the highest mean ImageReward. Compared with default FreeU, DPO consistently improves the mean ImageReward gain over SD, with 2.1x larger gains in constant mode and 7.0x larger gains in spatial mode.

Yi-Hsiang Ho*, Ting-Wei Chou*, Yi-Cheng Lai* — National Yang Ming Chiao Tung University (* Equal contribution)

Project Page · Poster · Proposal

Methodology

Gaussian Policy

Frozen CLIP encoders embed the prompt and a baseline SD image; their concatenated features feed a policy network that outputs a mean and log-variance over the 8 FreeU parameters. The policy is trained on (prompt, FreeU parameters, reward) samples with a reward-weighted Gaussian negative log-likelihood, so high-reward parameter choices are pulled closer to the predicted distribution. At inference, FreeU parameters are sampled directly from the predicted Gaussian.

REINFORCE

The same CLIP + policy network produces a mean, from which an action is sampled and denormalized into FreeU parameters with an associated log-probability. SD generates an image using these parameters, which is scored by ImageReward and compared against a baseline image's reward to form the advantage. The policy is updated with the REINFORCE objective, reinforcing parameter choices that improve ImageReward over the baseline.

DPO

The policy network outputs a base parameter vector, which is Gaussian-perturbed into two candidates. SD renders an image for each candidate, and ImageReward compares them to identify the preferred and non-preferred outputs. The policy is optimized with a DPO-style loss that pulls the base parameters toward the preferred candidate's parameters and away from the non-preferred one.

Experimental Setup & Dataset

Our experiments are conducted on Stable Diffusion 1.5, based on the latent diffusion framework, with a custom FreeU implementation that enables per-layer control over the four U-Net upsampling layers. We compare standard Stable Diffusion, default FreeU, and adaptive FreeU variants that predict FreeU parameters from the prompt and baseline SD image. Large-scale evaluation is performed on the MJHQ-30K dataset, using prompts and category metadata to assess performance across diverse image types. ImageReward-v1.0 is used as the primary evaluation metric, measuring prompt-image quality according to learned human preference.

Results

We evaluate each method on 5,000 MJHQ-30K prompts sampled uniformly across categories, using ImageReward as the main evaluation metric. Adaptive FreeU methods generally outperform standard SD and default FreeU, with DPO achieving the highest mean ImageReward of 0.076 under constant prediction.

Method	Constant ImageReward	Spatial ImageReward
SD	-0.0216 ± 0.0137	-0.0216 ± 0.0137
Default FreeU	0.0245 ± 0.0136	-0.0155 ± 0.0138
Gaussian	0.0304 ± 0.0137	-0.0088 ± 0.0138
REINFORCE	0.0435 ± 0.0137	0.0132 ± 0.0138
DPO	0.0764 ± 0.0136	0.0271 ± 0.0138

A user study further supports that the Gaussian-based adaptive prediction method is preferred over both SD and default FreeU:

Mode	SD	Default FreeU	Gaussian
Constant	30.0% (189/630)	23.7% (149/630)	46.3% (292/630)
Spatial	14.8% (93/630)	23.5% (148/630)	61.7% (389/630)

References

Si, C., Huang, Z., Jiang, Y., & Liu, Z. (2024). FreeU: Free lunch in diffusion U-Net. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
Xu, J., Liu, X., Wu, Y., Tong, Y., Li, Q., Ding, M., Tang, J., & Dong, Y. (2023). ImageReward: Learning and evaluating human preferences for text-to-image generation. In Advances in Neural Information Processing Systems (NeurIPS).
Playground AI. (2024). MJHQ-30K benchmark [Data set]. Hugging Face.
Rombach, R., Blattmann, A., Lorenz, D., Esser, P., & Ommer, B. (2022). High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
Rafailov, R., Sharma, A., Mitchell, E., Manning, C. D., Ermon, S., & Finn, C. (2023). Direct preference optimization: Your language model is secretly a reward model. In Advances in Neural Information Processing Systems (NeurIPS).

Code

Project Layout

src/
  cli.py                  # `uv run freeu` entrypoint
  adaptive/              # dataset build/upload, b/s predictor training, inference
  configs/adaptive/       # Adaptive FreeU dataset configs
  configs/sd15/base.json  # SD1.5 defaults and task directory
  configs/sd15/tasks/     # one experiment task per JSON file
  core/                   # FreeU algorithm and diffusion pipeline wrapper
  experiments/            # experiment runners: compare, sweep, seed search, ablation
  utils/                  # config loading, figure rendering, metadata IO
outputs/                  # generated images and metadata, ignored by git
docs/freeu/               # official FreeU reference repo copy
docs/                     # proposal/paper PDFs

outputs/ is runtime output and can be deleted before reruns. src/configs/ stores the experiment configs we edit with the code.

Setup

uv sync
# Stable Diffusion model access may require: uv run hf auth login

FreeU Reproduction

Run configured tasks (use --dry-run to preview without loading Stable Diffusion, or --task <name> for a single task):

uv run freeu                                          # run all configured tasks
uv run freeu --dry-run
uv run freeu --task teddy_snowstorm_feature_maps --device cuda --dtype float16 --seed 12
uv run freeu --task teddy_snowstorm_feature_maps --freeu-mode spatial

Tasks live in src/configs/sd15/tasks/, with shared defaults in src/configs/sd15/base.json. FreeU parameters are per-layer:

{ "index": 0, "name": "up_block_0_lowest_resolution", "b": 1.5, "s": 0.9, "enabled": true }

This lets us compare single layers and combinations such as L1, L2, L3, L4, L1+L2, L2+L3, and L3+L4.

--freeu-mode constant applies fixed backbone scaling plus skip Fourier filtering.
--freeu-mode spatial uses the paper-style normalized feature map for backbone scaling.
Our implementation reimplements FreeU directly instead of using diffusers' enable_freeu().

Adaptive FreeU Pipeline

1. Build datasets — generates SD1.5 baselines, FreeU candidates, and ImageReward labels:

uv run freeu dataset build --config src/configs/adaptive/freeu_constant_sd15.json
uv run freeu dataset build --config src/configs/adaptive/freeu_spatial_sd15.json --output-dir outputs/datasets/freeu_spatial_sd15

# smoke test
uv run freeu dataset build --dry-run --max-prompts 2 --candidate-count 2

Optionally upload a dataset to Hugging Face (use upload-large-folder to preserve the dataset name in the repo):

uv run hf upload-large-folder gainsborouo/NYCU_DL-Final-Project outputs/datasets --repo-type dataset --include "freeu_constant_sd15/**"

2. Train a policy — three interchangeable objectives, all using a frozen-CLIP + policy network and the same cached-feature pipeline (the first run builds a CLIP feature cache under the dataset directory):

# supervised b/s predictor (per-layer b/s for L1-L4, no enable flags)
uv run freeu train predictor --dataset outputs/datasets/freeu_constant_sd15 \
  --output-dir outputs/checkpoints/predictor_freeu_constant_sd15 --epochs 20 --batch-size 4

# reward-weighted Gaussian policy: p(b/s | prompt, SD baseline) via softmax(reward_delta / tau) weighting
uv run freeu train policy --dataset outputs/datasets/freeu_constant_sd15 \
  --output-dir outputs/checkpoints/freeu_constant_sd15_policy --epochs 20 --batch-size 16 --reward-temperature 0.5

# reward-weighted mixture-density policy (same pipeline, mixture of Gaussians)
uv run freeu train mdn-policy --dataset outputs/datasets/freeu_constant_sd15 \
  --output-dir outputs/checkpoints/freeu_constant_sd15_mdn_policy --epochs 20 --batch-size 16 \
  --mixture-count 4 --reward-temperature 0.5

3. Benchmark on MJHQ-30K — generates SD, SD+default FreeU, and SD+adaptive FreeU per prompt and scores them with ImageReward (add --compute-fid for FID against MJHQ real images, a set-level metric so it needs a real sample size, not a smoke run):

# evaluate one or more checkpoints (predicted parameters used directly at inference)
uv run freeu benchmark mjhq --checkpoint outputs/checkpoints/predictor_freeu_constant_sd15 --checkpoint-name v1 \
  --checkpoint outputs/checkpoints/freeu_constant_sd15_policy --checkpoint-name v2 \
  --output-dir outputs/benchmarks/mjhq_constant_sd15_compare --max-samples 100 --split test --compute-fid

# compare fixed FreeU baselines without any checkpoint, with category-balanced prompt sampling
uv run freeu benchmark mjhq --output-dir outputs/benchmarks/mjhq_freeu_modes \
  --freeu-baseline-mode spatial --freeu-baseline-mode backbone_fourier --freeu-baseline-mode wavelet \
  --max-samples 100 --prompt-sample-strategy category_uniform --prompt-sample-seed 42

--freeu-mode selects the baseline preset (backbone_fourier and wavelet use mode-specific defaults). Results are written to mjhq_results.csv, summary.json, and benchmark_comparison.svg.

4. Predict & compare — predict b/s from a prompt + SD baseline image and render a default-vs-adaptive comparison:

uv run freeu predict --checkpoint outputs/checkpoints/predictor_freeu_constant_sd15 \
  --prompt "A red fox sitting in fresh snow, frosted pine forest background, close-up portrait, soft morning light, realistic wildlife photography, sharp focus" \
  --image outputs/datasets/freeu_constant_sd15/images/sample_00000_seed_42/baseline_sd.png --seed 42

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
docs		docs
scripts		scripts
src		src
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AdaFreeU: Adaptive FreeU Parameter Prediction for Text-to-Image Diffusion Models

Methodology

Gaussian Policy

REINFORCE

DPO

Experimental Setup & Dataset

Results

References

Code

Project Layout

Setup

FreeU Reproduction

Adaptive FreeU Pipeline

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

AdaFreeU: Adaptive FreeU Parameter Prediction for Text-to-Image Diffusion Models

Methodology

Gaussian Policy

REINFORCE

DPO

Experimental Setup & Dataset

Results

References

Code

Project Layout

Setup

FreeU Reproduction

Adaptive FreeU Pipeline

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages