Major alignment safety

NoiseTilt: Noise-Tilted Reverse Kernels for Diffusion Reward Alignment

Jisung Hwang, Yunhong Min, Jaihoon Kim, I-Chao Shen, Minhyuk Sung

Published
Jun 16, 2026 — 15:38 UTC

Problem
The paper addresses the limitations of existing reward-guided sampling methods in diffusion models, which either degrade sample quality due to gradient-based guidance or lack gradient signals in search-based methods. The authors highlight that no prior method effectively balances the need for maintaining sample quality while incorporating reward signals. This work is a preprint and has not undergone peer review.

Method
The core contribution is the Noise-Tilted Reverse Kernel (NTRK), which modifies the noise term in the diffusion process to incorporate reward gradients while keeping the pretrained reverse kernel unchanged. This is achieved through a novel whitening operator that allows for the safe injection of reward gradients as noise, thus preserving the guiding signal without shifting the reverse mean. The method requires only a single sample per step, significantly improving computational efficiency. The authors detail the architecture and training process, although specific training compute resources are not disclosed.

Results
NTRK demonstrates superior performance across various reward alignment tasks compared to state-of-the-art baselines. Notably, in aesthetic generation tasks, NTRK achieves a reward score that surpasses the best baseline using only 25 Noise Function Evaluations (NFEs), compared to 500 NFEs for the baseline, indicating a 20× reduction in computational cost. The paper provides quantitative results that highlight the effectiveness of NTRK in maintaining sample quality while enhancing reward alignment.

Limitations
The authors acknowledge that while NTRK improves upon existing methods, it may still be sensitive to the choice of reward function and the specific characteristics of the diffusion model used. Additionally, the reliance on a single sample per step may introduce variability in certain contexts, which is not extensively explored in the paper. The authors do not discuss potential scalability issues or the generalizability of the method across different types of diffusion models.

Why it matters
The introduction of NTRK has significant implications for the field of generative modeling, particularly in enhancing the efficiency and quality of reward-guided sampling in diffusion models. By effectively integrating reward signals without compromising sample fidelity, this method opens avenues for more versatile applications in areas such as image synthesis and reinforcement learning. The findings contribute to the ongoing discourse on improving generative models and are relevant for researchers focusing on diffusion processes and reward alignment strategies, as published in arXiv.

Turing Wire

By Callan Zhang · Jun 16, 2026 · Editorial standards →

Summarised from the primary source with AI assistance under human editorial oversight. Turing Wire is not a primary source — read the original for the authoritative account.

Source: arXiv cs.LG