Why compute loss over `torch.randn_like()` rather than Gaussian noise multiplied with beta for each timestep? #405

Foundsheep · 2024-09-27T06:45:34Z

Hi there, my question is simple.

stablediffusion/ldm/models/diffusion/ddpm.py

Line 380 in cf1d67a

def p_losses(self, x_start, t, noise=None):

On the link above, why the mse loss or whatever loss is computed over a pure Gaussian noise, which matches the noise of xT rather than a Gaussian noise multiplied with beta t at certain timestep?

I think mse, l1 or l2 loss should be done on the target and pred which are assumed to be the same.

Am I missing something or should it be modified?

Actually I can see the current logic works and a model is trained, but mathmatically I think my thought is right.

Also this could be found from huggingface's official tutorial

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why compute loss over `torch.randn_like()` rather than Gaussian noise multiplied with beta for each timestep? #405

Why compute loss over `torch.randn_like()` rather than Gaussian noise multiplied with beta for each timestep? #405

Foundsheep commented Sep 27, 2024

Why compute loss over torch.randn_like() rather than Gaussian noise multiplied with beta for each timestep? #405

Why compute loss over torch.randn_like() rather than Gaussian noise multiplied with beta for each timestep? #405

Comments

Foundsheep commented Sep 27, 2024

Why compute loss over `torch.randn_like()` rather than Gaussian noise multiplied with beta for each timestep? #405

Why compute loss over `torch.randn_like()` rather than Gaussian noise multiplied with beta for each timestep? #405