This part is so important that I have to raise my feedback.
I would like to propose a major revision to Section 3.2 (“Seeking Low-Dimensional Distributions through Denoising”) to improve its clarity and pedagogical flow.
The current version contains 98 numbered equations (from 3.2.1 to 3.2.98), which makes the core ideas hard to follow, especially for beginners. The numbering is also confusing—there are multiple “3.2” headings at different levels, and the logical thread is obscured.
My suggestion is to restructure the section as follows:
-
Move many of the detailed derivations and technical lemmas to an appendix. The sheer number of equations (98) is overwhelming; only the most essential ones should remain in the main text.
-
Fix the numbering scheme. Currently, there are several instances of duplicate “3.2” headings (e.g., both the section title and some subsections are numbered 3.2). A clearer hierarchical numbering (e.g., 3.2.1, 3.2.2, …) should be used consistently.
-
Make the core logical flow explicit. The essential chain of ideas is:
step1: Core diffusion equation (e.g., Eq. (3.2.1) / (3.2.85))
step2: Tweedie’s formula (Eq. (3.2.23) / (3.2.86))
step3: DDIM / iterative denoising update (Eq. (3.2.82) / (3.2.87))
step4: Training objective (Eq. (3.2.90))
This sequence could be highlighted in a box or a diagram, and the surrounding text should explicitly guide the reader along this path.
- Include a complete, worked-out example that runs through the entire process. For instance, a simple Gaussian mixture or a toy image dataset could be used to illustrate the forward diffusion, the learned denoiser, and the reverse sampling steps. This would greatly help readers connect the abstract equations to a concrete implementation.
I believe these changes would make Section 3.2 much more accessible and would better serve the book’s goal of providing a principled yet practical introduction to representation learning.
This part is so important that I have to raise my feedback.
I would like to propose a major revision to Section 3.2 (“Seeking Low-Dimensional Distributions through Denoising”) to improve its clarity and pedagogical flow.
The current version contains 98 numbered equations (from 3.2.1 to 3.2.98), which makes the core ideas hard to follow, especially for beginners. The numbering is also confusing—there are multiple “3.2” headings at different levels, and the logical thread is obscured.
My suggestion is to restructure the section as follows:
Move many of the detailed derivations and technical lemmas to an appendix. The sheer number of equations (98) is overwhelming; only the most essential ones should remain in the main text.
Fix the numbering scheme. Currently, there are several instances of duplicate “3.2” headings (e.g., both the section title and some subsections are numbered 3.2). A clearer hierarchical numbering (e.g., 3.2.1, 3.2.2, …) should be used consistently.
Make the core logical flow explicit. The essential chain of ideas is:
step1: Core diffusion equation (e.g., Eq. (3.2.1) / (3.2.85))
step2: Tweedie’s formula (Eq. (3.2.23) / (3.2.86))
step3: DDIM / iterative denoising update (Eq. (3.2.82) / (3.2.87))
step4: Training objective (Eq. (3.2.90))
This sequence could be highlighted in a box or a diagram, and the surrounding text should explicitly guide the reader along this path.
I believe these changes would make Section 3.2 much more accessible and would better serve the book’s goal of providing a principled yet practical introduction to representation learning.