Dòng tin
Tất cả
DiffusionBlocks: Huấn luyện mạng nơ-ron theo khối độc lập
RT by @jeremyphoward: For over a decade, we’ve accepted that end-to-end backprop is the only way to train deep networks. But holding the entire network in memory all at once is why AI training is hitting a resource wall.
We found a new way to break the network into blocks and train them independently. The trick? Treating the network’s forward pass like a diffusion model denoising a signal.
This reinterpretation slashes the memory needed to train deep models. In our #ICLR2026 paper (https://arxiv.org/abs/2506.14202), we matched end-to-end performance across ViTs, DiTs, and LLMs. We did this while training just one isolated block at a time.
- ›Phương pháp mới cho phép huấn luyện mạng nơ-ron từng khối độc lập thay vì backprop toàn bộ