Dòng tin
Tất cả
Gated DeltaNet-2: Nâng cao hiệu suất Linear Attention bằng cơ chế tách riêng
Gated DeltaNet has been one of my favorite "hybrid attention" newcomers in the good old transformer stack.
Excited to see Gated DeltaNet-2. Adding it to my reading stack. In the meantime, I have a primer on Gated DeltaNet here: https://magazine.sebastianraschka.com/i/177848019/26-gated-deltanet
- ›Gated DeltaNet-2 là kiến trúc hybrid attention mới tách riêng cơ chế xóa (erase) và ghi (write) thay vì dùng chung một gate.