MAP
MAP: Unleashing Hybrid Mamba-Transformer Vision Backbone's Potential with Masked Autoregressive Pretraining[1]
作者是来自清华叉院和上海AI Lab、QiZhi 研究院的Yunze Liu和Li Yi,论文引用[1]:Liu, Yunze and Li Yi. “MAP: Unleashing Hybrid Mamba-Transformer Vision Backbone's Potential with Masked Autoregressive Pretraining.” ArXiv abs/2410.00871 (2024): n. pag.
Time
- 2025.Mar
Key Words
- masked Autoregressive Pretraining
总结
- 混合的Mamba-Transformer网络最近受到了很多的关注,这些网络利用Transformer的可扩展性和Mamba的long-context modeling和高效计算。然而,有效地预训练这样的混合网络仍然是一个open question,现有的方法,例如MAE 或者自回归 pretraining,主要聚焦于single-type network 架构,相比之下,对于Mamba和Transformer的混合结构,预训练策略必须有效,基于此,作者提出了Masked Autoregressive pretraining,以统一的范式,提高了Mamba和Transformer modules的性能。
\(Fig.1^{[1]}\)
\(Fig.2^{[1]}\)