Learning
Transferable Visual Models From Natural Language
Supervision[1]
作者是来自OpenAI的Alec Radford, Jong Wook Kim, Chris Halacy, Aditya
Ramesh, Gabriel Goh, Sandhini, Agarwal, Girish Sastry, Amanda Askell,
Pamela Mishkin, Jack Clark, Gretchen Krueger, IIya
Sustkever.论文引用[1]:Radford, Alec et al. “Learning Transferable Visual
Models From Natural Language Supervision.” International Conference on
Machine Learning (2021).
Time
Key Words
- image-text pairs
- CLIP: Contrastive Language-Image Pre-training
- Learning from natural language supervision
- perform a wide set of tasks during pre-training including
OCR,geo-localization, action recognition, and more