MixFormer: End-to-End Tracking with Iterative Mixed Attention[1]

作者是来自南大的Yutao Cui, Cheng Jiang, Limin Wang, Gangshan Wu. 论文引用[1]:Cui, Yutao et al. “MixFormer: End-to-End Tracking with Iterative Mixed Attention.” 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022): 13598-13608.

Time

  • 2022.Mar

Key Words

  • compact tracking framework
  • unify the feature extraction and target integration solely with a transformer-based architecture

VOT,MOT,SOT的区别

  1. VOT是标首帧,MOT是多目标追踪,SOT是单目标追踪。

动机

  1. Tracking经常用多阶段的pipeline:feature extraction,target information integration,bounding box estimation。为了简化这个pipeline,作者提出了一个紧凑的tracking框架,名为MixFormer。

  2. target information integration解释: fuse the target and search region information

阅读全文 »

在Ubuntu 20.04里安装 Nvidia RTX 3060显卡的驱动

  1. 之前照着网上的教程弄过一次,记得是通过命令行来弄的,结果搞得黑屏,好不容易解决了黑屏的问题,进入桌面之后,显示不了Wifi和蓝牙,好像缺了很多东西,搞得很狂躁。这两天跑Tracking,大部分是Ubuntu环境下的,就趁这次机会重装一下系统,然后找个新的教程

  2. 找到了两个方式:

    • 通过 Ubuntu 自带的Software & updates里头的additional drivers里,能看到有 Nvidia的显卡驱动,勾一个合适的就行,很简单。。全程没有什么bug。。害得我上次弄了好久。
    • 去nvidia官网上下载驱动,名称一般是 Nvidia-Linux-xxx.run,运行的时候需要先禁用掉 nouveau,在哪个文件里加上: blacklist nouveau, options nouveau modeset=0,然后重启,看看lsmod一下,看看nouveau有没有被禁用掉。然后运行 .run文件,在运行.run文件的时候:提示可以用Ubuntu里 additional drivers来安装。

Masked Autoencoders with Saptial-Attention Dropout for Tracking Tasks[1]

作者是来自CityU、IDEA、Tecent AI Lab、CUHK(SZ)的 Qiangqiang Wu、Tianyu Yang、Ziquan Liu、Baoyuan Wu、Ying Shan、Antoni B.Chan. 论文引用[1]:Wu, Qiangqiang et al. “DropMAE: Masked Autoencoders with Spatial-Attention Dropout for Tracking Tasks.” 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023): 14561-14571.

Time

  • 2023.Apr

Key Words

  • masked autoencoder
  • temporal matching-based
  • spatial-attention dropout

动机

  1. 将MAE应用到下游任务如: visual object tracking(VOT) and video object segmentation(VOS). 简单的扩展MAE是mask out frame patches in videos and reconstruct the frame pixels.然而作者发现这个会严重依赖于spatial cues, 当进行frame reconstruction的时候忽略temporal relations, 这个导致sub-optimal temporal matching representations for VOT and VOS.
阅读全文 »

Fully Convolution Networks for Semantic Segmentation[1]

作者是来自UC Berkeley的Jonathan Long, Evan Shelhamer, Trevor Darrell. 论文引用[1]:Shelhamer, Evan et al. “Fully convolutional networks for semantic segmentation.” 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2014): 3431-3440.

Time

  • 2014.Nov

Key Words

  • fully convolutional network

动机

  1. 目的是建一个fully convolution network, 接收任意尺寸的输入,产生相应尺寸的输出 with efficient inference and learning.
阅读全文 »

AI 相关的Resources

####公开课

  1. CS231n, CS25
  2. 台大李宏毅的课

InternLM大模型开源社区

  1. 链接为:https://aicarrier.feishu.cn/wiki/RPyhwV7GxiSyv7k1M5Mc9nrRnbd,是一个飞书文档,挺全面的

CS自学

  1. csdiy
  2. 计算机专业学习路线:https://hackway.org/docs/cs/intro

个人博客

  1. 苏剑林博客:https://spaces.ac.cn/
  2. https://lilianweng.github.io/

工具网站

  1. AI Paper Collector
  2. Paper with code
  3. HuggingFace docs
  4. AI Conference Deadline: https://aideadlin.es/?sub=ML,CV,CG,NLP,RO,SP,DM,AP,KR,HCI
  5. 深度学习实验管理wandb

IEEE论文的LaTex Template

  1. 可以在这里找:
    • https://journals.ieeeauthorcenter.ieee.org/create-your-ieee-journal-article/authoring-tools-and-templates/tools-for-ieee-authors/ieee-article-templates/

参考链接:

季恩比特的微博

FCOS: Fully Convolutional One-Stage Object Detection[1]

作者是来自澳大利亚的阿德莱德大学的ZhiTian, Chunhua Shen, Hao Chen, Tong He. 论文引用[1]:Tian, Zhi et al. “FCOS: Fully Convolutional One-Stage Object Detection.” 2019 IEEE/CVF International Conference on Computer Vision (ICCV) (2019): 9626-9635.

Time

  • 2019.Apr

Key Words

  • one-stage
  • FCN
  • per-pixel prediction fashion

动机

  1. 基于anchor的检测器的一些缺点:对于一些超参数敏感:例如aspect ratio,etc;计算量大;处理一些large shape variations的物体的时候有困难。
阅读全文 »

CornerNet: Detecting Objects as Paired Keypoints[1]

作者是来自Princeton的Hei Law, Jia Deng.论文引用[1]:Law, Hei and Jia Deng. “CornerNet: Detecting Objects as Paired Keypoints.” International Journal of Computer Vision 128 (2018): 642 - 656.

Time

  • 2018.Aug

Key Words

动机

总结

Object as Points[1]

作者是来自UT Austin, UC Berkeley的Xingyi Zhou,Dequan Wang, Philipp Krahenbuhl。论文引用[1]:Zhou, Xingyi et al. “Objects as Points.” ArXiv abs/1904.07850 (2019): n. pag.

Time

  • 2019.Apr

Key Words

  • model object as a single point -- center point of its bounding box
  • keypoint estimation

动机

  1. 大多数的目标检测器会产生大量的潜在的object locations,and classify each,这是wasteful, inefficient, 需要很多后处理。
阅读全文 »

Focal Loss for Dense Object Detection[1]

作者是来自FAIR的Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, Piotr Dollar.论文引用[1]:Lin, Tsung-Yi et al. “Focal Loss for Dense Object Detection.” IEEE Transactions on Pattern Analysis and Machine Intelligence 42 (2017): 318-327.

Time

  • 2017.Aug

Key Word

  • Focal loss focues training on a sparse set of hard examples and prevents the vast number of easy negatives from overwhelming the detector during training.
  • class imbalance between foreground and background classes during training.
  • easy negatives
阅读全文 »
0%