CV 导论 - Overview
Introduction
- 什么是 Vision? Sensation - Processing - Perception - Cognition, 包括神经/识别的 Perception 是更重要的部分
- Visuomotor coordination
- Vision 是 Modality 中较重要的一部分, 人有 83% 的感知来自 vision
- 人的 Perception-action(body control) Loop 是闭环的, vision 指导 action, action 反馈 vision, 进而改正, 这是人与动物都有的行为 具身智能
- Vision 与 language 交互(视觉语言/文字)
Computer Vision 做人类做的很多事情, 做人类不能做的很多事情
Computer Vistion 不是 Deep Learning 的应用, 是人本的, 是从神经科学出发的
Technology & History
What is Computer Vision? It deals with:
- acquiring
- RGB camera, Depth camera and LiDAR (maybe others)
- Stereo, even multiview Images, Panoramic images
- Video
- True 3D: Point Cloud, mesh, and volumetric
- processing and analyzing: Low-level
- Image processing
- Applications...
- Future extraction
- edge/corner cognition
- Image processing
- Mid-level Vision
- grouping
- 3D reconstruction
- Scanner
- SLAM
- High-level Vision: Understanding, semantic interpretations
- Facial Recognition
- Scene Understanding
- Augmented Reality
- Graphics and Vision - 前沿
- 图形学的 Synthetic data (differ from Real data) 与 Vision 对抗?
- NeRF
- Vision 还要管人不管的事情: Generation
What I cannot create, I do not understand. - Feynman
- Animating Faces
- Diffusion Model
- Vision-Language 待发展
- Embodied AI
- Autonomous Driving 端到端前些年都是for PR purposes Entering l3
- Generalist AGI
- (He Wang) GraspVLA