CV 导论 - Overview
#CV 导论 5

Introduction

  1. 什么是 Vision? Sensation - Processing - Perception - Cognition, 包括神经/识别的 Perception 是更重要的部分
  2. Visuomotor coordination
  3. Vision 是 Modality 中较重要的一部分, 人有 83% 的感知来自 vision
  4. 人的 Perception-action(body control) Loop 是闭环的, vision 指导 action, action 反馈 vision, 进而改正, 这是人与动物都有的行为 具身智能
  5. Vision 与 language 交互(视觉语言/文字)

Computer Vision 做人类做的很多事情, 做人类不能做的很多事情

Computer Vistion 不是 Deep Learning 的应用, 是人本的, 是从神经科学出发的

Technology & History

What is Computer Vision? It deals with:

  • acquiring
    • RGB camera, Depth camera and LiDAR (maybe others)
    • Stereo, even multiview Images, Panoramic images
    • Video
    • True 3D: Point Cloud, mesh, and volumetric
  • processing and analyzing: Low-level
    • Image processing
      • Applications...
    • Future extraction
      • edge/corner cognition
  • Mid-level Vision
    • grouping
    • 3D reconstruction
    • Scanner
    • SLAM
  • High-level Vision: Understanding, semantic interpretations
    • Facial Recognition
    • Scene Understanding
    • Augmented Reality
  • Graphics and Vision - 前沿
    • 图形学的 Synthetic data (differ from Real data) 与 Vision 对抗?
    • NeRF
    • Vision 还要管人不管的事情: Generation

    What I cannot create, I do not understand. - Feynman

    • Animating Faces
    • Diffusion Model
    • Vision-Language 待发展
  • Embodied AI
    • Autonomous Driving 端到端前些年都是for PR purposes Entering l3
    • Generalist AGI
    • (He Wang) GraspVLA
CV 导论 - Overview
http://localhost:8090/archives/UJMGFKsw
作者
酱紫瑞
发布于
更新于
许可协议