ICRA 2023 - Scalable Autonomous Driving 워크샵 (Day 5)
(적는 중)
Toyota
Foundation models == Zero-shot generalization
Pre-trained models are great out of the box. We should leverage geometry more. Mathematical formulation in geometry (e.g. epipolar geometry) are essentially the best pre-trained models!
Geometric foundation models? (== Data + Principles)
Fisher Yu - calable 4D Scene Understanding with Less Hassle
- Scaling with Less hassle
- Learn and benefit from large-scale data
- Avoid manual designs / manual labeling
Representation Learning in 4D world
- Learning robust 2D tracking
- Learning high-fidelity instalce-level representation
- Leanring 3D motion
Current tracking algorithms
- Object detection -> Motion estimation -> Initial Assocation (Location, Appearance) -> Association Optimization
- Pipeline is very complicated and also is fragile.
- Do humans go through all these steps?
- Humans only need to do ‘Object detection -> Appearance association’