2020년 10~12월 SLAM 논문 소식

Posted on 2020-12-29 Edited on 2023-06-09 In 1. Spatial AI , 1.1 SLAM , 월간 SLAM 뉴스 Views:

10월

DynaSLAM II : Tightly-Coupled Multi-Object Tracking and SLAM
- DynaSLAM 1편의 저자인 Berta Bescos, ORB-SLAM 3의 제 1저자인 Campos, ORB-SLAM 시리즈를 만든 랩실의 교수님인 Tardos 교수님의 새로운 작품.
- Object pose estimation + tracking과 SLAM을 joint optmisation으로 풀어냄.
- CubeSLAM과 비슷하지만 좀 더 발전된 approach인듯

LM:Reloc
- Direct SLAM의 최강자인 Cremers 교수님, 그리고 최근 Deep Direct SLAM 연구를 이끌고 있는 Nan Yang과 Lukas von Stumberg의 연구
- Direct 방식으로도 visual localisation이 된다는 것을 증명.
  - 곧 Direct 방식에서도 feature를 사용하지 않는 loop closure 방식이 나올 수도 있음.
- 이전의 GN-Net의 발전된 버전인듯

Multi-Resolution 3D Mapping with Explicit Free Space Representation for Fast and Accurate Mobile Robot Motion Planning
- Imperial College London 소속의 Stefan Leutenegger와 Pablo Alcantarilla의 작품.
  - Leutenegger는 BRISK, OKVIS, ElasticFusion, SemanticFusion의 저자, Alcantrilla는 KAZE / AKAZE feature의 저자임.
- SLAM을 이용한 Mapping에서 Free space를 정확하고 빠르게 모델링 할 수 있는 기법으로 보임. Map에서 정확한 경계를 표현하기 위해 공간을 Tree 구조로 쪼개서 정밀한 경계는 깊은 tree level로 들어간다. 이 부분을 multi-resolution occupancy mapping이라는 표현을 사용함.
  - 이 외로 novelty가 더 있어보이지만 아직 깊게 읽어보지 못함.
- ICL의 연구처럼 보이지만, 사실 SLAMCore의 연구라고 볼 수 있음.
  - Robot mapping + path planning을 효과적으로 할 수 있는 제품이 곧 SLAMCore의 제품에 실릴 것이라는 이야기.

Method and System for Performing Simultaneous Localization and Mapping using Convolutional Image Transformation
- MagicLeap의 새로운 특허. 당연히 DeTone과 Malisiewicz, Rabinovich의 이름이 들어갔다.
- MagicLeap에서 생각하고있는 SuperMaps의 아이디어를 담고있다.

Learning Monocular Dense Depth from Events
- Event하면 역시 Scaramuzza 교수님 작품이다.
- 매번 새로운 기능이 추가되는 모습이 보기 좋다.

SD-DefSLAM: Semi-Direct Monocular SLAM for Deformable and Intracorporeal Scenes
ORB-SLAM의 Tardos 교수님 + Montiel 교수님 랩실 작품
- Deformable scene에서 SLAM + Dynamic object는 거를 수 있음
- 수술 환경에서 SLAM을 하려는 연구는 관심을 가지는 사람의 수도 적고 어려운 환경이라 굉장히 희귀하다. Peter Mountney 이후로 오랜만에 보는 듯.

Freetures: Localization in Signed Distance Function Maps
- Microsoft와 ETH Zurich의 합작.
  - Delmerico, Nieto 등 참가했고, Roland Seigwart, Marc Pollefeys, Cesar Cadena 교수님도 참가하셨음.
- 3D dense map으로부터 localisation을 수행하는 방법을 제안함.

Elastic and Efficient LiDAR Reconstruction for Large-Scale Exploration Tasks
- ICL의 Stefan Leutenegger 교수님 랩실의 작품.
- 3D LiDAR를 이용해서 Large scale volumetric reconstruction을 수행함.

Tracking from Patterns: Learning Corresponding Patterns in Point Clouds for 3D Object Tracking
- VINS-mono를 만든 HKUST의 Shaojie Shen과 Peiliang Li의 작품.
- SLAM은 아니지만… LiDAR 기반 3D object detection 파이프라인에 temporal tracking 기능을 추가함으로써 안정적인 트랙킹을 만듬.

11월

Relocalization systems and methods
- MagicLeap에서 내놓은 Relocalization 알고리즘 특허.
  - 10월에 내놓았던 특허에서 빈 부분을 채워주는 듯
- 인풋 이미지로부터 global feature를 CNN으로 embedding을 한 후, nearest neighbour로 가장 가까운 이미지를 찾아서 Relocalization하는 듯

Primal-Dual Mesh Convolutional Neural Networks
- MIT의 Luca Carlone 교수님과 Zurich의 Scamaruzza 교수님이 함께 발표.
- Mesh 데이터를 처리하는 새로운 CNN 알고리즘을 개발.
  - 기존의 방식은 GNN 방식을 따라서 mesh를 그래프로 보며 처리했지만, mesh의 특성을 잘 반영하지 못했음.
  - 새로운 방식은 이전 방식이 잘 반영하지 못하던 mesh의 edge나 face도 사용 가능.

Deep Shells: Unsupervised Shape Correspondence with Optimal Transport
- Daniel Cremers 교수님이 참여… 근데 솔직히 왜 하신건지는 아직 잘 이해는 안감.
- 3D representation 중에 shell이라는 방식이 있음.
  - 3D model이 deformable하게 변할 때, optimal transport 기법을 사용하여 shape correspondence를 맞추는 방식을 제안함.
- 이 논문과 직접적인 관계는 없지만, SLAM 쪽에서도 3D representation 연구를 많이 하는 것 같음. 특히 딥러닝 3D representation은 하나의 class에서 다른 instance들로 differentiable하게 변환이 가능하기 때문에, SLAM의 입장에서는 joint optimisation이 가능해지는 것.

Accurate Mapping and Planning for Autonomous Racing
- Roland Siegwart 교수님 및 다수의 연구자가 참가한 연구임.
- 유명한 영상.
- SLAM과 optimal path planning, control이 모두 들어가있음.

Robot Navigation in Crowded Environments Using Deep Reinforcement Learning
- Roland Siegwart 교수님이 또…
- SLAM이 주가 되는 논문은 아니다. LiDAR odometry + scan 정보를 사용하지만, 논문의 메인 주제는 강화학습임.
- 그래도 SLAM과 다른 모듈이 함께 사용될 수 있는걸 보여주니 여기 넣어봤음.

From Points to Planes - Adding Planar Constraints to Monocular SLAM Factor Graphs
- 자라고자 대학의 Javier Civera 교수님 랩실 연구
- Plane enforcement로 만들어낸 plane landmark를 Factor Graph optimisation에 사용함.

A Robust Multi-Stereo Visual-Inertial Odometry Pipeline
- Michael Kaess 교수님 랩실 작품.
- Stereo pair에서 검출되는 모든 landmark 정보와 camera pose를 jointly optimise한다고 함.
- 1-point RANSAC 라는 기술을 쓰는데… 이게 뭘까…

Compositional Scalable Object SLAM
- Kaess 교수님 랩실 작품.
- Object-level SLAM이 새로 나옴.
  - RGB-D 기반, semantic data association을 포함함.
  - Large-scale indoor reconstruction에 사용 가능.

Rearrangement: A Challenge for Embodied AI
- 어벤저스 교수님 논문.
  - Dhruv Batra, Angel Chang, Sonia Chernova, Andrew Davison, Jia Deng, Vladlen Koltun, Sergey Levine, Jitendra Malik, Igor Mordatch, Roozbeh Mottaghi, Manolis Savva, Hao Su.
- ‘물체를 제자리에 돌려놓는 방법’에 대한 여러가지 방법을 함께 탐색하였다.
  - Geometric, language, Image, Experience가 중심 토픽이다.
    - Geometric - SLAM, 3D, Reconstruction, Planning
    - Language - NLP, Image-language, Communication
    - Image - Context undertanding, semantic understanding
    - Experience - Reinforcement learning, Memory

Providing Semantic-Augmented Artificial-Reality Experience
- Richard Newcombe가 있는 Facebook에서 내놓은 연구.
- Semantic AR를 수행하는 인프라 구조.
  - 모바일 -> 서버 -> 제 3의 컴퓨팅 리소스를 거친다.
  - Mesh를 만들어내는 과정이 필수적으로 들어간다. 아무래도 surface 정보를 이용한 interaction을 하려는 의도인 것 같다.

Learning Trajectories for Visual-Inertial System Calibration via Model-based Heuristic Deep Reinforcement Learning
- Roland Siegwart 교수님과 Cadena 교수님 및 다른 연구원들의 작품
- Visual-Inertial 시스템의 캘리브레이션은 굉장히 정확한 캘리브레이션을 요구함. 이 때, 캘리브레이션은 보통 캘리브레이션 타겟을 바라보면서 특정 움직임이 요구됨.
  - Model-based 강화학습 방식으로 캘리브레이션이 잘 되는 움직임을 학습함

Semantic Fusion (특허)
- Facebook의 Richard Newcombe
- Semantic Fusion이라는 이름으로 특허를 냈다.
  - Imperial College의 Davison 교수님 랩실에서 한 그 SemanticFusion과는 다른 것 같다.

AdaLAM: Revisiting Handcrafted Outlier Detection
- ETH Zurich의 Marc Pollefeys, Torsten Sattler, Viktor Larsson의 연구이다.
- Descriptor끼리 매칭할 때 nearest neighbour로 매칭하는 기존의 방식은 outlier가 너무 많이 생긴다.
- 이 연구는 새로운 local affine motion verification과 sample-adaptive threshold를 사용한 hierarchical pipeline을 소개한다.

A Factor-Graph Approach for Optimization Problems with Dynamics Constraints
- Factor graph optmisation 연구의 대가인 Frank Dellaert 교수님 랩실의 연구이다.
- Dynamic factor graph라는 방법론을 제시한다.
  - SLAM에서 쓰일 것은 아니지만, 가능성이 아예 없지는 않다.

SceneCAD: Predicting Object Alignments and Layouts in RGB-D Scans
- Facebook의 Angela Dai와 Matthias Nießner의 연구이다.
- RGB-D로 스캔한 환경에서 1. object에 CAD model prior를 fit하고, 2. 동시에 layout + object pose를 jointly optmise한다.

Tactile SLAM: Real-time inference of shape and pose from planar pushing
- Michael Kaess 교수님… 왜 비전 안 하세요…
- 비전 방식이 아닌, Force/torque 센서를 통해 얻는 센서 값으로 Shape guessing을 한다는 논문이다. SLAM에서도 전체 map은 모르지만 센서로 취득한 local map을 쌓아가면서 전체 map을 만드는 것 처럼, 똑같은 방식으로 force/torque 방식에 적용한 논문이다.
- SLAM…이라고 하긴 좀 그렇지만, 다른 방향으로도 새로운 방법론이 존재한다는걸 보기에는 좋은 예시이다.

12월

MonoRec: Semi-Supervised Dense Reconstruction in Dynamic Environments from a Single Moving Camera
- Nan Yang, Lukas von Stumberg와 Cremers 교수님의 작품.
- Dynamic object는 photometric inconsistency를 이용한 mask module을 사용하여 움직이는 물체임을 인지함.
  - 이를 이용해 Static / moving object에 대한 depth 를 정확하게 추정함.
  - 이 점이 기존의 MVS와 다른 점임.

Deep LiDAR localization using optical flow sensor-map correspondences
- Torsten Sattler의 연구이다.
- 딥러닝 기반 optical flow를 사용해서 LiDAR 기반 localization 방법을 소개한다.

A Review of Point Cloud Registration Algorithms for Mobile Robotics
- Siegwart 교수님이 참여하신 리뷰 논문이다.

Towards automating construction tasks: Large‐scale object mapping, segmentation, and manipulation
- Margarita Chli라는 ETH Zurich의 조교수로 계신 분의 연구이다. 이전에 Roland Siegwart 교수님 랩실에서 Stefan Leutenegger랑 연구도 같이 한듯… 이번 연구로 처음 알게 되었다.
- Application 논문이다. 공사 현장에서 드론 스캔으로 map을 만든 후, 포크레인에 LiDAR를 달아서 map merge 및 localisation을 수행. 그 후 object들에 대한 semantic 정보를 추출한 다음에, path planning 및 제어를 사용하여 물체를 집어올리고 이동한다.

Rotation-only bundle adjustment
- Seong Hun Lee와 Civera 교수님의 작품.
- Rotation만 최적화 하는 방법.
  - Position과 3d structure의 값이 부정확해도 잘 된다.

A Structure-Aware Method for Direct Pose Estimation
- Nathan Jacobs?라는 잘 모르는 분의 연구이다.
- CNN 기반 camera pose estimation 문제를 푸는데, direct방식과 indirect방식의 장점만을 가져와서 섞었다고 한다. Sota를 찍은건 아닌 것 같다.
  - Scene coordinate regression과 monocular depth estimation을 수행한다. 그 후 depth map을 camera intrinsic을 사용하여 3D scene coordinates로 만든 후, 3D-3D correspondence를 계산하여 pose regression을 한다고 한다.
  - 솔직히 좋은 연구 방향은 아닌 것 같다. Monocular depth estimation은 현재 단계에서는 sharp & crisp하게 나오는 것이 아니라, general하게 나타나기 때문이다. Edge 쪽에서도 많이 뭉뜽그려진다. 하지만 논문 초반에 나오는 intro + related works가 읽기 좋다.

DeFMO: Deblurring and Shape Recovery of Fast Moving Objects
- Marc Pollefeys 교수님이 연구에 참여하셨다.
- 보통 빠르게 움직이는 물체에는 이미지에서 blur가 진다. 이 연구에서는 이미지로부터 배경과 물체를 분리해낸 후, Temporal super-resolution을 물체가 움직이는 모습을 초고속 카메라가 깔끔하게 찍은 것 처럼 복원해준다.
  - 이 때, 깔끔한 이미지의 representation도 얻어낼 수 있다.

FMODetect: Robust Detection and Trajectory Estimation of Fast Moving Objects
- Pollefeys 교수님이 연구에 참여하셨다.
- DeFMO와 연관점이 많은 연구이다. 빠르게 움직이는 물체를 deblurring하고 trajectory를 딥러닝으로 복원해내는 연구이다.

Event-based Motion Segmentation with Spatio-Temporal Graph Cuts
- VINS-mono의 Shaojie Shen의 연구이다.
- Event 카메라로 motion segmentation을 수행하는 방법을 소개한다.

Using Image Sequences for Long-Term Visual Localization
- Torsten Sattler의 연구이다.
- 기존의 Visual localization은 1개의 이미지를 맵 데이터와 비교해서 위치를 복원해냈다. 하지만 보통 실제 시나리오에서는 사진 1장이 아니라 이미지 시퀀스를 받는다.
- 이번 연구는 사진 1장 대신 이미지 시퀀스를 사용함으로써 더 안정적이게 위치를 복원하는 방법을 소개한다.

Benchmarking Image Retrieval for Visual Localization
- NAVER LABS의 Martin Humenberger, Gabriela Csurka, Yohann Cabon과 Torsten Sattler의 연구이다.
- 다양한 Visual localization 방법들의 벤치마크 결과를 냈다.
- 그리고 벤치마킹 툴인 Kapture를 소개한다.

Appearance-Invariant 6-DoF Visual Localization using Generative Adversarial Networks
- 저자들이 누구일까…? 찾아도 안나온다. Shiguo Lian이라는 사람은 cryptography 관련 사람이 나오는데…
- Visual localization에서 사용되는 feature들은 invariant하지 않다는 점을 파고든다. 이를 위해 CycleGAN을 사용하여 만들어낸 다른 appearance를 가진 이미지들로 feature extraction 네트워크를 학습한다.

Point Transformer
- 옥스포드의 Philip Torr 교수님과 Intel Labs의 Vladlen Koltun의 연구.
- Transformer에서 사용되는 Self-attention을 포인트 클라우드 프로세싱에 쓸 수 있을까? 에 대한 답
  - Classification, Part segmentation, semantic segmentation 모두 잘 된다는 결과.

Efficient Volumetric Mapping Using Depth Completion With Uncertainty for Robotic Navigation
- Stafan Leutenegger의 연구이다.
- RGB-D 맵핑을 할 때 free-space에 대한 depth는 항상 정확하게 추정하기 어려웠다. 이 연구에서는 딥러닝으로 추정한 uncertainty 값을 이용해 depth completion을 수행하고, free-space에 대한 맵핑의 속도와 퀄리티를 향상시켰다.

Quantifying Aleatoric and Epistemic Uncertainty Using Density Estimation in Latent Space
- Cadena, Siegwart, Van Gool, Tombari 교수님의 연구이다.
- 최근 Deep-SLAM쪽에서 종종 uncertainty를 표현하기 위해 사용되는 aleatoric / epistemic uncertainty를 추정하는 새로운 방식을 제안한다.

Post-hoc Uncertainty Calibration for Domain Drift Scenarios
- Cremers 교수님이 연구에 참여하셨다.
- Deep learning classifer는 보통 training domain에서는 높은 정확도와 confidence score를 가지지만, 다른 domain으로 옮기면 confidence score가 급격하게 떨어진다.
- 이 confidence score를 다시 올려주는 post-hoc uncertainty calibration 방법을 소개한다.

Patch2Pix: Epipolar-Guided Pixel-Level Correspondences
- Torsten Sattler가 참여한 연구이다.
- 최근 성공적이였던 딥러닝 기반 keypoint matching 방식을 계승하는 연구이다.
  - Patch-level match proposal을 추출한 후, epipolar geometry를 weakly-supervised 방식으로 학습한 네트워크를 통해서 refine하여 pixel-wise match 정보를 regress한다.
  - 이 과정에서 outlier는 자동으로 제거된다.

CAD-Deform: Deformable Fitting of CAD Models to 3D Scans
- Matthias Nießner가 연구에 참여하였다.
- CAD-model 데이터를 3D scan 데이터에 deformable fitting을 통해 정확하게 alignment하는 방식을 소개한다.

RfD-Net: Point Scene Understanding by Semantic Instance Reconstruction
- 이번 연구 역시 Matthias Nießner가 연구에 참여하였다.
- Raw point cloud로부터 jointly하게 object detection + dense object surface reconstruction을 수행하는 법을 제안한다.

Seeing Behind Objects for 3D Multi-Object Tracking in RGB-D Sequences
- Matthias Nießner와 Facebook의 Angela Dai의 연구이다.
- RGB-D로 3d Detection을 한 후 지속적으로 tracking을 수행한다. 이 때, occlusion 등을 가려진 부분을 ‘hallucination’을 통해 채워넣어서 지속적으로 안정적이게 트랙킹한다.
  - 이를 통해 object간의 occlusion이 있을때도 안정적이게 multi-object tracking을 수행하였다.

Real-Time Temporal and Rotational Calibration of Heterogeneous Sensors Using Motion Correlation Analysis
- VINS-mono의 저자인 Tong Qin과 Shaojie Shen의 연구이다.
- 새로운 이종 센서간의 Temporal calibration 방식을 소개한다.

SOE-Net: A Self-Attention and Orientation Encoding Network for Point Cloud based Place Recognition
- Rui Wang과 Cremer 교수님의 작품이다.
- SOE-Net은 Point cloud 기반으로 place recognition을 할 때 필요한 local descriptor를 뽑는 방법과, large-scale point cloud based retrieval (i.e. long range)에 유용한 데이터를 추출하는 방법을 새롭게 제안한다.

3DSNet: Unsupervised Shape-to-Shape 3D Style Transfer
- Roland Siegwart 교수님께서 연구에 참여하셨다.
- Object class 정보를 가지고 있는 point cloud와, 특정 style을 표현하는 point cloud가 있을 때, 기존의 style-transfer와 비슷하게 적용할 수 있다.
- 이를 통해 동일 class지만 세부 모양이 다른 point cloud를 생성해내거나, 특정 이미징 디바이스의 point cloud 캡처 방식을 가져올 수 있다.
  - Class 내부에서 모양이 다른 point cloud들 간의 differentiable representation도 가지게 되는것이 아닌가…하고 생각이 든다. Shape + camera pose joint optimisation도 가능해지지 않을까?

Stable View Synthesis
- Intel의 Vladlen Koltun의 작품이다.
- SLAM이 이미지들로부터 camera pose와 structure를 구해내는 작업이라면, view synthesis는 camera pose와 structure를 기반으로 photorealistic image를 만들어내는 작업이다.
- 최근 SLAM쪽에서도 view synthesis에 대한 연구를 많이 진행하고 있는데, 이는 structure에 대한 representation을 구하기 아주 유용하기 때문이다.
- (또는, 부족한 정보를 view synthesis로 만들어서 채워넣을수도 있지 않을까?)

NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
- Ren Ng? 이라는 교수님이 지도하신 것 같다.
- 내용은 잘 이해가 되지 않았지만 View Synthesis에서 제일 좋은 성능을 뽑아낸다고 들었다. 굉장히 핫하다고… ㄷㄷ

NeuralFusion: Online Depth Fusion in Latent Space
- Pollefeys 교수님이 연구에 참여하셨다.
- Depth map aggregation을 latent feature space에서 학습한다고 한다. 사실 깊게 이해는 잘 안된다.

Reference Pose Generation for Long-term Visual Localization via Learned Features and View Synthesis
- Torsten Sattler와 Davide Scaramuzza 교수님의 작품이다.
- Visual Localization을 위해 reference pose를 만들 때는 SfM을 사용한다. 하지만 조명 변화가 생겼을 때는 local feature 매칭이 안되기 때문에 쓰기 어려워진다.
- 이번 연구는 View synthesis를 통해 새로운 learned feature를 포함하는 reference pose 데이터를 만드는 semi-automated 방법을 소개한다.

SceneFormer: Indoor Scene Generation with Transformers
- Matthias Nießner의 연구이다
- Indoor scene generation이라고 하는데… 아직 목적을 잘 모르겠다. 좀 더 읽어봐야곘다.