OpenScene: 3D Scene Understanding with Open Vocabularies

pre-trained text,image embedding models → 3D-scene Understanding

Untitled

Overview

Untitled

3.1. Image Feature Fusion

Segmentation model:

Untitled

LSeg, OpenSeg

Input: RGB Image with resolution H x W