GaussianWorld

Our Research Trajectory

Our projects build upon each other over time to expand the systems capabilities.

ShapeSplat

3DV'25 Oral

Object-level encoder.
Dataset of 206K 3DGS objects.
Self-supervised pretraining.

SceneSplat

ICCV'25 Oral

Indoor 3DGS encoder.
Dataset of 7K 3DGS scenes.
Language-aligned pretraining.

GaussianVLM

RA-L'25

3DGS scene VLM.
Compact scene tokens.
Embodied reasoning support.

SceneSplat++

NeurIPS'25

Indoor & outdoor 3DGS.
Dataset of 49K scenes.
Comprehensive 3D benchmark.

Chorus

CVPR'26 Oral

Multi-teacher distillation.
Wide task compatibility.
Work with both Splat&Point format.

About Us

We are building a system for robust 3D scene understanding and natural language interaction within complex environments. Our goal is to develop a foundation model that can tokenize complex 3D scenes, perform instance detection, and enable spatial reasoning to answer complex language queries—ultimately working to make 3D scene understanding as accessible and powerful as its 2D counterpart.

Demos

Key capabilities of our 3D systems.

3D Feature Extraction

Our system takes a 3D scene (3DGS or PC) as input and, in a single neural network forward pass, outputs a feature for each 3D primitive.

Language Interactions

Our system provides real-time open-vocabulary 3D content search leveraging the initially extracted 3D features.

The Team

Multi-institutional research group.

Our Research Trajectory

ShapeSplat

SceneSplat

GaussianVLM

SceneSplat++

Chorus

About Us

Demos

3D Feature Extraction

Language Interactions

The Team

Yue Li

Qi Ma

Anna-Maria Halacheva

Runyi Yang

Mengjiao Ma

Bin Ren

Nikola Popovic

Theo Gevers

Luc Van Gool

Martin R. Oswald

Danda Pani Paudel