Key-Grid: Unsupervised 3D Keypoints Detection using Grid Heatmap Features

1Shanghai Qi Zhi Institute, 2Tsinghua University, 3Shanghai Artificial Intelligence Laboratory,

4The University of Hong Kong, 5University of Science and Technology of China, 6National University of Singapore


Examples of the keypoints detected by Key-Grid.


Detecting 3D keypoints with semantic consistency is widely used in many scenarios such as pose estimation, shape registration and robotics. Currently, most unsupervised 3D keypoint detection methods focus on the rigid-body objects. However, when faced with deformable objects, the keypoints they identify do not preserve semantic consistency well.

In this paper, we introduce an innovative unsupervised keypoint detector Key-Grid for both the rigid-body and deformable objects, which is an autoencoder framework. Unlike previous work, we leverage the identified keypoint information to form a 3D grid feature heatmap called grid heatmap, which is used in the process of point cloud reconstruction.

Grid heatmap is a novel concept that represents the latent variables for grid points sampled uniformly in the 3D cubic space, where these variables are the shortest distance between the grid points and the “skeleton” connected by keypoint pairs. Meanwhile, we incorporate the information from each layer of the encoder into the reconstruction process of the point cloud. We conduct an extensive evaluation of Key-Grid on a list of benchmark datasets. Key-Grid achieves the state-of-the-art performance on the semantic consistency and position accuracy of keypoints. Moreover, we demonstrate the robustness of Key-Grid to noise and downsampling. In addition, we achieve SE-(3) invariance of keypoints though generalizing Key-Grid to a SE(3)-invariant backbone.


In the encoder section, given a point cloud, we detect the keypoints by utilizing the PointNet++. Then, we utilize the detected keypoints to form the Key-Grid. In the decoder section, we use each layer of the PointNet++ and the Key-Grid to reconstruct the input point cloud. If you want to understand more details, please refer to our paper.

Pipline of Key-Grid

Visualization Results

Fold Clothes and Pants

Note : we compare Key-Grid with other baselines (KD: KeypointDeformer, SM: Skeleton Merger, SC3K).

Drop Hat

Drop Shirt

Drag Long Pant

Drag Long Dress

Drag Tie


  author    = {Chengkai Hou, Zhengrong Xue, Bingyang Zhou, Jinghan Ke, Shao Lin and Huazhe Xu},
  title     = {Key-Grid: Unsupervised 3D Keypoints Detection using Grid Heatmap Features},
  journal   = {Arxiv},
  year      = {2023},