Depth-Guided Monocular Object Pose Estimation for Warehouse Automation

Phan Xuan Tan, Dinh Cuong Hoang, Anh Nhat Nguyen, Eiji Kamioka, Ta Huu Anh Duong, Tuan Minh Huynh, Duc Manh Nguyen, Duc Huy Ngo, Minh Duc Cao, Thu Uyen Nguyen, Van Thiep Nguyen, Duc Thanh Tran, Van Hiep Duong, Anh Truong Mai, Duc Long Pham, Khanh Toan Phan, Minh Quang Do

研究成果: Article査読

抄録

Accurate object pose estimation is essential for warehouse automation, enabling tasks such as robotic picking, sorting, and inventory management. Current state-of-the-art approaches rely on both color (RGB) and depth (D) images, as depth information provides critical geometric cues that enhance object localization and improve robustness against occlusions. However, RGBD-based methods require specialized depth sensors, which can be costly and may not function reliably in warehouse environments with reflective surfaces, varying lighting conditions, or sensor occlusions. To address these limitations, researchers have explored RGB-only approaches, but the absence of depth cues makes it challenging to handle occlusions, estimate object geometry, and differentiate between textureless or highly similar objects, which are common in warehouses. In this paper, we propose a novel end-to-end depth-guided object pose estimation method tailored for warehouse automation. Our approach leverages both depth and color images during training but relies solely on RGB images during inference. Depth images are used to supervise the training of a depth estimation network, which generates initial depth-aware features. These features are then refined using our proposed depth-guided feature enhancement module to improve spatial understanding and robustness. The enhanced features are subsequently utilized for keypoint-based 6D object pose estimation. By integrating depth-guided feature learning, our method significantly enhances pose estimation accuracy, especially in cluttered warehouse environments with severe occlusions and textureless objects. Extensive experiments on warehouse-specific datasets, as well as standard benchmark datasets, demonstrate that our approach outperforms existing RGB-based methods while maintaining real-time inference speeds, making it a highly practical solution for real-world warehouse automation applications.

本文言語English
ページ(範囲)110166-110184
ページ数19
ジャーナルIEEE Access
13
DOI
出版ステータスPublished - 2025

ASJC Scopus subject areas

  • コンピュータサイエンス一般
  • 材料科学一般
  • 工学一般

フィンガープリント

「Depth-Guided Monocular Object Pose Estimation for Warehouse Automation」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル