I. Research objective Human Pose Estimation (HPE) and 3D Pose Estimation are core technologies in augmented and virtual reality (AR/VR), human-computer interfaces, healthcare, and sports (Fig.1). In settings like hospitals or elder care facilities, people localization can provide information about the location of fall events, while 3D pose estimation can aid in applications such as bed exit alarms or fall prediction, reducing the occurrence of accidents and injuries. |
||
|
||
Figure 1. The applications of HPE in different fields. |
---|
II. Research content Traditional methods often utilize motion capture systems based on RGB cameras, but they commonly suffer from issues such as sensitivity to lighting conditions and invasion of privacy. In contrast to RGB cameras, millimeter-wave (mmWave) radar offers advantages such as wall-through ability, higher stability in various lighting and environmental conditions, and no privacy invasion concerns. These qualities make it suitable for applications in complex environments like hospitals, where also have higher demands for privacy protection. |
||
|
||
Figure 2. The HPE data collection system made by our laboratory. |
||
---|---|---|
Our laboratory cooperated with University of Washington to propose a mmWave-based human localization and 3D pose estimation system (Fig.2), with mmWave serving as the primary sensor. Additionally, data collected through dual RGB cameras and LiDAR, processed by the ZeDo model, produces high-precision human skeletons used as Ground Truth for training the HPE model (Fig.3). The proposed HPE model employs HRNet as the model backbone and generates two outputs after latent space: a body center’s confidence map and joint keypoint offsets map (Fig.4). |
||
|
||
Figure 3. Workflow of human localization and 3D pose ground truth annotations. |
||
|
||
Figure 4. The architecture of proposed 3D HPE and localization model. |
||
The model recognition results, as shown in Fig.5, demonstrate that the HPE model proposed by our laboratory exhibits precise localization performance, with an average localization error of approximately 9.91 cm. This confirms that our system can perform more effectively for people localization and human pose estimation. |
||
|
||
Figure 5. Visualization of pose estimation results by our proposed HPE model |