The Microsoft Kinect sensor is largely used to detect and recognize body gestures and layout with enough reliability, accuracy and precision in a quite simple way. However, the pretty low resolution of the optical sensors does not allow the device to detect gestures of body parts, such as the fingers of a hand, with the same straightforwardness. Given the clear application of this technology to the field of the user interaction within immersive multimedia environments, there is the actual need to have a reliable and effective method to detect the pose of some body parts. In this paper we propose a method based on a neural network to detect in real time the hand pose, to recognize whether it is closed or not. The neural network is used to process information of color, depth and skeleton coming from the Kinect device. This information is preprocessed to extract some significant feature. The output of the neural network is then filtered with a time average, to reduce the noise due to the fluctuation of the input data. We analyze and discuss three possible implementations of the proposed method, obtaining an accuracy of 90% under good conditions of lighting and background, and even reaching the 95% in best cases, in real time

Real-time Hand Pose Recognition Based on a Neural Network Using Microsoft Kinect

SORCE, Salvatore;
2013

Abstract

The Microsoft Kinect sensor is largely used to detect and recognize body gestures and layout with enough reliability, accuracy and precision in a quite simple way. However, the pretty low resolution of the optical sensors does not allow the device to detect gestures of body parts, such as the fingers of a hand, with the same straightforwardness. Given the clear application of this technology to the field of the user interaction within immersive multimedia environments, there is the actual need to have a reliable and effective method to detect the pose of some body parts. In this paper we propose a method based on a neural network to detect in real time the hand pose, to recognize whether it is closed or not. The neural network is used to process information of color, depth and skeleton coming from the Kinect device. This information is preprocessed to extract some significant feature. The output of the neural network is then filtered with a time average, to reduce the noise due to the fluctuation of the input data. We analyze and discuss three possible implementations of the proposed method, obtaining an accuracy of 90% under good conditions of lighting and background, and even reaching the 95% in best cases, in real time
978-0-7695-5093-0
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11387/153385
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 12
  • ???jsp.display-item.citation.isi??? ND
social impact