This is an introduction to「DPT」, a machine learning model that can be used with ailia SDK. You can easily use this model to create AI applications using ailia SDK as well as many other ready-to-use ailia MODELS.
Overview
HOPE-Net is a machine learning model released in October 2017 which compute the angles in three axes (yaw, pitch, and roll) of a face in an input image.
Fine-Grained Head Pose Estimation Without Keypoints
Architecture
Face orientation detection is an important technology used in gaze detection and recognition of which objects is being watched in a scene.
Face orientation detection usually works by detecting key points of the target face and converting those points from 2D to 3D using a standard head model. However, there is a problem that the result depends on the accuracy of the face key points, and the need for ad-hoc fitting.
HOPE-Net uses multi-loss convolutional neural networks to detect the orientation of faces in a single shot. Using the face detected by the face detector as input, ResNet50 extracts features and FC Layer calculates yaw, pitch, and roll.

HOPE-Net performs best on AFLW2000, a dataset made of the first 2000 images of the Annotated Facial Landmarks in the Wild (AFLW) dataset, which have been re-annotated with 68 3D landmarks.

HOPE-Net Usage
Use the following command to run HOPE-Net and detect face orientation from a web camera.
$ python3 hopenet.py -v 0
You can also use a faster version that uses ShuffleNetV2 instead of ResNet50 with the following command.
$ python3 blazehand.py --lite -v 0
axinc-ai/ailia-models
Here is the kind of result you can expect.
ax Inc. has developed ailia SDK, which enables cross-platform, GPU-based rapid inference.
ax Inc. provides a wide range of services from consulting and model creation, to the development of AI-based applications and SDKs. Feel free to contact us for any inquiry.