{"id":2479,"date":"2021-09-02T09:00:31","date_gmt":"2021-09-02T01:00:31","guid":{"rendered":"https:\/\/blog.ailia.ai\/uncategorized\/movenet-skeleton-detection-model-for-videos-with-intense-motion\/"},"modified":"2025-05-20T16:04:11","modified_gmt":"2025-05-20T08:04:11","slug":"movenet-skeleton-detection-model-for-videos-with-intense-motion","status":"publish","type":"post","link":"https:\/\/blog.ailia.ai\/en\/tips-en\/movenet-skeleton-detection-model-for-videos-with-intense-motion\/","title":{"rendered":"MoveNet : Pose Estimation for Video with Intense Motion"},"content":{"rendered":"\n<h3 class=\"wp-block-heading\" id=\"bd2a\"><strong>Overview<\/strong><\/h3>\n\n\n\n<p class=\"wp-block-paragraph\" id=\"c43a\"><em>MoveNet\u00a0<\/em>is a pose estimation model released by\u00a0<em>Google\u00a0<\/em>on May 17, 2021. Compared to conventional pose estimation models, it improves the detection accuracy in videos with intense motion. It is ideal for live fitness and sports applications.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"810\" height=\"480\" src=\"https:\/\/blog.ailia.ai\/wp-content\/uploads\/image-1.gif\" alt=\"\" class=\"wp-image-273\"\/><figcaption class=\"wp-element-caption\">Source: <a href=\"https:\/\/blog.tensorflow.org\/2021\/05\/next-generation-pose-detection-with-movenet-and-tensorflowjs.html\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/blog.tensorflow.org\/2021\/05\/next-generation-pose-detection-with-movenet-and-tensorflowjs.html<\/a><\/figcaption><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\"><a href=\"https:\/\/blog.tensorflow.org\/2021\/05\/next-generation-pose-detection-with-movenet-and-tensorflowjs.html?source=post_page-----d26d9e06126c--------------------------------\" target=\"_blank\" rel=\"noreferrer noopener\"><\/a><a href=\"https:\/\/blog.tensorflow.org\/2021\/05\/next-generation-pose-detection-with-movenet-and-tensorflowjs.html\" target=\"_blank\" rel=\"noreferrer noopener\">Next-Generation Pose Detection with MoveNet and TensorFlow.js<\/a><\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"1650\"><span id=\"ed14\" class=\"pd pe im bf pf pg ru pi gk pj rv pl gm pm rw po pp pq rx ps pt pu ry pw px py bk\" data-selectable-paragraph=\"\" style=\"white-space: normal; box-sizing: inherit; margin: 1.95em 0px -0.28em; font-family: sohne, &quot;Helvetica Neue&quot;, Helvetica, Arial, sans-serif; color: rgb(36, 36, 36); line-height: 30px; letter-spacing: -0.016em; font-size: 24px;\"><strong>Architecture<\/strong><\/span><\/h3>\n\n\n\n<p class=\"wp-block-paragraph\" id=\"4295\"><em>MoveNet<\/em>&nbsp;is able to detect 17 two-dimensional keypoints with high speed and high accuracy. There are two models available,&nbsp;<em>Lighting&nbsp;<\/em>and&nbsp;<em>Thunder.<\/em>&nbsp;The former can be used for applications that require speed and the latter for applications that require accuracy. Both&nbsp;<em>Lightning&nbsp;<\/em>and&nbsp;<em>Thunder&nbsp;<\/em>can run at 30FPS or higher on desktop PCs, laptops, and smartphones.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\" id=\"304c\">The architecture is similar to&nbsp;<a href=\"https:\/\/medium.com\/axinc-ai\/centernet-a-machine-learning-model-for-anchorless-object-detection-462c48483cfe\"><em>CenterNet<\/em><\/a>. The feature extractor is based on&nbsp;<em>MobileNetV2&nbsp;<\/em>to which&nbsp;<em>Feature Pyramid Network<\/em>&nbsp;(FPN) was added. By setting output stride to 4, it can handle high resolution feature map output.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"1400\" height=\"881\" src=\"https:\/\/blog.ailia.ai\/wp-content\/uploads\/image-42.png\" alt=\"\" class=\"wp-image-272\"\/><figcaption class=\"wp-element-caption\">Source: <a href=\"https:\/\/blog.tensorflow.org\/2021\/05\/next-generation-pose-detection-with-movenet-and-tensorflowjs.html\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/blog.tensorflow.org\/2021\/05\/next-generation-pose-detection-with-movenet-and-tensorflowjs.html<\/a><\/figcaption><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\" id=\"419a\">The output of the AI model is a person center heatmap, a keypoint regression field, a person keypoint heatmap, and a 2D per-keypoint offset field.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"1400\" height=\"741\" src=\"https:\/\/blog.ailia.ai\/wp-content\/uploads\/image-41.png\" alt=\"\" class=\"wp-image-271\"\/><figcaption class=\"wp-element-caption\">Source: <a href=\"https:\/\/blog.tensorflow.org\/2021\/05\/next-generation-pose-detection-with-movenet-and-tensorflowjs.html\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/blog.tensorflow.org\/2021\/05\/next-generation-pose-detection-with-movenet-and-tensorflowjs.html<\/a><\/figcaption><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\" id=\"8bea\">The model was trained using the COCO dataset and another Google\u2019s internal dataset called\u00a0<em>Active<\/em>. One limitation of the COCO dataset is that it does not include data from harsh environments where poses change drastically or motion blur is present, making it unsuitable for fitness and dance apps. However Google\u2019s internal dataset is made of annotated yoga, fitness, and dance videos from YouTube. Only three frames are taken from each video to ensure diversity in the dataset.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"1280\" height=\"720\" src=\"https:\/\/blog.ailia.ai\/wp-content\/uploads\/image-1.jpeg\" alt=\"\" class=\"wp-image-270\"\/><figcaption class=\"wp-element-caption\">Source: <a href=\"https:\/\/blog.tensorflow.org\/2021\/05\/next-generation-pose-detection-with-movenet-and-tensorflowjs.html\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/blog.tensorflow.org\/2021\/05\/next-generation-pose-detection-with-movenet-and-tensorflowjs.html<\/a><\/figcaption><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"cd4f\"><strong>Usage<\/strong><\/h3>\n\n\n\n<p class=\"wp-block-paragraph\" id=\"565c\">You can use MoveNet with ailia SDK on the video stream of a web camera with the following command.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><code>$ python3 movenet.py -v 0<\/code><\/p>\n\n\n\n<p class=\"wp-block-paragraph\" id=\"3783\">And here is the result you can expect.<\/p>\n\n\n\n<figure class=\"wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio\"><div class=\"wp-block-embed__wrapper\">\n<iframe loading=\"lazy\" title=\"ailia MODELS : MoveNet\" width=\"500\" height=\"281\" src=\"https:\/\/www.youtube.com\/embed\/hFUMD46Nugc?feature=oembed\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" referrerpolicy=\"strict-origin-when-cross-origin\" allowfullscreen><\/iframe>\n<\/div><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\"><a href=\"https:\/\/github.com\/axinc-ai\/ailia-models\/tree\/master\/pose_estimation\/movenet?source=post_page-----d26d9e06126c--------------------------------\" target=\"_blank\" rel=\"noreferrer noopener\"><\/a><a href=\"https:\/\/github.com\/axinc-ai\/ailia-models\/tree\/master\/pose_estimation\/movenet?source=post_page-----d26d9e06126c--------------------------------\" target=\"_blank\" rel=\"noreferrer noopener\">ailia-models\/pose_estimation\/movenet at master \u00b7 axinc-ai\/ailia-models<\/a><\/p>\n\n\n\n<p class=\"wp-block-paragraph\" id=\"fad5\"><a href=\"https:\/\/axinc.jp\/en\/\" rel=\"noreferrer noopener\" target=\"_blank\">ax Inc.<\/a>&nbsp;has developed&nbsp;<a href=\"https:\/\/ailia.jp\/en\/\" rel=\"noreferrer noopener\" target=\"_blank\">ailia SDK<\/a>, which enables cross-platform, GPU-based rapid inference.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\" id=\"fad5\">ax Inc. provides a wide range of services from consulting and model creation, to the development of AI-based applications and SDKs. Feel free to&nbsp;<a href=\"https:\/\/axinc.jp\/en\/\" rel=\"noreferrer noopener\" target=\"_blank\">contact us<\/a>&nbsp;for any inquiry.<a href=\"https:\/\/medium.com\/tag\/ailia-models?source=post_page-----d26d9e06126c---------------ailia_models-----------------\"><\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Overview MoveNet\u00a0is a pose estimation model released by\u00a0Google\u00a0on May 17, 2021. Compared to conventional pose estimation models, it improves the detection accuracy in videos with intense motion. It is ideal for live fitness and sports applications. Next-Generation Pose Detection with MoveNet and TensorFlow.js Architecture MoveNet&nbsp;is able to detect 17 two-dimensional keypoints with high speed and high accuracy. There are two models available,&nbsp;Lighting&nbsp;and&nbsp;Thunder.&nbsp;The former can be used for applications that require speed and the latter for applications that require accuracy. Both&nbsp;Lightning&nbsp;and&nbsp;Thunder&nbsp;can run at 30FPS or higher on desktop PCs, laptops, and smartphones. The architecture is similar to&nbsp;CenterNet. The feature extractor is based on&nbsp;MobileNetV2&nbsp;to which&nbsp;Feature Pyramid Network&nbsp;(FPN) was added. By setting output [&hellip;]<\/p>\n","protected":false},"author":5,"featured_media":2415,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[255],"tags":[266],"class_list":["post-2479","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-tips-en","tag-ailiamodels-en"],"acf":[],"_links":{"self":[{"href":"https:\/\/blog.ailia.ai\/en\/wp-json\/wp\/v2\/posts\/2479","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blog.ailia.ai\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blog.ailia.ai\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blog.ailia.ai\/en\/wp-json\/wp\/v2\/users\/5"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.ailia.ai\/en\/wp-json\/wp\/v2\/comments?post=2479"}],"version-history":[{"count":1,"href":"https:\/\/blog.ailia.ai\/en\/wp-json\/wp\/v2\/posts\/2479\/revisions"}],"predecessor-version":[{"id":2481,"href":"https:\/\/blog.ailia.ai\/en\/wp-json\/wp\/v2\/posts\/2479\/revisions\/2481"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/blog.ailia.ai\/en\/wp-json\/wp\/v2\/media\/2415"}],"wp:attachment":[{"href":"https:\/\/blog.ailia.ai\/en\/wp-json\/wp\/v2\/media?parent=2479"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.ailia.ai\/en\/wp-json\/wp\/v2\/categories?post=2479"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.ailia.ai\/en\/wp-json\/wp\/v2\/tags?post=2479"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}