Spatio-Temporal Difference Descriptor for Skeleton-Based Action Recognition

Chongyang Ding; Kai Liu; Jari Korhonen; Evgeny Belyaev

doi:10.1609/aaai.v35i2.16210

Spatio-Temporal Difference Descriptor for Skeleton-Based Action Recognition

Chongyang Ding^* (Corresponding Author), Kai Liu, Jari Korhonen, Evgeny Belyaev

^*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceeding › Published conference contribution

8 Citations (Scopus)

Abstract

In skeletal representation, intra-frame differences between body joints, as well as inter-frame dynamics between body skeletons contain discriminative information for action recognition. Conventional methods for modeling human skeleton sequences generally depend on motion trajectory and body joint dependency information, thus lacking the ability to identify the inherent differences of human skeletons. In this paper, we propose a spatio-temporal difference descriptor based on a directional convolution architecture that enables us to learn the spatio-temporal differences and contextual dependencies between different body joints simultaneously. The overall model is built on a deep symmetric positive definite (SPD) metric learning architecture designed to learn discriminative manifold features with the well-designed non-linear mapping operation. Experiments on several action datasets show that our proposed method achieves up to 3% accuracy improvement over state-of-the-art methods.

Original language	English
Title of host publication	The Thirty-Fifth AAAI Conference on Artificial Intelligence, The Thirty-Third Conference on Innovative Applications of Artificial Intelligence and The Eleventh Symposium on Educational Advances in Artificial Intelligence
Subtitle of host publication	Vol. 35 No. 2: AAAI-21 Technical Tracks 2
Place of Publication	Palo Alto, California
Publisher	ASSOC ADVANCEMENT ARTIFICIAL INTELLIGENCE
Pages	1227-1235
Number of pages	9
Volume	35
ISBN (Print)	978-1-57735-866-4
DOIs	https://doi.org/10.1609/aaai.v35i2.16210
Publication status	Published - 18 May 2021
Event	35th AAAI Conference on Artificial Intelligence / 33rd Conference on Innovative Applications of Artificial Intelligence / 11th Symposium on Educational Advances in Artificial Intelligence - Duration: 2 Feb 2021 → 9 Feb 2021

Publication series

Name	AAAI Conference on Artificial Intelligence
Publisher	ASSOC ADVANCEMENT ARTIFICIAL INTELLIGENCE
Number	2
Volume	35
ISSN (Print)	2159-5399
ISSN (Electronic)	2374-3468

Conference

Conference	35th AAAI Conference on Artificial Intelligence / 33rd Conference on Innovative Applications of Artificial Intelligence / 11th Symposium on Educational Advances in Artificial Intelligence
Period	2/02/21 → 9/02/21

Keywords

Video Understanding & Activity Analysis

Access to Document

10.1609/aaai.v35i2.16210Licence: Unspecified

Cite this

Ding, C., Liu, K., Korhonen, J., & Belyaev, E. (2021). Spatio-Temporal Difference Descriptor for Skeleton-Based Action Recognition. In The Thirty-Fifth AAAI Conference on Artificial Intelligence, The Thirty-Third Conference on Innovative Applications of Artificial Intelligence and The Eleventh Symposium on Educational Advances in Artificial Intelligence: Vol. 35 No. 2: AAAI-21 Technical Tracks 2 (Vol. 35, pp. 1227-1235). (AAAI Conference on Artificial Intelligence; Vol. 35, No. 2). ASSOC ADVANCEMENT ARTIFICIAL INTELLIGENCE. https://doi.org/10.1609/aaai.v35i2.16210

Spatio-Temporal Difference Descriptor for Skeleton-Based Action Recognition. / Ding, Chongyang (Corresponding Author); Liu, Kai; Korhonen, Jari et al.
The Thirty-Fifth AAAI Conference on Artificial Intelligence, The Thirty-Third Conference on Innovative Applications of Artificial Intelligence and The Eleventh Symposium on Educational Advances in Artificial Intelligence: Vol. 35 No. 2: AAAI-21 Technical Tracks 2. Vol. 35 Palo Alto, California: ASSOC ADVANCEMENT ARTIFICIAL INTELLIGENCE, 2021. p. 1227-1235 (AAAI Conference on Artificial Intelligence; Vol. 35, No. 2).

Research output: Chapter in Book/Report/Conference proceeding › Published conference contribution

Ding, C, Liu, K, Korhonen, J & Belyaev, E 2021, Spatio-Temporal Difference Descriptor for Skeleton-Based Action Recognition. in The Thirty-Fifth AAAI Conference on Artificial Intelligence, The Thirty-Third Conference on Innovative Applications of Artificial Intelligence and The Eleventh Symposium on Educational Advances in Artificial Intelligence: Vol. 35 No. 2: AAAI-21 Technical Tracks 2. vol. 35, AAAI Conference on Artificial Intelligence, no. 2, vol. 35, ASSOC ADVANCEMENT ARTIFICIAL INTELLIGENCE, Palo Alto, California, pp. 1227-1235, 35th AAAI Conference on Artificial Intelligence / 33rd Conference on Innovative Applications of Artificial Intelligence / 11th Symposium on Educational Advances in Artificial Intelligence, 2/02/21. https://doi.org/10.1609/aaai.v35i2.16210

Ding C, Liu K, Korhonen J, Belyaev E. Spatio-Temporal Difference Descriptor for Skeleton-Based Action Recognition. In The Thirty-Fifth AAAI Conference on Artificial Intelligence, The Thirty-Third Conference on Innovative Applications of Artificial Intelligence and The Eleventh Symposium on Educational Advances in Artificial Intelligence: Vol. 35 No. 2: AAAI-21 Technical Tracks 2. Vol. 35. Palo Alto, California: ASSOC ADVANCEMENT ARTIFICIAL INTELLIGENCE. 2021. p. 1227-1235. (AAAI Conference on Artificial Intelligence; 2). Epub 2021 May 18. doi: 10.1609/aaai.v35i2.16210

Ding, Chongyang ; Liu, Kai ; Korhonen, Jari et al. / Spatio-Temporal Difference Descriptor for Skeleton-Based Action Recognition. The Thirty-Fifth AAAI Conference on Artificial Intelligence, The Thirty-Third Conference on Innovative Applications of Artificial Intelligence and The Eleventh Symposium on Educational Advances in Artificial Intelligence: Vol. 35 No. 2: AAAI-21 Technical Tracks 2. Vol. 35 Palo Alto, California : ASSOC ADVANCEMENT ARTIFICIAL INTELLIGENCE, 2021. pp. 1227-1235 (AAAI Conference on Artificial Intelligence; 2).

@inproceedings{ac80827dc5964edeae1bbc47f219675a,

title = "Spatio-Temporal Difference Descriptor for Skeleton-Based Action Recognition",

abstract = "In skeletal representation, intra-frame differences between body joints, as well as inter-frame dynamics between body skeletons contain discriminative information for action recognition. Conventional methods for modeling human skeleton sequences generally depend on motion trajectory and body joint dependency information, thus lacking the ability to identify the inherent differences of human skeletons. In this paper, we propose a spatio-temporal difference descriptor based on a directional convolution architecture that enables us to learn the spatio-temporal differences and contextual dependencies between different body joints simultaneously. The overall model is built on a deep symmetric positive definite (SPD) metric learning architecture designed to learn discriminative manifold features with the well-designed non-linear mapping operation. Experiments on several action datasets show that our proposed method achieves up to 3% accuracy improvement over state-of-the-art methods.",

keywords = "Video Understanding & Activity Analysis",

author = "Chongyang Ding and Kai Liu and Jari Korhonen and Evgeny Belyaev",

year = "2021",

month = may,

day = "18",

doi = "10.1609/aaai.v35i2.16210",

language = "English",

isbn = "978-1-57735-866-4",

volume = "35",

series = "AAAI Conference on Artificial Intelligence",

publisher = "ASSOC ADVANCEMENT ARTIFICIAL INTELLIGENCE",

number = "2",

pages = "1227--1235",

booktitle = "The Thirty-Fifth AAAI Conference on Artificial Intelligence, The Thirty-Third Conference on Innovative Applications of Artificial Intelligence and The Eleventh Symposium on Educational Advances in Artificial Intelligence",

note = "35th AAAI Conference on Artificial Intelligence / 33rd Conference on Innovative Applications of Artificial Intelligence / 11th Symposium on Educational Advances in Artificial Intelligence ; Conference date: 02-02-2021 Through 09-02-2021",

}

TY - GEN

T1 - Spatio-Temporal Difference Descriptor for Skeleton-Based Action Recognition

AU - Ding, Chongyang

AU - Liu, Kai

AU - Korhonen, Jari

AU - Belyaev, Evgeny

PY - 2021/5/18

Y1 - 2021/5/18

N2 - In skeletal representation, intra-frame differences between body joints, as well as inter-frame dynamics between body skeletons contain discriminative information for action recognition. Conventional methods for modeling human skeleton sequences generally depend on motion trajectory and body joint dependency information, thus lacking the ability to identify the inherent differences of human skeletons. In this paper, we propose a spatio-temporal difference descriptor based on a directional convolution architecture that enables us to learn the spatio-temporal differences and contextual dependencies between different body joints simultaneously. The overall model is built on a deep symmetric positive definite (SPD) metric learning architecture designed to learn discriminative manifold features with the well-designed non-linear mapping operation. Experiments on several action datasets show that our proposed method achieves up to 3% accuracy improvement over state-of-the-art methods.

AB - In skeletal representation, intra-frame differences between body joints, as well as inter-frame dynamics between body skeletons contain discriminative information for action recognition. Conventional methods for modeling human skeleton sequences generally depend on motion trajectory and body joint dependency information, thus lacking the ability to identify the inherent differences of human skeletons. In this paper, we propose a spatio-temporal difference descriptor based on a directional convolution architecture that enables us to learn the spatio-temporal differences and contextual dependencies between different body joints simultaneously. The overall model is built on a deep symmetric positive definite (SPD) metric learning architecture designed to learn discriminative manifold features with the well-designed non-linear mapping operation. Experiments on several action datasets show that our proposed method achieves up to 3% accuracy improvement over state-of-the-art methods.

KW - Video Understanding & Activity Analysis

UR - https://slideslive.com/38948056/spatiotemporal-difference-descriptor-for-skeletonbased-action-recognition?ref=account-79851-presentations

U2 - 10.1609/aaai.v35i2.16210

DO - 10.1609/aaai.v35i2.16210

M3 - Published conference contribution

SN - 978-1-57735-866-4

VL - 35

T3 - AAAI Conference on Artificial Intelligence

SP - 1227

EP - 1235

BT - The Thirty-Fifth AAAI Conference on Artificial Intelligence, The Thirty-Third Conference on Innovative Applications of Artificial Intelligence and The Eleventh Symposium on Educational Advances in Artificial Intelligence

PB - ASSOC ADVANCEMENT ARTIFICIAL INTELLIGENCE

CY - Palo Alto, California

T2 - 35th AAAI Conference on Artificial Intelligence / 33rd Conference on Innovative Applications of Artificial Intelligence / 11th Symposium on Educational Advances in Artificial Intelligence

Y2 - 2 February 2021 through 9 February 2021

ER -

Spatio-Temporal Difference Descriptor for Skeleton-Based Action Recognition

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this