TY - UNPB
T1 - S-JEA
T2 - Stacked Joint Embedding Architectures for Self-Supervised Visual Representation Learning
AU - Manova, Alzbeta
AU - Durrant, Aiden
AU - Leontidis, Georgios
PY - 2023/5/19
Y1 - 2023/5/19
N2 - The recent emergence of Self-Supervised Learning (SSL) as a fundamental paradigm for learning image representations has, and continues to, demonstrate high empirical success in a variety of tasks. However, most SSL approaches fail to learn embeddings that capture hierarchical semantic concepts that are separable and interpretable. In this work, we aim to learn highly separable semantic hierarchical representations by stacking Joint Embedding Architectures (JEA) where higher-level JEAs are input with representations of lower-level JEA. This results in a representation space that exhibits distinct sub-categories of semantic concepts (e.g., model and colour of vehicles) in higher-level JEAs. We empirically show that representations from stacked JEA perform on a similar level as traditional JEA with comparative parameter counts and visualise the representation spaces to validate the semantic hierarchies.
AB - The recent emergence of Self-Supervised Learning (SSL) as a fundamental paradigm for learning image representations has, and continues to, demonstrate high empirical success in a variety of tasks. However, most SSL approaches fail to learn embeddings that capture hierarchical semantic concepts that are separable and interpretable. In this work, we aim to learn highly separable semantic hierarchical representations by stacking Joint Embedding Architectures (JEA) where higher-level JEAs are input with representations of lower-level JEA. This results in a representation space that exhibits distinct sub-categories of semantic concepts (e.g., model and colour of vehicles) in higher-level JEAs. We empirically show that representations from stacked JEA perform on a similar level as traditional JEA with comparative parameter counts and visualise the representation spaces to validate the semantic hierarchies.
KW - Deep Learning
KW - Self-Supervised Learning
KW - Computer vision
UR - https://arxiv.org/pdf/2305.11701.pdf
U2 - 10.48550/arXiv.2305.11701
DO - 10.48550/arXiv.2305.11701
M3 - Preprint
SP - 1
EP - 9
BT - S-JEA
PB - ArXiv
ER -