CurvSSL: Local Geometry in Self-Supervised Learning

3D diagram showing a data manifold with regions of high curvature highlighted in warm colors, alongside embeddings projected onto a unit hypersphere with lines connecting nearest neighbors.

CurvSSL: Local Geometry in Self-Supervised Learning

The field of self-supervised learning has just received a significant boost with CurvSSL, an innovative approach that incorporates explicitly the local geometry of the data manifold. While traditional non-contrastive methods have ignored this fundamental aspect, CurvSSL maintains the standard two-view architecture but introduces a curvature-based regularizer that captures essential geometric properties 🧠.

Curvature Mechanism in Embedding Spaces

Each embedding is treated as a vertex whose discrete curvature is calculated through cosine interactions on the unit hypersphere with its k nearest neighbors. The kernel variant operates in Hilbert spaces with reproducing kernels, deriving curvature from localized Gram matrices. These measurements are synchronized between views through a loss function inspired by Barlow Twins, which reduces redundancies while ensuring invariance to augmentations and geometric consistency 🔄.

Main features of the approach:

Discrete curvature calculation through cosine relationships on the unit hypersphere
Kernel variant operating in Hilbert spaces with reproducing kernels
Loss that aligns and decorrelates curvatures between different views

Curvature has transcended the realm of 3D design to become a fundamental ally in machine learning, demonstrating that well-defined curves have charm even in embedding spaces.

Experimental Evaluation and Comparisons

Tests conducted on benchmark datasets such as MNIST and CIFAR-10 using ResNet-18 reveal that CurvSSL generates results that match or surpass established methods like Barlow Twins and VICReg. These findings confirm that local geometric information effectively complements purely statistical regularizers, offering a richer perspective on the intrinsic structure of the data 📊.

Advantages demonstrated experimentally:

Competitive or superior performance against established methods
Effective integration between geometric information and statistical regularizers
Enhanced capture of the intrinsic structure of the data

Implications and Future of the Approach

The incorporation of explicit local geometry represents a significant conceptual advance in self-supervised learning. CurvSSL not only improves practical performance but also enriches our theoretical understanding of how models can learn more meaningful representations by considering the underlying geometric structure of the data. This hybrid approach between statistics and geometry promises to open new directions in representation learning research 🚀.