Skip to content

sipg-isr/visualsemantic-subspaces

 
 

Repository files navigation

Gabriel Moreira, Manuel Marques, Joao Costeira, Alexander G Hauptmann

Published 2025AISTATS 2025

PDF

Abstract:

Learning image representations that capture rich semantic relationships remains a significant challenge. Existing approaches are either contrastive, lacking robust theoretical guarantees, or struggle to effectively represent the partial orders inherent to structured visual-semantic data. In this paper, we introduce a nuclear norm-based loss function, grounded in the same information theoretic principles that have proved effective in self-supervised learning. We present a theoretical characterization of this loss, demonstrating that, in addition to promoting class orthogonality, it encodes the spectral geometry of the data within a subspace lattice. This geometric representation allows us to associate logical propositions with subspaces, ensuring that our learned representations adhere to a predefined symbolic structure.

image

The capability of explicitely represent negations is one key feature of this subspace representation. The figure shows that massive systems like CLIP that do not have structured representations by design do not cope with "negations" among other queries whereas by imposing a subspace structure the negation is naturally represented by the orthogonal complement.

Code

Run training

python ./train.py --config-name=celeb-a general.name="name_of_experiment"

About

Code for the paper Learning Visual-Semantic Subspaces

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 97.3%
  • Python 2.7%