Set-Theoretic Embeddings
Embedding methods whose geometry directly supports set operations — intersection, union, containment — so that semantic relations map onto operations on sets. The motivating intuition: a word’s meaning is better modeled as a set of senses/contexts than as a single point, and relations like hypernymy (“dog” ⊆ “animal”) or conjunction (“red” ∩ “car”) are set operations.
Region representations make this possible because regions, unlike points, can contain and intersect one another:
- Containment (one region inside another) ≈ hypernymy / “is-a”
- Intersection (overlap volume) ≈ conjunction / relatedness
- Volume ≈ breadth / probability of a concept
Box embeddings are the cleanest realization — axis-aligned boxes have exact, cheap intersection — which is why Word2Box is subtitled Capturing Set-Theoretic Semantics of Words using Box Embeddings. Point dense vectors cannot express these set operations directly; cosine similarity only ranks closeness.
Related Concepts
- Box Embedding — exact set operations via box intersection
- Word2Box — learns set-theoretic word semantics unsupervised
- Region-Based Representation — what makes set operations expressible
- Compositional Embeddings — building complex meanings by combining (set) operations
- Embeddings — point baseline that lacks native set operations
Articles
- Express Words in a Box - Understanding Box Embedding from the Basics — Shun Tsukagoshi; Word2Box is built to capture set-theoretic semantics
- Answering Compositional Queries with Set-Theoretic Embeddings — Shib Sankar Dasgupta et al.; intersection/union/complement as query operators for Compositional Queries