Set-Theoretic Embeddings

Embedding methods whose geometry directly supports set operations — intersection, union, containment — so that semantic relations map onto operations on sets. The motivating intuition: a word’s meaning is better modeled as a set of senses/contexts than as a single point, and relations like hypernymy (“dog” ⊆ “animal”) or conjunction (“red” ∩ “car”) are set operations.

Region representations make this possible because regions, unlike points, can contain and intersect one another:

  • Containment (one region inside another) ≈ hypernymy / “is-a”
  • Intersection (overlap volume) ≈ conjunction / relatedness
  • Volume ≈ breadth / probability of a concept

Box embeddings are the cleanest realization — axis-aligned boxes have exact, cheap intersection — which is why Word2Box is subtitled Capturing Set-Theoretic Semantics of Words using Box Embeddings. Point dense vectors cannot express these set operations directly; cosine similarity only ranks closeness.

Articles