Region-Based Representation

A family of embedding methods that represent an object as a region of space — a shape with volume — rather than a single point. Because a region has extent, it can express things a point cannot: the spread of a concept’s meaning, containment (hierarchy / hypernymy), and overlap (relatedness, set-theoretic relations).

Motivation: point embeddings (Word2Vec, BERT) encode similarity as distance/direction, but a point has no volume, so “animal” cannot be shown to contain “dog” and “cat”. Region representations restore that asymmetry.

The Main Variants

VariantShapeCapturesNote
Gaussian EmbeddingGaussian distributionSpread via variance; uncertaintyVilnis & McCallum, ICLR 2015
Poincaré EmbeddingPoint in hyperbolic spaceTree-like hierarchy with few dimsNickel & Kiela, NIPS 2017
Box EmbeddingAxis-aligned boxContainment, intersection, volumeCheapest exact overlap calculation

Boxes are the most widely applied of the three precisely because intersection and volume are trivial to compute (per-dimension min/max), whereas Gaussian overlap and hyperbolic distances are costlier.

Articles