Polysemy Problem

Visualization of the Polysemy Problem

The Polysemy Problem

Traditional word embeddings (Word2Vec, GloVe) assign a single vector to each word in the vocabulary. This causes problems for words with multiple meanings, as the embedding becomes an "average" of all senses.

For example, the word "bank" can refer to a financial institution or the side of a river. These meanings are quite different, but in static embeddings, they get merged into one representation.

More advanced models in Module 2 (like BERT) use contextual embeddings where each word gets a different vector depending on its surrounding context, addressing this limitation.

Words with Multiple Meanings

Financial Sense

I deposited money in the bank yesterday.

River Sense

We had a picnic on the bank of the river.

Key Insights

Static embeddings create a single vector for each word, regardless of context
Words with multiple meanings (polysemous) are poorly represented by static embeddings
The embedding becomes an "average" of all meanings, which can be problematic
This limitation is addressed by contextual embeddings in more advanced models
Contextual models (covered in Module 2) generate different vectors based on context

Static vs. Contextual Embeddings Comparison

Feature	Static Embeddings	Contextual Embeddings
Word Representation	One vector per word	Different vectors based on context
Handles Multiple Meanings	❌ Poor (blends meanings)	✅ Well (distinguishes meanings)
Computational Cost	Low (efficient)	High (more intensive)
Storage Requirements	Low (one vector per word)	High (many vectors per word)
Examples	Word2Vec, GloVe, FastText	BERT, GPT, ELMo, T5
Out-of-Vocabulary Words	❌ Cannot handle	✅ Can handle via subword tokenization
Sentence-Level Understanding	Limited	Strong

Bridge to Module 2: Transformer Architectures

In Module 2, we'll explore transformer architectures that use contextual embeddings
Transformers use attention mechanisms to generate context-dependent representations
This addresses the polysemy problem by creating different vectors for different contexts
The same word can have very different representations based on its usage
This enables more sophisticated language understanding, especially for ambiguous text

Polysemy Problem with Static Embeddings