โ๏ธ
Input Controls
Key Concepts
- Dynamic Selection: Attention weights change based on content, not position
- Query-Key Matching: Each word "asks" (query) what it should attend to, others "offer" (keys)
- Value Weighting: Relevant words contribute more to the final representation
- Contextual Relevance: Same word can attend differently in different contexts
๐ฏ
Attention Visualization
How to read: Each row shows what that word attends to. Darker colors = higher attention.
Hover over cells to see exact attention weights.
๐ Bengio's Concatenation (Equal Weights)
All words get equal importance regardless of relevance.
๐ฏ Attention Mechanism (Dynamic Weights)
Words get importance based on their relevance to the current focus.
Why Attention Works Better
- Attention can focus on relevant words regardless of position
- Long-range dependencies are captured naturally
- Context determines relevance, not just proximity
- Different words can have different importance for different predictions