Training Progress Visualizer

Monitor neural language model training with real-time loss and accuracy metrics

Training Configuration

Simple Model
1 context word, 50 hidden units
Bengio Model
3 context words, 100 hidden units
Large Model
5 context words, 200 hidden units
N-gram Baseline
Trigram with smoothing
Ready to train
0%
Epoch 0 / 100
0.00
Current Loss
0%
Accuracy
Perplexity
0s
Training Time

Training Insights

  • Select a model architecture to begin
  • Neural models typically train for many epochs
  • Watch loss decrease and accuracy increase over time

Loss & Accuracy Curves

Understanding Training Curves

  • Loss: How wrong the model's predictions are (lower is better)
  • Accuracy: Percentage of correct next-word predictions
  • Perplexity: Exponential of loss (lower means less confused)
  • Convergence: When curves flatten, training can stop

Model Comparison

Model Context Size Parameters Final Loss Final Accuracy Training Time Convergence
N-gram Baseline 2 words - - - - -
Simple Neural 1 word ~5K - - - -
Bengio Model 3 words ~15K - - - -
Large Model 5 words ~50K - - - -

Key Observations

  • Train different models to see performance comparisons
  • Larger context generally improves performance
  • More parameters can lead to better accuracy but slower training
  • Neural models typically outperform n-gram baselines