Configuration
Sequence Length
4
Embedding Dim
8
FFN Hidden
16
Attention Heads
1
Processing Phases
Input Embed
0
LayerNorm 1
~320
Self-Attention
~2,400
Residual Add
~32
LayerNorm 2
~320
Feed-Forward
~3,200
Residual Add
~32
Output
0
Transformer Block Architecture
READY
Transformer block ready. Press Run to execute.
—
Layer State
Current Activations (pos 0)
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
Attention Weights (pos 0→all)
—
—
—
—
Performance
Phase
—
Cycles
0
Time @ 150Hz
0.00s
% Complete
0%
Ready
0.00s
⚠ Full Transformer Scaling
This block (tiny)
~6,300 cyc
Time @ 150 Hz
~42 sec
GPT-2 block
~500M cyc
GPT-2 full (12 layers)
~6B cyc
GPT-2 @ 150 Hz
~1.3 years/token