A small AI lab in public. Notes, tools, and experiments.

HomeFrequency Domain ResearchActivation X-Ray Demo

🔬 Activation X-Ray Demo

Watch how a single clean frequency becomes a mess of harmonics

🎯 What This Demo Shows

This visualization reveals why activation functions are the fundamental barrier to frequency domain efficiency in neural networks. You'll see how a pure, single-frequency wave (represented by one spike in the frequency domain) explodes into multiple harmonics when passed through common activation functions like ReLU.

Why this matters: In frequency domain, we want operations that keep sparse representations sparse. But when ReLU "clips" negative values, it creates new frequencies that weren't there before, destroying the efficiency we're trying to achieve.

5
1.0
0.0

Input Signal (Time Domain)

Pure cosine wave - clean and simple

After Activation Function

See how the shape changes

Input Spectrum (Frequency Domain)

Single clean spike = efficient

Output Spectrum (The Problem!)

Multiple spikes = computational mess

📚 What You're Seeing Step-by-Step:

1 Clean Input: We start with a pure cosine wave - just one frequency component. In the frequency domain (bottom left), this shows up as a single spike. This is exactly what we want: sparse and efficient.
2 Activation Applied: When we apply ReLU (or other nonlinear functions), the smooth wave gets "clipped" or distorted. This might seem harmless in the time domain (top right).
3 Frequency Explosion: But look at the frequency domain (bottom right)! The single clean spike becomes multiple harmonics. What was once sparse (1 coefficient) is now dense (many coefficients).
4 Efficiency Lost: This is why frequency domain neural networks struggle. Every time we apply a nonlinear activation, we destroy the sparsity that makes frequency domain computation fast.

💡 The Core Insight

Frequency domain operations are fast when signals are sparse (few non-zero coefficients). Activation functions like ReLU turn sparse signals into dense ones, destroying the computational advantage we're trying to achieve.

🎯 Key Discovery: Try different activation functions above - notice how Tanh creates far fewer harmonics than ReLU! Smooth, bounded activations like Tanh are much more frequency-domain friendly than sharp, unbounded ones like ReLU.

🌟 Activation Function Ranking (Best to Worst for Frequency Domain)

  1. None (Pure): Perfect single spike - ideal but no nonlinearity
  2. Tanh: Smooth, bounded, symmetric - creates minimal harmonics ✨
  3. Softplus: Smooth like Tanh but unbounded
  4. GELU: Better than ReLU but still creates harmonics
  5. ReLU: Sharp cliff creates many harmonics - worst for frequency domain

Research Implication: The "activation barrier" isn't insurmountable - it's about choosing smooth, frequency-friendly activations!

📚 Research Background

The challenge of activation functions in frequency domain has been studied extensively:

References:
[1] Bruna & Mallat "Invariant Scattering Convolution Networks" IEEE PAMI 2013
[2] Li et al. "Fourier Neural Operator for Parametric Partial Differential Equations" ICLR 2021
[3] Trabelsi et al. "Deep Complex Networks" ICLR 2018
[4] Unser "A Representer Theorem for Deep Neural Networks" JMLR 2019

⚠️ Why This Matters for Neural Networks

In a neural network, we apply activations after most layers. If each activation turns our nice sparse frequency representation into a dense mess, we lose all the computational benefits of working in the frequency domain. This is the fundamental challenge that makes frequency domain neural networks difficult to implement efficiently.