Kalman & Particle Filters

State-space inference: hidden state evolves over time, observations arrive one step at a time. From the linear-Gaussian closed form (Kalman) through local linearization (EKF, UKF) to fully nonlinear, non-Gaussian particle filters.

Filtering is the time-recursive cousin of posterior inference. The unknown is a hidden state $x_k$ that evolves over time according to some dynamics $x_k = f(x_{k-1}) + w_k$; what you see are noisy observations $y_k = h(x_k) + v_k$. At each step, the goal is the filtering posterior $p(x_k \mid y_{1:k})$ — the belief over the current state given everything observed so far. The four methods below differ in what they assume about $f$, $h$, and the noise, and in how they represent the belief.

Why this is a separate page. The Monte Carlo & MCMC page covers static targets — a posterior $p(\theta \mid y)$ that doesn't change as new data arrives. Filtering is a different problem shape: the target moves with time and you want a recursive update, not a batch sampler. The two pages share the importance-sampling machinery (the particle filter is sequential IS with resampling), but otherwise solve different problems.

1. The state-space model

A discrete-time state-space model is two equations:

$$ x_k = f(x_{k-1}) + w_k,\qquad y_k = h(x_k) + v_k, $$

where $w_k$ and $v_k$ are process and observation noise. The filtering job is to compute $p(x_k \mid y_{1:k})$ sequentially, updating the belief by one observation at a time. Two recursions do the work:

Predict: push the belief forward through the dynamics. $p(x_k \mid y_{1:k-1}) = \int p(x_k \mid x_{k-1})\,p(x_{k-1} \mid y_{1:k-1})\,dx_{k-1}$.
Update: tilt the prediction by the new observation's likelihood. $p(x_k \mid y_{1:k}) \propto p(y_k \mid x_k)\,p(x_k \mid y_{1:k-1})$.

Whether either integral has a closed form depends on the assumptions about $f$, $h$, and the noise. The four methods below sit on a tradeoff ladder: start with an exact Gaussian closed form, then relax linearity, then relax the local-linearization requirement, then finally relax the Gaussian-belief restriction entirely.

Step	What changes	Requirement	What can fail
Kalman	Compute the filtering posterior exactly as mean and covariance.	Linear dynamics, linear observations, Gaussian noise.	Nonlinear or non-Gaussian systems violate the model.
EKF	Allow nonlinear dynamics/observations by linearizing locally.	Jacobians and a posterior narrow enough for local linearization.	Curvature over the uncertainty region can bias or destabilize the filter.
UKF	Push sigma points through the nonlinear functions instead of using Jacobians.	A unimodal Gaussian belief is still a good summary.	Multimodal or heavy-tailed posteriors get collapsed into one Gaussian.
Particle filter	Represent the posterior with weighted samples instead of one Gaussian.	Enough particles, likelihood evaluations, and resampling control.	Weight collapse, sample impoverishment, and poor scaling with state dimension.

2. Closed-form filtering: the Kalman family

closed-form Kalman filter (linear-Gaussian)

If $f$ and $h$ are linear ($x_k = Fx_{k-1} + w_k,\ y_k = Hx_k + v_k$) and the noises are Gaussian, the filtering distribution stays Gaussian forever and is given exactly by:

$$ \begin{aligned} \hat x_k^- &= F\hat x_{k-1}, \quad P_k^- = F P_{k-1} F^\top + Q & \text{(predict)} \\ K_k &= P_k^- H^\top (H P_k^- H^\top + R)^{-1} & \text{(gain)} \\ \hat x_k &= \hat x_k^- + K_k(y_k - H\hat x_k^-) \\ P_k &= (I - K_k H)\,P_k^- & \text{(update)} \end{aligned} $$

This is not a Monte Carlo method; it is the closed-form posterior for the linear-Gaussian special case. Use it when you can.

closed-form Extended Kalman filter (EKF)

For mildly nonlinear $f,h$, linearize at the current mean: $F_k = \partial f/\partial x\big|_{\hat x_{k-1}}$, $H_k = \partial h/\partial x\big|_{\hat x_k^-}$, then apply the standard Kalman equations. Cheap and widely deployed, but a poor approximation when the local linearization disagrees with the true function over the spread of the prior.

closed-form Unscented Kalman filter (UKF)

Instead of linearizing the function, propagate a deterministic set of $2n+1$ sigma points through the true nonlinear $f,h$ and refit a Gaussian to their image. Second-order accurate, requires no Jacobians, same cost as EKF. Still fundamentally Gaussian, and fails for multimodal or heavy-tailed posteriors.

Figure 1 compares the filtering approximations on one nonlinear update. Kalman-style methods keep only a Gaussian belief; the difference is how they push that Gaussian through nonlinear dynamics and observations.

Figure 1 · Kalman, EKF, and UKF through a nonlinear observation

true posterior shape EKF Gaussian UKF Gaussian

observation $y$ 1.2

observation noise 0.3

prior spread 1.0

Figure 2 turns that ladder into one shared nonlinear tracking problem. Switch methods and the estimate is recomputed under the selected relaxation; the algorithm lines below the canvas highlight exactly what changed.

Figure 2 · Relaxing assumptions on one nonlinear state-space model

true state observations selected filter estimate particles / uncertainty

method Kalman

nonlinearity 0.65

Belief: carry one Gaussian mean and variance $(m, P)$.

Belief: carry weighted particles $\{x^{(i)}, w^{(i)}\}$.

Predict: use the fixed linear model $F$ and add $Q$.

Predict: evaluate $f(m)$ and propagate $P$ through the Jacobian $F_k$.

Predict: push sigma points through $f$ and refit a Gaussian.

Predict: sample each particle through the nonlinear dynamics.

Update: use the linear observation matrix $H$.

Update: linearize $h$ at the predicted mean.

Update: transform predicted sigma points through $h$.

Update: weight by likelihood, compute ESS, and resample when needed.

3. Sequential Monte Carlo

When the state-space model is genuinely nonlinear or non-Gaussian, no closed form exists and Gaussian approximations break. SMC represents $p(x_{0:k}\mid y_{1:k})$ by a weighted cloud of particles $\{(x^{(i)}_{0:k}, w^{(i)}_k)\}_{i=1}^N$ and updates the cloud as each new observation arrives.

smc Sequential importance sampling (SIS)

Apply importance sampling step by step. With proposal $\pi(x_k\mid x_{0:k-1}, y_{1:k})$, the weights update recursively:

$$ w_k^{(i)} \propto w_{k-1}^{(i)} \cdot \frac{p(y_k\mid x_k^{(i)})\,p(x_k^{(i)}\mid x_{k-1}^{(i)})}{\pi(x_k^{(i)}\mid x_{0:k-1}^{(i)}, y_{1:k})}. $$

The catastrophe: after a few steps almost all weight ends up on one particle. This is weight degeneracy, and it is the reason SIS by itself is essentially never used.

smc Particle filter / SIR (bootstrap)

Add a resampling step: whenever the effective sample size drops below a threshold, draw $N$ new particles with replacement from the weighted cloud, duplicating heavy particles and killing light ones, and reset all weights to $1/N$. The simplest variant, the bootstrap filter, uses the prior $p(x_k\mid x_{k-1}^{(i)})$ as the proposal, which collapses the weight update to just the likelihood:

$$ w_k^{(i)} \propto p(y_k\mid x_k^{(i)}). $$

Bootstrap particle filter

For each particle $i$: draw $x_k^{(i)} \sim p(x_k \mid x_{k-1}^{(i)})$ from the dynamics.
Reweight: $w_k^{(i)} \propto p(y_k \mid x_k^{(i)})$.
Normalize weights; compute ESS.
If ESS < threshold: resample $N$ particles $\propto w_k^{(i)}$ and reset weights to $1/N$.

Variants improve on the bootstrap proposal: the optimal proposal $\pi(x_k\mid x_{k-1}^{(i)}, y_k)$ minimizes weight variance but is tractable only for special models; local-linearization proposals use an EKF or UKF on each particle to construct a locally near-optimal Gaussian proposal. Resampling itself has variants (multinomial, stratified, systematic, residual) that differ in variance.

Resampling cures weight degeneracy but introduces sample impoverishment: heavy particles get duplicated and the cloud loses diversity. Roughening (jittering particles by a small noise) and MCMC moves between resampling steps are the standard remedies.

Figure 3 runs SIS and a bootstrap particle filter on the same nonlinear tracking problem, so the weight-collapse failure, the ESS trace, and the resampling fix are visible side by side.

Figure 3 · Bootstrap particle filter on a nonlinear scalar SSM

true state observation $y_k$ particle cloud posterior mean

# particles 200

obs noise $\sigma_v$ 1.0

resample threshold (ESS/N) 0.5

Figure 4 · Particle genealogy after repeated resampling

surviving lineages extinct lineages

resampling pressure 0.65

4. Shared filtering structure

All four algorithms share the same predict-then-update recursion. Figure 2 keeps that skeleton fixed and changes one assumption at a time: fixed linear matrices, local Jacobians, sigma points, then weighted samples. The important reading is not that one method dominates everywhere; it is that each relaxation buys a specific kind of model mismatch at a specific computational cost.

The decision table below turns the same ladder into a quick method choice.

5. Filter choice by model

The decision is mostly about how nonlinear the dynamics are and whether the posterior stays unimodal. Tags echo the section colors: closed-form means a Gaussian closed-form posterior (no sampling), smc means weighted particles updated sequentially.

Method	Use when	Strengths	Pitfalls
Kalman filter closed-form	Linear-Gaussian state-space model. Filtering / smoothing of $x_{0:T}$ given $y_{1:T}$.	Exact; closed-form $O(d^3)$ per step.	Restricted to linear dynamics and Gaussian noise.
EKF closed-form	Mildly nonlinear state-space; tracking, robotics, navigation.	Drop-in replacement for KF; cheap.	First-order linearization error; can diverge under strong nonlinearity.
UKF closed-form	More nonlinear state-space than EKF can handle but still unimodal Gaussian.	Second-order accurate; no Jacobians required.	Still a Gaussian approximation; fails for multimodal / heavy-tailed posteriors.
Particle filter (SIR) smc	Nonlinear, non-Gaussian state-space; tracking, robotics, gene networks, anything multimodal over time.	Asymptotically exact for fully general SSMs; handles multimodal posteriors.	Sample impoverishment after resampling; scales poorly with state dimension.

Quick decision tree.

Linear dynamics and Gaussian noise? → Kalman filter.
Mildly nonlinear, Jacobians cheap and well-behaved? → EKF.
More nonlinear but still unimodal Gaussian belief is OK? → UKF.
Genuinely nonlinear / non-Gaussian / multimodal posterior? → particle filter.
Need to do inference on a static posterior (no time dimension)? → see Monte Carlo & MCMC.

What next

Filtering connects to static-posterior sampling, dependency structure, and the linear systems story.

Static

Monte Carlo & MCMC

Same change-of-measure machinery, applied to a static posterior rather than a time-evolving state.

Discrete state

Hidden Markov Models

The discrete-state analogue of Kalman filtering: forward–backward messages and Viterbi over a finite state alphabet.

Systems

LTI Systems on Random Inputs

The frequency-domain view of state-space dynamics; the Kalman filter is the time-domain inference complement.

Foundations

Measure Theory & Random Variables

Importance sampling — including the particle filter's sequential version — is the Radon–Nikodym identity made computational.