The auxiliary particle filter is a particle filtering algorithm introduced by Pitt and Shephard in 1999 to improve some deficiencies of the sequential importance resampling (SIR) algorithm when dealing with tailed observation densities.
Particle filters approximate continuous random variable by
particles with discrete probability mass
, say
for uniform distribution. The random sampled particles can be used to approximate the probability density function of the continuous random variable if the value
.
The empirical prediction density is produced as the weighted summation of these particles:[1]
, and we can view it as the "prior" density. Note that the particles are assumed to have the same weight
.
Combining the prior density
and the likelihood
, the empirical filtering density can be produced as:
, where
.
On the other hand, the true filtering density which we want to estimate is
.
The prior density
can be used to approximate the true filtering density
:
- The particle filters draw
samples from the prior density
. Each sample are drawn with equal probability.
- Assign each sample with the weights
. The weights represent the likelihood function
.
- If the number
, than the samples converge to the desired true filtering density.
- The
particles are resampled to
particles with the weight
.
The weakness of the particle filters includes:
- If the weight {
} has a large variance, the sample amount
must be large enough for the samples to approximate the empirical filtering density. In other words, while the weight is widely distributed, the SIR method will be imprecise and the adaption is difficult.
Therefore, the auxiliary particle filter is proposed to solve this problem.
Auxiliary particle filter
[edit]
Comparing with the empirical filtering density which has
,
we now define
, where
.
Being aware that
is formed by the summation of
particles, the auxiliary variable
represents one specific particle. With the aid of
, we can form a set of samples which has the distribution
. Then, we draw from these sample set
instead of directly from
. In other words, the samples are drawn from
with different probability. The samples are ultimately utilized to approximate
.
Take the SIR method for example:
- The particle filters draw
samples from
.
- Assign each samples with the weight
.
- By controlling
and
, the weights are adjusted to be even.
- Similarly, the
particles are resampled to
particles with the weight
.
The original particle filters draw samples from the prior density, while the auxiliary filters draw from the joint distribution of the prior density and the likelihood. In other words, the auxiliary particle filters avoid the circumstance which the particles are generated in the regions with low likelihood. As a result, the samples can approximate
more precisely.
Selection of the auxiliary variable
[edit]
The selection of the auxiliary variable affects
and controls the distribution of the samples. A possible selection of
can be:
, where
and
is the mean.
We sample from
to approximate
by the following procedure:
- First, we assign probabilities to the indexes of
. We named these probabilities as the first-stage weights
, which are proportional to
.
- Then, we draw
samples from
with the weighted indexes. By doing so, we are actually drawing the samples from
.
- Moreover, we reassign the second-stage weights
as the probabilities of the
samples, where
. The weights are aim to compensate the effect of
.
- Finally, the
particles are resampled to
particles with the weights
.
Following the procedure, we draw the
samples from
. Since
is closely related to the mean
, it has high conditional likelihood. As a result, the sampling procedure is more efficient and the value
can be reduced.
Other point of view
[edit]
Assume that the filtered posterior is described by the following M weighted samples:

Then, each step in the algorithm consists of first drawing a sample of the particle index
which will be propagated from
into the new step
. These indexes are auxiliary variables only used as an intermediary step, hence the name of the algorithm. The indexes are drawn according to the likelihood of some reference point
which in some way is related to the transition model
(for example, the mean, a sample, etc.):

This is repeated for
, and using these indexes we can now draw the conditional samples:

Finally, the weights are updated to account for the mismatch between the likelihood at the actual sample and the predicted point
:
