On the Informativeness of Measurements in Shiryaev's Bayesian Quickest Change Detection
Jason J. Ford, Jasmin James, Timothy L. Molloy

TL;DR
This paper uncovers a super-martingale phenomenon in Shiryaev's Bayesian quickest change detection, showing it can hinder detection when measurements lack sufficient informativeness, especially for subtle changes.
Contribution
It is the first to describe a weak super-martingale phenomenon in Shiryaev's Bayesian QCD and links it to measurement informativeness and relative entropy.
Findings
Super-martingale phenomenon can occur under certain entropy conditions.
The phenomenon indicates limitations of Shiryaev's test for subtle changes.
Insufficiently informative measurements impair detection performance.
Abstract
This paper provides the first description of a weak practical super-martingale phenomenon that can emerge in the test statistic in Shiryaev's Bayesian quickest change detection (QCD) problem. We establish that this super-martingale phenomenon can emerge under a condition on the relative entropy between pre and post change densities when the measurements are insufficiently informative to overcome the change time's geometric prior. We illustrate this super-martingale phenomenon in a simple Bayesian QCD problem which highlights the unsuitability of Shiryaev's test statistic for detecting subtle change events.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
On the Informativeness of Measurements in Shiryaev’s Bayesian Quickest Change Detection
Jason J. Ford [email protected]
Jasmin James [email protected]
Timothy L. Molloy [email protected] School of Electrical Engineering and Computer Science, Queensland University of Technology, 2 George St, Brisbane QLD, 4000 Australia.
Abstract
This paper provides the first description of a weak practical super-martingale phenomenon that can emerge in the test statistic in Shiryaev’s Bayesian quickest change detection (QCD) problem. We establish that this super-martingale phenomenon can emerge under a condition on the relative entropy between pre and post change densities when the measurements are insufficiently informative to overcome the change time’s geometric prior. We illustrate this super-martingale phenomenon in a simple Bayesian QCD problem which highlights the unsuitability of Shiryaev’s test statistic for detecting subtle change events.
keywords:
Bayesian Quickest Change Detection; Detection Algorithms; Markov Models; Super-martingale; Maximal Inequality
, ,
1 Introduction
Quickly detecting a change in the statistics of a process is an important signal processing problem with application in a diverse range of areas including: automatic control [1], quality control[2, 3, 1], statistics [4], target detection[5, 6] and many more [7, Ch, 1.3]. In the classic Bayesian quickest change detection (QCD) problem, it is assumed that a permanent change in the statistics of an observed process occurs at some random time (see [7, Ch. 1.2] for a comparison with non-Bayesian QCD). The classic Bayesian QCD objective is to minimize the average detection delay subject to a constraint on the probability of a false alarm. When the change time has a geometric prior, Shiryaev established the optimal stopping rule as a test of whether the change posterior probability is above a threshold [8]. This paper investigates the properties of Shiryaev’s famous test statistic in weak measurement environments.
The main contribution here is to provide the first report and characterization of a super-martingale phenomenon in Shiryaev’s Bayesian QCD problem (see [3] and [9] for extensive investigations of martingale phenomenon in other QCD rules). This paper introduces a new weak practical super-martingale concept and exploits the maximal inequality for non-negative supermartingales to characterise of conditions under which the Bayesian QCD measurements are not sufficiently informative and Shiryaev’s test statistic is dominated by the change time’s prior. Interestingly, the identified super-martingale phenomenon appears suddenly once an information theoretic requirement on the pre and post change densities holds (rather than emerging as a graceful degradation). Practically, in applications with weak measurements, these observations motivate consideration of subtle problem adjustments, such as in the quickest intermittent signal detection problem [10] which generalizes Shiryaev’s problem for use in a vision-based aircraft detection application, or using non-Bayesian QCD such as the Lorden criterion [11].
The specific contributions are:
- (i)
Establishing a condition in terms of the change time’s geometric prior and the relative entropy between pre and post change densities that identifies when measurements are insufficiently informative. 2. (ii)
Establishing that when measurements are insufficiently informative, Shiryaev’s test statistic can exhibit a super-martingale phenomenon; that is, the log of no change posterior is a weak practical super-martingale. 3. (iii)
Providing an example exhibiting this super-martingale phenomenon to illustrate a situation where Shiryaev’s Bayesian QCD approach is potentially unsuitable for detecting subtle change events.
We would expect similar phenomenon to emerge in recent Bayesian QCD generalizations involving non-ergodic models [12].
2 Shiryaev’s Bayesian Quickest Change Detection Problem and Optimal Solution
For , let be an independent and individually distributed (i.i.d.) sequence of random variables taking values in the set . Initially, the random variables have a pre change (marginal) probability density before, at some random change time , switching to having a post change (marginal) probability density . We will assume for , that for some finite . For , let random variable denote a change event process in the sense that for and for . Here are indicator vectors with 1 in the th element, and zero elsewhere. Let be shorthand for measurement sequences.
Before we formally state Shiryaev’s Bayesian QCD problem, let us first introduce a probability measure space. Let denote the filtration generated by . We will assume the existence of a probability space where is a sample space of sequences , -algebra with the convention that , and is the probability measure constructed using Kolmogorov’s extension on the joint probability density function of the observations where we define when . We will let denote expectation under and use the probability measure and expectation to denote the special case when there is no change event. We let D\left(b^{1}(y_{k})\big{|}\big{|}{b^{2}(y_{k})}\right)\triangleq E_{\infty}\left[\log\left(\frac{b^{1}(y_{k})}{b^{2}(y_{k})}\right)\right] denote the relative entropy between pre and post change densities.
In Bayesian QCD problem the change time that transitions from to is considered to be an unknown random variable with prior distribution for . This allows us to construct a new averaged measure for all and we let denote the corresponding expectation operation. In Shiryaev’s problem we consider the special case of the geometric prior for some (and set , ).
Let be a stopping time with respect to filtration . We can now introduce the Shiryaev cost criterion [8] to trade-off average detection delay with probability of false alarm as
[TABLE]
where , is the delay penalty and the problem is to minimise .
For , let the no change and change posterior probabilities be denoted , respectively. Noting that we can write , allows us to write Shiryaev’s optimal stopping rule for this cost criterion in terms of the no change posterior probability as
[TABLE]
where is a threshold selected to control the probability of false alarm, as it can be shown that the probability of false alarm satisfies [10].
3 The Emergence of the Super-Martingale Phenomenon
To develop conditions under which the test statistic of Shiryaev’s rule exhibits rapid decrease even in the no change regime, we first introduce the following result that establishes how to efficiently calculate it.
Lemma 1**.**
For , given a sequence of measurements the no change posterior probability is given by the scalar recursion
[TABLE]
with and the normalization factor
[TABLE]
{pf}
As defined above, is a first order time-homogeneous Markov chain whose transition probabilities at each time instant are given by for as
[TABLE]
where , and is observed via the random variables . Hence, the no change posterior can efficiently be calculated by hidden Markov model filter [13], where ,
[TABLE]
where and with . Noting that , then simple algebra lets us write (2). Then we note that
[TABLE]
giving (3). This completes the proof.
To facilitate characterization of our test statistic’s behaviour let us introduce , noting that we can write , and establish the following bound on .
Lemma 2**.**
( dependent bound on ) For any , there is a such that for any we have
[TABLE]
{pf}
We define
[TABLE]
Using (3) lets us write
[TABLE]
It then follows from (5) and the definition of that
[TABLE]
Noting that is a continuous (monotonic increasing) in and that are finite gives that for any , there is a such that for any we have E_{\pi}\left[\gamma_{k}\Big{|}\hat{X}_{k-1}^{1}\right]\leq\delta, and therefore (6) gives that
[TABLE]
Then using and gives
[TABLE]
because , and . Substitution of (8) into (7) gives the lemma result.
Recall that we can write . Hence Lemma 2 provides a bound on the test statistic increment which allows us to investigate conditions under which the measurements are insufficient to overcome the geometric prior information , and becomes a weak practical super-martingale in the following sense:
Definition 3**.**
*(Weak practical super-martingale) If for any arbitrarily small there exists a such that if then *
[TABLE]
and the log of the no change posterior probability is called a weak practical super-martingale.
We now establish our theorem which provides conditions under which measurements are insufficiently informative and this super-martingale phenomenon emerges.
Theorem 4**.**
(Insufficiently informative measurements) If the relative entropy between probability densities and is sufficient small, namely
[TABLE]
*then the measurements are insufficiently informative in the sense that is a weak practical super-martingale (cf. Definition 3). *
{pf}
From Lemma 2, the bound (9) gives that there exists a such that for we have and hence that satisfies the super-martingale property
[TABLE]
noting and that conditioning on and are equivalent. It remains to establish if remains trapped in or escapes.
Let us introduce and , with some as bounding parameters to manage our possibly unbounded super-martingale process. We define a new process . We now note that (10) gives that is a non-negative super-martingale and hence by the maximal inequality for non-negative super-martingales (cf. [14, Lemma 1]) we have, for any . that
[TABLE]
Noting that and that if then gives
[TABLE]
Rewriting in terms of the complimentary set for the maximal event gives, if that
[TABLE]
where can be written as . We note that the event implies for all that and hence by Lemma 2 that (10) holds for all . The theorem result follows by noting that for any we can find a (or equivalently a ) so that and the Definition 3 property holds.
Theorem 4 establishes that unless the relative entropy between pre and post change densities D\left(b^{1}(y_{k})\big{|}\big{|}{b^{2}(y_{k})}\right) is sufficiently large, the no change posterior is a weak practical super-martingale under and hence there exists a trap defined by the interval where Shiryaev’s test statistic becomes increasingly confident that the change has occurred even if it has not. Further, we note that on sufficiently long sequence of measurements there is non zero probability of entering the interval . A test statistic that can exhibit such incorrect increasing confidence on non-pathological sequences is problematic in a practical setting and hence we interpret the existence of this interval trap under the condition of Theorem 4 as meaning the measurements are insufficiently informative. To understand the behaviour of Shiryaev’s rule and the role of the relative entropy D\left(b^{1}(y_{k})\big{|}\big{|}{b^{2}(y_{k})}\right) first consider the limit case . In this case, D\left(b^{1}(y_{k})\big{|}\big{|}{b^{2}(y_{k})}\right) is zero, the posterior is given by , Shiryaev’s rule becomes the deterministic rule to stop at the earliest time at or after and . Informally, a similar geometric prior mechanism is driving the super-martingale phenomenon that occurs when D\left(b^{1}(y_{k})\big{|}\big{|}{b^{2}(y_{k})}\right) is non-zero but less than , with . Finally, we note that as D\left(b^{1}(y_{k})\big{|}\big{|}{b^{2}(y_{k})}\right) increases towards the critical value of then decreases towards [math], and the probability of entering the trap interval decays.
4 Example: Bayesian Quickest Change Detection With Gaussian Densities
Proposition 5**.**
Consider Shiryaev’s Bayesian quickest change detection problem with pre and post change (marginal) probabilities densities and . Consider the set
[TABLE]
* is non-empty. Further, there exists a such that has a threshold structure in the sense of , where . Finally, when , then the measurements are insufficiently informative in that is a weak practical super-martingale (cf. Definition 3). *
{pf}
To establish that is non-empty we note that for any there exist a such that . As then for any , there must be at least one such that and hence this as an element of the non-empty . The interval result follows by noting that if then for all , and this means the set can be described as the interval with some critical largest element . Algebra and the monotonic increasing nature of gives that . The final result follows from Lemma 2 and noting that the relative entropy between these two Gaussians is given by [1, Example 4.1.9] .
Simulation:
Consider a geometric prior and note from Proposition 5 that the phenomenon emerges below . Figure 1 illustrates two simulated examples of the posterior’s behaviour on a sequence prior to the change time ( and representing examples of informative and non-informative measurements). The significantly different behaviour seen is an illustration of the super-martingale phenomenon discussed in this paper.
To illustrate the transition in the behaviour of Shiryaev’s test statistic, for each value of we conducted a Monte-Carlo study of 1000 trials of 5000 long random variable sequences with no change. In Figure 2 the mean value of () illustrates that below the critical value the test statistic exhibits the super-martingale phenomenon and becomes incorrectly convinced that a change has occurred, when it has not.
5 Discussion
The super-martingale phenomenon emerges in Bayesian QCD as a consequence of the non-ergodic nature of the underlying signal model. That the class of post-change densities exhibiting the phenomenon can by parameterized by an interval set suggests this is a systemic issue of the problem rather than the result of a pathological noise realisation. Potential remedies in applications with weak measurements include using quickest intermittent signal detection [10] or using non-Bayesian QCD such as the Lorden criterion [11]. Finally, we would expect similar phenomenon to arise in more complex Bayesian QCD or filter problems involving non-ergodic models with weak observations.
Acknowledgment.
The authors express many thanks to an anonymous reviewer who helped correct the proof of Lemma 2.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] M. Basseville and I.V Nikiforov. Detection of abrupt changes: theory and application , volume 104. Prentice Hall Englewood Cliffs, 1993.
- 2[2] S. Hwang, I.and Kim, Y. Kim, and C.E. Seah. A Survey of Fault Detection, Isolation, and Reconfiguration Methods. IEEE Transactions on Control Systems Technology , 18(3):636–653, May 2010.
- 3[3] T. L. Lai. Information bounds and quick detection of parameter changes in stochastic systems. Information Theory, IEEE Transactions on , 44(7):2917 – 2929, November 1998.
- 4[4] A. Tartakovsky. Asymptotic optimality in Bayesian changepoint detection problems under global false alarm probability constraint. Theory of Probability & Its Applications , 53(3):443–466, 2009.
- 5[5] J. Ru, V. P. Jilkov, X. R. Li, and A. Bashi. Detection of target maneuver onset. IEEE Transactions on Aerospace and Electronic Systems , 45(2):536–554, April 2009.
- 6[6] J. Lai, J.J. Ford, L. Mejias, and P. O’Shea. Characterization of sky-region morphological-temporal airborne collision detection. Journal of Field Robotics , 30(2):171–193, Mar 2013.
- 7[7] A. Tartakovsky, I. Nikiforov, and M. Basseville. Sequential Analysis: Hypothesis Testing and Changepoint Detection . Chapman & Hall/CRC Monographs on Statistics & Applied Probability. Taylor & Francis, 2014.
- 8[8] A. N. Shiryaev. On optimum methods in quickest detection problems. Theory of Probability & Its Applications , 8(1):22–46, 1963.
