Image discrimination models are used to predict the visibility of the difference between two images. Using a four category rating scale method Rohaly et al. (SPIE Vol. 2411) and Ahumada & Beard (SPIE Vol. 2657) found that image discrimination models can predict target detectability when the background is kept constant, or `fixed'. In Experiment I, we use this same rating scale method and find no difference between `fixed' and `random' noise (where the white noise changes from trial to trial). In Experiment II, we compare fixed noise and two random noise conditions. Using a two-interval forced-choice procedure, the `random' noise was either the same or different in the two intervals. Contrary to image discrimination model predictions, the same random noise condition produced greater masking than the `fixed' noise. This suggests that observers use less efficient target templates than image discrimination models implicitly assume. Also, performance appeared limited by internal process variability rather than external noise variability since similar masking was obtained for both random noise types.
There are two general approaches used to predict signal detection in the presence of noise. The standard approach is based on the ideal observer concept, a method for computing observer performance given perfect information about the signal and the masking noise distribution [1]. Figure 1 diagrams the observer model resulting from the ideal observer point of view. The ideal observer calculates the stimulus likelihood ratio for the signal-plus-noise and the noise-alone distributions. For some noise distributions, this calculation reduces to a simple template matching operation. Since actual observer performance is less than ideal, it is necessary to include inefficiencies, or corruptions, to the ideal observer to predict human performance. As human observers respond differently to the same stimulus at different times, internal noise is often included [2]. Potential sources of internal noise include image measurement variability, criterion variability, and template instability. Other candidate inefficiencies are an imperfect template (sampling inefficiency) or multiple templates (signal uncertainty). In many situations, characterization of observer templates can be difficult, even when ideal observer specification is simple. For example, in letter recognition studies using fixed fonts [3] and the ideal templates would use those fonts, whereas characterizing the human observer involves specifying templates for all acceptable internal letter representations. Conveniently, another approach used to predict signal detection in noise does not require explicit templates. Image discrimination models [4,5,6] take two stimulus images as input and compute a visual system representation for each image. Differences among these internal representations are then calculated and combined, as diagramed in Figure 2. The resulting number is used to predict the probability that the observer regards the two images as different. Observer performance is assumed to be mainly limited by internal noise in the image representations. Image discrimination models are equivalent to the more general detection models when the general model target template is based on the difference between the internal representation of the object and non-object images.

Figure 1. Schematic of a target detection model.

Figure 2. Schematic of an image discrimination model.
Complex object detection experiments typically involve targets presented in different backgrounds [7]. Image discrimination models cannot make useful predictions from such images, since differences in the backgrounds are confounded with object presence. These models can, however, make object detection predictions if identical backgrounds are used for object-present and object-absent stimuli [8].
Image discrimination models can also be used to predict object detection in the presence of noise masks. Ahumada and Beard [9] reported model predictions to be useful for a fixed simulated background and various contrast levels of a fixed noise mask. Here we are concerned with whether these models can explain the results of random noise maskers.
The general detection and discrimination models presented in Figures 1 and 2 will make the same predictions for a fixed background detection situation if the general model uses the noiseless visual system representation difference to construct a template. Detection would be limited only by the internal noise of the visual system. If this internal noise is large compared to the variability introduced by randomizing the external noise, the random noise would be expected to mask only slightly more that a fixed sample masker, and the image discrimination predictions for fixed noise would be adequate for random noise conditions. We find that random noise can produce substantially greater masking than fixed noise. However, our explanation will not be that the external noise variability is important; we will suggest, rather, that the observer does not construct the target template from the noiseless target representation, as the image discrimination models effectively assume.
Experiment I tests whether internal noise can dominate external noise variability in determining the masking of random noise. One subject, BLB, intermixed conditions from our fixed noise experiment [9] with noise that was a different random sample on each trial.
The details of the experimental procedure were the same as in our previous paper [9]. The 128 x 128 pixel images were presented at a resolution of 47.5 p2ixels per degree in a 640 x 480 background wit a luminance of 25.5 cd/m^2 and a duration of 0.5 sec. The background image (Figure 3a) was subtracted from the airplane image (Figure 3b) to form a signal image (Figure 3c) that was attenuated to one of 6 levels, and added back to the background. These stimuli were then presented in blocks of 60 trials for ratings on a 4 point scale for the presence of a signal.

Figure 3. a) The background image alone. b) The target airplane and the background. c) The target alone.
For each block a noise level (0.1, 0.2, or 0.4 root mean square contrast as illustrated in Figure 4) was chosen and a type of noise was chosen (fixed or random). A random noise was constructed from the fixed noise by selecting a random number from 1 to 128 [2] as the starting position for the upper left corner of the noise and then copying the values sequentially with wrap around. Each type of block was replicated 4 times. Thurstone scaling of the ratings was used to estimate the signal contrast energy that resulted in a d' of one for each condition [10].

Figure 4. The three levels (RMS contrast) of masking noise added to the target airplane and the background.
Figure 5 (Observer BLB) shows the contrast energy of the signal giving a d' of one in decibels re 10^-6 deg^2 sec, the best contrast detection reported by Watson, Robson, and Barlow [11]. The filled circles show the results from the experiment reported last year with fixed noise [9]. The open circles show the replication of those results in this experiment, and the squares show the random noise results. A 95% confidence interval for the average difference between the random and the fixed conditions in the current experiment is 1.2 dB + or - 1.5 dB. Because these thresholds are so similar, these results indicate that the thresholds in this situation are dominated by internal noise variability [2].

Figure 5. Mean signal contrast energy thresholds for Observer BLB (decibels re 10^-6 deg^2 sec) for the fixed and random noise conditions as a function of the added white noise RMS contrast (Experiment I).
Experiment II was done to confirm the results of Experiment I using a two interval forced choice (2IFC) procedure. This procedure was used by Burgess and Colborne [2] to measure the relative contribution of internal noise and external noise to the detection of simple targets on a uniform background. The 2IFC procedure allows another experimental condition to be evaluated. This condition, which we will call Random-Fixed, uses a new random noise for each trial pair, but uses the same noise sample for the two intervals within a trial. Figure 6 illustrates this new condition, along with the Random and Fixed conditions for the 2IFC procedure.

Figure 6. The different noise conditions in Experiment II.
Burgess and Colborne [2] assumed observer responses are based on an internal measure of target presence whose mean is proportional to the square root of signal contrast energy and whose variance can be partitioned into internal and external noise variances, leading to the formula that the threshold signal contrast energy is proportional to the sum of these variances,
2 2
E = k ( sigmae + sigmai ) . (1)
For the Random-Fixed condition, the internal measure is assumed to have
no external variance because the contribution of the noise from one interval
subtracts exactly from the contribution of the noise in the other interval.
This leads to a prediction relating the threshold ratio for Random over
Random-Fixed to the ratio of the external to internal noise standard
deviations,
2
ER sigmae
--- = 1 + ------- . (2)
ERF 2
sigmai
Using this equation, Burgess and Colborne [2]
estimated the ratio of internal noise standard deviation
to external noise standard deviation to be 0.75.
This ratio corresponds to an average
threshold signal difference of 4.5 dB for
Random vs. Random-Fixed
when there is no
background-generated internal noise.
To account for the background, we can assume that
the internal noise is proportional to the contrast of
the background and the external noise.
The standard deviation ratio of 0.75 then predicts
the average
Random vs. Random-Fixed
threshold difference
for the three noise conditions of Experiment I
should be 3.5 dB.
Under the additional assumptions that 1) the internal noise standard deviation
is the same in Experiment I and 2IFC (no category variability) and
2) there is no correlation in the internal noise in the two intervals,
3.5 dB is also a prediction for what the average difference should
have been in Experiment I.
If the observers use the same measure in the Fixed and Random conditions,
and the internal noise is uncorrelated in the two samples, the Fixed and
the Fixed-Random result should be the same.
Experiment II used the 2IFC procedure and the same simulated airport runway stimuli of Experiment I to answer two questions: 1) Are thresholds for Fixed versus Random-Fixed noise similar, as would be predicted by image discrimination models or more general template models that keep the template the same for these two conditions? 2) Are thresholds for Random noise elevated relative to Random-Fixed noise? That is, theoretically, does the external noise variability raise the thresholds for Random noise significantly above those for Random-Fixed noise.
Stimuli The stimuli were generated as described above for Experiment I.
Procedure
A 2IFC tracking procedure was used with a stimulus duration of 0.5
sec and an interstimulus interval of 1.0 sec.
The images appeared in the screen center and were
replaced by the 25.5 cd/m^2 background luminance during the
interstimulus and intertrial intervals.
In one interval, selected at random, the
target image was multiplied by zero.
The observer was asked to determine which interval contained
the target aircraft.
A tone sounded if the response was incorrect.
The next trial began 1.0 sec after the response.
After an error, the target amplitude was
increased by 3 dB.
If three consecutive trials were correct, the target amplitude was
decreased by 3 dB.
Conditions were fixed for blocks of 60 trials.
Thresholds were determined for each block by
maximum likelihood probit analysis using a Gaussian
model estimating the target level that led
to 75% correct [12].
There were three noise sample randomization
conditions:
(i) Fixed: a single fixed noise
sample was used (throughout all Fixed blocks).
(ii) Random-Fixed: a new noise was
generated for each two-interval trial, and that
same noise sample was used in both intervals.
(iii) Random: a new noise was
generated for each interval of each trial.
Separate sub-experiments compared
Fixed vs. Random-Fixed (Experiment IIA) and
Random-Fixed vs. Random (Experiment IIB).
Experiment IIA: Fixed vs. Random-Fixed comparison. Three noise attenuation conditions were used, with attenuation factors of 0.25, 0.5, and 1. For each observer a random 3 by 3 Latin square was generated to balance the order of the attenuation conditions for three replications. For each of these replications, a block of Random-Fixed trials and a block of Fixed trials was run in a random order. Four observers participated: FK, ND, APG and BLB.
Experiment IIB: Random-Fixed vs. Random comparison. Four noise attenuation conditions were used, with attenuation factors of 0.00, 0.25, 0.50, and 1.00. For each observer a random 4 by 4 Latin square was used to balance the order of the attenuation conditions for four replications. For each of these replications, a block of Random-Fixed trials and a block of Fixed trials was run in a random order. When the noise attenuation factor was zero, there were still two blocks run, one with the background airport image RMS contrast at the standard value of 0.081 and one at a higher value of 0.165. Three observers participated: ND, APG, and BLB.
The results of Experiments IIA and IIB are summarized in Figure 7 which plots the mean signal contrast energy threshold in decibels against the peak contrast of the added random noise. Zero decibels corresponds to 10^-6 deg^2 sec, a threshold contrast energy for stimuli on a uniform background [11]. To facilitate the evaluation of differences within a comparison condition, the error bars are 95% confidence limits for the means based on the observer vs. condition interaction for the particular comparison condition (2.0 dB for the Random vs. Random-Fixed and 1.6 dB for the Random-Fixed vs. Fixed). The Fixed condition gave significantly lower thresholds than the Random-Fixed noise condition. A 95% confidence interval for the masking increase averaged over observers and the non-zero noise level for the Random-Fixed condition over the Fixed condition was 5.2 + or - 0.6 dB. A 95% confidence interval for the masking increase averaged over observers and noise levels for the Random condition over the Random-Fixed condition was 0.7 + or - 0.8 dB, indicating a negligible difference between these two conditions.

Figure 7. Mean signal contrast energy thresholds (decibels re 10^-6 deg^2 sec) for the different noise conditions as a function of the added white noise RMS contrast (Experiment II).
For the Random conditions it is possible to compute the efficiency of the observers relative to an observer limited only by the randomness in the noise. We computed d' for the observer that cross correlates the signal image with the two stimuli and selects the stimulus with the larger value. The d' was based on the cross correlation of the signal with itself and the variance of the cross correlation with the full population of possible noises. The efficiencies based on the average observer thresholds in ascending order of noise level were 0.043, 0.071, and 0.061.
Experiment I found no difference between Random and Fixed noise conditions using the category procedure, while Experiment II found a large difference between these two conditions using the 2IFC procedure. One possible explanation for this apparent discrepancy is that there is was a large amount of criterion variability in the category procedure. Criterion variability would be expected to act like additional internal noise and obscure the effects of external noise variability. Figure 8 illustrates that the results are consistent with this hypothesis. If the extent of criterion noise is estimated from the difference between the Fixed conditions for the two procedures then the difference between the Fixed and Random conditions seen in the 2IFC experiment would be expected to diminish to the much smaller amount as indicated in the figure. This analysis shows that whatever the source of the difference in the Fixed and Random conditions in the 2IFC experiment, a more precise experiment would be required to have that same difference be significant in the presence of more internal noise.

Figure 8. Mean signal contrast energy thresholds (decibels re 10^-6 deg^2 sec) for Observer BLB for the different noise conditions as a function of the added white noise RMS contrast (Experiments I and II). The dark line is the prediction of the Random condition of Experiment I (a), based on the criterion noise estimate from the increased in the Fixed threshold from Experiment II to Experiment I (b) and the increase in the threshold of the Random over the Fixed condition in Experiment II (c).
The results of Experiment IIA show that the external noise variability is not a major contributor to variability in the observer decision variable in this situation. The Random and Random-Fixed conditions are not significantly different from each other, and the efficiency suggests that only 7% of the variability in the observer decision variable is related to variability in the target level itself caused by the external noise variability.
The results of Experiment IIB show that there is a large difference between the Random-Fixed and the Fixed conditions. This suggests that the observer decision variable is computed differently in these two situations, probably because different templates are being used. Figure 9 illustrates the general detection process model of Figure 1 with additional processes added to indicate feedback mechanisms that might play a role in the generation of observer templates. We think that randomness in the noise somehow affects the learning process that generates the observer templates. For example, suppose that whenever the reinforcement says that a stimulus is a target and it is not similar enough to any current template, a new template is formed. With random noise samples, more templates would be formed, resulting in the poorer performance associated with stimulus uncertainty. Another possibility is that nonlinear feature processing interacts with the noise to exacerbate the above process or to prevent averaging processes from forming representative templates.
A general goal of this project was to see if image discrimination models predict contrast detection thresholds for objects embedded in fixed and random noise. Earlier results showed that these models can predict results of fixed pattern noise [9]. Although there is significant fixed pattern noise in some image systems, random noise is even more common, and here we were mainly interested in whether image discrimination models can predict the target detection in random noise, as one might expect if internal noise dominates detection. In the category rating experiment, the fixed pattern noise results were close to the random noise results, so that discrimination model predictions for the fixed case might be adequate for the random case. In the 2IFC experiment, the fixed noise thresholds were much lower than the random noise thresholds. However, the results are not consistent with the simple idea that only internal noise is reduced. It appears that random noise affects the observer target templates, which the image discrimination models account for in a simplistic manner.

Figure 9. Schematic of a target detection model with the feedback mechanisms required to provide for template learning.
This work was supported in part by NASA Grant 199-06-39, NASA Aeronautics RTOP #505-64-53, and NASA Cooperative Agreement NCC 2-327.
1. D. M. Green and J. A. Swets (1966) Signal Detection Theory,
Wiley, New York.
2. A. E. Burgess and B. Colborne (1988) "Visual signal detection:
IV. Observer inconsistency, Opt. Soc. Amer. A, Vol. 5, pp. 617-628.
3. J. A. Solomon and D. G. Pelli (1994) "The visual filter
mediating letter identification," Nature, Vol. 369, pp. 395-397.
4. A. B. Watson (1987) "Efficiency of a model human image code,"
Opt. Soc. Amer. A, Vol. 4, pp. 2401-2417.
5. S. Daly (1993) "The visible difference predictor: An
algorithm for the assessment of image fidelity,"
in Digital Images and Human Vision, ed. A. B. Watson,
pp. 179-206, MIT Press, Cambridge, MA.
6. J. Lubin (1993) "The use of psychophysical data and
models in the analysis of display system performance,"
in Digital Images and Human Vision, ed. A. B. Watson,
pp. 162-178, MIT Press, Cambridge, MA.
7. A. P. Ginsburg (1986) "Spatial filtering and visual form perception,"
in Handbook of perception and human performance,
Vol 2. Cognitive processes and performance,
ed. K. R. Boff, L. Kaufman, J. P. Thomas,
John Wiley & Sons, New York, NY, pp. 1-41.
8. A. J. Ahumada, Jr., A. B. Watson, and A. M. Rohaly (1995)
"Models of human image discrimination predict
object detection in natural backgrounds,"
in B. Rogowitz and J. Allebach,
Human Vision, Visual Processing, and Digital Display VI, Proc. Vol. 2411,
SPIE, Bellingham, WA, pp. 355-362.
9. A. J. Ahumada, Jr. and B. L. Beard (1996)
"Object detection in a noisy scene,"
in B. Rogowitz and J. Allebach,
Human Vision, Visual Processing, and Digital Display VI, Proc. Vol. 2657,
SPIE, Bellingham, WA, paper 23.
10. W. S. Torgerson (1958) Theory and Methods of Scaling,
Wiley, New York.
11. A. B. Watson, H. B. Barlow, J. G. Robson (1983)
"What does the eye see best?" Nature, Vol. 302, pp. 419-422.