next up previous
Next: The Tables Up: Simulations Previous: The Simulations Themselves

Measuring the Image Quality

Two quantitative measures of image quality are currently in general use in radio astronomy: the dynamic range and the fidelity. The dynamic range (DR), defined as the ratio of the peak brightness to the off-source root-mean-squared (rms) noise, basically measures the `contrast' in an image. The DR has several very positive features: it is easy to calculate, for any source structure, even when the true brightness distribution is not known; it relates the quality of an image directly (through the rms noise) to the best one could possibly hope to achieve; it is reasonably robust to different noise realizations, since both the off-source noise and the peak are themselves fairly stable quantities; and it provides a single, simple number with which to compare the quality of different images. Moreover the dynamic range improves when common sense says it should: as more antennas are added to an array, the noise goes down and the DR goes up; deeper deconvolutions lower the off-source noise and hence raise the DR; and stronger sources, which make deconvolution harder, generally lead to lower DRs than purely thermal noise would suggest. However, the dynamic range has two obvious flaws, one minor, the other fundamental. One problem lies in the definition - where should one measure the `off-source' noise? Presumably the noise $90^\circ$ from the source will be roughly thermal, but that tells us nothing about the quality of the image near the source. This leads up to the fundamental difficulty, that the DR measures the quality of the image precisely where we don't care about it, away from regions of interesting emission.

To address this latter problem, Cornwell, Holdaway, and Uson (1993; hereafter CHU) introduced the fidelity: ``the ratio of the value of a pixel to the error between the true sky distribution $T({\bf x})$ and the reconstructed image $I({\bf x})$.'' This is an image rather than a single number, and is intended to measure the SNR as a function of position. This measures what we want to know, but is ill-defined, difficult to calculate, and hard to summarize. In practice people form an approximation to the true fidelity, as

\begin{displaymath}F({\bf x})={I({\bf x})\over{I({\bf x})-T({\bf x})}}\end{displaymath}

This is simple enough to calculate (assuming you know the true sky distribution!) but is not very robust, because it divides a reasonably stable quantity by one which can be, more often by luck than by marvelous imaging, arbitrarily small. In principle one should redo each simulation with a large number of noise realizations to derive a better estimate of the on-source noise, but the computational requirements are prohibitive, and even then the fidelity is not guaranteed to be well-behaved (it will for instance have structure on scales much smaller than the synthesized beam). Plus there is still the problem that an image is more difficult to summarize, tabulate, and compare than a single number like the dynamic range. So what numbers can one use to characterize an image? Obvious possibilities are the mean and the median; the latter is far more robust, especially for a positive-definite quantity (presumably we care about the absolute value of the fidelity), and CHU adopted this as their fidelity index.

However, this median fidelity index (FI) still has some problems. Like the DR it is not terribly well-defined. By definition, $F({\bf x})=1$ where there is no signal; the median of any fidelity image will therefore tend to unity as the size of the region under consideration is increased, and the median fidelity will in general be biased low because of this effect. More seriously, it is not obvious that the fidelity is equally meaningful or important everywhere in the image; if the reconstruction differs from the truth where the true sky is roughly at the level of the noise, we don't really care, but because of the difference in the denominator such a pixel might well give a very high fidelity. We would prefer a less democratic estimator, one which like the DR pays more attention to the high points in the image which the astronomer may wind up trying to over-interpret. Considerations like this led Frazer Owen to suggest using the intensity-weighted mean FI:

\begin{displaymath}\langle FI_S\rangle\equiv { \sum I({\bf x}) abs(F({\bf x}))\over
\sum I({\bf x})}\end{displaymath}

Another approach would be to say that we're equally interested in the fidelity wherever the signal is well above the off-source noise, and construct the high SNR mean FI as

\begin{eqnarray*}
\langle FI_{N\sigma}\rangle\equiv { \sum w({\bf x}) abs(F({\b...
...{\bf x})},\\
w({\bf x})\equiv 1-e^{-I^2({\bf x})/(N\sigma)^2}
\end{eqnarray*}



where $\sigma$ is the off-source rms noise, and $N$ (taken hereafter to be 5) determines the low-signal cutoff. Both of these indicators rely on the mean (which is not very robust to outliers), but use the entire fidelity image.

Another set of possibilities, which as discussed below turn out to be the most useful, depend on a restricted range of (the brightest) pixels. In this memorandum I define the median peak fidelity index (Med.Pk. FI in the tables) as the median of the absolute value of the fidelity measured at the brightest pixels. The number of pixels is chosen to give enough independent samples that the median is meaningful, while concentrating on the regions of highest SNR (making this measure somewhat analagous to the DR). Similarly the peak SNR is defined as the mean brightness of the reconstructed image divided by the rms of the difference image, both statistics calculated over these same brightest pixels. In practice I used the brightest 300 pixels ( $\sim25\rm\,beams$) for Cas A, the brightest 100 ( $\sim8\rm\,beams$) for Cyg A in its 130 or 260 $^{\prime\prime}$ incarnations, and the brightest 150 ( $\sim12\rm\,beams$) for Cyg A at 360 $^{\prime\prime}$. Of course, these measures can be calculated for any set of pixels, and for a given simulation one can plot the median fidelity and SNR for equal-size bins as a function of surface brightness.

Figures [*]-[*] show a few of these plots. The median fidelity is a fair description of the bulk of the data, while the means are pulled significantly up by a few very high points. As expected, all these measures of the fidelity are at high SNR significantly below the SNR computed from the off-source noise, confirming that the on-source noise level is significantly higher than the off-source rms would indicate. Finally note that the on-source SNR and the median fidelity both rise rapidly with the source flux density, approaching an asymptotic value. This behavior is characteristic of these curves for all these simulations. In particular, an image which has a higher SNR or FI as calculated from the brightest pixels, tends to have a similarly-higher SNR or FI in all bins (beyond the low-SNR points, which are dominated by thermal noise). It is not at all obvious that this should be true, but this feature does allow one meaningfully to compare image qualities by characterizing the image quality at the brightest pixels. This is the justification for the use of the median peak FI and the peak SNR described above.


next up previous
Next: The Tables Up: Simulations Previous: The Simulations Themselves
Stephan Witz 2003-04-15