In this Note, I describe properties of Data Set 23 (DS23), a collection of Mark IV data collected and processed by Tom Droege. The dataset spans the period from Dec 4, 2001, to Apr 18, 2002. It contains magnitudes (V and I) and positions produced by the Mark IV photometric pipeline.
Table of contents
The dataset consists of star lists from 1481 pairs of simultaneous V-band and I-band images. A list of the individual image properties can be found in the ds23_centers.dat file.
The V-band and I-band cameras were (probably) pointed in slightly different directions, offset by some small number of arcminutes. I calculated the central position of each image pair in an approximate fashion as follows:
- find the min, max RA value of stars detected in the image pair
- find the min, max Dec value of stars detected in the image pair
- calculate image center as
central RA = 0.5*(min_RA + max_RA)
central Dec = 0.5*(min_Dec + max_Dec)
A plot of the (RA, Dec) central positions shows that the cameras were fixed close to Dec = +7 degrees.
Tom ran the Mark IV in follow mode: the cameras followed a fixed (RA, Dec) on the sky for about two hours, then moved back East to another (RA, Dec), following it for two hours, and so forth. With an exposure time of 100 seconds (?), and readout time of maybe 40 seconds, there were typically 30-60 pairs of images of a particular region of the sky during each night. This closeup of the area between RA = 90 and 120 degrees shows that some areas were covered by many different frames, often on different nights.
Recall that the pipeline produces (RA, Dec) positions for each matched (V,I) pair of detections in each pair of images. The steps involved are
One can ask two different questions about the positions of stars (or about anything, for that matter):
Precision of individual positions
We can address the first question by examining the scatter around the mean Mark IV position for each star. On a single night, the telescope might take 30 or 40 images of a particular field. How well do all those positions agree with each other?
Numerically, we can calculate mean and standard deviation from the mean in the positions. Using only stars which were observed at least 30 times, I find
Residuals in position (arcsec)
mean stdev
-----------------------------------
RA -0.0003 0.302
Dec +0.0007 0.294
-----------------------------------
Recall that the pixel scale on the Mark IV is about 7.5 arcseconds per pixel. Therefore, the residuals from the mean positions have standard deviation of about 1/25 of a pixel. A reasonable precision for raw stellar centroids is 1/30 of a pixel, so the current measurements are fairly good.
Now, let's look at some plots of the individual residuals (position in one particular frame minus average position in all frames) as a function of the position in the image. Since the Mark IV has a very wide field of view -- a bit more than 4 degrees on a side -- one might expect to see some systematic trends.
Here are residuals in RA as a function of radial distance from the field center, as a function of distance from the center in the RA direction, and as a function of distance from the center on the Dec direction.
The only big feature in these graphs is the increased residual for stars near the eastern edge of the frame; it has a magnitude of less than two arcseconds.
One can create similar diagnostic graphs for the residuals in Dec measurements.
Nothing big there.
Histograms of the residuals show nicely bell-shaped distributions:
Accuracy of individual positions
Now, how accurate are the DS23 positions? To some extent, this depends on the catalog used as the source of "true" positions. I compared the DS23 positions to those in the Tycho-2 catalog. Since the Tycho-2 positions were used to derive the DS23 astrometric transformations, one might expect the errors to be small.
For the comparisons below, I used a subset of the entire DS23 dataset: stars between RA = 105 and 115 degrees. This includes over a million individual measurements. Using a matching radius of 5 arcseconds, I found 374,000 matches between the Tycho-2 catalog and the DS23 subset. For some calculations, I broke these into "bright" (51,000 measurements with V < 10) and "faint" (323,000 measurements with V > 10) samples.
First, the numbers.
Errors in DS23 individual positions (arcseconds)
bright (V < 10) faint (V > 10)
--------------------------------------------------------------------------
Number of matches 51,000 323,000
RA error, mean 0.00 -0.05
stdev 0.59 1.06
Dec error, mean -0.03 -0.08
stdev 0.57 1.07
tot error, mean 0.56 1.18
stdev 0.60 0.94
--------------------------------------------------------------------------
The residuals are distributed in a nicely symmetrical fashion:
Here are histograms of the total error in position for both the bright and faint samples:
Accuracy of mean positions
Since the camera followed each field during a night,
it acquired many measurements of each star.
Let us examine the accuracy of the mean positions
of each star in the subset from RA = 105 to 115
which was observed at least 10 times.
If there are no systematic errors in the measurement
process, or in the astrometric calibration of each
frame,
then we would expect the mean positions to be
somewhat more accurate than the individual positions.
There are a total of 2910 stars observed at least 10 times in this range of RA. I divide them again into "bright" (V < 10) and "faint" (V > 10) samples.
Errors in DS23 mean positions (arcseconds)
bright (V < 10) faint (V > 10)
--------------------------------------------------------------------------
Number of matches 404 2,504
RA error, mean -0.02 -0.02
stdev 0.56 0.99
Dec error, mean -0.03 -0.12
stdev 0.56 0.99
tot error, mean
stdev 0.45 0.98
--------------------------------------------------------------------------
Aha! These errors are virtually the same as those for individual measurements! That means that systematic effects are dominating the error budget in our astrometry.
One way to see such systematic errors is to make a "Taffogram" (as Arne Henden put it, named after Lawrence G. Taff, I believe): a plot of the errors in position as a function of position in the field. Using all 374,000 individual measurements in the RA = 105 to 115 subset, I find the following pattern:
Each vector represents the average of roughly 1,000 stellar positions at its position in the field. There are clearly small areas of correlated errors, mostly of small amplitudes, less than 0.3 arcsec. I suspect that these residuals -- which I speculate are due to a combination of aberrations in the optics and curvature of the CCD chip -- limit the Mark IV astrometry. The only way to improve it would be to replace the optics, the chip, or use distortion correction of higher order than the current cubic model. However, since there are typically only 30 to 40 good Tycho-2 stars per field to use in the astrometric calibration, which are barely enough for the cubic fit, it's unlikely one could do much better with the current system.
Photometry -- a quick overview
How good are the magnitudes produced by the
Mark IV pipeline?
There are several ways to answer the question,
and so this portion of the document is divided
into several sections.
First, I selected a subset of the entire DS23 dataset to analyze in depth. This subset should be well measured:
Tom's distribution of the DS23 on a CD-Rom included a big file, "collected.big", which grouped together all the measurements of a given star from different images and different nights. Roughly 2 percent of the entries in "collected.big" satisfy the conditions listed above, so there are roughly 2,000 distinct stars in this subset.
I wrote a script to look at the collected measurements of each star in the "collected.big" file and calculate for each star a number of quantities, including
The most obvious metric of photometric precision is a plot of stdev from mean mag vs. magnitude. This subset shows reasonable behavior in V-band (note the logarithmic vertical scale!):
There is a ridge line which falls below 0.01 mag at the bright end and rises in the expected way to fainter stars. There are quite a few stars with much larger stdev values -- too many to be true variables. Something's fishy.
In the I-band, the fishiness explodes:
Here, the points are scattered nearly uniformly in the diagram at all magnitudes. There's a ceiling around 0.07 mag, but that's unacceptable for bright stars.
What's going on?
Look at a plot of the magnitude residuals from the mean as a function of position (in this case, RA) on the frame. In the V-band, there's no big systematic pattern:
But in the I-band, there is an obvious pattern:
It's clear that the I-band magnitudes, especially, suffer from errors as a function of a star's location in the field. So, let's look at these residuals as a function of position in the field. I'll plot the positive residuals as red stars, and negative residuals as black triangles; big symbols mean big residuals. Note that Tom's Mark IV was slightly misaligned during his observations, so that the cameras slowly crept in RA and Dec during a two-hour follow run on any particular field. Successive observations of the same star will show up as diagonal streaks in the diagrams below.
Here are the V-band residuals. It's clear that the edges show a few big errors, but the center of the frame looks okay. Note that the symbols in any diagonal stripe (belonging to the same star in a follow run) are usually about the same small size.
The I-band residuals, by contrast, show a large-scale dichotomy: stars on the right half of the chip are much fainter than average, and those on the left half brighter than average.
Photometry -- ensemble analysis
It appears that one of the problems with the DS23 photometry is a position-dependent error in magnitudes. This might have several different explanations (vignetting, poor flatfields, tree limbs, etc.), but what can one do about it?
One possibility is to do all photometry locally. If the systematic errors occur over a large distance in the field -- a degree or so -- then the relative magnitudes of the stars in a small region -- say, one quarter of a degree on a side -- ought to remain nearly constant. Even if the small region appears on the left side of the chip one night, and the right side of the chip on the next night, we ought to measure the same relative magnitudes within the region. If we call all the stars which appear in some small area an "ensemble", then the analysis of their relative magnitudes might be called ensemble photometry. I'll use the term "regular photometry" to describe that produced by the Mark IV pipeline on a frame-by-frame basis in the usual manner, and placed into the "collected.big" file.
Perhaps ensemble analysis might improve the relative magnitudes of stars. If we are interested in variability, the relative magnitudes are often good enough. I will leave aside for the moment -- though it will eventually come back -- the question of how one might properly tie together a number of ensemble solutions.
So, using the mechanism explained in detail by Honeycutt (PASP 104, 435, 1992), I selected one small region: a square box at RA = 110 degrees, Dec = 7.5 degrees, and width 0.4 degrees. This is about one tenth the width of the Mark IV frame. There were between 20 and 40 stars detected in this box in each image pair. I analyzed all measurements of stars within this box in two collections:
Consider the measurements from a single night. As the Mark IV followed this field for two hours, the telescope drifted a little bit -- only a small fraction of the entire field. We might expect the systematic errors in "regular" photometry to be small: the stars probably didn't move far enough across the field to enter an area of significantly different sensitivity. Below, I plot the standard deviation from mean magnitude vs. magnitude for both the "regular" and "ensemble" methods of analysis.
First, the V-band measurements. I plot the ensemble results for stars in the little box on this night (big green squares), and the "regular" results for ALL stars all over the image (little red dots).
Now, the I-band measurements.
Note that both methods of analysis yield relatively tight ridge lines which extend below stdev = 0.01 mag at the bright end. There doesn't appear to be much of a difference between the two methods -- just as we expected. Fine.
Now, what if we include measurements from many different nights? Stars will appear in different places on the chip on different nights. If position-dependent errors dominate, we would expect the scatter in the measurements on all night to be much larger than that on a single night. So, below I plot the results from "regular" analysis of ALL stars in the field on all nights (small red dots) and an "ensemble" analysis of the stars in the little box on all nights (big green squares). The "regular" analysis includes only stars brighter than V = 10 and I = 10, but you'll see the difference clearly.
V-band first:
Now, the I-band measurements.
In both bands (especially in V), the scatter is much larger for "regular" analysis than for the "ensemble" solution. This suggests that, indeed, position-dependent errors are the cause of the scatter at bright magnitudes.
But, to be really fair, one should compare a "regular" analysis of stars in the little box on all nights vs. "ensemble" analysis of stars in the little box on all nights. Okay, I can do that.
But in these plots, I have added two additional pieces of information. They now show stdev from mean magnitude vs. mean magnitudes in four ways:
In V-band, note that the ensemble solution improves greatly the performance at the bright end; in fact, it comes very close to matching the theoretical estimates for scatter. Great.
In I-band, note that the ensemble solution is certainly better than results of the "regular" analysis. However, there the actual scatter is still significantly larger than that predicted by the statistics of photon counts.
I conclude that, as far as the pipeline goes, everything is peachy in V-band. In I-band, there are clearly some errors that remain even after an ensemble analysis. I suspect that better flatfields will help, but that's just a guess.
Note that there is still a good deal to do in after the (current) pipeline ends. The ensemble analysis is best done after a large amount of data from different nights has been gathered. It should be done either on demand, as users request information from a database, or on a "snapshot" -- a dataset based on some fixed amount of data (such as DS23). Tying the ensemble solutions for stars in different areas together is still not a trivial problem, I claim. Arne has suggested simply using the average calibrated magnitudes to set the zero-point of the ensemble solution, but that works only if each star has been measured at positions spaced uniformly all over the chip. I think this is a tough one ...
Photometry -- compared to accurate catalogs
The best source for accurate magnitudes of stars in the range of the Mark IV (between seventh and thirteenth magnitude) are the measurements of Landolt:
Most of these stars are very close to the celestial equator, and so don't match up with stars in DS23: only 38 of the stars appear in the range between Dec = +5 and Dec = +10.
Therefore, I included standards from another source: the photometric sequences observed by Arne Henden. I found several sequences that appeared in DS23 region:
field RA Dec #stars #stars V<13
---------------------------------------------------------------
GZ Cnc 138 9 38 0
SV CMi 113 6 588 12
UY Mon 105 9 343 17
WX CMi 108 7 771 12
V650 Ori 83 10 402 6
---------------------------------------------------------------
I matched up against this compilation of "standard" stars the measurements from DS23:
V-band errors versus V-band magnitude.
I-band errors versus I-band magnitude.
Now, V-band errors versus TASS (V-I) color.
I-band errors versus TASS (V-I) color.
I conclude from the graphs above
But -- alas -- the most important conclusion is that there are just too few good standards in the area covered by DS23 to permit one to make corrections. Note that there is exactly one standard brighter than tenth magnitude in this area! Corrections (or the confident absence thereof) must wait until we have a set of measurements over the celestial equator, where there are many more good, bright standards.
Last modified July 5, 2002 by MWR