Coma etc.

Andrew Bennett

Version 5, 2001 Feb 16:
Ancient and Historical Fits have been relegated to the bottom of the page. Don't bother with them!

All sorts of things including basic fitting theory and comparisons between least squares fitting methods and the aperture fitting method used by "Star" have been moved to a separate page

The Current Model

The current model uses the sum of two elliptical Gaussians.

2 Ellipse model
The small red ellipse represents the core while the larger blue ellipse represents the comatic tail. The centres of the two ellipses are separated by a distance d, with components d1 and d2 in the AXIS1 and AXIS2 directions respectively. In the present version, the two components are assumed to have equal fluxes.

The ellipses could be characterized by their major and minor axes, a and b, and the angle of the major axis. However, this formulation gives serious numerical problems with indeterminate angles for nearly circular sources. Instead, an ellipse is here characterized by a radius - the radius of the circle having the same area - and two components of ellipticity: Ellipse

The two-ellipse model thus has 8 parameters per source; radius and two ellipticities for each of the two ellipses plus the two components of the separation between their centres.

Variation Across the Image

A realistic description of the variation of the radius requires a quadratic.

r^2 = quadratic in X1, X2

This requires six terms; constant, X1, X2, X1^2, X1X2, X2^2. These give a considerable range of behaviours from the plain and boring;

PSFconst

through the mundane:

PSFgradient PSFcentred

to the downright bizarre i.e. just about enough range to cope with Tom's lenses.

PSFweird1 PSFweird2

The minimum number of parameters needed to describe the variation of each component of the displacement and ellipticity components across the image in a moderately realistic manner is three: constant plus X1 and X2 gradients.

PSFtrail PSFradial

Constant ellipticity can represent trail in RA while matched linear gradients can give either a radial pattern ...

PSFtangent PSFweird3

... or a tangential one. Quite complicated behaviour can be modelled.

The 8 parameters per source thus require a minimum of 30 parameters to represent them across the image: 2x6 for the two radii and 6x3 for the 6 displacement and ellipticity components. Put them all together and one can do a fair job of representing not just a centred image but even the real thing!

PSFcomatic Typical

The fit on the right above is for an actual V image. I've labelled it typical, which in this case does not mean the best I could find! The fit is still inadequate especially at the bottom (South) of the image where the central core and comatic tail don't overlap in this representation. The situation isn't disastrously bad - the components don't really have sharp edges and they extend well past the contour that is shown in the diagrams which is at exp(-0.5) or 0.6065 compared to the peak value for the component but one would like to add another component. Really, however, things are not too bad! The components have been magnified by a factor of 200 compared to the size of the image: the red circles in the synthetic fit on the left, supposedly on the same scale, are 2 pixels in diameter.


Fitting the Parameters

I originally used the non-linear Simplex method (Amoeba in "Numerical Methods") which may be charitably described as thrashing around until a fit is found. For N parameters, this required of order 6N^2 evaluations. Early attempts had horrible problems with false optima and the like. It is now realised that these problems arose from attempting to fit ill-chosen parameters or from fitting a single source to data that was really two bright sources. Careful choice of parameters, a few tricks like replacing x^2 by x^2 - 1/3 so as to have a zero mean over -1..1 and not using stars with close companions allows the use of simpler methods which do not work at all in the general case. If all the parameters were orthogonal over the image area, the optimum could be found in around 3N evaluations; with the current choice of parameters, a reasonable fit can be reached in around 9N to 20N evaluations starting from a good initial estimate - a large time saving for N = 30 over the 6N^2 of the Simplex method.

The current method consists of adjusting each parameter in turn by the standard naval gunner's method of one short, one over and one down the funnel. This takes just 3N+1 evaluations to do every parameter, if the starting guess is good enough. If the parameters were all orthogonal, this would arrive at the solution. In the real world, one iterates.

In detail, fitting proceeds as follows:

1) Sources are selected from the cleaned, Dark/Flat processed image that are below saturation (arbitrarily defined after examining values for obviously saturated sources) and above some arbitrarily chosen value for their brightest pixel. Mean X1 and X2 and amplitude A defined as the sum of all connected pixel values above some threshold are computed.

2) These sources are matched with the Tycho2 catalog. This gives preliminary astrometry and photometry.

3) The brightest 30 or so sources that do not have close bright neighbours according to the Tycho2 catalog are selected.

4) A trial PSF is set up.

5) For each source, a goodness-of-fit parameter is computed, optimizing X1, X2 and A using the current PSF.

6) An overall goodness-of-fit parameter is computed. The individual goodness-of-fit parameters are examined; if the worst source is much worse than the typical sources, that source is eliminated and the process restarted at step 4). This gets rid of, for example, close pairs that the catalog matching process missed.

7) PSF parameters are systematically adjusted on the basis of the overall goodness-of-fit parameters.

8) Steps 5), 6) and 7) are repeated until the change resulting from a complete pass through all the parameters is sufficiently small. Or until the owner of the computer gets tired of waiting.

9) The optimal positions for all the sources, obtained using the optimum PSF, are rematched with the Tycho2 catalog to give final astrometry and improved (one hopes) photometry. The probable errors in RA, DEC and magnitude then provide assurance that the whole process has been successful. Or do not, as the case may be.

Some results

The latest fits have overall goodness-of-fit parameters around 1%. That is, if I have the scalings right, the parameters that I am fitting account for 99% of the variance. This is actually not very good. 1% residual variance implies residuals around 10% which is nowhere near as good as is required for things like image subtraction, I think.

Those with long memories will remember that I was claiming 1 to 2% goodness-of-fit a long time ago ... a lot of bugs have been found since then, including quite a large factor in scaling residuals. These results are a lot better than the old ones. Honest!

Here is a key to the solid mass of figures. Or you may just look at the pretty pictures!

 R1     R1_1   R1_2   R1_11  R1_12  R1_22  e11    e11_1  e11_2  e12    e12_1  e12_2
 R2     R2_1   R2_2   R2_11  R2_12  R2_22  e21    e21_1  e21_2  e22    e22_1  e22_2
 D1     D1_1   D1_2   D2     D2_1   D2_2

R1 is the radius of the core of the source. Well, actually it is the square of the radius, in pixels. R2 is the square of the radius of the halo. Both components are, as discussed above, taken to be elliptical Gaussians: the radius is the radius of the circular Gaussian of the same area.

The suffices _1, _2 etc denote the coefficients of the X1, X2 terms. X1 and X2 have been scaled to the range -1 to +1 so the numerical values of all terms are the peak contribution of that term, in pixels. e11, e12 are the two components of ellipticity of the core; e21, e22 the two components for the halo. D1, D2 are the two components of the displacement between core and halo.

An ideal system might give parameters:

 R1     R1_1   R1_2   R1_11  R1_12  R1_22  e11    e11_1  e11_2  e12    e12_1  e12_2
 R2     R2_1   R2_2   R2_11  R2_12  R2_22  e21    e21_1  e21_2  e22    e22_1  e22_2
 D1     D1_1   D1_2   D2     D2_1   D2_2

 0.6    0      0      0      0      0      0      0      0      0      0      0
 1.2    0      0      0      0      0      0      0      0      0      0      0
 0      0      0      0      0      0  

The best lens so far:

 R1     R1_1   R1_2   R1_11  R1_12  R1_22  e11    e11_1  e11_2  e12    e12_1  e12_2
 R2     R2_1   R2_2   R2_11  R2_12  R2_22  e21    e21_1  e21_2  e22    e22_1  e22_2
 D1     D1_1   D1_2   D2     D2_1   D2_2
CD15: H3R1659.828
 0.621 -0.067  0.101  0.026  0.101  0.085 -0.023 -0.042  0.007 -0.339 -0.194  0.059  
 1.232 -0.161  0.264  0.174  0.309  0.172 -0.026 -0.050 -0.008 -0.082 -0.084  0.037 
 1.080 -0.686  0.132 -0.143 -0.058 -0.841
CD15 V

The corresponding I-band lens.

H4R1659.828
 0.880  0.066  0.209 -0.032  0.089  0.051 -0.020  0.006  0.062 -0.181  0.032  0.108 
 2.176  0.388  0.804 -0.109  0.109  0.040 -0.050  0.030 -0.027  0.047 -0.034  0.030 
-0.788 -0.482 -0.028 -1.068  0.022 -0.767
CD15 I

For the old comatic lenses:

CD10: H3R1591.592
 0.752 -0.019  0.158  0.064  0.067  0.190  0.006 -0.061  0.014 -0.093 -0.008 -0.062  
 3.014 -0.343  0.639  0.829  0.465  1.233 -0.021  0.092  0.076 -0.060  0.002  0.064 
 0.073 -2.495  0.217 -1.193 -0.118 -1.899
CD10 V
H4R1591.592
 0.704  0.029  0.051  0.155 -0.025  0.067 -0.006 -0.015 -0.008 -0.067  0.023 -0.005  
 4.130 -0.076  0.193  0.637  0.024  1.070 -0.086 -0.073 -0.084 -0.034  0.062 -0.106 
 0.471 -2.073 -0.030 -0.344 -0.074 -1.910
CD10 I

Ancient and Historical Fits

The early MK IV images had a lot of coma. The stars in the corners were approximately circular with a bright core sitting on the edge of the circular comatic patch furthest from the centre of the image. The edge of the comatic image was quite well defined and the centre of the comatic patch was slightly less bright than the edge. My first model attempting to capture this behaviour consisted of superposed doughnuts.

Model 1

All right: that's not a very clear diagram. Here is a cross section:

Model 1X

The big difference from the Mk III is that the MK IV PSF varies a lot with position in the image. My first model had just 2 radius dependent parameters; radial and tangential length scales for the coma. Now there are 8 externally adjustable parameters ... and lots of internal fiddles. My 8 parameters are:

Sigma. A width parameter for a gaussian core near the centre of the image.

SigmaFac Width = Sigma*Sqrt(1 + SigmaFac*R^2). This gives a good approximation to the variation of core width across the image. R is the radius, defined below.

The PSF is actually built up by summing "pancakes".

Model 2

Herbert Johnson suggests:

Actually, a closer metaphor would be the "towers of Hanoi" stack of disks. It is also a child's toy, a simple stack of disks of increasing size on a peg.
O.K. I'll go along with that. Thanks, Herbert.

All the disks have the same gaussian blurring parameter. The current model uses 4 disks with fluxes forming a geometric progression. The central "disk" is the gaussian core; it does not have a flat central region as the others do. The positions and sizes of the intermediate disks are tied to those of the outer disk and do not introduce additional parameters. Adding a fifth disk gave only an insignificant further improvement.

RScale, DScale

The radii of the disks are RScale*Sqrt(R^2 + ROff^2)*Internal parameter
where, for convenience (my convenience) I have scaled things so that RScale is the maximum halo radius at R = 1, measured in pixels; R is scaled to be 1 at the middles of the edges or root(2) in the corners and the internal parameters range from zero for the core to 1 for the largest disk.

In previous versions of the model, the radii were taken as RScale*R*Internal parameter
The factor ROff is required for two reasons. Firstly, the original formula is not admissible - any physically admissible form must have d/dR = 0 at R = 0. Secondly, the new form gave a significantly improved fit for small radius in a preliminary fit that gave ROff = 0.1. This value appeared non-critical so it is currently fixed at this value

DScale controls the displacement of the doughnut centre from the position of the core:
Displacement = DScale*R*Internal parameter.
It turns out that DScale is numerically about the same size as RScale. As the coma is towards the centre of the image, DScale is actually negative.

(L1[1],L2[1]), (L1[2],L2[2]). These two points specify a 3-point convolution used to model trailing. In previous models, uniform linear trailing was assumed. This method is more flexible. The third point (L1[3],L2[3]) is obtained by setting
L1[3] = -(L1[1] + L1[2]); L2[3] = -(L2[1] + L2[2])
[2000 Feb 18: cut down to a 2-point convolution with only two adjustable parameters. For the excellent images used, the improvement with the extra point was insignificant. This would definitely not be true for some of the other images.]

Two additional parameters have also been fitted but are currently held constant:

C1, C2. Actual optical centre of the image from which P is measured. Especially in the case of the V-band images, the four corners look somewhat different. Even with the V-band images, where the difference between the centre of the image and the optical centre is quite small, the fit is improved significantly. [These parameters were allowed to vary for the fits of 2000 Feb 18. This gave a significant improvement especially for Lens #2 (see below). This may or may not mean that the camera went together differently each time!]

Results of the fitting

Initial individual fits to bright (but not saturated) sources gave correlation coefficients of over 0.98 [2000 Feb 18: that is now up to 0.99 over the whole image.] between the fitted a_i and the directly measured response. I was astonished since I had a pretty steep learning curve to climb with the Mk III fits starting with correlation coefficients around 0.7. Better yet, the fit parameters didn't vary much between the different bright sources. Wow!! Coma can be modelled!

I have cleaned up my Simplex optimization code so that it can now do a global fit of the 8 PSF parameters at the same time as optimizing X1, X2 and amplitude for over a hundred bright sources. That's optimizing over 300 parameters. Over 400 if you count the implicit background estimates. The time is still around a minute per bright source for the global fit. As I have remarked before - "there are no slow computers; only slow algorithms". I don't intend to let it worry me that Tom can produce 30 images per camera in the time it takes me to analyse one!

Tom's CD7 focussing runs with various spacers

Values of Neff computed for the fits to the V-images on CD7 are as follows

NOTE: 2000 Feb 18. There was a bug in the code that detected convergence; some of these fits are not as well converged as they should have been ...

Lens #1; no extra spacing; 0.010" focussing steps

File        centre R=0.5   R=1 corner
H3R1551.506    8.8  21.3  42.0   66.5
H3R1551.508    6.7  17.2  34.6   54.6
H3R1551.510    7.2  18.2  36.6   58.0
H3R1551.512   12.4  28.0  52.3   80.1
H3R1551.513   15.4  30.4  49.5   67.9
H3R1551.515   24.0  42.9  66.2   87.5
H3R1551.517   53.9  81.3 116.0  146.4
H3R1551.518  153.5 193.4 249.1  292.1
H3R1551.520  177.6 218.3 278.6  327.2

Lens #1; 0.016" extra spacing; 0.010" focussing steps

File        centre R=0.5   R=1 corner
H3R1551.463   13.3  29.2  55.4   86.5
H3R1551.465   10.4  24.1  46.8   73.6
H3R1551.466   13.7  30.1  55.2   83.1
H3R1551.468    8.0  19.8  39.6   62.9
H3R1551.470    8.0  19.8  39.3   62.3
H3R1551.471    9.3  22.2  43.3   68.0
H3R1551.473   90.3 129.4 167.6  185.2
H3R1551.475   83.3 133.5 178.9  197.5
H3R1551.477  135.4 206.3 272.4  300.2

Lens #1; 0.032" extra spacing; 0.010" focussing steps

File        centre R=0.5   R=1 corner
H3R1551.533  117.6 167.9 228.8  271.0
H3R1551.534   31.5  51.1  69.1   77.2
H3R1551.536   17.8  32.3  46.0   52.9
H3R1551.538   14.4  27.3  39.7   46.1
H3R1551.540    8.2  17.2  26.4   32.1
H3R1551.541    8.0  16.8  26.2   32.3
H3R1551.543   21.6  38.1  55.5   68.0
H3R1551.545   51.9  76.8 102.1  116.7
H3R1551.547  134.8 171.9 214.5  235.7

Lens #2; No extra spacing; 0.005" focussing steps

File        centre R=0.5   R=1 corner
H3R1551.707   22.8  40.8  58.8   69.8
H3R1551.709   14.7  28.6  42.6   50.8
H3R1551.711   14.6  28.3  42.3   51.0
H3R1551.713    9.5  20.1  30.6   36.5
H3R1551.715    8.8  18.8  28.7   34.3
H3R1551.717   10.2  21.1  32.3   39.4
H3R1551.719   13.1  25.9  39.4   48.9
H3R1551.721   21.3  38.2  56.4   69.5
H3R1551.723   36.7  59.3  82.8   98.3

More fully converged fits for the best focus cases: 2000 Feb 18

Lens          File        centre R=0.5   R=1 corner
#1 no extra  H3R1551.508     6.4  14.5  28.7  45.2
0.016" extra H3R1551.470     7.7  18.8  37.0  58.1
0.032" extra H3R1551.540     6.4  14.1  28.2  44.9
#2           H3R1551.715     5.5  13.8  29.6  48.6

1) The position of best focus (smallest Neff) is the same all the way from the centre to the corners in every case. This is good news: the field of the lens is flat.

2) As can be seen by comparing the original fits with the latest ones, there is a good deal of uncertainty. In the original fits, Lens #1 without any extra spacer came out sharper than #2 in the middle and rather worse in the corners. In the final fits, this is reversed, and #2 is a little sharper in the middle. If there is any real difference, it is small. The final fits appear to confirm the odd result for Lens #1 that the response is worse with the 0.016" extra spacer than it is either with no extra spacer or with 0.032" extra. I think this is a real effect but I am not sure. It would help if my fitting program provided some estimate of the accuracy of its fit. The residuals are converged to one part in ten thousand. This corresponds to no better than 1% at best for the most significant fitted parameter and probably 10% or worse for most. And even that convergence is overkill compared to the statistical uncertainty even with 40,000 degrees of freedom ...

Trailing

The least squares fitting program uses a 3-point convolution to take out the effect of trailing. Two points are fitted independently and the third is calculated from them so as to keep the centre of gravity of the central core of the source fixed. If all three points were allowed to roam, the fitted source position wouldn't mean much! The fitting program does not care how the points are numbered so, for easier comparison, I have shuffled them into X2 order. The estimated trail parameters for the four sharpest images are

File          L1[1]  L2[1]   L1[2]  L2[2]   L1[3] L2[3]
H3R1551.508  -0.259 -0.717   0.052  0.088   0.207 0.630  
H3R1551.470  -0.279 -0.525  -0.130 -0.140   0.409 0.664  
H3R1551.540  -0.273 -0.561   0.004 -0.033   0.270 0.594
H3R1551.715  -0.199 -0.829   0.248  0.387  -0.049 0.443

The total trail is a bit over one pixel (7.5 arc seconds) which is impressively good and very much better than for earlier images. The trail is also very consistent, which implies that it will get even better when Tom finally has nothing better to do than tweaking up the alignment and drive rates! Some of the other CD6 and CD7 images, not selected here, show larger trail (up to 3 or 4 pixels) so not everything is perfect. Yet.

Spreading and Coma

The other four fitted parameters for the same four sharpest images are

File         Sigma   SigFac  RScale  DScale
H3R1551.508  0.549   1.113    5.9    -6.4 NEff 6.4,14.8,30.2,48.6
H3R1551.470  0.594   0.926    5.9    -6.5 NEff 7.7,18.8,37.0,58.1
H3R1551.540  0.566   0.962    5.3    -5.4 NEff 6.3,13.7,27.6,44.2
H3R1551.715  0.573   0.699    5.3    -5.4 NEff 6.9,14.8,28.1,43.2

11 parameters, with C1, C2
H3R1551.508  0.556   0.976    5.1    -6.4 NEff 6.4,14.5,28.7,45.2
H3R1551.540  0.563   0.955    5.3    -5.4 NEff 6.4,14.1,28.2,44.9
H3R1551.715  0.514   1.222    5.3    -5.4 NEff 5.5,13.8,29.6,48.6

RScale and DScale vary very little and are both numerically about 6 pixels; the comatic images are about 12 pixels in diameter at the edges of the images. The negative values of DScale reflect the fact that the coma points towards the centre of the image. The similar numerical values of RScale and DScale expresses the fact that the bright core is on the edge of the approximately circular comatic image. The broad comatic disk/pancake contains about one third of the total flux.

Remaining Problems 2000 Feb 7

Computation time is still awful; an hour or two per image on a Pentium. Ah well - there are no slow computers, only slow algorithms. My Simplex code got pretty creaky with 8 parameters and now I am fitting hundreds. There are faster methods. Computers are getting faster. Not a long term problem.

The non-linear Simplex method is quite robust (after one removes various bugs such as the one which occasionally came up with a negative sum of squares ...) but it can (and does) still home in on a local minimum which is quite different from the desired optimum. Every "solution" therefore has to be checked - the standard technique is to run the whole thing twice with different starting points but even that is not foolproof, let alone PhD-proof.

Bennett TASS home page

The Bennetts' Home Page