TN 0011: Object Detection in TASS

Author: Mike Gutzwiller
Date: 960412
Revision: #1 960412
Key Words: CCD, techniques, PSF, computation

This technical note describes the methodology used to generate object candidates by the preliminary software supplied to Tom.

Contents include:

Software Basis

Source File

Recognition Algorithm

Output File Format

Software Basis

The software is derived from a version of the DeepSky image processing software. DeepSky is a MS Windows based system developed for amateur astronomer's image processing needs.

Source File

Each source file consists of several thousand lines of data generated during a single night. At 0.92 seconds per line, a full 12 hour run would produce over 45,000 lines of data, each line contains 823 words of information. A full 12 hour file would be over 75 Mb per singlet or over 225 Mb per triplet per night. Not something easily sent over a modem!

Each data line in the file has the following format:

	Word Range	Contents		
	
0 - 9		Date and Time Stamp			
10 - 19		10 Empty Registers			
20 - 23		4 Shielded Pixels			
24 - 791	768 Image Pixels			
792 - 802	11 Shielded Pixels			
803 - 804	2 Empty Registers			
805 - 816	6 spare floats			
817 - 822	6 spare integers

Recognition Algorithm

The recognition algorithm proceeds in a series of four steps:

1. Preliminary Image Processing

Since TASS file are too long to read into memory the file is segemented into more manageable pieces. These segments are overlapped to avoid losing any candidates in the gaps. If a candidate is right on the edge this sometimes results in two entries into the candidate list. The segment is then dark subtracted and flat fielded.

2. Image Characterization

Each segment is then characterized for its noise level to track local conditions. A PSF is then generated to model a typical star in that segment. If no valid stars were found to generate the PSF, the segment is skipped.

3. Candidate Detection

Candidates are then detected using the PSF from the previous step. In general a candidate must N sigma over the background and look like the PSF to be added to the output file. N is chosen by the user but defaults to 3 sigma.

4. Candidate Calculations

Each candidate which passes the criteria then has its position and magnitude calculated and is added to the output file.

File Segmentation

To reduce the size of the data being processed by the recognition algorithm, the source file is read in segments. Each segment overlaps the previous segment by the height of the PSF to ensure that all candidates are processed. The segmentation size is adjustable by the user to optimize the performance based on the ram, ram cache, and processor cache size and I/O performance of the analysis computer.

Dark Value

For each segment a dark value is calculated by finding the median value of the second shielded pixel. This value is then subtracted from all the image pixels. If this would result in integer overflow for any pixel then the results for all pixels are compressed by a factor of 2 to avoid the overflow.

Flat Vector

A flat vector for each segment is generated by first determining the 1% and 99% pixel value levels for the entire segment. A flat value for each column is then calculated by taking the median of all pixel values within the 1% and 99% limits in the column. Each row in the segment is then divided by the flat vector and multiplied by the average flat value resulting in a flattened image.

Noise Characterization

The base noise sigma is then estimated by first taking the difference between the median (50% level) pixel value and the pixel value at the 15.87% level in each row and then using the median sigma for all rows in the segment. For a Gaussian distribution these levels represent one sigma.

PSF Generation

A Point Spread Function is calculated in two steps.

First, local maxima greater than 30x the base sigma are examined. The FWHM in x and y for each peak is stored in a temporary array. Up to 100 candidate peaks are examined in this way. A target FWHM in x and y is then set to the median value of all the FWHM values. The width and height of the PSF is set to approximately 2.548*FWHM for x and y. For a gaussian distribution this corresponds to the 3 sigma level.

Next the segment is examined again for local peaks greater than 30x the base sigma but candidates are rejected if their calculated FWHM is greater than 2 times or less than 1/2 times the target FWHM from the first step. The PSF is then set to the average PSF of each candidate and then normalized to sum to 1.

Candidate Detection

Candidates must pass the following criteria:

A > N sigma * sqrt(PSF width * PSF Height)

Candidate Position Calculation

The x and y position of each candidate is calculated using a weighted sum of all the pixels under the PSF. For most cases the x and y centers should be accurate to about 1/3 to 1/2 pixel in x and y respectively.

The RA and Dec positions are calculated based on a least squares fit of 4 or more stars near the beginning of the file.

Candidate Magnitude Calculation

The magnitude of each candidate is determined by first summing the values of all the pixels over the background within the bounds of the PSF. The magnitude is then calculated by the following formula:

M = 21.0 - 2.5*log10(sum)

The base magnitude, 21.0, was chosen to provide the best match to published visual magnitudes when using the V filter. Even when using the V filter though the match between TASS magnitudes and visual magnitudes has a 0.5 magnitude sigma.

The error in magnitude is calculated as follows:

E = 2.5*log10((sum+sumErr)/sum)

where:

sumErr = sqrt(N*sigma^2 + sum/scaleFactor)

where:

sigma is the base noise level and
scaleFactor is the number of electrons/ADU

The error calculated does not take into account any error due to pixel saturation and should be considered unreliable below magnitude 8.

Output File Format

A file header is written containing the following:

Each segment is preceded by a segment header with the following data

Each segment header is followed by zero or more candidate in the following format:

where xxx.xx is the x center of the object, yyyyy.yy is the y center of the object, mm.mmm is the relative magnitude of the object, and e.eee is the calculated 1 sigma error of the magnitude (statistical only). The y center is calculated from the first line read in.