[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Work to Do (was: Lots of Data)



We've been down this road before with the Mark III and Tom is right, there
is a strong tendency for us programmer types to want to do it all ourselves.
That's the way it started with the Mark III with at least three analysis
programs (Star, IRAF and Sextractor) and  two databases developed (one for
Oracle and one for PostgreSQL).  In the end only one combination resulted in
published data, the Star - PostgreSQL chain that resulted in the tenxcat.

So will the same thing happen with the Mark IV?  Is this the right way to do
it?  Offhand I don't know.  One thing is obvious to me, only those with
cameras will process any significant amount of raw data.  This is due simply
to the extreme amount of data involved.  With each camera producing 3 to 4
Gb of data per night there isn't much of an alternative.  Shipping out a
night or two is certainly possible but keeping up with the demand would be
quite difficult.  Will we all end up using the same tool?  I doubt it.
Arne, Michael and myself have different reasons for choosing the tools we
want to use.

Will all the processed data end up in a single database?  This is still
doubtful.  The reason is the same, the tremendous amount of data.  Data
reduction may "compress" the data by about a factor of 10 but that still
gives us about 300 to 400 Mb per night per camera.  Even with high speed
internet connections it would take hours to transfer that amount of data.
Chris has already said the current database design couldn't import the
amount of data in a reasonable amount of time.

So where does that leave us?  I for one certainly don't want to lose the
group effort part of TASS.  I certainly don't want to Balkanize ourselves
into single camera efforts.  This suggests an the area to direct new effort;
that is, solve the problem of distributed data.  If the raw or even
processed data never leaves a single place, how do we combine and analyze
the data from several sites simultaneously?  This is the area I think will
need the most work.  It's also an area where there isn't a readily available
solution.


Thanks,

Mike G.

-----Original Message-----
From: Tom Droege [mailto:tdroege@veriomail.com]
Sent: Friday, October 20, 2000 2:57 PM
To: Creager, Robert S; tass@listserv.wwa.com
Subject: RE: Lots of Data


I get a modest amount of mail like the message from Rob below.  I have lots 
of data and would like to start reducing it.  Most of the programs exist, 
many are on our web page.  We just need a software leader who can point 
some of the Rob's on the list at this work.

I know that most of you like to do everything yourself, and don't think 
about leading an effort.  I would have a go at it except that I have to 
spend full time building the hardware.

So to answer Rob's questions,

Yes, there is an immense prior art.  IRAF for example.  There are lots of 
references on the home page.  So it is not necessary, I think, to write 
much code at this time.  Later there will be big data base work to do.  But 
for now I would be happy to just reduce the data to star lists in an agreed 
format.

As I see it, the main job is to write scripts.  Go to this directory, look 
for .fts files.  When you find one look at the header for where it was 
looking in the sky.  Match some of the stars in the image to a standard 
catalog by processing it through IRAF with he proper commands.  Now tell 
IRAF to find all the stars in the image.  Now correct the measured 
magnitudes by comparing them to fraction of stars with well established 
magnitudes, now compute errors and such.  Michael Gutzwiller's star goes 
through this process for the Mark III images and he is working on it for 
the Mark IV.

I understand the problem faced by an interested person attempting to attack 
this problem from scratch.  It is like me wanting to get IRAF up on my 
Linux machine.  I just don't know what to do.  I watch Linux programmers 
sitting in front of a machine doing things.  They are just frantically 
typing and huge amounts of error messages are scrolling up the 
screen.  Every so often they pay attention to one.  I have no idea why.

The interesting task is really quite simple.  Collect many measurements of 
a star and see if it is an interesting star.  i.e. one that is changing.

OK, here is the problem.  We have lots of sample data now.  We have (I 
think) lots of talent willing to work on it.  The raw data has interesting 
content if we would look.  So how can we organize the talent to look at the 
data?

I know part of the problem.  There are a number on the list that know 
exactly what to do.  They want to do it all by themselves with their 
telescope and their data.  They want to process the data the way they think 
is best.  This intimidates others from trying.  Scratch the surface and you 
will find some really hard opinions about how to do dark fields and flat 
fields and the like.  Color corrections, etc..  When someone tries to 
process data they will find that they have pulled a hornets nest out of a 
tree.  But these differences are at the 0.02 mag level for our 
data.  Better to reduce the data to 0.05 mag and have data than to never 
process it at all.  OK, we need to be honest and really understand the 
error and publish what it really is.  Then try to make it as good as we can.

OK, the project can work with the solo workers.  That was in my thinking 
when I designed the project.  I did not expect to lead an assortment of 
people to put together a big project.  I just hoped that someone would 
write code that I could use.  This worked for the Mark IIIs, and I hope 
that it will work for the Mark IVs too.

But it would be even better if we could put a group effort together.  It 
would be a lot of fun.  But it will require a leader.  I just don't have 
the time to be that leader now.  But later I will have a go at it if no one 
else goes for it first.  You don't have to know anything about astronomy to 
lead such a project, (it does help and you will quickly learn as we have 
lots of experts here) you just have to support and encourage and beg and 
plead and make things fun.

I have lots of good data now, and will send some to anyone that wants to 
look at it.

Tom Droege


At 11:17 AM 10/20/00 -0600, you wrote:

>Tom,
>
>Where would I even start looking for information on how to do this?  Is
>there any prior art?  I'm quite a good programmer, but only took Astronomy
>101 in college (10 years ago).  While it might take me a couple of months,
>I'd be happy to start trying now if this task is not currently being worked
>on...
>
>Thanks for any help,
>Rob
>
>Robert S. Creager
>Senior Embedded Software Development Engineer
>Multi Platform Tape Library Development
>Ph:      303-673-2365
>Fax:    303-661-5379
>Pager: 888-912-4458
>StorageTek
>INFORMATION made POWERFUL
>
>
> > -----Original Message-----
> > From: Tom Droege [mailto:tdroege@veriomail.com]
> > Sent: Friday, October 20, 2000 10:54 AM
> > To: tass@listserv.wwa.com
> > Subject: Lots of Data
> >
> >
> > Last night I ran both TOM and MICHAEL almost all night.  The
> > result is 10
> > CD ROMs full of data.  6.4 GBytes.
> >
> > OK, I know this data will probably never be analyzed.  But I
> > am stacking up
> > a lot of data at 0 degrees with TOM.  It is all in FTS
> > format.  All I need
> > is a program that will process it through to star lists and I
> > could start
> > looking for new variables.  I probably have 30 GBytes of
> > pretty good images
> > taken with TOM at 0 degrees.  I think I will just keep going.
> >
> > There would be nothing that would make me happier than having
> > to buy a row
> > of 1GHz machines to process all this nice (getting nicer) data.
> >
> > Tom Droege
> >
> >