[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: New Data Handling Paradigms?



Peter Mount wrote:
>

> Looking at these figures, do you think I should look at stress testing
> PostgreSQL to make sure it can handle this amount of data?

Peter,

It would be nice to know what to expect.  The test should
be realistic.  Here is an idea of what (IMO) would make for a
realistic test.

Make two tables.
1) Observations: 200 million rows about 100 bytes per row.
2) Catalog:       20 million rows about 100 bytes per row
Each table has a primary integer key "TASS_ID"
Build index as required

I'd be interested to know the number of transactions/minute
you can do.  A transaction is
1) Append two or more rows to "Observations"
2) update on row of "Catalog"

With a _good_ off-line (non DBMS based) pre-matching program we
can do all appends to "Observations" for a given TASS_ID then only
one update to "Catalog".  I have found COPY to be much faster
then INSERT.

So the _basic_ algorithm I see using using is this:
1) Performed at each camera site.
   a) Match all observations for a given period (one or more nights)
      to a local copy to the TASS_Catalog.  Assign a TASS_ID to
      each observation.
   b) Sort all observations (I and V together) by TASS_ID.
   c) Throw away any star that was not seen at least twice.
   d) Build the list of UPDATE commands to be applied to the
      central database's copy of "tass_catalog".
   e) Transmit to Michael's site.
2) Performed at central database site.
   a) START TRANSACTION
   b) Copy to Observations table two or more rows
   c) update tass_catalog
   d) COMMIT TRANSACTION

There are more special cases not covered by the above but to purposes of
performance testing  just doing steps 2a --> 2d above with random data
on a large database is enough.

If you can script the test then others can try it out on their hardware.
I'd be very interested to see how well this works on a range of hardware.


--
  Chris Albertson

  calbertson@logicon.com                  Voice: 626-351-0089  X127
  Logicon, Pasadena California            Fax:   626-351-0699