[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: A Data Reduction Proposal




--- Mark Pitts <PITTS@shands.ufl.edu> wrote:
> Chris et al,
> 
> I think database selection is going to take some experimentation
> (assuming we use a database, which I believe is a good idea if it's
> feasible).  Database performance can be finnicky, but all my
> experience is with large commercial databases like DB2 and Oracle. 
> I've never used the open-source databases like Postgresql, so I can't
> speak (yet) as to how they might be optimized.

Postgresql is much like Oracle it is a bit easyer is admin.
For the most part your experiance will apply.  We do need to
experiment as sometimes re-working the SQL can make a big
diference.  We do have some experiance running A TASS
dtabase.  The Mark III data is in one.

One "got ya" that I found was that you should _not_ experiment
with small numbers of records.  THere is a big "knee" in the
performance graph when the data gets to big to cache in RAM.
I found I needed a random data generator to populate a few
table million rows.  

I think we should just bite the bullet and go with Postgresql.
It runs on every platform and OS under the sun, is free and
is fully featured.  Doing this allows us to use Postgresql-
specific extensions such as user defined data type and
functions.  For example I would like to define an RA and DEC
pair as a type. 

I've bechmarked Oracle, Postrgresql and MySQL.  MySQL runs
circles around the others by __large amounts__ however it
dies when you hit it with many clients.  It uses a very
unsophisticated lock where only one client is allowed access
at a time.  So in single user environments it wins.
Postgresql curently has the best locking mechanism but Oracle
has a slight performance edge and is "even".  Postgres has
some "unevenness" where slight changes to the SQL wich should
do nothing really do effect the timming.  It's best 
feature is 1) price, 2) user extensability.  We can add functions
and data types at run time as "normal users".  It can also
handle a billion (10E9) rows or more with out reliability
problems.  It has _about_ the feature set of DB2/Oracle 8
with triggers, PL, esql compiler and so on.

We also don't need to port a database around. We will likely
only run a few copies on a few machines

What we need is something like a UML design.  We have time
to do this "right".  

A few of use on this list already have projects on Soureforge.
It looks like the place to go.  But I'd keep a backup up to date
"just in case".

=====
Chris Albertson 
  Home:   310-376-1029  chrisalbertson90278@yahoo.com
  Cell:   310-990-7550
  Office: 310-336-5189  Christopher.J.Albertson@aero.org

__________________________________________________
Do You Yahoo!?
Great stuff seeking new owners in Yahoo! Auctions! 
http://auctions.yahoo.com