[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: TASS Database, split catalog and numeric roundoff



Michael Richmond, over a week ago, discussed Chris' query regarding numeric
roundoff.  While this is old news, I wanted to make a couple of comments.

>      - when we keep the square of the value, we must require at most
>           28 bits per value
>       - suppose we add together N such values.  Then the number of bits
>           required to keep N 28-bit numbers is (at most) 
>                   log2(N) + 28 bits
>      
>       - thus, given 32-bit floats, it looks like we can add together
>           2^4 = 16 values, each a 28-bit number, before we overflow
  Not quite true.  A 32-bit float has an 8-bit exponent and a 24-bit
mantissa.  Therefore, you only have 24-bits of precision.  You will get
roundoff almost immediately in sums.  The problem is that to calculate
deviations, you have to subtract terms, and that is where roundoff is
important.  The means will be ok since we don't really have 1mmag precision
in the magnitudes.  To perform the calculation correctly will take either
a 64-bit integer or 64-bit float for the sum-of-squares.
  For coordinates, our 1mas resolution means you need 31 bits to represent
all values.  This means you need 62bits for the squares, and more for any
sum-of-squares.  64-bit floats are better in this case since the exponent
handles numeric overflows (which would occur with 64-bit integers and
sum-of-square fields), and the 56-bits of precision is more than the
31bits of a single coordinate and roundoff is less important (plus, we
don't have 1mas precision in coordinates anyway from TASS).
  Calculations when the numbers are close to the precision of the numeric
field always require care.
Arne