This is a discussion on rounding problems within the Pgsql General forums, part of the PostgreSQL category; --> I tried casting them to numeric and it was still wrong OK i just added decimal point after the ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| I tried casting them to numeric and it was still wrong OK i just added decimal point after the 9 and 1 it work at that point. Thats an odd result i would not have expected it to do that. This prompts another question how does postgres figure out the data types passed in an SQL string??? Andy Anderson wrote: > I would guess the issue is that 9/10 is an integer calculation, with > result 0. Use instead 9./10 or 9/10. or 9./10. with result 0.9. > > -- Andy > > On May 12, 2008, at 5:09 PM, Justin wrote: > >> As i'm playing around with rounding and the numeric field precision >> ran into a odd set of results i don't understand >> >> here is the sql i wrote the first four inserts are calculations we >> run everyday and they make sense but if division is used the results >> are not right or am i missing something >> >> create table test_num ( >> num1 numeric(20,1), >> num2 numeric(20,2), >> num3 numeric(20,3), >> num4 numeric(20,4), >> num5 numeric(20,5), >> num6 numeric(20,6), >> num7 numeric(20,7), >> num8 numeric(20,8), >> num9 numeric(20,9)); >> >> delete from test_num; >> >> insert into test_num values ( (0.70 *1.05), (0.70 *1.05), (0.70 *1.05), >> (0.70 *1.05), (0.70 *1.05), (0.70 *1.05), >> (0.70 *1.05), (0.70 *1.05), (0.70 *1.05)); >> >> insert into test_num values ( (0.709 *1.05), (0.709 *1.05), (0.709 >> *1.05), >> (0.709 *1.05), (0.709 *1.05), (0.709 *1.05), >> (0.709 *1.05), (0.709 *1.05), (0.709 *1.05)); >> insert into test_num values( (.5/.03), (.5/.3), (.5/3), >> (.5/30), (.5/300), (.5/3000), >> (.5/30000), (.5/30000), (.5/30000)); >> >> >> insert into test_num values( (.5/.03)*.9975, (.5/.3)*.9975, >> (.5/3)*.9975, >> (.5/30)*.9975, (.5/300)*.9975, (.5/3000)*.9975, >> (.5/30000)*.9975, (.5/30000)*.9975, (.5/30000)*.9975); >> insert into test_num values( (9*.1), >> (9*.01), >> (9*.001), >> (9*.0001), >> (9*.00001), >> (9*.000001), >> (9*.0000001), >> (9*.00000001), >> (9*.000000001)); >> >> insert into test_num values ( (9/10), >> (9/100), >> (9/1000), >> (9/10000), >> (9/100000), >> (9/1000000), >> (9/10000000), >> (9/100000000), >> (9/1000000000)); >> insert into test_num values( (1*.1), >> (1*.01), >> (1*.001), >> (1*.0001), >> (1*.00001), >> (1*.000001), >> (1*.0000001), >> (1*.00000001), >> (1*.000000001)); >> insert into test_num values ( (1/10), >> (1/100), >> (1/1000), >> (1/10000), >> (1/100000), >> (1/1000000), >> (1/10000000), >> (1/100000000), >> (1/1000000000)); >> >> select * from test_num ; >> >> -- >> Sent via pgsql-general mailing list (pgsql-general@postgresql.org) >> To make changes to your subscription: >> http://www.postgresql.org/mailpref/pgsql-general > -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general |
| |||
| Justin wrote: > Craig Ringer wrote: >> `double' in C++ refers to double precision floating point. `double' is >> subject to all the usual fun with rational decimals being irrational >> binary floats (and vice versa). > Not according to MS specific if i'm reading it correctly > > *Microsoft Specific >* > > The double type contains 64 bits: 1 for sign, 11 for the exponent, and > 52 for the mantissa. Its range is +/--1.7E308 with at least 15 digits of > precision I take it you're referring to: http://msdn.microsoft.com/en-us/libr...98(VS.80).aspx ? See how it says "The format is similar to the float format...." ? As you can see from: http://msdn.microsoft.com/en-us/libr...ke(VS.80).aspx the `double' type is a binary floating point representation, just like float. It just has a bigger exponent and a bigger mantissa, so it can represent more extreme values and do so with more precision. Being a binary floating point representation it's subject to all the usual problems with comparison for equality, rounding oddities, etc. Here's one of the many explanations out there on the 'net. I haven't read this particular one, it's just a viable looking Google hit: http://www.cprogramming.com/tutorial...ing_point.html By the way, there was at least a proposal for a numeric/decimal type for C++0x . It doesn't seem to have made the cut. http://209.85.173.104/search?q=cache...ient=firefox-a http://en.wikipedia.org/wiki/C%2B%2B0x http://www.open-std.org/jtc1/sc22/wg21/docs/papers/ It looks like ISO C might adopt a decimal type or library though: http://www2.hursley.ibm.com/decimal/ Note in particular the support in gcc 4.2 or newer. There's also a library: http://www2.hursley.ibm.com/decimal/dfpal/ that might be useful. -- Craig Ringe -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general |
| |||
| Justin wrote: > I tried casting them to numeric and it was still wrong How do the results differ from what you expect? You've posted a bunch of code, but haven't explained what you think is wrong with the results. Can you post a couple of SMALL examples and explain how the results are different from what you expect them to be? Try the example using the following formats for the literals in your test: 2.0 '2.0'::numeric (this is a BCD decimal) '2.0'::float4 (this is a C++/IEEE "float") '2.0'::float8 (this is a C++/IEEE "double") and see how the results differ. -- Craig Riniger -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general |
| |||
| I guess i have not been very clear. lets take this select (9/10), (9/10)::numeric, (9::numeric/10::numeric), (9./10), (9*.1) With the given select statement i expected the results all to be same, especially sense it cast 4 of the 5 to numeric either with explicit cast or by containing a decimal. Instead postgresql cast the first 2 calculations to integer, it then uses integer math so the result is 0. To Add further conversion to my small brain there is a specific type cast to the second calculation but it still returned 0. Not what i would have expected. After thinking about it for say 10 seconds, i see that Postgresql is following the order of operation in the 2nd calculation where it does integer math then cast the results to numeric. I made the incorrect assumption Postgresql would have casted all the arguments to numeric then done the math. After thinking this through for a short bit i see why postgresql is casting the arguments to integer type as numeric/floating point math can be a pretty heavy hit performance wise. So this prompts the question how does postgresql decide what types to cast arguments to. It seems thus far if a decimal is found in the argument its numeric and everything else is assumed to be integer if it does not contain a decimal point. Craig Ringer wrote: > Justin wrote: >> I tried casting them to numeric and it was still wrong > > How do the results differ from what you expect? You've posted a bunch > of code, but haven't explained what you think is wrong with the results. > > Can you post a couple of SMALL examples and explain how the results > are different from what you expect them to be? > > Try the example using the following formats for the literals in your > test: > > 2.0 > '2.0'::numeric (this is a BCD decimal) > '2.0'::float4 (this is a C++/IEEE "float") > '2.0'::float8 (this is a C++/IEEE "double") > > and see how the results differ. > > -- > Craig Riniger > -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general |
| |||
| Yet another option, of course, is to simply not do any calculations in PostgreSQL, and accept the results from Excel as definitive... which seems to be what is desired, anyway. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general |
| |||
| On May 12, 2008, at 6:37 PM, Justin wrote: > lets take this > select (9/10), (9/10)::numeric, (9::numeric/10::numeric), (9./10), > (9*.1) > > With the given select statement i expected the results all to be > same, > especially sense it cast 4 of the 5 to numeric either with explicit > cast > or by containing a decimal. Instead postgresql cast the first 2 > calculations to integer, it then uses integer math so the result is 0. Putting a decimal on a string of digits is the standard way to specify that it's numeric rather than integer; see 4.1.2.4. Numeric Constants: http://www.postgresql.org/docs/8.3/i...ve/sql-syntax- lexical.html#AEN1276> In other words, 9. is equivalent to 9::numeric, though the latter involves an operation on an integer. If a calculation contains a numeric value, any integers involved will be cast to a numeric value first, and then the calculation will proceed numerically. 9/10 => 0 (a purely integer calculation, division truncates the fractional part) (9/10)::numeric => 0::numeric => 0. (using parentheses forces the integer calculation to occur *before* the cast) 9::numeric/10::numeric => 9./10. => 0.9 (using one or two casts forces a numeric calculation) 9./10 => 9./10. => 0.9 (specifying a numeric value forces the integer to be cast to numeric) > To Add further conversion to my small brain there is a specific type > cast to the second calculation but it still returned 0. Not what i > would have expected. After thinking about it for say 10 seconds, i > see > that Postgresql is following the order of operation in the 2nd > calculation where it does integer math then cast the results to > numeric. > > I made the incorrect assumption Postgresql would have casted all the > arguments to numeric then done the math. Not when you change the order of evaluation by using parentheses. See the precedence table in 4.1.6. Lexical Precedence: http://www.postgresql.org/docs/8.3/i...ve/sql-syntax- lexical.html#SQL-PRECEDENCE > After thinking this through > for a short bit i see why postgresql is casting the arguments to > integer > type as numeric/floating point math can be a pretty heavy hit > performance wise. > > So this prompts the question how does postgresql decide what types to > cast arguments to. It starts with operator precedence to determine the order of operation, and then for each operator it decides how it will cast arguments for the "best" results. -- Andy -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general |
| |||
| On Mon, May 12, 2008 at 06:37:02PM -0400, Justin wrote: > I guess i have not been very clear. > > lets take this > select (9/10), (9/10)::numeric, (9::numeric/10::numeric), (9./10), > (9*.1) > > With the given select statement i expected the results all to be same, > especially sense it cast 4 of the 5 to numeric either with explicit cast > or by containing a decimal. Instead postgresql cast the first 2 > calculations to integer, it then uses integer math so the result is 0. > > To Add further conversion to my small brain there is a specific type > cast to the second calculation but it still returned 0. Not what i > would have expected. After thinking about it for say 10 seconds, i see > that Postgresql is following the order of operation in the 2nd > calculation where it does integer math then cast the results to numeric. PG does very similar things to what C does. '9' is an integer literal, and so is '10', there is a '/' operator that takes two integers and returns an integer and this gets used, resulting in an integer. If you happen to cast the thing to a value of numeric type this will happen after the division (i.e. it follows the syntax, like C does). Casting the integers to values of numeric type is similar, just the numeric version of the division operator gets used. The last example exercises a different code path, in that '9.' is a value of numeric type and '10' is still of integer type. There's some magic somewhere in PG that says that values of numeric type are more expressive than values of integer type causing the parser (I'm guessing here) to insert a cast to numeric type. The types now unify and one value can be divided by the other. The magic seems somewhat arbitrary; what if I wanted to go to the less precise type or generally be told when things didn't unify. > I made the incorrect assumption Postgresql would have casted all the > arguments to numeric then done the math. After thinking this through > for a short bit i see why postgresql is casting the arguments to integer > type as numeric/floating point math can be a pretty heavy hit > performance wise. I don't think it's accurate to say the behaviour is there because of performance reasons, it's just evaluating your code as you've written it. The behaviour you describe is closer to an untyped (i.e. dynamically checked, or as they seem to be popularly known "weakly typed") scripting language. Either that or something like Haskell which treats types much more rigorously than PG, where the expression (9.0 / (10::Int)) would fail to type check, and 9.0/10 (or even 9/10) would do what you wanted and parse 10 as any value that implements the fractional type class (probably a floating point number). The easiest way to understand what's going on is generally playing with a single expression, then changing the literals to represent values of different types and seeing how the result changes. You may get some mileage out of using EXPLAIN VERBOSE (you can see the cast being inserted in the 9./10 case, when compared to 9/10---function OID 1740 takes an int4 and returns a numeric) but it's somewhat difficult to read. Sam -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general |
| |||
| thats what i'm trying to get a grasp on, what postgres is doing with calculation as it truncates or rounds the number when committing the records to the physical table. I just start digging into this as we are having problems where some fields in the database are precision of 2 and other go all the way to 10 decimal places. The table layout we have is not consistent and the result are hundred to thousandths of pennies off but those pennies start become dollars every 100 to 1000 transactions. It seems the pg rounding is favoring the lower side of the number when being committed to the table. I've been going over transactions in WIP and compared to values in the Generial Ledger i'm off 6 cents and thats only on 36 transactions that i have handcheck. GL has a lower value compared to the records in WIP tables which have 4 and 6 decimals precision versues GL 2 decimal precision in the tables I going through the tables and making all the numeric fields all the same. I have run into problems as some of columns are referenced by views and other constraints and its not letting me change them. :'( WE have several columns in table defined with numeric (20,10) thats is just insanity. Unless your doing scientific calculations which we do, do. Having that many decimal points for an accounting package is just nonsense and then its rounded to 4 or 6 in Inventory and Wip tables then 2 when the numbers finally hit the GL tables. Who ever laid these tables out has never had to try and get numbers to balance and agree across tables :-( . Every time i dig a little deeper i keep finding stupid things like this. Some people may think i'm crazy trying to track this down but when you're only consume 0.003186 lbs of a metal per part that cost 22.7868 per lb and the work order calls fro 1148 parts. how the machine rounds becomes a big problem (.00318611*1148) = 3.65765 lbs consumed * 22.7868 = $83.3462 . Sense GL tables are 2 decimal the value is stored as $83.35 But the problem is far worse than that. BOM allows for greater precision of 8 wip Inventory Movements shows only 6, Wip tables has 6 and 4. The question quickly becomes what number is the correct number. Wip truncates the material consumed to .003186*1148 = 3.6575 * 22.7868 = 83.3434 which is rounded = 83.34 Multiply this by 1000 transactions a day and we start having major problems. Sam Mason wrote: > On Mon, May 12, 2008 at 06:37:02PM -0400, Justin wrote: > >> I guess i have not been very clear. >> >> lets take this >> select (9/10), (9/10)::numeric, (9::numeric/10::numeric), (9./10), >> (9*.1) >> >> With the given select statement i expected the results all to be same, >> especially sense it cast 4 of the 5 to numeric either with explicit cast >> or by containing a decimal. Instead postgresql cast the first 2 >> calculations to integer, it then uses integer math so the result is 0. >> >> To Add further conversion to my small brain there is a specific type >> cast to the second calculation but it still returned 0. Not what i >> would have expected. After thinking about it for say 10 seconds, i see >> that Postgresql is following the order of operation in the 2nd >> calculation where it does integer math then cast the results to numeric. >> > > PG does very similar things to what C does. '9' is an integer literal, > and so is '10', there is a '/' operator that takes two integers and > returns an integer and this gets used, resulting in an integer. If you > happen to cast the thing to a value of numeric type this will happen > after the division (i.e. it follows the syntax, like C does). Casting > the integers to values of numeric type is similar, just the numeric > version of the division operator gets used. The last example exercises > a different code path, in that '9.' is a value of numeric type and '10' > is still of integer type. There's some magic somewhere in PG that says > that values of numeric type are more expressive than values of integer > type causing the parser (I'm guessing here) to insert a cast to numeric > type. The types now unify and one value can be divided by the other. > > The magic seems somewhat arbitrary; what if I wanted to go to the less > precise type or generally be told when things didn't unify. > > >> I made the incorrect assumption Postgresql would have casted all the >> arguments to numeric then done the math. After thinking this through >> for a short bit i see why postgresql is casting the arguments to integer >> type as numeric/floating point math can be a pretty heavy hit >> performance wise. >> > > I don't think it's accurate to say the behaviour is there because > of performance reasons, it's just evaluating your code as you've > written it. The behaviour you describe is closer to an untyped (i.e. > dynamically checked, or as they seem to be popularly known "weakly > typed") scripting language. Either that or something like Haskell which > treats types much more rigorously than PG, where the expression (9.0 / > (10::Int)) would fail to type check, and 9.0/10 (or even 9/10) would do > what you wanted and parse 10 as any value that implements the fractional > type class (probably a floating point number). > > The easiest way to understand what's going on is generally playing with > a single expression, then changing the literals to represent values > of different types and seeing how the result changes. You may get > some mileage out of using EXPLAIN VERBOSE (you can see the cast being > inserted in the 9./10 case, when compared to 9/10---function OID 1740 > takes an int4 and returns a numeric) but it's somewhat difficult to > read. > > > Sam > > |
| |||
| Justin wrote: > WE have several columns in table defined with numeric (20,10) thats is > just insanity. Not necessarily. I have a few places where a monetary value is mulitiplied by a ratio quantity. For some of the historical data imported from another system the ratio can be irrational or at least not representable in any small precision value. I ended up needing a precision of 8 numeric digits to acceptably represent these ratios, resulting in a numeric(16,8) type to permit ratio values up to 99999999.99999999 . I probably could've got away with numeric(13,8) or even numeric(12,8) but as space and performance aren't utterly critical it didn't seem to be worth the risk of hitting limits and overflows later. As it is I'm tempted to go to 10 digits of precision, as there's still a 3 cent difference between the totals from the old system and the same data imported into the new system. You'll encounter similar situations in your materials consumption tracking (as you detailed below) and other places. So don't discount the use of high precision numeric values just yet. Personally I'd be tempted to use a `double precision' (float8) for things like materials consumption. Materials consumed in huge quantities will have lower price rates, and materials consumed in tiny quantities will often be priced higher. With wally-numbers: You're not going to care about the 0.0003 kg of steel consumed at a price of $0.00001 , but the same amount of something valuable might have a detectable (if still sub-cent) value. Floating point numbers are IMO better for that than BCD numeric. However, since the float will just get converted to numeric during multiplication with a numeric price-per-mass ratio it may well not be worth worrying about it. There's a use for that numeric(20,10). > Unless your doing scientific calculations which we do, > do. Having that many decimal points for an accounting package is just > nonsense and then its rounded to 4 or 6 in Inventory and Wip tables > then 2 when the numbers finally hit the GL tables. Who ever laid > these tables out has never had to try and get numbers to balance and > agree across tables :-( . Every time i dig a little deeper i keep > finding stupid things like this. It sounds like you might have quite a bit of compounded rounding error from the successive stages of rounding as data moves through the system. Maybe you're rounding too aggressively? I like to store a bit more precision than I have to, unless there's a business rule that requires rounding to a particular precision. For example, if your invoice items are rounded to whole cents you'd probably round the calculated invoice item price when inserting into an invoice item table. Of course, that means that sum(calculation of invoice item price) <> sum(rounded price of invoice items) because of rounding. That's fine; you can't balance the two things exactly because they're actually subtly different things. If you're using an appropriate rounding method for financial data, like round-to-even, you'll only ever get a couple of cents difference and that should be expected and ignored. > Some people may think i'm crazy trying to track this down but when > you're only consume 0.003186 lbs of a metal per part that cost 22.7868 > per lb and the work order calls fro 1148 parts. how the machine rounds > becomes a big problem (.00318611*1148) = 3.65765 lbs consumed * > 22.7868 = $83.3462 . Sense GL tables are 2 decimal the value is stored > as $83.35 Thinking about correct rounding and precision is very important, and far from crazy. > The question quickly becomes what number is the correct number. Sometimes the answer is "both of them" - even though they are different. See the example above with rounded invoice items. -- Craig Ringer -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general |
| ||||
| Craig Ringer wrote: > Justin wrote: > > >> WE have several columns in table defined with numeric (20,10) thats is >> just insanity. >> > > Not necessarily. I have a few places where a monetary value is > mulitiplied by a ratio quantity. For some of the historical data > imported from another system the ratio can be irrational or at least not > representable in any small precision value. > > I ended up needing a precision of 8 numeric digits to acceptably > represent these ratios, resulting in a numeric(16,8) type to permit > ratio values up to 99999999.99999999 . I probably could've got away with > numeric(13,8) or even numeric(12,8) but as space and performance aren't > utterly critical it didn't seem to be worth the risk of hitting limits > and overflows later. As it is I'm tempted to go to 10 digits of > precision, as there's still a 3 cent difference between the totals from > the old system and the same data imported into the new system. > That 3 cent difference is over how many transactions ??? The differences i'm seeing are getting into the hundreds of dollars in 1 quarter within this stupid application. The person/persons who laid this database out do not or did not understand the compound rounding errors. I'm just trying to figure out how best to fix it. > You'll encounter similar situations in your materials consumption > tracking (as you detailed below) and other places. So don't discount the > use of high precision numeric values just yet. > > Personally I'd be tempted to use a `double precision' (float8) for > things like materials consumption. Materials consumed in huge quantities > will have lower price rates, and materials consumed in tiny quantities > will often be priced higher. With wally-numbers: You're not going to > care about the 0.0003 kg of steel consumed at a price of $0.00001 , but > the same amount of something valuable might have a detectable (if still > sub-cent) value. Floating point numbers are IMO better for that than BCD > numeric. However, since the float will just get converted to numeric > during multiplication with a numeric price-per-mass ratio it may well > not be worth worrying about it. > > There's a use for that numeric(20,10). > I'm moving all the numeric fields to numeric(20,8) . I feel its pretty safe with that scale setting. I agree data storage and performance aren't critical concerns as they once were > > >> Unless your doing scientific calculations which we do, >> do. Having that many decimal points for an accounting package is just >> nonsense and then its rounded to 4 or 6 in Inventory and Wip tables >> then 2 when the numbers finally hit the GL tables. Who ever laid >> these tables out has never had to try and get numbers to balance and >> agree across tables :-( . Every time i dig a little deeper i keep >> finding stupid things like this. >> > > It sounds like you might have quite a bit of compounded rounding error > from the successive stages of rounding as data moves through the system. > Maybe you're rounding too aggressively? > Thats the problem the database layout is crap. > I like to store a bit more precision than I have to, unless there's a > business rule that requires rounding to a particular precision. For > example, if your invoice items are rounded to whole cents you'd probably > round the calculated invoice item price when inserting into an invoice > item table. > > Of course, that means that > sum(calculation of invoice item price) > <> > sum(rounded price of invoice items) > > because of rounding. That's fine; you can't balance the two things > exactly because they're actually subtly different things. If you're > using an appropriate rounding method for financial data, like > round-to-even, you'll only ever get a couple of cents difference and > that should be expected and ignored. > I normally would but given all the tables are showing different values when summed over a Accounting period its adding up to significant differences between all the tables. > >> Some people may think i'm crazy trying to track this down but when >> you're only consume 0.003186 lbs of a metal per part that cost 22.7868 >> per lb and the work order calls fro 1148 parts. how the machine rounds >> becomes a big problem (.00318611*1148) = 3.65765 lbs consumed * >> 22.7868 = $83.3462 . Sense GL tables are 2 decimal the value is stored >> as $83.35 >> > > Thinking about correct rounding and precision is very important, and far > from crazy. > > >> The question quickly becomes what number is the correct number. >> > > Sometimes the answer is "both of them" - even though they are different. > See the example above with rounded invoice items. > > -- > Craig Ringer > |