This is a discussion on Why does numeric data type use storage inefficiently? within the SQL Server forums, part of the Microsoft SQL Server category; --> I have found several sources stating that the numeric data type in SQL Server (without vardecimal options set) stores ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| I have found several sources stating that the numeric data type in SQL Server (without vardecimal options set) stores numbers of precision 10-19 in 9 bytes. Since 19 * ln(10) / ln(2) ~= 63.11 => 65 bits with sign => 9 bytes, I understand 9 bytes at this end of range. However, 10 * ln(10) / ln(2) ~= 33.22 => 35 bits with sign => 5 bytes. Why does SQL Server use the 'extra' 4 bytes? Is there an alignment requirement? If this is the reason, why not break storage requirements at the base-10 precisions that require 4 bytes and 8 bytes, to be synchronized with 16-, 32-, 64-bit registers? Just curious... Thanks in advance, John |
| |||
| On Wed, 6 Feb 2008 09:10:31 -0800 (PST), residual.interest@gmail.com wrote: >I have found several sources stating that the numeric data type in SQL >Server (without vardecimal options set) stores numbers of precision >10-19 in 9 bytes. Since 19 * ln(10) / ln(2) ~= 63.11 => 65 bits with >sign => 9 bytes, I understand 9 bytes at this end of range. However, >10 * ln(10) / ln(2) ~= 33.22 => 35 bits with sign => 5 bytes. Why >does SQL Server use the 'extra' 4 bytes? Is there an alignment >requirement? If this is the reason, why not break storage >requirements at the base-10 precisions that require 4 bytes and 8 >bytes, to be synchronized with 16-, 32-, 64-bit registers? Hi John, Your calculations are not really relevant to the issue. Numeric data is not stored in base-2 stoarage format, but in a variation of BCD (see http://en.wikipedia.org/wiki/Binary-coded_decimal). I have no idea *why* 10-19 numbers all take 9 bytes. BCD is able to store 10 or 11 numbers plus sign in 6 bytes, 12-13 in 7, etc. But since the 9 byte requirement is documented in Books Online, I'm sure that the engineers who created SQL Server had some reason for this. -- Hugo Kornelis, SQL Server MVP My SQL Server blog: http://sqlblog.com/blogs/hugo_kornelis |
| |||
| >> Is there an alignment requirement? If this is the reason, why not break storage requirements at the base-10 precisions that require 4 bytes and 8 bytes, to be synchronized with 16-, 32-, 64-bit registers? << Look at the Intel instruction set and that should provide answers. It is best to use native hardware data types. Or if you have hardware with micro code, to invent your own assembly language for your high level languages -- some old Burroughs machines could that and it was impressive! A Phillips cassette would make them into a COBOL machine, an ALGOL machine, Fortran machine or LISP machine. But more to the point, at the SQL level, we don't care about implementation details. When you start worrying about things at that level, you start writing SQL that is not portable, tuned for one release of one product only and miss global optimizations because of bit fiddling at the low levels. |
| ||||
| Thanks for quick answers to this! As I mention, I ask somewhat out of curiosity. Also, I believe it's healthy to stay mindful enough to the implementation to know how to use high-level tools efficiently. In this case, my curiosity arose because I have a large quantity of data that is calculated as double in C++, and I can store as float, or round and store as fixed precision. In looking at the size of the mantissa supported in floating point versus the precision supported by the fixed-precision, I was surprised by how little the fixed-precision numeric type holds in some cases (e.g., 9 bytes for a 10-digit #) relative to the 8-byte float. Some of this must relate to BCD, but the fact that 10 and 19 digits require the same storage is still surprising (to me). I trust that there are reasons for rounding of space usage. However, one reason I would be motivated to specify a lower precision for a given field is to save storage and fit more records onto a page... The vardecimal is interesting, but the 2 bytes of overhead associated, presumably along with the extra work involved in reading a datum-level size before reading each value, are negatives, plus, if records are frequently updated so as to enlarge vardecimal data, then they'll have to be written out to a different page, resulting in fragmentation, right? Anyway, thanks for the helpful thoughts, John On Feb 6, 7:25*pm, --CELKO-- <jcelko...@earthlink.net> wrote: > >> *Is there an alignment requirement? *If this is the reason, why notbreak storage requirements at the base-10 precisions that require 4 bytes and 8 bytes, to be synchronized with 16-, 32-, 64-bit registers? << > > Look at the Intel instruction set and that should provide answers. It > is best to use native hardware data types. *Or if you have hardware > with micro code, to invent your own assembly language for your high > level languages -- some old Burroughs machines could that and it was > impressive! *A Phillips cassette would make them into a COBOL machine, > an ALGOL machine, Fortran machine or LISP machine. > > But more to the point, at the SQL level, we don't care about > implementation details. *When you start worrying about things at that > level, you start writing SQL that is not portable, tuned for one > release of one product only and miss global optimizations because of > bit fiddling at the low levels. |