Decimal Encoding Specification, version 1.01
Copyright (c) IBM Corporation, 2009. All rights reserved. © | 7 Apr 2009 |
[previous | contents | next] |
This document describes decimal encodings for decimal numbers. These encodings allow for a range of positive and negative values (including both normal and subnormal numbers), together with values of ±0, ±Infinity, and Not-a-Number (NaN).
Three formats of decimal numbers are described:
The finite numbers are defined by a sign, an exponent (which is a power of ten), and a decimal integer coefficient. The value of a finite number is given by (–1)sign × coefficient × 10exponent. For example, if the sign had the value 1, the exponent had the value –1, and the coefficient had the value 25, then the value of the number is –2.5.
This dual integer description of numbers permits redundant encodings of some values. For example, if the sign had the value 1, the exponent had the value –2, and the coefficient had the value 250, then the numerical value of this number is also –2.5.
The advantage of this representation is that it exactly matches the definition of decimal numbers used in almost all databases, programming languages, and applications. This in turn allows a decimal arithmetic unit to support not only the floating-point arithmetic used in languages such as Java and C# but also the strongly-typed fixed-point and integer arithmetic required in databases and other languages.
The cost of the redundant encodings is approximately a 17% reduction in possible exponent values – however, because the base is 10, the exponent range is always greater than that of the IEEE 754 binary format of the same size.
In the encoding of these dual-integer numbers, the sign is a single bit, as for IEEE 754 binary numbers. The exponent is encoded as an unsigned binary integer from which a bias is subtracted to allow both negative and positive exponents. The coefficient is an unsigned decimal integer, with each complete group of three digits encoded in 10 bits (this increases the precision available by about 15%, compared to simple binary coded decimal).
Given the length of the coefficient and possible values for the encoded exponent, the maximum positive exponent (Emax) and bias can be derived, as described in Appendix A.
Calculating the values leads to the following results for the three formats:
|
|
Note that numbers with the same ‘scale’ (such as 1.23 and 123.45) have the same encoded exponent.
As shown in the first table, each format has a coefficient whose length is a multiple of three, plus one. One digit of the coefficient (the most significant) cannot be included in a 10-bit group and instead is combined with the two most significant digits of the exponent into a 5-bit combination field. This scheme is more efficient than keeping the exponent and coefficient separated, and increases the exponent range available by about 50%.
The combination field requires 30 states out of a possible 32 for the finite numbers; the other two states are used to identify the special values. This localization of the special values means that only the first few bits of a number have to be inspected in order to determine whether it is finite or is a special value. Further, bulk initialization of storage to values of ±0, NaN, or ±Infinity can be achieved by simple byte replication.