Decimal Encoding Specification, version 1.01
Copyright (c) IBM Corporation, 2009. All rights reserved. ©
7 Apr 2009
[previous | contents | next]

Specification

This section defines decimal encodings for decimal numbers. These encodings allow for a range of positive and negative values (including both normal and subnormal numbers), together with values of ±0, ±Infinity, and Not-a-Number (NaN).


Fields in the encodings

Each encoding comprises four fields, as follows:

sign
A single bit indicating the polarity of the number.

In numbers for which the sign has meaning (for finite numbers and Infinity) a 1 indicates the number is negative (or is negative zero) and a 0 indicates it is positive or is non-negative zero.

combination field
A 5-bit field which which encodes the two most significant bits (MSBs) of the exponent (which may take only the values 0 through 2) and the most significant digit (MSD) of the coefficient (4 bits, which may take only the values 0 through 9).

When any of the first four bits of the field is 0, the whole encoding describes a finite number. When all of the first four bits of the field are 1, the whole encoding describes a special value (an Infinity or NaN).

The following table defines the encoding of the combination field. The leftmost of the bits in the combination field is placed first.
Combination
field
(5 bits)
Type Exponent
MSBs
(2 bits)
Coefficient
MSD
(4 bits)
a b c d e Finite a b 0 c d e
1 1 c d e Finite c d 1 0 0 e
1 1 1 1 0 Infinity - - - - - -
1 1 1 1 1 NaN - - - - - -
Note that either one or both of the exponent MSBs will always be 0, so in the first line of the table, either a or b (or both) will be 0, and in the second line of the table, either c or d (or both) will be 0.

exponent continuation
The remaining, less significant, bits of the exponent. The most significant of these bits is on the left (is placed first).

The encoded exponent is formed by appending these continuation bits as a suffix to the two exponent bits derived from the combination field. The whole encoded exponent forms a unsigned binary integer whose largest unsigned value, Elimit, is given by 3 × 2ecbits–1, where ecbits is the number of bits in the exponent continuation. ecbits varies with the format, as detailed below.

The value of the exponent is calculated by subtracting a bias from the value of the encoded exponent, in order to allow both negative and positive exponents. The value of the bias varies with the format, and is also detailed below. In each format, all values of encoded exponent (0 through Elimit) can be used.

When the number is a NaN or an Infinity, the first two bits of the exponent continuation field are used as follows:
Combination
field
Exponent continuation
field most significant bits
Value
1 1 1 1 0 - - Infinity
1 1 1 1 1 0 - quiet NaN
1 1 1 1 1 1 - signaling NaN
where ‘–’ means undefined (an implementation may use these undefined bits for its own purposes, such as to indicate the origin of a NaN value); however, a future standard might require that the results of arithmetic set these bits to 1 for a NaN or 0 for an Infinity.

These assignments allow the bulk initialization of consecutive numbers in storage through byte replication (for initial values of NaNs, ±Infinity, or ±0).

coefficient continuation
The remaining, less significant, digits of the coefficient. The coefficient continuation is a multiple of 10 bits (the multiple depending on the format), and the most significant group is on the left (is placed first).

Each 10-bit group represents three decimal digits, using Densely Packed Decimal encoding.[1]  Note that certain 10-bit groups encode the same value (all 8 possibilities where all three digits in the value are either 8 or 9 have four possible encodings). For these numbers, all four encodings are accepted as operands, but only the encoding with the first two bits being 0 will be generated on output.

The coefficient is formed by appending the decoded continuation digits as a suffix to the digit derived from the combination field. The value of the coefficient is an unsigned integer which is the sum of the values of its digits, each multiplied by the appropriate power of ten. That is, if there are n digits in the coefficient which are labeled dn dn–1 ... d1 d0, where dn is the most significant, the value is SUM(di × 10), where i takes the values 0 through n.

The maximum value of the coefficient, Cmax, is therefore 10n–1.

The coefficient continuation field is undefined when the combination field indicates that the number is an Infinity or NaN. In this case, an implementation may use the bits in the field for its own purposes (for example, to indicate the origin of a NaN value); however, a future standard might require that the results of arithmetic set these bits to 1 for a NaN or 0 for an Infinity.

The fields of encodings are laid out in the order they are described above. Within each field, the bits are laid out as described for each field (that is, the combination field has its bits in the order abcde, the exponent continuation field has its most significant bit first, and the coefficient continuation field has its most significant 10-bit group first).

The network byte order (the order in which the bytes of an encoding are transmitted in a network protocol such as TCP/IP) of an encoding is such that the byte which includes the sign is transmitted first.


Lengths of the fields

This specification defines three formats for decimal numbers: Of these, the decimal32 format is required if the decimal64 format is provided, and the decimal64 format is required if the decimal128 format is provided.

In all three formats, the sign is always one bit and the combination field is always 5 bits. The lengths of the other two fields vary with the format, and from these lengths the maximum exponent (Emax) and bias can be derived, as described in Appendix A. The following table defines the field lengths (in bits, unless specified) and details the corresponding derived values.
Format decimal32 decimal64 decimal128
Format length 32 64 128
Exponent continuation length (ecbits) 6 8 12
Coefficient continuation length 20 50 110
Total Exponent length 8 10 14
Total Coefficient length in digits 7 16 34
Elimit 191 767 12287
Emax 96 384 6144
Emin –95 –383 –6143
bias 101 398 6176


The value of an encoded number

The value of an encoding is either a special value (a NaN or an Infinity) or it is a finite number whose numerical value is given exactly by: (–1)sign × coefficient × 10exponent.

For example, if the sign had the value 1, the exponent had the value –1, and the coefficient had the value 25, then the numerical value of the number is exactly –2.5.

Notes

  1. More than one encoding may have the same numerical value; if the sign again had the value 1, but the exponent had the value –2 and the coefficient had the value 250, then the numerical value of the number would also be exactly –2.5.
  2. The largest value of the exponent in a format is less than Emax because the coefficient is an integer. For a format with precision p digits and maximum exponent Emax, IEEE 854 requires that the maximum absolute value of a number be exactly (10p–1) × 10–(p–1) × 10Emax. (For example, if p=7 and Emax=96 then the largest value allowed is 9.999999E+96.)
    The maximum value of the exponent for a given format is therefore Emax–(p–1). For example, if p=7 and Emax=96 then the number whose coefficient=9999999 and exponent=+90 has the value 9.999999E+96, which is the maximum normal number.
  3. The method for deriving the values of Emax and bias ensures that all combinations of exponent (0 through Elimit) and coefficient (0 through 10p–1) are allowed. For example, the exponent encoded as zero is always allowed.


Examples

In the decimal64 format, the length and content of the fields are:
Length (bits) 1 5 8 50
Content Sign Combination
field
Exponent
continuation
Coefficient
continuation
In this format, the finite number –7.50 would be encoded as follows: The bits of the combination field are therefore 01000 (the last three bits are 0 because the most significant digit of the coefficient is 0). The full encoding is therefore (in hexadecimal, shown in network byte order):
  A2 30 00 00 00 00 03 D0
Simlarly, the value +Infinity is encoded as:
  78 xx xx xx xx xx xx xx
(Where the bytes xx are undefined and could be repetitions of the 78.)

Note that only the first byte has to be inspected to determine whether the number is finite or is a special value. Also, if the number is a special value, its specific value is fully defined in that first byte.


Footnotes:
[1] See Densely Packed Decimal Encoding, Mike Cowlishaw, in IEE Proceedings – Computers and Digital Techniques, ISSN 1350-2387, Vol. 149, No. 3, pp102-104, IEE, May 2002.
Abstract: Chen-Ho encoding is a lossless compression of three Binary Coded Decimal digits into 10 bits using an algorithm which can be applied or reversed using only simple Boolean operations. An improvement to the encoding which has the same advantages but is not limited to multiples of three digits is described. The new encoding allows arbitrary-length decimal numbers to be coded efficiently while keeping decimal digit boundaries accessible. This in turn permits efficient decimal arithmetic and makes the best use of available resources such as storage or hardware registers.
A summary is available at: http://speleotrove.com/decimal/DPDecimal.html

[previous | contents | next]