Bibliography of material on Decimal Arithmetic [Index]

Decimal Arithmetic: Floating-point

think 10

Precise Computation Using Range Arithmetic, via C++, Oliver Aberth and Mark J Schaefer, ACM Transactions on Mathematical Software, Vol. 18 #4, pp481–491, ACM Press, December 1992.
Abstract: An arithmetic is described that can replace floating-point arithmetic for programming tasks requiring assured accuracy. A general explanation is given of how the arithmetic is constructed with C++, and a programming example in this language is supplied. Times for solving representative problems are presented.
Solving Constraints on the Intermediate Result of Decimal Floating-Point Operations, Merav Aharoni, Ron Maharik, and Abraham Ziv, Proceedings of the 18th IEEE Symposium on Computer Arithmetic, ISBN 0-7695-2854-6, ISBN 978-0-7695-2854-0, pp38–45, IEEE, June 2007.
Abstract: The draft revision of the IEEE Standard for Floating- Point Arithmetic (IEEE P754) includes a definition for decimal floating-point (FP) in addition to the widely used binary FP specification. The decimal standard raises new concerns with regard to the verification of hardware- and software-based designs. The verification process normally emphasizes intricate corner cases and uncommon events. The decimal format introduces several new classes of such events in addition to those characteristic of binary FP. Our work addresses the following problem: Given a decimal floating-point operation, a constraint on the intermediate result, and a constraint on the representation selected for the result, find random inputs for the operation that yield an intermediate result compatible with these specifications. The paper supplies efficient analytic solutions for addition and for some cases of multiplication and division. We provide probabilistic algorithms for the remaining cases. These algorithms prove to be efficient in the actual implementation.
An APL interpreter and system for a small computer, M. Alfonseca, M. L. Tavera, and R. Casajuana, IBM Systems Journal, Vol. 16 #1, pp18–40, IBM, 1977.
Abstract: The design and implementation of an experimental APL system on the small, sensor-based System/7 is described. Emphasis is placed on the solution to the problem of fitting a full APL system into a small computer.
   The system has been extended through an I/O auxiliary processor to make it possible to use APL in the management and control of the System/7 sensor-based I/O operations.
ANSI X3.274-1996: American National Standard for Information Technology – Programming Language REXX, Brian Marks and Neil Milsted, 167pp, ANSI, February 1996.
Abstract: This standard provides an unambiguous definition of the programming language REXX. Its purpose is to facilitate portability of REXX programs for use on a wide variety of computer systems.
Note: Errata also available, as ANSI X3.274-1996/AM 1-2000.
Unnormalized Floating Point Arithmetic, R. L. Ashenhurst and N. Metropolis, Journal of the ACM, Vol. 6 #3, pp415–428, ACM Press, July 1959.
Abstract: Algorithms for floating point computer arithmetic are described, in which fractional parts are not subject to the usual normalization convention. These algorithms give results in a form which furnishes some indication of their degree of precision. An analysis of one-stage error propagation is developed for each operation; a suggested statistical model for long run error propagation is also set forth.
Extending TeX and METAFONT with floating-point arithmetic, Nelson H.F. Beebe, Proceedings of TUG 2007, TUGboat Vol. 28 #3, ISSN 0896-3207, pp319–328, TeX User's Group, July 2007.
Abstract: The article surveys the state of arithmetic in TeX and METAFONT, suggests that they could usefully be extended to support floating-point arithmetic, and shows how this could be done with a relatively small effort, without loss of the important feature of platform-independent results from those programs, and without invalidating any existing documents, or software written for those programs, including output drivers.
A Decimal Floating-Point Processor for Optimal Arithmetic, G. Bohlender and T. Teufel, Computer arithmetic: Scientific Computation and Programming Languages, ISBN 3-519-02448-9, pp31–58, B. G. Teubner Stuttgart, 1987.
Abstract: A floating-point processor for optimal arithmetic should perform scalar products with maximum accuracy in addition to the usual operations +, -, *, /. This means that scalar products have to be computed with an error of at most one bit of the least significant digit, even if cancellation of leading digits occurs. In order to avoid conversion errors during input and output of numerical data, the decimal number system should be chosen.
    The arithmetic processor BAP-SC performs these operations in a 64 bit floating-point format with 13 decimal digits in the mantissa. The prototype is built in bit-slice technology on wire-wrap boards. Interfaces have been developed [sic] for several busses and computers.
    The arithmetic processor is fully integrated in the programming language PASCAL-SC. It supports operations in higher numerical spaces and new numerical algorithms that compute verified results with error bounds.
Decimal Floating-Point Arithmetic in Binary Representation, Gerd Bohlender, Computer arithmetic: Scientific Computation and Mathematical Modelling (Proceedings of the Second International Conference, Albena, Bulgaria, 24-28 September 1990), pp13–27, J. C. Baltzer AG, 1991.
Abstract: The binary representation of decimal floating-point numbers permits an efficient implementation of the proposed radix independent IEEE standard for floating-point arithmetic, as far as storage space is concerned. Unfortunately the left and right shifts occurring in the arithmetic operations are very complicated and slow in this representation. In the present paper therefore methods are proposed which speed up these shifts; in particular a kind of carry look-ahead technique is used for division. These methods can be combined to construct a decimal shifter which is needed in an ALU for decimal arithmetic.
Printing Floating-Point Numbers Quickly and Accurately, Robert G. Burger and R. Kent Dybvig, Proceedings of the ACM SIGPLAN '96 conference on Programming language design and implementation, pp108–116, ACM Press, 1996.
Abstract: This paper presents a fast and accurate algorithm for printing floating-point numbers in both free- and fixed-format modes. In free-format mode, the algorithm generates the shortest, correctly rounded output string that converts to the same number when read back in, accommodating whatever rounding mode the reader uses. In fixed-format mode, the algorithm generates a correctly rounded output string using special # marks to denote insignificant trailing digits. For both modes, the algorithm employs a fast estimator to scale floating-point numbers efficiently.
Burroughs B5500 Information Processing Systems Reference Manual, Burroughs Corporation, 224pp, Burroughs Corporation, Detroit, Michigan, 1964.
Abstract: This reference manual describes the hardware characteristics of the Burroughs B 5500 Information Processing System by presenting detailed information concerning the functional operation of the entire system. The B 5500 is a large-scale, high-speed, solid-state computer which represents a departure from the conventional computer system concept. It is a problem language oriented system rather than the conventional hardware oriented system. Because of the design concept of the B 5500, there exists a strong interdependence between the hardware and the Master Control Program which directs the system. The material presented herein pertains only to the hardware considerations, whereas the Master Control Program is discussed under separate cover.
Specification of the IEEE-854 Floating-Point Standard in HOL and PVS, Victor A. Carreño and Paul S. Miner, HOL95: Eighth International Workshop on Higher-Order Logic Theorem Proving and Its Applications, 16pp, Brigham Young University, September 1995.
Abstract: The IEEE-854 Standard for radix-independent floating-point arithmetic has been partially defined within two mechanical verication systems. We present the specication of key parts of the standard in both HOL and PVS. This effort to formalize IEEE-854 has given the opportunity to compare the styles imposed by the two verification systems on the specification.
A Proposed Radix- and Word-length-independent Standard for Floating-point Arithmetic, W. J. Cody, J. T. Coonen, D. M. Gay, K. Hanson, D. Hough, W. Kahan, R. Karpinski, J. Palmer, F. N. Ris, and D. Stevenson, IEEE Micro magazine, Vol. 4 #4, pp86–100, IEEE, August 1984.
Abstract: This article places [Draft 1.0 of IEEE 854] before the public for the first time. ... This article also includes material that describes how decisions were reached in preparing the P854 draft and explains how to overcome some of the implementation problems.
Note: Reprinted in ACM SIGNUM, Vol. 20, #1, pp35-51, 1985.
CADAC: A Controlled-Precision Decimal Arithmetic Unit, Marty S. Cohen, T. E. Hull, and V. Carl Hamacher, IEEE Transactions on Computers, Vol. 32 #4, pp370–377, IEEE, April 1983.
Abstract: This paper describes the design of an arithmetic unit called CADAC (clean arithmetic with decimal base and controlled precision). Programming language specifications for carrying out “ideal” floating-point arithmetic are described first. These specifications include detailed requirements for dynamic precision control and exception handling, along with both complex and interval arithmetic at the level of a programming language such as Fortran or PL/I.
    CADAC is an arithmetic unit which performs the four floating-point operations add/subtract/multiply/divide on decimal numbers in such a way as to support all the language requirements efficiently. A three-level pipeline is used to overlap two-digit-at-a-time serial processing of the partial products/remainders. Although the logic design is relatively complex, the performance is efficient, and the advantages gained by implementing programmer-controlled precision directly in the hardware are significant.
A Decimal Floating-Point Specification, Michael F. Cowlishaw, Eric M. Schwarz, Ronald M. Smith, and Charles F. Webb, Proceedings of the 15th IEEE Symposium on Computer Arithmetic, ISBN 0-7695-1150-3, pp147–154, IEEE, June 2001.
Abstract: Even though decimal arithmetic is pervasive in financial and commercial transactions, computers are still implementing almost all arithmetic calculations using binary arithmetic. As chip real estate becomes cheaper it is becoming likely that more computer manufacturers will provide processors with decimal arithmetic engines. Programming languages and databases are expanding the decimal data types available while there has been little change in the base hardware. As a result, each language and application is defining a different arithmetic and few have considered the efficiency of hardware implementations when setting requirements.
    In this paper, we propose a decimal format which meets the requirements of existing standards for decimal arithmetic and is efficient for hardware implementation. We propose this specification in the hope that designers will consider providing decimal arithmetic in future microprocessors and that future decimal software specifications will consider hardware efficiencies.

Note: Eric Schwarz’s Presentation foils are available here.
Decimal Floating-Point: Algorism for Computers, Michael F. Cowlishaw, Proceedings of the 16th IEEE Symposium on Computer Arithmetic, ISBN 0-7695-1894-X, pp104–111, IEEE, June 2003.
Abstract: Decimal arithmetic is the norm in human calculations, and human-centric applications must use a decimal floating-point arithmetic to achieve the same results.
    Initial benchmarks indicate that some applications spend 50% to 90% of their time in decimal processing, because software decimal arithmetic suffers a 100× to 1000× performance penalty over hardware. The need for decimal floating-point in hardware is urgent.
    Existing designs, however, either fail to conform to modern standards or are incompatible with the established rules of decimal arithmetic. This paper introduces a new approach to decimal floating-point which not only provides the strict results which are necessary for commercial applications but also meets the constraints and requirements of the IEEE 854 standard.
    A hardware implementation of this arithmetic is in development, and it is expected that this will significantly accelerate a wide variety of applications.

Note: Softcopy is available in PDF.
Fixed, floating, and exact computation with Java's BigDecimal, M. Cowlishaw, J. Bloch, and J.D. Darcy, Dr. Dobb's Journal Vol. 29 #7, ISSN 1044-789X, pp22–27, CMP Media, July 2004.
Abstract: Decimal data types are widely used in commercial, financial, and Web applications, and many general-purpose programming languages have either native decimal types or readily available decimal arithmetic packages. Since the 1.1 release, the libraries of the Java programming language supported decimal arithmetic via the Java.math.BigDecimal class. With the inclusion of JSR13 into J2SE 1.5, BigDecimal now has true floating-point operations consistent with those in the IEEE 754 revision. In this article, we first explain why decimal arithmetic is important and the differences between the BigDecimal class and binary float and double types.
Decimal Floating Point Processor, K. A. Duke, IBM Technical Disclosure Bulletin, 11-69, pp862–862, IBM, November 1969.
Abstract: A numerical processor can be built which operates on floating-point numbers where the mantissa is an integer and the characteristic represents a power of 10 by which that integer must be multiplied. Thus, decimal numbers can be represented exactly without conversion errors. Such floating point numbers are expressed as N = (-1)/S/ x 10/X/ x I where S = sign bit, X = exponent, and I = integer.
IBM POWER6 accelerators: VMX and DFU, L. Eisen, J. W. Ward III, H.-W. Tast, N. Mäding, J. Leenstra, S. M. Mueller, C. Jacobi, J. Preiss, E. M. Schwarz, and S. R. Carlough, IBM Journal of Research and Development Vol. 51 #6, ISSN 0018-8646, pp663–683, IBM, November 2007.
Abstract: The IBM POWER6 microprocessor core includes two accelerators for increasing performance of specific workloads. The vector multimedia extension (VMX) provides a vector acceleration of graphic and scientific workloads. It provides single instructions that work on multiple data elements. The instructions separate a 128-bit vector into different components that are operated on concurrently. The decimal floating-point unit (DFU) provides acceleration of commercial workloads, more specifically, financial transactions. It provides a new number system that performs implicit rounding to decimal radix points, a feature essential to monetary transactions. The IBM POWER processor instruction set is substantially expanded with the addition of these two accelerators. The VMX architecture contains 176 instructions, while the DFU architecture adds 54 instructions to the base architecture. The IEEE 754R Binary Floating-Point Arithmetic Standard defines decimal floating-point formats, and the POWER6 processor—on which a substantial amount of area has been devoted to increasing performance of both scientific and commercial workloads—is the first commercial hardware implementation of this format.
Decimal Floating-Point Multiplication Via Carry-Save Addition, Mark A. Erle, Michael J. Schulte, and Brian J. Hickmann, Proceedings of the 18th IEEE Symposium on Computer Arithmetic, ISBN 0-7695-2854-6, ISBN 978-0-7695-2854-0, pp46–55, IEEE, June 2007.
Abstract: Decimal multiplication is important in many commercial applications including financial analysis, banking, tax calculation, currency conversion, insurance, and accounting. This paper presents the design of a decimal floating-point multiplier that complies with specifications for decimal multiplication given in the draft revision of the IEEE 754 Standard for Floating-point Arithmetic (IEEE 754R). This multiplier extends a previously published decimal fixedpoint multiplier design by adding several features including exponent generation, sticky bit generation, shifting of the intermediate product, rounding, and exception detection and handling. The core of the decimal multiplication algorithm is an iterative scheme of partial product accumulation employing decimal carry-save addition to reduce the critical path delay. Novel features of the proposed multiplier include support for decimal floating-point numbers, on-thefly generation of the sticky bit, early estimation of the shift amount, and efficient decimal rounding. Area and delay estimates are provided for a verified Verilog register transfer level model of the multiplier.
The Design of Floating-Point Data Types, David Goldberg, ACM Letters on Programming Languages and Systems, Vol. 1 #2, pp138–151, ACM Press, June 1992.
Abstract: The issues involved in designing the floating-point part of a programming language are discussed. Looking at the language specifications for most existing languages might suggest that this design involves only trivial issues, such as whether to have one or two types of REALs or how to name the functions that convert from INTEGER to REAL. It is shown that there are more significant semantic issues involved. After discussing the trade-offs for the major design decisions, they are illustrated by presenting the design of the floating-point part of the Modula-3 language.
On a Floating-Point Number Representation For Use with Algorithmic Languages, A. A. Grau, Communications of the ACM, Vol. 5 #3, pp160–161, ACM Press, March 1962.
Abstract: Algorithmic languages, such as ALGOL, make provision for two types of numbers, real and integer, which are usually implemented on the computer by means of floating-point and fixed-point numbers respectively. The concepts real and integer, however, are taken from mathematics, where the set of integers forms a proper subset of the set of real numbers. In implementation a real problem is posed by the fact that the set of fixed-point numbers is not a proper subset of the set of floating-point numbers; this problem becomes very apparent in attempts to implement ALGOL 60. Furthermore, the one mathematical operation of addition is implemented in the machine by one of two machine operations, fixed-point addition or floating-point addition. ...
Decimal to Binary Floating Point Number Conversion Mechanism, J. W. Havender, IBM Technical Disclosure Bulletin, 07-80, pp706–708, IBM, July 1980.
Abstract: Floating point numbers may be converted from decimal to binary using a high speed natural logarithm and exponential function calculation mechanism and a fixed point divide/multiply unit.
    The problem solved is to convert numbers expressed in a radix 10 floating point form to numbers expressed in a radix 2 floating point form.
Math Reference, Hewlett Packard Company, HP-71 Reference Manual, Mfg. # 0071-90110, Reorder # 0071-90010, pp317–318, Hewlett Packard Company, October 1987.
Note: First edition October 1983. Subsections describe the numeric precisions available and the range of representable numbers. Manual available from The Museum of HP Calculators (
The IEEE Proposal for Handling Math Exceptions, Hewlett Packard Company, HP-71 Reference Manual, Mfg. # 0071-90110, Reorder # 0071-90010, pp338–345, Hewlett Packard Company, October 1987.
Abstract: The IEEE Radix Independent Floating-Point Proposal divides all of the floating-point “exceptional events” encountered in calculations into five classes of math exceptions: invalid operation, division by zero, overflow, underflow, and inexact result. Associated with each math exception is a flag that is set by the HP-71 whenever an exception is encountered. These flags remain set until you clear them. Each of these flags can be accessed by its number or its name.
Note: First edition October 1983. Manual available from The Museum of HP Calculators (
Chapter 13 – Internal Data Representations, Hewlett Packard Company, Software Internal Design Specification for the HP-71, Vol. 1 Part #00071-90068, pp13.1–13.17, Hewlett Packard Company, December 1983.
Abstract: This chapter discusses the format in which the HP-71 represents numeric or string data in memory or in the CPU registers.
Note: Manual available from The Museum of HP Calculators (
Desirable Floating-Point Arithmetic and Elementary Functions for Numerical Computation, T. E. Hull, ACM Signum Newsletter, Vol. 14 #1 (Proceedings of the SIGNUM Conference on the Programming Environment for Development of Numerical Software), pp96–99, ACM Press, 1978.
Abstract: The purpose of this talk is to summarize proposed specifications for floating-point arithmetic and elementary functions. The topics considered are: the base of the number system, precision control, number representation, arithmetic operations, other basic operations, elementary functions, and exception handling. The possibility of doing without fixed-point arithmetic is also mentioned. The specifications are intended to be entirely at the level of a programming language such as Fortran. The emphasis is on convenience and simplicity from the user’s point of view. Conforming to such specifications would have obvious beneficial implications for the portability of numerical software, and for proving programs correct, as well as attempting to provide facilities which are most suitable for the user. The specifications are not complete in every detail, but it is intended that they be complete “in spirit” – some further details, especially syntactic details, would have to be provided, but the proposals are otherwise relatively complete.
Note: Also in Proceedings of the IEEE 4th Symposium on Computer Arithmetic pp63-69.
Principles, Preferences and Ideals for Computer Arithmetic, Thomas E. Hull, Christian H. Reinsch, and John R. Rice, CSD-TR-339, 13pp, Dept. of Computer Science, Purdue University, June 1980.
Abstract: This paper presents principles and preferences for the implementation of computer arithmetic and ideals for the arithmetic facilities in future programming languages. The implementation principles and preferences are for the current approaches to the design of arithmetic units. The ideals are for the long term development of programming languages, with the hope that arithmetic units will be built to support the requirements of programming languages.
Toward an Ideal Computer Arithmetic, T. E. Hull and M. S. Cohen, Proceedings of the 8th Symposium on Computer Arithmetic, pp131–138, IEEE, May 1987.
Abstract: A new computer arithmetic is described. Closely related built-in functions are included. A user’s point of view is taken, so that the emphasis is on what language features are available to a user. The main new feature is flexible precision control of decimal floating-point arithmetic. It is intended that the language facilities be sufficient for describing numerical processes one might want to implement, while at the same time being simple to use, and implementable in a reasonably efficient manner. Illustrative examples are based on experience with an existing software implementation.
Decimal Shifting for an Exact Floating Point Representation, J. D. Johannes, C. Dennis Pegden, and F. E. Petry, Computers and Electrical Engineering, Vol. 7 #3, pp149–155, Elsevier, September 1980.
Abstract: A floating point representation which permits exact conversion of decimal numbers is discussed. This requires the exponent to represent a power of ten, and thus decimal shifts of the mantissa are needed. A specialized design is analyzed for the problem of division by ten, which is needed for decimal shifting.
Higher Radix Floating Point Representations, P. Johnstone and F. Petry, Proceedings of the 9th Symposium on Computer Arithmetic, ISBN 0-8186-8963-3, pp128–135, IEEE Computer Society Press, September 1989.
Abstract: This paper examines the feasibility of higher radix floating point representations, and in particular, decimal based representations. Traditional analyses of such representations have assumed the format of a floating point datum to be roughly identical to that of traditional binary floating point encodings such as the IEEE P754 task group standard representations. We relax this restriction and propose a method of encoding higher radix floating point data with range, precision, and storage requirements comparable to those exhibited by traditional binary representations. Results from McKeeman’s Maximum and Average Relative Representational Error (MRRE and ARRE) analyses, Brent’s RMS error evaluation, Matula’s ratio of significance space and gap functions, and Brown and Richman’s exponent range estimates are extended to accomodate the proposed representation. A decimal alternative to traditional binary representations is proposed, and the behavior of such a system is contrasted with that of a comparable binary system.
Architecture and Algorithms for Processing Non-binary Floating Point Radices, Paul Johnstone and Frederick E. Petry, unpublished paper, 39pp, pers. comm., July 2001.
Abstract: Recent studies have proposed several non-binary floating point representations which possess most of the storage and algorithmic efficiencies of traditional binary systems with no sacrifice of precision and only modest reductions in range. Such systems possess inherent advantages in that they employ less complicated conversion algorithms and are less prone to errors in representation. Additionally, non-binary systems tend to produce more precise arithmetic results in that common problem of truncation of an infinitely repeating quotient occurs with a lesser frequency.
    However, as has been previously observed, traditional binary floating representations are most efficiently adapted to the prevailing choices of technology and system architecture. Previous research has left undone the quantification and evaluation of the algorithms and componentry necessary to effect the proposed representations in a fully realized system. We consider in this study the expected impact of adding the capacity to process one of the proposed non-binary radix representations within a conventional computer system. Since decimal representations are clearly the overwhelming impetus for these studies, discussion will focus solely on base 10 systems. Examination of implementation issues are directed toward the following areas: the implementation of floating point representations in contemporary computer architectures, the design of any extensions to such systems, the effects on system complexity and cost, and, finally, resulting algorithmic revisions.
Floating Point Feature On The IBM Type 1620, F. B. Jones and A. W. Wymore, IBM Technical Disclosure Bulletin, 05-62, pp43–46, IBM, May 1962.
Abstract: In the type 1620 automatic floating point operations, a floating point number is a field consisting of a variable length mantissa and a two digit exponent. The exponent is in the two low order positions of the field, and the mantissa is in the remaining high order positions, |M.....M|EE.
    The most significant digit positions are marked by flags and the algebraic signs are marked by flags over the least significant digit positions. The exponent is established on the premise that the mantissa is less than 1.0 and equal to or greater than 0.1, and has a range of -99 to +99. The smallest positive quantity that can be represented is thus 00.... 099. The mantissa may have from two to one hundred digits. ...
The Art of Computer Programming, Vol 2, Donald E. Knuth, ISBN 0-201-89684-2, 762pp, Addison Wesley Longman, 1998.
Abstract: The chief purpose of this chapter [4] is to make a careful study of the four basic processes of arithmetic: addition, subtraction, multiplication, and division. Many people see arithmetic as a trivial thing that children learn and computers do, but we will see that arithmetic is a fascinating topic with many interesting facets. ...
Note: Third edition. See especially sections 4.1 through 4.4.
Fixed-Point Math in C, Joe Lemieux, Embedded Systems Programming, Vol. 14 #4, EDTN, April 2001.
Abstract: Floating-point arithmetic can be expensive if you’re using an integer-only processor. But floating-point values can be manipulated as integers, as a less expensive alternative.
Fairchild decimal arithmetic unit, Stan Mazor, 9pp, pers. comm., July–September 2002.
Abstract: We embarked on the design of Symbol II [circa 1966], a large scale HIGH LEVEL language, virtual memory, time sharing machine. This machine used large printed circuit boards, approx. 16″ x 20″ with slots for over 210 DIP’s. We had 100 connector pins on each side and we defined the system using a number of parallel busses with multiple autonomous functional units and inter-processor communication. The completed system had over 110 printed circuit boards and consumed mega-watts of power...
MSDN Library Visual Basic 6.0 Reference, Microsoft Corporation, URL:, Microsoft Corporation, 2002.
Abstract: The contents of the Visual Basic Language Reference and Controls Reference includes topics on the controls, objects, properties, methods, events, statements, functions, and constants available.
    Additionally, this Reference contains topics on wizards, trappable errors, data types, keyboard shortcuts, and bi-directional programming.
On conventions for systems of numerical representation, Peter M. Neely, Proceedings of the ACM annual conference, Boston, Massachusetts, pp644–651, ACM Press, 1972.
Abstract: Present conventions for numeric representation are considered inadequate to serve the needs of applied computing. Thus an augmented digital number system is proposed for use in programming languages and in digital computers. Special symbols are proposed for numbers too large, too small or too close to zero to be represented in the normal digital number system, or which are undefined. Properties of mappings among and between digital number systems are used to justify the augments chosen. Finally a suggestion is made for a new floating point word format that will serve all the above needs and will greatly extend the exponent range of floating point numbers.
ERMETH: The First Swiss Computer, Hans Heukom, IEEE Annals of the History of Computing, pp5–22, IEEE, October 2005.
Abstract: Eduard Stiefel, in 1948 the first director of the Federal Institute of Technology’s newly established Institute of Applied Mathematics, recognized that computers would be essential to this new field of mathematics. Unable to find exactly what he wanted in existing computers, Stiefel developed the ERMETH. This article examines the rationale of, and objectives for, the first Swiss computer.
EASIAC, A Pseudo-Computer, Robert Perkins, Journal of the ACM, Vol. 3 #2, pp65–72, ACM Press, April 1956.
Abstract: One of the primary functions of the MIDAC installation at the University of Michigan is the instruction of beginners in the various aspects of digital machine use including programming and coding. ... In conducting these courses it was soon found to be extremely difficult, in five or six instruction periods, to bring a complete newcomer up to the point where he can code and check out on MIDAC anything more than a rather trivial routine. As might be expected the difficulty centers around problems of scaling, instruction modification and binary representation. ... To alleviate these problems it was decided that a new computer was needed: one designed to make programming easier. At the cost of some of MIDAC’s speed and capacity plus two or three man-months of programming time EASIAC, the EASy Instruction Automatic Computer, was realized as a translation- interpretation program in MIDAC.
Principles and Preferences for Computer Arithmetic, Christian H. Reinsch, ACM SIGNUM Vol. 14 #1, pp12–27, ACM Press, March 1979.
Abstract: This working paper arose out of discussions on desirable hardware features for numerical calculation in the IFIP Working Group 2.5 on Numerical Software. It reflects the views of all members of the group, although no formal vote of approval has been taken; it is not an official IFIP document. Many people contributed ideas to this paper, especially T. J. Dekker, C. W. Gear, T. E. Hull, J. R. Rice, and J. L. Schonfeldor.
A Unified Decimal Floating-Point Architecture for the Support of High-Level Languages, Frederic N. Ris, ACM SIGNUM Newsletter, Vol. 11 #3, pp18–23, ACM Press, October 1976.
Abstract: This paper summarizes a proposal for a decimal floating-point arithmetic interface for the support of high-level languages, consisting both of the arithmetic operations observed by application programs and facilities to produce subroutine libraries accessible from these programs. What is not included here are the detailed motivations, examinations of alternatives, and implementation considerations which will appear in the full work.
Note: Also in ACM SIGARCH Computer Architecture News, Vol 5 #4, pp21-31, October 1976. Also in ACM SIGPLAN Notices, Vol 12 #9, pp60-70, September 1977. Also in IBM RC 6203 (#26651) 11pp, September 1976.
Applications of Redundant Number Representations to Decimal Arithmetic, R. Sacks-Davis, The Computer Journal, Vol. 25 #4, pp471–477, November 1982.
Abstract: A decimal arithmetic unit is proposed for both integer and floating-point computations. To achieve comparable speed to a binary arithmetic unit, the decimal unit is based on a redundant number representation. With this representation no loss of compactness is made relative to binary coded decimal (BCD) form. In this paper the hardware required for the implementation of the basic operations of addition, subtraction, multiplication and division are described and the properties of floating-point arithmetic based on a redundant number representation are investigated.
Mathematics and computer science at odds over real numbers, Thomas J. Scott, ACM SIGCSE Bulletin, Vol. 23 #1 (Technical Symposium on Computer Science Education 1991), pp130–139, ACM Press, 1991.
Abstract: This paper discusses the “real number” data type as implemented by “floating point” numbers. Floating point implementations and a theorem that characterizes their truncations are presented. A teachable floating point system is presented, chosen so that most problems can be worked out with paper and pencil. Then major differences between floating point number systems and the continuous real number system are presented. Important floating point formats are next discussed. Two examples derived from actual computing practice on mainframes, minicomputers, and PCs are presented. The paper concludes with a discussion of where floating point arithmetic should be taught in standard courses in the ACM curriculum.
BigDecimal (Java 2 Platform SE v1.4.0), Sun Microsystems, URL:, 17pp, Sun Microsystems Inc., 2002.
Abstract: Immutable, arbitrary-precision signed decimal numbers. A BigDecimal consists of an arbitrary precision integer unscaled value and a non-negative 32-bit integer scale, which represents the number of digits to the right of the decimal point. The number represented by the BigDecimal is (unscaledValue/10scale). BigDecimal provides operations for basic arithmetic, scale manipulation, comparison, hashing, and format conversion.
Floating Point Number Format with Number System with Base of 1000, Y. Takashi, IBM Technical Disclosure Bulletin, 01-98, pp609–610, IBM, January 1998.
Abstract: Disclosed is a use number system with a base of 1000 instead of 2 at the mantissa part of a floating point number. The unit is 10 bit. Each 10 bit keeps the value between 0 and 1000. This format is superior to Binary Coded Decimal (BCD) because it can keep more decimal numbers in the same size. This format is superior to binary because 1000 is 100 times of 10, and it makes no difference when converted to/from human’s decimal format.
Experimental Computer for Schools, D. M. Taub, C. E. Owen, and B. P.. Day, Proceedings of the IEE, Vol. 117 #2, pp303–312, IEE, February 1970.
Abstract: The computer is a small desk-top machine designed for teaching schoolchildren how computers work. It works in decimal notation and has a powerful instruction set which includes 3-address floating-point instructions implemented as ‘extracode’ subroutines. Addressing can be absolute, relative or indirect. For input it uses a capacitive touch keyboad, and for output and display a perfectly normal TV receiver is used. Another input/output device is an ordinary domestic tape recorder, used mainly for long term storage of programs. To make the operation of the machine easy to follow, it can be made to stop at certain stages in the processing of an instruction and automaticaly display the contents of all registers and storage locations relevant at that time. The paper gives a description of the machine and a discussion of the factors that hav influenced its design.
An evaluation of the design of the Gamma 60, T. J. Tumlin and M. Smothermann, Actes du 3e colloque de l'Histoire de l'Informatique, 11pp, Sophia-Antipolis, INRIA, 1993.
Abstract: The Bull Gamma 60 remains a major innovation in computer design. Its use of explicit FORK-JOIN parallelism is shown by a simulation model to wisely exploit a large difference in speeds between logic components and memory elements, as found on some machines of the 1950’s. Recently the reappearance of a large speed ratio makes the same type of explicit FORK-JOIN parallelism attractive in advanced designs and validates the latency-tolerant design philosophyof the Gamma 60. The major difficulty of the design is the programming effort required to fully express the parallelism available in programs.
A New Family of High–Performance Parallel Decimal Multipliers, Alvaro Vázquez, Elisardo Antelo, and Paolo Montuschi, Proceedings of the 18th IEEE Symposium on Computer Arithmetic, ISBN 0-7695-2854-6, ISBN 978-0-7695-2854-0, pp195–204, IEEE, June 2007.
Abstract: This paper introduces two novel architectures for parallel decimal multipliers. Our multipliers are based on a new algorithm for decimal carry–save multioperand addition that uses a novel BCD–4221 recoding for decimal digits. It significantly improves the area and latency of the partial product reduction tree with respect to previous proposals. We also present three schemes for fast and efficient generation of partial products in parallel. The recoding of the BCD–8421 multiplier operand into minimally redundant signed–digit radix–10, radix–4 and radix–5 representations using new recoders reduces the complexity of partial product generation. In addition, SD radix–4 and radix–5 recodings allow the reuse of a conventional parallel binary radix–4 multiplier to perform combined binary/ decimal multiplications. Evaluation results show that the proposed architectures have interesting area–delay figures compared to conventional Booth radix–4 and radix–8 parallel binary multipliers and other representative alternatives for decimal multiplication.
Floating-Point Arithmetics, W. G. Wadey, Journal of the ACM, Vol. 7 #2, pp129–139, ACM Press, April 1960.
Abstract: Three types of floating-point arithmetics with error control are discussed and compared with conventional floating-point arithmetic. General multiplication and division shift criteria are derived (for any base) for Metropolis-style arithmetics. The limitations and most suitable range of application for each arithmetic are discussed.
Decimal Floating-Point Adder and Multifunction Unit with Injection-Based Rounding, Liang-Kai Wang and Michael J. Schulte, Proceedings of the 18th IEEE Symposium on Computer Arithmetic, ISBN 0-7695-2854-6, ISBN 978-0-7695-2854-0, pp56–65, IEEE, June 2007.
Abstract: Shrinking feature sizes gives more headroom for designers to extend the functionality of microprocessors. The IEEE 754R working group has revised the IEEE 754-1985 Standard for Binary Floating-Point Arithmetic to include specifications for decimal floating-point arithmetic and IBM recently announced incorporating a decimal floatingpoint unit into their POWER6 processor. As processor support for decimal floating-point arithmetic emerges, it is important to investigate efficient algorithms and hardware designs for common decimal floating-point arithmetic algorithms. This paper presents novel designs for a decimal floating-point adder and a decimal floating-point multifunction unit. To reduce their delay, both the adder and the multifunction unit use decimal injection-based rounding, a new form of decimal operand alignment, and a fast flag-based method for rounding and overflow detection. Synthesis results indicate that the proposed adder is roughly 21% faster and 1.6% smaller than a previous decimal floating-point adder design, when implemented in the same technology. Compared to the decimal floating-point adder, the decimal floating-point multifunction unit provides six additional operations, yet only has 2.8%more delay and 9.7% more area.
Benchmarks and Performance Analysis of Decimal Floating-Point Applications, Liang-Kai Wang, Charles Tsen, Michael J. Schulte, and Divya Jhalani, Proceedings of the IEEE International Conference on Computer Design 2007, pp164–170, IEEE, October 2007.
Abstract: The IEEE P754 Draft Standard for Floating-point Arithmetic provides specifications for Decimal Floating-Point (DFP) formats and operations. Based on this standard, many developers will provide support for DFP calculations. We present a benchmark suite for DFP applications and use this suite to evaluate the performance of hardware and software DFP solutions. Our benchmarks include banking, commerce, risk-management, tax, and telephone billing applications organized into a suite of five macro benchmarks. In addition to developing our own applications, we leverage open-source projects and academic financial analysis applications. The benchmarks are modular, making them easy to adapt for different DFP solutions. We use the benchmarks to evaluate the performance of the decNumber DFP library and an extended version of the SimpleScalar PISA architecture with hardware and instruction set support for DFP operations. Our analysis shows that providing processor support for high-speed DFP operations significantly improves the performance of DFP applications.
IBM z10: The Next-Generation Mainframe Microprocessor, Charles Webb, IEEE Micro Vol. 28 #2, ISSN 0272-1732, pp19–29, IEEE, March/April 2008.
Abstract: The IBM system z10 includes four microprocessor cores — each with a private 3-Mbyte cache — and integrated accelerators for decimal floating-point computation, cryptography, and data compression. A separate SMP hub chip provides a shared third-level cache and interconnect fabric for multiprocessor scaling. This article focuses on the high-frequency design techniques used to achieve a 4.4-GHz system, and on the pipeline design that optimizes z10’s CPU performance.
A Complete Floating-Decimal Interpretive System for the IBM 650 Magnetic Drum Calculator, V. M. Wolontis, IBM Reference Manual, Floating-Decimal Interpretive System for the IBM 650, 87pp, IBM, 1959.
Abstract: This report describes an interpretive system which transforms the 650 into a three-address, floating-decimal, general-purpose computer, primarily suited for scientific and engineering calculations. The system is complete in the sense that all mathematical, logical, and input-output operations normally called for in such calculations can be performed within the system, i.e., without reference to the basic operation codes of the 650. The guiding principles in designing the system have been ease of use, as defined in the introduction, high speed of arithmetic and frequently used logical operations and full accuracy and range for the elementary transcendental functions...
Note: This document and the earlier Bell Telephone Laboratories report are available at

The 53 references listed on this page are selected from the bibliography on Decimal Arithmetic collected by Mike Cowlishaw. Please see the index page for more details and other categories.

Last updated: 10 Mar 2011
Some elements Copyright © IBM Corporation, 2002, 2009. All rights reserved.