Bibliography of material on Decimal Arithmetic – by year think 10

This collection of references forms a bibliography of Decimal Arithmetic, with the emphasis on computer implementations of arithmetic.
This list is sorted by year of publication. (a categorized collection and an alphabetic list by first author are also available).

This bibliography lists the papers I collected when researching decimal arithmetic through 2008. For a more extensive floating-point bibliography, including papers since that date, I commend Norbert Juffa and Nelson Beebe’s fparith collection at the University of Utah (search for ‘decimal’).  fparith is available in a variety of formats.

For general background on why decimal arithmetic is important, a decimal FAQ, decimal arithmetic specifications and testcases, and other World Wide Web links, please see the General Decimal Arithmetic pages.

For books, and papers with no formal abstract, the Abstract material is quoted from an introductory section or (occasionally, for books only) back cover matter. Omitted material is indicated by ellipses (...).

Please send any comments, corrections, or additions to Mike Cowlishaw,

 ® 1925 (1) 1946 (2) 1952 (1) 1954 (3) 1955 (1) 1956 (1) 1958 (5) 1959 (5) 1960 (1) 1961 (1) 1962 (4) 1963 (1) 1964 (1) 1965 (2) 1966 (2) 1967 (2) 1968 (4) 1969 (4) 1970 (3) 1971 (3) 1972 (3) 1973 (3) 1974 (5) 1975 (8) 1976 (3) 1977 (3) 1978 (3) 1979 (2) 1980 (5) 1981 (4) 1982 (2) 1983 (3) 1984 (4) 1985 (5) 1986 (2) 1987 (8) 1988 (1) 1989 (4) 1990 (4) 1991 (7) 1992 (8) 1993 (3) 1994 (6) 1995 (3) 1996 (3) 1997 (5) 1998 (9) 1999 (3) 2000 (3) 2001 (8) 2002 (6) 2003 (5) 2004 (10) 2005 (8) 2006 (14) 2007 (23) 2008 (8) 2009 (1)

The History of Arithmetic, Louis Charles Karpinski, 200pp, Rand McNally & Company, 1925.
Abstract: The purpose of this book is to present the development of arithmetic as a vital and integral part of the history of civilization. Particular attention is paid to the material of arithmetic which continues to be taught in our elementary schools and to the historical phases of that work with which the teacher of arithmetic should be familiar...
Note: Reprint: Russell & Russell, New York, 1965.

Preliminary discussion of the logical design of an electronic computing instrument, Arthur W. Burks, Herman H. Goldstine, and John von Neumann, 42pp, Inst. for Advanced Study, Princeton, N. J., June 28, 1946.
Abstract: Inasmuch as the completed device will be a general-purpose computing machine it should contain certain main organs relating to arithmetic, memory-storage, control and connection with the human operator. It is intended that the machine be fully automatic in character, i.e. independent of the human operator after the computation starts...
Note: Reprinted in von Neumann’s Collected Works, Vol. 5, A. H. Taub, Ed. (Pergamon, London, 1963), pp 34-79, and also in Computer Structures: Reading and Examples, Bell & Newell, McGraw-Hill Inc., 1971. Now widely available on the Internet.
    Contract W-36-034-ORD-H81. R&D Service, Ordnance Department, US Army and Institute for Advanced Study, Princeton
The Electronic Numerical Integrator and Computer (ENIAC), H. H. Goldstine and Adele Goldstine, IEEE Annals of the History of Computing, Vol. 18 #1, pp10–16, IEEE, 1996.
Abstract: It is our purpose in the succeeding pages to give a brief description of the ENIAC and an indication of the kinds of problems for which it can be used. This general purpose electronic computing machine was recently made public by the Army Ordnance Department for which it was developed by the Moore School of Electrical Engineering. The machine was developed primarily for the purpose of calculating firing tablcs for the armed forces. Its design is, however, sufficiently general to permit the solution of a large class of numerical problems which could hardly be attempted by more conventional computing tools.
    In order easily to obtain sufficient accuracy for scientific computations, the ENIAC was designed as a digital device. The equipment normally handles signed 10-digit numbers expressed in the decimal system. It is, however, so constructed that operations with as many as 20 digits are possible.
    The machine is automatically sequenced in the sense that all instructions needed to carry out a computation are given to it before the computation commences. It will be seen below how these instructions are given to the machine.

Note: Reprinted from Mathematical Tables and Other Aids to Computation, 1946.

Automatic Recognition of Spoken Digits, K. Davis, R. Biddulph, and S. Balashek, Journal of the Acoustical Society of America, Vol. 24 (Possibly: American Journal of Otolaryngology, Vol. 24.), pp637–642, ASA, November 1952.
Abstract: The recognizer discussed will automatically recognize telephone-quality digits spoken at normal speech rates by a single individual, with an accuracy varying between 97 and 99 percent. After some preliminary analysis of the speech of any individual, the circuit can be adjusted to deliver a similar accuracy on the speech of that individual. The circuit is not, however, in its present configuration, capable of performing equally well on the speech of a series of talkers without recourse to such adjustment.
    Circuitry involves division of the speech spectrum into two frequency bands, one below and the other above 900 cps. Axis-crossing counts are then individually made of both band energies to determine the frequency of the maximum syllabic rate energy within each band. Simultaneous two-dimensional frequency portrayal is found to possess recognition significance. Standards are then determined, one for each digit of the ten-digit series, and are built into the recognizer as a form of elemental memory. By means of a series of calculations performed automatically on the spoken input digit, a best match type comparison is made with each of the ten standard digit patterns and the digit of best match selected.

The IBM Type 702, An Electronic Data Processing Machine for Business, C. J. Bashe, W. Buchholz, and N. Rochester, Journal of the ACM (JACM), Vol. 1 #4, pp149–169, ACM Press, October 1954.
Abstract: The main features of the IBM Electronic Data Processing Machine, Type 702, are discussed from the programmer’s point of view to illustrate how it was designed specifically to solve large accounting and statistical problems in business, industry, and government. The 702 exploits in one integrated system the high speed and storage capacity of magnetic tape, the accessibility of electrostatic memory supplemented by large auxiliary storage on magnetic drums, the flexibility of punched-card document input, the page printing output of modern accounting machines, and the technology of general-purpose, stored-program, electronic computers. The 702 is a serial machine with decimal arithmetic. Its serial nature provides several unusual logical features of great aid in programming accounting problems.
The IBM Magnetic Drum Calculator Type 650, F. E. Hamilton and E. C. Kubie, Journal of the ACM, Vol. 1 #1, pp13–20, ACM Press, January 1954.
Abstract: The IBM Magnetic Drum Calculator Type 650 is an electronic calculator intermediate in speed, capacity and cost. It takes a logical position between the IBM Card Programmed Electronic Calculator and the IBM Electronic Data Processing Machines Type 701. It is a more powerful computing tool as required by those who have “outgrown” the Card Programmed Electronic Calculator. It is also a machine which may be used economically by those who are not as yet ready for a large scale computer such as the 701. It will serve not only to perform their required computing tasks, but it will also result in gaining valuable experience for later use of large scale equipment. The Magnetic Drum Calculator, through its stored program control, comprehensive order list, punched card input-output, self-checking and moderate memory capacity, gains the flexibility required of a computer which is to serve in both the commercial and scientific computing fields...
The Generation of Pseudo-Random Numbers on a Decimal Calculator, Jack Moshman, Journal of the ACM Vol. 1 #2, pp88–91, ACM Press, April 1954.
Abstract: (None.) Describes the generation of pseudo-random numbers on the decimal UNIVAC machine.

Arithmetic Operations in Digital Computers, R. K. Richards, ISBN (none), 397pp, D. Van Nostrand Co., NY, 1955.
Abstract: Among the first things that are learned in a study of mathematics are rules and procedures for performing basic arithmetic operations, notably addition, subtraction, multiplication, and division. The rules and procedures taught in school are, for the most part, aimed at making the operations as simple and speedy as possible when a pencil and a piece of paper are the only tools. In the design of more elaborate arithmetical tools, it is usually found necessary or at least highly desirable to devise new methods for executing the various arithmetic operations. ...
Note: Library of Congress No. 55-6234. Bibliography 9pp.

EASIAC, A Pseudo-Computer, Robert Perkins, Journal of the ACM, Vol. 3 #2, pp65–72, ACM Press, April 1956.
Abstract: One of the primary functions of the MIDAC installation at the University of Michigan is the instruction of beginners in the various aspects of digital machine use including programming and coding. ... In conducting these courses it was soon found to be extremely difficult, in five or six instruction periods, to bring a complete newcomer up to the point where he can code and check out on MIDAC anything more than a rather trivial routine. As might be expected the difficulty centers around problems of scaling, instruction modification and binary representation. ... To alleviate these problems it was decided that a new computer was needed: one designed to make programming easier. At the cost of some of MIDAC’s speed and capacity plus two or three man-months of programming time EASIAC, the EASy Instruction Automatic Computer, was realized as a translation- interpretation program in MIDAC.

BIDEC – A Binary-to-Decimal or Decimal-to-Binary Converter, J. F. Couleur, IRE Transactions on Electronic Computers, Vol. EC-7, pp313–316, IRE, 1958.
Abstract: Simple, high-speed devices to convert binary, binary coded octal, or Gray code numbers to binary coded decimal numbers or vice versa is described. Circuitry required is four shift register stages per decimal digit plus one 30-diode network per decimal digit. In simple form the conversion requires two operations per binary bit but is theoretically capable of working at one operation per bit.
Computation with Approximate Numbers, Daniel B. Delury, The Mathematics Teacher 51, pp521–530, November 1958.
Abstract: There is room, I think, for the view that it is improper to speak at all of “approximate numbers”...
Note: Reprinted with permission of the Canadian School.
Binary and truth-function operations on a decimal computer with an extract command, William H. Kautz, Communications of the ACM, Vol. 1 #5, pp12–13, ACM Press, May 1958.
Abstract: It occasionally becomes desirable to solve, on automatic digital computing machines which are capable of handling only decimal numbers, problems in logic, class structure, coding, binary relations or binary arithmetic. This note describes how the major logical and binary operations can be carried out on one such machine, the DATATRON 205, without any circuit modifications to the computer. These procedures would be applicable with little modification to any decimal computer with an extract command, however.
Number Words and Number Symbols: A Cultural History of Numbers, Karl Menninger, ISBN 0-486-27096-3, 480pp, Dover Publications, Inc., 1992.
Abstract: This book is ... a multifacted linguistic and historical analysis of how numbers have developed and evolved in many different cultures. “... especially good on early counting and calculating devices ...”.
Note: First published in English by the MIT Press, 1969. Translated from the German by Paul Broneer.
An Improved Decimal Redundancy Check, Roger L. Sisson, Communications of the ACM, Vol. 1 #5, pp10–12, ACM Press, May 1958.
Abstract: As more emphasis is placed on improving the accuracy of data fed into automatic computing systems, more emphasis will be placed on redundancy checking of predicable fields within the input. Two systems (at least) of checking a field of decimal digits have been proposed. In both of these it is assumed that the field to be checked is all numeric and that the redundancy must be of only one digit.

Unnormalized Floating Point Arithmetic, R. L. Ashenhurst and N. Metropolis, Journal of the ACM, Vol. 6 #3, pp415–428, ACM Press, July 1959.
Abstract: Algorithms for floating point computer arithmetic are described, in which fractional parts are not subject to the usual normalization convention. These algorithms give results in a form which furnishes some indication of their degree of precision. An analysis of one-stage error propagation is developed for each operation; a suggested statistical model for long run error propagation is also set forth.
Fingers or Fists? (The Choice of Decimal or Binary representation), Werner Buchholz, Communications of the ACM, Vol. 2 #12, pp3–11, ACM Press, December 1959.
Abstract: The binary number system offers many advantages over a decimal representation for a high-perfornmnee, general-purpose computer. The greater simplicity of a binary arithmetic unit and the greater compactness of binary numbers both contribute directly to arithmetic speed. Less obvious and perhaps more important is the way binary addressing and instruction formats can increase the overall performance. Binary addresses are also essential to certain powerful operations which are not practical with decimal instruction formats.
    On the other hand, decimal numbers are essential for communicating between man and the computer. In applications requiring the processing of a large volume of inherently decimal input and output data, the time for decimal-binary conversion needed by a purely binary computer may be significant. A slower decimal adder may take less time than a fast binary adder doing an addition and two conversions.
    A careful review ef the significance of decimal and binary number systems led to the adoption in the IBM STRETCH computer of binary addressing and both binary and decimal data arithmetic, supplemented by efficient conversion instructions.

Note: Letters to the edtor in response to this paper were published in CACM, Vol. 3, #3, March 1960.
Decimal-Binary conversions in CORDIC, D. H. Daggett, IRE Transactions on Electronic Computers, Vol. EC-8 #5, pp335–339, IRE, September 1959.
Abstract: A special-purpose, binary computer called CORDIC (COordinate Rotation DIgital Computer) contains a unique arithmetic unit composed of three shift registers, three adder-subtractors, and suitable interconnections for efficiently performing calculations involving trigonometric functions. A technique is formulated for using the CORDIC arithmetic unit to convert between angles expressed in degrees and minutes in the 8, 4, 2, 1 code and angles expressed in binary fractions of a half revolution. Decimal-to-binary conversion is accomplished through the generation of an intermediate binary code in which the variable values are +1 and 1. Each of these intermediate code variables controls the addition or subtraction of a particular binary constant in the formation of an accumulated sum which represents the angle. Examples are presented to illustrate the technique. Binary-to-decimal conversion is accomplished by applying essentially the same conversion steps in reverse order, but this feature is not discussed fully. Fundamental principles of the conversion technique, rather than details of implementation, are emphasized. The CORDIC conversion technique is sufficiently general to be applied to decimal-binary conversion problems involving other mixed radix systems and other decimal codes.
Binary conversion, with fixed decimal precision, of a decimal fraction, Donald Taranto, Communications of the ACM, Vol. 2 #7, pp27–27, ACM Press, July 1959.
Abstract: Given a decimal fraction f find a binary approximation fb to f, with a given decimal precision h.
A Complete Floating-Decimal Interpretive System for the IBM 650 Magnetic Drum Calculator, V. M. Wolontis, IBM Reference Manual, Floating-Decimal Interpretive System for the IBM 650, 87pp, IBM, 1959.
Abstract: This report describes an interpretive system which transforms the 650 into a three-address, floating-decimal, general-purpose computer, primarily suited for scientific and engineering calculations. The system is complete in the sense that all mathematical, logical, and input-output operations normally called for in such calculations can be performed within the system, i.e., without reference to the basic operation codes of the 650. The guiding principles in designing the system have been ease of use, as defined in the introduction, high speed of arithmetic and frequently used logical operations and full accuracy and range for the elementary transcendental functions...
Note: This document and the earlier Bell Telephone Laboratories report are available at

Floating-Point Arithmetics, W. G. Wadey, Journal of the ACM, Vol. 7 #2, pp129–139, ACM Press, April 1960.
Abstract: Three types of floating-point arithmetics with error control are discussed and compared with conventional floating-point arithmetic. General multiplication and division shift criteria are derived (for any base) for Metropolis-style arithmetics. The limitations and most suitable range of application for each arithmetic are discussed.

A Third Survey of Domestic Electronic Digital Computing Systems, Report No. 1115, Martin H. Weik, 1131pp, Ballistic Research Laboratories, Aberdeen Proving Ground, Maryland, March 1961.
Abstract: Based on the results of a third survey, the engineering and programming characteristics of two hundred twenty-two different electronic digital computing systems are given. The data are presented from the point of view of application, numerical and arithmetic characteristics, input, output and storage systems, construction and checking features, power, space, weight, and site preparation and personnel requirements, production records, cost and rental rates, sale and lease policy, reliability, operating experience, and time availability, engineering modifications and improvements and other related topics. An analysis of the survey data, fifteen comparative tables, a discussion of trends, a revised bibliography, and a complete glossary of computer engineering and programming terminology are included.

On a Floating-Point Number Representation For Use with Algorithmic Languages, A. A. Grau, Communications of the ACM, Vol. 5 #3, pp160–161, ACM Press, March 1962.
Abstract: Algorithmic languages, such as ALGOL, make provision for two types of numbers, real and integer, which are usually implemented on the computer by means of floating-point and fixed-point numbers respectively. The concepts real and integer, however, are taken from mathematics, where the set of integers forms a proper subset of the set of real numbers. In implementation a real problem is posed by the fact that the set of fixed-point numbers is not a proper subset of the set of floating-point numbers; this problem becomes very apparent in attempts to implement ALGOL 60. Furthermore, the one mathematical operation of addition is implemented in the machine by one of two machine operations, fixed-point addition or floating-point addition. ...
Floating Point Feature On The IBM Type 1620, F. B. Jones and A. W. Wymore, IBM Technical Disclosure Bulletin, 05-62, pp43–46, IBM, May 1962.
Abstract: In the type 1620 automatic floating point operations, a floating point number is a field consisting of a variable length mantissa and a two digit exponent. The exponent is in the two low order positions of the field, and the mantissa is in the remaining high order positions, |M.....M|EE.
    The most significant digit positions are marked by flags and the algebraic signs are marked by flags over the least significant digit positions. The exponent is established on the premise that the mantissa is less than 1.0 and equal to or greater than 0.1, and has a range of -99 to +99. The smallest positive quantity that can be represented is thus 00.... 099. The mantissa may have from two to one hundred digits. ...
Hardware Conversion of Decimal and Binary Numbers, G. T. Lake, Communications of the ACM, Vol.5 #9, pp468–469, ACM Press, September 1962.
On a Wired-In Binary-to-Decimal Conversion Scheme, W. C. Lynch, Communications of the ACM, Vol. 5 #3, pp159–159, ACM Press, March 1962.

Mixed Congruential Random Number Generators for Decimal Machines, J. L. Allard, A. R. Dobell, and T. E. Hull, Journal of the ACM, Vol. 10 #2, pp131–141, ACM Press, April 1963.
Abstract: Random number generators of the mixed eongruential type have recently been proposed. They appear to have some advantages over those of the multiplicative congruential type, but they have not been thoroughly tested. This paper summarizes the results of extensive testing of these generators which has been carried out on a decimal machine. Most results are for word length 10, and special attention is given to simple multipliers which give fast generators. But other word lengths and many other multipliers are considered. A variety of additive constants is also used. It turns out that these mixed generators, in contrast to the multiplicative ones, are not consistently good from a statistical point of view. The cases which are bad seem to belong to a well-defined class which, unfor unfortunately, includes most of the generators associated with the simple multipliers. However, a surprise result is that all generators associated with one of the simplest and fastest multipliers, namely 101, turn out to be consistently good for word lengths greater than seven digits. A final section of the paper suggests a simple theoretical explanation of these experimental results.

Burroughs B5500 Information Processing Systems Reference Manual, Burroughs Corporation, 224pp, Burroughs Corporation, Detroit, Michigan, 1964.
Abstract: This reference manual describes the hardware characteristics of the Burroughs B 5500 Information Processing System by presenting detailed information concerning the functional operation of the entire system. The B 5500 is a large-scale, high-speed, solid-state computer which represents a departure from the conventional computer system concept. It is a problem language oriented system rather than the conventional hardware oriented system. Because of the design concept of the B 5500, there exists a strong interdependence between the hardware and the Master Control Program which directs the system. The material presented herein pertains only to the hardware considerations, whereas the Master Control Program is discussed under separate cover.

Number Base Conversion in a Significant Digit Arithmetic, Herbert Kanner, Journal of the ACM, Vol. 12 #2, ISSN 0004-5411, pp242–246, ACM Press, April 1965.
Abstract: An algorithm is presented for the conversion in either direction between binary and decimal floating-point representations, retaining proper significance through the conversion in an unnormalized significant digit arithmetic.
Pracniques: simulation of Boolean functions in a decimal computer, M. Morris Mano, Communications of the ACM, Vol. 8 #1, ISSN 0001-0782, pp39–40, ACM Press, January 1965.
Abstract: A method is presented here for simulating logical functions in a digital computer by means of simple arithmetic and control instructions. This method is of practical value when the computer used does not have built-in logical instructions.

Automatic Controlled Precision Calculations, Bruce A. Chartres, Journal of the ACM, Vol. 13 #3, pp386–403, ACM Press, July 1966.
Abstract: Recent developments in computer design and error analysis have made feasible the use of variable precision arithmetic and the preparation of programs that automatically determine their own precision requirements. Such programs enable the user to specify the accuracy he wants, and yield answers guaranteed to lie within the bounds prescribed. A class of such programs, called “contracting error programs”, is defined in which the precision is determined by prescribing error bounds on the data. A variant of interval arithmetic is defined which enables a limited class of algorithms to be programmed as contracting error programs. A contracting error program for the solution of simultaneous linear equations is described, demonstrating the application of the idea to a wider class of problems.
Multiple precision floating-point conversion from decimal-to-binary and vice versa, O. G. Mancino, Communications of the ACM, Vol. 9 #5, pp347–348, ACM Press, May 1966.
Abstract: Decimal-to-binary and binary-to-decimal floating-point conversion is often performed by using a table of the powers 10i (i a positive integer) for converting from base 10 to base 2, and by using a table of the coefficients of a polynomial approximation of 10x (0 ≤ x < 1) for converting from base 2 to base 10. These tables occupy a large storage region in the case of a nonsingle precision conversion. This paper shows that a single small table suffices for a floating-point conversion from decimal to binary, and vice versa, in any useful precision.

27 Bits Are Not Enough for 8-Digit Accuracy, I. Bennett Goldberg, Communications of the ACM, Vol. 10 #2, pp105–106, ACM Press, February 1967.
Abstract: From the inequality 108 < 227, we are likely to conclude that we can represent 8-digit decimal floating-point numbers accurately by 27-bit [binary] floating-point numbers. However, we need 28 significant bits to represent some 8-digit numbers accurately. In general, we can show that if 10p < 2q-1, then q significant bits are always enough for p-digit decimal accuracy. Finally, we can define a compact 27-bit floating-point representation that will give 28 significant bits, for numbers of practical importance.
Chapt. 1.4 Computer Characteristics Table, Melvin Klerer et al, Digital Computer User's Handbook, 67pp, McGraw-Hill, NY, 1967.
Abstract: Section I: General-purpose Solid-state Computers Manufactured in the United States and Designed for a Wide Variety of Business and Scientific Applications
   Section II: Systems Manufactured in the United States with General-purpose Capabilities but Used Principally in Process Control, Message Switching, and Other Specialized Applications
   Section III: General-purpose Computers Manufactured in Countries Other Than the United States
   Section IV: Vacuum-tube Computers No Longer Manufactured but Still in Use
   Section V: Chronological Listing of Vacuum-tube and Solid-state Computers Manufactured in the United States and Installed between 1951 and 1965

Generating prime implicants via ternary encoding and decimal arithmetic, D. L. Dietmeyer and J. R. Duley, Communications of the ACM, Vol. 11 #7, ISSN 0001-0782, pp520–523, ACM Press, July 1968.
Abstract: Decimal arithmetic, ternary encoding of cubes, and topological considerations are used in an algorithm to obtain the extremals and prime implicants of Boolean functions. The algorithm, which has been programmed in the FORTRAN language, generally requires less memory than other minimization procedures, and treats DON’T CARE terms in an efficient manner.
In-and-out conversions, David Matula, Communications of the ACM, Vol. 11 #1, pp47–60, ACM Press, January 1968.
Abstract: By an in-and-out conversion we mean that a floating-point number in one base is converted into a floating-point number in another base and then converted back to a floating-point number in the original base. For all combinations of rounding and truncation conversions the question is considered of how many significant digits are needed in the intermediate base to allow such in-and-out conversions to return the original number (when possible), or at least to cause a difference of no more than a unit in the least significant digit.
An electronic digital slide rule, Hermann Schmid and David Busch, The Electronic Engineer, pp54–64, July 1968.
Abstract: The Electronic Digital Slide Rule (EDSR) of the future not only will be smaller and easier to operate than the conventional slide rule, but it will also be more accurate.
High Speed Binary to Decimal Conversion, M. S. Schmookler, IEEE Transactions on Computers, Vol. C-17, pp506–508, IEEE, 1968.
Abstract: This note describes several methods of performing fast, efficient, binary-to-decimal conversion. With a modest amount of circuitry, an order of magnitude speed improvement can is obtained. This achievement offers a unique advantage to general-purpose computers requiring special hardware to translate between binary and decimal numbering systems.

The Choice of Base, W. S. Brown and P. L. Richman, Communications of the ACM, Vol. 12 #10, pp560–561, ACM Press, October 1969.
Abstract: A digital computer is considered, whose memory words are composed of N r-state devices plus two sign bits (two state devices). The choice of base b for the internal representation of floating-point numbers on such a computer is discussed. It is shown that in a certain sense b = r is best.
Decimal Floating Point Processor, K. A. Duke, IBM Technical Disclosure Bulletin, 11-69, pp862–862, IBM, November 1969.
Abstract: A numerical processor can be built which operates on floating-point numbers where the mantissa is an integer and the characteristic represents a power of 10 by which that integer must be multiplied. Thus, decimal numbers can be represented exactly without conversion errors. Such floating point numbers are expressed as N = (-1)/S/ x 10/X/ x I where S = sign bit, X = exponent, and I = integer.
Electronic Computers: A Historical Survey, Saul Rosen, ACM Computing Surveys (CSUR), Vol. 1 #1, ISSN 0360-0300, pp7–36, ACM Press, March 1969.
Abstract: The first large scale electronic computers were built in connection with university projects sponsored by government military and research organizations. Many established companies, as well as new companies, entered the computer field during the first generation, 1947-1959, in which the vacuum tube was almost universally used as the active component in the implementation of computer logic. The second generation was characterized by the transistorized computers that began to appear in 1959. Some of the computers built then and since are considered super computers; they attempt to go to the limit of current technology in terms of size, speed, and logical complexity. From 1965 onward, most new computers belong to a third generation, which features integrated circuit technology and multiprocessor multiprogramming systems.
Decimal Adder with Signed Digit Arithmetic, Antonin Svoboda, IEEE Transactions on Computers, Vol. 18 #3, pp212–215, IEEE, March 1969.
Abstract: The decimal adder with signed digit arithmetic presented here was designed to establish the following facts: the redundant representation of a decimal digit xi by a 5-bit binary number Xi=3xi leads to a logical design of extreme simplicity; it is possible to form an additional algorithm for the adder so that it can be used to transform numbers written in a conventional decinal form into a signed digit form, and vice versa.

Another method of converting from hexadecimal to decimal, M. V. Kailas, Communications of the ACM, Vol. 13 #3, 193pp, ACM Press, March 1970.
Abstract: There is a simple paper-and-pencil method of converting a hexadecimal number N to decimal.
A Formalization of Floating-Point Numeric Base Conversion, David W. Matula, IEEE Transactions on Computers, Vol. C-19 #8, pp681–692, IEEE, August 1970.
Abstract: The process of converting arbitrary real numbers into a floating-point format is formalized as a mapping of the reals into a specified subset of real numbers. The structure of this subset, the set of n significant digit base b floating-point numbers, is analyzed and properties of conversion mappings are determined. For a restricted conversion mapping of the n significant digit base b numbers to the m significant-digit base d numbers, the one-to-one, onto, and order-preserving properties of the mapping are summarized. Multiple conversions consisting of a composition of individual conversion mappings are investigated and some results of the invariant points of such compound conversions are presented. The hardware and software implications of these results with regard to establishing goals and standards for floating-point formats and conversion procedures are considered.
Experimental Computer for Schools, D. M. Taub, C. E. Owen, and B. P.. Day, Proceedings of the IEE, Vol. 117 #2, pp303–312, IEE, February 1970.
Abstract: The computer is a small desk-top machine designed for teaching schoolchildren how computers work. It works in decimal notation and has a powerful instruction set which includes 3-address floating-point instructions implemented as ‘extracode’ subroutines. Addressing can be absolute, relative or indirect. For input it uses a capacitive touch keyboad, and for output and display a perfectly normal TV receiver is used. Another input/output device is an ordinary domestic tape recorder, used mainly for long term storage of programs. To make the operation of the machine easy to follow, it can be made to stop at certain stages in the processing of an instruction and automaticaly display the contents of all registers and storage locations relevant at that time. The paper gives a description of the machine and a discussion of the factors that hav influenced its design.

The Gamma 60: The computer that was ahead of its time, M. Bataille, Honeywell Computer Journal Vol. 5 #3, pp99–105, Honeywell, 1971.
Abstract: Prior to 1960 the Compagnie des Machines Bull (now Honeywell Bull) delivered the first large computer system with an architecture designed for multiprogramming. Many unique features of the Gamma 60 were forerunners of present system architecture concepts. This article revisits these concepts.
Decimal Number Compression, Tien Chi Chen, Internal IBM memo to Dr. Irving T. Ho, 4pp, IBM, 29 March 1971.
Abstract: The fact that four bits can represent 16 different states, but a decimal digit exploits only 10 of then (0-9) has been a valid criticism against decimal arithmetic.

On the other hand, it is well known that a number with several decimal digits can be reexpressed into binary, leading to a 20% gain in the number of bits used. Examples are, two decimal digits (8 bits) reexpressed as a seven-bit number and three decimal digits (twelve bits) reexpressed as a ten-bit number. ...
Note: Available at:

High speed decimal addition, Martin S. Schmookler and Arnold Weinberger, IEEE Transactions on Computers, Vol. C-20 #8, pp862–867, IEEE, August 1971.
Abstract: Parallel decimal arithmetic capability is becoming increasingly attractive with new applications of computers in a multiprogramming environment. The direct production of decimal sums offers a significant improvement in addition over methods requiring decimal correction. These techniques are illustrated in the eight-digit adder which appears in the System/360 Model 195.

A Binary Representation for Decimal Numbers, Peter M. Fenwick, Australian Computer Journal, Vol. 4 #4 (now Journal of Research and Practice in Information Technology), pp146–149, Australian Computer Society Inc., November 1972.
Abstract: A number system is described which combines the programming convenience of decimal numbers with the hardware advantages of binary arithmetic. The number format resembles some integer floating-point formats, except that the exponent is associated with a base of 10, rather than some power of 2. It is shown that arithmetic in the new representation is little more difficult than for ordinary floating-point numbers, and methods are given for implementing the “decimal” shifts which are a consequence of the exponent base.
Zoned Decimal Arithmetic, J. W. Franklin, IBM Technical Disclosure Bulletin, 12-72, pp2123–2124, IBM, December 1972.
Abstract: A means is described for performing arithmetic on zoned decimal data that does not require additional storage space for the intermediate result, and which preserves both operands until it is determined that the operation has been performed correctly and successfully.
On conventions for systems of numerical representation, Peter M. Neely, Proceedings of the ACM annual conference, Boston, Massachusetts, pp644–651, ACM Press, 1972.
Abstract: Present conventions for numeric representation are considered inadequate to serve the needs of applied computing. Thus an augmented digital number system is proposed for use in programming languages and in digital computers. Special symbols are proposed for numbers too large, too small or too close to zero to be represented in the normal digital number system, or which are undefined. Properties of mappings among and between digital number systems are used to justify the augments chosen. Finally a suggestion is made for a new floating point word format that will serve all the above needs and will greatly extend the exponent range of floating point numbers.

On the Precision Attainable with Various Floating-point Number Systems, Richard P. Brent, IEEE Transactions on Computers, Vol. C-22 #6, pp601–607, IEEE, June 1973.
Abstract: For scientific computations on a digital computer the set of real numbers is usually approximated by a finite set F of “floating-point” numbers. We compare the numerical accuracy possible with difference choices of F having approximately the same range and requiring the same word length. In particular, we compare different choices of base (or radix) in the usual floating-point systems. The emphasis is on the choice of F, not on the details of the number representation or the arithmetic, but both rounded and truncated arithmetic are considered. Theoretical results are given, and some simulations of typical floating-point computations (forming sums, solving systems of linear equations, finding eigenvalues) are described. If the leading fraction bit of a normalized base 2 number is not stored explicitly (saving a bit), and the criterion is to minimize the mean square roundoff error, then base 2 is best. If unnormalized numbers are allowed, so the first bit must be stored explicitly, then base 4 (or sometimes base 8) is the best of the usual systems.
A Combinatoric Division Algorithm for Fixed-Integer Divisors, David H. Jacobsohn, IEEE Transactions on Computers, Vol. C-22 #6, pp608–610, IEEE, June 1973.
Abstract: A procedure is presented for performing a combinatoric fixed-integer division that satisfies the division algorithm in regard to both quotient and remainder. In this procedure, division is performed by multiplying the dividend by the reciprocal of the divisor. The reciprocal is, in all nontrivial cases, of necessity a repeating binary fraction, and two treatments for finding the product of an integer and repeating binary fraction are developed. Two examples of the application of the procedure are given.
Variable-Precision Exponentiation, P. L. Richman, Communications of the ACM, Vol. 16 #1, pp38–40, ACM Press, January 1973.
Abstract: A previous paper presented an efficient algorithm, called the Recomputation Algorithm, for evaluating a rational expression to within any desired tolerance on a computer which performs variable-precision aritbmetic operations. The Recomputation Algorithm can be applied to expressions involving any variable-precision operations having O(10p + S | eii |) error bounds, where p denotes the operation’s precision and ei denotes the error in the operation’s ith argument. This paper presents an efficient variable-precision exponential operation with an error bound of the above order. Other operations, such as log, sin, and cos, which have simple series expansions, can be handled similarly.

Fast B. C. D. Multiplier, Dharma P. Agrawal, Electronics Letters, Vol. 10 #12, pp237–238, IEE, 13 June 1974.
Abstract: A fast b.c.d multiplier is proposed, based on obtaining the product of a 1-digit multiplicand and a 1-digit multiplier in a single row of adders. For high-speed operation, the carry-save technique, universally adopted for binary multipliers, is used.
Some error correcting codes for certain transposition and transcription errors in decimal integers, D. A. H. Brown, The Computer Journal, Vol. 17 #1, pp9–12, OUP, February 1974.
Abstract: The standard theory of modulus 11 cyclic block error-correcting codes is applied to numbers expressed in the decimal system. An algorithm for error correction is given.
Biquinary decimal error detection codes with one, two and three check digits, D. A. H. Brown, The Computer Journal, Vol. 17 #3, pp201–204, OUP, August 1974.
Abstract: The biquinary system of representing the decimal integers 0 to 9 is combined with polynomial coding to produce true decimal codes having any required number of check digits added to an integer of any length.
Decimal Computation, Hermann Schmid, ISBN 047176180X, 266pp, Wiley, 1974.
Abstract: This book is thus a collection, a catalog, and a review of BCD computation techniques. The book describes how each of the most common arithmetic and transcendental operations can be implemented in a variety of ways. ... covers ... A review of number systems, BCD codes, of early calculating instruments and electronic calculating machines ... An outline of BCD computing circuit applications in the automotive, consumer, education, and entertainment fields, illustrated with some specific examples ... Mathematical developments of the algorithms ... Discussions and comparisons of circuit complexity and performance (accuracy, resolution, and speed of operation) for the different algorithms ...
Note: Reprinted 1983, ISBN 0-89874-318-4, Robert E. Krieger Publishing Co.
Serial Binary Division by Ten, R. L. Sites, IEEE Transactions on Computers, Vol. 23 #12, ISSN 0018-9340, pp1299–1301, IEEE, December 1974.
Abstract: A technique is presented for dividing a positive binary integer by ten, in which the bits of the input are presented serially, low-order bit first. A complete division by ten is performed in two word times (comparable to the time needed for two serial additions). The technique can be useful in serial conversions from binary to decimal, or in scaling binary numbers by powers of 10.

Storage-Efficient Representation of Decimal Data, Tien Chi Chen and Irving T. Ho, CACM Vol. 18 #2, pp49–52, ACM Press, January 1975.
Abstract: Usually n decimal digits are represented by 4n bits in computers. Actually, two BCD digits can be compressed optimally and reversibly into 7 bits, and three digits into 10 bits, by a very simple algorithm based on the fixed-length combination of two variable field-length encodings. In over half of the cases the compressed code results from the conventional BCD code by simple removal of redundant 0 bits. A long decimal message can be subdivided into three-digit blocks, and separately compressed; the result differs from the asymptotic minimum length by only 0.34 percent. The hardware requirement is small, and the mappings can be done manually.
A quantitative measure of precision, G. Hunter, The Computer Journal, Volume 18, Issue 3, pp231–233, OUP, August 1975.
Abstract: The precision zb of a real number is defined quantitatively in terms of the fractional error in the number, and the base of the arithmetic in which it is represented. The definition is an extension of the traditional rough measure of precision as the number of signification digits in the number. In binary arithmetic the integral part of zb is the number of binary digits required to store the number. Conversion of the precision from one base to another (such as binary/decimal) is discussed, and applied to consideration of the intrinsic precision of input/output routines and floating point arithmetic.
Programmer-controlled roundoff and the selection of a stable roundoff rule, R. A. Keir, Conf. Rec. 3rd Symp. Comp. Arithmetic CH1017-3C, pp73–76, IEEE Computer Society, 1975.
Abstract: The author suggests that every computer with floating-point addition and subtraction should have PSW controlable roundoff facilities. Yohe’s catalog should be induded. There should also be a stable roundoff mode using the round-to-off [-odd] or round-to-even rule based on whether the radix is divisible by four or only by two.
Compatible number representations, R. A. Keir, Conf. Rec. 3rd Symp. Comp. Arithmetic CH1017-3C, pp82–87, IEEE Computer Society, 1975.
Abstract: A compatible number system for mixed fixed-point and floating-point arithmetic is described in termsof number formats and opcode sequences (for hardwired or microcoded control). This inexpensive system can be as fast as fixed-point arithmetic on integers, is faster than normalized arithmetic in floating point, gets answers identical to those of normalized arithmetic, and automatically satisfies the Algol-60 mixed-mode rules. The central concept is the avoidance of meaningless “normalization” following arithmetic operations. Adoption of this system could lead to simpler compilers.
Should the stable rounding rule be radix-dependent?, Roy A. Keir, Information Processing Letters, Vol. 3 #6, pp188–189, Elsevier, July 1975.
Abstract: (None.)
Calculator Algorithms, Don Senzig, IEEE Compcon Reader Digest, IEEE Catalog No. 75 CH 0920-9C, pp139–141, IEEE, Spring 1975.
Abstract: This paper discusses algorithms for generating the trigonometric, exponential, and hyperbolic functions and their inverses. No invention is claimed here. The algorithm for logarithm was used by Briggs in compiling his table of logarithms in the 1600’s. Other earlier references are (cited). The development presented here is, perhaps, more direct than those given in the above references but leads to the same result.
Comments on a Paper by T. C. Chen and I. T. Ho, Alan Jay Smith, CACM Vol. 18 #8, pp463–463, ACM Press, August 1975.
Abstract: (None.)
Addition in an Arbitrary Base Without Radix Conversion, Stephen Soule, Communications of the ACM Vol. 18 #6, pp344–346, ACM Press, June 1975.
Abstract: This paper presents a generalization of an old programming technique; using it, one may add and subtract numbers represented in any radix, including a mixed radix, and stored one digit per byte in bytes of sufficient size. Radix conversion is unnecessary, no looping is required, and numbers may even be stored in a display (I/O) format. Applications to Cobol, MIX, and hexadecimal sums are discussed.

ANSI X3.53-l976: American National Standard – Programming Language PL/I, J. F. Auwaerter, 421pp, ANSI, 1976.
Abstract: This document defines American National Standard Programming Language PL/I and specifies both the form and interpretation of computer programs written in PL/I. The standard is intended to provide a high degree of machine independence and thereby facilitate program exchange among a variety of computing systems. The document serves as an authoritative reference rather than as a tutorial exposition. The language is defined by specifying a conceptual PL/I machine which translates and interprets intended PL/I programs. The relationship between an actual implementation of PL/I and the conceptual machine presented in this document is also given. This reference document was developed jointly under the auspices of the American National Standards Institute and the European Computer Manufacturers Association.
Note: Reaffirmed 1998.
Fast multiple-precision evaluation of elementary functions, Richard P. Brent, Journal of the ACM, Vol. 23 #2, pp242–251, ACM Press, April 1976.
Abstract: Let f(x) be one of the usual elementary functions (exp, log, artan, sin, cosh, etc.), and let M(n) be the number of single-precision operations required to multiply n-bit integers. It is shown that f(x) can be evaluated, with relative error O(2-n), in O(M(n)log(n)) operations, for any floating-point number x (with an n-bit fraction) in a suitable finite interval. From the Schönhage-Strassen bound on M(n), it follows that an n-bit approximation to f(x) may be evaluated in O(n(log(n))2loglog(n)) operations. Special cases include the evaluation of constants such as pi, e, and epi. The algorithms depend on the theory of elliptic integrals, using the arithmetic-geometric
A Unified Decimal Floating-Point Architecture for the Support of High-Level Languages, Frederic N. Ris, ACM SIGNUM Newsletter, Vol. 11 #3, pp18–23, ACM Press, October 1976.
Abstract: This paper summarizes a proposal for a decimal floating-point arithmetic interface for the support of high-level languages, consisting both of the arithmetic operations observed by application programs and facilities to produce subroutine libraries accessible from these programs. What is not included here are the detailed motivations, examinations of alternatives, and implementation considerations which will appear in the full work.
Note: Also in ACM SIGARCH Computer Architecture News, Vol 5 #4, pp21-31, October 1976. Also in ACM SIGPLAN Notices, Vol 12 #9, pp60-70, September 1977. Also in IBM RC 6203 (#26651) 11pp, September 1976.

An APL interpreter and system for a small computer, M. Alfonseca, M. L. Tavera, and R. Casajuana, IBM Systems Journal, Vol. 16 #1, pp18–40, IBM, 1977.
Abstract: The design and implementation of an experimental APL system on the small, sensor-based System/7 is described. Emphasis is placed on the solution to the problem of fitting a full APL system into a small computer.
   The system has been extended through an I/O auxiliary processor to make it possible to use APL in the management and control of the System/7 sensor-based I/O operations.
An instruction timing model of CPU performance, Bernard L. Peuto and Leonard J. Shustek, Proceedings of the 4th annual symposium on Computer architecture, pp165–178, ACM Press, 1977.
Abstract: A model of high-performance computers is derived from instruction timing formulas, with compensation for pipeline and cache memory effects. The model is used to predict the performance of the IBM 370/168 and the Amdahl 470 V/6 on specific programs, and the results are verified by comparison with actual performance. Data collected about program behavior is combined with the performance analysis to highlight some of the problems with high-performance implementations of such architectures.
A New Representation for Decimal Numbers, C. K. Yuen, IEEE Transactions on Computers, Vol. 26 #12, pp1286–1288, IEEE, December 1977.
Abstract: A new representation for decimal numbers is proposed. It uses a mixture of positive and negative radixes to ensure that the maximum value of a four bit decimal digit is 9. This eliminates the more complex carry generation process required in BCD addition.

Desirable Floating-Point Arithmetic and Elementary Functions for Numerical Computation, T. E. Hull, ACM Signum Newsletter, Vol. 14 #1 (Proceedings of the SIGNUM Conference on the Programming Environment for Development of Numerical Software), pp96–99, ACM Press, 1978.
Abstract: The purpose of this talk is to summarize proposed specifications for floating-point arithmetic and elementary functions. The topics considered are: the base of the number system, precision control, number representation, arithmetic operations, other basic operations, elementary functions, and exception handling. The possibility of doing without fixed-point arithmetic is also mentioned. The specifications are intended to be entirely at the level of a programming language such as Fortran. The emphasis is on convenience and simplicity from the user’s point of view. Conforming to such specifications would have obvious beneficial implications for the portability of numerical software, and for proving programs correct, as well as attempting to provide facilities which are most suitable for the user. The specifications are not complete in every detail, but it is intended that they be complete “in spirit” – some further details, especially syntactic details, would have to be provided, but the proposals are otherwise relatively complete.
Note: Also in Proceedings of the IEEE 4th Symposium on Computer Arithmetic pp63-69.
Error-Correcting Codes in Binary-Coded-Decimal Arithmetic, Chao-Kai Liu and Tse Lin Wang, IEEE Transactions on Computers, Vol. 27 #11, pp977–984, IEEE, November 1978.
Abstract: Error-correcting coding schemes devised for binary arithmetic are not in general applicable to BCD arithmetic. In this paper, we investigate the new problem of using such coding schemes in BCD systems. We first discuss the general characteristics of arithmetic errors and define the arithmetic weight and distance in BCD systems. We show that the distance is a metric function. Number theory is used to construct a class of single-error-correcting codes for BCD arithmetic. It is shown that the generator of these codes possesses a very simple form and the structure of these codes can be analytically determined.
Two Methods for Fast Integer Binary-BCD Conversion, F. A. Schreiber and R. Stefanelli, Proc. 4th Symposium on Computer Arithmetic, pp200–207, IEEE Press, October 1978.
Abstract: Two methods for performing binary-BCD conversion of positive integers are discussed. The principle which underlies both methods is the repeated division by five and then by two, obtained the first by means of subtractions performed from left to right, the second by shifting bits before next subtraction.
    It is shown that these methods work in a time which is linear with the length in bit of the number to be converted,
    A ROM solution is proposed and its complexity is compared with that of other methods.

FOCUS Microcomputer Number System, Albert D. Edgar and Samuel C. Lee, Communications of the ACM Vol. 22 #3, pp166–177, ACM Press, March 1979.
Abstract: FOCUS is a number system and supporting computational algorithms especially useful for microcomputer control and other signal processing applications. FOCUS has the wide-ranging character of floating-point numbers with a uniformity of state distributions that give FOCUS better than a twofold accuracy advantage over an equal word length floating-point system. FOCUS computations are typically five times faster than single precision fixed-point or integer arithmetic for a mixture of operations, comparable in speed with hardware arithmetic for many applications. Algorithms for 8-bit and 16-bit implementations of FOCUS are included.
Principles and Preferences for Computer Arithmetic, Christian H. Reinsch, ACM SIGNUM Vol. 14 #1, pp12–27, ACM Press, March 1979.
Abstract: This working paper arose out of discussions on desirable hardware features for numerical calculation in the IFIP Working Group 2.5 on Numerical Software. It reflects the views of all members of the group, although no formal vote of approval has been taken; it is not an official IFIP document. Many people contributed ideas to this paper, especially T. J. Dekker, C. W. Gear, T. E. Hull, J. R. Rice, and J. L. Schonfeldor.

Software Manual for the Elementary Functions, W. J. Cody and W. Waite, ISBN 0-13-822064-6, 269pp, Prentice-Hall, 1980.
Decimal to Binary Floating Point Number Conversion Mechanism, J. W. Havender, IBM Technical Disclosure Bulletin, 07-80, pp706–708, IBM, July 1980.
Abstract: Floating point numbers may be converted from decimal to binary using a high speed natural logarithm and exponential function calculation mechanism and a fixed point divide/multiply unit.
    The problem solved is to convert numbers expressed in a radix 10 floating point form to numbers expressed in a radix 2 floating point form.
Principles, Preferences and Ideals for Computer Arithmetic, Thomas E. Hull, Christian H. Reinsch, and John R. Rice, CSD-TR-339, 13pp, Dept. of Computer Science, Purdue University, June 1980.
Abstract: This paper presents principles and preferences for the implementation of computer arithmetic and ideals for the arithmetic facilities in future programming languages. The implementation principles and preferences are for the current approaches to the design of arithmetic units. The ideals are for the long term development of programming languages, with the hope that arithmetic units will be built to support the requirements of programming languages.
Decimal Shifting for an Exact Floating Point Representation, J. D. Johannes, C. Dennis Pegden, and F. E. Petry, Computers and Electrical Engineering, Vol. 7 #3, pp149–155, Elsevier, September 1980.
Abstract: A floating point representation which permits exact conversion of decimal numbers is discussed. This requires the exponent to represent a power of ten, and thus decimal shifts of the mantissa are needed. A specialized design is analyzed for the problem of division by ten, which is needed for decimal shifting.
IBM 4341 hardware/microcode trade-off decisions, James R. Kleinsteiber, MICRO 13: Proceedings of the 13th annual workshop on Microprogramming, pp190–192, ACM Press, December 1980.
Abstract: The design of IBM’s 4341 Processor, as with other processors, involved many cost/performance tradeoffs. The designer is continually under pressure to increase processor speed without increasing cost or to decrease processor cost without decreasing performance. This paper will examine some of the engineering decisions that were made in the attempt to make the 4341 a high-performing yet low cost processor. These decisions include searching for, or developing, algorithms that make the best use of hardware properties, such as data path width, arithmetic/logical operations and special functions. Functions were sought such that a small amount of added hardware would go a long way towards improving system performance. Hardware designers, microcoders and performance analysis people worked together to implement instructions, functions and algorithms with the proper mixture of hardware functions and microcode in order to build a viable processor. Some specific functions will be covered to examine a few of the decisions. The TEST UNDER MASK performance problem will be discussed with its resulting implementation decision. The method of using EXCLUSIVE OR to clear storage and the resulting algorithm design will be shown. Other topics to be discussed include multiple hardware functions and the resulting effect on floating point, fixed point and decimal multiply; the divide function and its effect on floating point and fixed point divide; and the effect of an 8-byte data path for decimal arithmetic.
Note: Also published in December 1980 SIGMICRO Newsletter Volume 11 Issue 3-4

MP User's Guide (Fourth Edition), Richard P. Brent, 73pp, Dept. Computer Science, Australian National University, Canberra, TR-CS-81-08, June 1981.
Abstract: MP is a multiple-precision floating-point arithmetic package. It is almost completely machine-independent, and should run on any machine with an ANSI Standard Fortran (ANS X3.9-1966) compiler, sufficient memory, and a wordlength (for integer arithmetic) of at least 16 bits. A precompiler (Augment) which facilitates the use of the MP package is available. ...
    MP works with normalized floating-point numbers. The base (B) and number of digits (T) are arbitrary, subject to some restrictions given below, and may be varied dynamically. ...
Method of Adding Decimal Numbers by Means of Binary Arithmetic, G. Chroust, IBM Technical Disclosure Bulletin, 03-81, pp4525–4526, IBM, March 1981.
Abstract: The simulation of decimal arithmetic on a machine without packed arithmetic necessitates a method for simulating decimal addition by binary arithmetic.
    Decimal addition simulation is effected by simultaneously applying the following steps to as many digits (d1, d2, .., dn) of the decimal number as fit into the (binary = bin) word length of the object machine. 1. (Binary) addition of the two operands, 2. adding a `6’ in each digit position (this generates the correct carry), and 3. subtracting a `6’ in those places from which no carry resulted.
Binary to Decimal Conversion, L. K. Griffiths, IBM Technical Disclosure Bulletin, 06-81, pp237–238, IBM, June 1981.
Abstract: Binary to decimal conversion can be achieved by multiplying 1/10 as 51/512 x 256/255 and using the fact that 256/255 = 1 + 1/256 + 1/2562 ..., i.e., 256/255 = 257-256 rounded up.
    This method can be performed efficiently on short word computers with only adding and shifting operations, i.e., first multiplying by 51/512 and then correcting by multiplying by 256/255.
The Universal History of Numbers, Georges Ifrah, ISBN 1-86046-324-X, 633pp, The Harvill Press Ltd., 1994.
Abstract: More than a history of counting and calculating from the caveman to the late twentieth century, this is the story of how the human race has learnt to think logically. The reader is taken through the whole art and science of numeration as it has developed all over the world, from Europe to China, via the Classical World, Mesopotamia, South America, and, above all, India and the Arab lands. ...
Note: Translated from the French by David Bellos, E. F. Harding, Sophie Wood, and Ian Monk.
    (Also published is a translation of an earlier edition – From One to Zero: A Universal History of Numbers. Translated by Lowell Bair. Viking, New York, 1985.)

Representational error in binary and decimal numbering systems, Paul Johnstone, Proceedings of the 20th annual ACM Southeast Regional Conference, pp85–88, ACM Press, 1982.
Abstract: The representation of a general rational number of the form A/B as a floating point number requires a conversion from the general form to a base specific form. This conversion often results in the generation of infinitely repeating non-zero strings of digits which are truncated to the size of the mantissa resulting in a loss of precision. It is shown that the proportion of repeating versus finite rational numbers specific to a base is expotentially related to the number of unique prime factors of the base. Simulation results are presented which show the relative proportions of finite representations for binary and decimal cases over a range of mantissa sizes.
Applications of Redundant Number Representations to Decimal Arithmetic, R. Sacks-Davis, The Computer Journal, Vol. 25 #4, pp471–477, November 1982.
Abstract: A decimal arithmetic unit is proposed for both integer and floating-point computations. To achieve comparable speed to a binary arithmetic unit, the decimal unit is based on a redundant number representation. With this representation no loss of compactness is made relative to binary coded decimal (BCD) form. In this paper the hardware required for the implementation of the basic operations of addition, subtraction, multiplication and division are described and the properties of floating-point arithmetic based on a redundant number representation are investigated.

CADAC: A Controlled-Precision Decimal Arithmetic Unit, Marty S. Cohen, T. E. Hull, and V. Carl Hamacher, IEEE Transactions on Computers, Vol. 32 #4, pp370–377, IEEE, April 1983.
Abstract: This paper describes the design of an arithmetic unit called CADAC (clean arithmetic with decimal base and controlled precision). Programming language specifications for carrying out “ideal” floating-point arithmetic are described first. These specifications include detailed requirements for dynamic precision control and exception handling, along with both complex and interval arithmetic at the level of a programming language such as Fortran or PL/I.
    CADAC is an arithmetic unit which performs the four floating-point operations add/subtract/multiply/divide on decimal numbers in such a way as to support all the language requirements efficiently. A three-level pipeline is used to overlap two-digit-at-a-time serial processing of the partial products/remainders. Although the logic design is relatively complex, the performance is efficient, and the advantages gained by implementing programmer-controlled precision directly in the hardware are significant.
Chapter 13 – Internal Data Representations, Hewlett Packard Company, Software Internal Design Specification for the HP-71, Vol. 1 Part #00071-90068, pp13.1–13.17, Hewlett Packard Company, December 1983.
Abstract: This chapter discusses the format in which the HP-71 represents numeric or string data in memory or in the CPU registers.
Note: Manual available from The Museum of HP Calculators (
Mathematics Written in Sand, W. Kahan, Proc. Joint Statistical Mtg. of the American Statistical Association, pp12–26, American Statistical Association, 1983.
Abstract: Simplicity is a Virtue; yet we continue to cram ever more complicated circuits ever more densely into silicon chips, hoping all the while that their internal complexity will promote simplicity of use. This paper exhibits how well that hope has been fulfilled by several inexpensive devices widely used nowadays for numerical computation. One of them is the Hewlett-Packard hp-15C programmable shirtpocket calculator, on which only a few keys need be pressed to perform tasks like these:
    Real and Complex arithmetic, including the elementary transcendental functions and their inverses; Matrix arithmetic including inverse, transpose, determinant, residual, norms, prompted input/output and complex-real conversion; Solve an equation and evaluate an Integral numerically; simple statistics; G and combinatorial functions; ...
    For instance, a stroke of its [1/X] key inverts an 8x8 matrix of 10-sig.-dec. numbers in 90 sec.
    This calculator costs under $100 by mail-order. Mathematically dense circuitry is also found in Intel’s 8087 coprocessor chip, currently priced below $200, which has for two years augmented the instruction repertoire of the 8086 and 8088 microcomputer chips to cope with ...
    Three binary floating-point formats 32, 64 and 80 bits wide; three binary integer formats 16, 32 and 64 bits wide; 18-digit BCDecimal integers; rational arithmetic, square root, format conversion and exception handling all in conformity with p754, the proposed IEEE arithmetic standard (see “Computer” Mar. 1, 1981); the kernels of transcendental functions exp, log, tan and arctan; and an internal stack of eight registers each 80 bits wide.
    For instance, the 8087 has been used to invert a 100x100 matrix of 64-bit floating-point numbers in 90 sec. Among the machines that can use this chip are the widely distributed IBM Personal Computers, each containing a socket already wired for an 8087. Several other manufacturers now produce arithmetic engines that, like the 8087, conform to the proposed IEEE arithmetic standard, so software that exploits its refined arithmetic properties should be widespread soon.
    As sophisticated mathematical operations come into use ever more widely, mathematical proficiency appears to rise; in a sense it actually declines. Computations formerly reserved for experts lie now within reach of whoever might benefit from them regardless of how little mathematics he understands; and that little is more likely to have been gleaned from handbooks for calculators and personal computers than from professors. This trend is pronounced among users of financial calculators like the hp-12C. Such trends ought to affect what and how we teach, as well as how we use mathematics, regardless of whether large fast computers, hitherto dedicated mostly to speed, ever catch up with some smaller machines’ progress towards mathematical robustness and convenience.

Beyond Floating Point, C. W. Clenshaw and F. W. J. Olver, Journal of the ACM, Vol. 31 #2, pp319–328, ACM Press, April 1984.
Abstract: A new number system is proposed for computer arithmetic based on iterated exponential functions. The main advantage is to eradicate overflow and underflow, but there are several other advantages and these are described and discussed.
A Proposed Radix- and Word-length-independent Standard for Floating-point Arithmetic, W. J. Cody, J. T. Coonen, D. M. Gay, K. Hanson, D. Hough, W. Kahan, R. Karpinski, J. Palmer, F. N. Ris, and D. Stevenson, IEEE Micro magazine, Vol. 4 #4, pp86–100, IEEE, August 1984.
Abstract: This article places [Draft 1.0 of IEEE 854] before the public for the first time. ... This article also includes material that describes how decisions were reached in preparing the P854 draft and explains how to overcome some of the implementation problems.
Note: Reprinted in ACM SIGNUM, Vol. 20, #1, pp35-51, 1985.
The Design of the REXX Language, M. F. Cowlishaw, IBM Systems Journal, Vol. 23 #4, pp326–335, IBM (Offprint # G321-5228), 1984.
Abstract: One way of classifying computer languages is by two classes: languages needing skilled programmers, and personal languages used by an expanding population of general users. REstructured eXtended eXecutor (REXX) is a flexible personal language designed with particular attention to feedback from its users. It has proved to be effective and easy to use, yet it is sufficiently general and powerful to fulfill the needs of many demanding professional applications. REXX is system and hardware independent, so that it has been possible to integrate it experimentally into several operating systems. Here REXX is used for such purposes as command and macro programming, prototyping, education, and personal programming. This study introduces REXX and describes the basic design principles that were followed in developing it.
Note: First published as IBM Hursley Technical Report TR12.223, October 1983.
A Significance Rule for Multiple-Precision Arithmetic, Christopher B. Jones, ACM Transactions on Mathematical Software (TOMS), Vol. 10 #1, pp97–107, ACM Press, March 1984.
Abstract: Multiple-precision arithmetic overcomes the round-off error incurred in conventional floating-point arithmetic, at the cost of increased processing overhead. Significance arithmetic takes into account the inexactness of the operands of a calculation, but can lead to loss of significant digits after a long series of operations. A new technique is described which alleviates the overhead of multiple-precision arithmetic by allowing nonsignificant digits to be discarded, while limiting the significance loss per operation to a controllable and acceptable rate. The technique is based on storing an inexact number as an interval, using a criterion of significance to determine the precision with which the limits of the interval should be stored. A procedure referred to as a significance rule uses this criterion to remove some of the nonsignificant digits from the limits of an interval prior to storage. A certain number of nonsignificant digits are retained as guard digits. Calculations are performed using exact interval arithmetic and the significance-rule procedure is invoked after each operation to remove superfluous diglts. Round-off in the procedure causes a slight increase in the interval width on each operation. This results in a cumulative loss of significance at a rate related to the number of guard digits.

Accurate Arithmetic Results for Decimal Data on Non-Decimal Computers, Winfried Auzinger and H. J. Stetter, Computing, 35, pp141–151, 1985.
Abstract: Recently, techniques have been devised and implemented which permit the computation of smallest enclosing machine number interval for the exact results of a good number of highly composite operations. These exact results refer, however, to the data as they are represented in the computer. This note shows how the conversion of decimal data into non-decimal representations may be joined with the mathematical operation on the data into one high-accuracy algorithm. Such an algorithm is explicitly presented for the solution of systems of linear equations.
Turbo Pascal Version 3.0 Reference Manual, Borland International, ISBN 0-87524-003-8, 386pp, Borland International, April 1985.
Abstract: Turbo Pascal 3 was the first Turbo Pascal version to support the Intel 8087 math co-processor (16-bit PC version). It also included support for Binary Coded Decimal (BCD) math to eliminate round off errors in business applications. Turbo Pascal 3 also allowed you to build larger programs (> 64k bytes) using overlays. The PC version also supported Turtle Graphics, Color, Sound, Window Routines, and more.
Numerical Turing, T. E. Hull, A. Abrham, M. S. Cohen, A. F. X. Curley, C. B. Hall, D. A. Penny, and J. T. M. Sawchuk, SIGNUM Newsletter, vol. 20 # 3, pp26–34, ACM Press, July 1985.
Abstract: Numerical Turing is an extension of the Turing programming language. Turing is a Pascal-like language (with convenient string handling, dynamic arrays, modules, and more general parameter lists) developed at the University of Toronto. Turing has been in use since May, 1983, and is now available on several machines.
    The Numerical Turing extension is especialy designed for numerical calculations. The important new features are: (a) clean decimal arithmetic, along with convenient functions for directed roundings and exponent manipulation; (b) complete precision control of variables and operations. ...
Properly Rounded Variable Precision Square Root, T. E. Hull and A. Abrham, ACM Transactions on Mathematical Software, Vol. 11 #3, pp229–237, ACM Press, September 1985.
Abstract: The square root function presented here returns a properly rounded approximation to the square root of its argument, or it raises an error condition if the argument is negative. Properly rounded means rounded to nearest, or to nearest even in case of a tie. It is variable precision in that it is designed to return a p-digit approximation to a p-digit argument, for any p > 0. (Precision p means p decimal digits.) The program and the analysis are valid for all p > 0, but current implementations place some restrictions on p.
IEEE 754-1985 IEEE Standard for Binary Floating-Point Arithmetic, David Stevenson et al, 20pp, IEEE, July 1985.
Abstract: This standard defines a family of commercially feasible ways for new systems to perform binary floating-point arithmetic. The issues of retrofitting were not considered.
    It is intended that an implementation of a floating-point system conforming to this standard can be realized entirely in software, entirely in hardware, or in any combination of software and hardware. It is the environment the programmer or user of the system sees that conforms or fails to conform to this standard. Hardware components that require software support to conform shall not be said to conform apart from such software.

Note: Reaffirmed 1991.

Variable Precision Exponential Function, T. E. Hull and A. Abrham, ACM Transactions on Mathematical Software, Vol. 12 #2, pp79–91, ACM Press, June 1986.
Abstract: The exponential function presented here returns a result which differs from ex by less than one unit in the last place, for any representable value of x which is not too close to values for which ex would overflow or underflow. (For values of x which are not within this range, an error condition is raised.) It is a “variable precision” function in that it returns a p-digit approximation for a p-digit argument, for any p > 0 (p-digit means p-decimal-digit). The program and analysis are valid for all p > 0, but current implementations place a restriction on p. The program is presented in a Pascal-like programming language called Numerical Turing which has special facilities for scientific computing, including precision control, directed roundings, and built-in functions for getting and setting exponents.
The IBM 650: An Appreciation from the Field, Donald E. Knuth, IEEE Annals of the History of Computing, Vol. 8 #1, pp50–55, IEEE, January-March 1986.
Abstract: I suppose it was natural for a person like me to fall in love with his first computer. But there was something special about the IBM 650, something that has provided the inspiration for much of my life’s work. Somehow this machine was powerful in spite of its severe limitations. Somehow it was friendly in spite of its primitive man-machine interface...

Implementable Decimal Arithmetic Algorithms for Micro/Minicomputers, M. Ahmad, Microprocessing and Microprogramming, Vol. 19 #2, pp119–128, February 1987.
Abstract: The need for efficient decimal arithmetic and its ever increasing applications in micro/minicomputers and microprocessor based equipment and appliances has been emphasised. Some algorithms suitable for implementation for decimal arithmetic operations of BCD packed decimal numbers have been suggested. These algorithms employ comparatively faster instructions available on most of the microprocessors and provide efficient and faster decimal arithmetic.
A Decimal Floating-Point Processor for Optimal Arithmetic, G. Bohlender and T. Teufel, Computer arithmetic: Scientific Computation and Programming Languages, ISBN 3-519-02448-9, pp31–58, B. G. Teubner Stuttgart, 1987.
Abstract: A floating-point processor for optimal arithmetic should perform scalar products with maximum accuracy in addition to the usual operations +, -, *, /. This means that scalar products have to be computed with an error of at most one bit of the least significant digit, even if cancellation of leading digits occurs. In order to avoid conversion errors during input and output of numerical data, the decimal number system should be chosen.
    The arithmetic processor BAP-SC performs these operations in a 64 bit floating-point format with 13 decimal digits in the mantissa. The prototype is built in bit-slice technology on wire-wrap boards. Interfaces have been developed [sic] for several busses and computers.
    The arithmetic processor is fully integrated in the programming language PASCAL-SC. It supports operations in higher numerical spaces and new numerical algorithms that compute verified results with error bounds.
Atari System Reference Manual, section 11, Bob DuHamel, Atari, 1987.
Abstract: The routines which do floating point arithmetic are a part of the operating system ROM. The Atari computer uses the 6502’s decimal math mode. This mode uses numbers represented in packed Binary Coded Decimal (BCD). This means that each byte of a floating point number holds two decimal digits. The actual method of representing a full number is complicated and probably not very important to a programmer. However, for those with the knowledge to use it, the format is given below...
Note: 6 bytes: 10-digit BCD, 7-bit excess-64 exponent.
Math Reference, Hewlett Packard Company, HP-71 Reference Manual, Mfg. # 0071-90110, Reorder # 0071-90010, pp317–318, Hewlett Packard Company, October 1987.
Note: First edition October 1983. Subsections describe the numeric precisions available and the range of representable numbers. Manual available from The Museum of HP Calculators (
The IEEE Proposal for Handling Math Exceptions, Hewlett Packard Company, HP-71 Reference Manual, Mfg. # 0071-90110, Reorder # 0071-90010, pp338–345, Hewlett Packard Company, October 1987.
Abstract: The IEEE Radix Independent Floating-Point Proposal divides all of the floating-point “exceptional events” encountered in calculations into five classes of math exceptions: invalid operation, division by zero, overflow, underflow, and inexact result. Associated with each math exception is a flag that is set by the HP-71 whenever an exception is encountered. These flags remain set until you clear them. Each of these flags can be accessed by its number or its name.
Note: First edition October 1983. Manual available from The Museum of HP Calculators (
Toward an Ideal Computer Arithmetic, T. E. Hull and M. S. Cohen, Proceedings of the 8th Symposium on Computer Arithmetic, pp131–138, IEEE, May 1987.
Abstract: A new computer arithmetic is described. Closely related built-in functions are included. A user’s point of view is taken, so that the emphasis is on what language features are available to a user. The main new feature is flexible precision control of decimal floating-point arithmetic. It is intended that the language facilities be sufficient for describing numerical processes one might want to implement, while at the same time being simple to use, and implementable in a reasonably efficient manner. Illustrative examples are based on experience with an existing software implementation.
IEEE 854-1987 IEEE Standard for Radix-Independent Floating-Point Arithmetic, W. J. Cody et al, 14pp, IEEE, March 1987.
Abstract: It is intended that an implementation of a floating-point system conforming to this standard can be realized entirely in software, entirely in hardware, or in any combination of software and hardware. It is the environment the programmer or user of the system sees that conforms or fails to conform to this standard. Hardware components that require software support to conform shall not be said to conform apart from such software.
Note: Reaffirmed 1994.
Superoptimizer: A Look at the Smallest Program, Henry Massalin, ACM Sigplan Notices, Vol. 22 #10 (Proceedings of the Second International Conference on Architectual support for Programming Languages and Operating Systems), pp122–126, ACM, also IEEE Computer Society Press #87CH2440-6, October 1987.
Abstract: Given an instruction set, the superoptimizer finds the shortest program to compute a function. Startling programs have been generated, many of them engaging in convoluted bit-fiddling bearing little resemblance to the source programs which defined the functions. The key idea in the superoptimizer is a probabilistic test that makes exhaustive searches practical for programs of useful size. The search space is defined by the processor’s instruction set, which may include the whole set, but it is typically restricted to a subset. By constraining the instructions and observing the effect on the output program, one can gain insight into the design of instruction sets. In addition, superoptimized programs may be used by peephole optimizers to improve the quality of generated code, or by assembly language programmers to improve manually written code.
Note: Also in: ACM SIGOPS, Operating Systems Review, Vol. 21 # 4.

VLSI designs for redundant binary-coded decimal addition, Behrooz Shirazi, David Y. Y. Yun, and Chang N. Zhang, IEEE Seventh Annual International Phoenix Conference on Computers and Communications, 1988, pp52–56, IEEE, March 1988.
Abstract: Binary-coded decimal (BCD) system provides rapid binary-decimal conversion. However, BCD arithmetic operations are often slow and require complex hardware. One can eliminate the need for carry propagation and thus improve performance of BCD operations by using a redundant binary-coded decimal (RBCD) system. This paper introduces the VLSI design of an RBCD adder. The design consists of two small PLA’s and two four-bit binary adders for one digit of the RBCD adder. The addition delay is constant for n-digit RBCD addition (no carry propagation delay). The VLSI time and space complexities of the design as well as its layout are presented, showing the regularity of the structures. In addition, two simple algorithms and the corresponding hardware designs for conversion between RBCD and BCD are presented.

Higher Radix Floating Point Representations, P. Johnstone and F. Petry, Proceedings of the 9th Symposium on Computer Arithmetic, ISBN 0-8186-8963-3, pp128–135, IEEE Computer Society Press, September 1989.
Abstract: This paper examines the feasibility of higher radix floating point representations, and in particular, decimal based representations. Traditional analyses of such representations have assumed the format of a floating point datum to be roughly identical to that of traditional binary floating point encodings such as the IEEE P754 task group standard representations. We relax this restriction and propose a method of encoding higher radix floating point data with range, precision, and storage requirements comparable to those exhibited by traditional binary representations. Results from McKeeman’s Maximum and Average Relative Representational Error (MRRE and ARRE) analyses, Brent’s RMS error evaluation, Matula’s ratio of significance space and gap functions, and Brown and Richman’s exponent range estimates are extended to accomodate the proposed representation. A decimal alternative to traditional binary representations is proposed, and the behavior of such a system is contrasted with that of a comparable binary system.
Multistep Gradual Rounding, Corinna Lee, IEEE Transactions on Computers, Vol. 28 #4, pp595–600, IEEE, April 1989.
Abstract: A value V is to be rounded to an arbitrary precision resulting in the value V“. Conventional rounding technique uses one step to accomplish this. Alternatively, multistep rounding uses several steps to round the value V to successively shorter precisions with the final rounding step producing the desired value V”. This alternate rounding method is one way to implement, with the minimum of hardware, the denormalization process that the IEEE Floating-Point Standard 754 requires when underflow occurs. There are certain cases for which multistep rounding produces a different result than single-step rounding. To prevent such a step error, the author introduces a rounding procedure called gradual rounding that is very similar to conventional rounding with the addition of two tag bits associated with each floating-point register.
Methods and Programs for Mathematical Functions, Stephen L. Moshier, 415pp, Prentice-Hall, Inc., Englewood Cliffs, New Jersey 07632, USA, 1989.
Abstract: This book provides a working collection of mathematical software for computing various elementary and higher functions. It also supplies tutorial information of a practical nature; the purpose of this is to assist in constructing numerical programs for the reader’s special applications.
    Though some of the main analytical techniques for deriving functional expansions are described, the emphasis is on computing; so there has been no attempt to incorporate or supplant the many books on functional and numerical analysis that are available. ...

Note: Program source codes are available at
Error detecting decimal digits, Neal R. Wagner and Paul S. Putter, Communications of the ACM Vol. 32 #1, pp106–110, ACM Press, January 1989.
Abstract: We were recently engaged by a large mail-order house to act as consultants on their use of check digits for detecting errors in account numbers. Since we were not experts in coding theory, we looked in reference books such #as Error Correcting Codes [7] and asked colleagues who were familiar with coding theory. Uniformly, the answer was: There is no field of order 10; the theory only works over a field. This article relates our experi- ences and presents several of the simple decimal- oriented error detection schemes that are available, but not widely known.
Note: ACM abstract: Decimal-oriented error detection schemes are explored in the context of one particular company project.

How to read floating point numbers accurately, William D. Clinger, Proceedings of the ACM SIGPLAN '90 Conference on Programming Language Design and Implementation, pp92–101, ACM Press, June 1990.
Abstract: Consider the problem of converting decimal scientific notation for a number into the best binary floating point approximation to that number, for some fixed precision. This problem cannot be solved using arithmetic of any fixed precision. Hence the IEEE Standard for Binary Floating-Point Arithmetic does not require the result of such a conversion to be the best approximation.
    This paper presents an efficient algorithm that always finds the best approximation. The algorithm uses a few extra bits of precision to compute an IEEE-conforming approximation while testing an intermediate result to determine whether the approximation could be other than the best. If the approximation might not be the best, then the best approximation is determined by a few simple operations on multiple-precision integers, where the precision is determined by the input. When using 64 bits of precision to compute IEEE double precision results, the algorithm avoids higher-precision arithmetic over 99% of the time.
    The input problem considered by this paper is the inverse of an output problem considered by Steele and White: Given a binary floating point number, print a correctly rounded decimal representation of it using the smallest number of digits that will allow the number to be read without loss of accuracy. The Steele and White algorithm assumes that the input problem is solved; an imperfect solution to the input problem, as allowed by the IEEE standard and ubiquitous in current practice, defeats the purpose of their algorithm.
Software Product Description: COBOL-81/RSTS/E, Version 3.1, DEC, 3pp, Digital Equipment Corporation, December 1990.
Abstract: COBOL-81/RSTS/E is a high-level language for business data processing that operates under control of the RSTS/E Operating System. It is based on the 1985 ANSI COBOL Standard X3.23-1985 and includes all of the features necessary to achieve the intermediate level of that standard. COBOL-81/RSTS/E is a subset of VAX COBOL and includes various Digital Equipment Corporation extensions to COBOL, including screen handling at the source language level. COBOL-81/RSTS/E also supports the ANSI-1974 standard, and both standards are switch selectable using the /STA:V2 or /STA:85 switches.
Correctly Rounded Binary-Decimal and Decimal-Binary Conversions, David M. Gay, Numerical Analysis Manuscript 90-10, 16pp, AT&T Bell Laboratories, November 1990.
Abstract: This note discusses the main issues in performing correctly rounded decimal-to-binary and binary-to-decimal conversions. It reviews recent work by Clinger and by Steele and White on these conversions and describes some efficiency enhancements. Computational experience with several kinds of arithmetic suggests that the average computational cost for correct rounding can be small for typical conversions. Source for conversion routines that support this claim is available from netlib.
How to Print Floating-Point Numbers Accurately, Guy. L. Steele Jr. and Jon. L. White, Proceedings of the ACM SIGPLAN '90 Conference on Programming Language Design and Implementation, pp112–126, ACM Press, June 1990.
Abstract: We present algorithms for accurately converting [binary] floating-point numbers to decimal representation. The key idea is to carry along with the computation an explicit representation of the required rounding accuracy.
    We begin with the simpler problem of converting fixed-point fractions. A modification of the well-known algorithm for radix-conversion of fixed-point fractions by multiplication explicitly determines when to terminate the conversion process; a variable number of digits are produced. ...

Decimal Floating-Point Arithmetic in Binary Representation, Gerd Bohlender, Computer arithmetic: Scientific Computation and Mathematical Modelling (Proceedings of the Second International Conference, Albena, Bulgaria, 24-28 September 1990), pp13–27, J. C. Baltzer AG, 1991.
Abstract: The binary representation of decimal floating-point numbers permits an efficient implementation of the proposed radix independent IEEE standard for floating-point arithmetic, as far as storage space is concerned. Unfortunately the left and right shifts occurring in the arithmetic operations are very complicated and slow in this representation. In the present paper therefore methods are proposed which speed up these shifts; in particular a kind of carry look-ahead technique is used for division. These methods can be combined to construct a decimal shifter which is needed in an ALU for decimal arithmetic.
A method of designing a decimal arithmetic processor, M. A. Gladshtein, Automatic Control and Computer Sciences, Vol. 25 #6, pp51–56, 1991.
Abstract: The advantages and drawbacks of binary numeric coding in digital computers have been considered. This type of coding has been shown ineffective in processing large data arrays especially when represented in the floating-point form. Also, the low efficiency of conventionally employed decimal computational procedures using the so-called corrections has been noted. It has been proposed, in designing digital computers, to renounce the principle of binary computations in favor of decimal operations on the basis of stored addition and multiplication tables using binary-decimal numeric coding. A version of circuit design for a decimal processor, algorithms and microprograms for addition and multiplication operations have been described. Advantages inherent in the method proposed have been analyzed.
Note: Translated from Avtomatika i Vychislitel’naya Tekhnika UDC 681.3.48.
Specifications for a Variable-Precision Arithmetic Coprocessor, T. E. Hull, M. S. Cohen, and C. B. Hall, Proceedings. 10th Symposium on Computer Arithmetic, ISBN 0-8186-9151-4, pp127–131, IEEE, 1991.
Abstract: The authors have been developing a programming system intended to be especially convenient for scientific computing. Its main features are variable precision (decimal) floating-point arithmetic and convenient exception handling. The software implementation of the system has evolved over a number of years, and a partial hardware implementation of the arithmetic itself was constructed and used during the early stages of the project. Based on this experience, the authors have developed a set of specifications for an arithmetic coprocessor to support such a system. The main purpose of this paper is to describe these specifications. An outline of the language features and how they can be used is also provided, to help justify our particular choice of coprocessor specifications.
Numeric types, representations, and other fictions, T. Ochs, Computer Language, Vol. 8 #8, pp93–101, August 1991.
Abstract: Only rational numbers are explicitly representable in computers. Any explicit representation has a zero measure. Both rational and BCD arbitrary precision meet [author’s] initial requirements [of precision and range]. Floating-point numbers have a strange distribution.
Supporting packed decimal in Ada, David A. Rosenfeld, Proceedings of the conference on TRI-Ada '91, ISBN 0-89791-445-7, pp187–190, ACM Press, 1991.
Abstract: One of the principal barriers to Ada in the Information Systems (IS) marketplace is that Ada compilers do not support decimal arithmetic and a packed decimal representation for numbers. An Ada apologist could argue that Ada as a language does support these featurtx, but such arguments do little to help a COBOL programmer, accustomed to manipulating decimal quantities in a straightforward way. Our project, under contract to the Army, is addressing the problem directly, by implementing packed decimal numbers in its MVS Ada Compiler. T his paper will discuss the possible approaches to the problem, and explain the approach selected, comparing it briefly with other solutions...
Mathematics and computer science at odds over real numbers, Thomas J. Scott, ACM SIGCSE Bulletin, Vol. 23 #1 (Technical Symposium on Computer Science Education 1991), pp130–139, ACM Press, 1991.
Abstract: This paper discusses the “real number” data type as implemented by “floating point” numbers. Floating point implementations and a theorem that characterizes their truncations are presented. A teachable floating point system is presented, chosen so that most problems can be worked out with paper and pencil. Then major differences between floating point number systems and the continuous real number system are presented. Important floating point formats are next discussed. Two examples derived from actual computing practice on mainframes, minicomputers, and PCs are presented. The paper concludes with a discussion of where floating point arithmetic should be taught in standard courses in the ACM curriculum.
A Study of DataBase 2 Customer Queries, Annie Tsang and Manfred Olschanowsky, IBM Technical Report TR 03.413, 25pp, IBM Santa Teresa Laboratory, San Jose, CA, April 1991.
Abstract: Over 200 Database 2 read-only and update queries were collected from 30 major DB2 customers during 1989 and 1990. These queries were considered representative of customers using DB2. Analysis of these queries were made in order to determine their characteristics and also to determine which SQL funetions were commonly used and how frequently they were used by these customers.
    Results of this study can be used in various ways, induding:
  • provide input to the planning and development organizations to enable them to implement new functions, enhance existing functions, and improve performance of functions that are most commonly used by customers.
  • allow developers to develop realistic workloads for benchmarking.

Precise Computation Using Range Arithmetic, via C++, Oliver Aberth and Mark J Schaefer, ACM Transactions on Mathematical Software, Vol. 18 #4, pp481–491, ACM Press, December 1992.
Abstract: An arithmetic is described that can replace floating-point arithmetic for programming tasks requiring assured accuracy. A general explanation is given of how the arithmetic is constructed with C++, and a programming example in this language is supplied. Times for solving representative problems are presented.
Binary-to-Decimal Conversion Based on the Divisibility of 28-1 [255] by 5, B. Arazi and D. Naccache, Electronic Letters, Vol. 28 #3, pp2151–2152, IEE, November 1992.
Abstract: The Letter treats the case of converting a binary value, represented in the form of n bytes, into a decimal value, represented in the form of m BCD characters. The conversion, which is suitable for one-byte and two-byte processors, is based on the following observations: (a) 5 is a divisor of 28-1 and 216-1. (b) Modular binary arithmetic over 2r-1 is easily performed. (c) Binary division by 2r-1, in the case where the remainder is known to be zero, is easily performed. (d) All the prime factors of 28-1 and 216-1 are of the form 2r+1.
An Ada Decimal Arithmetic Capability, Benjamin M. Brosgol, Robert I. Eachus, and David E. Emery, CrossTalk, The Journal of Defense Software Engineering, Number 36, 8 (approx)pp, US Air Force Software Technology Support Center, September 1992.
Abstract: (None.)
    Support for financial processing requires suitable arithmetic facilities, representation control, and formatted output. This paper ... describes the possible approaches to the problem, the solution that the authors have developed, and the rationale for the choice. The name chosen for the solution, ADAR, stands for “Ada Decimal Arithmetic and Representations”

Note: Probably the same as or very similar to “Decimal arithmetic in Ada” by the same authors in the same year.
Software Product Description: VAX 9000 Series Diagnostic Set, DEC, 3pp, Digital Equipment Corporation, April 1992.
Abstract: VAX 9000 Series Diagnostic Set is a package of programs that allows users to maintain a VAX 9000 system. These diagnostics test all subsystems of the VAX 9000 system including the Power Control System, Service Processor System, CPU, Memory, I/O Adapters, and peripheral devices. The package includes firmwarebased tests, service-processor-based tests, and macrodiagnostics
The Design of Floating-Point Data Types, David Goldberg, ACM Letters on Programming Languages and Systems, Vol. 1 #2, pp138–151, ACM Press, June 1992.
Abstract: The issues involved in designing the floating-point part of a programming language are discussed. Looking at the language specifications for most existing languages might suggest that this design involves only trivial issues, such as whether to have one or two types of REALs or how to name the functions that convert from INTEGER to REAL. It is shown that there are more significant semantic issues involved. After discussing the trade-offs for the major design decisions, they are illustrated by presenting the design of the floating-point part of the Modula-3 language.
ISO/IEC 9075:1992: Information Technology – Database Languages – SQL, Jim Melton et al, 626pp, ISO, 1992.
Abstract: This International Standard was developed from ISO/IEC 9075:1989, Information Systems, Database Language SQL with Integrity Enhancements, and replaces that International Standard. It adds significant new features and capabilities to the specifications. It is generally compatible with ISO/IEC 9075:1989, in the sense that, with very few exceptions, SQL language that conforms to ISO/IEC 9075:1989 also conforms to this International Standard, and will be treated in the same way by an implementation of this International Standard as it would by an implementation of ISO/IEC 9075:1989...
Note: Also available as ANSI INCITS 135-1992 (R1998).
A Decimal Multiplication Algorithm for Microcomputers, Mohammad S. Obaidat and Saleh A. Bleha, Computers and Electrical Engineering, Vol. 18 #5, pp357–363, Elsevier, September 1992.
Abstract: A decimal multiplication algorithm is developed and its implementation for microcomputers is illustrated. The algorithm can provide an average multiplication speedup equal to 1.34 compared to the traditional algorithm that is based on repeated additions if both are implemented in pure hardware. The average speedup of the developed algorithm is 1.20 if implemented on an 8-bit microcomputer system. The algorithm is significant especially for simple real-time applications that require cost-effective designs.
Division by 10, R. A. Vowels, Australian Computer Journal, Vol. 24 #3, pp81–85, ACS, August 1992.
Abstract: Division of a binary integer and a binary floating-point mantissa by 10 can be performed with shifts and adds, yielding a significant improvement in hardware execution time, and in software execution time if no hardware divide instruction is available. Several algorithms are given, appropriate to specific machine word sizes, hardware and hardware instructions available, and depending on whether a remainder is required.
    The integer division algorithms presented here contain a new strategy that produces the correct quotient directly, without the need for the supplementary correction required of previously published algorithms. The algorithms are competitive in time with binary coded decimal (BCD) divide by 10.
    Both the integer and floating-point algorithms are an order of magnitude faster than conventional division.

Rational Number Approximation in Higher Radix Floating Point Systems, P. Johnstone and F. Petry, Computers and Mathematics with Applications, Vol. 25 #6, pp103–108, Pergamon Press, 1993.
Abstract: Recent proposals have suggested that suitably encoded non-binary floating point representations might offer range and precision comparable to binary systems of equal word size. This is of obvious importance in that it allows computation to be performed on decimal operands without the overhead or error of base conversion while maintaining the error performance and representational characteristics of more traditional encodings. There remains, however, a more general question on the effect of the choice of radix on the ability of fioating point systems to represent arbitrary rational numbers. Mathematical researchers have long recognized that some bases offer some representational advantages in that they generate fewer nonterminate values when representing arbitrary rational numbers. Base twelve, for example, has long been considered preferred over base ten because of its inclusion of three in its primary factorization allowing finite representation of a greater number of rational numbers.
    While such results are true for abstract number systems, little attention has been paid to machine based computation and its finite resources. In this study, such results are considered in an environment more typical of computer based models of number systems. Specifically, we consider the effect of the choice of floating point base on rational number approximation in systems which exhibit the typical characteristics of floating point representations – normalized encodings, limited exponent range and storage allocated in a fixed number of ‘bits’ per datum. The frequency with which terminate and representable results can be expected is considered for binary, decimal, and other potentially interesting bases.
Efficient Multiprecision Floating Point Multiplication with Optimal Directional Rounding, Werner Krandick and Jeremy R. Johnson, Proceedings of the 11th IEEE Symposium on Computer Arithmetic, 6pp, IEEE, 1993.
Abstract: An algorithm is described for multiplying multiprecision floating-point numbers. The algorithm produces either the smallest floating-point number greater than or equal to the true product or the greatest floating-point number smaller than or equal to the true product Software implementations of multiprecision precision floating-point multiplication can reduce the computing time by a factor of two if they do not compute the low order digits of the product of the two mantissas. However, these algorithms do not necessarily provide optimally rounded results. The algorithm described in this paper is guaranteed to produce optimally rounded results and typically obtains the same savings.
An evaluation of the design of the Gamma 60, T. J. Tumlin and M. Smothermann, Actes du 3e colloque de l'Histoire de l'Informatique, 11pp, Sophia-Antipolis, INRIA, 1993.
Abstract: The Bull Gamma 60 remains a major innovation in computer design. Its use of explicit FORK-JOIN parallelism is shown by a simulation model to wisely exploit a large difference in speeds between logic components and memory elements, as found on some machines of the 1950’s. Recently the reappearance of a large speed ratio makes the same type of explicit FORK-JOIN parallelism attractive in advanced designs and validates the latency-tolerant design philosophyof the Gamma 60. The major difficulty of the design is the programming effort required to fully express the parallelism available in programs.

Information Systems Development in Ada, Benjamin M. Brosgol, Robert I. Eachus, and David E. Emery, Eleventh Annual Washington Ada Symposium, pp2–16, ACM Press, June 1994.
Abstract: (None.)
    In this paper we survey how to use Ada (both Ada 83 and Ada 9X) for IS applications, with a focus on two principal issues: Specification of the information architecture of an IS application, and Programming techniques relevant to financial and related applications.
    We cover both the language features and the supplemental packages for IS development. Special attention will be paid to the Ada Decimal-Associated Reusabilia (“ADAR”) components for Ada 83 and transitioning to Ada 9X.
Dynamics of Arithmetic: A Connectionist View of Arithmetic Skills, Richard Z. Dallaway, ISSN 1350-3162, 159pp, CSRP 306, Univerity of Sussex, February 1994.
Abstract: Arithmetic takes time. Children need five or six years to master the one hundred multiplication facts (00 to99), and it takes adults approximately one second to recall an answer to a problem like 78. Multicolumn arithmetic (e.g., 4567) requires a sequence of actions, and children produce a host of systematic mistakes when solving such problems. This thesis models the time course and mistakes of adults and children solving arithmetic problems. Two models are presented, both of which are built from connectionist components.
Multiple-length Division Revisited: a Tour of the Minefield, Per Brinch Hansen, Software -- Practice and Experience Vol. 24 #6, pp579–601, John Wiley & Sons, June 1994.
Abstract: Long division of natural numbers plays a crucial role in Cobol arithmetic, cryptography, and primality testing. Only a handful of textbooks discuss the theory and practice of long division, and none of them do it satisfactorily. This tutorial attempts to fill this surprising gap in the literature on computer algorithms. We illustrate the subtleties of long division by examples, define the problem concisely, summarize the theory, and develop a complete Pascal algorithm using a consistent terminology.
Precision Control and Exception Handling in Scientific Computing, K. R. Jackson and N. S. Nedialkov, Technical report, pp1–8, Computer Science Dept., University of Toronto, 1994.
Abstract: This paper describes convenient language facilities for precision control and exception handling. Nedialkov has developed a variable-precision and exception handling library, SciLib, implemented as a numerical class library in C++. A new scalar data type, real, is introduced, consisting of variable-precision floating-point numbers. Arithmetic, relational, and input and output operators of the language are overloaded for reals, so that mathematical expressions can be written without explicit function calls. Precision of computations can be changed during program execution. The exception handling mechanism treats only numerical exceptions and does not distinguish between different types of exceptions.
    The proposed precision control and exception handling facilities are illustrated by sample SciLib programs.
Design and Analysis of Non-binary Radix Floating Point Representations., P. Johnstone and F. Petry, Computers and Electrical Engineering, Vol. 20 #1, pp39–50, Elsevier, January 1994.
Abstract: This paper examines the feasibility of higher radix floating point representations and in particular decimal based representations. Traditional analyses of such representations have assumed the format of a floating point datum to be roughly identical to that of traditional binary floating point encodings such as the IEEE P754 task group standard representations. We relax this restriction and propose a method of encoding higher radix floating point data with range, precision, and storage requirements comparable to those exhibited by traditional binary representations. Results from McKeeman’s Maximum and Average Relative Representational Error (MRRE and ARRE) analyses, Brent’s RMS error evaluation, Matula’s ratio of significance space and gap functions, and Brown and Richman’s exponent range estimates are extended to accomodate the proposed representation. A decimal alternative to traditional binary representations is proposed, and the behavior of such a system is contrasted with that of a comparable binary system.
Note: Almost identical to 1989 Higher Radix Floating Point Representations by the same authors.
A Complete Term Rewriting System for Decimal Integer Arithmetic, H. R. Walters, Technical Report CS-9435, 9pp, Centrum voor Wiskunde en Informatica (CWI), August 1994.
Abstract: We present a term rewriting system for decimal integers with addition and subtraction. We prove that the system is confluent and terminating.

Specification of the IEEE-854 Floating-Point Standard in HOL and PVS, Victor A. Carreño and Paul S. Miner, HOL95: Eighth International Workshop on Higher-Order Logic Theorem Proving and Its Applications, 16pp, Brigham Young University, September 1995.
Abstract: The IEEE-854 Standard for radix-independent floating-point arithmetic has been partially defined within two mechanical verication systems. We present the specication of key parts of the standard in both HOL and PVS. This effort to formalize IEEE-854 has given the opportunity to compare the styles imposed by the two verification systems on the specification.
ISO/IEC 8652:1995: Information Technology – Programming Languages – Ada (Ada 95 Reference Manual: Language and Standard Libraries), S. Tucker Taft and Robert A. Duff, ISBN 3-540-63144-5, 552pp, Springer-Verlag, July 1997.
Abstract: This International Standard specifies the form and meaning of programs written in Ada. Its purpose is to promote the portability of Ada programs to a variety of data processing systems.
A new calculator and why it is necessary, Harold Thimbleby, The Computer Journal, Vol. 38 #6, pp418–433, OUP, 1995.
Abstract: Conventional calculators are badly designed: they suffer from bad computer science – they are unnecessarily difficult to use and buggy. I describe a solution, avoiding the problems caused by conventional calculators, one that is more powerful and arguably much easier to use. The solution has been implemented, and design issues are discussed.
    This paper shows an interactive system that is declarative, with the advantages of clarity and power that entails. It frees people from working out how a calculation should be expressed to concentrating on what they want solved. An important contribution is to demonstrate the very serious problems users face when using conventional calculators, and hence what a freedom a declarative design brings.

ANSI X3.274-1996: American National Standard for Information Technology – Programming Language REXX, Brian Marks and Neil Milsted, 167pp, ANSI, February 1996.
Abstract: This standard provides an unambiguous definition of the programming language REXX. Its purpose is to facilitate portability of REXX programs for use on a wide variety of computer systems.
Note: Errata also available, as ANSI X3.274-1996/AM 1-2000.
Printing Floating-Point Numbers Quickly and Accurately, Robert G. Burger and R. Kent Dybvig, Proceedings of the ACM SIGPLAN '96 conference on Programming language design and implementation, pp108–116, ACM Press, 1996.
Abstract: This paper presents a fast and accurate algorithm for printing floating-point numbers in both free- and fixed-format modes. In free-format mode, the algorithm generates the shortest, correctly rounded output string that converts to the same number when read back in, accommodating whatever rounding mode the reader uses. In fixed-format mode, the algorithm generates a correctly rounded output string using special # marks to denote insignificant trailing digits. For both modes, the algorithm employs a fast estimator to scale floating-point numbers efficiently.
Numbers: The Universal Language, Denis Guedj, ISBN 0-8109-2845-0, 176pp, Harry N. Abrams, Inc, 1997.
Abstract: Numbers, like letter forms, have a rich and complex history. Who first invented them? How old are they, and how were they developed? ...
    With Chronology and Glossary. Many referenced illustrations.

Note: Translated from the French (Empire des nombres) by Lory Frankel.

Decimal Adjustment of Long Numbers in Constant Time, Andreas Döring and Wolfgang J. Paul, Information Processing Letters, Vol. 62 #3, pp161–163, Elsevier Science B.V., June 1997.
Abstract: We propose a very simple method for adding and subtracting n-digit binary coded decimal (BCD) numbers with a small constant number of ordinary operations of a 4n-bit binary ALU. With this method addition/subtraction of 8-digit decimal numbers on an intel 486 processor is faster than programs that use the special built-in operations for decimal adjustment.
The Introduction of the Euro and the Rounding of Currency Amounts, European Commission, 29pp, European Commission Directorate General II Economic and Financial Affairs, 1997.
Abstract: The rounding rules laid down in the legal framework of the euro are an integral part of the monetary law of the euro area. The legal equality of the euro unit and the national currency units is based on their application and the application of the conversion rates. The basic rules laid down in the Council Regulation (EC) No 1103/97 are...
Economical Correctly Rounded Binary Decimal Conversions, Kenton Hanson, URL:, 5pp, 1997.
Abstract: Economical correctly rounded binary to decimal and decimal to binary conversions simplifies computing environments. Undue confusion and inaccuracies can occur with less precise conversions. Correct conversions can easily be guaranteed with very large precision arithmetic, but may cause performance and space penalties. Mostly correct conversions can be achieved with machine arithmetic. We demonstrate that correctly rounded conversions can be guaranteed with a minimum amount of extra precision arithmetic.
    An efficient algorithm for finding the most difficult conversions is described in detail. We then use these results to show how correct conversions can be guaranteed with a minimum of extra precision. Most normal conversions only require native machine arithmetic. Determining when extra precision is needed is straightforward.

Note: Only available as a web page.
Composite Arithmetic: Proposal for a New Standard, W. Neville Holmes, IEEE Computer, pp65–73, IEEE, March 1997.
Abstract: A general-purpose arithmetic standard could give general computation the kind of reliability and stability that the floating-point standard brought to scientific computing. The author describes composite arithmetic as a possible starting point.
TI-86 Graphing Calculator Guidebook, Texas Instruments, 419pp, Texas Instruments, September 1997.
Abstract: User’s Guide for the TI-86 Graphing Calculator.
Note: Revised February 2001.

Integer Square Roots, Jack W. Crenshaw, Embedded Systems Programming, Vol. 11 #2, EDTN, February 1998.
Borneo 1.0.2 – Adding IEEE 754 floating-point support to Java., Joseph D. Darcy, 129pp, University of California, Berkeley, May 1998.
Abstract: The design of Java relies heavily on experiences with programming languages past. Major Java features, including garbage collection, object-oriented programming, and strong static type checking, have all proven their worth over many years. However, Java breaks with tradition in its floating-point support; instead of accepting whatever floating point formats a machine might provide, Java mandates use of the nearly ubiquitous IEEE Standard for Binary Floating-Point Arithmetic (IEEE 754-1985). Unfortunately, Java’s specification creates two problems for numerical computation: only a strict subset of IEEE 754’s required features are supported by Java and Java’s bit-for-bit reproducibility goals for floating-point computation cause significant performance penalties on popular architectures.
   Java forbids using some distinguishing features of IEEE 754, features designed to make building robust numerical software by numerical experts and novices alike easier than in the past. Only simple floating-point features common to IEEE 754 and obsolete floating-point formats are allowed.
   Legitimate differences exist among various standard-conforming realizations of IEEE 754. For example, the x86 processor family supports the IEEE 754 recommended 80 bit double extended format in addition to the float and double formats found on other architectures. In many instances, using the double extended format for intermediate results leads to more robust programs. To support its “write once, run anywhere” goals, Java specifies that only the float and double formats be used for intermediate results in numeric expressions. For recent x86 processors to emulate exactly a machine that only uses float and double entails a significant performance penalty; over an order of magnitude degradation has been reported. An analogous situation arises on architectures such as the PowerPC that support a fused multiply accumulate instruction; Java semantics preclude using a hardware feature that would usually give more accurate answers faster. However, even numerical analysts do not need or desire exact reproducibility in all cases. The disallowed x86 features were designed to allow numerically unsophisticated programs to have a better likelihood of getting reasonable results.
   To address these concerns, the Java dialect Borneo is able to express all required features of IEEE 754. Borneo also aims to run efficiently on multiple hardware implementations of IEEE 754 and to allow convenient construction of new numeric types.
The Introduction of the Euro and the Rounding of Currency Amounts, European Commission Directorate General II, II/28/99-EN Euro Papers No. 22., 32pp, DGII/C-4-SP(99) European Commission, March 1998, February 1999.
Abstract: The purpose of the present document is to respond in a systematic manner to the various questions on rounding which the Commission services have received since the adoption of the Council regulation on certain provisions relating to the introduction of the euro in June 1997. 4 To this end it tries to clarify the interpretation of the rounding provisions in the legal framework of the euro and to give guidance on technical aspects of rounding.
A Calculated Look at Fixed-Point Arithmetic, Robert Gordon, Embedded Systems Programming, Vol. 11 #4, pp72–78, Miller Freeman, Inc, April 1998.
Abstract: This article explores the subject of fixed-point numbers and presents techniques you can use to implement efficient, fixed-precision number applications.
Decimal Arithmetic Instructions, IBM, ESA/390 Principles of Operation, Chapter 8, IBM, 1998.
Abstract: The decimal instructions of this chapter perform arithmetic and editing operations on decimal data. Additional operations on decimal data are provided by several of the instructions in Chapter 7, “General Instructions”. Decimal operands always reside in storage, and all decimal instructions use the SS instruction format. Decimal operands occupy storage fields that can start on any byte boundary.
The Art of Computer Programming, Vol 2, Donald E. Knuth, ISBN 0-201-89684-2, 762pp, Addison Wesley Longman, 1998.
Abstract: The chief purpose of this chapter [4] is to make a careful study of the four basic processes of arithmetic: addition, subtraction, multiplication, and division. Many people see arithmetic as a trivial thing that children learn and computers do, but we will see that arithmetic is a fascinating topic with many interesting facets. ...
Note: Third edition. See especially sections 4.1 through 4.4.
MSDN Library Visual Basic 6.0 Reference, Microsoft Corporation, URL:, Microsoft Corporation, 2002.
Abstract: The contents of the Visual Basic Language Reference and Controls Reference includes topics on the controls, objects, properties, methods, events, statements, functions, and constants available.
    Additionally, this Reference contains topics on wizards, trappable errors, data types, keyboard shortcuts, and bi-directional programming.
An Instruction Timing Model of CPU Performance, Bernard L. Peuto and Leonard J. Shustek, International Conference on Computer Architecture: 25 years of the International Symposia on Computer architecture, pp152–165, ACM Press, 1998.
Abstract: A model of high-performance computers is derived from instruction timing formulas, with compensation for pipeline and cache memory effects. The model is used to predict the performance of the IBM 370/168 and the Amdahl 470 V/6 on specific programs, and the results are verified by comparison with actual performance. Data collected about program behavior is combined with the performance analysis to highlight some of the problems with high-performance implementations of such architectures.
Note: Original reference: ISCA 1977: pp165-178.
Floating Point Number Format with Number System with Base of 1000, Y. Takashi, IBM Technical Disclosure Bulletin, 01-98, pp609–610, IBM, January 1998.
Abstract: Disclosed is a use number system with a base of 1000 instead of 2 at the mantissa part of a floating point number. The unit is 10 bit. Each 10 bit keeps the value between 0 and 1000. This format is superior to Binary Coded Decimal (BCD) because it can keep more decimal numbers in the same size. This format is superior to binary because 1000 is 100 times of 10, and it makes no difference when converted to/from human’s decimal format.

Architecture and software support in IBM S/390 Parallel Enterprise Servers for IEEE Floating-Point arithmetic, Paul H. Abbott et al, IBM Journal of Research and Development, Vol. 43 #5/6, pp723–760, IBM, September/November 1999.
Abstract: IEEE Binary Floating-Point is an industry-standard architecture. The IBM System/360 hexadecimal floating-point architecture predates the IEEE standard and has been carried forward through the System/370 to current System/390 processors. The growing importance of industry standards and floating-point combined to produce a need for IEEE Floating-Point on System/390. At the same time, customer investment in IBM floating-point had to be preserved. This paper describes the architecture, hardware, and software efforts that combined to produce a conforming implementation of IEEE Floating-Point on System/390 while retaining compatibility with the original IBM architecture.
The Nothing That Is – A Natural History of Zero, Robert Kaplan, ISBN 0-19-512842-7, 225pp, Oxford University Press, 1999.
Abstract: If you look at zero you see nothing; but look through it and you will see the world. For zero brings into focus the great, organic sprawl of mathematics, and mathematics in turn the complex nature of things. ...
Note: Also available in paperback: ISBN 0-19-514237-3.
TI-89 TI-92 Plus Guidebook, Texas Instruments, 606pp, Texas Instruments, November 1999.
Abstract: User’s Guide for the TI-89 and TI-92 Plus Graphing Calculators.
Note: Revised February 2001.

COBOL Script: A Business-Oriented Scripting Language, T. Imajo, T. Miyake, S. Sato, T. Ito, D. Yokotsuka, Y. Tsujihata, and S. Uemura, Proceedings of the Fourth International Conference on Enterprise Distributed Object Computing (EDOC'00), pp231–240, IEEE, September 2000.
Abstract: This paper describes COBOL Script, a Web-oriented script language developed by Hitachi. COBOL Script includes the following features: (1) The language specifications, which consist of functions required for Web computing, are a subset of COBOL85, the most frequently used programming language in business information systems. (2) COBOL Script supports decimal arithmetic functions that have the same precision as in standard COBOL85 on mainframe computers. (3) Efficient implementation was based on analysis of the pros and cons of the COBOL processing system. Using COBOL Script, users can: (1) Process applications requiring high precision, such as account-related applications, over the Web. (2) Use a test debugger and a Coverage Function with COBOL Script for large-scale development projects. (3) Use Japanese in programs. (4) Achieve good run-time performance.
ZERO – The Biography of a Dangerous Idea, Charles Seife, ISBN 0-670-88457-X, 248pp, Penguin Books Ltd., 2000.
Abstract: The Babylonians invented it, the Greeks banned it, the Hindus worshipped it, and the Church used it to fend off heretics. For centuries, the power of zero savored of the demonic; once harnessed, it became the most important tool in mathematics...
Note: Also available in paperback: ISBN 0-14-02-9647-6.
Decimal arithmetic in applications and hardware, Akira Shibamiya, 2pp, pers. comm., 14 June 2000.
Abstract: (None)

The IBM z900 Decimal Arithmetic Unit, Fadi Y. Busaba, Christopher A. Krygowski, Wen H. Li, Eric M. Schwarz, and Steven R. Carlough, Conference Record of the 35th Asilomar Conference on Signals, Systems and Computers, Vol. 2, ISBN 0 7803 7147 X, pp1335–1339, IEEE, Nov. 2001.
Abstract: As the cost for adding function to a processor continues to decline, processor designs are including many additional features. An example of this trend is the appearance of graphics engines and compression engines on midrange and even low end microprocessors. One area that has the potential to capture chip real estate is the decimal arithmetic engine because of its importance in financial and business applications. Studies show that 55% of the numeric data stored on commercial databases are in decimal format. Although decimal arithmetic is supported in many software languages it is not yet available on many microprocessors. This paper details the decimal arithmetic engine in the recently announced z900 microprocessor.
Note: IEEE cat #01ch37256.
A Decimal Floating-Point Specification, Michael F. Cowlishaw, Eric M. Schwarz, Ronald M. Smith, and Charles F. Webb, Proceedings of the 15th IEEE Symposium on Computer Arithmetic, ISBN 0-7695-1150-3, pp147–154, IEEE, June 2001.
Abstract: Even though decimal arithmetic is pervasive in financial and commercial transactions, computers are still implementing almost all arithmetic calculations using binary arithmetic. As chip real estate becomes cheaper it is becoming likely that more computer manufacturers will provide processors with decimal arithmetic engines. Programming languages and databases are expanding the decimal data types available while there has been little change in the base hardware. As a result, each language and application is defining a different arithmetic and few have considered the efficiency of hardware implementations when setting requirements.
    In this paper, we propose a decimal format which meets the requirements of existing standards for decimal arithmetic and is efficient for hardware implementation. We propose this specification in the hope that designers will consider providing decimal arithmetic in future microprocessors and that future decimal software specifications will consider hardware efficiencies.

Note: Eric Schwarz’s Presentation foils are available here.
C# Language Specification, Rex Jaeschke, ECMA-TC39-TG2-2001, 520pp, ECMA, September 2001.
Abstract: This International Standard specifies the form and establishes the interpretation of programs written in the C# programming language. It specifies:
    The representation of C# programs;
    The syntax and constraints of the C# language;
    The semantic rules for interpreting C# programs;
    The restrictions and limits imposed by a conforming implementation of C#.

Note: Final draft submitted for ECMA GA approval December 2001.
Proposed Revision of ISO 1989:1985 Information technology – Programming languages, their environments and system software interfaces – Programming language COBOL, JTC-1/SC22/WG4, 905pp, INCITS, December 2001.
Abstract: COBOL began as a business programming language, but its present use has spread well beyond that to a general-purpose programming language. COBOL is well known for its file handling capabilities, which are extended in this revision by the addition of file sharing and record locking capabilities. Other major enhancements add object-oriented capabilities, handling of national characters, and enhanced interoperability with other programming languages.
    This is the proposed ISO/IEC 1989:2002 final draft.
Architecture and Algorithms for Processing Non-binary Floating Point Radices, Paul Johnstone and Frederick E. Petry, unpublished paper, 39pp, pers. comm., July 2001.
Abstract: Recent studies have proposed several non-binary floating point representations which possess most of the storage and algorithmic efficiencies of traditional binary systems with no sacrifice of precision and only modest reductions in range. Such systems possess inherent advantages in that they employ less complicated conversion algorithms and are less prone to errors in representation. Additionally, non-binary systems tend to produce more precise arithmetic results in that common problem of truncation of an infinitely repeating quotient occurs with a lesser frequency.
    However, as has been previously observed, traditional binary floating representations are most efficiently adapted to the prevailing choices of technology and system architecture. Previous research has left undone the quantification and evaluation of the algorithms and componentry necessary to effect the proposed representations in a fully realized system. We consider in this study the expected impact of adding the capacity to process one of the proposed non-binary radix representations within a conventional computer system. Since decimal representations are clearly the overwhelming impetus for these studies, discussion will focus solely on base 10 systems. Examination of implementation issues are directed toward the following areas: the implementation of floating point representations in contemporary computer architectures, the design of any extensions to such systems, the effects on system complexity and cost, and, finally, resulting algorithmic revisions.
Fixed-Point Math in C, Joe Lemieux, Embedded Systems Programming, Vol. 14 #4, EDTN, April 2001.
Abstract: Floating-point arithmetic can be expensive if you’re using an integer-only processor. But floating-point values can be manipulated as integers, as a less expensive alternative.
TI-89/TI-92 Plus Developers Guide, Beta Version .02, Texas Instruments, 1356pp, Texas Instruments, 2001.
Note: Available from web site.
TI-89/TI-92 Plus Sierra C Assembler Reference Manual, Beta Version .02, Texas Instruments, 322pp, Texas Instruments, 2001.
Note: Available from web site.

Densely Packed Decimal Encoding, Michael F. Cowlishaw, IEE Proceedings – Computers and Digital Techniques, Vol. 149 #3, ISSN 1350-2387, pp102–104, IEE, London, May 2002.
Abstract: Chen-Ho encoding is a lossless compression of three Binary Coded Decimal digits into 10 bits using an algorithm which can be applied or reversed using only simple Boolean operations. An improvement to the encoding which has the same advantages but is not limited to multiples of three digits is described. The new encoding allows arbitrary-length decimal numbers to be coded efficiently while keeping decimal digit boundaries accessible. This in turn permits efficient decimal arithmetic and makes the best use of available resources such as storage or hardware registers.
The ‘telco’ benchmark, M. F. Cowlishaw, URL:, 3pp, IBM Hursley Laboratory, May 2002.
Abstract: This benchmark was devised in order to investigate the balance between Input and Output (I/O) time and calculation time in a simple program which realistically captures the essence of a telephone company billing application.
    In summary, the application reads a large input file containing a suitably distributed list of telephone call durations (each in seconds). For each call, a charging rate is chosen and the price calculated and rounded to hundreths. One or two taxes are applied (depending on the type of call) and the total cost is converted to a character string and written to an output file. Running totals of the total cost and taxes are kept; these are displayed at the end of the benchmark for verification.
Potential Speedup with Decimal Floating-Point Hardware, Mark A Erle, Michael J Schulte, and J G Linebarger, Proceedings of the Thirty Sixth Asilomar Conference on Signals, Systems, and Computers, Pacific Grove, California, pp1073–1077, IEEE Press, November 2002.
Abstract: This paper address the potential speedup achieved by using decimal floating-point hardware, instead of software routines, on a high-performance super-scalar architecture. Software routines were written to performag decimal addition, subtraction, multiplication, and division. Cycle counts were then measured for each instruction using the Simplescalar simulator. After this, new hardware algorithms were developed, existing algorithms were analyzed, and cycle counts were estimated for the same set of instructions using specialized decimal floating-point hardware. This data was then used to show the potential speedup obtained for programs with different instruction mixes and a recently developed benchmark.
Fairchild decimal arithmetic unit, Stan Mazor, 9pp, pers. comm., July–September 2002.
Abstract: We embarked on the design of Symbol II [circa 1966], a large scale HIGH LEVEL language, virtual memory, time sharing machine. This machine used large printed circuit boards, approx. 16″ x 20″ with slots for over 210 DIP’s. We had 100 connector pins on each side and we defined the system using a number of parallel busses with multiple autonomous functional units and inter-processor communication. The completed system had over 110 printed circuit boards and consumed mega-watts of power...
The microarchitecture of the IBM eServer z900 processor, Eric M. Schwarz et al, IBM Journal of Research and Development, Vol. 46 #4/5, pp381–395, IBM, July/September 2002.
Abstract: The recent IBM ESA/390 CMOS line of processors, from 1997 to 1999, consisted of the G4, G5, and G6 processors. The architecture they implemented lacked 64-bit addressability and had only a limited set of 64-bit arithmetic instructions. The processors also lacked data and instruction bandwidth, since they utilized a unified cache. The branch performance was good, but there were delays due to conflicts in searching and writing the branch target buffer. Also, the hardware data compression and decimal arithmetic performance, though good, was in demand by database and COBOL programmers. Most of the performance concerns regarding prior processors were due to area constraints. Recent technology advances have increased the circuit density by 50 percent over that of the G6 processor. This has allowed the design of several performance-critical areas to be revisited. The end result of these efforts is the IBM eServer z900 processor, which is the first high-end processor based on the new 64-bit z/ArchitectureTM.
BigDecimal (Java 2 Platform SE v1.4.0), Sun Microsystems, URL:, 17pp, Sun Microsystems Inc., 2002.
Abstract: Immutable, arbitrary-precision signed decimal numbers. A BigDecimal consists of an arbitrary precision integer unscaled value and a non-negative 32-bit integer scale, which represents the number of digits to the right of the decimal point. The number represented by the BigDecimal is (unscaledValue/10scale). BigDecimal provides operations for basic arithmetic, scale manipulation, comparison, hashing, and format conversion.

Decimal Floating-Point: Algorism for Computers, Michael F. Cowlishaw, Proceedings of the 16th IEEE Symposium on Computer Arithmetic, ISBN 0-7695-1894-X, pp104–111, IEEE, June 2003.
Abstract: Decimal arithmetic is the norm in human calculations, and human-centric applications must use a decimal floating-point arithmetic to achieve the same results.
    Initial benchmarks indicate that some applications spend 50% to 90% of their time in decimal processing, because software decimal arithmetic suffers a 100× to 1000× performance penalty over hardware. The need for decimal floating-point in hardware is urgent.
    Existing designs, however, either fail to conform to modern standards or are incompatible with the established rules of decimal arithmetic. This paper introduces a new approach to decimal floating-point which not only provides the strict results which are necessary for commercial applications but also meets the constraints and requirements of the IEEE 854 standard.
    A hardware implementation of this arithmetic is in development, and it is expected that this will significantly accelerate a wide variety of applications.

Note: Softcopy is available in PDF.
Decimal Multiplication Via Carry-Save Addition, Mark A Erle and Michael J Schulte, Proceedings of the IEEE International Conference on Application-Specific Systems, Architectures, and Processors, the Hague, Netherlands,, pp348–358, IEEE Computer Society Press, June 2003.
Abstract: Decimal multiplication is important in many commercial applications including financial analysis, banking, tax calculation, currency conversion, insurance, and accounting. This paper presents two novel designs for fixed-point decimal multiplication that utilize decimal carry-save addition to reduce the critical path delay. First, a multiplier that stores a reduced number of multiplicand multiples and uses decimal carry-save addition in the iterative portion of the design is presented. Then, a second multiplier design is proposed with several notable improvements including fast generation of multiplicand multiples that do not need to be stored, the use of decimal (4:2) compressors, and a simplified decimal carry-propagate addition to produce the final product. When multiplying two n-digit operands to produce a 2n-digit product, the improved multiplier design has a worst-case latency of n + 4 cycles and an initiation interval of n + 1 cycles. Three data-dependent optimizations, which help reduce the multipliers’ average latency, are also described. The multipliers presented can be extended to support decimal floating-point multiplication.
Before the B5000: Burroughs Computers, 1951-1963, George T. Gray and Ronald Q. Smith, IEEE Annals of the History of Computing, Vol. 25 #2, pp50–61, IEEE, April-June 2003.
Abstract: Like many companies entering the computer industry, Burroughs began by working on US government contracts. Once sufficient expertise had been gained, the company entered the general purpose computer market. The Datatron computer, obtained through the ElectroData Corporation acquisition, was a modest success in the late 1950s; however, pioneering work on transistor computers for military contracts was not immediately transferred to the commercial marketplace.
Using multiple-precision arithmetic, David M Smith, Computing in Science and Engineering, Vol. 5 #4, pp88–93, IEEE Computer Society, July 2003.
Abstract: High-precision arithmetic is useful in many different computational problems. The most common is a numerically unstable algorithm, for which, say, 53-bit (ANSI/IEEE 754-1985 Standard) double precision would not yield a sufficiently accurate result.
Note: Related papers by same author at:
How to Print Floating-Point Numbers Accurately (Retrospective), Guy. L. Steele Jr. and Jon. L. White, 20 Years of the ACM/SIGPLAN Conference on Programming Language Design and Implementation (1979-1999): A Selection, 2003, 3pp, ACM Press, 2003.
Abstract: Our PLDI paper was almost 20 years in the making. How should the result of dividing 1.0 by 10.0 be printed? In 1970, one usually got “0.0999999” or “0.099999994”; why not “0.1”? ...

The Design of the Fixed Point Unit for the z990 Microprocessor, Fadi Y. Busaba, Timothy Slegel, Steven R. Carlough, Christopher A. Krygowski, and John G Rell, Proceedings of the 14th ACM Great Lakes symposium on VLSI, ISBN 1-58113-853-9, pp364 – 367, ACM Press, 2004.
Abstract: The paper presents the design of the Fixed Point Unit (FXU) for the IBM eServer z990 microprocessor (announced in 2Q ’03) that runs at 1.2 GHz. The FXU is capable of executing two Register-Memory instructions including arithmetic instructions and a branch instruction in a single cycle. The FXU executes a total of 369 instructions that operate on variable size operands (1 to 256 bytes). The instruction set include decimal arithmetic with multiplies and divides, binary arithmetic, shifts and rotates, loads/stores, branches, long moves, logical operations, convert instructions, and other special instructions. The FXU consists of 64-bit dataflow stack that is custom designed and a control stack that is synthesized. The current FXU is the first superscalar design for the CMOS z-series machines, has a new improved decimal unit, and has for the first time a 16x64 bit binary multiplier.
Fixed, floating, and exact computation with Java's BigDecimal, M. Cowlishaw, J. Bloch, and J.D. Darcy, Dr. Dobb's Journal Vol. 29 #7, ISSN 1044-789X, pp22–27, CMP Media, July 2004.
Abstract: Decimal data types are widely used in commercial, financial, and Web applications, and many general-purpose programming languages have either native decimal types or readily available decimal arithmetic packages. Since the 1.1 release, the libraries of the Java programming language supported decimal arithmetic via the Java.math.BigDecimal class. With the inclusion of JSR13 into J2SE 1.5, BigDecimal now has true floating-point operations consistent with those in the IEEE 754 revision. In this article, we first explain why decimal arithmetic is important and the differences between the BigDecimal class and binary float and double types.
On Intermediate Precision Required for Correctly-Rounding Decimal-to-Binary Floating-Point Conversion., Michel Hack, Proceedings of RNC6 (6th conference on Real Numbers and Computers), URL:, 22pp, University of Trier, November 2004.
Abstract: The algorithms developed ten years ago in preparation for IBM’s support of IEEE Floating-Point on its mainframe S/390 processors use an overly conservative intermediate precision to guarantee correctly-rounded results across the entire exponent range. Here we study the minimal requirement for both bounded and unbounded precision on the decimal side (converting to machine precision on the binary side). An interesting new theorem on Continued Fraction expansions is offered, as well as an open problem on the growth of partial quotients for ratios of powers of two and five.
The child-engineering of arithmetic in ToonTalk, Ken Kahn, Proceedings of the 2004 conference on Interaction Design and Children, ISBN 1-58113-791-5, pp141–142, ACM Press, 2004.
Abstract: Providing a child-appropriate interface to an arithmetic package with large numbers and exact fractions is surprisingly challenging. We discuss solutions to problems ranging from how to present fractions such as 1/3 to how to deal with numbers with tens of thousands of digits. As with other objects in ToonTalk®, we strive to make the enhanced numbers work in a concrete and playful manner.
Multioperand Decimal Addition (extended version), Robert D Kenney and Michael J Schulte, Proceedings of the IEEE Computer Society Annual Symposium on VLSI, Lafayette, LA, February, 2004., 10pp, IEEE, February 2004.
Abstract: This paper introduces and analyzes four techniques for performing fast decimal addition on multiple binary coded decimal (BCD) operands. Three of the techniques speculate BCD correction values and use chaining to correct intermediate results. The first speculates over one addition. The second speculates over two additions. The third employs multiple instances of the second technique in parallel and then merges the results. The fourth technique uses a binary carry-save adder tree and produces a binary sum. Combinational logic is then used to correct the sum and determine the carry into the next digit. Multioperand adder designs are constructed and synthesized for four to sixteen input operands. Analyses are performed on the synthesis results and the merits of each technique are discussed. Finally, these techniques are compared to previous attempts made at speeding up decimal addition.
High-Frequency Decimal Multiplier, Robert D Kenney, Michael J Schulte, and Mark A. Erle, Proceedings of the IEEE International Conference on Computer Design: VLSI in Computers and Processors, ISBN 0 7695 2231 9, pp26–29, IEEE, October 2004.
Abstract: Decimal arithmetic is regaining popularity in the computing community due to the growing importance of commercial, financial, and Internet-based applications, which process decimal data. This paper presents an iterative decimal multiplier, which operates at high clock frequencies and scales well to large operand sizes. The multiplier uses a new decimal representation for intermediate products, which allows for a very fast two- stage iterative multiplier design. Decimal multipliers, which are synthesized using a 0.11 micron CMOS standard cell library, operate at clock frequencies close to 2 GHz. The latency of the proposed design to multiply two n-digit BCD operands is (n + 8) cycles with a new multiplication able to begin every (n + 1) cycles.
A decimal carry-free adder, Hooman Nikmehr, Braden Phillips, and Cheng-Chew Lim, SPIE Symposium Smart Materials, Nano-, and Micro-Smart Systems, Proceedings of SPIE Vol. 5649, 12pp, SPIE International Society for Optical Engineering, December 2004.
Abstract: Recently, decimal arithmetic has become attractive in the financial and commercial world including banking, tax calculation, currency conversion, insurance and accounting. Although computers are still carrying out decimal calculation using software libraries and binary floating-point numbers, it is likely that in the near future, all processors will be equipped with units performing decimal operations directly on decimal operands. One critical building block for some complex decimal operations is the decimal carry-free adder. This paper discusses the mathematical framework of the addition, introduces a new signed-digit format for representing decimal numbers and presents an efficient architectural implementation. Delay estimation analysis shows that the adder offers improved performance over earlier designs.
Design Exploration for Decimal Floating-Point Arithmetic {IBM} University Partnership Program Proposal, Michael J. Schulte and Eric Schwarz, 4pp, IBM, 11 March 2004.
Abstract: Commercial applications and databases typically store numerical data in decimal format. Currently, however, microprocessors do not provide instructions or hardware support for decimal floating-point arithmetic. Consequently, decimal numbers are often read into computers, converted to binary numbers, and then processed using binary floating-point arithmetic. Results are then converted back to decimal before being stored. Besides being time-consuming, this process is error-prone, since most decimal numbers cannot be exactly represented as binary numbers. Thus, if binary floating-point arithmetic is used to process decimal data, unexpected results may occur after a few computations...
A 64-bit Decimal Floating-Point Adder (extended version), John Thompson, Nandini Karra, and Michael J Schulte, Proceedings of the IEEE Computer Society Annual Symposium on VLSI, Lafayette, LA, February, 2004., pp297–298, IEEE, February 2004.
Abstract: Due to the rapid growth in financial, commercial, and Internet-based applications, there is an increasing desire to allow computers to operate on both binary and decimal floating-point numbers. Consequently, specifications for decimal floating-point arithmetic are being added to the IEEE-754 Standard for Floating-Point Arithmetic. In this paper, we present the design and implementation of a decimal floating-point adder that is compliant with the current draft revision of the IEEE-754 Standard. The adder supports operations on 64-bit (16-digit) decimal floating-point operands. We provide synthesis results indicating the estimated area and delay for our design when it is pipelined to various depths.
Decimal Floating-Point Division Using Newton-Raphson Iteration, Liang-Kai Wang and Michael J Schulte, Proceedings of the 15th IEEE International Conference on Application-Specific Systems, Architectures and Processors (ASAP’04), pp84–95, IEEE Computer Society Press, September 2004.
Abstract: Decreasing feature sizes allow additional functionality to be added to future microprocessors to improve the performance of important application domains. As a result of rapid growth in financial, commercial, and Internet-based applications, hardware support for decimal floating-point arithmetic is now being considered by various computer manufacturers and specifications for decimal floating-point arithmetic have been added to the draft revision of the IEEE-754 Standard for Floating-Point Arithmetic (IEEE-754R). This paper presents an efficient arithmetic algorithm and hardware design for decimal floating-point division. The design uses an optimized piecewise linear approximation, a modified Newton- Raphson iteration, a specialized rounding technique, and a simplified combined decimal incrementer/decrementer. Synthesis results show that a 64-bit (16-digit) implementation of the decimal divider, which is compliant with IEEE-754R, has an estimated critical path delay of 0.69 ns when implemented using LSI Logic’s 0.11 micron gflx-p standard cell library.

Design of a Reversible Binary Coded Decimal Adder by Using Reversible 4-bit Parallel Adder, Hafiz Md. Hasan Babu and Ahsan Raja Chowdhury, Proceedings of the 18th International Conference on VLSI Design (VLSID 2005), ISBN 0-7695-2264-5, pp255–260, IEEE, 2005.
Abstract: In this paper, we have proposed a design technique for the reversible circuit of binary coded decimal (BCD) adder. The proposed circuit has the ability to add two 4-bits binary variables and it transforms the addition into the appropriate BCD number with efficient error correcting modules where the operations are reversible. We also show that the proposed design technique generates the reversible BCD adder circuit with minimum number of gates as well as the minimum number of garbage outputs.
Decimal Multiplication With Efficient Partial Product Generation, Mark A Erle, Eric Schwarz, and Michael J Schulte, Proceedings of the 17th IEEE Symposium on Computer Arithmetic, ISBN 0-7695-2366-8, pp21–28, IEEE, June 2005.
Abstract: Decimal multiplication is important in many commercial applications including financial analysis, banking, tax calculation, currency conversion, insurance, and accounting. This paper presents a novel design for fixed-point decimal multiplication that utilizes a simple recoding scheme to produce signed-magnitude representations of the operands thereby greatly simplifying the process of generating partial products for each multiplier digit. The partial products are generated using a digit-by-digit multiplier on a word-by-digit basis, first in a signed-digit form with two digits per position, and then combined via a combinational circuit. As the signed-digit partial products are developed one at a time while traversing the recoded multiplier operand from the least significant digit to the most significant digit, each partial product is added along with the accumulated sum of previous partial products via a signed-digit adder. This work is significantly different from other work employing digit-by-digit multipliers due to the efficiency gained by restricting the range of digits throughout the multiplication process.
High-speed multioperand decimal adders, R.D. Kenney and M. J. Schulte, IEEE Transactions on Computers, Vol. 54 #8, ISSN 0018-9340, pp953–963, IEEE, August 2005.
Abstract: There is increasing interest in hardware support for decimal arithmetic as a result of recent growth in commercial, financial, and Internet-based applications. Consequently, new specifications for decimal floating-point arithmetic have been added to the draft revision of the IEEE-754 Standard for Floating-Point Arithmetic. This paper introduces and analyzes three techniques for performing fast decimal addition on multiple binary coded decimal (BCD) operands. Two of the techniques speculate BCD correction values and correct intermediate results while adding the input operands. The first speculates over one addition. The second speculates over two additions. The third technique uses a binary carry-save adder tree and produces a binary sum. Combinational logic is then used to correct the sum and determine the carry into the next more significant digit. Multioperand adder designs are constructed and synthesized for four to 16 input operands. Analyses are performed on the synthesis results and the merits of each technique are discussed. Finally, these techniques are compared to several previous techniques for high-speed decimal addition.
On the Randomness of Pi and Other Decimal Expansions, George Marsaglia, Interstat October 2005 #5, 17pp, Interstat (, October 2005.
Abstract: Tests of randomness much more rigorous than the usual frequency-of-digit counts are applied to the decimal expansions of π, e and √2, using the Diehard Battery of Tests adapted to base 10 rather than the original base 2. The first 109 digits of π, e and √2 seem to pass the Diehard tests very well. But so do the decimal expansions of most rationals k/p with large primes p. Over the entire set of tests, only the digits of √2 give a questionable result: the monkey test on 5-letter words. Its significance is discussed in the text. Three specific k/p are used for comparison. The cycles in their decimal expansions are developed in reverse order by the multiply-with-carry (MWC) method. They do well in the Diehard tests, as do many fast and simple MWC RNGs that produce base-b ‘digits’ of the expansions of k/p for b = 232 or b = 232− 1. Choices of primes p for such MWC RNGs are discussed, along with comments on their implementation.
ERMETH: The First Swiss Computer, Hans Heukom, IEEE Annals of the History of Computing, pp5–22, IEEE, October 2005.
Abstract: Eduard Stiefel, in 1948 the first director of the Federal Institute of Technology’s newly established Institute of Applied Mathematics, recognized that computers would be essential to this new field of mathematics. Unable to find exactly what he wanted in existing computers, Stiefel developed the ERMETH. This article examines the rationale of, and objectives for, the first Swiss computer.
Radix Converters: Complexity and Implementation by LUT Cascades, Tsutomu Sasao, 35th International Symposium on Multiple-Valued Logic (ISMVL'05), pp256–263, IEEE, May 2005.
Abstract: In digital signal processing, we often use higher radix system to achieve high-speed computation. In such cases, we require radix converters. This paper considers the design of LUT cascades that convert ��-nary numbers to -nary numbers. In particular, we derive several upper bounds on the column multiplicities of decomposition charts that represent radix converters. From these, we can estimate the size of LUT cascades to realize radix converters. These results are useful to design compact radix converters, since these bounds show strategies to partition the outputs into groups.
Performance Evaluation of Decimal Floating-Point Arithmetic, Michael J. Schulte, Nick Lindberg, and Anitha Laxminarain, Proceedings of the 6th IBM Austin Center for Advanced Studies Conference, Austin, TX,, 8pp, IBM, February 2005.
Abstract: The prominence of decimal data in commercial and financial applications has led researchers to pursue efficient techniques for performing decimal floating-point arithmetic. While several software implementations of decimal floating-point arithmetic have been implemented, there is a growing need to provide hardware support for decimal floating-point arithmetic to keep up with the processing demands of emerging commercial and financial applications. This paper evaluates and compares the performance of decimal floating-point arithmetic operations when implemented on superscalar processors using either software libraries or specialized hardware designs. Our comparisons show that hardware implementations of decimal floating-point arithmetic operations are one to two orders of magnitude faster than software implementations.
Decimal Floating-Point Square Root Using Newton-Raphson Iteration, Liang-Kai Wang and Michael J Schulte, Proceedings of the 15th IEEE International Conference on Application-Specific Systems, Architectures and Processors (ASAP’05), pp309–315, IEEE Computer Society Press, July 2005.
Abstract: With continued reductions in feature size, additional functionality may be added to future microprocessors to boost the performance of important application domains. Due to growth in commercial, financial, and Internet-based applications, decimal floating point arithmetic is now attracting more attention, and hardware support for decimal operations is being considered by various computer manufacturers. In order to standardize decimal number formats and operations, specifications for decimal floating-point arithmetic have been added to the draft revision of the IEEE-754 Standard for Floating-Point Arithmetic (IEEE-754R). This paper presents an efficient arithmetic algorithm and hardware design for decimal floating-point square root. This design uses an optimized piecewise linear approximation, a modified Newton-Raphson iteration, a specialized rounding technique, and a modified decimal multiplier. Synthesis results show that a 64-bit (16-digit) implementation of the decimal square root, which is compliant with the IEEE-754R, has an estimated critical path delay of 0.95 ns and maximum latency of 210 clock cycles when implemented using LSI Logic’s 0.11 micron Gflx-P Standard Cell library.

Where did all my decimals go?, Chuck Allison, Computing Sciences in Colleges, Vol. 21 #3, pp47–59, Consortium for Computing Sciences in Colleges, February 2006.
Abstract: It is tremendously ironic that computers were invented with number crunching in mind, yet nowadays most CS graduates leave school with little or no experience with the intricacies of numeric computation. This paper surveys what every CS graduate should know about floating-point arithmetic, based on experience teaching a recently-created course on modern numerical software development.
Integer Representation of Decimal Numbers for Exact Computations, Javier Bernal and Christoph Witzgall, Journal of Research of the National Institute of Standards and Technology, Vol. 111 #2, pp79–88, National Institute of Standards and Technology, March-April 2006.
Abstract: A scheme is presented and software is documented for representing as integers input decimal numbers that have been stored in a computer as double precision floating point numbers and for carrying out multiplications, additions and subtractions based on these numbers in an exact manner. The input decimal numbers must not have more than nine digits to the left of the decimal point. The decimal fractions of their floating point representations are all first rounded off at a prespecified location, a location no more than nine digits away from the decimal point. The number of digits to the left of the decimal point for each input number besides not being allowed to exceed nine must then be such that the total number of digits from the leftmost digit of the number to the location where round-off is to occur does not exceed fourteen.
A 64-bit Decimal Floating-Point Comparator, Ivan D. Castellanos and James E. Stine, IEEE 17th International Conference on Application-specific Systems, Architectures and Processors (ASAP'06), pp138–144, IEEE, 2006.
Abstract: Decimal arithmetic is growing in importance as scientific studies reveal that current financial and commercial applications spend a high percentage overhead in this type of calculations. Typically, software is utilized to emulate decimal floating point arithmetic in these applications. On the other hand, functional units that employ decimal floating point hardware can improve performance by two or three orders of magnitude. This paper presents the design and implementation of a novel decimal floating-point comparator compliant with the current draft revision of the IEEE-754 Standard for floating-point arithmetic. It utilizes a novel BCD magnitude comparator with logarithmic delay and it supports 64-bit decimal floating-point numbers. Area and delay results are examined for an implementation in TSMC SCN6M SCMOS technology.
The Official "Do Not Use" List, The Joint Commission, URL:, 1p, 2006.
Abstract: In May 2005, The Joint Commission affirmed its “do not use” list of abbreviations. The list was originally created in 2004 by the Joint Commission as part of the requirements for meeting National Patient Safety Goal (NPSG) requirement 2B (Standardize a list of abbreviations, acronyms and symbols that are not to be used throughout the organization). Summit conclusions were posted on the Joint Commission website for public comment. During the four-week comment period, the Joint Commission received 5,227 responses, including 15,485 comments. More than 80 percent of the respondents supported the creation and adoption of a “do not use” list.
Reversible Implementation of Densely-Packed-Decimal Converter to and from Binary-Coded-Decimal Format Using in IEEE-754R, A. Kaivani, A. Zaker Alhosseini, S. Gorgin, and M. Fazlali, 9th International Conference on Information Technology (ICIT'06), pp273–276, IEEE, December 2006.
Abstract: The Binary Coded Decimal (BCD) encoding has always dominated the decimal arithmetic algorithms and their hardware implementation. Due to importance of decimal arithmetic, the decimal format defined in lEEE 754 floating point standard has been revisited. It uses Densely Packed Decimal (DPD) encoding to store significand part of a decimal floating point number. Furthermore in recent years reversible logic has attracted the attention of engineers for designing low power CMOS circuits, as it is not possible to realize quantum compufing withouf reversible logic implementation. This paper derives the reversible implementation of DPD converter to and from conventional BCD format using in IEEE 754R.
On the Conversion Between Number Systems, Houssain Kettani, IEEE Transactions on Circuits and Systems, Vol. 53 #11, ISSN 1057-7130, pp1255–1258, IEEE, November 2006.
Abstract: This brief revisits the problem of conversion between number systems and asks the following question: given a nonnegative decimal number d, what is the value of the digit at position j in the corresponding base b number? Thus, we do not require the knowledge of other digits except the one we are interested in. Accordingly, we present a conversion function that relates each digit in a base b system to the decimal value that is equal to the base b number in question. We also show some applications of this new algorithm in the areas of parallel computing and cryptography.
A Hybrid Decimal Division Algorithm Reducing Computational Iterations, Yong-Dae Kim, Soon-Youl Kwon, Seon-Kyoung Han, Kyoung-Rok Cho, and Younggap You, IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences Vol. E89-A #6, pp1807–1812, The Institute of Electronics, Information and Communication Engineers, 2006.
Abstract: This paper presents a hybrid decimal division algorithm to improve division speed. The proposed hybrid algorithm employs either non-restoring or restoring algorithm on each digit to reduce iterative computations. The selection of the algorithm is based on the relative remainder values with respect to the half of its divisor. The proposed algorithm requires maximum 7n+4 add/subtract operations for an n-digit quotient, whereas other restoring or non-restoring schemes comprise more than 10n+1 operations.
A Radix-10 Combinational Multiplier, Tomás Lang and Alberto Nannarelli, Proceedings of 40th Asilomar Conference on Signals, Systems, and Computers, pp313–317, IEEE, October 2006.
Abstract: In this work, we present a combinational decimal multiply unit which can be pipelined to reach the desired throughput. With respect to previous implementations of decimal multiplication, the proposed unit is combinational (parallel) and not sequential, has a simpler recoding of the operands which reduces the number of partial product precomputations and uses counters to eliminate the need of the decimal equivalent of a 4:2 adder. The results of the implementation show that the combinational decimal multiplier offers a good compromise between latency and area when compared to other decimal multiply units and to binary double-precision multipliers.
Fast Decimal Floating-Point Division, Hooman Nikmehr, Braden Phillips, and Cheng-Chew Lim, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, Vol. 14 #9, ISSN 1063-8210, pp951–961, IEEE, September 2006.
Abstract: A new implementation for decimal floating-point (DFP) division is introduced. The algorithm is based on high-radix SRT division. The SRT division algorithm is named after D. Sweeney, J. E. Robertson, and T. D. Tocher, with the recurrence in a new decimal signed-digit format. Quotient digits are selected using comparison multiples, where the magnitude of the quotient digit is calculated by comparing the truncated partial remainder with limited precision multiples of the divisor. The sign is determined concurrently by investigating the polarity of the truncated partial remainder. A timing evaluation using a logic synthesis shows a significant decrease in the division execution time in contrast with one of the fastest DFP dividers reported in the open literature.
Novel BCD Adders and Their Reversible Logic Implementation for IEEE 754r Format, Himanshu Thapliyal, Saurabh Kotiyal, and M. B. Srinivas, Proceeding of the 19th International Conference on VLSI Design (VLSID’06), pp387–392, IEEE, 2006.
Abstract: IEEE 754r is the ongoing revision to the IEEE 754 floating point standard and a major enhancement to the standard is the addition of decimal format. This paper proposes two novel BCD adders called carry skip and carry look-ahead BCD adders respectively. Furthermore, in the recent years, reversible logic has emerged as a promising technology having its applications in low power CMOS, quantum computing, nanotechnology, and optical computing. It is not possible to realize quantum computing without reversible logic. Thus, this paper also provides the reversible logic implementation of the conventional BCD adder as the well as the proposed Carry Skip BCD adder using a recently proposed TSG gate. Furthermore, a new reversible gate called TS-3 is also being proposed and it has been shown that the proposed reversible logic implementation of the BCD Adders is much better compared to recently proposed one, in terms of number of reversible gates used and garbage outputs produced. The reversible BCD circuits designed and proposed here form the basis of the decimal ALU of a primitive quantum CPU.
Modified Carry Look Ahead BCD Adder With CMOS and Reversible Logic Implementation, Himanshu Thapliyal and Hamid R. Arabnia, Proceedings of the 2006 International Conference on Computer Design (CDES'06), ISBN 1-60132-009-4, pp64–69, CSREA Press, November 2006.
Abstract: IEEE 754r is the ongoing revision to the IEEE 754 floating point standard and a major enhancement to the standard is the addition of decimal format. Firstly, this paper proposes novel two transistor AND & OR gates. The proposed AND gate has no power supply, thus it can be referred as the Powerless AND gate. Similarly, the proposed two transistor OR gate has no ground and can be referred as Groundless OR. Two designs of AND & OR gate without VDD or GND are also shown. Secondly for IEEE 754r format, one novel BCD adder called carry look-ahead BCD adder is also proposed. In order to design the carry look-ahead BCD adder, a novel 4 bit carry look-ahead adder called NCLA is proposed which forms the basic building block of the proposed carry look-ahead BCD adder. The proposed two transistors AND & OR gates are used to provide the optimized small area, low power, high throughput circuitries of the proposed BCD adder. Nowadays, reversible logic is also emerging as a promising computing paradigm having its applications in quantum computing, optical computing and nanotechnology. Thus, reversible logic implementation of the proposed BCD Adder is also shown in this paper.
Design of Novel Reversible Carry Look-Ahead BCD Subtractor, Himanshu Thapliyal and Sumedha K. Gupta, Proceedings of the 9th International Conference on Information Technology (ICIT'06), ISBN 0-7695-2635-7, pp253–258, IEEE, December 2006.
Abstract: IEEE 754r is the ongoing revision to the IEEE 754 floating point standard. A major enhancement to the standard is the addition of decimal format, thus the design of BCD arithmetic units is likely to get significant attention. Firstly, this paper introduces a novel carry look-ahead BCD adder and then builds a novel carry look-ahead BCD subtractor based on it. Secondly, it introduces the reversible logic implementation of the proposed carry look-ahead BCD subtractor. We have tried to design the reversible logic implementation of the BCD Subtractor optimal in terms of number of reversible gates used and garbage outputs produced. Thus, the proposed work will be of significant value as the technologies mature.
Formal Design of Decimal Arithmetic Circuits Using Arithmetic Description Language, Yuki Watanabe, Naofumi Homma, Takafumi Aoki, and Tatsuo Higuchi, IEEE International Symposium on Intelligent Signal Processing and Communications, 2006 (ISPACS '06), ISBN 0-7803-9733-9, pp419–422, IEEE, December 2006.
Abstract: This paper presents a formal design of decimal arithmetic circuits using an arithmetic description language called ARITH. The use of ARITH makes possible (i) formal description of arithmetic algorithms including those using unconventional number systems, (ii) formal verification of described arithmetic algorithms, and (iii) translation of arithmetic algorithms to the equivalent HDL descriptions. In this paper, we demonstrate the potential of ARITH through an experimental design of binary coded decimal (BCD) arithmetic circuits.
Dynamic decimal adder circuit design by using the carry look ahead, Younggap You, Yong Dae Kim, and Jong Hwa Choi, IEEE Workshop on Design and Diagnostics of Electronic Circuits and Systems, 3pp, IEEE Computer Society, April 2006.
Abstract: This paper presents a carry look ahead (CLA) circuitry design based on dynamic circuit aiming at delay reduction in addition of BCD coded decimal numbers. The performance of the proposed dynamic decimal adder is analyzed demonstrating its speed improvement. Timing simulation on the proposed decimal addition circuit employing 0.25µm CMOS technology yields the worst case delay of 622 ns.

Solving Constraints on the Intermediate Result of Decimal Floating-Point Operations, Merav Aharoni, Ron Maharik, and Abraham Ziv, Proceedings of the 18th IEEE Symposium on Computer Arithmetic, ISBN 0-7695-2854-6, ISBN 978-0-7695-2854-0, pp38–45, IEEE, June 2007.
Abstract: The draft revision of the IEEE Standard for Floating- Point Arithmetic (IEEE P754) includes a definition for decimal floating-point (FP) in addition to the widely used binary FP specification. The decimal standard raises new concerns with regard to the verification of hardware- and software-based designs. The verification process normally emphasizes intricate corner cases and uncommon events. The decimal format introduces several new classes of such events in addition to those characteristic of binary FP. Our work addresses the following problem: Given a decimal floating-point operation, a constraint on the intermediate result, and a constraint on the representation selected for the result, find random inputs for the operation that yield an intermediate result compatible with these specifications. The paper supplies efficient analytic solutions for addition and for some cases of multiplication and division. We provide probabilistic algorithms for the remaining cases. These algorithms prove to be efficient in the actual implementation.
Extending TeX and METAFONT with floating-point arithmetic, Nelson H.F. Beebe, Proceedings of TUG 2007, TUGboat Vol. 28 #3, ISSN 0896-3207, pp319–328, TeX User's Group, July 2007.
Abstract: The article surveys the state of arithmetic in TeX and METAFONT, suggests that they could usefully be extended to support floating-point arithmetic, and shows how this could be done with a relatively small effort, without loss of the important feature of platform-independent results from those programs, and without invalidating any existing documents, or software written for those programs, including output drivers.
Performance Characterization of Decimal Arithmetic in Commercial Java Workloads, M. Bhat, J. Crawford, R. Morin, and K. Shiv, IEEE International Symposium on Performance Analysis of Systems & Software, 2007 (ISPASS 2007) IEEE, pp54–61, April 2007.
Abstract: Binary floating-point numbers with finite precision cannot represent all decimal numbers with complete accuracy. This can often lead to errors while performing calculations involving floating point numbers. For this reason many commercial applications use special decimal representations for performing these calculations, but their use carries performance costs such as bi-directional conversion. The purpose of this study was to understand the total application performance impact of using these decimal representations in commercial workloads, and provide a foundation of data to justify pursuing optimized hardware support for decimal math. In Java, a popular development environment for commercial applications, the BigDecimal class is used for performing accurate decimal computations. BigDecimal provides operations for arithmetic, scale manipulation, rounding, comparison, hashing, and format conversion. We studied the impact of BigDecimal usage on the performance of server-side Java applications by analyzing its usage on two standard enterprise benchmarks, SPECjbb2005 and SPECjAppServer2004 as well as a real-life mission-critical financial workload, Morgan Stanley’s Trade Completion. In this paper, we present detailed performance characteristics and we conclude that, relative to total application performance, the overhead of using software decimal implementations is low, and at least from the point of view of these workloads, there is insufficient performance justification to pursue hardware solutions
A Software Implementation of the IEEE 754R Decimal Floating-Point Arithmetic Using the Binary Encoding Format, Marius Cornea, Cristina Anderson, John Harrison, Ping Tak Peter Tang, Eric Schneider, and Charles Tsen, Proceedings of the 18th IEEE Symposium on Computer Arithmetic, ISBN 0-7695-2854-6, ISBN 978-0-7695-2854-0, pp29–37, IEEE, June 2007.
Abstract: The IEEE Standard 754-1985 for Binary Floating-Point Arithmetic was revised, and an important addition is the definition of decimal floating-point arithmetic. This is intended mainly to provide a robust, reliable framework for financial applications that are often subject to legal requirements concerning rounding and precision of the results, because the binary floating-point arithmetic may introduce small but unacceptable errors. Using binary floating-point calculations to emulate decimal calculations in order to correct this issue has led to the existence of numerous proprietary software packages, each with its own characteristics and capabilities. IEEE 754R decimal arithmetic should unify the ways decimal floating-point calculations are carried out on various platforms. New algorithms and properties are presented in this paper which are used in a software implementation of the IEEE 754R decimal floatingpoint arithmetic, with emphasis on using binary operations efficiently. The focus is on rounding techniques for decimal values stored in binary format, but algorithms for the more important or interesting operations of addition, multiplication, division, and conversions between binary and decimal floating-point formats are also outlined. Performance results are included for a wider range of operations, showing promise that our approach is viable for applications that require decimal floating-point calculations.
Multioperand Parallel Decimal Adder: A Mixed Binary and BCD Approach, Luigi Dadda, IEEE Transactions on Computers, Vol. 56 #10, ISSN 0018-9340, pp1320–1328, IEEE, October 2007.
Abstract: Decimal arithmetic has been in recent years revived due to the large amount of data in commercial applications. We consider the problem of Multi Operand Parallel Decimal Addition with an approach that uses binary arithmetic, suggested by the adoption of BCD numbers. This involves corrections in order to obtain the BCD result, or a binary to decimal conversion. We adopt the latter approach, particularly efficient for a large number of addends. Conversion requires a relatively small area and can afford fast operation. The BD conversion, moreover, allows an easy alignment of the sums of adjacent columns. We treat the design of BCD digit adders using fast carry free adders and the conversion problem through a known parallel scheme using elementary conversion cells. Spreadsheets have been developed for adding several BCD digits and for simulating the binary to decimal conversion as design tool.
Decimal floating-point in z9: An implementation and testing perspective, A. Y. Duale, M. H. Decker, H.-G. Zipperer, M Aharoni, and T. J. Bohizic, IBM Journal of Research and Development, Vol. 51 #1/2, ISSN 0018-8646, pp217–227, IBM, January 2007.
Abstract: Although decimal arithmetic is widely used in commercial and financial applications, the related computations are handled in software. As a result, applications that use decimal data may experience performance degradations. Use of the newly defined decimal floating-point (DFP) format instead of binary floating-point is expected to significantly improve the performance of such applications. System z9™ is the first IBM machine to support the DFP instructions. We present an overview of this implementation and provide some measurement of the performance gained using hardware assists. Various tools and techniques employed for the DFP verification on unit, element, and system levels are presented in detail. Several groups within IBM collaborated on the verification of the new DFP facility, using a common reference model to predict DFP results.
IBM POWER6 accelerators: VMX and DFU, L. Eisen, J. W. Ward III, H.-W. Tast, N. Mäding, J. Leenstra, S. M. Mueller, C. Jacobi, J. Preiss, E. M. Schwarz, and S. R. Carlough, IBM Journal of Research and Development Vol. 51 #6, ISSN 0018-8646, pp663–683, IBM, November 2007.
Abstract: The IBM POWER6 microprocessor core includes two accelerators for increasing performance of specific workloads. The vector multimedia extension (VMX) provides a vector acceleration of graphic and scientific workloads. It provides single instructions that work on multiple data elements. The instructions separate a 128-bit vector into different components that are operated on concurrently. The decimal floating-point unit (DFU) provides acceleration of commercial workloads, more specifically, financial transactions. It provides a new number system that performs implicit rounding to decimal radix points, a feature essential to monetary transactions. The IBM POWER processor instruction set is substantially expanded with the addition of these two accelerators. The VMX architecture contains 176 instructions, while the DFU architecture adds 54 instructions to the base architecture. The IEEE 754R Binary Floating-Point Arithmetic Standard defines decimal floating-point formats, and the POWER6 processor—on which a substantial amount of area has been devoted to increasing performance of both scientific and commercial workloads—is the first commercial hardware implementation of this format.
Decimal Floating-Point Multiplication Via Carry-Save Addition, Mark A. Erle, Michael J. Schulte, and Brian J. Hickmann, Proceedings of the 18th IEEE Symposium on Computer Arithmetic, ISBN 0-7695-2854-6, ISBN 978-0-7695-2854-0, pp46–55, IEEE, June 2007.
Abstract: Decimal multiplication is important in many commercial applications including financial analysis, banking, tax calculation, currency conversion, insurance, and accounting. This paper presents the design of a decimal floating-point multiplier that complies with specifications for decimal multiplication given in the draft revision of the IEEE 754 Standard for Floating-point Arithmetic (IEEE 754R). This multiplier extends a previously published decimal fixedpoint multiplier design by adding several features including exponent generation, sticky bit generation, shifting of the intermediate product, rounding, and exception detection and handling. The core of the decimal multiplication algorithm is an iterative scheme of partial product accumulation employing decimal carry-save addition to reduce the critical path delay. Novel features of the proposed multiplier include support for decimal floating-point numbers, on-thefly generation of the sticky bit, early estimation of the shift amount, and efficient decimal rounding. Area and delay estimates are provided for a verified Verilog register transfer level model of the multiplier.
A Parallel IEEE P754 Decimal Floating-Point Multiplier, Brian J. Hickmann, Andrew Krioukov, Michael J. Schulte, and Mark A. Erle, Proceedings of the IEEE International Conference on Computer Design 2007, pp296–303, IEEE, October 2007.
Abstract: Decimal floating-point multiplication is important in many commercial applications including banking, tax calculation, currency conversion, and other financial areas. This paper presents a fully parallel decimal floating-point multiplier compliant with the recent draft of the IEEE P754 Standard for Floating-point Arithmetic (IEEE P754). The novelty of the design is that it is the first parallel decimal floating-point multiplier offering low latency and high throughput. This design is based on a previously published parallel fixed-point decimal multiplier which uses alternate decimal digit encodings to reduce area and delay. The fixed-point design is extended to support floating-point multiplication by adding several components including exponent generation, rounding, shifting, and exception handling. Area and delay estimates are presented that show a significant latency and throughput improvement with a substantial increase in area as compared to the only published IEEE P754 compliant sequential floating-point multiplier. To the best of our knowledge, this is the first publication to present a fully parallel decimal floating-point multiplier that complies with IEEE P754.
On Designs of Radix Converters using Arithmetic Decompositions, Yukihiro Iguchi, Tsutomu Sasao, and Munehiro Matsuura, Proceedings of ISMVL-2007, Oslo, Norway (CD-ROM), 8pp, IEEE, May 2007.
Abstract: In digital signal processing, radixes other than two are often used for high-speed computation. In the computation for finance, decimal numbers are used instead of binary numbers. In such cases, radix converters are necessary. This paper considers design methods for binary to q-nary converters. It introduces a new design technique based on weighted-sum (WS) functions. The method computes a WS function for each digit by an LUT cascade and a binary adder, then adds adjacent digits with q-nary adders. A 16-bit binary to decimal converter is designed to show the method.
Design Methods of Radix Converters using Arithmetic Decompositions, Yukihiro Iguchi, Tsutomu Sasao, and Munehiro Matsuura, Institute of Electronics, Information and Communication Engineers, Transactions on Information and Systems, Vol. E90-D #6, pp905–914, IEICE, June 2007.
Abstract: In arithmetic circuits for digital signal processing, radixes other than two are often used to make circuits faster. In such cases, radix converters are necessary. However, in general, radix converters tend to be complex. This paper considers design methods for p-nary to binary converters. First, it considers Look-Up Table (LUT) cascade realizations. Then, it introduces a new design technique called arithmetic decomposition by using LUTs and adders. Finally, it compares the amount of hardware and performance of radix converters implemented by FPGAs. 12-digit ternary to binary converters on Cyclone II FPGAs designed by the proposed method are faster than ones by conventional methods.
Quick Addition of Decimals Using Reversible Conservative Logic, Rekha K. James, Shahana T. K., K. Poulose Jacob, and Sreela Sasi, 15th International Conference on Advanced Computing and Communications (ADCOM 2007),, ISBN 0-7695-3059-1, pp191–196, IEEE Computer Society, December 2007.
Abstract: In recent years, reversible logic has emerged as one of the most important approaches for power optimization with its application in low power CMOS, nanotechnology and quantum computing. This research proposes quick addition of decimals (QAD) suitable for multi-digit BCD addition, using reversible conservative logic. The design makes use of reversible fault tolerant Fredkin gates only. The implementation strategy is to reduce the number of levels of delay there by increasing the speed, which is the most important factor for high speed circuits.
A Radix-10 Digit-Recurrence Division Unit: Algorithm and Architecture, Tomás Lang and Alberto Nannarelli, IEEE Transactions on Computers, Vol. 56 #6, pp727–739, IEEE, June 2007.
Abstract: In this work, we present a radix-10 division unit that is based on the digit-recurrence algorithm. The previous decimal division designs do not include recent developments in the theory and practice of this type of algorithm, which were developed for radix-2k dividers. In addition to the adaptation of these features, the radix-10 quotient digit is decomposed into a radix-2 digit and a radix-5 digit in such a way that only five and two times the divisor are required in the recurrence. Moreover, the most significant slice of the recurrence, which includes the selection function, is implemented in radix-2, avoiding the additional delay introduced by the radix-10 carry-save additions and allowing the balancing of the paths to reduce the cycle delay. The results of the implementation of the proposed radix-10 division unit show that its latency is close to that of radix-16 division units (comparable dynamic range of significands) and it has a shorter latency than a radix-10 unit based on the Newton-Raphson approximation.
Design and Synthesis of a Carry-Free Signed-Digit Decimal Adder, John Moskal, Erdal Oruklu, and Jafar Saniie, IEEE International Symposium on Circuits and Systems (ISCAS 2007), pp1089–1092, IEEE, May 2007.
Abstract: The decimal arithmetic has been receiving an increased attention because of the growth of financial and scientific applications requiring high precision and increased computing power. This paper presents an efficient architecture for multi-digit decimal addition based on carry-free signed-digit numbers. In this study, the decimal adder architecture has been designed and synthesized using the TSMC 0.18mu technology. The synthesis results were compared to the existing decimal adders with respect to design area, delay and power consumption. These results show that proposed adder architecture improves the area-delay factor by 3 for a 32 digit adder.
Hardware Design of a Binary Integer Decimal-based IEEE P754 Rounding Unit, Charles Tsen, Michael J. Schulte, and Sonia Gonzalez-Navarro, Proceedings of the IEEE 18th International International Conference on Application-specific Systems, Architectures and Processors (ASAP), 7pp, IEEE, July 2007.
Abstract: Because of the growing importance of decimal floating-point (DFP) arithmetic, specifications for it were recently added to the draft revision of the IEEE 754 Standard (IEEE P754). In this paper, we present a hardware design for a rounding unit for 64-bit DFP numbers (decimal64) that use the IEEE P754 binary encoding of DFP numbers, which is widely known as the Binary Integer Decimal (BID) encoding. We summarize the technique used for rounding, present the theory and design of the BID rounding unit, and evaluate its critical path delay, latency, and area for combinational and pipelined designs. Over 86% of the rounding unit’s area is due to a 55-bit by 54-bit binary multiplier, which can be shared with a double-precision binary floating-point multiplier. To our knowledge, this is the first hardware design for rounding IEEE P754 BID-encoded DFP numbers.
Hardware Design of a Binary Integer Decimal-based Floating-point Adder, Charles Tsen, Sonia Gonzalez-Navarro, and Michael J. Schulte, Proceedings of the IEEE 25th International Conference on Computer Design, 9pp, IEEE, October 2007.
Abstract: Because of the growing importance of decimal floating-point (DFP) arithmetic, specifications for it are included in the IEEE Draft Standard for Floating-point Arithmetic (IEEE P754). In this paper, we present a novel algorithm and hardware design for a DFP adder. The adder performs addition and subtraction on 64-bit operands that use the IEEE P754 binary encoding of DFP numbers, widely known as the Binary Integer Decimal (BID) encoding. The BID adder uses a novel hardware component for decimal digit counting and an enhanced version of a previously published BID rounding unit. By adding more sophisticated control, operations are performed with variable latency to optimize for common cases. We show that a BID-based DFP adder design can be achieved with a modest area increase compared to a single 2-stage pipelined 64-bit fixed-point multiplier. Over 70% of the BID adder’s area is due the 64-bit fixed-point multiplier, which can be shared with a binary floating-point multiplier and hardware for other DFP operations. To our knowledge, this is the first hardware design for adding and subtracting IEEE P754 BID-encoded DFP numbers.
Functions to Support Input and Output of Intervals, M. H., van Emden, B. Moa, and S. C. Somosan, Report DCS-311-IR, 16pp, University of Victoria, Canada, February 2007.
Abstract: Interval arithmetic is hardly feasible without directed rounding as provided, for example, by the IEEE floating-point standard. Equally essential for interval methods is directed rounding for conversion between the external decimal and internal binary numerals. This is not provided by the standard I/O libraries. Conversion algorithms exist that guarantee identity upon conversion followed by its inverse. Although it may be possible to adapt these algorithms for use in decimal interval I/O, we argue that outward rounding in radix conversion is computationally a simpler problem than guaranteeing identity. Hence it is preferable to develop decimal interval I/O ab initio, which is what we do in this paper.
A New Family of High–Performance Parallel Decimal Multipliers, Alvaro Vázquez, Elisardo Antelo, and Paolo Montuschi, Proceedings of the 18th IEEE Symposium on Computer Arithmetic, ISBN 0-7695-2854-6, ISBN 978-0-7695-2854-0, pp195–204, IEEE, June 2007.
Abstract: This paper introduces two novel architectures for parallel decimal multipliers. Our multipliers are based on a new algorithm for decimal carry–save multioperand addition that uses a novel BCD–4221 recoding for decimal digits. It significantly improves the area and latency of the partial product reduction tree with respect to previous proposals. We also present three schemes for fast and efficient generation of partial products in parallel. The recoding of the BCD–8421 multiplier operand into minimally redundant signed–digit radix–10, radix–4 and radix–5 representations using new recoders reduces the complexity of partial product generation. In addition, SD radix–4 and radix–5 recodings allow the reuse of a conventional parallel binary radix–4 multiplier to perform combined binary/ decimal multiplications. Evaluation results show that the proposed architectures have interesting area–delay figures compared to conventional Booth radix–4 and radix–8 parallel binary multipliers and other representative alternatives for decimal multiplication.
Novel, High-Speed 16-Digit BCD Adders Conforming to IEEE 754r Format, Sreehari Veeramachaneni, M.Kirthi Krishna, Lingamneni Avinash, Sreekanth Reddy P, and M.B. Srinivas, IEEE Computer Society Annual Symposium on VLSI (ISVLSI '07), pp343–350, IEEE, May 2007.
Abstract: In view of increasing prominence of commercial, financial and internet-based applications that process data in decimal format, there is a renewed interest in providing hardware support to handle decimal data. In this paper, a new architecture for efficient 1-digit decimal addition of binary coded decimal (BCD) operands, which is the core of high speed multi-operand adders and floating decimal-point arithmetic, is proposed. Based on this 1-digit BCD adder, novel architectures for higher order (n-digit) BCD adders such as ripple carry adder and carry look-ahead adder are derived. The proposed circuits are compared (both qualitatively as well as quantitatively) with the existing circuits in literature and are shown to perform better. Simulation results show that the proposed 1-digit BCD adder achieves an improvement of 40% in delay. The 16-digit BCD lookahead adder using prefix logic is shown to perform at least 80% faster than the existing ripple carry one.
Decimal Floating-Point Adder and Multifunction Unit with Injection-Based Rounding, Liang-Kai Wang and Michael J. Schulte, Proceedings of the 18th IEEE Symposium on Computer Arithmetic, ISBN 0-7695-2854-6, ISBN 978-0-7695-2854-0, pp56–65, IEEE, June 2007.
Abstract: Shrinking feature sizes gives more headroom for designers to extend the functionality of microprocessors. The IEEE 754R working group has revised the IEEE 754-1985 Standard for Binary Floating-Point Arithmetic to include specifications for decimal floating-point arithmetic and IBM recently announced incorporating a decimal floatingpoint unit into their POWER6 processor. As processor support for decimal floating-point arithmetic emerges, it is important to investigate efficient algorithms and hardware designs for common decimal floating-point arithmetic algorithms. This paper presents novel designs for a decimal floating-point adder and a decimal floating-point multifunction unit. To reduce their delay, both the adder and the multifunction unit use decimal injection-based rounding, a new form of decimal operand alignment, and a fast flag-based method for rounding and overflow detection. Synthesis results indicate that the proposed adder is roughly 21% faster and 1.6% smaller than a previous decimal floating-point adder design, when implemented in the same technology. Compared to the decimal floating-point adder, the decimal floating-point multifunction unit provides six additional operations, yet only has 2.8%more delay and 9.7% more area.
Benchmarks and Performance Analysis of Decimal Floating-Point Applications, Liang-Kai Wang, Charles Tsen, Michael J. Schulte, and Divya Jhalani, Proceedings of the IEEE International Conference on Computer Design 2007, pp164–170, IEEE, October 2007.
Abstract: The IEEE P754 Draft Standard for Floating-point Arithmetic provides specifications for Decimal Floating-Point (DFP) formats and operations. Based on this standard, many developers will provide support for DFP calculations. We present a benchmark suite for DFP applications and use this suite to evaluate the performance of hardware and software DFP solutions. Our benchmarks include banking, commerce, risk-management, tax, and telephone billing applications organized into a suite of five macro benchmarks. In addition to developing our own applications, we leverage open-source projects and academic financial analysis applications. The benchmarks are modular, making them easy to adapt for different DFP solutions. We use the benchmarks to evaluate the performance of the decNumber DFP library and an extended version of the SimpleScalar PISA architecture with hardware and instruction set support for DFP operations. Our analysis shows that providing processor support for high-speed DFP operations significantly improves the performance of DFP applications.
A Decimal Floating-Point Divider using Newton-Raphson Iteration, Liang-Kai Wang and Michael J. Schulte, Journal of VLSI Signal Processing Systems, Vol. 49 #1, ISSN 0922-5773, pp3–18, Kluwer Academic Publishers, October 2007.
Abstract: Increasing chip densities and transistor counts provide more room for designers to add functionality for important application domains into future microprocessors. As a result of rapid growth in financial, commercial, and Internet-based applications, hardware support for decimal floating-point arithmetic is now being considered by various computer manufacturers and specifications for decimal floating-point arithmetic have been added to the draft revision of the IEEE-754 Standard for Floating-Point Arithmetic (IEEE P754). In this paper, we present an efficient arithmetic algorithm and hardware design for decimal floating-point division. The design uses an efficient piecewise linear approximation, a modified Newton-Raphson iteration, a specialized rounding technique, and a simplified decimal incrementer and decrementer. Synthesis results show that a 64-bit (16-digit) implementation of the decimal divider, which is compliant with the current version of IEEE P754, has an estimated critical path delay of 0.69 ns (around 13 FO4 inverter delays) when implemented using LSI Logic’s 0.11 micron Gflx-P standard cell library.
Processor support for decimal floating-point arithmetic, Liang-Kai Wang, ISBN 978-0-549-19463-7, 157pp, University of Wisconsin at Madison, 2007.
Abstract: Decimal data permeates society, as humans most commonly use base-ten numbers. Although microprocessors normally use base-two binary arithmetic to obtain faster execution times and simpler circuitry, binary numbers cannot represent decimal fractions exactly. This leads to large errors being accumulated after several decimal operations. Furthermore, binary floating-point arithmetic operations perform binary rounding instead of decimal rounding. Consequently, applications, such as financial, commercial, tax, and Internet-based applications, which are sensitive to representation and rounding errors, often require decimal arithmetic. Due to the increasing importance of and demand for decimal arithmetic, its formats and operations have been specified in the IEEE Draft Standard for Floating-point Arithmetic (IEEE P754).
   Most decimal applications use software routines and binary arithmetic to emulate decimal operations. Although this approach eliminates errors due to converting between binary and decimal numbers and provides decimal rounding to mirror manual calculations, it results in long latencies for numerically intensive commercial applications. This is because software emulation of decimal floating-point (DFP) arithmetic has significant overhead due to function calls, dealing with decimal formats, operand alignment, decimal rounding, and special case and exception handling.
   This dissertation investigates processor support for decimal floating-point arithmetic. It first reviews recent progress in decimal arithmetic, including decimal encodings, the IEEE P754 Draft Standard, and software packages, hardware designs, and benchmark suites for decimal arithmetic. Next, this dissertation presents novel arithmetic algorithms and hardware designs for basic DFP operations, including DFP addition, subtraction, division, square root, and others. Most of the hardware designs presented in this dissertation are the first published designs compliant with the IEEE P754 Draft Standard. Finally, to study the performance impact of DFP instructions and hardware, this dissertation presents the first publicly available benchmark suite for DFP arithmetic. This benchmark suite, along with instruction set extensions and a decimal-enhanced processor simulator, are used to demonstrate that providing fast hardware support for DFP operations leads to significant performance benefits to DFP-intensive applications.

A Novel Approach to Design BCD Adder and Carry Skip BCD Adder, Ashis Kumer Biswas, Md. Mahmudul Hasan, Moshaddek Hasan, Ahsan Raja Chowdhury, and Hafiz Md. Hasan Babu, Proceedings of the 21st International Conference on VLSI Design (VLSID '08), ISBN 0-7695-3083-4, pp566–571, IEEE Computer Society, January 2008.
Abstract: Reversible logic has become one of the most promising research areas in the past few decades and has found its applications in several technologies; such as low power CMOS, nanocomputing and optical computing. This paper presents improved and efficient reversible logic implementations for Binary Coded Decimal (BCD) adder as well as Carry Skip BCD adder. It has been shown that the modified designs outperform the existing ones in terms of number of gates, number of garbage output and delay.
Efficient approaches for designing reversible Binary Coded Decimal adders, Ashis Kumer Biswas, Md. Mahmudul Hasan, Ahsan Raja Chowdhury, and Hafiz Md. Hasan Babu, Microelectronics Journal, Vol. 39 #12, ISSN 0026-2692, pp1693–1703, Elsevier, December 2008.
Abstract: Reversible logic has become one of the most promising research areas in the past few decades and has found its applications in several technologies; such as low-power CMOS, nanocomputing and optical computing. This paper presents improved and efficient reversible logic implementations for Binary Coded Decimal (BCD) adder as well as Carry Skip BCD adder. It has been shown that the modified designs outperform the existing ones in terms of number of gates, number of garbage outputs, delay, and quantum cost. In order to show the efficiency of the proposed designs, lower bounds of the reversible BCD adders in terms of gates and garbage outputs are proposed as well.
Compressor trees for decimal partial product reduction, Ivan D. Castellanos and James E. Stine, Proceedings of the 18th ACM Great Lakes symposium on VLSI, ISBN 978-1-59593-999-9, pp107–110, ACM Press, 2008.
Abstract: Decimal multiplication has grown in interest due to the recent announcement of new IEEE 754R standards and the availability of high-speed decimal computation hardware. Prior research enabled partial products to be coded more efficiently for their use in radix 10 architectures. This paper clarifies previous techniques for partial product reduction using carry-save adders and presents a new 4:2 compressor structure. This new structure improves performance at the expense of more gates, however, regularity is introduced into the circuit to promote implementations in Very Large Scale Integration (VLSI) Designs. Results are presented and compared for several designs using a TSMC SCN6M 0.18 µm feature size.
Algorithms and Hardware Designs for Decimal Multiplication, Mark A. Erle, 217pp, Lehigh University, November 2008.
Abstract: Although a preponderance of business data is in decimal form, virtually all floating-point arithmetic units on today’s general-purpose microprocessors are based on the binary number system. Higher performance, less circuitry, and better overall error characteristics are the main reasons why binary floating-point hardware (BFP) is chosen over decimal floating-point (DFP) hardware. However, the binary number system cannot precisely represent many common decimal values. Further, although BFP arithmetic is well-suited for the scientific community, it is quite different from manual calculation norms and does not meet many legal requirements.
   Due to the shortcomings of BFP arithmetic, many applications involving fractional decimal data are forced to perform their arithmetic either entirely in software or with a combination of software and decimal fixed-point hardware. Providing DFP hardware has the potential to dramatically improve the performance of such applications. Only recently has a large microprocessor manufacturer begun providing systems with DFP hardware. With available die area continually increasing, dedicated DFP hardware implementations are likely to be offered by other microprocessor manufacturers.
   This dissertation discusses the motivation for decimal computer arithmetic, a brief history of this arithmetic, and relevant software and processor support for a variety of decimal arithmetic functions. As the context of the research is the IEEE Standard for Floating-point Arithmetic (IEEE 754-2008) and two-state transistor technology, descriptions of the standard and various decimal digit encodings are described.
   The research presented investigates algorithms and hardware support for decimal multiplication, with particular emphasis on DFP multiplication. Both iterative and parallel implementations are presented and discussed. Novel ideas are advanced such as the use of decimal counters and compressors and the support of IEEE 754-2008 floating-point, including early estimation of the shift amount, in-line exception handling, on-the-fly sticky bit generation, and efficient decimal rounding. The iterative and parallel, decimal multiplier designs are compared and contrasted in terms of their latency, throughput, area, delay, and usage.
   The culmination of this research is the design and comparison of an iterative DFP multiplier with a parallel DFP multiplier. The iterative DFP multiplier is significantly smaller and may achieve a higher practical frequency of operation than the parallel DFP multiplier. Thus, in situations where the area available for DFP is an important design constraint, the iterative DFP multiplier may be an attractive implementation. However, the parallel DFP multiplier has less latency for a single multiply operation and is able to produce a new result every cycle. As for power considerations, the fewer overall devices in the iterative multiplier, and more importantly the fewer storage elements, should result in less leakage. This benefit is mitigated by its higher latency and lower throughput.
   The proposed implementations are suitable for general-purpose, server, and mainframe microprocessor designs. Depending on the demand for DFP in human-centric applications, this research may be employed in the application-specific integrated circuits (ASICs) market.

Note: Available at
A BCD-based architecture for fast coordinate rotation, Antonio Jimeno, Higinio Mora, Jose L. Sanchez, and Francisco Pujol, Journal of Systems Architecture: the EUROMICRO Journal, Vol. 54 #8, ISSN 1383-7621, pp829–840, Elsevier, August 2008.
Abstract: Although radix 10 based arithmetic has been gaining renewed importance over the last few years, decimal systems are not efficient enough and techniques are still under development. In this paper, an improvement of the CORDIC (coordinate rotation digital computer) method for decimal representation is proposed and applied to produce fast rotations. The algorithm uses BCD operands as inputs, combining the advantages of both decimal and binary systems. The result is a reduction of 50% in the number of iterations if compared with the original Decimal CORDIC method. Finally, we present a hardware architecture useful to produce BCD coordinates rotations accurately and fast, and different experiments demonstrating the advantages of the new method are shown. A reduction of 75% in a single stage delay is obtained, whereas the circuit area just increases in about 5%.
Optimized reversible binary-coded decimal adders, Michael Kirkedal Thomsen and Robert Glück, Journal of Systems Architecture: the EUROMICRO Journal, Vol. 54 #7, ISSN 1383-7621, pp697–706, Elsevier, July 2008.
Abstract: Babu and Chowdhury recently proposed, in this journal, a reversible adder for binary-coded decimals. This paper corrects and optimizes their design. The optimized 1-decimal BCD full-adder, a 13x13 reversible logic circuit, is faster, and has lower circuit cost and less garbage bits. It can be used to build a fast reversible m-decimal BCD full-adder that has a delay of only m+17 low-power reversible CMOS gates. For a 32-decimal (128-bit) BCD addition, the circuit delay of 49 gates is significantly lower than is the number of bits used for the BCD representation. A complete set of reversible half- and full-adders for n-bit binary numbers and m-decimal BCD numbers is presented. The results show that special-purpose design pays off in reversible logic design by drastically reducing the number of garbage bits. Specialized designs benefit from support by reversible logic synthesis. All circuit components required for optimizing the original design could also be synthesized successfully by an implementation of an existing synthesis algorithm.
A Novel Carry-Look Ahead Approach to a Unified BCD and Binary Adder/Subtractor, Sreehari Veeramachaneni, M. Kirthi Krishna, G. V. Prateek, S. Subroto, S. Bharat, and M. B. Srinivas, Proceedings of the 21st International Conference on VLSI Design (VLSID '08), ISBN 0-7695-3083-4, pp547–552, IEEE Computer Society, January 2008.
Abstract: Increasing prominence of commercial, financial and internet-based applications, which process decimal data, there is an increasing interest in providing hardware support for such data. In this paper, new architecture for efficient binary and Binary Coded Decimal (BCD) adder/subtractor is presented. This employs a new method of subtraction unlike the existing designs which mostly use 10’s complements, to obtain a much lower latency. Though there is a necessity of correction in some cases, the delay overhead is minimal. A complete discussion about such cases and the required logic to process is presented. The architecture is run-time reconfigurable to facilitate both BCD and binary operations, including signed and unsigned numbers. The proposed circuits are compared (both qualitatively as well as quantitatively) with the existing circuits in literature and are shown to perform better. Simulation results show that the proposed architecture is at least 11% faster than the existing designs.
IBM z10: The Next-Generation Mainframe Microprocessor, Charles Webb, IEEE Micro Vol. 28 #2, ISSN 0272-1732, pp19–29, IEEE, March/April 2008.
Abstract: The IBM system z10 includes four microprocessor cores — each with a private 3-Mbyte cache — and integrated accelerators for decimal floating-point computation, cryptography, and data compression. A separate SMP hub chip provides a shared third-level cache and interconnect fabric for multiprocessor scaling. This article focuses on the high-frequency design techniques used to achieve a 4.4-GHz system, and on the pipeline design that optimizes z10’s CPU performance.

Decimal floating-point support on the IBM System z10 processor, Eric M. Schwarz, John S. Kapernick, and Mike F. Cowlishaw, IBM Journal of Research and Development, Vol. 53 #1, pp4:1–4:10, IBM, January 2009.
Abstract: The latest IBM zSeries processor, the IBM System z10 processor, provides hardware support for the decimal floating-point (DFP) facility that was introduced on the IBM System z9 processor. The z9 processor implements the facility with a mixture of low-level software and hardware assists. Recently, the IBM POWER6 processor-based System p 570 server introduced a hardware implementation of the DFP facility. The latest zSeries processor includes a decimal floating-point unit based on the POWER6 processor DFP unit that has been enhanced to also support the traditional zSeries decimal fixed-point instruction set. This paper explains the hardware implementation to support both decimal fixed point and DFP and the new software support for the DFP facility, including IBM z/OS, Java JIT, and C/C++ compilers, as well as support in IBM DB2 and middleware.

252 references listed. Last updated: 10 Mar 2011
Some elements Copyright © IBM Corporation, 2002, 2009. All rights reserved.