Bibliography of material on Decimal Arithmetic – by year 
This collection of references forms a bibliography of Decimal
Arithmetic, with the emphasis on computer implementations of arithmetic.
This list is sorted by year of publication.
(a categorized collection
and an alphabetic list by first author
are also available).
This bibliography lists the papers I collected when researching decimal arithmetic through 2008. For a more extensive floatingpoint bibliography, including papers since that date, I commend Norbert Juffa and Nelson Beebe’s fparith collection at the University of Utah (search for ‘decimal’). fparith is available in a variety of formats. 
For general background on why decimal arithmetic is important, a decimal FAQ, decimal arithmetic specifications and testcases, and other World Wide Web links, please see the General Decimal Arithmetic pages.
For books, and papers with no formal abstract, the Abstract material is quoted from an introductory section or (occasionally, for books only) back cover matter. Omitted material is indicated by ellipses (...).
Please send any comments, corrections, or additions to Mike Cowlishaw, mfc@speleotrove.com.
karpin1925
¿Web? 
The History of Arithmetic,
Louis Charles Karpinski,
200pp,
Rand McNally & Company,
1925.
Abstract: The purpose of this book is to present the development of arithmetic as a vital and integral part of the history of civilization. Particular attention is paid to the material of arithmetic which continues to be taught in our elementary schools and to the historical phases of that work with which the teacher of arithmetic should be familiar... Note: Reprint: Russell & Russell, New York, 1965. 
burks1946
¿Web? 
Preliminary discussion of the logical design of an electronic computing instrument,
Arthur W. Burks, Herman H. Goldstine, and John von Neumann,
42pp,
Inst. for Advanced Study, Princeton, N. J.,
June 28, 1946.
Abstract: Inasmuch as the completed device will be a generalpurpose computing machine it should contain certain main organs relating to arithmetic, memorystorage, control and connection with the human operator. It is intended that the machine be fully automatic in character, i.e. independent of the human operator after the computation starts... Note: Reprinted in von Neumann’s Collected Works, Vol. 5, A. H. Taub, Ed. (Pergamon, London, 1963), pp 3479, and also in Computer Structures: Reading and Examples, Bell & Newell, McGrawHill Inc., 1971. Now widely available on the Internet. Contract W36034ORDH81. R&D Service, Ordnance Department, US Army and Institute for Advanced Study, Princeton 
golds1946
¿Web? 
The Electronic Numerical Integrator and Computer (ENIAC),
H. H. Goldstine and Adele Goldstine,
IEEE Annals of the History of Computing, Vol. 18 #1,
pp10–16,
IEEE,
1996.
Abstract: It is our purpose in the succeeding pages to give a brief description of the ENIAC and an indication of the kinds of problems for which it can be used. This general purpose electronic computing machine was recently made public by the Army Ordnance Department for which it was developed by the Moore School of Electrical Engineering. The machine was developed primarily for the purpose of calculating firing tablcs for the armed forces. Its design is, however, sufficiently general to permit the solution of a large class of numerical problems which could hardly be attempted by more conventional computing tools. In order easily to obtain sufficient accuracy for scientific computations, the ENIAC was designed as a digital device. The equipment normally handles signed 10digit numbers expressed in the decimal system. It is, however, so constructed that operations with as many as 20 digits are possible. The machine is automatically sequenced in the sense that all instructions needed to carry out a computation are given to it before the computation commences. It will be seen below how these instructions are given to the machine. Note: Reprinted from Mathematical Tables and Other Aids to Computation, 1946. 
davis1952
¿Web? 
Automatic Recognition of Spoken Digits,
K. Davis, R. Biddulph, and S. Balashek,
Journal of the Acoustical Society of America, Vol. 24 (Possibly: American Journal of Otolaryngology, Vol. 24.),
pp637–642,
ASA,
November 1952.
Abstract: The recognizer discussed will automatically recognize telephonequality digits spoken at normal speech rates by a single individual, with an accuracy varying between 97 and 99 percent. After some preliminary analysis of the speech of any individual, the circuit can be adjusted to deliver a similar accuracy on the speech of that individual. The circuit is not, however, in its present configuration, capable of performing equally well on the speech of a series of talkers without recourse to such adjustment. Circuitry involves division of the speech spectrum into two frequency bands, one below and the other above 900 cps. Axiscrossing counts are then individually made of both band energies to determine the frequency of the maximum syllabic rate energy within each band. Simultaneous twodimensional frequency portrayal is found to possess recognition significance. Standards are then determined, one for each digit of the tendigit series, and are built into the recognizer as a form of elemental memory. By means of a series of calculations performed automatically on the spoken input digit, a best match type comparison is made with each of the ten standard digit patterns and the digit of best match selected. 
bashe1954
¿Web? 
The IBM Type 702, An Electronic Data Processing Machine for Business,
C. J. Bashe, W. Buchholz, and N. Rochester,
Journal of the ACM (JACM), Vol. 1 #4,
pp149–169,
ACM Press,
October 1954.
Abstract: The main features of the IBM Electronic Data Processing Machine, Type 702, are discussed from the programmer’s point of view to illustrate how it was designed specifically to solve large accounting and statistical problems in business, industry, and government. The 702 exploits in one integrated system the high speed and storage capacity of magnetic tape, the accessibility of electrostatic memory supplemented by large auxiliary storage on magnetic drums, the flexibility of punchedcard document input, the page printing output of modern accounting machines, and the technology of generalpurpose, storedprogram, electronic computers. The 702 is a serial machine with decimal arithmetic. Its serial nature provides several unusual logical features of great aid in programming accounting problems. 
hamilton1954
¿Web? 
The IBM Magnetic Drum Calculator Type 650,
F. E. Hamilton and E. C. Kubie,
Journal of the ACM, Vol. 1 #1,
pp13–20,
ACM Press,
January 1954.
Abstract: The IBM Magnetic Drum Calculator Type 650 is an electronic calculator intermediate in speed, capacity and cost. It takes a logical position between the IBM Card Programmed Electronic Calculator and the IBM Electronic Data Processing Machines Type 701. It is a more powerful computing tool as required by those who have “outgrown” the Card Programmed Electronic Calculator. It is also a machine which may be used economically by those who are not as yet ready for a large scale computer such as the 701. It will serve not only to perform their required computing tasks, but it will also result in gaining valuable experience for later use of large scale equipment. The Magnetic Drum Calculator, through its stored program control, comprehensive order list, punched card inputoutput, selfchecking and moderate memory capacity, gains the flexibility required of a computer which is to serve in both the commercial and scientific computing fields... 
mosh1954
¿Web? 
The Generation of PseudoRandom Numbers on a Decimal Calculator,
Jack Moshman,
Journal of the ACM Vol. 1 #2,
pp88–91,
ACM Press,
April 1954.
Abstract: (None.) Describes the generation of pseudorandom numbers on the decimal UNIVAC machine. 
rich1955
¿Web? 
Arithmetic Operations in Digital Computers,
R. K. Richards,
ISBN (none),
397pp,
D. Van Nostrand Co., NY,
1955.
Abstract: Among the first things that are learned in a study of mathematics are rules and procedures for performing basic arithmetic operations, notably addition, subtraction, multiplication, and division. The rules and procedures taught in school are, for the most part, aimed at making the operations as simple and speedy as possible when a pencil and a piece of paper are the only tools. In the design of more elaborate arithmetical tools, it is usually found necessary or at least highly desirable to devise new methods for executing the various arithmetic operations. ... Note: Library of Congress No. 556234. Bibliography 9pp. 
perk1956
¿Web? 
EASIAC, A PseudoComputer,
Robert Perkins,
Journal of the ACM, Vol. 3 #2,
pp65–72,
ACM Press,
April 1956.
Abstract: One of the primary functions of the MIDAC installation at the University of Michigan is the instruction of beginners in the various aspects of digital machine use including programming and coding. ... In conducting these courses it was soon found to be extremely difficult, in five or six instruction periods, to bring a complete newcomer up to the point where he can code and check out on MIDAC anything more than a rather trivial routine. As might be expected the difficulty centers around problems of scaling, instruction modification and binary representation. ... To alleviate these problems it was decided that a new computer was needed: one designed to make programming easier. At the cost of some of MIDAC’s speed and capacity plus two or three manmonths of programming time EASIAC, the EASy Instruction Automatic Computer, was realized as a translation interpretation program in MIDAC. 
couleur1958
¿Web? 
BIDEC – A BinarytoDecimal or DecimaltoBinary Converter,
J. F. Couleur,
IRE Transactions on Electronic Computers, Vol. EC7,
pp313–316,
IRE,
1958.
Abstract: Simple, highspeed devices to convert binary, binary coded octal, or Gray code numbers to binary coded decimal numbers or vice versa is described. Circuitry required is four shift register stages per decimal digit plus one 30diode network per decimal digit. In simple form the conversion requires two operations per binary bit but is theoretically capable of working at one operation per bit. 
delury1958
¿Web? 
Computation with Approximate Numbers,
Daniel B. Delury,
The Mathematics Teacher 51,
pp521–530,
November 1958.
Abstract: There is room, I think, for the view that it is improper to speak at all of “approximate numbers”... Note: Reprinted with permission of the Canadian School. 
kautz1958
¿Web? 
Binary and truthfunction operations on a decimal computer with an extract command,
William H. Kautz,
Communications of the ACM, Vol. 1 #5,
pp12–13,
ACM Press,
May 1958.
Abstract: It occasionally becomes desirable to solve, on automatic digital computing machines which are capable of handling only decimal numbers, problems in logic, class structure, coding, binary relations or binary arithmetic. This note describes how the major logical and binary operations can be carried out on one such machine, the DATATRON 205, without any circuit modifications to the computer. These procedures would be applicable with little modification to any decimal computer with an extract command, however. 
mennin1958
¿Web? 
Number Words and Number Symbols: A Cultural History of Numbers,
Karl Menninger,
ISBN 0486270963,
480pp,
Dover Publications, Inc.,
1992.
Abstract: This book is ... a multifacted linguistic and historical analysis of how numbers have developed and evolved in many different cultures. “... especially good on early counting and calculating devices ...”. Note: First published in English by the MIT Press, 1969. Translated from the German by Paul Broneer. 
siss1958
¿Web? 
An Improved Decimal Redundancy Check,
Roger L. Sisson,
Communications of the ACM, Vol. 1 #5,
pp10–12,
ACM Press,
May 1958.
Abstract: As more emphasis is placed on improving the accuracy of data fed into automatic computing systems, more emphasis will be placed on redundancy checking of predicable fields within the input. Two systems (at least) of checking a field of decimal digits have been proposed. In both of these it is assumed that the field to be checked is all numeric and that the redundancy must be of only one digit. 
ashen1959
¿Web? 
Unnormalized Floating Point Arithmetic,
R. L. Ashenhurst and N. Metropolis,
Journal of the ACM, Vol. 6 #3,
pp415–428,
ACM Press,
July 1959.
Abstract: Algorithms for floating point computer arithmetic are described, in which fractional parts are not subject to the usual normalization convention. These algorithms give results in a form which furnishes some indication of their degree of precision. An analysis of onestage error propagation is developed for each operation; a suggested statistical model for long run error propagation is also set forth. 
buch1959
¿Web? 
Fingers or Fists? (The Choice of Decimal or Binary representation),
Werner Buchholz,
Communications of the ACM, Vol. 2 #12,
pp3–11,
ACM Press,
December 1959.
Abstract: The binary number system offers many advantages over a decimal representation for a highperfornmnee, generalpurpose computer. The greater simplicity of a binary arithmetic unit and the greater compactness of binary numbers both contribute directly to arithmetic speed. Less obvious and perhaps more important is the way binary addressing and instruction formats can increase the overall performance. Binary addresses are also essential to certain powerful operations which are not practical with decimal instruction formats. On the other hand, decimal numbers are essential for communicating between man and the computer. In applications requiring the processing of a large volume of inherently decimal input and output data, the time for decimalbinary conversion needed by a purely binary computer may be significant. A slower decimal adder may take less time than a fast binary adder doing an addition and two conversions. A careful review ef the significance of decimal and binary number systems led to the adoption in the IBM STRETCH computer of binary addressing and both binary and decimal data arithmetic, supplemented by efficient conversion instructions. Note: Letters to the edtor in response to this paper were published in CACM, Vol. 3, #3, March 1960. 
dagg1959
¿Web? 
DecimalBinary conversions in CORDIC,
D. H. Daggett,
IRE Transactions on Electronic Computers, Vol. EC8 #5,
pp335–339,
IRE,
September 1959.
Abstract: A specialpurpose, binary computer called CORDIC (COordinate Rotation DIgital Computer) contains a unique arithmetic unit composed of three shift registers, three addersubtractors, and suitable interconnections for efficiently performing calculations involving trigonometric functions. A technique is formulated for using the CORDIC arithmetic unit to convert between angles expressed in degrees and minutes in the 8, 4, 2, 1 code and angles expressed in binary fractions of a half revolution. Decimaltobinary conversion is accomplished through the generation of an intermediate binary code in which the variable values are +1 and 1. Each of these intermediate code variables controls the addition or subtraction of a particular binary constant in the formation of an accumulated sum which represents the angle. Examples are presented to illustrate the technique. Binarytodecimal conversion is accomplished by applying essentially the same conversion steps in reverse order, but this feature is not discussed fully. Fundamental principles of the conversion technique, rather than details of implementation, are emphasized. The CORDIC conversion technique is sufficiently general to be applied to decimalbinary conversion problems involving other mixed radix systems and other decimal codes. 
tarant1959
¿Web? 
Binary conversion, with fixed decimal precision, of a decimal fraction,
Donald Taranto,
Communications of the ACM, Vol. 2 #7,
pp27–27,
ACM Press,
July 1959.
Abstract: Given a decimal fraction f find a binary approximation f_{b} to f, with a given decimal precision h. 
wolontis1959
URL ¿Web? 
A Complete FloatingDecimal Interpretive System for the IBM 650 Magnetic Drum Calculator,
V. M. Wolontis,
IBM Reference Manual, FloatingDecimal Interpretive System for the IBM 650,
87pp,
IBM,
1959.
Abstract: This report describes an interpretive system which transforms the 650 into a threeaddress, floatingdecimal, generalpurpose computer, primarily suited for scientific and engineering calculations. The system is complete in the sense that all mathematical, logical, and inputoutput operations normally called for in such calculations can be performed within the system, i.e., without reference to the basic operation codes of the 650. The guiding principles in designing the system have been ease of use, as defined in the introduction, high speed of arithmetic and frequently used logical operations and full accuracy and range for the elementary transcendental functions... Note: This document and the earlier Bell Telephone Laboratories report are available at http://www.bitsavers.org/pdf/ibm/650 
wadey1960
¿Web? 
FloatingPoint Arithmetics,
W. G. Wadey,
Journal of the ACM, Vol. 7 #2,
pp129–139,
ACM Press,
April 1960.
Abstract: Three types of floatingpoint arithmetics with error control are discussed and compared with conventional floatingpoint arithmetic. General multiplication and division shift criteria are derived (for any base) for Metropolisstyle arithmetics. The limitations and most suitable range of application for each arithmetic are discussed. 
weik1961
URL ¿Web? 
A Third Survey of Domestic Electronic Digital Computing Systems, Report No. 1115,
Martin H. Weik,
1131pp,
Ballistic Research Laboratories, Aberdeen Proving Ground, Maryland,
March 1961.
Abstract: Based on the results of a third survey, the engineering and programming characteristics of two hundred twentytwo different electronic digital computing systems are given. The data are presented from the point of view of application, numerical and arithmetic characteristics, input, output and storage systems, construction and checking features, power, space, weight, and site preparation and personnel requirements, production records, cost and rental rates, sale and lease policy, reliability, operating experience, and time availability, engineering modifications and improvements and other related topics. An analysis of the survey data, fifteen comparative tables, a discussion of trends, a revised bibliography, and a complete glossary of computer engineering and programming terminology are included. 
grau1962
¿Web? 
On a FloatingPoint Number Representation For Use with Algorithmic Languages,
A. A. Grau,
Communications of the ACM, Vol. 5 #3,
pp160–161,
ACM Press,
March 1962.
Abstract: Algorithmic languages, such as ALGOL, make provision for two types of numbers, real and integer, which are usually implemented on the computer by means of floatingpoint and fixedpoint numbers respectively. The concepts real and integer, however, are taken from mathematics, where the set of integers forms a proper subset of the set of real numbers. In implementation a real problem is posed by the fact that the set of fixedpoint numbers is not a proper subset of the set of floatingpoint numbers; this problem becomes very apparent in attempts to implement ALGOL 60. Furthermore, the one mathematical operation of addition is implemented in the machine by one of two machine operations, fixedpoint addition or floatingpoint addition. ... 
jones1962
¿Web? 
Floating Point Feature On The IBM Type 1620,
F. B. Jones and A. W. Wymore,
IBM Technical Disclosure Bulletin, 0562,
pp43–46,
IBM,
May 1962.
Abstract: In the type 1620 automatic floating point operations, a floating point number is a field consisting of a variable length mantissa and a two digit exponent. The exponent is in the two low order positions of the field, and the mantissa is in the remaining high order positions, M.....MEE. The most significant digit positions are marked by flags and the algebraic signs are marked by flags over the least significant digit positions. The exponent is established on the premise that the mantissa is less than 1.0 and equal to or greater than 0.1, and has a range of 99 to +99. The smallest positive quantity that can be represented is thus 00.... 099. The mantissa may have from two to one hundred digits. ... 
lake1962
¿Web? 
Hardware Conversion of Decimal and Binary Numbers, G. T. Lake, Communications of the ACM, Vol.5 #9, pp468–469, ACM Press, September 1962. 
lynch1962
¿Web? 
On a WiredIn BinarytoDecimal Conversion Scheme, W. C. Lynch, Communications of the ACM, Vol. 5 #3, pp159–159, ACM Press, March 1962. 
allard1963
¿Web? 
Mixed Congruential Random Number Generators for Decimal Machines,
J. L. Allard, A. R. Dobell, and T. E. Hull,
Journal of the ACM, Vol. 10 #2,
pp131–141,
ACM Press,
April 1963.
Abstract: Random number generators of the mixed eongruential type have recently been proposed. They appear to have some advantages over those of the multiplicative congruential type, but they have not been thoroughly tested. This paper summarizes the results of extensive testing of these generators which has been carried out on a decimal machine. Most results are for word length 10, and special attention is given to simple multipliers which give fast generators. But other word lengths and many other multipliers are considered. A variety of additive constants is also used. It turns out that these mixed generators, in contrast to the multiplicative ones, are not consistently good from a statistical point of view. The cases which are bad seem to belong to a welldefined class which, unfor unfortunately, includes most of the generators associated with the simple multipliers. However, a surprise result is that all generators associated with one of the simplest and fastest multipliers, namely 101, turn out to be consistently good for word lengths greater than seven digits. A final section of the paper suggests a simple theoretical explanation of these experimental results. 
burro1964
¿Web? 
Burroughs B5500 Information Processing Systems Reference Manual,
Burroughs Corporation,
224pp,
Burroughs Corporation, Detroit, Michigan,
1964.
Abstract: This reference manual describes the hardware characteristics of the Burroughs B 5500 Information Processing System by presenting detailed information concerning the functional operation of the entire system. The B 5500 is a largescale, highspeed, solidstate computer which represents a departure from the conventional computer system concept. It is a problem language oriented system rather than the conventional hardware oriented system. Because of the design concept of the B 5500, there exists a strong interdependence between the hardware and the Master Control Program which directs the system. The material presented herein pertains only to the hardware considerations, whereas the Master Control Program is discussed under separate cover. 
kanner1965
¿Web? 
Number Base Conversion in a Significant Digit Arithmetic,
Herbert Kanner,
Journal of the ACM, Vol. 12 #2,
ISSN 00045411,
pp242–246,
ACM Press,
April 1965.
Abstract: An algorithm is presented for the conversion in either direction between binary and decimal floatingpoint representations, retaining proper significance through the conversion in an unnormalized significant digit arithmetic. 
mano1965
¿Web? 
Pracniques: simulation of Boolean functions in a decimal computer,
M. Morris Mano,
Communications of the ACM, Vol. 8 #1,
ISSN 00010782,
pp39–40,
ACM Press,
January 1965.
Abstract: A method is presented here for simulating logical functions in a digital computer by means of simple arithmetic and control instructions. This method is of practical value when the computer used does not have builtin logical instructions. 
chart1966
¿Web? 
Automatic Controlled Precision Calculations,
Bruce A. Chartres,
Journal of the ACM, Vol. 13 #3,
pp386–403,
ACM Press,
July 1966.
Abstract: Recent developments in computer design and error analysis have made feasible the use of variable precision arithmetic and the preparation of programs that automatically determine their own precision requirements. Such programs enable the user to specify the accuracy he wants, and yield answers guaranteed to lie within the bounds prescribed. A class of such programs, called “contracting error programs”, is defined in which the precision is determined by prescribing error bounds on the data. A variant of interval arithmetic is defined which enables a limited class of algorithms to be programmed as contracting error programs. A contracting error program for the solution of simultaneous linear equations is described, demonstrating the application of the idea to a wider class of problems. 
mancino1966
¿Web? 
Multiple precision floatingpoint conversion from decimaltobinary and vice versa,
O. G. Mancino,
Communications of the ACM, Vol. 9 #5,
pp347–348,
ACM Press,
May 1966.
Abstract: Decimaltobinary and binarytodecimal floatingpoint conversion is often performed by using a table of the powers 10^{i} (i a positive integer) for converting from base 10 to base 2, and by using a table of the coefficients of a polynomial approximation of 10^{x} (0 ≤ x < 1) for converting from base 2 to base 10. These tables occupy a large storage region in the case of a nonsingle precision conversion. This paper shows that a single small table suffices for a floatingpoint conversion from decimal to binary, and vice versa, in any useful precision. 
gold1967
¿Web? 
27 Bits Are Not Enough for 8Digit Accuracy,
I. Bennett Goldberg,
Communications of the ACM, Vol. 10 #2,
pp105–106,
ACM Press,
February 1967.
Abstract: From the inequality 10^{8} < 2^{27}, we are likely to conclude that we can represent 8digit decimal floatingpoint numbers accurately by 27bit [binary] floatingpoint numbers. However, we need 28 significant bits to represent some 8digit numbers accurately. In general, we can show that if 10^{p} < 2^{q1}, then q significant bits are always enough for pdigit decimal accuracy. Finally, we can define a compact 27bit floatingpoint representation that will give 28 significant bits, for numbers of practical importance. 
klerer1967
¿Web? 
Chapt. 1.4 Computer Characteristics Table,
Melvin Klerer et al,
Digital Computer User's Handbook,
67pp,
McGrawHill, NY,
1967.
Abstract: Section I: Generalpurpose Solidstate Computers Manufactured in the United States and Designed for a Wide Variety of Business and Scientific Applications Section II: Systems Manufactured in the United States with Generalpurpose Capabilities but Used Principally in Process Control, Message Switching, and Other Specialized Applications Section III: Generalpurpose Computers Manufactured in Countries Other Than the United States Section IV: Vacuumtube Computers No Longer Manufactured but Still in Use Section V: Chronological Listing of Vacuumtube and Solidstate Computers Manufactured in the United States and Installed between 1951 and 1965 
dietmeyer1968
¿Web? 
Generating prime implicants via ternary encoding and decimal arithmetic,
D. L. Dietmeyer and J. R. Duley,
Communications of the ACM, Vol. 11 #7,
ISSN 00010782,
pp520–523,
ACM Press,
July 1968.
Abstract: Decimal arithmetic, ternary encoding of cubes, and topological considerations are used in an algorithm to obtain the extremals and prime implicants of Boolean functions. The algorithm, which has been programmed in the FORTRAN language, generally requires less memory than other minimization procedures, and treats DON’T CARE terms in an efficient manner. 
matula1968
¿Web? 
Inandout conversions,
David Matula,
Communications of the ACM, Vol. 11 #1,
pp47–60,
ACM Press,
January 1968.
Abstract: By an inandout conversion we mean that a floatingpoint number in one base is converted into a floatingpoint number in another base and then converted back to a floatingpoint number in the original base. For all combinations of rounding and truncation conversions the question is considered of how many significant digits are needed in the intermediate base to allow such inandout conversions to return the original number (when possible), or at least to cause a difference of no more than a unit in the least significant digit. 
schmid1968
¿Web? 
An electronic digital slide rule,
Hermann Schmid and David Busch,
The Electronic Engineer,
pp54–64,
July 1968.
Abstract: The Electronic Digital Slide Rule (EDSR) of the future not only will be smaller and easier to operate than the conventional slide rule, but it will also be more accurate. 
schmoo1968
¿Web? 
High Speed Binary to Decimal Conversion,
M. S. Schmookler,
IEEE Transactions on Computers, Vol. C17,
pp506–508,
IEEE,
1968.
Abstract: This note describes several methods of performing fast, efficient, binarytodecimal conversion. With a modest amount of circuitry, an order of magnitude speed improvement can is obtained. This achievement offers a unique advantage to generalpurpose computers requiring special hardware to translate between binary and decimal numbering systems. 
brown1969
¿Web? 
The Choice of Base,
W. S. Brown and P. L. Richman,
Communications of the ACM, Vol. 12 #10,
pp560–561,
ACM Press,
October 1969.
Abstract: A digital computer is considered, whose memory words are composed of N rstate devices plus two sign bits (two state devices). The choice of base b for the internal representation of floatingpoint numbers on such a computer is discussed. It is shown that in a certain sense b = r is best. 
duke1969
¿Web? 
Decimal Floating Point Processor,
K. A. Duke,
IBM Technical Disclosure Bulletin, 1169,
pp862–862,
IBM,
November 1969.
Abstract: A numerical processor can be built which operates on floatingpoint numbers where the mantissa is an integer and the characteristic represents a power of 10 by which that integer must be multiplied. Thus, decimal numbers can be represented exactly without conversion errors. Such floating point numbers are expressed as N = (1)/S/ x 10/X/ x I where S = sign bit, X = exponent, and I = integer. 
rosen1969
URL ¿Web? 
Electronic Computers: A Historical Survey,
Saul Rosen,
ACM Computing Surveys (CSUR), Vol. 1 #1,
ISSN 03600300,
pp7–36,
ACM Press,
March 1969.
Abstract: The first large scale electronic computers were built in connection with university projects sponsored by government military and research organizations. Many established companies, as well as new companies, entered the computer field during the first generation, 19471959, in which the vacuum tube was almost universally used as the active component in the implementation of computer logic. The second generation was characterized by the transistorized computers that began to appear in 1959. Some of the computers built then and since are considered super computers; they attempt to go to the limit of current technology in terms of size, speed, and logical complexity. From 1965 onward, most new computers belong to a third generation, which features integrated circuit technology and multiprocessor multiprogramming systems. 
svoboda1969
¿Web? 
Decimal Adder with Signed Digit Arithmetic,
Antonin Svoboda,
IEEE Transactions on Computers, Vol. 18 #3,
pp212–215,
IEEE,
March 1969.
Abstract: The decimal adder with signed digit arithmetic presented here was designed to establish the following facts: the redundant representation of a decimal digit x_{i} by a 5bit binary number X_{i}=3x_{i} leads to a logical design of extreme simplicity; it is possible to form an additional algorithm for the adder so that it can be used to transform numbers written in a conventional decinal form into a signed digit form, and vice versa. 
kailas1970
¿Web? 
Another method of converting from hexadecimal to decimal,
M. V. Kailas,
Communications of the ACM, Vol. 13 #3,
193pp,
ACM Press,
March 1970.
Abstract: There is a simple paperandpencil method of converting a hexadecimal number N to decimal. 
matu1970
¿Web? 
A Formalization of FloatingPoint Numeric Base Conversion,
David W. Matula,
IEEE Transactions on Computers, Vol. C19 #8,
pp681–692,
IEEE,
August 1970.
Abstract: The process of converting arbitrary real numbers into a floatingpoint format is formalized as a mapping of the reals into a specified subset of real numbers. The structure of this subset, the set of n significant digit base b floatingpoint numbers, is analyzed and properties of conversion mappings are determined. For a restricted conversion mapping of the n significant digit base b numbers to the m significantdigit base d numbers, the onetoone, onto, and orderpreserving properties of the mapping are summarized. Multiple conversions consisting of a composition of individual conversion mappings are investigated and some results of the invariant points of such compound conversions are presented. The hardware and software implications of these results with regard to establishing goals and standards for floatingpoint formats and conversion procedures are considered. 
taub1970
¿Web? 
Experimental Computer for Schools,
D. M. Taub, C. E. Owen, and B. P.. Day,
Proceedings of the IEE, Vol. 117 #2,
pp303–312,
IEE,
February 1970.
Abstract: The computer is a small desktop machine designed for teaching schoolchildren how computers work. It works in decimal notation and has a powerful instruction set which includes 3address floatingpoint instructions implemented as ‘extracode’ subroutines. Addressing can be absolute, relative or indirect. For input it uses a capacitive touch keyboad, and for output and display a perfectly normal TV receiver is used. Another input/output device is an ordinary domestic tape recorder, used mainly for long term storage of programs. To make the operation of the machine easy to follow, it can be made to stop at certain stages in the processing of an instruction and automaticaly display the contents of all registers and storage locations relevant at that time. The paper gives a description of the machine and a discussion of the factors that hav influenced its design. 
bata1971
¿Web? 
The Gamma 60: The computer that was ahead of its time,
M. Bataille,
Honeywell Computer Journal Vol. 5 #3,
pp99–105,
Honeywell,
1971.
Abstract: Prior to 1960 the Compagnie des Machines Bull (now Honeywell Bull) delivered the first large computer system with an architecture designed for multiprogramming. Many unique features of the Gamma 60 were forerunners of present system architecture concepts. This article revisits these concepts. 
chen1971
URL ¿Web? 
Decimal Number Compression,
Tien Chi Chen,
Internal IBM memo to Dr. Irving T. Ho,
4pp,
IBM,
29 March 1971.
Abstract: The fact that four bits can represent 16 different states, but a decimal digit exploits only 10 of then (09) has been a valid criticism against decimal arithmetic. On the other hand, it is well known that a number with several decimal digits can be reexpressed into binary, leading to a 20% gain in the number of bits used. Examples are, two decimal digits (8 bits) reexpressed as a sevenbit number and three decimal digits (twelve bits) reexpressed as a tenbit number. ...

schmoo1971
¿Web? 
High speed decimal addition,
Martin S. Schmookler and Arnold Weinberger,
IEEE Transactions on Computers, Vol. C20 #8,
pp862–867,
IEEE,
August 1971.
Abstract: Parallel decimal arithmetic capability is becoming increasingly attractive with new applications of computers in a multiprogramming environment. The direct production of decimal sums offers a significant improvement in addition over methods requiring decimal correction. These techniques are illustrated in the eightdigit adder which appears in the System/360 Model 195. 
fenw1972
¿Web? 
A Binary Representation for Decimal Numbers,
Peter M. Fenwick,
Australian Computer Journal, Vol. 4 #4 (now Journal of Research and Practice in Information Technology),
pp146–149,
Australian Computer Society Inc.,
November 1972.
Abstract: A number system is described which combines the programming convenience of decimal numbers with the hardware advantages of binary arithmetic. The number format resembles some integer floatingpoint formats, except that the exponent is associated with a base of 10, rather than some power of 2. It is shown that arithmetic in the new representation is little more difficult than for ordinary floatingpoint numbers, and methods are given for implementing the “decimal” shifts which are a consequence of the exponent base. 
frankl1972
¿Web? 
Zoned Decimal Arithmetic,
J. W. Franklin,
IBM Technical Disclosure Bulletin, 1272,
pp2123–2124,
IBM,
December 1972.
Abstract: A means is described for performing arithmetic on zoned decimal data that does not require additional storage space for the intermediate result, and which preserves both operands until it is determined that the operation has been performed correctly and successfully. 
neely1972
¿Web? 
On conventions for systems of numerical representation,
Peter M. Neely,
Proceedings of the ACM annual conference, Boston, Massachusetts,
pp644–651,
ACM Press,
1972.
Abstract: Present conventions for numeric representation are considered inadequate to serve the needs of applied computing. Thus an augmented digital number system is proposed for use in programming languages and in digital computers. Special symbols are proposed for numbers too large, too small or too close to zero to be represented in the normal digital number system, or which are undefined. Properties of mappings among and between digital number systems are used to justify the augments chosen. Finally a suggestion is made for a new floating point word format that will serve all the above needs and will greatly extend the exponent range of floating point numbers. 
brent1973
¿Web? 
On the Precision Attainable with Various Floatingpoint Number Systems,
Richard P. Brent,
IEEE Transactions on Computers, Vol. C22 #6,
pp601–607,
IEEE,
June 1973.
Abstract: For scientific computations on a digital computer the set of real numbers is usually approximated by a finite set F of “floatingpoint” numbers. We compare the numerical accuracy possible with difference choices of F having approximately the same range and requiring the same word length. In particular, we compare different choices of base (or radix) in the usual floatingpoint systems. The emphasis is on the choice of F, not on the details of the number representation or the arithmetic, but both rounded and truncated arithmetic are considered. Theoretical results are given, and some simulations of typical floatingpoint computations (forming sums, solving systems of linear equations, finding eigenvalues) are described. If the leading fraction bit of a normalized base 2 number is not stored explicitly (saving a bit), and the criterion is to minimize the mean square roundoff error, then base 2 is best. If unnormalized numbers are allowed, so the first bit must be stored explicitly, then base 4 (or sometimes base 8) is the best of the usual systems. 
jacob1973
¿Web? 
A Combinatoric Division Algorithm for FixedInteger Divisors,
David H. Jacobsohn,
IEEE Transactions on Computers, Vol. C22 #6,
pp608–610,
IEEE,
June 1973.
Abstract: A procedure is presented for performing a combinatoric fixedinteger division that satisfies the division algorithm in regard to both quotient and remainder. In this procedure, division is performed by multiplying the dividend by the reciprocal of the divisor. The reciprocal is, in all nontrivial cases, of necessity a repeating binary fraction, and two treatments for finding the product of an integer and repeating binary fraction are developed. Two examples of the application of the procedure are given. 
rich1973
¿Web? 
VariablePrecision Exponentiation,
P. L. Richman,
Communications of the ACM, Vol. 16 #1,
pp38–40,
ACM Press,
January 1973.
Abstract: A previous paper presented an efficient algorithm, called the Recomputation Algorithm, for evaluating a rational expression to within any desired tolerance on a computer which performs variableprecision aritbmetic operations. The Recomputation Algorithm can be applied to expressions involving any variableprecision operations having O(10^{p} + S  e_{i}i ) error bounds, where p denotes the operation’s precision and e_{i} denotes the error in the operation’s ith argument. This paper presents an efficient variableprecision exponential operation with an error bound of the above order. Other operations, such as log, sin, and cos, which have simple series expansions, can be handled similarly. 
agrawal1974
¿Web? 
Fast B. C. D. Multiplier,
Dharma P. Agrawal,
Electronics Letters, Vol. 10 #12,
pp237–238,
IEE,
13 June 1974.
Abstract: A fast b.c.d multiplier is proposed, based on obtaining the product of a 1digit multiplicand and a 1digit multiplier in a single row of adders. For highspeed operation, the carrysave technique, universally adopted for binary multipliers, is used. 
brown1974a
¿Web? 
Some error correcting codes for certain transposition and transcription errors in decimal integers,
D. A. H. Brown,
The Computer Journal, Vol. 17 #1,
pp9–12,
OUP,
February 1974.
Abstract: The standard theory of modulus 11 cyclic block errorcorrecting codes is applied to numbers expressed in the decimal system. An algorithm for error correction is given. 
brown1974b
¿Web? 
Biquinary decimal error detection codes with one, two and three check digits,
D. A. H. Brown,
The Computer Journal, Vol. 17 #3,
pp201–204,
OUP,
August 1974.
Abstract: The biquinary system of representing the decimal integers 0 to 9 is combined with polynomial coding to produce true decimal codes having any required number of check digits added to an integer of any length. 
schmid1974
¿Web? 
Decimal Computation,
Hermann Schmid,
ISBN 047176180X,
266pp,
Wiley,
1974.
Abstract: This book is thus a collection, a catalog, and a review of BCD computation techniques. The book describes how each of the most common arithmetic and transcendental operations can be implemented in a variety of ways. ... covers ... A review of number systems, BCD codes, of early calculating instruments and electronic calculating machines ... An outline of BCD computing circuit applications in the automotive, consumer, education, and entertainment fields, illustrated with some specific examples ... Mathematical developments of the algorithms ... Discussions and comparisons of circuit complexity and performance (accuracy, resolution, and speed of operation) for the different algorithms ... Note: Reprinted 1983, ISBN 0898743184, Robert E. Krieger Publishing Co. 
sites1974
¿Web? 
Serial Binary Division by Ten,
R. L. Sites,
IEEE Transactions on Computers, Vol. 23 #12,
ISSN 00189340,
pp1299–1301,
IEEE,
December 1974.
Abstract: A technique is presented for dividing a positive binary integer by ten, in which the bits of the input are presented serially, loworder bit first. A complete division by ten is performed in two word times (comparable to the time needed for two serial additions). The technique can be useful in serial conversions from binary to decimal, or in scaling binary numbers by powers of 10. 
chen1975
¿Web? 
StorageEfficient Representation of Decimal Data,
Tien Chi Chen and Irving T. Ho,
CACM Vol. 18 #2,
pp49–52,
ACM Press,
January 1975.
Abstract: Usually n decimal digits are represented by 4n bits in computers. Actually, two BCD digits can be compressed optimally and reversibly into 7 bits, and three digits into 10 bits, by a very simple algorithm based on the fixedlength combination of two variable fieldlength encodings. In over half of the cases the compressed code results from the conventional BCD code by simple removal of redundant 0 bits. A long decimal message can be subdivided into threedigit blocks, and separately compressed; the result differs from the asymptotic minimum length by only 0.34 percent. The hardware requirement is small, and the mappings can be done manually. 
hunter1975
¿Web? 
A quantitative measure of precision,
G. Hunter,
The Computer Journal, Volume 18, Issue 3,
pp231–233,
OUP,
August 1975.
Abstract: The precision z_{b} of a real number is defined quantitatively in terms of the fractional error in the number, and the base of the arithmetic in which it is represented. The definition is an extension of the traditional rough measure of precision as the number of signification digits in the number. In binary arithmetic the integral part of z_{b} is the number of binary digits required to store the number. Conversion of the precision from one base to another (such as binary/decimal) is discussed, and applied to consideration of the intrinsic precision of input/output routines and floating point arithmetic. 
keir1975a
¿Web? 
Programmercontrolled roundoff and the selection of a stable roundoff rule,
R. A. Keir,
Conf. Rec. 3rd Symp. Comp. Arithmetic CH10173C,
pp73–76,
IEEE Computer Society,
1975.
Abstract: The author suggests that every computer with floatingpoint addition and subtraction should have PSW controlable roundoff facilities. Yohe’s catalog should be induded. There should also be a stable roundoff mode using the roundtooff [odd] or roundtoeven rule based on whether the radix is divisible by four or only by two. 
keir1975b
¿Web? 
Compatible number representations,
R. A. Keir,
Conf. Rec. 3rd Symp. Comp. Arithmetic CH10173C,
pp82–87,
IEEE Computer Society,
1975.
Abstract: A compatible number system for mixed fixedpoint and floatingpoint arithmetic is described in termsof number formats and opcode sequences (for hardwired or microcoded control). This inexpensive system can be as fast as fixedpoint arithmetic on integers, is faster than normalized arithmetic in floating point, gets answers identical to those of normalized arithmetic, and automatically satisfies the Algol60 mixedmode rules. The central concept is the avoidance of meaningless “normalization” following arithmetic operations. Adoption of this system could lead to simpler compilers. 
keir1975c
¿Web? 
Should the stable rounding rule be radixdependent?,
Roy A. Keir,
Information Processing Letters, Vol. 3 #6,
pp188–189,
Elsevier,
July 1975.
Abstract: (None.) 
senzig1975
¿Web? 
Calculator Algorithms,
Don Senzig,
IEEE Compcon Reader Digest, IEEE Catalog No. 75 CH 09209C,
pp139–141,
IEEE,
Spring 1975.
Abstract: This paper discusses algorithms for generating the trigonometric, exponential, and hyperbolic functions and their inverses. No invention is claimed here. The algorithm for logarithm was used by Briggs in compiling his table of logarithms in the 1600’s. Other earlier references are (cited). The development presented here is, perhaps, more direct than those given in the above references but leads to the same result. 
smith1975
¿Web? 
Comments on a Paper by T. C. Chen and I. T. Ho,
Alan Jay Smith,
CACM Vol. 18 #8,
pp463–463,
ACM Press,
August 1975.
Abstract: (None.) 
soule1975
¿Web? 
Addition in an Arbitrary Base Without Radix Conversion,
Stephen Soule,
Communications of the ACM Vol. 18 #6,
pp344–346,
ACM Press,
June 1975.
Abstract: This paper presents a generalization of an old programming technique; using it, one may add and subtract numbers represented in any radix, including a mixed radix, and stored one digit per byte in bytes of sufficient size. Radix conversion is unnecessary, no looping is required, and numbers may even be stored in a display (I/O) format. Applications to Cobol, MIX, and hexadecimal sums are discussed. 
ansi1976
¿Web? 
ANSI X3.53l976: American National Standard – Programming Language PL/I,
J. F. Auwaerter,
421pp,
ANSI,
1976.
Abstract: This document defines American National Standard Programming Language PL/I and specifies both the form and interpretation of computer programs written in PL/I. The standard is intended to provide a high degree of machine independence and thereby facilitate program exchange among a variety of computing systems. The document serves as an authoritative reference rather than as a tutorial exposition. The language is defined by specifying a conceptual PL/I machine which translates and interprets intended PL/I programs. The relationship between an actual implementation of PL/I and the conceptual machine presented in this document is also given. This reference document was developed jointly under the auspices of the American National Standards Institute and the European Computer Manufacturers Association. Note: Reaffirmed 1998. 
brent1976
URL ¿Web? 
Fast multipleprecision evaluation of elementary functions,
Richard P. Brent,
Journal of the ACM, Vol. 23 #2,
pp242–251,
ACM Press,
April 1976.
Abstract: Let f(x) be one of the usual elementary functions (exp, log, artan, sin, cosh, etc.), and let M(n) be the number of singleprecision operations required to multiply nbit integers. It is shown that f(x) can be evaluated, with relative error O(2^{n}), in O(M(n)log(n)) operations, for any floatingpoint number x (with an nbit fraction) in a suitable finite interval. From the SchönhageStrassen bound on M(n), it follows that an nbit approximation to f(x) may be evaluated in O(n(log(n))^{2}loglog(n)) operations. Special cases include the evaluation of constants such as pi, e, and e^{pi}. The algorithms depend on the theory of elliptic integrals, using the arithmeticgeometric 
ris1976
¿Web? 
A Unified Decimal FloatingPoint Architecture for the Support of HighLevel Languages,
Frederic N. Ris,
ACM SIGNUM Newsletter, Vol. 11 #3,
pp18–23,
ACM Press,
October 1976.
Abstract: This paper summarizes a proposal for a decimal floatingpoint arithmetic interface for the support of highlevel languages, consisting both of the arithmetic operations observed by application programs and facilities to produce subroutine libraries accessible from these programs. What is not included here are the detailed motivations, examinations of alternatives, and implementation considerations which will appear in the full work. Note: Also in ACM SIGARCH Computer Architecture News, Vol 5 #4, pp2131, October 1976. Also in ACM SIGPLAN Notices, Vol 12 #9, pp6070, September 1977. Also in IBM RC 6203 (#26651) 11pp, September 1976. 
alfonseca1977
¿Web? 
An APL interpreter and system for a small computer,
M. Alfonseca, M. L. Tavera, and R. Casajuana,
IBM Systems Journal, Vol. 16 #1,
pp18–40,
IBM,
1977.
Abstract: The design and implementation of an experimental APL system on the small, sensorbased System/7 is described. Emphasis is placed on the solution to the problem of fitting a full APL system into a small computer. The system has been extended through an I/O auxiliary processor to make it possible to use APL in the management and control of the System/7 sensorbased I/O operations. 
peuto1977
¿Web? 
An instruction timing model of CPU performance,
Bernard L. Peuto and Leonard J. Shustek,
Proceedings of the 4th annual symposium on Computer architecture,
pp165–178,
ACM Press,
1977.
Abstract: A model of highperformance computers is derived from instruction timing formulas, with compensation for pipeline and cache memory effects. The model is used to predict the performance of the IBM 370/168 and the Amdahl 470 V/6 on specific programs, and the results are verified by comparison with actual performance. Data collected about program behavior is combined with the performance analysis to highlight some of the problems with highperformance implementations of such architectures. 
yuen1977
¿Web? 
A New Representation for Decimal Numbers,
C. K. Yuen,
IEEE Transactions on Computers, Vol. 26 #12,
pp1286–1288,
IEEE,
December 1977.
Abstract: A new representation for decimal numbers is proposed. It uses a mixture of positive and negative radixes to ensure that the maximum value of a four bit decimal digit is 9. This eliminates the more complex carry generation process required in BCD addition. 
hull1978
¿Web? 
Desirable FloatingPoint Arithmetic and Elementary Functions for Numerical Computation,
T. E. Hull,
ACM Signum Newsletter, Vol. 14 #1 (Proceedings of the SIGNUM Conference on the Programming Environment for Development of Numerical Software),
pp96–99,
ACM Press,
1978.
Abstract: The purpose of this talk is to summarize proposed specifications for floatingpoint arithmetic and elementary functions. The topics considered are: the base of the number system, precision control, number representation, arithmetic operations, other basic operations, elementary functions, and exception handling. The possibility of doing without fixedpoint arithmetic is also mentioned. The specifications are intended to be entirely at the level of a programming language such as Fortran. The emphasis is on convenience and simplicity from the user’s point of view. Conforming to such specifications would have obvious beneficial implications for the portability of numerical software, and for proving programs correct, as well as attempting to provide facilities which are most suitable for the user. The specifications are not complete in every detail, but it is intended that they be complete “in spirit” – some further details, especially syntactic details, would have to be provided, but the proposals are otherwise relatively complete. Note: Also in Proceedings of the IEEE 4th Symposium on Computer Arithmetic pp6369. 
liu1978
¿Web? 
ErrorCorrecting Codes in BinaryCodedDecimal Arithmetic,
ChaoKai Liu and Tse Lin Wang,
IEEE Transactions on Computers, Vol. 27 #11,
pp977–984,
IEEE,
November 1978.
Abstract: Errorcorrecting coding schemes devised for binary arithmetic are not in general applicable to BCD arithmetic. In this paper, we investigate the new problem of using such coding schemes in BCD systems. We first discuss the general characteristics of arithmetic errors and define the arithmetic weight and distance in BCD systems. We show that the distance is a metric function. Number theory is used to construct a class of singleerrorcorrecting codes for BCD arithmetic. It is shown that the generator of these codes possesses a very simple form and the structure of these codes can be analytically determined. 
schrei1978
URL ¿Web? 
Two Methods for Fast Integer BinaryBCD Conversion,
F. A. Schreiber and R. Stefanelli,
Proc. 4th Symposium on Computer Arithmetic,
pp200–207,
IEEE Press,
October 1978.
Abstract: Two methods for performing binaryBCD conversion of positive integers are discussed. The principle which underlies both methods is the repeated division by five and then by two, obtained the first by means of subtractions performed from left to right, the second by shifting bits before next subtraction. It is shown that these methods work in a time which is linear with the length in bit of the number to be converted, A ROM solution is proposed and its complexity is compared with that of other methods. 
edgar1979
¿Web? 
FOCUS Microcomputer Number System,
Albert D. Edgar and Samuel C. Lee,
Communications of the ACM Vol. 22 #3,
pp166–177,
ACM Press,
March 1979.
Abstract: FOCUS is a number system and supporting computational algorithms especially useful for microcomputer control and other signal processing applications. FOCUS has the wideranging character of floatingpoint numbers with a uniformity of state distributions that give FOCUS better than a twofold accuracy advantage over an equal word length floatingpoint system. FOCUS computations are typically five times faster than single precision fixedpoint or integer arithmetic for a mixture of operations, comparable in speed with hardware arithmetic for many applications. Algorithms for 8bit and 16bit implementations of FOCUS are included. 
rein1979
¿Web? 
Principles and Preferences for Computer Arithmetic,
Christian H. Reinsch,
ACM SIGNUM Vol. 14 #1,
pp12–27,
ACM Press,
March 1979.
Abstract: This working paper arose out of discussions on desirable hardware features for numerical calculation in the IFIP Working Group 2.5 on Numerical Software. It reflects the views of all members of the group, although no formal vote of approval has been taken; it is not an official IFIP document. Many people contributed ideas to this paper, especially T. J. Dekker, C. W. Gear, T. E. Hull, J. R. Rice, and J. L. Schonfeldor. 
cody1980
¿Web? 
Software Manual for the Elementary Functions, W. J. Cody and W. Waite, ISBN 0138220646, 269pp, PrenticeHall, 1980. 
haven1980
¿Web? 
Decimal to Binary Floating Point Number Conversion Mechanism,
J. W. Havender,
IBM Technical Disclosure Bulletin, 0780,
pp706–708,
IBM,
July 1980.
Abstract: Floating point numbers may be converted from decimal to binary using a high speed natural logarithm and exponential function calculation mechanism and a fixed point divide/multiply unit. The problem solved is to convert numbers expressed in a radix 10 floating point form to numbers expressed in a radix 2 floating point form. 
hull1980
¿Web? 
Principles, Preferences and Ideals for Computer Arithmetic,
Thomas E. Hull, Christian H. Reinsch, and John R. Rice,
CSDTR339,
13pp,
Dept. of Computer Science, Purdue University,
June 1980.
Abstract: This paper presents principles and preferences for the implementation of computer arithmetic and ideals for the arithmetic facilities in future programming languages. The implementation principles and preferences are for the current approaches to the design of arithmetic units. The ideals are for the long term development of programming languages, with the hope that arithmetic units will be built to support the requirements of programming languages. 
johan1980
¿Web? 
Decimal Shifting for an Exact Floating Point Representation,
J. D. Johannes, C. Dennis Pegden, and F. E. Petry,
Computers and Electrical Engineering, Vol. 7 #3,
pp149–155,
Elsevier,
September 1980.
Abstract: A floating point representation which permits exact conversion of decimal numbers is discussed. This requires the exponent to represent a power of ten, and thus decimal shifts of the mantissa are needed. A specialized design is analyzed for the problem of division by ten, which is needed for decimal shifting. 
kleinsteiber1980
¿Web? 
IBM 4341 hardware/microcode tradeoff decisions,
James R. Kleinsteiber,
MICRO 13: Proceedings of the 13th annual workshop on Microprogramming,
pp190–192,
ACM Press,
December 1980.
Abstract: The design of IBM’s 4341 Processor, as with other processors, involved many cost/performance tradeoffs. The designer is continually under pressure to increase processor speed without increasing cost or to decrease processor cost without decreasing performance. This paper will examine some of the engineering decisions that were made in the attempt to make the 4341 a highperforming yet low cost processor. These decisions include searching for, or developing, algorithms that make the best use of hardware properties, such as data path width, arithmetic/logical operations and special functions. Functions were sought such that a small amount of added hardware would go a long way towards improving system performance. Hardware designers, microcoders and performance analysis people worked together to implement instructions, functions and algorithms with the proper mixture of hardware functions and microcode in order to build a viable processor. Some specific functions will be covered to examine a few of the decisions. The TEST UNDER MASK performance problem will be discussed with its resulting implementation decision. The method of using EXCLUSIVE OR to clear storage and the resulting algorithm design will be shown. Other topics to be discussed include multiple hardware functions and the resulting effect on floating point, fixed point and decimal multiply; the divide function and its effect on floating point and fixed point divide; and the effect of an 8byte data path for decimal arithmetic. Note: Also published in December 1980 SIGMICRO Newsletter Volume 11 Issue 34 
brent1981
URL ¿Web? 
MP User's Guide (Fourth Edition),
Richard P. Brent,
73pp,
Dept. Computer Science, Australian National University, Canberra, TRCS8108,
June 1981.
Abstract: MP is a multipleprecision floatingpoint arithmetic package. It is almost completely machineindependent, and should run on any machine with an ANSI Standard Fortran (ANS X3.91966) compiler, sufficient memory, and a wordlength (for integer arithmetic) of at least 16 bits. A precompiler (Augment) which facilitates the use of the MP package is available. ... MP works with normalized floatingpoint numbers. The base (B) and number of digits (T) are arbitrary, subject to some restrictions given below, and may be varied dynamically. ... 
chroust1981
¿Web? 
Method of Adding Decimal Numbers by Means of Binary Arithmetic,
G. Chroust,
IBM Technical Disclosure Bulletin, 0381,
pp4525–4526,
IBM,
March 1981.
Abstract: The simulation of decimal arithmetic on a machine without packed arithmetic necessitates a method for simulating decimal addition by binary arithmetic. Decimal addition simulation is effected by simultaneously applying the following steps to as many digits (d1, d2, .., dn) of the decimal number as fit into the (binary = bin) word length of the object machine. 1. (Binary) addition of the two operands, 2. adding a `6’ in each digit position (this generates the correct carry), and 3. subtracting a `6’ in those places from which no carry resulted. 
griff1981
¿Web? 
Binary to Decimal Conversion,
L. K. Griffiths,
IBM Technical Disclosure Bulletin, 0681,
pp237–238,
IBM,
June 1981.
Abstract: Binary to decimal conversion can be achieved by multiplying 1/10 as 51/512 x 256/255 and using the fact that 256/255 = 1 + 1/256 + 1/256^{2} ..., i.e., 256/255 = 257256 rounded up. This method can be performed efficiently on short word computers with only adding and shifting operations, i.e., first multiplying by 51/512 and then correcting by multiplying by 256/255. 
ifrah1981
¿Web? 
The Universal History of Numbers,
Georges Ifrah,
ISBN 186046324X,
633pp,
The Harvill Press Ltd.,
1994.
Abstract: More than a history of counting and calculating from the caveman to the late twentieth century, this is the story of how the human race has learnt to think logically. The reader is taken through the whole art and science of numeration as it has developed all over the world, from Europe to China, via the Classical World, Mesopotamia, South America, and, above all, India and the Arab lands. ... Note: Translated from the French by David Bellos, E. F. Harding, Sophie Wood, and Ian Monk. (Also published is a translation of an earlier edition – From One to Zero: A Universal History of Numbers. Translated by Lowell Bair. Viking, New York, 1985.) 
johnst1982
¿Web? 
Representational error in binary and decimal numbering systems,
Paul Johnstone,
Proceedings of the 20th annual ACM Southeast Regional Conference,
pp85–88,
ACM Press,
1982.
Abstract: The representation of a general rational number of the form A/B as a floating point number requires a conversion from the general form to a base specific form. This conversion often results in the generation of infinitely repeating nonzero strings of digits which are truncated to the size of the mantissa resulting in a loss of precision. It is shown that the proportion of repeating versus finite rational numbers specific to a base is expotentially related to the number of unique prime factors of the base. Simulation results are presented which show the relative proportions of finite representations for binary and decimal cases over a range of mantissa sizes. 
sacks1982
¿Web? 
Applications of Redundant Number Representations to Decimal Arithmetic,
R. SacksDavis,
The Computer Journal, Vol. 25 #4,
pp471–477,
November 1982.
Abstract: A decimal arithmetic unit is proposed for both integer and floatingpoint computations. To achieve comparable speed to a binary arithmetic unit, the decimal unit is based on a redundant number representation. With this representation no loss of compactness is made relative to binary coded decimal (BCD) form. In this paper the hardware required for the implementation of the basic operations of addition, subtraction, multiplication and division are described and the properties of floatingpoint arithmetic based on a redundant number representation are investigated. 
cohen1983
¿Web? 
CADAC: A ControlledPrecision Decimal Arithmetic Unit,
Marty S. Cohen, T. E. Hull, and V. Carl Hamacher,
IEEE Transactions on Computers, Vol. 32 #4,
pp370–377,
IEEE,
April 1983.
Abstract: This paper describes the design of an arithmetic unit called CADAC (clean arithmetic with decimal base and controlled precision). Programming language specifications for carrying out “ideal” floatingpoint arithmetic are described first. These specifications include detailed requirements for dynamic precision control and exception handling, along with both complex and interval arithmetic at the level of a programming language such as Fortran or PL/I. CADAC is an arithmetic unit which performs the four floatingpoint operations add/subtract/multiply/divide on decimal numbers in such a way as to support all the language requirements efficiently. A threelevel pipeline is used to overlap twodigitatatime serial processing of the partial products/remainders. Although the logic design is relatively complex, the performance is efficient, and the advantages gained by implementing programmercontrolled precision directly in the hardware are significant. 
hp71spec1983
¿Web? 
Chapter 13 – Internal Data Representations,
Hewlett Packard Company,
Software Internal Design Specification for the HP71, Vol. 1 Part #0007190068,
pp13.1–13.17,
Hewlett Packard Company,
December 1983.
Abstract: This chapter discusses the format in which the HP71 represents numeric or string data in memory or in the CPU registers. Note: Manual available from The Museum of HP Calculators (www.hpmuseum.org). 
kahan1983
URL ¿Web? 
Mathematics Written in Sand,
W. Kahan,
Proc. Joint Statistical Mtg. of the American Statistical Association,
pp12–26,
American Statistical Association,
1983.
Abstract: Simplicity is a Virtue; yet we continue to cram ever more complicated circuits ever more densely into silicon chips, hoping all the while that their internal complexity will promote simplicity of use. This paper exhibits how well that hope has been fulfilled by several inexpensive devices widely used nowadays for numerical computation. One of them is the HewlettPackard hp15C programmable shirtpocket calculator, on which only a few keys need be pressed to perform tasks like these: Real and Complex arithmetic, including the elementary transcendental functions and their inverses; Matrix arithmetic including inverse, transpose, determinant, residual, norms, prompted input/output and complexreal conversion; Solve an equation and evaluate an Integral numerically; simple statistics; G and combinatorial functions; ... For instance, a stroke of its [1/X] key inverts an 8x8 matrix of 10sig.dec. numbers in 90 sec. This calculator costs under $100 by mailorder. Mathematically dense circuitry is also found in Intel’s 8087 coprocessor chip, currently priced below $200, which has for two years augmented the instruction repertoire of the 8086 and 8088 microcomputer chips to cope with ... Three binary floatingpoint formats 32, 64 and 80 bits wide; three binary integer formats 16, 32 and 64 bits wide; 18digit BCDecimal integers; rational arithmetic, square root, format conversion and exception handling all in conformity with p754, the proposed IEEE arithmetic standard (see “Computer” Mar. 1, 1981); the kernels of transcendental functions exp, log, tan and arctan; and an internal stack of eight registers each 80 bits wide. For instance, the 8087 has been used to invert a 100x100 matrix of 64bit floatingpoint numbers in 90 sec. Among the machines that can use this chip are the widely distributed IBM Personal Computers, each containing a socket already wired for an 8087. Several other manufacturers now produce arithmetic engines that, like the 8087, conform to the proposed IEEE arithmetic standard, so software that exploits its refined arithmetic properties should be widespread soon. As sophisticated mathematical operations come into use ever more widely, mathematical proficiency appears to rise; in a sense it actually declines. Computations formerly reserved for experts lie now within reach of whoever might benefit from them regardless of how little mathematics he understands; and that little is more likely to have been gleaned from handbooks for calculators and personal computers than from professors. This trend is pronounced among users of financial calculators like the hp12C. Such trends ought to affect what and how we teach, as well as how we use mathematics, regardless of whether large fast computers, hitherto dedicated mostly to speed, ever catch up with some smaller machines’ progress towards mathematical robustness and convenience. 
clen1984
¿Web? 
Beyond Floating Point,
C. W. Clenshaw and F. W. J. Olver,
Journal of the ACM, Vol. 31 #2,
pp319–328,
ACM Press,
April 1984.
Abstract: A new number system is proposed for computer arithmetic based on iterated exponential functions. The main advantage is to eradicate overflow and underflow, but there are several other advantages and these are described and discussed. 
cody1984
¿Web? 
A Proposed Radix and Wordlengthindependent Standard for Floatingpoint Arithmetic,
W. J. Cody, J. T. Coonen, D. M. Gay, K. Hanson, D. Hough, W. Kahan, R. Karpinski, J. Palmer, F. N. Ris, and D. Stevenson,
IEEE Micro magazine, Vol. 4 #4,
pp86–100,
IEEE,
August 1984.
Abstract: This article places [Draft 1.0 of IEEE 854] before the public for the first time. ... This article also includes material that describes how decisions were reached in preparing the P854 draft and explains how to overcome some of the implementation problems. Note: Reprinted in ACM SIGNUM, Vol. 20, #1, pp3551, 1985. 
cowlis1984
¿Web? 
The Design of the REXX Language,
M. F. Cowlishaw,
IBM Systems Journal, Vol. 23 #4,
pp326–335,
IBM (Offprint # G3215228),
1984.
Abstract: One way of classifying computer languages is by two classes: languages needing skilled programmers, and personal languages used by an expanding population of general users. REstructured eXtended eXecutor (REXX) is a flexible personal language designed with particular attention to feedback from its users. It has proved to be effective and easy to use, yet it is sufficiently general and powerful to fulfill the needs of many demanding professional applications. REXX is system and hardware independent, so that it has been possible to integrate it experimentally into several operating systems. Here REXX is used for such purposes as command and macro programming, prototyping, education, and personal programming. This study introduces REXX and describes the basic design principles that were followed in developing it. Note: First published as IBM Hursley Technical Report TR12.223, October 1983. 
jones1984
¿Web? 
A Significance Rule for MultiplePrecision Arithmetic,
Christopher B. Jones,
ACM Transactions on Mathematical Software (TOMS), Vol. 10 #1,
pp97–107,
ACM Press,
March 1984.
Abstract: Multipleprecision arithmetic overcomes the roundoff error incurred in conventional floatingpoint arithmetic, at the cost of increased processing overhead. Significance arithmetic takes into account the inexactness of the operands of a calculation, but can lead to loss of significant digits after a long series of operations. A new technique is described which alleviates the overhead of multipleprecision arithmetic by allowing nonsignificant digits to be discarded, while limiting the significance loss per operation to a controllable and acceptable rate. The technique is based on storing an inexact number as an interval, using a criterion of significance to determine the precision with which the limits of the interval should be stored. A procedure referred to as a significance rule uses this criterion to remove some of the nonsignificant digits from the limits of an interval prior to storage. A certain number of nonsignificant digits are retained as guard digits. Calculations are performed using exact interval arithmetic and the significancerule procedure is invoked after each operation to remove superfluous diglts. Roundoff in the procedure causes a slight increase in the interval width on each operation. This results in a cumulative loss of significance at a rate related to the number of guard digits. 
auzing1985
¿Web? 
Accurate Arithmetic Results for Decimal Data on NonDecimal Computers,
Winfried Auzinger and H. J. Stetter,
Computing, 35,
pp141–151,
1985.
Abstract: Recently, techniques have been devised and implemented which permit the computation of smallest enclosing machine number interval for the exact results of a good number of highly composite operations. These exact results refer, however, to the data as they are represented in the computer. This note shows how the conversion of decimal data into nondecimal representations may be joined with the mathematical operation on the data into one highaccuracy algorithm. Such an algorithm is explicitly presented for the solution of systems of linear equations. 
borl1985
URL ¿Web? 
Turbo Pascal Version 3.0 Reference Manual,
Borland International,
ISBN 0875240038,
386pp,
Borland International,
April 1985.
Abstract: Turbo Pascal 3 was the first Turbo Pascal version to support the Intel 8087 math coprocessor (16bit PC version). It also included support for Binary Coded Decimal (BCD) math to eliminate round off errors in business applications. Turbo Pascal 3 also allowed you to build larger programs (> 64k bytes) using overlays. The PC version also supported Turtle Graphics, Color, Sound, Window Routines, and more. 
hull1985a
¿Web? 
Numerical Turing,
T. E. Hull, A. Abrham, M. S. Cohen, A. F. X. Curley, C. B. Hall, D. A. Penny, and J. T. M. Sawchuk,
SIGNUM Newsletter, vol. 20 # 3,
pp26–34,
ACM Press,
July 1985.
Abstract: Numerical Turing is an extension of the Turing programming language. Turing is a Pascallike language (with convenient string handling, dynamic arrays, modules, and more general parameter lists) developed at the University of Toronto. Turing has been in use since May, 1983, and is now available on several machines. The Numerical Turing extension is especialy designed for numerical calculations. The important new features are: (a) clean decimal arithmetic, along with convenient functions for directed roundings and exponent manipulation; (b) complete precision control of variables and operations. ... 
hull1985b
¿Web? 
Properly Rounded Variable Precision Square Root,
T. E. Hull and A. Abrham,
ACM Transactions on Mathematical Software, Vol. 11 #3,
pp229–237,
ACM Press,
September 1985.
Abstract: The square root function presented here returns a properly rounded approximation to the square root of its argument, or it raises an error condition if the argument is negative. Properly rounded means rounded to nearest, or to nearest even in case of a tie. It is variable precision in that it is designed to return a pdigit approximation to a pdigit argument, for any p > 0. (Precision p means p decimal digits.) The program and the analysis are valid for all p > 0, but current implementations place some restrictions on p. 
ieee1985
¿Web? 
IEEE 7541985 IEEE Standard for Binary FloatingPoint Arithmetic,
David Stevenson et al,
20pp,
IEEE,
July 1985.
Abstract: This standard defines a family of commercially feasible ways for new systems to perform binary floatingpoint arithmetic. The issues of retrofitting were not considered. It is intended that an implementation of a floatingpoint system conforming to this standard can be realized entirely in software, entirely in hardware, or in any combination of software and hardware. It is the environment the programmer or user of the system sees that conforms or fails to conform to this standard. Hardware components that require software support to conform shall not be said to conform apart from such software. Note: Reaffirmed 1991. 
hull1986
¿Web? 
Variable Precision Exponential Function,
T. E. Hull and A. Abrham,
ACM Transactions on Mathematical Software, Vol. 12 #2,
pp79–91,
ACM Press,
June 1986.
Abstract: The exponential function presented here returns a result which differs from e^{x} by less than one unit in the last place, for any representable value of x which is not too close to values for which e^{x} would overflow or underflow. (For values of x which are not within this range, an error condition is raised.) It is a “variable precision” function in that it returns a pdigit approximation for a pdigit argument, for any p > 0 (pdigit means pdecimaldigit). The program and analysis are valid for all p > 0, but current implementations place a restriction on p. The program is presented in a Pascallike programming language called Numerical Turing which has special facilities for scientific computing, including precision control, directed roundings, and builtin functions for getting and setting exponents. 
knuth1986
¿Web? 
The IBM 650: An Appreciation from the Field,
Donald E. Knuth,
IEEE Annals of the History of Computing, Vol. 8 #1,
pp50–55,
IEEE,
JanuaryMarch 1986.
Abstract: I suppose it was natural for a person like me to fall in love with his first computer. But there was something special about the IBM 650, something that has provided the inspiration for much of my life’s work. Somehow this machine was powerful in spite of its severe limitations. Somehow it was friendly in spite of its primitive manmachine interface... 
ahmad1987
¿Web? 
Implementable Decimal Arithmetic Algorithms for Micro/Minicomputers,
M. Ahmad,
Microprocessing and Microprogramming, Vol. 19 #2,
pp119–128,
February 1987.
Abstract: The need for efficient decimal arithmetic and its ever increasing applications in micro/minicomputers and microprocessor based equipment and appliances has been emphasised. Some algorithms suitable for implementation for decimal arithmetic operations of BCD packed decimal numbers have been suggested. These algorithms employ comparatively faster instructions available on most of the microprocessors and provide efficient and faster decimal arithmetic. 
bohl1987
¿Web? 
A Decimal FloatingPoint Processor for Optimal Arithmetic,
G. Bohlender and T. Teufel,
Computer arithmetic: Scientific Computation and Programming Languages,
ISBN 3519024489,
pp31–58,
B. G. Teubner Stuttgart,
1987.
Abstract: A floatingpoint processor for optimal arithmetic should perform scalar products with maximum accuracy in addition to the usual operations +, , *, /. This means that scalar products have to be computed with an error of at most one bit of the least significant digit, even if cancellation of leading digits occurs. In order to avoid conversion errors during input and output of numerical data, the decimal number system should be chosen. The arithmetic processor BAPSC performs these operations in a 64 bit floatingpoint format with 13 decimal digits in the mantissa. The prototype is built in bitslice technology on wirewrap boards. Interfaces have been developed [sic] for several busses and computers. The arithmetic processor is fully integrated in the programming language PASCALSC. It supports operations in higher numerical spaces and new numerical algorithms that compute verified results with error bounds. 
duham1987
¿Web? 
Atari System Reference Manual, section 11,
Bob DuHamel,
Atari,
1987.
Abstract: The routines which do floating point arithmetic are a part of the operating system ROM. The Atari computer uses the 6502’s decimal math mode. This mode uses numbers represented in packed Binary Coded Decimal (BCD). This means that each byte of a floating point number holds two decimal digits. The actual method of representing a full number is complicated and probably not very important to a programmer. However, for those with the knowledge to use it, the format is given below... Note: 6 bytes: 10digit BCD, 7bit excess64 exponent. 
hp71ref1987a
¿Web? 
Math Reference,
Hewlett Packard Company,
HP71 Reference Manual, Mfg. # 007190110, Reorder # 007190010,
pp317–318,
Hewlett Packard Company,
October 1987.
Note: First edition October 1983. Subsections describe the numeric precisions available and the range of representable numbers. Manual available from The Museum of HP Calculators (www.hpmuseum.org). 
hp71ref1987b
¿Web? 
The IEEE Proposal for Handling Math Exceptions,
Hewlett Packard Company,
HP71 Reference Manual, Mfg. # 007190110, Reorder # 007190010,
pp338–345,
Hewlett Packard Company,
October 1987.
Abstract: The IEEE Radix Independent FloatingPoint Proposal divides all of the floatingpoint “exceptional events” encountered in calculations into five classes of math exceptions: invalid operation, division by zero, overflow, underflow, and inexact result. Associated with each math exception is a flag that is set by the HP71 whenever an exception is encountered. These flags remain set until you clear them. Each of these flags can be accessed by its number or its name. Note: First edition October 1983. Manual available from The Museum of HP Calculators (www.hpmuseum.org). 
hull1987
¿Web? 
Toward an Ideal Computer Arithmetic,
T. E. Hull and M. S. Cohen,
Proceedings of the 8th Symposium on Computer Arithmetic,
pp131–138,
IEEE,
May 1987.
Abstract: A new computer arithmetic is described. Closely related builtin functions are included. A user’s point of view is taken, so that the emphasis is on what language features are available to a user. The main new feature is flexible precision control of decimal floatingpoint arithmetic. It is intended that the language facilities be sufficient for describing numerical processes one might want to implement, while at the same time being simple to use, and implementable in a reasonably efficient manner. Illustrative examples are based on experience with an existing software implementation. 
ieee1987
¿Web? 
IEEE 8541987 IEEE Standard for RadixIndependent FloatingPoint Arithmetic,
W. J. Cody et al,
14pp,
IEEE,
March 1987.
Abstract: It is intended that an implementation of a floatingpoint system conforming to this standard can be realized entirely in software, entirely in hardware, or in any combination of software and hardware. It is the environment the programmer or user of the system sees that conforms or fails to conform to this standard. Hardware components that require software support to conform shall not be said to conform apart from such software. Note: Reaffirmed 1994. 
mass1987
¿Web? 
Superoptimizer: A Look at the Smallest Program,
Henry Massalin,
ACM Sigplan Notices, Vol. 22 #10 (Proceedings of the Second International Conference on Architectual support for Programming Languages and Operating Systems),
pp122–126,
ACM, also IEEE Computer Society Press #87CH24406,
October 1987.
Abstract: Given an instruction set, the superoptimizer finds the shortest program to compute a function. Startling programs have been generated, many of them engaging in convoluted bitfiddling bearing little resemblance to the source programs which defined the functions. The key idea in the superoptimizer is a probabilistic test that makes exhaustive searches practical for programs of useful size. The search space is defined by the processor’s instruction set, which may include the whole set, but it is typically restricted to a subset. By constraining the instructions and observing the effect on the output program, one can gain insight into the design of instruction sets. In addition, superoptimized programs may be used by peephole optimizers to improve the quality of generated code, or by assembly language programmers to improve manually written code. Note: Also in: ACM SIGOPS, Operating Systems Review, Vol. 21 # 4. 
shirazi1988
¿Web? 
VLSI designs for redundant binarycoded decimal addition,
Behrooz Shirazi, David Y. Y. Yun, and Chang N. Zhang,
IEEE Seventh Annual International Phoenix Conference on Computers and Communications, 1988,
pp52–56,
IEEE,
March 1988.
Abstract: Binarycoded decimal (BCD) system provides rapid binarydecimal conversion. However, BCD arithmetic operations are often slow and require complex hardware. One can eliminate the need for carry propagation and thus improve performance of BCD operations by using a redundant binarycoded decimal (RBCD) system. This paper introduces the VLSI design of an RBCD adder. The design consists of two small PLA’s and two fourbit binary adders for one digit of the RBCD adder. The addition delay is constant for ndigit RBCD addition (no carry propagation delay). The VLSI time and space complexities of the design as well as its layout are presented, showing the regularity of the structures. In addition, two simple algorithms and the corresponding hardware designs for conversion between RBCD and BCD are presented. 
johnst1989
¿Web? 
Higher Radix Floating Point Representations,
P. Johnstone and F. Petry,
Proceedings of the 9th Symposium on Computer Arithmetic,
ISBN 0818689633,
pp128–135,
IEEE Computer Society Press,
September 1989.
Abstract: This paper examines the feasibility of higher radix floating point representations, and in particular, decimal based representations. Traditional analyses of such representations have assumed the format of a floating point datum to be roughly identical to that of traditional binary floating point encodings such as the IEEE P754 task group standard representations. We relax this restriction and propose a method of encoding higher radix floating point data with range, precision, and storage requirements comparable to those exhibited by traditional binary representations. Results from McKeeman’s Maximum and Average Relative Representational Error (MRRE and ARRE) analyses, Brent’s RMS error evaluation, Matula’s ratio of significance space and gap functions, and Brown and Richman’s exponent range estimates are extended to accomodate the proposed representation. A decimal alternative to traditional binary representations is proposed, and the behavior of such a system is contrasted with that of a comparable binary system. 
lee1989
URL ¿Web? 
Multistep Gradual Rounding,
Corinna Lee,
IEEE Transactions on Computers, Vol. 28 #4,
pp595–600,
IEEE,
April 1989.
Abstract: A value V is to be rounded to an arbitrary precision resulting in the value V“. Conventional rounding technique uses one step to accomplish this. Alternatively, multistep rounding uses several steps to round the value V to successively shorter precisions with the final rounding step producing the desired value V”. This alternate rounding method is one way to implement, with the minimum of hardware, the denormalization process that the IEEE FloatingPoint Standard 754 requires when underflow occurs. There are certain cases for which multistep rounding produces a different result than singlestep rounding. To prevent such a step error, the author introduces a rounding procedure called gradual rounding that is very similar to conventional rounding with the addition of two tag bits associated with each floatingpoint register. 
moshier1989
¿Web? 
Methods and Programs for Mathematical Functions,
Stephen L. Moshier,
415pp,
PrenticeHall, Inc., Englewood Cliffs, New Jersey 07632, USA,
1989.
Abstract: This book provides a working collection of mathematical software for computing various elementary and higher functions. It also supplies tutorial information of a practical nature; the purpose of this is to assist in constructing numerical programs for the reader’s special applications. Though some of the main analytical techniques for deriving functional expansions are described, the emphasis is on computing; so there has been no attempt to incorporate or supplant the many books on functional and numerical analysis that are available. ... Note: Program source codes are available at http://www.netlib.org/cephes. 
wagner1989
¿Web? 
Error detecting decimal digits,
Neal R. Wagner and Paul S. Putter,
Communications of the ACM Vol. 32 #1,
pp106–110,
ACM Press,
January 1989.
Abstract: We were recently engaged by a large mailorder house to act as consultants on their use of check digits for detecting errors in account numbers. Since we were not experts in coding theory, we looked in reference books such #as Error Correcting Codes [7] and asked colleagues who were familiar with coding theory. Uniformly, the answer was: There is no field of order 10; the theory only works over a field. This article relates our experi ences and presents several of the simple decimal oriented error detection schemes that are available, but not widely known. Note: ACM abstract: Decimaloriented error detection schemes are explored in the context of one particular company project. 
clinger1990
¿Web? 
How to read floating point numbers accurately,
William D. Clinger,
Proceedings of the ACM SIGPLAN '90 Conference on Programming Language Design and Implementation,
pp92–101,
ACM Press,
June 1990.
Abstract: Consider the problem of converting decimal scientific notation for a number into the best binary floating point approximation to that number, for some fixed precision. This problem cannot be solved using arithmetic of any fixed precision. Hence the IEEE Standard for Binary FloatingPoint Arithmetic does not require the result of such a conversion to be the best approximation. This paper presents an efficient algorithm that always finds the best approximation. The algorithm uses a few extra bits of precision to compute an IEEEconforming approximation while testing an intermediate result to determine whether the approximation could be other than the best. If the approximation might not be the best, then the best approximation is determined by a few simple operations on multipleprecision integers, where the precision is determined by the input. When using 64 bits of precision to compute IEEE double precision results, the algorithm avoids higherprecision arithmetic over 99% of the time. The input problem considered by this paper is the inverse of an output problem considered by Steele and White: Given a binary floating point number, print a correctly rounded decimal representation of it using the smallest number of digits that will allow the number to be read without loss of accuracy. The Steele and White algorithm assumes that the input problem is solved; an imperfect solution to the input problem, as allowed by the IEEE standard and ubiquitous in current practice, defeats the purpose of their algorithm. 
dec1990
¿Web? 
Software Product Description: COBOL81/RSTS/E, Version 3.1,
DEC,
3pp,
Digital Equipment Corporation,
December 1990.
Abstract: COBOL81/RSTS/E is a highlevel language for business data processing that operates under control of the RSTS/E Operating System. It is based on the 1985 ANSI COBOL Standard X3.231985 and includes all of the features necessary to achieve the intermediate level of that standard. COBOL81/RSTS/E is a subset of VAX COBOL and includes various Digital Equipment Corporation extensions to COBOL, including screen handling at the source language level. COBOL81/RSTS/E also supports the ANSI1974 standard, and both standards are switch selectable using the /STA:V2 or /STA:85 switches. 
gay1990
URL ¿Web? 
Correctly Rounded BinaryDecimal and DecimalBinary Conversions,
David M. Gay,
Numerical Analysis Manuscript 9010,
16pp,
AT&T Bell Laboratories,
November 1990.
Abstract: This note discusses the main issues in performing correctly rounded decimaltobinary and binarytodecimal conversions. It reviews recent work by Clinger and by Steele and White on these conversions and describes some efficiency enhancements. Computational experience with several kinds of arithmetic suggests that the average computational cost for correct rounding can be small for typical conversions. Source for conversion routines that support this claim is available from netlib. 
steele1990
¿Web? 
How to Print FloatingPoint Numbers Accurately,
Guy. L. Steele Jr. and Jon. L. White,
Proceedings of the ACM SIGPLAN '90 Conference on Programming Language Design and Implementation,
pp112–126,
ACM Press,
June 1990.
Abstract: We present algorithms for accurately converting [binary] floatingpoint numbers to decimal representation. The key idea is to carry along with the computation an explicit representation of the required rounding accuracy. We begin with the simpler problem of converting fixedpoint fractions. A modification of the wellknown algorithm for radixconversion of fixedpoint fractions by multiplication explicitly determines when to terminate the conversion process; a variable number of digits are produced. ... 
bohl1991
¿Web? 
Decimal FloatingPoint Arithmetic in Binary Representation,
Gerd Bohlender,
Computer arithmetic: Scientific Computation and Mathematical Modelling (Proceedings of the Second International Conference, Albena, Bulgaria, 2428 September 1990),
pp13–27,
J. C. Baltzer AG,
1991.
Abstract: The binary representation of decimal floatingpoint numbers permits an efficient implementation of the proposed radix independent IEEE standard for floatingpoint arithmetic, as far as storage space is concerned. Unfortunately the left and right shifts occurring in the arithmetic operations are very complicated and slow in this representation. In the present paper therefore methods are proposed which speed up these shifts; in particular a kind of carry lookahead technique is used for division. These methods can be combined to construct a decimal shifter which is needed in an ALU for decimal arithmetic. 
glads1991
¿Web? 
A method of designing a decimal arithmetic processor,
M. A. Gladshtein,
Automatic Control and Computer Sciences, Vol. 25 #6,
pp51–56,
1991.
Abstract: The advantages and drawbacks of binary numeric coding in digital computers have been considered. This type of coding has been shown ineffective in processing large data arrays especially when represented in the floatingpoint form. Also, the low efficiency of conventionally employed decimal computational procedures using the socalled corrections has been noted. It has been proposed, in designing digital computers, to renounce the principle of binary computations in favor of decimal operations on the basis of stored addition and multiplication tables using binarydecimal numeric coding. A version of circuit design for a decimal processor, algorithms and microprograms for addition and multiplication operations have been described. Advantages inherent in the method proposed have been analyzed. Note: Translated from Avtomatika i Vychislitel’naya Tekhnika UDC 681.3.48. 
hull1991
¿Web? 
Specifications for a VariablePrecision Arithmetic Coprocessor,
T. E. Hull, M. S. Cohen, and C. B. Hall,
Proceedings. 10th Symposium on Computer Arithmetic,
ISBN 0818691514,
pp127–131,
IEEE,
1991.
Abstract: The authors have been developing a programming system intended to be especially convenient for scientific computing. Its main features are variable precision (decimal) floatingpoint arithmetic and convenient exception handling. The software implementation of the system has evolved over a number of years, and a partial hardware implementation of the arithmetic itself was constructed and used during the early stages of the project. Based on this experience, the authors have developed a set of specifications for an arithmetic coprocessor to support such a system. The main purpose of this paper is to describe these specifications. An outline of the language features and how they can be used is also provided, to help justify our particular choice of coprocessor specifications. 
ochs1991
¿Web? 
Numeric types, representations, and other fictions,
T. Ochs,
Computer Language, Vol. 8 #8,
pp93–101,
August 1991.
Abstract: Only rational numbers are explicitly representable in computers. Any explicit representation has a zero measure. Both rational and BCD arbitrary precision meet [author’s] initial requirements [of precision and range]. Floatingpoint numbers have a strange distribution. 
rosen1991
¿Web? 
Supporting packed decimal in Ada,
David A. Rosenfeld,
Proceedings of the conference on TRIAda '91,
ISBN 0897914457,
pp187–190,
ACM Press,
1991.
Abstract: One of the principal barriers to Ada in the Information Systems (IS) marketplace is that Ada compilers do not support decimal arithmetic and a packed decimal representation for numbers. An Ada apologist could argue that Ada as a language does support these featurtx, but such arguments do little to help a COBOL programmer, accustomed to manipulating decimal quantities in a straightforward way. Our project, under contract to the Army, is addressing the problem directly, by implementing packed decimal numbers in its MVS Ada Compiler. T his paper will discuss the possible approaches to the problem, and explain the approach selected, comparing it briefly with other solutions... 
scott1991
¿Web? 
Mathematics and computer science at odds over real numbers,
Thomas J. Scott,
ACM SIGCSE Bulletin, Vol. 23 #1 (Technical Symposium on Computer Science Education 1991),
pp130–139,
ACM Press,
1991.
Abstract: This paper discusses the “real number” data type as implemented by “floating point” numbers. Floating point implementations and a theorem that characterizes their truncations are presented. A teachable floating point system is presented, chosen so that most problems can be worked out with paper and pencil. Then major differences between floating point number systems and the continuous real number system are presented. Important floating point formats are next discussed. Two examples derived from actual computing practice on mainframes, minicomputers, and PCs are presented. The paper concludes with a discussion of where floating point arithmetic should be taught in standard courses in the ACM curriculum. 
tsang1991
¿Web? 
A Study of DataBase 2 Customer Queries,
Annie Tsang and Manfred Olschanowsky,
IBM Technical Report TR 03.413,
25pp,
IBM Santa Teresa Laboratory, San Jose, CA,
April 1991.
Abstract: Over 200 Database 2 readonly and update queries were collected from 30 major DB2 customers during 1989 and 1990. These queries were considered representative of customers using DB2. Analysis of these queries were made in order to determine their characteristics and also to determine which SQL funetions were commonly used and how frequently they were used by these customers. Results of this study can be used in various ways, induding:

aber1992
¿Web? 
Precise Computation Using Range Arithmetic, via C++,
Oliver Aberth and Mark J Schaefer,
ACM Transactions on Mathematical Software, Vol. 18 #4,
pp481–491,
ACM Press,
December 1992.
Abstract: An arithmetic is described that can replace floatingpoint arithmetic for programming tasks requiring assured accuracy. A general explanation is given of how the arithmetic is constructed with C++, and a programming example in this language is supplied. Times for solving representative problems are presented. 
arazi1992
¿Web? 
BinarytoDecimal Conversion Based on the Divisibility of 2^{8}1 [255] by 5,
B. Arazi and D. Naccache,
Electronic Letters, Vol. 28 #3,
pp2151–2152,
IEE,
November 1992.
Abstract: The Letter treats the case of converting a binary value, represented in the form of n bytes, into a decimal value, represented in the form of m BCD characters. The conversion, which is suitable for onebyte and twobyte processors, is based on the following observations: (a) 5 is a divisor of 2^{8}1 and 2^{16}1. (b) Modular binary arithmetic over 2^{r}1 is easily performed. (c) Binary division by 2^{r}1, in the case where the remainder is known to be zero, is easily performed. (d) All the prime factors of 2^{8}1 and 2^{16}1 are of the form 2^{r}+1. 
brosgol1992
¿Web? 
An Ada Decimal Arithmetic Capability,
Benjamin M. Brosgol, Robert I. Eachus, and David E. Emery,
CrossTalk, The Journal of Defense Software Engineering, Number 36,
8 (approx)pp,
US Air Force Software Technology Support Center,
September 1992.
Abstract: (None.) Support for financial processing requires suitable arithmetic facilities, representation control, and formatted output. This paper ... describes the possible approaches to the problem, the solution that the authors have developed, and the rationale for the choice. The name chosen for the solution, ADAR, stands for “Ada Decimal Arithmetic and Representations” Note: Probably the same as or very similar to “Decimal arithmetic in Ada” by the same authors in the same year. 
dec1992
¿Web? 
Software Product Description: VAX 9000 Series Diagnostic Set,
DEC,
3pp,
Digital Equipment Corporation,
April 1992.
Abstract: VAX 9000 Series Diagnostic Set is a package of programs that allows users to maintain a VAX 9000 system. These diagnostics test all subsystems of the VAX 9000 system including the Power Control System, Service Processor System, CPU, Memory, I/O Adapters, and peripheral devices. The package includes firmwarebased tests, serviceprocessorbased tests, and macrodiagnostics 
gold1992
¿Web? 
The Design of FloatingPoint Data Types,
David Goldberg,
ACM Letters on Programming Languages and Systems, Vol. 1 #2,
pp138–151,
ACM Press,
June 1992.
Abstract: The issues involved in designing the floatingpoint part of a programming language are discussed. Looking at the language specifications for most existing languages might suggest that this design involves only trivial issues, such as whether to have one or two types of REALs or how to name the functions that convert from INTEGER to REAL. It is shown that there are more significant semantic issues involved. After discussing the tradeoffs for the major design decisions, they are illustrated by presenting the design of the floatingpoint part of the Modula3 language. 
iso1992
¿Web? 
ISO/IEC 9075:1992: Information Technology – Database Languages – SQL,
Jim Melton et al,
626pp,
ISO,
1992.
Abstract: This International Standard was developed from ISO/IEC 9075:1989, Information Systems, Database Language SQL with Integrity Enhancements, and replaces that International Standard. It adds significant new features and capabilities to the specifications. It is generally compatible with ISO/IEC 9075:1989, in the sense that, with very few exceptions, SQL language that conforms to ISO/IEC 9075:1989 also conforms to this International Standard, and will be treated in the same way by an implementation of this International Standard as it would by an implementation of ISO/IEC 9075:1989... Note: Also available as ANSI INCITS 1351992 (R1998). 
obai1992
¿Web? 
A Decimal Multiplication Algorithm for Microcomputers,
Mohammad S. Obaidat and Saleh A. Bleha,
Computers and Electrical Engineering, Vol. 18 #5,
pp357–363,
Elsevier,
September 1992.
Abstract: A decimal multiplication algorithm is developed and its implementation for microcomputers is illustrated. The algorithm can provide an average multiplication speedup equal to 1.34 compared to the traditional algorithm that is based on repeated additions if both are implemented in pure hardware. The average speedup of the developed algorithm is 1.20 if implemented on an 8bit microcomputer system. The algorithm is significant especially for simple realtime applications that require costeffective designs. 
vowels1992
¿Web? 
Division by 10,
R. A. Vowels,
Australian Computer Journal, Vol. 24 #3,
pp81–85,
ACS,
August 1992.
Abstract: Division of a binary integer and a binary floatingpoint mantissa by 10 can be performed with shifts and adds, yielding a significant improvement in hardware execution time, and in software execution time if no hardware divide instruction is available. Several algorithms are given, appropriate to specific machine word sizes, hardware and hardware instructions available, and depending on whether a remainder is required. The integer division algorithms presented here contain a new strategy that produces the correct quotient directly, without the need for the supplementary correction required of previously published algorithms. The algorithms are competitive in time with binary coded decimal (BCD) divide by 10. Both the integer and floatingpoint algorithms are an order of magnitude faster than conventional division. 
johnst1993
¿Web? 
Rational Number Approximation in Higher Radix Floating Point Systems,
P. Johnstone and F. Petry,
Computers and Mathematics with Applications, Vol. 25 #6,
pp103–108,
Pergamon Press,
1993.
Abstract: Recent proposals have suggested that suitably encoded nonbinary floating point representations might offer range and precision comparable to binary systems of equal word size. This is of obvious importance in that it allows computation to be performed on decimal operands without the overhead or error of base conversion while maintaining the error performance and representational characteristics of more traditional encodings. There remains, however, a more general question on the effect of the choice of radix on the ability of fioating point systems to represent arbitrary rational numbers. Mathematical researchers have long recognized that some bases offer some representational advantages in that they generate fewer nonterminate values when representing arbitrary rational numbers. Base twelve, for example, has long been considered preferred over base ten because of its inclusion of three in its primary factorization allowing finite representation of a greater number of rational numbers. While such results are true for abstract number systems, little attention has been paid to machine based computation and its finite resources. In this study, such results are considered in an environment more typical of computer based models of number systems. Specifically, we consider the effect of the choice of floating point base on rational number approximation in systems which exhibit the typical characteristics of floating point representations – normalized encodings, limited exponent range and storage allocated in a fixed number of ‘bits’ per datum. The frequency with which terminate and representable results can be expected is considered for binary, decimal, and other potentially interesting bases. 
krand1993
¿Web? 
Efficient Multiprecision Floating Point Multiplication with Optimal Directional Rounding,
Werner Krandick and Jeremy R. Johnson,
Proceedings of the 11th IEEE Symposium on Computer Arithmetic,
6pp,
IEEE,
1993.
Abstract: An algorithm is described for multiplying multiprecision floatingpoint numbers. The algorithm produces either the smallest floatingpoint number greater than or equal to the true product or the greatest floatingpoint number smaller than or equal to the true product Software implementations of multiprecision precision floatingpoint multiplication can reduce the computing time by a factor of two if they do not compute the low order digits of the product of the two mantissas. However, these algorithms do not necessarily provide optimally rounded results. The algorithm described in this paper is guaranteed to produce optimally rounded results and typically obtains the same savings. 
tumlin1993
¿Web? 
An evaluation of the design of the Gamma 60,
T. J. Tumlin and M. Smothermann,
Actes du 3e colloque de l'Histoire de l'Informatique,
11pp,
SophiaAntipolis, INRIA,
1993.
Abstract: The Bull Gamma 60 remains a major innovation in computer design. Its use of explicit FORKJOIN parallelism is shown by a simulation model to wisely exploit a large difference in speeds between logic components and memory elements, as found on some machines of the 1950’s. Recently the reappearance of a large speed ratio makes the same type of explicit FORKJOIN parallelism attractive in advanced designs and validates the latencytolerant design philosophyof the Gamma 60. The major difficulty of the design is the programming effort required to fully express the parallelism available in programs. 
brosgol1994
¿Web? 
Information Systems Development in Ada,
Benjamin M. Brosgol, Robert I. Eachus, and David E. Emery,
Eleventh Annual Washington Ada Symposium,
pp2–16,
ACM Press,
June 1994.
Abstract: (None.) In this paper we survey how to use Ada (both Ada 83 and Ada 9X) for IS applications, with a focus on two principal issues: Specification of the information architecture of an IS application, and Programming techniques relevant to financial and related applications. We cover both the language features and the supplemental packages for IS development. Special attention will be paid to the Ada DecimalAssociated Reusabilia (“ADAR”) components for Ada 83 and transitioning to Ada 9X. 
dall1994
¿Web? 
Dynamics of Arithmetic: A Connectionist View of Arithmetic Skills,
Richard Z. Dallaway,
ISSN 13503162,
159pp,
CSRP 306, Univerity of Sussex,
February 1994.
Abstract: Arithmetic takes time. Children need five or six years to master the one hundred multiplication facts (00 to99), and it takes adults approximately one second to recall an answer to a problem like 78. Multicolumn arithmetic (e.g., 4567) requires a sequence of actions, and children produce a host of systematic mistakes when solving such problems. This thesis models the time course and mistakes of adults and children solving arithmetic problems. Two models are presented, both of which are built from connectionist components. 
hansen1994
¿Web? 
Multiplelength Division Revisited: a Tour of the Minefield,
Per Brinch Hansen,
Software  Practice and Experience Vol. 24 #6,
pp579–601,
John Wiley & Sons,
June 1994.
Abstract: Long division of natural numbers plays a crucial role in Cobol arithmetic, cryptography, and primality testing. Only a handful of textbooks discuss the theory and practice of long division, and none of them do it satisfactorily. This tutorial attempts to fill this surprising gap in the literature on computer algorithms. We illustrate the subtleties of long division by examples, define the problem concisely, summarize the theory, and develop a complete Pascal algorithm using a consistent terminology. 
jackson1994
URL ¿Web? 
Precision Control and Exception Handling in Scientific Computing,
K. R. Jackson and N. S. Nedialkov,
Technical report,
pp1–8,
Computer Science Dept., University of Toronto,
1994.
Abstract: This paper describes convenient language facilities for precision control and exception handling. Nedialkov has developed a variableprecision and exception handling library, SciLib, implemented as a numerical class library in C++. A new scalar data type, real, is introduced, consisting of variableprecision floatingpoint numbers. Arithmetic, relational, and input and output operators of the language are overloaded for reals, so that mathematical expressions can be written without explicit function calls. Precision of computations can be changed during program execution. The exception handling mechanism treats only numerical exceptions and does not distinguish between different types of exceptions. The proposed precision control and exception handling facilities are illustrated by sample SciLib programs. 
johnst1994
¿Web? 
Design and Analysis of Nonbinary Radix Floating Point Representations.,
P. Johnstone and F. Petry,
Computers and Electrical Engineering, Vol. 20 #1,
pp39–50,
Elsevier,
January 1994.
Abstract: This paper examines the feasibility of higher radix floating point representations and in particular decimal based representations. Traditional analyses of such representations have assumed the format of a floating point datum to be roughly identical to that of traditional binary floating point encodings such as the IEEE P754 task group standard representations. We relax this restriction and propose a method of encoding higher radix floating point data with range, precision, and storage requirements comparable to those exhibited by traditional binary representations. Results from McKeeman’s Maximum and Average Relative Representational Error (MRRE and ARRE) analyses, Brent’s RMS error evaluation, Matula’s ratio of significance space and gap functions, and Brown and Richman’s exponent range estimates are extended to accomodate the proposed representation. A decimal alternative to traditional binary representations is proposed, and the behavior of such a system is contrasted with that of a comparable binary system. Note: Almost identical to 1989 Higher Radix Floating Point Representations by the same authors. 
walters1994
¿Web? 
A Complete Term Rewriting System for Decimal Integer Arithmetic,
H. R. Walters,
Technical Report CS9435,
9pp,
Centrum voor Wiskunde en Informatica (CWI),
August 1994.
Abstract: We present a term rewriting system for decimal integers with addition and subtraction. We prove that the system is confluent and terminating. 
carre1995
¿Web? 
Specification of the IEEE854 FloatingPoint Standard in HOL and PVS,
Victor A. Carreño and Paul S. Miner,
HOL95: Eighth International Workshop on HigherOrder Logic Theorem Proving and Its Applications,
16pp,
Brigham Young University,
September 1995.
Abstract: The IEEE854 Standard for radixindependent floatingpoint arithmetic has been partially defined within two mechanical verication systems. We present the specication of key parts of the standard in both HOL and PVS. This effort to formalize IEEE854 has given the opportunity to compare the styles imposed by the two verification systems on the specification. 
iso1995
URL ¿Web? 
ISO/IEC 8652:1995: Information Technology – Programming Languages – Ada (Ada 95 Reference Manual: Language and Standard Libraries),
S. Tucker Taft and Robert A. Duff,
ISBN 3540631445,
552pp,
SpringerVerlag,
July 1997.
Abstract: This International Standard specifies the form and meaning of programs written in Ada. Its purpose is to promote the portability of Ada programs to a variety of data processing systems. 
thimbleby1995
URL ¿Web? 
A new calculator and why it is necessary,
Harold Thimbleby,
The Computer Journal, Vol. 38 #6,
pp418–433,
OUP,
1995.
Abstract: Conventional calculators are badly designed: they suffer from bad computer science – they are unnecessarily difficult to use and buggy. I describe a solution, avoiding the problems caused by conventional calculators, one that is more powerful and arguably much easier to use. The solution has been implemented, and design issues are discussed. This paper shows an interactive system that is declarative, with the advantages of clarity and power that entails. It frees people from working out how a calculation should be expressed to concentrating on what they want solved. An important contribution is to demonstrate the very serious problems users face when using conventional calculators, and hence what a freedom a declarative design brings. 
ansi1996
¿Web? 
ANSI X3.2741996: American National Standard for Information Technology – Programming Language REXX,
Brian Marks and Neil Milsted,
167pp,
ANSI,
February 1996.
Abstract: This standard provides an unambiguous definition of the programming language REXX. Its purpose is to facilitate portability of REXX programs for use on a wide variety of computer systems. Note: Errata also available, as ANSI X3.2741996/AM 12000. 
burg1996
¿Web? 
Printing FloatingPoint Numbers Quickly and Accurately,
Robert G. Burger and R. Kent Dybvig,
Proceedings of the ACM SIGPLAN '96 conference on Programming language design and implementation,
pp108–116,
ACM Press,
1996.
Abstract: This paper presents a fast and accurate algorithm for printing floatingpoint numbers in both free and fixedformat modes. In freeformat mode, the algorithm generates the shortest, correctly rounded output string that converts to the same number when read back in, accommodating whatever rounding mode the reader uses. In fixedformat mode, the algorithm generates a correctly rounded output string using special # marks to denote insignificant trailing digits. For both modes, the algorithm employs a fast estimator to scale floatingpoint numbers efficiently. 
guedj1996
¿Web? 
Numbers: The Universal Language,
Denis Guedj,
ISBN 0810928450,
176pp,
Harry N. Abrams, Inc,
1997.
Abstract: Numbers, like letter forms, have a rich and complex history. Who first invented them? How old are they, and how were they developed? ... With Chronology and Glossary. Many referenced illustrations. Note: Translated from the French (Empire des nombres) by Lory Frankel. 
doring1997
¿Web? 
Decimal Adjustment of Long Numbers in Constant Time,
Andreas Döring and Wolfgang J. Paul,
Information Processing Letters, Vol. 62 #3,
pp161–163,
Elsevier Science B.V.,
June 1997.
Abstract: We propose a very simple method for adding and subtracting ndigit binary coded decimal (BCD) numbers with a small constant number of ordinary operations of a 4nbit binary ALU. With this method addition/subtraction of 8digit decimal numbers on an intel 486 processor is faster than programs that use the special builtin operations for decimal adjustment. 
euro1997
URL ¿Web? 
The Introduction of the Euro and the Rounding of Currency Amounts,
European Commission,
29pp,
European Commission Directorate General II Economic and Financial Affairs,
1997.
Abstract: The rounding rules laid down in the legal framework of the euro are an integral part of the monetary law of the euro area. The legal equality of the euro unit and the national currency units is based on their application and the application of the conversion rates. The basic rules laid down in the Council Regulation (EC) No 1103/97 are... 
hanson1997
URL ¿Web? 
Economical Correctly Rounded Binary Decimal Conversions,
Kenton Hanson,
URL: http://www.dnai.com/~khanson/ECRBDC.html,
5pp,
1997.
Abstract: Economical correctly rounded binary to decimal and decimal to binary conversions simplifies computing environments. Undue confusion and inaccuracies can occur with less precise conversions. Correct conversions can easily be guaranteed with very large precision arithmetic, but may cause performance and space penalties. Mostly correct conversions can be achieved with machine arithmetic. We demonstrate that correctly rounded conversions can be guaranteed with a minimum amount of extra precision arithmetic. An efficient algorithm for finding the most difficult conversions is described in detail. We then use these results to show how correct conversions can be guaranteed with a minimum of extra precision. Most normal conversions only require native machine arithmetic. Determining when extra precision is needed is straightforward. Note: Only available as a web page. 
holm1997
¿Web? 
Composite Arithmetic: Proposal for a New Standard,
W. Neville Holmes,
IEEE Computer,
pp65–73,
IEEE,
March 1997.
Abstract: A generalpurpose arithmetic standard could give general computation the kind of reliability and stability that the floatingpoint standard brought to scientific computing. The author describes composite arithmetic as a possible starting point. 
texas1997
¿Web? 
TI86 Graphing Calculator Guidebook,
Texas Instruments,
419pp,
Texas Instruments,
September 1997.
Abstract: User’s Guide for the TI86 Graphing Calculator. Note: Revised February 2001. 
crensh1998
URL ¿Web? 
Integer Square Roots, Jack W. Crenshaw, Embedded Systems Programming, Vol. 11 #2, EDTN, February 1998. 
darcy1998
¿Web? 
Borneo 1.0.2 – Adding IEEE 754 floatingpoint support to Java.,
Joseph D. Darcy,
129pp,
University of California, Berkeley,
May 1998.
Abstract: The design of Java relies heavily on experiences with programming languages past. Major Java features, including garbage collection, objectoriented programming, and strong static type checking, have all proven their worth over many years. However, Java breaks with tradition in its floatingpoint support; instead of accepting whatever floating point formats a machine might provide, Java mandates use of the nearly ubiquitous IEEE Standard for Binary FloatingPoint Arithmetic (IEEE 7541985). Unfortunately, Java’s specification creates two problems for numerical computation: only a strict subset of IEEE 754’s required features are supported by Java and Java’s bitforbit reproducibility goals for floatingpoint computation cause significant performance penalties on popular architectures. Java forbids using some distinguishing features of IEEE 754, features designed to make building robust numerical software by numerical experts and novices alike easier than in the past. Only simple floatingpoint features common to IEEE 754 and obsolete floatingpoint formats are allowed. Legitimate differences exist among various standardconforming realizations of IEEE 754. For example, the x86 processor family supports the IEEE 754 recommended 80 bit double extended format in addition to the float and double formats found on other architectures. In many instances, using the double extended format for intermediate results leads to more robust programs. To support its “write once, run anywhere” goals, Java specifies that only the float and double formats be used for intermediate results in numeric expressions. For recent x86 processors to emulate exactly a machine that only uses float and double entails a significant performance penalty; over an order of magnitude degradation has been reported. An analogous situation arises on architectures such as the PowerPC that support a fused multiply accumulate instruction; Java semantics preclude using a hardware feature that would usually give more accurate answers faster. However, even numerical analysts do not need or desire exact reproducibility in all cases. The disallowed x86 features were designed to allow numerically unsophisticated programs to have a better likelihood of getting reasonable results. To address these concerns, the Java dialect Borneo is able to express all required features of IEEE 754. Borneo also aims to run efficiently on multiple hardware implementations of IEEE 754 and to allow convenient construction of new numeric types. 
euro1998
URL ¿Web? 
The Introduction of the Euro and the Rounding of Currency Amounts,
European Commission Directorate General II,
II/28/99EN Euro Papers No. 22.,
32pp,
DGII/C4SP(99) European Commission,
March 1998, February 1999.
Abstract: The purpose of the present document is to respond in a systematic manner to the various questions on rounding which the Commission services have received since the adoption of the Council regulation on certain provisions relating to the introduction of the euro in June 1997. 4 To this end it tries to clarify the interpretation of the rounding provisions in the legal framework of the euro and to give guidance on technical aspects of rounding. 
gord1998
URL ¿Web? 
A Calculated Look at FixedPoint Arithmetic,
Robert Gordon,
Embedded Systems Programming, Vol. 11 #4,
pp72–78,
Miller Freeman, Inc,
April 1998.
Abstract: This article explores the subject of fixedpoint numbers and presents techniques you can use to implement efficient, fixedprecision number applications. 
ibm1998
URL ¿Web? 
Decimal Arithmetic Instructions,
IBM,
ESA/390 Principles of Operation, Chapter 8,
IBM,
1998.
Abstract: The decimal instructions of this chapter perform arithmetic and editing operations on decimal data. Additional operations on decimal data are provided by several of the instructions in Chapter 7, “General Instructions”. Decimal operands always reside in storage, and all decimal instructions use the SS instruction format. Decimal operands occupy storage fields that can start on any byte boundary. 
knuth1998
URL ¿Web? 
The Art of Computer Programming, Vol 2,
Donald E. Knuth,
ISBN 0201896842,
762pp,
Addison Wesley Longman,
1998.
Abstract: The chief purpose of this chapter [4] is to make a careful study of the four basic processes of arithmetic: addition, subtraction, multiplication, and division. Many people see arithmetic as a trivial thing that children learn and computers do, but we will see that arithmetic is a fascinating topic with many interesting facets. ... Note: Third edition. See especially sections 4.1 through 4.4. 
msdn1998
URL ¿Web? 
MSDN Library Visual Basic 6.0 Reference,
Microsoft Corporation,
URL: http://msdn.microsoft.com/library,
Microsoft Corporation,
2002.
Abstract: The contents of the Visual Basic Language Reference and Controls Reference includes topics on the controls, objects, properties, methods, events, statements, functions, and constants available. Additionally, this Reference contains topics on wizards, trappable errors, data types, keyboard shortcuts, and bidirectional programming. 
peuto1998
¿Web? 
An Instruction Timing Model of CPU Performance,
Bernard L. Peuto and Leonard J. Shustek,
International Conference on Computer Architecture: 25 years of the International Symposia on Computer architecture,
pp152–165,
ACM Press,
1998.
Abstract: A model of highperformance computers is derived from instruction timing formulas, with compensation for pipeline and cache memory effects. The model is used to predict the performance of the IBM 370/168 and the Amdahl 470 V/6 on specific programs, and the results are verified by comparison with actual performance. Data collected about program behavior is combined with the performance analysis to highlight some of the problems with highperformance implementations of such architectures. Note: Original reference: ISCA 1977: pp165178. 
takashi1998
¿Web? 
Floating Point Number Format with Number System with Base of 1000,
Y. Takashi,
IBM Technical Disclosure Bulletin, 0198,
pp609–610,
IBM,
January 1998.
Abstract: Disclosed is a use number system with a base of 1000 instead of 2 at the mantissa part of a floating point number. The unit is 10 bit. Each 10 bit keeps the value between 0 and 1000. This format is superior to Binary Coded Decimal (BCD) because it can keep more decimal numbers in the same size. This format is superior to binary because 1000 is 100 times of 10, and it makes no difference when converted to/from human’s decimal format. 
abbott1999
URL ¿Web? 
Architecture and software support in IBM S/390 Parallel Enterprise Servers for IEEE FloatingPoint arithmetic,
Paul H. Abbott et al,
IBM Journal of Research and Development, Vol. 43 #5/6,
pp723–760,
IBM,
September/November 1999.
Abstract: IEEE Binary FloatingPoint is an industrystandard architecture. The IBM System/360 hexadecimal floatingpoint architecture predates the IEEE standard and has been carried forward through the System/370 to current System/390 processors. The growing importance of industry standards and floatingpoint combined to produce a need for IEEE FloatingPoint on System/390. At the same time, customer investment in IBM floatingpoint had to be preserved. This paper describes the architecture, hardware, and software efforts that combined to produce a conforming implementation of IEEE FloatingPoint on System/390 while retaining compatibility with the original IBM architecture. 
kaplan1999
¿Web? 
The Nothing That Is – A Natural History of Zero,
Robert Kaplan,
ISBN 0195128427,
225pp,
Oxford University Press,
1999.
Abstract: If you look at zero you see nothing; but look through it and you will see the world. For zero brings into focus the great, organic sprawl of mathematics, and mathematics in turn the complex nature of things. ... Note: Also available in paperback: ISBN 0195142373. 
texas1999
¿Web? 
TI89 TI92 Plus Guidebook,
Texas Instruments,
606pp,
Texas Instruments,
November 1999.
Abstract: User’s Guide for the TI89 and TI92 Plus Graphing Calculators. Note: Revised February 2001. 
imajo2000
¿Web? 
COBOL Script: A BusinessOriented Scripting Language,
T. Imajo, T. Miyake, S. Sato, T. Ito, D. Yokotsuka, Y. Tsujihata, and S. Uemura,
Proceedings of the Fourth International Conference on Enterprise Distributed Object Computing (EDOC'00),
pp231–240,
IEEE,
September 2000.
Abstract: This paper describes COBOL Script, a Weboriented script language developed by Hitachi. COBOL Script includes the following features: (1) The language specifications, which consist of functions required for Web computing, are a subset of COBOL85, the most frequently used programming language in business information systems. (2) COBOL Script supports decimal arithmetic functions that have the same precision as in standard COBOL85 on mainframe computers. (3) Efficient implementation was based on analysis of the pros and cons of the COBOL processing system. Using COBOL Script, users can: (1) Process applications requiring high precision, such as accountrelated applications, over the Web. (2) Use a test debugger and a Coverage Function with COBOL Script for largescale development projects. (3) Use Japanese in programs. (4) Achieve good runtime performance. 
seife2000
¿Web? 
ZERO – The Biography of a Dangerous Idea,
Charles Seife,
ISBN 067088457X,
248pp,
Penguin Books Ltd.,
2000.
Abstract: The Babylonians invented it, the Greeks banned it, the Hindus worshipped it, and the Church used it to fend off heretics. For centuries, the power of zero savored of the demonic; once harnessed, it became the most important tool in mathematics... Note: Also available in paperback: ISBN 0140296476. 
shiba2000
¿Web? 
Decimal arithmetic in applications and hardware,
Akira Shibamiya,
2pp,
pers. comm.,
14 June 2000.
Abstract: (None) 
busa2001
¿Web? 
The IBM z900 Decimal Arithmetic Unit,
Fadi Y. Busaba, Christopher A. Krygowski, Wen H. Li, Eric M. Schwarz, and Steven R. Carlough,
Conference Record of the 35th Asilomar Conference on Signals, Systems and Computers, Vol. 2,
ISBN 0 7803 7147 X,
pp1335–1339,
IEEE,
Nov. 2001.
Abstract: As the cost for adding function to a processor continues to decline, processor designs are including many additional features. An example of this trend is the appearance of graphics engines and compression engines on midrange and even low end microprocessors. One area that has the potential to capture chip real estate is the decimal arithmetic engine because of its importance in financial and business applications. Studies show that 55% of the numeric data stored on commercial databases are in decimal format. Although decimal arithmetic is supported in many software languages it is not yet available on many microprocessors. This paper details the decimal arithmetic engine in the recently announced z900 microprocessor. Note: IEEE cat #01ch37256. 
cowlis2001
¿Web? 
A Decimal FloatingPoint Specification,
Michael F. Cowlishaw, Eric M. Schwarz, Ronald M. Smith, and Charles F. Webb,
Proceedings of the 15th IEEE Symposium on Computer Arithmetic,
ISBN 0769511503,
pp147–154,
IEEE,
June 2001.
Abstract: Even though decimal arithmetic is pervasive in financial and commercial transactions, computers are still implementing almost all arithmetic calculations using binary arithmetic. As chip real estate becomes cheaper it is becoming likely that more computer manufacturers will provide processors with decimal arithmetic engines. Programming languages and databases are expanding the decimal data types available while there has been little change in the base hardware. As a result, each language and application is defining a different arithmetic and few have considered the efficiency of hardware implementations when setting requirements. In this paper, we propose a decimal format which meets the requirements of existing standards for decimal arithmetic and is efficient for hardware implementation. We propose this specification in the hope that designers will consider providing decimal arithmetic in future microprocessors and that future decimal software specifications will consider hardware efficiencies. Note: Eric Schwarz’s Presentation foils are available here. 
ecma2001
¿Web? 
C# Language Specification,
Rex Jaeschke,
ECMATC39TG22001,
520pp,
ECMA,
September 2001.
Abstract: This International Standard specifies the form and establishes the interpretation of programs written in the C# programming language. It specifies: The representation of C# programs; The syntax and constraints of the C# language; The semantic rules for interpreting C# programs; The restrictions and limits imposed by a conforming implementation of C#. Note: Final draft submitted for ECMA GA approval December 2001. 
iso2001
¿Web? 
Proposed Revision of ISO 1989:1985 Information technology – Programming languages, their environments and system software interfaces – Programming language COBOL,
JTC1/SC22/WG4,
905pp,
INCITS,
December 2001.
Abstract: COBOL began as a business programming language, but its present use has spread well beyond that to a generalpurpose programming language. COBOL is well known for its file handling capabilities, which are extended in this revision by the addition of file sharing and record locking capabilities. Other major enhancements add objectoriented capabilities, handling of national characters, and enhanced interoperability with other programming languages. This is the proposed ISO/IEC 1989:2002 final draft. 
johnst2001
¿Web? 
Architecture and Algorithms for Processing Nonbinary Floating Point Radices,
Paul Johnstone and Frederick E. Petry,
unpublished paper,
39pp,
pers. comm.,
July 2001.
Abstract: Recent studies have proposed several nonbinary floating point representations which possess most of the storage and algorithmic efficiencies of traditional binary systems with no sacrifice of precision and only modest reductions in range. Such systems possess inherent advantages in that they employ less complicated conversion algorithms and are less prone to errors in representation. Additionally, nonbinary systems tend to produce more precise arithmetic results in that common problem of truncation of an infinitely repeating quotient occurs with a lesser frequency. However, as has been previously observed, traditional binary floating representations are most efficiently adapted to the prevailing choices of technology and system architecture. Previous research has left undone the quantification and evaluation of the algorithms and componentry necessary to effect the proposed representations in a fully realized system. We consider in this study the expected impact of adding the capacity to process one of the proposed nonbinary radix representations within a conventional computer system. Since decimal representations are clearly the overwhelming impetus for these studies, discussion will focus solely on base 10 systems. Examination of implementation issues are directed toward the following areas: the implementation of floating point representations in contemporary computer architectures, the design of any extensions to such systems, the effects on system complexity and cost, and, finally, resulting algorithmic revisions. 
lemieux2001
URL ¿Web? 
FixedPoint Math in C,
Joe Lemieux,
Embedded Systems Programming, Vol. 14 #4,
EDTN,
April 2001.
Abstract: Floatingpoint arithmetic can be expensive if you’re using an integeronly processor. But floatingpoint values can be manipulated as integers, as a less expensive alternative. 
texas2001a
¿Web? 
TI89/TI92 Plus Developers Guide, Beta Version .02,
Texas Instruments,
1356pp,
Texas Instruments,
2001.
Note: Available from education.ti.com web site. 
texas2001b
¿Web? 
TI89/TI92 Plus Sierra C Assembler Reference Manual, Beta Version .02,
Texas Instruments,
322pp,
Texas Instruments,
2001.
Note: Available from education.ti.com web site. 
cowlis2002
¿Web? 
Densely Packed Decimal Encoding,
Michael F. Cowlishaw,
IEE Proceedings – Computers and Digital Techniques, Vol. 149 #3,
ISSN 13502387,
pp102–104,
IEE, London,
May 2002.
Abstract: ChenHo encoding is a lossless compression of three Binary Coded Decimal digits into 10 bits using an algorithm which can be applied or reversed using only simple Boolean operations. An improvement to the encoding which has the same advantages but is not limited to multiples of three digits is described. The new encoding allows arbitrarylength decimal numbers to be coded efficiently while keeping decimal digit boundaries accessible. This in turn permits efficient decimal arithmetic and makes the best use of available resources such as storage or hardware registers. 
cowlis2002b
URL ¿Web? 
The ‘telco’ benchmark,
M. F. Cowlishaw,
URL: http://speleotrove.com/decimal,
3pp,
IBM Hursley Laboratory,
May 2002.
Abstract: This benchmark was devised in order to investigate the balance between Input and Output (I/O) time and calculation time in a simple program which realistically captures the essence of a telephone company billing application. In summary, the application reads a large input file containing a suitably distributed list of telephone call durations (each in seconds). For each call, a charging rate is chosen and the price calculated and rounded to hundreths. One or two taxes are applied (depending on the type of call) and the total cost is converted to a character string and written to an output file. Running totals of the total cost and taxes are kept; these are displayed at the end of the benchmark for verification. 
erle2002
¿Web? 
Potential Speedup with Decimal FloatingPoint Hardware,
Mark A Erle, Michael J Schulte, and J G Linebarger,
Proceedings of the Thirty Sixth Asilomar Conference on Signals, Systems, and Computers, Pacific Grove, California,
pp1073–1077,
IEEE Press,
November 2002.
Abstract: This paper address the potential speedup achieved by using decimal floatingpoint hardware, instead of software routines, on a highperformance superscalar architecture. Software routines were written to performag decimal addition, subtraction, multiplication, and division. Cycle counts were then measured for each instruction using the Simplescalar simulator. After this, new hardware algorithms were developed, existing algorithms were analyzed, and cycle counts were estimated for the same set of instructions using specialized decimal floatingpoint hardware. This data was then used to show the potential speedup obtained for programs with different instruction mixes and a recently developed benchmark. 
mazor2002
¿Web? 
Fairchild decimal arithmetic unit,
Stan Mazor,
9pp,
pers. comm.,
July–September 2002.
Abstract: We embarked on the design of Symbol II [circa 1966], a large scale HIGH LEVEL language, virtual memory, time sharing machine. This machine used large printed circuit boards, approx. 16″ x 20″ with slots for over 210 DIP’s. We had 100 connector pins on each side and we defined the system using a number of parallel busses with multiple autonomous functional units and interprocessor communication. The completed system had over 110 printed circuit boards and consumed megawatts of power... 
schwarz2002
¿Web? 
The microarchitecture of the IBM eServer z900 processor,
Eric M. Schwarz et al,
IBM Journal of Research and Development, Vol. 46 #4/5,
pp381–395,
IBM,
July/September 2002.
Abstract: The recent IBM ESA/390 CMOS line of processors, from 1997 to 1999, consisted of the G4, G5, and G6 processors. The architecture they implemented lacked 64bit addressability and had only a limited set of 64bit arithmetic instructions. The processors also lacked data and instruction bandwidth, since they utilized a unified cache. The branch performance was good, but there were delays due to conflicts in searching and writing the branch target buffer. Also, the hardware data compression and decimal arithmetic performance, though good, was in demand by database and COBOL programmers. Most of the performance concerns regarding prior processors were due to area constraints. Recent technology advances have increased the circuit density by 50 percent over that of the G6 processor. This has allowed the design of several performancecritical areas to be revisited. The end result of these efforts is the IBM eServer z900 processor, which is the first highend processor based on the new 64bit z/Architecture^{TM}. 
sun2002
URL ¿Web? 
BigDecimal (Java 2 Platform SE v1.4.0),
Sun Microsystems,
URL: http://java.sun.com/products,
17pp,
Sun Microsystems Inc.,
2002.
Abstract: Immutable, arbitraryprecision signed decimal numbers. A BigDecimal consists of an arbitrary precision integer unscaled value and a nonnegative 32bit integer scale, which represents the number of digits to the right of the decimal point. The number represented by the BigDecimal is (unscaledValue/10^{scale}). BigDecimal provides operations for basic arithmetic, scale manipulation, comparison, hashing, and format conversion. 
cowlis2003
URL ¿Web? 
Decimal FloatingPoint: Algorism for Computers,
Michael F. Cowlishaw,
Proceedings of the 16th IEEE Symposium on Computer Arithmetic,
ISBN 076951894X,
pp104–111,
IEEE,
June 2003.
Abstract: Decimal arithmetic is the norm in human calculations, and humancentric applications must use a decimal floatingpoint arithmetic to achieve the same results. Initial benchmarks indicate that some applications spend 50% to 90% of their time in decimal processing, because software decimal arithmetic suffers a 100× to 1000× performance penalty over hardware. The need for decimal floatingpoint in hardware is urgent. Existing designs, however, either fail to conform to modern standards or are incompatible with the established rules of decimal arithmetic. This paper introduces a new approach to decimal floatingpoint which not only provides the strict results which are necessary for commercial applications but also meets the constraints and requirements of the IEEE 854 standard. A hardware implementation of this arithmetic is in development, and it is expected that this will significantly accelerate a wide variety of applications. Note: Softcopy is available in PDF. 
erle2003
¿Web? 
Decimal Multiplication Via CarrySave Addition,
Mark A Erle and Michael J Schulte,
Proceedings of the IEEE International Conference on ApplicationSpecific Systems, Architectures, and Processors, the Hague, Netherlands,,
pp348–358,
IEEE Computer Society Press,
June 2003.
Abstract: Decimal multiplication is important in many commercial applications including financial analysis, banking, tax calculation, currency conversion, insurance, and accounting. This paper presents two novel designs for fixedpoint decimal multiplication that utilize decimal carrysave addition to reduce the critical path delay. First, a multiplier that stores a reduced number of multiplicand multiples and uses decimal carrysave addition in the iterative portion of the design is presented. Then, a second multiplier design is proposed with several notable improvements including fast generation of multiplicand multiples that do not need to be stored, the use of decimal (4:2) compressors, and a simplified decimal carrypropagate addition to produce the final product. When multiplying two ndigit operands to produce a 2ndigit product, the improved multiplier design has a worstcase latency of n + 4 cycles and an initiation interval of n + 1 cycles. Three datadependent optimizations, which help reduce the multipliers’ average latency, are also described. The multipliers presented can be extended to support decimal floatingpoint multiplication. 
gray2003
¿Web? 
Before the B5000: Burroughs Computers, 19511963,
George T. Gray and Ronald Q. Smith,
IEEE Annals of the History of Computing, Vol. 25 #2,
pp50–61,
IEEE,
AprilJune 2003.
Abstract: Like many companies entering the computer industry, Burroughs began by working on US government contracts. Once sufficient expertise had been gained, the company entered the general purpose computer market. The Datatron computer, obtained through the ElectroData Corporation acquisition, was a modest success in the late 1950s; however, pioneering work on transistor computers for military contracts was not immediately transferred to the commercial marketplace. 
smith2003
URL ¿Web? 
Using multipleprecision arithmetic,
David M Smith,
Computing in Science and Engineering, Vol. 5 #4,
pp88–93,
IEEE Computer Society,
July 2003.
Abstract: Highprecision arithmetic is useful in many different computational problems. The most common is a numerically unstable algorithm, for which, say, 53bit (ANSI/IEEE 7541985 Standard) double precision would not yield a sufficiently accurate result. Note: Related papers by same author at: http://myweb.lmu.edu/dmsmith/FMLIB.html 
steele2003
¿Web? 
How to Print FloatingPoint Numbers Accurately (Retrospective),
Guy. L. Steele Jr. and Jon. L. White,
20 Years of the ACM/SIGPLAN Conference on Programming Language Design and Implementation (19791999): A Selection, 2003,
3pp,
ACM Press,
2003.
Abstract: Our PLDI paper was almost 20 years in the making. How should the result of dividing 1.0 by 10.0 be printed? In 1970, one usually got “0.0999999” or “0.099999994”; why not “0.1”? ... 
busa2004
¿Web? 
The Design of the Fixed Point Unit for the z990 Microprocessor,
Fadi Y. Busaba, Timothy Slegel, Steven R. Carlough, Christopher A. Krygowski, and John G Rell,
Proceedings of the 14th ACM Great Lakes symposium on VLSI,
ISBN 1581138539,
pp364 – 367,
ACM Press,
2004.
Abstract: The paper presents the design of the Fixed Point Unit (FXU) for the IBM eServer z990 microprocessor (announced in 2Q ’03) that runs at 1.2 GHz. The FXU is capable of executing two RegisterMemory instructions including arithmetic instructions and a branch instruction in a single cycle. The FXU executes a total of 369 instructions that operate on variable size operands (1 to 256 bytes). The instruction set include decimal arithmetic with multiplies and divides, binary arithmetic, shifts and rotates, loads/stores, branches, long moves, logical operations, convert instructions, and other special instructions. The FXU consists of 64bit dataflow stack that is custom designed and a control stack that is synthesized. The current FXU is the first superscalar design for the CMOS zseries machines, has a new improved decimal unit, and has for the first time a 16x64 bit binary multiplier. 
cowlis2004
¿Web? 
Fixed, floating, and exact computation with Java's BigDecimal,
M. Cowlishaw, J. Bloch, and J.D. Darcy,
Dr. Dobb's Journal Vol. 29 #7,
ISSN 1044789X,
pp22–27,
CMP Media,
July 2004.
Abstract: Decimal data types are widely used in commercial, financial, and Web applications, and many generalpurpose programming languages have either native decimal types or readily available decimal arithmetic packages. Since the 1.1 release, the libraries of the Java programming language supported decimal arithmetic via the Java.math.BigDecimal class. With the inclusion of JSR13 into J2SE 1.5, BigDecimal now has true floatingpoint operations consistent with those in the IEEE 754 revision. In this article, we first explain why decimal arithmetic is important and the differences between the BigDecimal class and binary float and double types. 
hack2004
URL ¿Web? 
On Intermediate Precision Required for CorrectlyRounding DecimaltoBinary FloatingPoint Conversion.,
Michel Hack,
Proceedings of RNC6 (6th conference on Real Numbers and Computers),
URL: http://www.informatik.unitrier.de/Reports/TR082004/rnc6_10_hack.pdf,
22pp,
University of Trier,
November 2004.
Abstract: The algorithms developed ten years ago in preparation for IBM’s support of IEEE FloatingPoint on its mainframe S/390 processors use an overly conservative intermediate precision to guarantee correctlyrounded results across the entire exponent range. Here we study the minimal requirement for both bounded and unbounded precision on the decimal side (converting to machine precision on the binary side). An interesting new theorem on Continued Fraction expansions is offered, as well as an open problem on the growth of partial quotients for ratios of powers of two and five. 
kahn2004
¿Web? 
The childengineering of arithmetic in ToonTalk,
Ken Kahn,
Proceedings of the 2004 conference on Interaction Design and Children,
ISBN 1581137915,
pp141–142,
ACM Press,
2004.
Abstract: Providing a childappropriate interface to an arithmetic package with large numbers and exact fractions is surprisingly challenging. We discuss solutions to problems ranging from how to present fractions such as 1/3 to how to deal with numbers with tens of thousands of digits. As with other objects in ToonTalk®, we strive to make the enhanced numbers work in a concrete and playful manner. 
kenney2004a
¿Web? 
Multioperand Decimal Addition (extended version),
Robert D Kenney and Michael J Schulte,
Proceedings of the IEEE Computer Society Annual Symposium on VLSI, Lafayette, LA, February, 2004.,
10pp,
IEEE,
February 2004.
Abstract: This paper introduces and analyzes four techniques for performing fast decimal addition on multiple binary coded decimal (BCD) operands. Three of the techniques speculate BCD correction values and use chaining to correct intermediate results. The first speculates over one addition. The second speculates over two additions. The third employs multiple instances of the second technique in parallel and then merges the results. The fourth technique uses a binary carrysave adder tree and produces a binary sum. Combinational logic is then used to correct the sum and determine the carry into the next digit. Multioperand adder designs are constructed and synthesized for four to sixteen input operands. Analyses are performed on the synthesis results and the merits of each technique are discussed. Finally, these techniques are compared to previous attempts made at speeding up decimal addition. 
kenney2004b
¿Web? 
HighFrequency Decimal Multiplier,
Robert D Kenney, Michael J Schulte, and Mark A. Erle,
Proceedings of the IEEE International Conference on Computer Design: VLSI in Computers and Processors,
ISBN 0 7695 2231 9,
pp26–29,
IEEE,
October 2004.
Abstract: Decimal arithmetic is regaining popularity in the computing community due to the growing importance of commercial, financial, and Internetbased applications, which process decimal data. This paper presents an iterative decimal multiplier, which operates at high clock frequencies and scales well to large operand sizes. The multiplier uses a new decimal representation for intermediate products, which allows for a very fast two stage iterative multiplier design. Decimal multipliers, which are synthesized using a 0.11 micron CMOS standard cell library, operate at clock frequencies close to 2 GHz. The latency of the proposed design to multiply two ndigit BCD operands is (n + 8) cycles with a new multiplication able to begin every (n + 1) cycles. 
nikmehr2004
¿Web? 
A decimal carryfree adder,
Hooman Nikmehr, Braden Phillips, and ChengChew Lim,
SPIE Symposium Smart Materials, Nano, and MicroSmart Systems, Proceedings of SPIE Vol. 5649,
12pp,
SPIE International Society for Optical Engineering,
December 2004.
Abstract: Recently, decimal arithmetic has become attractive in the financial and commercial world including banking, tax calculation, currency conversion, insurance and accounting. Although computers are still carrying out decimal calculation using software libraries and binary floatingpoint numbers, it is likely that in the near future, all processors will be equipped with units performing decimal operations directly on decimal operands. One critical building block for some complex decimal operations is the decimal carryfree adder. This paper discusses the mathematical framework of the addition, introduces a new signeddigit format for representing decimal numbers and presents an efficient architectural implementation. Delay estimation analysis shows that the adder offers improved performance over earlier designs. 
schulte2004
URL ¿Web? 
Design Exploration for Decimal FloatingPoint Arithmetic {IBM} University Partnership Program Proposal,
Michael J. Schulte and Eric Schwarz,
4pp,
IBM,
11 March 2004.
Abstract: Commercial applications and databases typically store numerical data in decimal format. Currently, however, microprocessors do not provide instructions or hardware support for decimal floatingpoint arithmetic. Consequently, decimal numbers are often read into computers, converted to binary numbers, and then processed using binary floatingpoint arithmetic. Results are then converted back to decimal before being stored. Besides being timeconsuming, this process is errorprone, since most decimal numbers cannot be exactly represented as binary numbers. Thus, if binary floatingpoint arithmetic is used to process decimal data, unexpected results may occur after a few computations... 
thomp2004
¿Web? 
A 64bit Decimal FloatingPoint Adder (extended version),
John Thompson, Nandini Karra, and Michael J Schulte,
Proceedings of the IEEE Computer Society Annual Symposium on VLSI, Lafayette, LA, February, 2004.,
pp297–298,
IEEE,
February 2004.
Abstract: Due to the rapid growth in financial, commercial, and Internetbased applications, there is an increasing desire to allow computers to operate on both binary and decimal floatingpoint numbers. Consequently, specifications for decimal floatingpoint arithmetic are being added to the IEEE754 Standard for FloatingPoint Arithmetic. In this paper, we present the design and implementation of a decimal floatingpoint adder that is compliant with the current draft revision of the IEEE754 Standard. The adder supports operations on 64bit (16digit) decimal floatingpoint operands. We provide synthesis results indicating the estimated area and delay for our design when it is pipelined to various depths. 
wang2004
¿Web? 
Decimal FloatingPoint Division Using NewtonRaphson Iteration,
LiangKai Wang and Michael J Schulte,
Proceedings of the 15th IEEE International Conference on ApplicationSpecific Systems, Architectures and Processors (ASAP’04),
pp84–95,
IEEE Computer Society Press,
September 2004.
Abstract: Decreasing feature sizes allow additional functionality to be added to future microprocessors to improve the performance of important application domains. As a result of rapid growth in financial, commercial, and Internetbased applications, hardware support for decimal floatingpoint arithmetic is now being considered by various computer manufacturers and specifications for decimal floatingpoint arithmetic have been added to the draft revision of the IEEE754 Standard for FloatingPoint Arithmetic (IEEE754R). This paper presents an efficient arithmetic algorithm and hardware design for decimal floatingpoint division. The design uses an optimized piecewise linear approximation, a modified Newton Raphson iteration, a specialized rounding technique, and a simplified combined decimal incrementer/decrementer. Synthesis results show that a 64bit (16digit) implementation of the decimal divider, which is compliant with IEEE754R, has an estimated critical path delay of 0.69 ns when implemented using LSI Logic’s 0.11 micron gflxp standard cell library. 
babu2005
¿Web? 
Design of a Reversible Binary Coded Decimal Adder by Using Reversible 4bit Parallel Adder,
Hafiz Md. Hasan Babu and Ahsan Raja Chowdhury,
Proceedings of the 18th International Conference on VLSI Design (VLSID 2005),
ISBN 0769522645,
pp255–260,
IEEE,
2005.
Abstract: In this paper, we have proposed a design technique for the reversible circuit of binary coded decimal (BCD) adder. The proposed circuit has the ability to add two 4bits binary variables and it transforms the addition into the appropriate BCD number with efficient error correcting modules where the operations are reversible. We also show that the proposed design technique generates the reversible BCD adder circuit with minimum number of gates as well as the minimum number of garbage outputs. 
erle2005
¿Web? 
Decimal Multiplication With Efficient Partial Product Generation,
Mark A Erle, Eric Schwarz, and Michael J Schulte,
Proceedings of the 17th IEEE Symposium on Computer Arithmetic,
ISBN 0769523668,
pp21–28,
IEEE,
June 2005.
Abstract: Decimal multiplication is important in many commercial applications including financial analysis, banking, tax calculation, currency conversion, insurance, and accounting. This paper presents a novel design for fixedpoint decimal multiplication that utilizes a simple recoding scheme to produce signedmagnitude representations of the operands thereby greatly simplifying the process of generating partial products for each multiplier digit. The partial products are generated using a digitbydigit multiplier on a wordbydigit basis, first in a signeddigit form with two digits per position, and then combined via a combinational circuit. As the signeddigit partial products are developed one at a time while traversing the recoded multiplier operand from the least significant digit to the most significant digit, each partial product is added along with the accumulated sum of previous partial products via a signeddigit adder. This work is significantly different from other work employing digitbydigit multipliers due to the efficiency gained by restricting the range of digits throughout the multiplication process. 
kenney2005
¿Web? 
Highspeed multioperand decimal adders,
R.D. Kenney and M. J. Schulte,
IEEE Transactions on Computers, Vol. 54 #8,
ISSN 00189340,
pp953–963,
IEEE,
August 2005.
Abstract: There is increasing interest in hardware support for decimal arithmetic as a result of recent growth in commercial, financial, and Internetbased applications. Consequently, new specifications for decimal floatingpoint arithmetic have been added to the draft revision of the IEEE754 Standard for FloatingPoint Arithmetic. This paper introduces and analyzes three techniques for performing fast decimal addition on multiple binary coded decimal (BCD) operands. Two of the techniques speculate BCD correction values and correct intermediate results while adding the input operands. The first speculates over one addition. The second speculates over two additions. The third technique uses a binary carrysave adder tree and produces a binary sum. Combinational logic is then used to correct the sum and determine the carry into the next more significant digit. Multioperand adder designs are constructed and synthesized for four to 16 input operands. Analyses are performed on the synthesis results and the merits of each technique are discussed. Finally, these techniques are compared to several previous techniques for highspeed decimal addition. 
marsaglia2005
URL ¿Web? 
On the Randomness of Pi and Other Decimal Expansions,
George Marsaglia,
Interstat October 2005 #5,
17pp,
Interstat (interstat.statjournals.net),
October 2005.
Abstract: Tests of randomness much more rigorous than the usual frequencyofdigit counts are applied to the decimal expansions of π, e and √2, using the Diehard Battery of Tests adapted to base 10 rather than the original base 2. The first 10^{9} digits of π, e and √2 seem to pass the Diehard tests very well. But so do the decimal expansions of most rationals k/p with large primes p. Over the entire set of tests, only the digits of √2 give a questionable result: the monkey test on 5letter words. Its significance is discussed in the text. Three specific k/p are used for comparison. The cycles in their decimal expansions are developed in reverse order by the multiplywithcarry (MWC) method. They do well in the Diehard tests, as do many fast and simple MWC RNGs that produce baseb ‘digits’ of the expansions of k/p for b = 2^{32} or b = 2^{32}− 1. Choices of primes p for such MWC RNGs are discussed, along with comments on their implementation. 
neukom2005
¿Web? 
ERMETH: The First Swiss Computer,
Hans Heukom,
IEEE Annals of the History of Computing,
pp5–22,
IEEE,
October 2005.
Abstract: Eduard Stiefel, in 1948 the first director of the Federal Institute of Technology’s newly established Institute of Applied Mathematics, recognized that computers would be essential to this new field of mathematics. Unable to find exactly what he wanted in existing computers, Stiefel developed the ERMETH. This article examines the rationale of, and objectives for, the first Swiss computer. 
sasao2005
¿Web? 
Radix Converters: Complexity and Implementation by LUT Cascades,
Tsutomu Sasao,
35th International Symposium on MultipleValued Logic (ISMVL'05),
pp256–263,
IEEE,
May 2005.
Abstract: In digital signal processing, we often use higher radix system to achieve highspeed computation. In such cases, we require radix converters. This paper considers the design of LUT cascades that convert ��nary numbers to nary numbers. In particular, we derive several upper bounds on the column multiplicities of decomposition charts that represent radix converters. From these, we can estimate the size of LUT cascades to realize radix converters. These results are useful to design compact radix converters, since these bounds show strategies to partition the outputs into groups. 
schulte2005
URL ¿Web? 
Performance Evaluation of Decimal FloatingPoint Arithmetic,
Michael J. Schulte, Nick Lindberg, and Anitha Laxminarain,
Proceedings of the 6th IBM Austin Center for Advanced Studies Conference, Austin, TX,,
8pp,
IBM,
February 2005.
Abstract: The prominence of decimal data in commercial and financial applications has led researchers to pursue efficient techniques for performing decimal floatingpoint arithmetic. While several software implementations of decimal floatingpoint arithmetic have been implemented, there is a growing need to provide hardware support for decimal floatingpoint arithmetic to keep up with the processing demands of emerging commercial and financial applications. This paper evaluates and compares the performance of decimal floatingpoint arithmetic operations when implemented on superscalar processors using either software libraries or specialized hardware designs. Our comparisons show that hardware implementations of decimal floatingpoint arithmetic operations are one to two orders of magnitude faster than software implementations. 
wang2005
¿Web? 
Decimal FloatingPoint Square Root Using NewtonRaphson Iteration,
LiangKai Wang and Michael J Schulte,
Proceedings of the 15th IEEE International Conference on ApplicationSpecific Systems, Architectures and Processors (ASAP’05),
pp309–315,
IEEE Computer Society Press,
July 2005.
Abstract: With continued reductions in feature size, additional functionality may be added to future microprocessors to boost the performance of important application domains. Due to growth in commercial, financial, and Internetbased applications, decimal floating point arithmetic is now attracting more attention, and hardware support for decimal operations is being considered by various computer manufacturers. In order to standardize decimal number formats and operations, specifications for decimal floatingpoint arithmetic have been added to the draft revision of the IEEE754 Standard for FloatingPoint Arithmetic (IEEE754R). This paper presents an efficient arithmetic algorithm and hardware design for decimal floatingpoint square root. This design uses an optimized piecewise linear approximation, a modified NewtonRaphson iteration, a specialized rounding technique, and a modified decimal multiplier. Synthesis results show that a 64bit (16digit) implementation of the decimal square root, which is compliant with the IEEE754R, has an estimated critical path delay of 0.95 ns and maximum latency of 210 clock cycles when implemented using LSI Logic’s 0.11 micron GflxP Standard Cell library. 
allison2006
¿Web? 
Where did all my decimals go?,
Chuck Allison,
Computing Sciences in Colleges, Vol. 21 #3,
pp47–59,
Consortium for Computing Sciences in Colleges,
February 2006.
Abstract: It is tremendously ironic that computers were invented with number crunching in mind, yet nowadays most CS graduates leave school with little or no experience with the intricacies of numeric computation. This paper surveys what every CS graduate should know about floatingpoint arithmetic, based on experience teaching a recentlycreated course on modern numerical software development. 
bernal2006
¿Web? 
Integer Representation of Decimal Numbers for Exact Computations,
Javier Bernal and Christoph Witzgall,
Journal of Research of the National Institute of Standards and Technology, Vol. 111 #2,
pp79–88,
National Institute of Standards and Technology,
MarchApril 2006.
Abstract: A scheme is presented and software is documented for representing as integers input decimal numbers that have been stored in a computer as double precision floating point numbers and for carrying out multiplications, additions and subtractions based on these numbers in an exact manner. The input decimal numbers must not have more than nine digits to the left of the decimal point. The decimal fractions of their floating point representations are all first rounded off at a prespecified location, a location no more than nine digits away from the decimal point. The number of digits to the left of the decimal point for each input number besides not being allowed to exceed nine must then be such that the total number of digits from the leftmost digit of the number to the location where roundoff is to occur does not exceed fourteen. 
castell2006
¿Web? 
A 64bit Decimal FloatingPoint Comparator,
Ivan D. Castellanos and James E. Stine,
IEEE 17th International Conference on Applicationspecific Systems, Architectures and Processors (ASAP'06),
pp138–144,
IEEE,
2006.
Abstract: Decimal arithmetic is growing in importance as scientific studies reveal that current financial and commercial applications spend a high percentage overhead in this type of calculations. Typically, software is utilized to emulate decimal floating point arithmetic in these applications. On the other hand, functional units that employ decimal floating point hardware can improve performance by two or three orders of magnitude. This paper presents the design and implementation of a novel decimal floatingpoint comparator compliant with the current draft revision of the IEEE754 Standard for floatingpoint arithmetic. It utilizes a novel BCD magnitude comparator with logarithmic delay and it supports 64bit decimal floatingpoint numbers. Area and delay results are examined for an implementation in TSMC SCN6M SCMOS technology. 
jointcomm2006
URL ¿Web? 
The Official "Do Not Use" List,
The Joint Commission,
URL: http://www.jointcommission.org/PatientSafety/DoNotUseList/,
1p,
2006.
Abstract: In May 2005, The Joint Commission affirmed its “do not use” list of abbreviations. The list was originally created in 2004 by the Joint Commission as part of the requirements for meeting National Patient Safety Goal (NPSG) requirement 2B (Standardize a list of abbreviations, acronyms and symbols that are not to be used throughout the organization). Summit conclusions were posted on the Joint Commission website for public comment. During the fourweek comment period, the Joint Commission received 5,227 responses, including 15,485 comments. More than 80 percent of the respondents supported the creation and adoption of a “do not use” list. 
kaivani2006
¿Web? 
Reversible Implementation of DenselyPackedDecimal Converter to and from BinaryCodedDecimal Format Using in IEEE754R,
A. Kaivani, A. Zaker Alhosseini, S. Gorgin, and M. Fazlali,
9th International Conference on Information Technology (ICIT'06),
pp273–276,
IEEE,
December 2006.
Abstract: The Binary Coded Decimal (BCD) encoding has always dominated the decimal arithmetic algorithms and their hardware implementation. Due to importance of decimal arithmetic, the decimal format defined in lEEE 754 floating point standard has been revisited. It uses Densely Packed Decimal (DPD) encoding to store significand part of a decimal floating point number. Furthermore in recent years reversible logic has attracted the attention of engineers for designing low power CMOS circuits, as it is not possible to realize quantum compufing withouf reversible logic implementation. This paper derives the reversible implementation of DPD converter to and from conventional BCD format using in IEEE 754R. 
kettani2006
¿Web? 
On the Conversion Between Number Systems,
Houssain Kettani,
IEEE Transactions on Circuits and Systems, Vol. 53 #11,
ISSN 10577130,
pp1255–1258,
IEEE,
November 2006.
Abstract: This brief revisits the problem of conversion between number systems and asks the following question: given a nonnegative decimal number d, what is the value of the digit at position j in the corresponding base b number? Thus, we do not require the knowledge of other digits except the one we are interested in. Accordingly, we present a conversion function that relates each digit in a base b system to the decimal value that is equal to the base b number in question. We also show some applications of this new algorithm in the areas of parallel computing and cryptography. 
kim2006
¿Web? 
A Hybrid Decimal Division Algorithm Reducing Computational Iterations,
YongDae Kim, SoonYoul Kwon, SeonKyoung Han, KyoungRok Cho, and Younggap You,
IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences Vol. E89A #6,
pp1807–1812,
The Institute of Electronics, Information and Communication Engineers,
2006.
Abstract: This paper presents a hybrid decimal division algorithm to improve division speed. The proposed hybrid algorithm employs either nonrestoring or restoring algorithm on each digit to reduce iterative computations. The selection of the algorithm is based on the relative remainder values with respect to the half of its divisor. The proposed algorithm requires maximum 7n+4 add/subtract operations for an ndigit quotient, whereas other restoring or nonrestoring schemes comprise more than 10n+1 operations. 
lang2006
¿Web? 
A Radix10 Combinational Multiplier,
Tomás Lang and Alberto Nannarelli,
Proceedings of 40th Asilomar Conference on Signals, Systems, and Computers,
pp313–317,
IEEE,
October 2006.
Abstract: In this work, we present a combinational decimal multiply unit which can be pipelined to reach the desired throughput. With respect to previous implementations of decimal multiplication, the proposed unit is combinational (parallel) and not sequential, has a simpler recoding of the operands which reduces the number of partial product precomputations and uses counters to eliminate the need of the decimal equivalent of a 4:2 adder. The results of the implementation show that the combinational decimal multiplier offers a good compromise between latency and area when compared to other decimal multiply units and to binary doubleprecision multipliers. 
nikmehr2006
¿Web? 
Fast Decimal FloatingPoint Division,
Hooman Nikmehr, Braden Phillips, and ChengChew Lim,
IEEE Transactions on Very Large Scale Integration (VLSI) Systems, Vol. 14 #9,
ISSN 10638210,
pp951–961,
IEEE,
September 2006.
Abstract: A new implementation for decimal floatingpoint (DFP) division is introduced. The algorithm is based on highradix SRT division. The SRT division algorithm is named after D. Sweeney, J. E. Robertson, and T. D. Tocher, with the recurrence in a new decimal signeddigit format. Quotient digits are selected using comparison multiples, where the magnitude of the quotient digit is calculated by comparing the truncated partial remainder with limited precision multiples of the divisor. The sign is determined concurrently by investigating the polarity of the truncated partial remainder. A timing evaluation using a logic synthesis shows a significant decrease in the division execution time in contrast with one of the fastest DFP dividers reported in the open literature. 
thapliyal2006
URL ¿Web? 
Novel BCD Adders and Their Reversible Logic Implementation for IEEE 754r Format,
Himanshu Thapliyal, Saurabh Kotiyal, and M. B. Srinivas,
Proceeding of the 19th International Conference on VLSI Design (VLSID’06),
pp387–392,
IEEE,
2006.
Abstract: IEEE 754r is the ongoing revision to the IEEE 754 floating point standard and a major enhancement to the standard is the addition of decimal format. This paper proposes two novel BCD adders called carry skip and carry lookahead BCD adders respectively. Furthermore, in the recent years, reversible logic has emerged as a promising technology having its applications in low power CMOS, quantum computing, nanotechnology, and optical computing. It is not possible to realize quantum computing without reversible logic. Thus, this paper also provides the reversible logic implementation of the conventional BCD adder as the well as the proposed Carry Skip BCD adder using a recently proposed TSG gate. Furthermore, a new reversible gate called TS3 is also being proposed and it has been shown that the proposed reversible logic implementation of the BCD Adders is much better compared to recently proposed one, in terms of number of reversible gates used and garbage outputs produced. The reversible BCD circuits designed and proposed here form the basis of the decimal ALU of a primitive quantum CPU. 
thapliyal2006b
¿Web? 
Modified Carry Look Ahead BCD Adder With CMOS and Reversible Logic Implementation,
Himanshu Thapliyal and Hamid R. Arabnia,
Proceedings of the 2006 International Conference on Computer Design (CDES'06),
ISBN 1601320094,
pp64–69,
CSREA Press,
November 2006.
Abstract: IEEE 754r is the ongoing revision to the IEEE 754 floating point standard and a major enhancement to the standard is the addition of decimal format. Firstly, this paper proposes novel two transistor AND & OR gates. The proposed AND gate has no power supply, thus it can be referred as the Powerless AND gate. Similarly, the proposed two transistor OR gate has no ground and can be referred as Groundless OR. Two designs of AND & OR gate without VDD or GND are also shown. Secondly for IEEE 754r format, one novel BCD adder called carry lookahead BCD adder is also proposed. In order to design the carry lookahead BCD adder, a novel 4 bit carry lookahead adder called NCLA is proposed which forms the basic building block of the proposed carry lookahead BCD adder. The proposed two transistors AND & OR gates are used to provide the optimized small area, low power, high throughput circuitries of the proposed BCD adder. Nowadays, reversible logic is also emerging as a promising computing paradigm having its applications in quantum computing, optical computing and nanotechnology. Thus, reversible logic implementation of the proposed BCD Adder is also shown in this paper. 
thapliyal2006c
¿Web? 
Design of Novel Reversible Carry LookAhead BCD Subtractor,
Himanshu Thapliyal and Sumedha K. Gupta,
Proceedings of the 9th International Conference on Information Technology (ICIT'06),
ISBN 0769526357,
pp253–258,
IEEE,
December 2006.
Abstract: IEEE 754r is the ongoing revision to the IEEE 754 floating point standard. A major enhancement to the standard is the addition of decimal format, thus the design of BCD arithmetic units is likely to get significant attention. Firstly, this paper introduces a novel carry lookahead BCD adder and then builds a novel carry lookahead BCD subtractor based on it. Secondly, it introduces the reversible logic implementation of the proposed carry lookahead BCD subtractor. We have tried to design the reversible logic implementation of the BCD Subtractor optimal in terms of number of reversible gates used and garbage outputs produced. Thus, the proposed work will be of significant value as the technologies mature. 
watanabe2006
¿Web? 
Formal Design of Decimal Arithmetic Circuits Using Arithmetic Description Language,
Yuki Watanabe, Naofumi Homma, Takafumi Aoki, and Tatsuo Higuchi,
IEEE International Symposium on Intelligent Signal Processing and Communications, 2006 (ISPACS '06),
ISBN 0780397339,
pp419–422,
IEEE,
December 2006.
Abstract: This paper presents a formal design of decimal arithmetic circuits using an arithmetic description language called ARITH. The use of ARITH makes possible (i) formal description of arithmetic algorithms including those using unconventional number systems, (ii) formal verification of described arithmetic algorithms, and (iii) translation of arithmetic algorithms to the equivalent HDL descriptions. In this paper, we demonstrate the potential of ARITH through an experimental design of binary coded decimal (BCD) arithmetic circuits. 
you2006
¿Web? 
Dynamic decimal adder circuit design by using the carry look ahead,
Younggap You, Yong Dae Kim, and Jong Hwa Choi,
IEEE Workshop on Design and Diagnostics of Electronic Circuits and Systems,
3pp,
IEEE Computer Society,
April 2006.
Abstract: This paper presents a carry look ahead (CLA) circuitry design based on dynamic circuit aiming at delay reduction in addition of BCD coded decimal numbers. The performance of the proposed dynamic decimal adder is analyzed demonstrating its speed improvement. Timing simulation on the proposed decimal addition circuit employing 0.25µm CMOS technology yields the worst case delay of 622 ns. 
aharoni2007
URL ¿Web? 
Solving Constraints on the Intermediate Result of Decimal FloatingPoint Operations,
Merav Aharoni, Ron Maharik, and Abraham Ziv,
Proceedings of the 18th IEEE Symposium on Computer Arithmetic,
ISBN 0769528546,
ISBN 9780769528540,
pp38–45,
IEEE,
June 2007.
Abstract: The draft revision of the IEEE Standard for Floating Point Arithmetic (IEEE P754) includes a definition for decimal floatingpoint (FP) in addition to the widely used binary FP specification. The decimal standard raises new concerns with regard to the verification of hardware and softwarebased designs. The verification process normally emphasizes intricate corner cases and uncommon events. The decimal format introduces several new classes of such events in addition to those characteristic of binary FP. Our work addresses the following problem: Given a decimal floatingpoint operation, a constraint on the intermediate result, and a constraint on the representation selected for the result, find random inputs for the operation that yield an intermediate result compatible with these specifications. The paper supplies efficient analytic solutions for addition and for some cases of multiplication and division. We provide probabilistic algorithms for the remaining cases. These algorithms prove to be efficient in the actual implementation. 
beebe2007a
¿Web? 
Extending TeX and METAFONT with floatingpoint arithmetic,
Nelson H.F. Beebe,
Proceedings of TUG 2007, TUGboat Vol. 28 #3,
ISSN 08963207,
pp319–328,
TeX User's Group,
July 2007.
Abstract: The article surveys the state of arithmetic in TeX and METAFONT, suggests that they could usefully be extended to support floatingpoint arithmetic, and shows how this could be done with a relatively small effort, without loss of the important feature of platformindependent results from those programs, and without invalidating any existing documents, or software written for those programs, including output drivers. 
bhat2007
¿Web? 
Performance Characterization of Decimal Arithmetic in Commercial Java Workloads,
M. Bhat, J. Crawford, R. Morin, and K. Shiv,
IEEE International Symposium on Performance Analysis of Systems & Software, 2007 (ISPASS 2007) Abstract: Binary floatingpoint numbers with finite precision cannot represent all decimal numbers with complete accuracy. This can often lead to errors while performing calculations involving floating point numbers. For this reason many commercial applications use special decimal representations for performing these calculations, but their use carries performance costs such as bidirectional conversion. The purpose of this study was to understand the total application performance impact of using these decimal representations in commercial workloads, and provide a foundation of data to justify pursuing optimized hardware support for decimal math. In Java, a popular development environment for commercial applications, the BigDecimal class is used for performing accurate decimal computations. BigDecimal provides operations for arithmetic, scale manipulation, rounding, comparison, hashing, and format conversion. We studied the impact of BigDecimal usage on the performance of serverside Java applications by analyzing its usage on two standard enterprise benchmarks, SPECjbb2005 and SPECjAppServer2004 as well as a reallife missioncritical financial workload, Morgan Stanley’s Trade Completion. In this paper, we present detailed performance characteristics and we conclude that, relative to total application performance, the overhead of using software decimal implementations is low, and at least from the point of view of these workloads, there is insufficient performance justification to pursue hardware solutions 
cornea2007
URL ¿Web? 
A Software Implementation of the IEEE 754R Decimal FloatingPoint Arithmetic Using the Binary Encoding Format,
Marius Cornea, Cristina Anderson, John Harrison, Ping Tak Peter Tang, Eric Schneider, and Charles Tsen,
Proceedings of the 18th IEEE Symposium on Computer Arithmetic,
ISBN 0769528546,
ISBN 9780769528540,
pp29–37,
IEEE,
June 2007.
Abstract: The IEEE Standard 7541985 for Binary FloatingPoint Arithmetic was revised, and an important addition is the definition of decimal floatingpoint arithmetic. This is intended mainly to provide a robust, reliable framework for financial applications that are often subject to legal requirements concerning rounding and precision of the results, because the binary floatingpoint arithmetic may introduce small but unacceptable errors. Using binary floatingpoint calculations to emulate decimal calculations in order to correct this issue has led to the existence of numerous proprietary software packages, each with its own characteristics and capabilities. IEEE 754R decimal arithmetic should unify the ways decimal floatingpoint calculations are carried out on various platforms. New algorithms and properties are presented in this paper which are used in a software implementation of the IEEE 754R decimal floatingpoint arithmetic, with emphasis on using binary operations efficiently. The focus is on rounding techniques for decimal values stored in binary format, but algorithms for the more important or interesting operations of addition, multiplication, division, and conversions between binary and decimal floatingpoint formats are also outlined. Performance results are included for a wider range of operations, showing promise that our approach is viable for applications that require decimal floatingpoint calculations. 
dadda2007
¿Web? 
Multioperand Parallel Decimal Adder: A Mixed Binary and BCD Approach,
Luigi Dadda,
IEEE Transactions on Computers, Vol. 56 #10,
ISSN 00189340,
pp1320–1328,
IEEE,
October 2007.
Abstract: Decimal arithmetic has been in recent years revived due to the large amount of data in commercial applications. We consider the problem of Multi Operand Parallel Decimal Addition with an approach that uses binary arithmetic, suggested by the adoption of BCD numbers. This involves corrections in order to obtain the BCD result, or a binary to decimal conversion. We adopt the latter approach, particularly efficient for a large number of addends. Conversion requires a relatively small area and can afford fast operation. The BD conversion, moreover, allows an easy alignment of the sums of adjacent columns. We treat the design of BCD digit adders using fast carry free adders and the conversion problem through a known parallel scheme using elementary conversion cells. Spreadsheets have been developed for adding several BCD digits and for simulating the binary to decimal conversion as design tool. 
duale2007
URL ¿Web? 
Decimal floatingpoint in z9: An implementation and testing perspective,
A. Y. Duale, M. H. Decker, H.G. Zipperer, M Aharoni, and T. J. Bohizic,
IBM Journal of Research and Development, Vol. 51 #1/2,
ISSN 00188646,
pp217–227,
IBM,
January 2007.
Abstract: Although decimal arithmetic is widely used in commercial and financial applications, the related computations are handled in software. As a result, applications that use decimal data may experience performance degradations. Use of the newly defined decimal floatingpoint (DFP) format instead of binary floatingpoint is expected to significantly improve the performance of such applications. System z9™ is the first IBM machine to support the DFP instructions. We present an overview of this implementation and provide some measurement of the performance gained using hardware assists. Various tools and techniques employed for the DFP verification on unit, element, and system levels are presented in detail. Several groups within IBM collaborated on the verification of the new DFP facility, using a common reference model to predict DFP results. 
eisen2007
¿Web? 
IBM POWER6 accelerators: VMX and DFU,
L. Eisen, J. W. Ward III, H.W. Tast, N. Mäding, J. Leenstra, S. M. Mueller, C. Jacobi, J. Preiss, E. M. Schwarz, and S. R. Carlough,
IBM Journal of Research and Development Vol. 51 #6,
ISSN 00188646,
pp663–683,
IBM,
November 2007.
Abstract: The IBM POWER6 microprocessor core includes two accelerators for increasing performance of specific workloads. The vector multimedia extension (VMX) provides a vector acceleration of graphic and scientific workloads. It provides single instructions that work on multiple data elements. The instructions separate a 128bit vector into different components that are operated on concurrently. The decimal floatingpoint unit (DFU) provides acceleration of commercial workloads, more specifically, financial transactions. It provides a new number system that performs implicit rounding to decimal radix points, a feature essential to monetary transactions. The IBM POWER processor instruction set is substantially expanded with the addition of these two accelerators. The VMX architecture contains 176 instructions, while the DFU architecture adds 54 instructions to the base architecture. The IEEE 754R Binary FloatingPoint Arithmetic Standard defines decimal floatingpoint formats, and the POWER6 processor—on which a substantial amount of area has been devoted to increasing performance of both scientific and commercial workloads—is the first commercial hardware implementation of this format. 
erle2007
URL ¿Web? 
Decimal FloatingPoint Multiplication Via CarrySave Addition,
Mark A. Erle, Michael J. Schulte, and Brian J. Hickmann,
Proceedings of the 18th IEEE Symposium on Computer Arithmetic,
ISBN 0769528546,
ISBN 9780769528540,
pp46–55,
IEEE,
June 2007.
Abstract: Decimal multiplication is important in many commercial applications including financial analysis, banking, tax calculation, currency conversion, insurance, and accounting. This paper presents the design of a decimal floatingpoint multiplier that complies with specifications for decimal multiplication given in the draft revision of the IEEE 754 Standard for Floatingpoint Arithmetic (IEEE 754R). This multiplier extends a previously published decimal fixedpoint multiplier design by adding several features including exponent generation, sticky bit generation, shifting of the intermediate product, rounding, and exception detection and handling. The core of the decimal multiplication algorithm is an iterative scheme of partial product accumulation employing decimal carrysave addition to reduce the critical path delay. Novel features of the proposed multiplier include support for decimal floatingpoint numbers, onthefly generation of the sticky bit, early estimation of the shift amount, and efficient decimal rounding. Area and delay estimates are provided for a verified Verilog register transfer level model of the multiplier. 
hickmann2007
¿Web? 
A Parallel IEEE P754 Decimal FloatingPoint Multiplier,
Brian J. Hickmann, Andrew Krioukov, Michael J. Schulte, and Mark A. Erle,
Proceedings of the IEEE International Conference on Computer Design 2007,
pp296–303,
IEEE,
October 2007.
Abstract: Decimal floatingpoint multiplication is important in many commercial applications including banking, tax calculation, currency conversion, and other financial areas. This paper presents a fully parallel decimal floatingpoint multiplier compliant with the recent draft of the IEEE P754 Standard for Floatingpoint Arithmetic (IEEE P754). The novelty of the design is that it is the first parallel decimal floatingpoint multiplier offering low latency and high throughput. This design is based on a previously published parallel fixedpoint decimal multiplier which uses alternate decimal digit encodings to reduce area and delay. The fixedpoint design is extended to support floatingpoint multiplication by adding several components including exponent generation, rounding, shifting, and exception handling. Area and delay estimates are presented that show a significant latency and throughput improvement with a substantial increase in area as compared to the only published IEEE P754 compliant sequential floatingpoint multiplier. To the best of our knowledge, this is the first publication to present a fully parallel decimal floatingpoint multiplier that complies with IEEE P754. 
iguchi2007a
URL ¿Web? 
On Designs of Radix Converters using Arithmetic Decompositions,
Yukihiro Iguchi, Tsutomu Sasao, and Munehiro Matsuura,
Proceedings of ISMVL2007, Oslo, Norway (CDROM),
8pp,
IEEE,
May 2007.
Abstract: In digital signal processing, radixes other than two are often used for highspeed computation. In the computation for finance, decimal numbers are used instead of binary numbers. In such cases, radix converters are necessary. This paper considers design methods for binary to qnary converters. It introduces a new design technique based on weightedsum (WS) functions. The method computes a WS function for each digit by an LUT cascade and a binary adder, then adds adjacent digits with qnary adders. A 16bit binary to decimal converter is designed to show the method. 
iguchi2007b
URL ¿Web? 
Design Methods of Radix Converters using Arithmetic Decompositions,
Yukihiro Iguchi, Tsutomu Sasao, and Munehiro Matsuura,
Institute of Electronics, Information and Communication Engineers, Transactions on Information and Systems, Vol. E90D #6,
pp905–914,
IEICE,
June 2007.
Abstract: In arithmetic circuits for digital signal processing, radixes other than two are often used to make circuits faster. In such cases, radix converters are necessary. However, in general, radix converters tend to be complex. This paper considers design methods for pnary to binary converters. First, it considers LookUp Table (LUT) cascade realizations. Then, it introduces a new design technique called arithmetic decomposition by using LUTs and adders. Finally, it compares the amount of hardware and performance of radix converters implemented by FPGAs. 12digit ternary to binary converters on Cyclone II FPGAs designed by the proposed method are faster than ones by conventional methods. 
james2007
¿Web? 
Quick Addition of Decimals Using Reversible Conservative Logic,
Rekha K. James, Shahana T. K., K. Poulose Jacob, and Sreela Sasi,
15th International Conference on Advanced Computing and Communications (ADCOM 2007),,
ISBN 0769530591,
pp191–196,
IEEE Computer Society,
December 2007.
Abstract: In recent years, reversible logic has emerged as one of the most important approaches for power optimization with its application in low power CMOS, nanotechnology and quantum computing. This research proposes quick addition of decimals (QAD) suitable for multidigit BCD addition, using reversible conservative logic. The design makes use of reversible fault tolerant Fredkin gates only. The implementation strategy is to reduce the number of levels of delay there by increasing the speed, which is the most important factor for high speed circuits. 
lang2007
URL ¿Web? 
A Radix10 DigitRecurrence Division Unit: Algorithm and Architecture,
Tomás Lang and Alberto Nannarelli,
IEEE Transactions on Computers, Vol. 56 #6,
pp727–739,
IEEE,
June 2007.
Abstract: In this work, we present a radix10 division unit that is based on the digitrecurrence algorithm. The previous decimal division designs do not include recent developments in the theory and practice of this type of algorithm, which were developed for radix2k dividers. In addition to the adaptation of these features, the radix10 quotient digit is decomposed into a radix2 digit and a radix5 digit in such a way that only five and two times the divisor are required in the recurrence. Moreover, the most significant slice of the recurrence, which includes the selection function, is implemented in radix2, avoiding the additional delay introduced by the radix10 carrysave additions and allowing the balancing of the paths to reduce the cycle delay. The results of the implementation of the proposed radix10 division unit show that its latency is close to that of radix16 division units (comparable dynamic range of significands) and it has a shorter latency than a radix10 unit based on the NewtonRaphson approximation. 
moskal2007
¿Web? 
Design and Synthesis of a CarryFree SignedDigit Decimal Adder,
John Moskal, Erdal Oruklu, and Jafar Saniie,
IEEE International Symposium on Circuits and Systems (ISCAS 2007),
pp1089–1092,
IEEE,
May 2007.
Abstract: The decimal arithmetic has been receiving an increased attention because of the growth of financial and scientific applications requiring high precision and increased computing power. This paper presents an efficient architecture for multidigit decimal addition based on carryfree signeddigit numbers. In this study, the decimal adder architecture has been designed and synthesized using the TSMC 0.18mu technology. The synthesis results were compared to the existing decimal adders with respect to design area, delay and power consumption. These results show that proposed adder architecture improves the areadelay factor by 3 for a 32 digit adder. 
tsen2007a
¿Web? 
Hardware Design of a Binary Integer Decimalbased IEEE P754 Rounding Unit,
Charles Tsen, Michael J. Schulte, and Sonia GonzalezNavarro,
Proceedings of the IEEE 18th International International Conference on Applicationspecific Systems, Architectures and Processors (ASAP),
7pp,
IEEE,
July 2007.
Abstract: Because of the growing importance of decimal floatingpoint (DFP) arithmetic, specifications for it were recently added to the draft revision of the IEEE 754 Standard (IEEE P754). In this paper, we present a hardware design for a rounding unit for 64bit DFP numbers (decimal64) that use the IEEE P754 binary encoding of DFP numbers, which is widely known as the Binary Integer Decimal (BID) encoding. We summarize the technique used for rounding, present the theory and design of the BID rounding unit, and evaluate its critical path delay, latency, and area for combinational and pipelined designs. Over 86% of the rounding unit’s area is due to a 55bit by 54bit binary multiplier, which can be shared with a doubleprecision binary floatingpoint multiplier. To our knowledge, this is the first hardware design for rounding IEEE P754 BIDencoded DFP numbers. 
tsen2007b
¿Web? 
Hardware Design of a Binary Integer Decimalbased Floatingpoint Adder,
Charles Tsen, Sonia GonzalezNavarro, and Michael J. Schulte,
Proceedings of the IEEE 25th International Conference on Computer Design,
9pp,
IEEE,
October 2007.
Abstract: Because of the growing importance of decimal floatingpoint (DFP) arithmetic, specifications for it are included in the IEEE Draft Standard for Floatingpoint Arithmetic (IEEE P754). In this paper, we present a novel algorithm and hardware design for a DFP adder. The adder performs addition and subtraction on 64bit operands that use the IEEE P754 binary encoding of DFP numbers, widely known as the Binary Integer Decimal (BID) encoding. The BID adder uses a novel hardware component for decimal digit counting and an enhanced version of a previously published BID rounding unit. By adding more sophisticated control, operations are performed with variable latency to optimize for common cases. We show that a BIDbased DFP adder design can be achieved with a modest area increase compared to a single 2stage pipelined 64bit fixedpoint multiplier. Over 70% of the BID adder’s area is due the 64bit fixedpoint multiplier, which can be shared with a binary floatingpoint multiplier and hardware for other DFP operations. To our knowledge, this is the first hardware design for adding and subtracting IEEE P754 BIDencoded DFP numbers. 
vanemden2007
¿Web? 
Functions to Support Input and Output of Intervals,
M. H., van Emden, B. Moa, and S. C. Somosan,
Report DCS311IR,
16pp,
University of Victoria, Canada,
February 2007.
Abstract: Interval arithmetic is hardly feasible without directed rounding as provided, for example, by the IEEE floatingpoint standard. Equally essential for interval methods is directed rounding for conversion between the external decimal and internal binary numerals. This is not provided by the standard I/O libraries. Conversion algorithms exist that guarantee identity upon conversion followed by its inverse. Although it may be possible to adapt these algorithms for use in decimal interval I/O, we argue that outward rounding in radix conversion is computationally a simpler problem than guaranteeing identity. Hence it is preferable to develop decimal interval I/O ab initio, which is what we do in this paper. 
vazquez2007
URL ¿Web? 
A New Family of High–Performance Parallel Decimal Multipliers,
Alvaro Vázquez, Elisardo Antelo, and Paolo Montuschi,
Proceedings of the 18th IEEE Symposium on Computer Arithmetic,
ISBN 0769528546,
ISBN 9780769528540,
pp195–204,
IEEE,
June 2007.
Abstract: This paper introduces two novel architectures for parallel decimal multipliers. Our multipliers are based on a new algorithm for decimal carry–save multioperand addition that uses a novel BCD–4221 recoding for decimal digits. It significantly improves the area and latency of the partial product reduction tree with respect to previous proposals. We also present three schemes for fast and efficient generation of partial products in parallel. The recoding of the BCD–8421 multiplier operand into minimally redundant signed–digit radix–10, radix–4 and radix–5 representations using new recoders reduces the complexity of partial product generation. In addition, SD radix–4 and radix–5 recodings allow the reuse of a conventional parallel binary radix–4 multiplier to perform combined binary/ decimal multiplications. Evaluation results show that the proposed architectures have interesting area–delay figures compared to conventional Booth radix–4 and radix–8 parallel binary multipliers and other representative alternatives for decimal multiplication. 
veerama2007
¿Web? 
Novel, HighSpeed 16Digit BCD Adders Conforming to IEEE 754r Format,
Sreehari Veeramachaneni, M.Kirthi Krishna, Lingamneni Avinash, Sreekanth Reddy P, and M.B. Srinivas,
IEEE Computer Society Annual Symposium on VLSI (ISVLSI '07),
pp343–350,
IEEE,
May 2007.
Abstract: In view of increasing prominence of commercial, financial and internetbased applications that process data in decimal format, there is a renewed interest in providing hardware support to handle decimal data. In this paper, a new architecture for efficient 1digit decimal addition of binary coded decimal (BCD) operands, which is the core of high speed multioperand adders and floating decimalpoint arithmetic, is proposed. Based on this 1digit BCD adder, novel architectures for higher order (ndigit) BCD adders such as ripple carry adder and carry lookahead adder are derived. The proposed circuits are compared (both qualitatively as well as quantitatively) with the existing circuits in literature and are shown to perform better. Simulation results show that the proposed 1digit BCD adder achieves an improvement of 40% in delay. The 16digit BCD lookahead adder using prefix logic is shown to perform at least 80% faster than the existing ripple carry one. 
wang2007
URL ¿Web? 
Decimal FloatingPoint Adder and Multifunction Unit with InjectionBased Rounding,
LiangKai Wang and Michael J. Schulte,
Proceedings of the 18th IEEE Symposium on Computer Arithmetic,
ISBN 0769528546,
ISBN 9780769528540,
pp56–65,
IEEE,
June 2007.
Abstract: Shrinking feature sizes gives more headroom for designers to extend the functionality of microprocessors. The IEEE 754R working group has revised the IEEE 7541985 Standard for Binary FloatingPoint Arithmetic to include specifications for decimal floatingpoint arithmetic and IBM recently announced incorporating a decimal floatingpoint unit into their POWER6 processor. As processor support for decimal floatingpoint arithmetic emerges, it is important to investigate efficient algorithms and hardware designs for common decimal floatingpoint arithmetic algorithms. This paper presents novel designs for a decimal floatingpoint adder and a decimal floatingpoint multifunction unit. To reduce their delay, both the adder and the multifunction unit use decimal injectionbased rounding, a new form of decimal operand alignment, and a fast flagbased method for rounding and overflow detection. Synthesis results indicate that the proposed adder is roughly 21% faster and 1.6% smaller than a previous decimal floatingpoint adder design, when implemented in the same technology. Compared to the decimal floatingpoint adder, the decimal floatingpoint multifunction unit provides six additional operations, yet only has 2.8%more delay and 9.7% more area. 
wang2007b
URL ¿Web? 
Benchmarks and Performance Analysis of Decimal FloatingPoint Applications,
LiangKai Wang, Charles Tsen, Michael J. Schulte, and Divya Jhalani,
Proceedings of the IEEE International Conference on Computer Design 2007,
pp164–170,
IEEE,
October 2007.
Abstract: The IEEE P754 Draft Standard for Floatingpoint Arithmetic provides specifications for Decimal FloatingPoint (DFP) formats and operations. Based on this standard, many developers will provide support for DFP calculations. We present a benchmark suite for DFP applications and use this suite to evaluate the performance of hardware and software DFP solutions. Our benchmarks include banking, commerce, riskmanagement, tax, and telephone billing applications organized into a suite of five macro benchmarks. In addition to developing our own applications, we leverage opensource projects and academic financial analysis applications. The benchmarks are modular, making them easy to adapt for different DFP solutions. We use the benchmarks to evaluate the performance of the decNumber DFP library and an extended version of the SimpleScalar PISA architecture with hardware and instruction set support for DFP operations. Our analysis shows that providing processor support for highspeed DFP operations significantly improves the performance of DFP applications. 
wang2007c
¿Web? 
A Decimal FloatingPoint Divider using NewtonRaphson Iteration,
LiangKai Wang and Michael J. Schulte,
Journal of VLSI Signal Processing Systems, Vol. 49 #1,
ISSN 09225773,
pp3–18,
Kluwer Academic Publishers,
October 2007.
Abstract: Increasing chip densities and transistor counts provide more room for designers to add functionality for important application domains into future microprocessors. As a result of rapid growth in financial, commercial, and Internetbased applications, hardware support for decimal floatingpoint arithmetic is now being considered by various computer manufacturers and specifications for decimal floatingpoint arithmetic have been added to the draft revision of the IEEE754 Standard for FloatingPoint Arithmetic (IEEE P754). In this paper, we present an efficient arithmetic algorithm and hardware design for decimal floatingpoint division. The design uses an efficient piecewise linear approximation, a modified NewtonRaphson iteration, a specialized rounding technique, and a simplified decimal incrementer and decrementer. Synthesis results show that a 64bit (16digit) implementation of the decimal divider, which is compliant with the current version of IEEE P754, has an estimated critical path delay of 0.69 ns (around 13 FO4 inverter delays) when implemented using LSI Logic’s 0.11 micron GflxP standard cell library. 
wang2007d
¿Web? 
Processor support for decimal floatingpoint arithmetic,
LiangKai Wang,
ISBN 9780549194637,
157pp,
University of Wisconsin at Madison,
2007.
Abstract: Decimal data permeates society, as humans most commonly use baseten numbers. Although microprocessors normally use basetwo binary arithmetic to obtain faster execution times and simpler circuitry, binary numbers cannot represent decimal fractions exactly. This leads to large errors being accumulated after several decimal operations. Furthermore, binary floatingpoint arithmetic operations perform binary rounding instead of decimal rounding. Consequently, applications, such as financial, commercial, tax, and Internetbased applications, which are sensitive to representation and rounding errors, often require decimal arithmetic. Due to the increasing importance of and demand for decimal arithmetic, its formats and operations have been specified in the IEEE Draft Standard for Floatingpoint Arithmetic (IEEE P754). Most decimal applications use software routines and binary arithmetic to emulate decimal operations. Although this approach eliminates errors due to converting between binary and decimal numbers and provides decimal rounding to mirror manual calculations, it results in long latencies for numerically intensive commercial applications. This is because software emulation of decimal floatingpoint (DFP) arithmetic has significant overhead due to function calls, dealing with decimal formats, operand alignment, decimal rounding, and special case and exception handling. This dissertation investigates processor support for decimal floatingpoint arithmetic. It first reviews recent progress in decimal arithmetic, including decimal encodings, the IEEE P754 Draft Standard, and software packages, hardware designs, and benchmark suites for decimal arithmetic. Next, this dissertation presents novel arithmetic algorithms and hardware designs for basic DFP operations, including DFP addition, subtraction, division, square root, and others. Most of the hardware designs presented in this dissertation are the first published designs compliant with the IEEE P754 Draft Standard. Finally, to study the performance impact of DFP instructions and hardware, this dissertation presents the first publicly available benchmark suite for DFP arithmetic. This benchmark suite, along with instruction set extensions and a decimalenhanced processor simulator, are used to demonstrate that providing fast hardware support for DFP operations leads to significant performance benefits to DFPintensive applications. 
biswas2008
¿Web? 
A Novel Approach to Design BCD Adder and Carry Skip BCD Adder,
Ashis Kumer Biswas, Md. Mahmudul Hasan, Moshaddek Hasan, Ahsan Raja Chowdhury, and Hafiz Md. Hasan Babu,
Proceedings of the 21st International Conference on VLSI Design (VLSID '08),
ISBN 0769530834,
pp566–571,
IEEE Computer Society,
January 2008.
Abstract: Reversible logic has become one of the most promising research areas in the past few decades and has found its applications in several technologies; such as low power CMOS, nanocomputing and optical computing. This paper presents improved and efficient reversible logic implementations for Binary Coded Decimal (BCD) adder as well as Carry Skip BCD adder. It has been shown that the modified designs outperform the existing ones in terms of number of gates, number of garbage output and delay. 
biswas2008b
¿Web? 
Efficient approaches for designing reversible Binary Coded Decimal adders,
Ashis Kumer Biswas, Md. Mahmudul Hasan, Ahsan Raja Chowdhury, and Hafiz Md. Hasan Babu,
Microelectronics Journal, Vol. 39 #12,
ISSN 00262692,
pp1693–1703,
Elsevier,
December 2008.
Abstract: Reversible logic has become one of the most promising research areas in the past few decades and has found its applications in several technologies; such as lowpower CMOS, nanocomputing and optical computing. This paper presents improved and efficient reversible logic implementations for Binary Coded Decimal (BCD) adder as well as Carry Skip BCD adder. It has been shown that the modified designs outperform the existing ones in terms of number of gates, number of garbage outputs, delay, and quantum cost. In order to show the efficiency of the proposed designs, lower bounds of the reversible BCD adders in terms of gates and garbage outputs are proposed as well. 
castell2008
¿Web? 
Compressor trees for decimal partial product reduction,
Ivan D. Castellanos and James E. Stine,
Proceedings of the 18th ACM Great Lakes symposium on VLSI,
ISBN 9781595939999,
pp107–110,
ACM Press,
2008.
Abstract: Decimal multiplication has grown in interest due to the recent announcement of new IEEE 754R standards and the availability of highspeed decimal computation hardware. Prior research enabled partial products to be coded more efficiently for their use in radix 10 architectures. This paper clarifies previous techniques for partial product reduction using carrysave adders and presents a new 4:2 compressor structure. This new structure improves performance at the expense of more gates, however, regularity is introduced into the circuit to promote implementations in Very Large Scale Integration (VLSI) Designs. Results are presented and compared for several designs using a TSMC SCN6M 0.18 µm feature size. 
erle2008
URL ¿Web? 
Algorithms and Hardware Designs for Decimal Multiplication,
Mark A. Erle,
217pp,
Lehigh University,
November 2008.
Abstract: Although a preponderance of business data is in decimal form, virtually all floatingpoint arithmetic units on today’s generalpurpose microprocessors are based on the binary number system. Higher performance, less circuitry, and better overall error characteristics are the main reasons why binary floatingpoint hardware (BFP) is chosen over decimal floatingpoint (DFP) hardware. However, the binary number system cannot precisely represent many common decimal values. Further, although BFP arithmetic is wellsuited for the scientific community, it is quite different from manual calculation norms and does not meet many legal requirements. Due to the shortcomings of BFP arithmetic, many applications involving fractional decimal data are forced to perform their arithmetic either entirely in software or with a combination of software and decimal fixedpoint hardware. Providing DFP hardware has the potential to dramatically improve the performance of such applications. Only recently has a large microprocessor manufacturer begun providing systems with DFP hardware. With available die area continually increasing, dedicated DFP hardware implementations are likely to be offered by other microprocessor manufacturers. This dissertation discusses the motivation for decimal computer arithmetic, a brief history of this arithmetic, and relevant software and processor support for a variety of decimal arithmetic functions. As the context of the research is the IEEE Standard for Floatingpoint Arithmetic (IEEE 7542008) and twostate transistor technology, descriptions of the standard and various decimal digit encodings are described. The research presented investigates algorithms and hardware support for decimal multiplication, with particular emphasis on DFP multiplication. Both iterative and parallel implementations are presented and discussed. Novel ideas are advanced such as the use of decimal counters and compressors and the support of IEEE 7542008 floatingpoint, including early estimation of the shift amount, inline exception handling, onthefly sticky bit generation, and efficient decimal rounding. The iterative and parallel, decimal multiplier designs are compared and contrasted in terms of their latency, throughput, area, delay, and usage. The culmination of this research is the design and comparison of an iterative DFP multiplier with a parallel DFP multiplier. The iterative DFP multiplier is significantly smaller and may achieve a higher practical frequency of operation than the parallel DFP multiplier. Thus, in situations where the area available for DFP is an important design constraint, the iterative DFP multiplier may be an attractive implementation. However, the parallel DFP multiplier has less latency for a single multiply operation and is able to produce a new result every cycle. As for power considerations, the fewer overall devices in the iterative multiplier, and more importantly the fewer storage elements, should result in less leakage. This benefit is mitigated by its higher latency and lower throughput. The proposed implementations are suitable for generalpurpose, server, and mainframe microprocessor designs. Depending on the demand for DFP in humancentric applications, this research may be employed in the applicationspecific integrated circuits (ASICs) market. Note: Available at speleotrove.com. 
jimeno2008
¿Web? 
A BCDbased architecture for fast coordinate rotation,
Antonio Jimeno, Higinio Mora, Jose L. Sanchez, and Francisco Pujol,
Journal of Systems Architecture: the EUROMICRO Journal, Vol. 54 #8,
ISSN 13837621,
pp829–840,
Elsevier,
August 2008.
Abstract: Although radix 10 based arithmetic has been gaining renewed importance over the last few years, decimal systems are not efficient enough and techniques are still under development. In this paper, an improvement of the CORDIC (coordinate rotation digital computer) method for decimal representation is proposed and applied to produce fast rotations. The algorithm uses BCD operands as inputs, combining the advantages of both decimal and binary systems. The result is a reduction of 50% in the number of iterations if compared with the original Decimal CORDIC method. Finally, we present a hardware architecture useful to produce BCD coordinates rotations accurately and fast, and different experiments demonstrating the advantages of the new method are shown. A reduction of 75% in a single stage delay is obtained, whereas the circuit area just increases in about 5%. 
thomsen2008
¿Web? 
Optimized reversible binarycoded decimal adders,
Michael Kirkedal Thomsen and Robert Glück,
Journal of Systems Architecture: the EUROMICRO Journal, Vol. 54 #7,
ISSN 13837621,
pp697–706,
Elsevier,
July 2008.
Abstract: Babu and Chowdhury recently proposed, in this journal, a reversible adder for binarycoded decimals. This paper corrects and optimizes their design. The optimized 1decimal BCD fulladder, a 13x13 reversible logic circuit, is faster, and has lower circuit cost and less garbage bits. It can be used to build a fast reversible mdecimal BCD fulladder that has a delay of only m+17 lowpower reversible CMOS gates. For a 32decimal (128bit) BCD addition, the circuit delay of 49 gates is significantly lower than is the number of bits used for the BCD representation. A complete set of reversible half and fulladders for nbit binary numbers and mdecimal BCD numbers is presented. The results show that specialpurpose design pays off in reversible logic design by drastically reducing the number of garbage bits. Specialized designs benefit from support by reversible logic synthesis. All circuit components required for optimizing the original design could also be synthesized successfully by an implementation of an existing synthesis algorithm. 
veerama2008
¿Web? 
A Novel CarryLook Ahead Approach to a Unified BCD and Binary Adder/Subtractor,
Sreehari Veeramachaneni, M. Kirthi Krishna, G. V. Prateek, S. Subroto, S. Bharat, and M. B. Srinivas,
Proceedings of the 21st International Conference on VLSI Design (VLSID '08),
ISBN 0769530834,
pp547–552,
IEEE Computer Society,
January 2008.
Abstract: Increasing prominence of commercial, financial and internetbased applications, which process decimal data, there is an increasing interest in providing hardware support for such data. In this paper, new architecture for efficient binary and Binary Coded Decimal (BCD) adder/subtractor is presented. This employs a new method of subtraction unlike the existing designs which mostly use 10’s complements, to obtain a much lower latency. Though there is a necessity of correction in some cases, the delay overhead is minimal. A complete discussion about such cases and the required logic to process is presented. The architecture is runtime reconfigurable to facilitate both BCD and binary operations, including signed and unsigned numbers. The proposed circuits are compared (both qualitatively as well as quantitatively) with the existing circuits in literature and are shown to perform better. Simulation results show that the proposed architecture is at least 11% faster than the existing designs. 
webb2008
URL ¿Web? 
IBM z10: The NextGeneration Mainframe Microprocessor,
Charles Webb,
IEEE Micro Vol. 28 #2,
ISSN 02721732,
pp19–29,
IEEE,
March/April 2008.
Abstract: The IBM system z10 includes four microprocessor cores — each with a private 3Mbyte cache — and integrated accelerators for decimal floatingpoint computation, cryptography, and data compression. A separate SMP hub chip provides a shared thirdlevel cache and interconnect fabric for multiprocessor scaling. This article focuses on the highfrequency design techniques used to achieve a 4.4GHz system, and on the pipeline design that optimizes z10’s CPU performance. 
schwarz2009
URL ¿Web? 
Decimal floatingpoint support on the IBM System z10 processor,
Eric M. Schwarz, John S. Kapernick, and Mike F. Cowlishaw,
IBM Journal of Research and Development, Vol. 53 #1,
pp4:1–4:10,
IBM,
January 2009.
Abstract: The latest IBM zSeries processor, the IBM System z10 processor, provides hardware support for the decimal floatingpoint (DFP) facility that was introduced on the IBM System z9 processor. The z9 processor implements the facility with a mixture of lowlevel software and hardware assists. Recently, the IBM POWER6 processorbased System p 570 server introduced a hardware implementation of the DFP facility. The latest zSeries processor includes a decimal floatingpoint unit based on the POWER6 processor DFP unit that has been enhanced to also support the traditional zSeries decimal fixedpoint instruction set. This paper explains the hardware implementation to support both decimal fixed point and DFP and the new software support for the DFP facility, including IBM z/OS, Java JIT, and C/C++ compilers, as well as support in IBM DB2 and middleware. 
252 references listed.
Last updated: 10 Mar 2011
Some elements Copyright © IBM Corporation, 2002, 2009. All rights reserved.