The 'telco' benchmark specification

The ‘telco’ benchmark specification
A simplified billing application [draft v0.53]
Copyright © IBM Corporation, 2002, 2005. All rights reserved.

This benchmark specifies a small program which is intended to capture the essence of a telephone company billing application, with a realistic balance between Input/Output activity and application calculations.

This is expected to allow

the comparison of Input/Output time with calculation time
the investigation of different implementation approaches.

Note that:

Call durations and rates vary significantly from country to country.
Taxes are calculated differently in different countries.
Billing calculations can be significantly more complicated than the ones included in this benchmark.

Examples of these and other differences, and their effects on the results, are discussed below.

This benchmark, therefore, does not exactly match any specific company’s billing system. Instead, the distributions, rates, etc., are intended to give a representative result (involving calculations of appropriate precision and reasonable complexity) rather than match a specific model. In general, these results will underestimate the costs of calculation.

For some more background information and some initial results, see http://speleotrove.com/decimal/telco.html.

Acknowledgements

Thanks are due to the application engineers who suggested and verified the benchmark, and to Robert Berry (IBM) and Pamela Taylor (SHARE) for suggestions and improvements to the benchmark.

Specification

In this description, the notation [name=value] represents a value or setting that might be a constant or a parameter, depending on the implementation of the benchmark.

Inputs and Outputs

Two files are used:

An input file, i. This is an existing binary file containing a series of eight-byte pseudo-random integers with [distribution=exponential] and [mean=180]. The numbers are all either [type=packed-decimal-15-digits], [type=binary-64-bit], or [type=IEEE-754r-decimal64], the type being chosen as appropriate for the platform or language. The whole file is read during the benchmark.
The file should be sufficiently large that the timings are clearly proportional to the size of the file (for example, if the size of the file is doubled, the measured times would also double).
An output file, f. This is a new text file which will be written and closed during the benchmark; it will contain one number string per line. The representation of the file is platform dependent (for example, characters might be encoded in ASCII, UTF8, or EBCDIC, and the file might be record-oriented or stream, etc.).

For both files, buffered input and output routines must be used, using the default buffer size for the platform.

Calculations

All calculations (except for the one case where rounding is explicit) must be exact. If a suitable decimal type or decimal package is available for the platform, this must be used for the calculations.

Benchmark steps

The benchmark comprises the following steps:

If a file with the same name as the output file, f, exists, then it is erased or renamed.
At this point, timing of the benchmark begins.

Three variables are initialized to zero:

  sumT = 0   -- sum of total prices
  sumB = 0   -- sum of basic tax
  sumD = 0   -- sum of 'distance' tax

The output text file, f, is created (opened).
A sequence of numbers, representing call duration times in seconds, are read from the input file, i. Reading the number from the file, and any initial conversion into the numerical form used by the following calculations, is considered to be part of the Input/Output time of the benchmark.
After each number, n, is read the following calculations are carried out:
1. A call type indicator, c, is set from the bottom (least significant) bit of the duration (hence c is 0 or 1).
2. A rate, r, is determined from the call type. Those calls with c=0 have a low rate: 0.0013; the remainder (‘distance calls’) have a ‘premium’ rate: 0.00894. (The rates are, very roughly, in Euros or dollars per second.)
3. A price, p, for the call is then calculated (p=r*n). This is rounded to exactly 2 fractional digits using round-half-even (Banker’s round to nearest).
4. A basic tax, b, is calculated: b=p*0.0675 (6.75%). This is truncated to exactly 2 fractional digits (round-down), and the total basic tax variable is then incremented (sumB=sumB+b).
5. For distance calls: a distance tax, d, is calculated: d=p*0.0341 (3.41%). This is truncated to exactly 2 fractional digits (round-down), and then the total distance tax variable is incremented (sumD=sumD+d).
6. The total price, t, is calculated (t=p+b, and, if a distance call, t=t+d).
7. The total prices variable is incremented (sumT=sumT+t).
8. The total price, t, is converted to a string, s.
The string s is then written to file f as a line (there is therefore one value per line in the completed file).
Once all the numbers in the input file i have been processed, any pending output data are flushed to disk, if possible, and then the input file i and the output file f are closed.
At this point, timing of the benchmark ends.
The values sumT, sumB, sumD are displayed for verification. For the short 10-number test file [infile=telco.test or infile=telco.testb] the values should be:
```
  sumT = 8.91
  sumB = 0.50
  sumD = 0.22
  
```
Details of the individual calculations (and the contents of the output file, f) for this short test case can be found on the introduction page at http://speleotrove.com/decimal/telco.html.
For the 1,000,000-number file, exponential distribution with a mean of 180 [infile=expon180.1e6 or infile=expon180.1e6b], the sum values should be:
```
  sumT = 1004737.58
  sumB = 57628.30
  sumD = 25042.17
  
```
(In either case, if sumD is unexpectedly 0 then probably the BCD data file is being used when the testcase expects the binary data file.)

Time comparisons

Comparison of the relative timing of Input/Output and calculations can be made by having the inner steps 1-8 above be controlled by a compile-time or run-time switch which either carries out the eight steps of the calculation as described or prepares a four-character string to be written to the output file as step 9. (A runtime switch is more convenient, and its runtime cost will be neglible.)

Discussion

The model used in the benchmark makes a number of simplifications in order to increase the generality of the test. Here are some answers to specific questions about the simplifications.

In the benchmark, only two rates are used, and these are equally likely. In real systems many different rates are used. These are dependent on a number of different variables, such as the timing of the call (prime, off-peak, etc.), the geography of the call (interstate, international, etc.), the type of telephone in use (land line, mobile, etc.), and so on. Why does the benchmark use only two, equally likely, rates?

The choice of two rates means that a realistic spread of rate precisions (about 3 bits) is used in the calculations, giving a reasonable estimate of the computational effect of multiple rates. The 50% choice is roughly the proportion between low-rate and high-rate calls in countries with a high penetration of mobile phone usage.
The 50% ratio also means that the split can be determined fairly and cheaply from the bottom (least significant) bit of the number n, regardless of whether the number is expressed in a decimal or binary base. This allows the data file to be a simple ‘generic random number distribution’ instead of being a data file specific to this benchmark.

Only two tax rates are used. Billing schemes can be far more complex, quite often with more than two tax authorities involved. Also there are sometimes “special plans” which allow free calls up to a fixed monthly fee, and so on. Why not add more calculations into the benchmark?

Billing calculations do vary considerably. In some countries, there is only one ‘telephone tax’. Two is common. In the USA there can be as many as four on a single call.
The ‘two-tier’ tax system used in the benchmark was chosen to be simpler than most billing schemes, while not being significantly more complex than the simplest of schemes. (The 50-50 choice was made for the same reasons as just discussed.)
If a more complex model were used, it might overestimate the amount of calculation required for many situations. The effect of the more complex model on the results would be an increased amount of calculation and related logic, compared to Input/Output activity.

The mean call duration is 180 seconds; is this realistic?

This number (three minutes) was chosen as a reasonable mean, based on statistics from a number of countries in North America, Europe and the Pacific Rim. Quite large variations in this mean would not significantly alter the precisions, and therefore the computational costs, of the calculations.

Why is an exponential distribution used for the call times?

This distribution is commonly used for modeling call times. Although there are better models (e.g., log-normal), this one is extremely easy to generate and gives an appropriate spread of precisions in the input numbers.

Only about 16 bits (4 digits) are needed for the input numbers; why use 64 bits (15 digits)?

Partly to make sure the benchmark doesn’t underestimate the amount of data to be read from disk, and partly to allow the use of a ‘generic’ random number distribution data file.

Why does the package require the use of decimal arithmetic (if available)?

Decimal arithmetic can be used to assure the exactness of the results (see the Decimal FAQ). Also, using a decimal data type reflects actual applications of this type (the data are often held in decimal form, and using decimals improves maintainability compared to manual scaling of integers).
Note that the format of the input file (binary or packed decimal) may be chosen to suit the representation used by the decimal arithmetic package used.

In a ‘real’ application, the input data are likely to be derived from a database. Won’t reading the data from a file underestimate the Input/Output costs?

Reading from a database should be close in speed to reading from a disk. However, to verify this it is hoped to couple the benchmark to a database at some point. In the meantime, measurements are being made with the files being network files (Ethernet connected) as well as using ‘local’ files. The network file read time should be significantly larger than database read times and will therefore place an upper bound on results.

Mike Cowlishaw, mfc@speleotrove.com