Warning, /testsuites/benchmarks/dhrystone/README.md is written in an unsupported language. File is not indexed.
0001 C README
0002 ========
0003 This "shar" file contains the documentation for the
0004 electronic mail distribution of the Dhrystone benchmark (C version 2.1);
0005 a companion "shar" file contains the source code.
0006 (Because of mail length restrictions for some mailers, I have
0007 split the distribution in two parts.)
0008
0009 For versions in other languages, see the other "shar" files.
0010
0011 Files containing the C version (*.h: Header File, *.c: C Modules)
0012
0013 dhry.h
0014 dhry_1.c
0015 dhry_2.c
0016
0017 The file RATIONALE contains the article
0018
0019 "Dhrystone Benchmark: Rationale for Version 2 and Measurement Rules"
0020
0021 which has been published, together with the C source code (Version 2.0),
0022 in SIGPLAN Notices vol. 23, no. 8 (Aug. 1988), pp. 49-62.
0023 This article explains all changes that have been made for Version 2,
0024 compared with the version of the original publication
0025 in Communications of the ACM vol. 27, no. 10 (Oct. 1984), pp. 1013-1030.
0026 It also contains "ground rules" for benchmarking with Dhrystone
0027 which should be followed by everyone who uses the program and publishes
0028 Dhrystone results.
0029
0030 Compared with the Version 2.0 published in SIGPLAN Notices, Version 2.1
0031 contains a few corrections that have been made after Version 2.0 was
0032 distriobuted over the UNIX network Usenet. These small differences between
0033 Version 2.0 and 2.1 should not affect execution time measurements.
0034 For those who want to compare the exact contents of both versions,
0035 the file "dhry_c.dif" contains the differences between the two versions,
0036 as generated by a file comparison of the corresponding files with the
0037 UNIX utility "diff".
0038
0039 The file VARIATIONS contains the article
0040
0041 "Understanding Variations in Dhrystone Performance"
0042
0043 which has been published in Microprocessor Report, May 1989
0044 (Editor: M. Slater), pp. 16-17. It describes the points that users
0045 should know if C Dhrystone results are compared.
0046
0047 Recipients of this shar file who perform measurements are asked
0048 to send measurement results to the author and/or to Rick Richardson.
0049 Rick Richardson publishes regularly Dhrystone results on the UNIX network
0050 Usenet. For submissions of results to him (preferably by electronic mail,
0051 see address in the program header), he has provided a form which is contained
0052 in the file "submit.frm".
0053
0054
0055 The following files are contained in other "shar" files:
0056
0057 Files containing the Ada version (*.s: Specifications, *.b: Bodies):
0058
0059 d_global.s
0060 d_main.b
0061 d_pack_1.b
0062 d_pack_1.s
0063 d_pack_2.b
0064 d_pack_2.s
0065
0066 File containing the Pascal version:
0067
0068 dhry.p
0069
0070
0071 February 22, 1990
0072
0073 Reinhold P. Weicker
0074 Siemens AG, AUT E 51
0075 Postfach 3220
0076 D-8520 Erlangen
0077 Germany (West)
0078
0079 Phone: [xxx-49]-9131-7-20330 (8-17 Central European Time)
0080 UUCP: ..!mcsun!unido!estevax!weicker
0081
0082
0083 Rationale
0084 =========
0085
0086
0087
0088 Dhrystone Benchmark: Rationale for Version 2 and Measurement Rules
0089
0090 [published in SIGPLAN Notices 23,8 (Aug. 1988), 49-62]
0091
0092
0093 Reinhold P. Weicker
0094 Siemens AG, E STE 35
0095 [now: Siemens AG, AUT E 51]
0096 Postfach 3220
0097 D-8520 Erlangen
0098 Germany (West)
0099
0100
0101
0102
0103 1. Why a Version 2 of Dhrystone?
0104
0105 The Dhrystone benchmark program [1] has become a popular benchmark for
0106 CPU/compiler performance measurement, in particular in the area of
0107 minicomputers, workstations, PC's and microprocesors. It apparently satisfies
0108 a need for an easy-to-use integer benchmark; it gives a first performance
0109 indication which is more meaningful than MIPS numbers which, in their literal
0110 meaning (million instructions per second), cannot be used across different
0111 instruction sets (e.g. RISC vs. CISC). With the increasing use of the
0112 benchmark, it seems necessary to reconsider the benchmark and to check whether
0113 it can still fulfill this function. Version 2 of Dhrystone is the result of
0114 such a re-evaluation, it has been made for two reasons:
0115
0116 o Dhrystone has been published in Ada [1], and Versions in Ada, Pascal and C
0117 have been distributed by Reinhold Weicker via floppy disk. However, the
0118 version that was used most often for benchmarking has been the version made
0119 by Rick Richardson by another translation from the Ada version into the C
0120 programming language, this has been the version distributed via the UNIX
0121 network Usenet [2].
0122
0123 There is an obvious need for a common C version of Dhrystone, since C is at
0124 present the most popular system programming language for the class of
0125 systems (microcomputers, minicomputers, workstations) where Dhrystone is
0126 used most. There should be, as far as possible, only one C version of
0127 Dhrystone such that results can be compared without restrictions. In the
0128 past, the C versions distributed by Rick Richardson (Version 1.1) and by
0129 Reinhold Weicker had small (though not significant) differences.
0130
0131 Together with the new C version, the Ada and Pascal versions have been
0132 updated as well.
0133
0134 o As far as it is possible without changes to the Dhrystone statistics,
0135 optimizing compilers should be prevented from removing significant
0136 statements. It has turned out in the past that optimizing compilers
0137 suppressed code generation for too many statements (by "dead code removal"
0138 or "dead variable elimination"). This has lead to the danger that
0139 benchmarking results obtained by a naive application of Dhrystone - without
0140 inspection of the code that was generated - could become meaningless.
0141
0142 The overall policiy for version 2 has been that the distribution of
0143 statements, operand types and operand locality described in [1] should remain
0144 unchanged as much as possible. (Very few changes were necessary; their impact
0145 should be negligible.) Also, the order of statements should remain unchanged.
0146 Although I am aware of some critical remarks on the benchmark - I agree with
0147 several of them - and know some suggestions for improvement, I didn't want to
0148 change the benchmark into something different from what has become known as
0149 "Dhrystone"; the confusion generated by such a change would probably outweight
0150 the benefits. If I were to write a new benchmark program, I wouldn't give it
0151 the name "Dhrystone" since this denotes the program published in [1].
0152 However, I do recognize the need for a larger number of representative
0153 programs that can be used as benchmarks; users should always be encouraged to
0154 use more than just one benchmark.
0155
0156 The new versions (version 2.1 for C, Pascal and Ada) will be distributed as
0157 widely as possible. (Version 2.1 differs from version 2.0 distributed via the
0158 UNIX Network Usenet in March 1988 only in a few corrections for minor
0159 deficiencies found by users of version 2.0.) Readers who want to use the
0160 benchmark for their own measurements can obtain a copy in machine-readable
0161 form on floppy disk (MS-DOS or XENIX format) from the author.
0162
0163
0164 2. Overall Characteristics of Version 2
0165
0166 In general, version 2 follows - in the parts that are significant for
0167 performance measurement, i.e. within the measurement loop - the published
0168 (Ada) version and the C versions previously distributed. Where the versions
0169 distributed by Rick Richardson [2] and Reinhold Weicker have been different,
0170 it follows the version distributed by Reinhold Weicker. (However, the
0171 differences have been so small that their impact on execution time in all
0172 likelihood has been negligible.) The initialization and UNIX instrumentation
0173 part - which had been omitted in [1] - follows mostly the ideas of Rick
0174 Richardson [2]. However, any changes in the initialization part and in the
0175 printing of the result have no impact on performance measurement since they
0176 are outside the measaurement loop. As a concession to older compilers, names
0177 have been made unique within the first 8 characters for the C version.
0178
0179 The original publication of Dhrystone did not contain any statements for time
0180 measurement since they are necessarily system-dependent. However, it turned
0181 out that it is not enough just to inclose the main procedure of Dhrystone in a
0182 loop and to measure the execution time. If the variables that are computed
0183 are not used somehow, there is the danger that the compiler considers them as
0184 "dead variables" and suppresses code generation for a part of the statements.
0185 Therefore in version 2 all variables of "main" are printed at the end of the
0186 program. This also permits some plausibility control for correct execution of
0187 the benchmark.
0188
0189 At several places in the benchmark, code has been added, but only in branches
0190 that are not executed. The intention is that optimizing compilers should be
0191 prevented from moving code out of the measurement loop, or from removing code
0192 altogether. Statements that are executed have been changed in very few places
0193 only. In these cases, only the role of some operands has been changed, and it
0194 was made sure that the numbers defining the "Dhrystone distribution"
0195 (distribution of statements, operand types and locality) still hold as much as
0196 possible. Except for sophisticated optimizing compilers, execution times for
0197 version 2.1 should be the same as for previous versions.
0198
0199 Because of the self-imposed limitation that the order and distribution of the
0200 executed statements should not be changed, there are still cases where
0201 optimizing compilers may not generate code for some statements. To a certain
0202 degree, this is unavoidable for small synthetic benchmarks. Users of the
0203 benchmark are advised to check code listings whether code is generated for all
0204 statements of Dhrystone.
0205
0206 Contrary to the suggestion in the published paper and its realization in the
0207 versions previously distributed, no attempt has been made to subtract the time
0208 for the measurement loop overhead. (This calculation has proven difficult to
0209 implement in a correct way, and its omission makes the program simpler.)
0210 However, since the loop check is now part of the benchmark, this does have an
0211 impact - though a very minor one - on the distribution statistics which have
0212 been updated for this version.
0213
0214
0215 3. Discussion of Individual Changes
0216
0217 In this section, all changes are described that affect the measurement loop
0218 and that are not just renamings of variables. All remarks refer to the C
0219 version; the other language versions have been updated similarly.
0220
0221 In addition to adding the measurement loop and the printout statements,
0222 changes have been made at the following places:
0223
0224 o In procedure "main", three statements have been added in the non-executed
0225 "then" part of the statement
0226
0227 if (Enum_Loc == Func_1 (Ch_Index, 'C'))
0228
0229 they are
0230
0231 strcpy (Str_2_Loc, "DHRYSTONE PROGRAM, 3'RD STRING");
0232 Int_2_Loc = Run_Index;
0233 Int_Glob = Run_Index;
0234
0235 The string assignment prevents movement of the preceding assignment to
0236 Str_2_Loc (5'th statement of "main") out of the measurement loop (This
0237 probably will not happen for the C version, but it did happen with another
0238 language and compiler.) The assignment to Int_2_Loc prevents value
0239 propagation for Int_2_Loc, and the assignment to Int_Glob makes the value of
0240 Int_Glob possibly dependent from the value of Run_Index.
0241
0242 o In the three arithmetic computations at the end of the measurement loop in
0243 "main ", the role of some variables has been exchanged, to prevent the
0244 division from just cancelling out the multiplication as it was in [1]. A
0245 very smart compiler might have recognized this and suppressed code
0246 generation for the division.
0247
0248 o For Proc_2, no code has been changed, but the values of the actual parameter
0249 have changed due to changes in "main".
0250
0251 o In Proc_4, the second assignment has been changed from
0252
0253 Bool_Loc = Bool_Loc | Bool_Glob;
0254
0255 to
0256
0257 Bool_Glob = Bool_Loc | Bool_Glob;
0258
0259 It now assigns a value to a global variable instead of a local variable
0260 (Bool_Loc); Bool_Loc would be a "dead variable" which is not used
0261 afterwards.
0262
0263 o In Func_1, the statement
0264
0265 Ch_1_Glob = Ch_1_Loc;
0266
0267 was added in the non-executed "else" part of the "if" statement, to prevent
0268 the suppression of code generation for the assignment to Ch_1_Loc.
0269
0270 o In Func_2, the second character comparison statement has been changed to
0271
0272 if (Ch_Loc == 'R')
0273
0274 ('R' instead of 'X') because a comparison with 'X' is implied in the
0275 preceding "if" statement.
0276
0277 Also in Func_2, the statement
0278
0279 Int_Glob = Int_Loc;
0280
0281 has been added in the non-executed part of the last "if" statement, in order
0282 to prevent Int_Loc from becoming a dead variable.
0283
0284 o In Func_3, a non-executed "else" part has been added to the "if" statement.
0285 While the program would not be incorrect without this "else" part, it is
0286 considered bad programming practice if a function can be left without a
0287 return value.
0288
0289 To compensate for this change, the (non-executed) "else" part in the "if"
0290 statement of Proc_3 was removed.
0291
0292 The distribution statistics have been changed only by the addition of the
0293 measurement loop iteration (1 additional statement, 4 additional local integer
0294 operands) and by the change in Proc_4 (one operand changed from local to
0295 global). The distribution statistics in the comment headers have been updated
0296 accordingly.
0297
0298
0299 4. String Operations
0300
0301 The string operations (string assignment and string comparison) have not been
0302 changed, to keep the program consistent with the original version.
0303
0304 There has been some concern that the string operations are over-represented in
0305 the program, and that execution time is dominated by these operations. This
0306 was true in particular when optimizing compilers removed too much code in the
0307 main part of the program, this should have been mitigated in version 2.
0308
0309 It should be noted that this is a language-dependent issue: Dhrystone was
0310 first published in Ada, and with Ada or Pascal semantics, the time spent in
0311 the string operations is, at least in all implementations known to me,
0312 considerably smaller. In Ada and Pascal, assignment and comparison of strings
0313 are operators defined in the language, and the upper bounds of the strings
0314 occuring in Dhrystone are part of the type information known at compilation
0315 time. The compilers can therefore generate efficient inline code. In C,
0316 string assignemt and comparisons are not part of the language, so the string
0317 operations must be expressed in terms of the C library functions "strcpy" and
0318 "strcmp". (ANSI C allows an implementation to use inline code for these
0319 functions.) In addition to the overhead caused by additional function calls,
0320 these functions are defined for null-terminated strings where the length of
0321 the strings is not known at compilation time; the function has to check every
0322 byte for the termination condition (the null byte).
0323
0324 Obviously, a C library which includes efficiently coded "strcpy" and "strcmp"
0325 functions helps to obtain good Dhrystone results. However, I don't think that
0326 this is unfair since string functions do occur quite frequently in real
0327 programs (editors, command interpreters, etc.). If the strings functions are
0328 implemented efficiently, this helps real programs as well as benchmark
0329 programs.
0330
0331 I admit that the string comparison in Dhrystone terminates later (after
0332 scanning 20 characters) than most string comparisons in real programs. For
0333 consistency with the original benchmark, I didn't change the program despite
0334 this weakness.
0335
0336
0337 5. Intended Use of Dhrystone
0338
0339 When Dhrystone is used, the following "ground rules" apply:
0340
0341 o Separate compilation (Ada and C versions)
0342
0343 As mentioned in [1], Dhrystone was written to reflect actual programming
0344 practice in systems programming. The division into several compilation
0345 units (5 in the Ada version, 2 in the C version) is intended, as is the
0346 distribution of inter-module and intra-module subprogram calls. Although on
0347 many systems there will be no difference in execution time to a Dhrystone
0348 version where all compilation units are merged into one file, the rule is
0349 that separate compilation should be used. The intention is that real
0350 programming practice, where programs consist of several independently
0351 compiled units, should be reflected. This also has implies that the
0352 compiler, while compiling one unit, has no information about the use of
0353 variables, register allocation etc. occuring in other compilation units.
0354 Although in real life compilation units will probably be larger, the
0355 intention is that these effects of separate compilation are modeled in
0356 Dhrystone.
0357
0358 A few language systems have post-linkage optimization available (e.g., final
0359 register allocation is performed after linkage). This is a borderline case:
0360 Post-linkage optimization involves additional program preparation time
0361 (although not as much as compilation in one unit) which may prevent its
0362 general use in practical programming. I think that since it defeats the
0363 intentions given above, it should not be used for Dhrystone.
0364
0365 Unfortunately, ISO/ANSI Pascal does not contain language features for
0366 separate compilation. Although most commercial Pascal compilers provide
0367 separate compilation in some way, we cannot use it for Dhrystone since such
0368 a version would not be portable. Therefore, no attempt has been made to
0369 provide a Pascal version with several compilation units.
0370
0371 o No procedure merging
0372
0373 Although Dhrystone contains some very short procedures where execution would
0374 benefit from procedure merging (inlining, macro expansion of procedures),
0375 procedure merging is not to be used. The reason is that the percentage of
0376 procedure and function calls is part of the "Dhrystone distribution" of
0377 statements contained in [1]. This restriction does not hold for the string
0378 functions of the C version since ANSI C allows an implementation to use
0379 inline code for these functions.
0380
0381 o Other optimizations are allowed, but they should be indicated
0382
0383 It is often hard to draw an exact line between "normal code generation" and
0384 "optimization" in compilers: Some compilers perform operations by default
0385 that are invoked in other compilers only when optimization is explicitly
0386 requested. Also, we cannot avoid that in benchmarking people try to achieve
0387 results that look as good as possible. Therefore, optimizations performed
0388 by compilers - other than those listed above - are not forbidden when
0389 Dhrystone execution times are measured. Dhrystone is not intended to be
0390 non-optimizable but is intended to be similarly optimizable as normal
0391 programs. For example, there are several places in Dhrystone where
0392 performance benefits from optimizations like common subexpression
0393 elimination, value propagation etc., but normal programs usually also
0394 benefit from these optimizations. Therefore, no effort was made to
0395 artificially prevent such optimizations. However, measurement reports
0396 should indicate which compiler optimization levels have been used, and
0397 reporting results with different levels of compiler optimization for the
0398 same hardware is encouraged.
0399
0400 o Default results are those without "register" declarations (C version)
0401
0402 When Dhrystone results are quoted without additional qualification, they
0403 should be understood as results obtained without use of the "register"
0404 attribute. Good compilers should be able to make good use of registers even
0405 without explicit register declarations ([3], p. 193).
0406
0407 Of course, for experimental purposes, post-linkage optimization, procedure
0408 merging and/or compilation in one unit can be done to determine their effects.
0409 However, Dhrystone numbers obtained under these conditions should be
0410 explicitly marked as such; "normal" Dhrystone results should be understood as
0411 results obtained following the ground rules listed above.
0412
0413 In any case, for serious performance evaluation, users are advised to ask for
0414 code listings and to check them carefully. In this way, when results for
0415 different systems are compared, the reader can get a feeling how much
0416 performance difference is due to compiler optimization and how much is due to
0417 hardware speed.
0418
0419
0420 6. Acknowledgements
0421
0422 The C version 2.1 of Dhrystone has been developed in cooperation with Rick
0423 Richardson (Tinton Falls, NJ), it incorporates many ideas from the "Version
0424 1.1" distributed previously by him over the UNIX network Usenet. Through his
0425 activity with Usenet, Rick Richardson has made a very valuable contribution to
0426 the dissemination of the benchmark. I also thank Chaim Benedelac (National
0427 Semiconductor), David Ditzel (SUN), Earl Killian and John Mashey (MIPS), Alan
0428 Smith and Rafael Saavedra-Barrera (UC at Berkeley) for their help with
0429 comments on earlier versions of the benchmark.
0430
0431
0432 7. Bibliography
0433
0434 [1]
0435 Reinhold P. Weicker: Dhrystone: A Synthetic Systems Programming Benchmark.
0436 Communications of the ACM 27, 10 (Oct. 1984), 1013-1030
0437
0438 [2]
0439 Rick Richardson: Dhrystone 1.1 Benchmark Summary (and Program Text)
0440 Informal Distribution via "Usenet", Last Version Known to me: Sept. 21,
0441 1987
0442
0443 [3]
0444 Brian W. Kernighan and Dennis M. Ritchie: The C Programming Language.
0445 Prentice-Hall, Englewood Cliffs (NJ) 1978
0446
0447
0448 Variations
0449 ==========
0450 Understanding Variations in Dhrystone Performance
0451
0452
0453
0454 By Reinhold P. Weicker, Siemens AG, AUT E 51, Erlangen
0455
0456
0457
0458 April 1989
0459
0460
0461 This article has appeared in:
0462
0463
0464 Microprocessor Report, May 1989 (Editor: M. Slater), pp. 16-17
0465
0466
0467
0468
0469 Microprocessor manufacturers tend to credit all the performance measured by
0470 benchmarks to the speed of their processors, they often don't even mention the
0471 programming language and compiler used. In their detailed documents, usually
0472 called "performance brief" or "performance report," they usually do give more
0473 details. However, these details are often lost in the press releases and other
0474 marketing statements. For serious performance evaluation, it is necessary to
0475 study the code generated by the various compilers.
0476
0477 Dhrystone was originally published in Ada (Communications of the ACM, Oct.
0478 1984). However, since good Ada compilers were rare at this time and, together
0479 with UNIX, C became more and more popular, the C version of Dhrystone is the
0480 one now mainly used in industry. There are "official" versions 2.1 for Ada,
0481 Pascal, and C, which are as close together as the languages' semantic
0482 differences permit.
0483
0484 Dhrystone contains two statements where the programming language and its
0485 translation play a major part in the execution time measured by the benchmark:
0486
0487 o String assignment (in procedure Proc_0 / main)
0488 o String comparison (in function Func_2)
0489
0490 In Ada and Pascal, strings are arrays of characters where the length of the
0491 string is part of the type information known at compile time. In C, strings
0492 are also arrays of characters, but there are no operators defined in the
0493 language for assignment and comparison of strings. Instead, functions
0494 "strcpy" and "strcmp" are used. These functions are defined for strings of
0495 arbitrary length, and make use of the fact that strings in C have to end with
0496 a terminating null byte. For general-purpose calls to these functions, the
0497 implementor can assume nothing about the length and the alignment of the
0498 strings involved.
0499
0500 The C version of Dhrystone spends a relatively large amount of time in these
0501 two functions. Some time ago, I made measurements on a VAX 11/785 with the
0502 Berkeley UNIX (4.2) compilers (often-used compilers, but certainly not the
0503 most advanced). In the C version, 23% of the time was spent in the string
0504 functions; in the Pascal version, only 10%. On good RISC machines (where less
0505 time is spent in the procedure calling sequence than on a VAX) and with better
0506 optimizing compilers, the percentage is higher; MIPS has reported 34% for an
0507 R3000. Because of this effect, Pascal and Ada Dhrystone results are usually
0508 better than C results (except when the optimization quality of the C compiler
0509 is considerably better than that of the other compilers).
0510
0511 Several people have noted that the string operations are over-represented in
0512 Dhrystone, mainly because the strings occurring in Dhrystone are longer than
0513 average strings. I admit that this is true, and have said so in my SIGPLAN
0514 Notices paper (Aug. 1988); however, I didn't want to generate confusion by
0515 changing the string lengths from version 1 to version 2.
0516
0517 Even if they are somewhat over-represented in Dhrystone, string operations are
0518 frequent enough that it makes sense to implement them in the most efficient
0519 way possible, not only for benchmarking purposes. This means that they can
0520 and should be written in assembly language code. ANSI C also explicitly allows
0521 the strings functions to be implemented as macros, i.e. by inline code.
0522
0523 There is also a third way to speed up the "strcpy" statement in Dhrystone: For
0524 this particular "strcpy" statement, the source of the assignment is a string
0525 constant. Therefore, in contrast to calls to "strcpy" in the general case, the
0526 compiler knows the length and alignment of the strings involved at compile
0527 time and can generate code in the same efficient way as a Pascal compiler
0528 (word instructions instead of byte instructions).
0529
0530 This is not allowed in the case of the "strcmp" call: Here, the addresses are
0531 formal procedure parameters, and no assumptions can be made about the length
0532 or alignment of the strings. Any such assumptions would indicate an incorrect
0533 implementation. They might work for Dhrystone, where the strings are in fact
0534 word-aligned with typical compilers, but other programs would deliver
0535 incorrect results.
0536
0537 So, for an apple-to-apple comparison between processors, and not between
0538 several possible (legal or illegal) degrees of compiler optimization, one
0539 should check that the systems are comparable with respect to the following
0540 three points:
0541
0542 (1) String functions in assembly language vs. in C
0543
0544 Frequently used functions such as the string functions can and should be
0545 written in assembly language, and all serious C language systems known
0546 to me do this. (I list this point for completeness only.) Note that
0547 processors with an instruction that checks a word for a null byte (such
0548 as AMD's 29000 and Intel's 80960) have an advantage here. (This
0549 advantage decreases relatively if optimization (3) is applied.) Due to
0550 the length of the strings involved in Dhrystone, this advantage may be
0551 considered too high in perspective, but it is certainly legal to use
0552 such instructions - after all, these situations are what they were
0553 invented for.
0554
0555 (2) String function code inline vs. as library functions.
0556
0557 ANSI C has created a new situation, compared with the older
0558 Kernighan/Ritchie C. In the original C, the definition of the string
0559 function was not part of the language. Now it is, and inlining is
0560 explicitly allowed. I probably should have stated more clearly in my
0561 SIGPLAN Notices paper that the rule "No procedure inlining for
0562 Dhrystone" referred to the user level procedures only and not to the
0563 library routines.
0564
0565 (3) Fixed-length and alignment assumptions for the strings
0566
0567 Compilers should be allowed to optimize in these cases if (and only if)
0568 it is safe to do so. For Dhrystone, this is the "strcpy" statement, but
0569 not the "strcmp" statement (unless, of course, the "strcmp" code
0570 explicitly checks the alignment at execution time and branches
0571 accordingly). A "Dhrystone switch" for the compiler that causes the
0572 generation of code that may not work under certain circumstances is
0573 certainly inappropriate for comparisons. It has been reported in Usenet
0574 that some C compilers provide such a compiler option; since I don't have
0575 access to all C compilers involved, I cannot verify this.
0576
0577 If the fixed-length and word-alignment assumption can be used, a wide
0578 bus that permits fast multi-word load instructions certainly does help;
0579 however, this fact by itself should not make a really big difference.
0580
0581 A check of these points - something that is necessary for a thorough
0582 evaluation and comparison of the Dhrystone performance claims - requires
0583 object code listings as well as listings for the string functions (strcpy,
0584 strcmp) that are possibly called by the program.
0585
0586 I don't pretend that Dhrystone is a perfect tool to measure the integer
0587 performance of microprocessors. The more it is used and discussed, the more I
0588 myself learn about aspects that I hadn't noticed yet when I wrote the program.
0589 And of course, the very success of a benchmark program is a danger in that
0590 people may tune their compilers and/or hardware to it, and with this action
0591 make it less useful.
0592
0593 Whetstone and Linpack have their critical points also: The Whetstone rating
0594 depends heavily on the speed of the mathematical functions (sine, sqrt, ...),
0595 and Linpack is sensitive to data alignment for some cache configurations.
0596
0597 Introduction of a standard set of public domain benchmark software (something
0598 the SPEC effort attempts) is certainly a worthwhile thing. In the meantime,
0599 people will continue to use whatever is available and widely distributed, and
0600 Dhrystone ratings are probably still better than MIPS ratings if these are -
0601 as often in industry - based on no reproducible derivation. However, any
0602 serious performance evaluation requires more than just a comparison of raw
0603 numbers; one has to make sure that the numbers have been obtained in a
0604 comparable way.