Back to home page

LXR

 
 

    


Warning, /testsuites/benchmarks/dhrystone/README.md is written in an unsupported language. File is not indexed.

0001 C README
0002 ========
0003 This "shar" file contains the documentation for the
0004 electronic mail distribution of the Dhrystone benchmark (C version 2.1);
0005 a companion "shar" file contains the source code.
0006 (Because of mail length restrictions for some mailers, I have
0007 split the distribution in two parts.)
0008 
0009 For versions in other languages, see the other "shar" files.
0010 
0011 Files containing the C version (*.h: Header File, *.c: C Modules)
0012 
0013   dhry.h
0014   dhry_1.c
0015   dhry_2.c
0016   
0017 The file RATIONALE contains the article 
0018 
0019   "Dhrystone Benchmark: Rationale for Version 2 and Measurement Rules"
0020 
0021 which has been published, together with the C source code (Version 2.0),
0022 in SIGPLAN Notices vol. 23, no. 8 (Aug. 1988), pp. 49-62.
0023 This article explains all changes that have been made for Version 2,
0024 compared with the version of the original publication
0025 in Communications of the ACM vol. 27, no. 10 (Oct. 1984), pp. 1013-1030.
0026 It also contains "ground rules" for benchmarking with Dhrystone
0027 which should be followed by everyone who uses the program and publishes
0028 Dhrystone results.
0029 
0030 Compared with the Version 2.0 published in SIGPLAN Notices, Version 2.1
0031 contains a few corrections that have been made after Version 2.0 was
0032 distriobuted over the UNIX network Usenet. These small differences between
0033 Version 2.0 and 2.1 should not affect execution time measurements.
0034 For those who want to compare the exact contents of both versions,
0035 the file "dhry_c.dif" contains the differences between the two versions,
0036 as generated by a file comparison of the corresponding files with the
0037 UNIX utility "diff".
0038 
0039 The file VARIATIONS contains the article
0040 
0041   "Understanding Variations in Dhrystone Performance"
0042 
0043 which has been published in Microprocessor Report, May 1989
0044 (Editor: M. Slater), pp. 16-17. It describes the points that users
0045 should know if C Dhrystone results are compared.
0046 
0047 Recipients of this shar file who perform measurements are asked
0048 to send measurement results to the author and/or to Rick Richardson.
0049 Rick Richardson publishes regularly Dhrystone results on the UNIX network
0050 Usenet. For submissions of results to him (preferably by electronic mail,
0051 see address in the program header), he has provided a form which is contained
0052 in the file "submit.frm".
0053 
0054 
0055 The following files are contained in other "shar" files:
0056 
0057 Files containing the Ada version (*.s: Specifications, *.b: Bodies):
0058 
0059   d_global.s
0060   d_main.b
0061   d_pack_1.b
0062   d_pack_1.s
0063   d_pack_2.b
0064   d_pack_2.s
0065 
0066 File containing the Pascal version:
0067 
0068   dhry.p
0069 
0070 
0071 February 22, 1990
0072 
0073                  Reinhold P. Weicker
0074                  Siemens AG, AUT E 51
0075                  Postfach 3220
0076                  D-8520 Erlangen
0077                  Germany (West)
0078 
0079                  Phone:  [xxx-49]-9131-7-20330  (8-17 Central European Time)
0080                  UUCP:   ..!mcsun!unido!estevax!weicker
0081 
0082 
0083 Rationale
0084 =========
0085 
0086 
0087 
0088     Dhrystone Benchmark: Rationale for Version 2 and Measurement Rules
0089 
0090         [published in SIGPLAN Notices 23,8 (Aug. 1988), 49-62]
0091 
0092 
0093                  Reinhold P. Weicker
0094                  Siemens AG, E STE 35
0095                  [now: Siemens AG, AUT E 51]
0096                  Postfach 3220
0097                  D-8520 Erlangen
0098                  Germany (West)
0099 
0100 
0101 
0102 
0103 1.  Why a Version 2 of Dhrystone?
0104 
0105 The Dhrystone benchmark  program  [1]  has  become  a  popular  benchmark  for
0106 CPU/compiler   performance   measurement,   in   particular  in  the  area  of
0107 minicomputers, workstations, PC's and microprocesors.  It apparently satisfies
0108 a  need  for  an  easy-to-use  integer benchmark; it gives a first performance
0109 indication which is more meaningful than MIPS numbers which, in their  literal
0110 meaning  (million  instructions  per  second), cannot be used across different
0111 instruction sets (e.g. RISC  vs.  CISC).   With  the  increasing  use  of  the
0112 benchmark, it seems necessary to reconsider the benchmark and to check whether
0113 it can still fulfill this function.  Version 2 of Dhrystone is the  result  of
0114 such a re-evaluation, it has been made for two reasons:
0115 
0116 o Dhrystone has been published in Ada [1], and Versions in Ada, Pascal  and  C
0117   have  been  distributed  by  Reinhold Weicker via floppy disk.  However, the
0118   version that was used most often for benchmarking has been the version  made
0119   by  Rick  Richardson  by another translation from the Ada version into the C
0120   programming language, this has been the version  distributed  via  the  UNIX
0121   network Usenet [2].
0122 
0123   There is an obvious need for a common C version of Dhrystone, since C is  at
0124   present  the  most  popular  system  programming  language  for the class of
0125   systems (microcomputers, minicomputers,  workstations)  where  Dhrystone  is
0126   used  most.   There  should  be,  as  far as possible, only one C version of
0127   Dhrystone such that results can be compared  without  restrictions.  In  the
0128   past,  the  C  versions  distributed by Rick Richardson (Version 1.1) and by
0129   Reinhold Weicker had small (though not significant) differences.
0130 
0131   Together with the new C version, the  Ada  and  Pascal  versions  have  been
0132   updated as well.
0133 
0134 o As far as it is  possible  without  changes  to  the  Dhrystone  statistics,
0135   optimizing   compilers   should   be  prevented  from  removing  significant
0136   statements.  It has  turned  out  in  the  past  that  optimizing  compilers
0137   suppressed  code  generation for too many statements (by "dead code removal"
0138   or  "dead  variable  elimination").   This  has  lead  to  the  danger  that
0139   benchmarking  results obtained by a naive application of Dhrystone - without
0140   inspection of the code that was generated - could become meaningless.
0141 
0142 The  overall  policiy  for  version  2  has  been  that  the  distribution  of
0143 statements,  operand types and operand locality described in [1] should remain
0144 unchanged as much as possible.  (Very few changes were necessary; their impact
0145 should be negligible.)  Also, the order of statements should remain unchanged.
0146 Although I am aware of some critical remarks on the benchmark - I  agree  with
0147 several  of them - and know some suggestions for improvement, I didn't want to
0148 change the benchmark into something different from what has  become  known  as
0149 "Dhrystone"; the confusion generated by such a change would probably outweight
0150 the benefits. If I were to write a new benchmark program, I wouldn't  give  it
0151 the  name  "Dhrystone"  since  this  denotes  the  program  published  in [1].
0152 However, I do recognize  the  need  for  a  larger  number  of  representative
0153 programs  that can be used as benchmarks; users should always be encouraged to
0154 use more than just one benchmark.
0155 
0156 The new versions (version 2.1 for C, Pascal and Ada) will  be  distributed  as
0157 widely as possible.  (Version 2.1 differs from version 2.0 distributed via the
0158 UNIX Network Usenet in  March  1988  only  in  a  few  corrections  for  minor
0159 deficiencies  found  by  users  of  version 2.0.)  Readers who want to use the
0160 benchmark for their own measurements can obtain  a  copy  in  machine-readable
0161 form on floppy disk (MS-DOS or XENIX format) from the author.
0162 
0163 
0164 2.  Overall Characteristics of Version 2
0165 
0166 In general, version 2  follows  -  in  the  parts  that  are  significant  for
0167 performance  measurement,  i.e.   within  the measurement loop - the published
0168 (Ada) version and the C versions previously distributed.  Where  the  versions
0169 distributed  by  Rick Richardson [2] and Reinhold Weicker have been different,
0170 it  follows  the  version  distributed  by  Reinhold  Weicker.  (However,  the
0171 differences  have  been  so  small  that their impact on execution time in all
0172 likelihood has been negligible.)  The initialization and UNIX  instrumentation
0173 part  -  which  had  been  omitted  in  [1] - follows mostly the ideas of Rick
0174 Richardson [2].  However, any changes in the initialization part  and  in  the
0175 printing  of  the  result have no impact on performance measurement since they
0176 are outside the measaurement loop.  As a concession to older compilers,  names
0177 have been made unique within the first 8 characters for the C version.
0178 
0179 The original publication of Dhrystone did not contain any statements for  time
0180 measurement  since  they  are necessarily system-dependent. However, it turned
0181 out that it is not enough just to inclose the main procedure of Dhrystone in a
0182 loop  and  to  measure the execution time.  If the variables that are computed
0183 are not used somehow, there is the danger that the compiler considers them  as
0184 "dead  variables" and suppresses code generation for a part of the statements.
0185 Therefore in version 2 all variables of "main" are printed at the end  of  the
0186 program.  This also permits some plausibility control for correct execution of
0187 the benchmark.
0188 
0189 At several places in the benchmark, code has been added, but only in  branches
0190 that  are  not  executed. The intention is that optimizing compilers should be
0191 prevented from moving code out of the measurement loop, or from removing  code
0192 altogether.  Statements that are executed have been changed in very few places
0193 only.  In these cases, only the role of some operands has been changed, and it
0194 was   made  sure  that  the  numbers  defining  the  "Dhrystone  distribution"
0195 (distribution of statements, operand types and locality) still hold as much as
0196 possible.   Except for sophisticated optimizing compilers, execution times for
0197 version 2.1 should be the same as for previous versions.
0198 
0199 Because of the self-imposed limitation that the order and distribution of  the
0200 executed  statements  should  not  be  changed,  there  are  still cases where
0201 optimizing compilers may not generate code for some statements. To  a  certain
0202 degree,  this  is  unavoidable  for  small synthetic benchmarks.  Users of the
0203 benchmark are advised to check code listings whether code is generated for all
0204 statements of Dhrystone.
0205 
0206 Contrary to the suggestion in the published paper and its realization  in  the
0207 versions previously distributed, no attempt has been made to subtract the time
0208 for the measurement loop overhead. (This calculation has proven  difficult  to
0209 implement  in  a  correct  way,  and  its omission makes the program simpler.)
0210 However, since the loop check is now part of the benchmark, this does have  an
0211 impact  -  though a very minor one - on the distribution statistics which have
0212 been updated for this version.
0213 
0214 
0215 3.  Discussion of Individual Changes
0216 
0217 In this section, all changes are described that affect  the  measurement  loop
0218 and  that  are  not  just  renamings  of variables. All remarks refer to the C
0219 version; the other language versions have been updated similarly.
0220 
0221 In addition to adding  the  measurement  loop  and  the  printout  statements,
0222 changes have been made at the following places:
0223 
0224 o In procedure "main", three statements have been added  in  the  non-executed
0225   "then" part of the statement
0226 
0227         if (Enum_Loc == Func_1 (Ch_Index, 'C'))
0228 
0229   they are
0230 
0231         strcpy (Str_2_Loc, "DHRYSTONE PROGRAM, 3'RD STRING");
0232         Int_2_Loc = Run_Index;
0233         Int_Glob = Run_Index;
0234 
0235   The string assignment prevents  movement  of  the  preceding  assignment  to
0236   Str_2_Loc  (5'th  statement  of  "main")  out  of the measurement loop (This
0237   probably will not happen for the C version, but it did happen  with  another
0238   language   and  compiler.)   The  assignment  to  Int_2_Loc  prevents  value
0239   propagation for Int_2_Loc, and the assignment to Int_Glob makes the value of
0240   Int_Glob possibly dependent from the value of Run_Index.
0241 
0242 o In the three arithmetic computations at the end of the measurement  loop  in
0243   "main  ",  the  role  of  some  variables has been exchanged, to prevent the
0244   division from just cancelling out the multiplication as it was  in  [1].   A
0245   very   smart  compiler  might  have  recognized  this  and  suppressed  code
0246   generation for the division.
0247 
0248 o For Proc_2, no code has been changed, but the values of the actual parameter
0249   have changed due to changes in "main".
0250 
0251 o In Proc_4, the second assignment has been changed from
0252 
0253         Bool_Loc = Bool_Loc | Bool_Glob;
0254 
0255   to
0256 
0257         Bool_Glob = Bool_Loc | Bool_Glob;
0258 
0259   It now assigns a value to a global variable  instead  of  a  local  variable
0260   (Bool_Loc);   Bool_Loc  would  be  a  "dead  variable"  which  is  not  used
0261   afterwards.
0262 
0263 o In Func_1, the statement
0264 
0265         Ch_1_Glob = Ch_1_Loc;
0266 
0267   was added in the non-executed "else" part of the "if" statement, to  prevent
0268   the suppression of code generation for the assignment to Ch_1_Loc.
0269 
0270 o In Func_2, the second character comparison statement has been changed to
0271 
0272         if (Ch_Loc == 'R')
0273 
0274   ('R' instead of 'X') because  a  comparison  with  'X'  is  implied  in  the
0275   preceding "if" statement.
0276 
0277   Also in Func_2, the statement
0278 
0279         Int_Glob = Int_Loc;
0280 
0281   has been added in the non-executed part of the last "if" statement, in order
0282   to prevent Int_Loc from becoming a dead variable.
0283 
0284 o In Func_3, a non-executed "else" part has been added to the "if"  statement.
0285   While  the  program  would  not be incorrect without this "else" part, it is
0286   considered bad programming practice if a function  can  be  left  without  a
0287   return value.
0288 
0289   To compensate for this change, the (non-executed) "else" part  in  the  "if"
0290   statement of Proc_3 was removed.
0291 
0292 The distribution statistics have been changed only  by  the  addition  of  the
0293 measurement loop iteration (1 additional statement, 4 additional local integer
0294 operands) and by the change in Proc_4  (one  operand  changed  from  local  to
0295 global).  The distribution statistics in the comment headers have been updated
0296 accordingly.
0297 
0298 
0299 4.  String Operations
0300 
0301 The string operations (string assignment and string comparison) have not  been
0302 changed, to keep the program consistent with the original version.
0303 
0304 There has been some concern that the string operations are over-represented in
0305 the  program,  and that execution time is dominated by these operations.  This
0306 was true in particular when optimizing compilers removed too much code in  the
0307 main part of the program, this should have been mitigated in version 2.
0308 
0309 It should be noted that this is a  language-dependent  issue:   Dhrystone  was
0310 first  published  in  Ada, and with Ada or Pascal semantics, the time spent in
0311 the string operations is,  at  least  in  all  implementations  known  to  me,
0312 considerably smaller.  In Ada and Pascal, assignment and comparison of strings
0313 are operators defined in the language, and the upper  bounds  of  the  strings
0314 occuring  in  Dhrystone  are part of the type information known at compilation
0315 time.  The compilers can therefore generate  efficient  inline  code.   In  C,
0316 string  assignemt  and comparisons are not part of the language, so the string
0317 operations must be expressed in terms of the C library functions "strcpy"  and
0318 "strcmp".   (ANSI  C  allows  an  implementation  to use inline code for these
0319 functions.)  In addition to the overhead caused by additional function  calls,
0320 these  functions  are  defined for null-terminated strings where the length of
0321 the strings is not known at compilation time; the function has to check  every
0322 byte for the termination condition (the null byte).
0323 
0324 Obviously, a C library which includes efficiently coded "strcpy" and  "strcmp"
0325 functions  helps to obtain good Dhrystone results. However, I don't think that
0326 this is unfair since string  functions  do  occur  quite  frequently  in  real
0327 programs  (editors, command interpreters, etc.).  If the strings functions are
0328 implemented efficiently,  this  helps  real  programs  as  well  as  benchmark
0329 programs.
0330 
0331 I admit that the  string  comparison  in  Dhrystone  terminates  later  (after
0332 scanning  20  characters)  than most string comparisons in real programs.  For
0333 consistency with the original benchmark, I didn't change the  program  despite
0334 this weakness.
0335 
0336 
0337 5.  Intended Use of Dhrystone
0338 
0339 When Dhrystone is used, the following "ground rules" apply:
0340 
0341 o Separate compilation (Ada and C versions)
0342 
0343   As mentioned in [1], Dhrystone was written  to  reflect  actual  programming
0344   practice  in  systems  programming.   The  division into several compilation
0345   units (5 in the Ada version, 2 in the C version)  is  intended,  as  is  the
0346   distribution of inter-module and intra-module subprogram calls.  Although on
0347   many systems there will be no difference in execution time  to  a  Dhrystone
0348   version  where  all  compilation units are merged into one file, the rule is
0349   that separate compilation should  be  used.   The  intention  is  that  real
0350   programming  practice,  where  programs  consist  of  several  independently
0351   compiled units, should  be  reflected.   This  also  has  implies  that  the
0352   compiler,  while  compiling  one  unit,  has no information about the use of
0353   variables, register allocation etc.  occuring in  other  compilation  units.
0354   Although  in  real  life  compilation  units  will  probably  be larger, the
0355   intention is that these effects  of  separate  compilation  are  modeled  in
0356   Dhrystone.
0357 
0358   A few language systems have post-linkage optimization available (e.g., final
0359   register allocation is performed after linkage).  This is a borderline case:
0360   Post-linkage  optimization  involves  additional  program  preparation  time
0361   (although  not  as  much  as  compilation in one unit) which may prevent its
0362   general use in practical programming.  I think that  since  it  defeats  the
0363   intentions given above, it should not be used for Dhrystone.
0364 
0365   Unfortunately, ISO/ANSI  Pascal  does  not  contain  language  features  for
0366   separate  compilation.   Although  most  commercial Pascal compilers provide
0367   separate compilation in some way, we cannot use it for Dhrystone since  such
0368   a  version  would  not  be portable.  Therefore, no attempt has been made to
0369   provide a Pascal version with several compilation units.
0370 
0371 o No procedure merging
0372 
0373   Although Dhrystone contains some very short procedures where execution would
0374   benefit  from  procedure  merging (inlining, macro expansion of procedures),
0375   procedure merging is not to be used.  The reason is that the  percentage  of
0376   procedure  and  function  calls  is  part of the "Dhrystone distribution" of
0377   statements contained in [1].  This restriction does not hold for the  string
0378   functions  of  the  C  version  since ANSI C allows an implementation to use
0379   inline code for these functions.
0380 
0381 o Other optimizations are allowed, but they should be indicated
0382 
0383   It is often hard to draw an exact line between "normal code generation"  and
0384   "optimization"  in  compilers:  Some compilers perform operations by default
0385   that are invoked in other compilers only  when  optimization  is  explicitly
0386   requested.  Also, we cannot avoid that in benchmarking people try to achieve
0387   results that look as good as possible.  Therefore,  optimizations  performed
0388   by  compilers  -  other  than  those  listed  above - are not forbidden when
0389   Dhrystone execution times are measured.  Dhrystone is  not  intended  to  be
0390   non-optimizable  but  is  intended  to  be  similarly  optimizable as normal
0391   programs.   For  example,  there  are  several  places  in  Dhrystone  where
0392   performance   benefits   from   optimizations   like   common  subexpression
0393   elimination, value  propagation  etc.,  but  normal  programs  usually  also
0394   benefit  from  these  optimizations.   Therefore,  no  effort  was  made  to
0395   artificially  prevent  such  optimizations.   However,  measurement  reports
0396   should  indicate  which  compiler  optimization  levels  have been used, and
0397   reporting results with different levels of  compiler  optimization  for  the
0398   same hardware is encouraged.
0399 
0400 o Default results are those without "register" declarations (C version)
0401 
0402   When Dhrystone results are quoted  without  additional  qualification,  they
0403   should  be  understood  as  results  obtained  without use of the "register"
0404   attribute. Good compilers should be able to make good use of registers  even
0405   without explicit register declarations ([3], p. 193).
0406 
0407 Of course, for experimental  purposes,  post-linkage  optimization,  procedure
0408 merging and/or compilation in one unit can be done to determine their effects.
0409 However,  Dhrystone  numbers  obtained  under  these  conditions   should   be
0410 explicitly  marked as such; "normal" Dhrystone results should be understood as
0411 results obtained following the ground rules listed above.
0412 
0413 In any case, for serious performance evaluation, users are advised to ask  for
0414 code  listings  and  to  check  them carefully.  In this way, when results for
0415 different systems are  compared,  the  reader  can  get  a  feeling  how  much
0416 performance  difference is due to compiler optimization and how much is due to
0417 hardware speed.
0418 
0419 
0420 6.  Acknowledgements
0421 
0422 The C version 2.1 of Dhrystone has been developed  in  cooperation  with  Rick
0423 Richardson  (Tinton  Falls,  NJ), it incorporates many ideas from the "Version
0424 1.1" distributed previously by him over the UNIX network Usenet.  Through  his
0425 activity with Usenet, Rick Richardson has made a very valuable contribution to
0426 the dissemination of the benchmark.  I also thank  Chaim  Benedelac  (National
0427 Semiconductor),  David Ditzel (SUN), Earl Killian and John Mashey (MIPS), Alan
0428 Smith and Rafael  Saavedra-Barrera  (UC  at  Berkeley)  for  their  help  with
0429 comments on earlier versions of the benchmark.
0430 
0431 
0432 7.  Bibliography
0433 
0434 [1]
0435    Reinhold P. Weicker: Dhrystone: A Synthetic Systems Programming Benchmark.
0436    Communications of the ACM 27, 10 (Oct. 1984), 1013-1030
0437 
0438 [2]
0439    Rick Richardson: Dhrystone 1.1 Benchmark Summary (and Program Text)
0440    Informal Distribution via "Usenet", Last Version Known  to  me:  Sept.  21,
0441    1987
0442 
0443 [3]
0444    Brian W. Kernighan and Dennis M. Ritchie:  The C Programming Language.
0445    Prentice-Hall, Englewood Cliffs (NJ) 1978
0446 
0447 
0448 Variations
0449 ==========
0450             Understanding Variations in Dhrystone Performance
0451 
0452 
0453 
0454           By Reinhold P. Weicker, Siemens AG, AUT E 51, Erlangen
0455 
0456 
0457 
0458                                 April 1989
0459 
0460 
0461                       This article has appeared in:
0462 
0463 
0464         Microprocessor Report, May 1989 (Editor: M. Slater), pp. 16-17
0465 
0466 
0467 
0468 
0469 Microprocessor manufacturers tend to credit all the  performance  measured  by
0470 benchmarks to the speed of their processors, they often don't even mention the
0471 programming language and compiler used. In their detailed  documents,  usually
0472 called  "performance brief" or "performance report," they usually do give more
0473 details. However, these details are often lost in the press releases and other
0474 marketing  statements.  For serious performance evaluation, it is necessary to
0475 study the code generated by the various compilers.
0476 
0477 Dhrystone was originally published in Ada (Communications  of  the  ACM,  Oct.
0478 1984).  However, since good Ada compilers were rare at this time and, together
0479 with UNIX, C became more and more popular, the C version of Dhrystone  is  the
0480 one  now  mainly  used in industry. There are "official" versions 2.1 for Ada,
0481 Pascal, and C,  which  are  as  close  together  as  the  languages'  semantic
0482 differences permit.
0483 
0484 Dhrystone contains two statements  where  the  programming  language  and  its
0485 translation play a major part in the execution time measured by the benchmark:
0486 
0487   o   String assignment (in procedure Proc_0 / main)
0488   o   String comparison (in function Func_2)
0489 
0490 In Ada and Pascal, strings are arrays of characters where the  length  of  the
0491 string  is  part  of the type information known at compile time. In C, strings
0492 are also arrays of characters, but there  are  no  operators  defined  in  the
0493 language  for  assignment  and  comparison  of  strings.   Instead,  functions
0494 "strcpy" and "strcmp" are used. These functions are  defined  for  strings  of
0495 arbitrary  length, and make use of the fact that strings in C have to end with
0496 a terminating null byte. For general-purpose calls  to  these  functions,  the
0497 implementor  can  assume  nothing  about  the  length and the alignment of the
0498 strings involved.
0499 
0500 The C version of Dhrystone spends a relatively large amount of time  in  these
0501 two  functions.  Some  time  ago, I made measurements on a VAX 11/785 with the
0502 Berkeley UNIX (4.2) compilers (often-used compilers,  but  certainly  not  the
0503 most  advanced).  In  the  C  version, 23% of the time was spent in the string
0504 functions; in the Pascal version, only 10%. On good RISC machines (where  less
0505 time is spent in the procedure calling sequence than on a VAX) and with better
0506 optimizing compilers, the percentage is higher; MIPS has reported 34%  for  an
0507 R3000.   Because  of this effect, Pascal and Ada Dhrystone results are usually
0508 better than C results (except when the optimization quality of the C  compiler
0509 is considerably better than that of the other compilers).
0510 
0511 Several people have noted that the string operations are  over-represented  in
0512 Dhrystone,  mainly  because the strings occurring in Dhrystone are longer than
0513 average strings. I admit that this is true, and have said  so  in  my  SIGPLAN
0514 Notices  paper  (Aug.  1988);  however, I didn't want to generate confusion by
0515 changing the string lengths from version 1 to version 2.
0516 
0517 Even if they are somewhat over-represented in Dhrystone, string operations are
0518 frequent  enough  that  it makes sense to implement them in the most efficient
0519 way possible, not only for benchmarking purposes.  This means  that  they  can
0520 and should be written in assembly language code. ANSI C also explicitly allows
0521 the strings functions to be implemented as macros, i.e. by inline code.
0522 
0523 There is also a third way to speed up the "strcpy" statement in Dhrystone: For
0524 this  particular  "strcpy" statement, the source of the assignment is a string
0525 constant. Therefore, in contrast to calls to "strcpy" in the general case, the
0526 compiler  knows  the  length  and alignment of the strings involved at compile
0527 time and can generate code in the same efficient  way  as  a  Pascal  compiler
0528 (word instructions instead of byte instructions).
0529 
0530 This is not allowed in the case of the "strcmp" call: Here, the addresses  are
0531 formal  procedure  parameters, and no assumptions can be made about the length
0532 or alignment of the strings.  Any such assumptions would indicate an incorrect
0533 implementation.  They  might work for Dhrystone, where the strings are in fact
0534 word-aligned  with  typical  compilers,  but  other  programs  would   deliver
0535 incorrect results.
0536 
0537 So, for an apple-to-apple  comparison  between  processors,  and  not  between
0538 several  possible  (legal  or  illegal)  degrees of compiler optimization, one
0539 should check that the systems are comparable with  respect  to  the  following
0540 three points:
0541 
0542   (1) String functions in assembly language vs. in C
0543 
0544       Frequently used functions such as the string functions can and should be
0545       written  in  assembly language, and all serious C language systems known
0546       to me do this. (I list this point  for  completeness  only.)  Note  that
0547       processors  with an instruction that checks a word for a null byte (such
0548       as AMD's  29000  and  Intel's  80960)  have  an  advantage  here.  (This
0549       advantage  decreases  relatively if optimization (3) is applied.) Due to
0550       the length of the strings involved in Dhrystone, this advantage  may  be
0551       considered  too  high  in  perspective, but it is certainly legal to use
0552       such instructions - after all,  these  situations  are  what  they  were
0553       invented for.
0554 
0555   (2) String function code inline vs. as library functions.
0556 
0557       ANSI  C  has  created  a  new  situation,  compared   with   the   older
0558       Kernighan/Ritchie  C.  In  the  original C, the definition of the string
0559       function was not part of the  language.  Now  it  is,  and  inlining  is
0560       explicitly  allowed.  I  probably  should have stated more clearly in my
0561       SIGPLAN  Notices  paper  that  the  rule  "No  procedure  inlining   for
0562       Dhrystone"  referred  to  the  user level procedures only and not to the
0563       library routines.
0564 
0565   (3) Fixed-length and alignment assumptions for the strings
0566 
0567       Compilers should be allowed to optimize in these cases if (and only  if)
0568       it  is safe to do so. For Dhrystone, this is the "strcpy" statement, but
0569       not the  "strcmp"  statement  (unless,  of  course,  the  "strcmp"  code
0570       explicitly   checks   the  alignment  at  execution  time  and  branches
0571       accordingly).  A "Dhrystone switch" for the  compiler  that  causes  the
0572       generation  of  code  that  may  not work under certain circumstances is
0573       certainly inappropriate for comparisons. It has been reported in  Usenet
0574       that some C compilers provide such a compiler option; since I don't have
0575       access to all C compilers involved, I cannot verify this.
0576 
0577       If the fixed-length and word-alignment assumption can be  used,  a  wide
0578       bus  that permits fast multi-word load instructions certainly does help;
0579       however, this fact by itself should not make a really big difference.
0580 
0581 A check of  these  points  -  something  that  is  necessary  for  a  thorough
0582 evaluation  and  comparison  of  the  Dhrystone  performance claims - requires
0583 object code listings as well as listings for  the  string  functions  (strcpy,
0584 strcmp) that are possibly called by the program.
0585 
0586 I don't pretend that Dhrystone is  a  perfect  tool  to  measure  the  integer
0587 performance  of microprocessors. The more it is used and discussed, the more I
0588 myself learn about aspects that I hadn't noticed yet when I wrote the program.
0589 And  of  course,  the  very success of a benchmark program is a danger in that
0590 people may tune their compilers and/or hardware to it, and  with  this  action
0591 make it less useful.
0592 
0593 Whetstone and Linpack have their critical points also:  The  Whetstone  rating
0594 depends  heavily on the speed of the mathematical functions (sine, sqrt, ...),
0595 and Linpack is sensitive to data alignment for some cache configurations.
0596 
0597 Introduction of a standard set of public domain benchmark software  (something
0598 the  SPEC  effort attempts) is certainly a worthwhile thing.  In the meantime,
0599 people will continue to use whatever is available and widely distributed,  and
0600 Dhrystone  ratings  are probably still better than MIPS ratings if these are -
0601 as often in industry - based on  no  reproducible  derivation.   However,  any
0602 serious  performance  evaluation  requires  more than just a comparison of raw
0603 numbers; one has to make sure  that  the  numbers  have  been  obtained  in  a
0604 comparable way.