Computer Processing Capability


A Report on Computer Processing Capability for the Magnetohydrodynamic Simulation Model
Tatsuki Ogino
Solar-Terrestrial Environment Laboratory, Nagoya University
Honohara 3-13, Toyokawa, Aichi 442, Japan

1.  Introduction

   In the computer simulation of space plasma phenomena, it is quite important 
to   keep higher numerical accuracy and fine spatial and temporal  resolution. 
To do so, it is strongly needed to extract the maximum performance of  comput-
ers   in  execution  of the simulation codes as well as  to  use  a  numerical  
algorithm with higher accuracy.
 
  For example, in the global simulation of the earth's magnetosphere, we  want 
to keep the outer boundaries away from the earth to avoid troublesome boundary 
effects   and  also  to find what is happening in narrow regions  of  the  bow 
shock,  magnetopause   and plasma sheet. Therefore, we need  to  increase  the  
number   of  grid  points to as many as possible and  that  automatically  in-
creases the  computer memory  and computation time.  If the grid intervals are 
changed  to be a  half in  a 3-dimensional simulation box with the same  total 
length, the  simulation code usually needs 8 times the computer memory and  16 
times computation  time. This  is an essential reason why we need a  supercom-
puter with higher speed and greater memory. 

  It  is not easy in general to evaluate computer performance  because results 
strongly  depend on the nature of the program itself and conditions of  execu-
tion   time. On the other hand, it is very difficult to estimate how  long  we 
need to execute our particular simulation codes only from the catalog lists of 
a  computer performance. Thus we often want to know a rough evaluation  or  an 
example  of  the practical performance of computation in  different  kinds  of 
computers.   In order to realize such a purpose, we have executed  some   test  
programs  in  many kind of computers and we have used the results as a   guide  
to develop the magnetohydrodynamic (MHD) simulation codes.
        
        
2.  Comparison of Computer Processing Capability

  We  have had good opportunities to use several kinds of computers. In  these 
trials we  tried to execute some test program runs to evaluate the computer  
performance  for fundamental arithmetic calculations and 2- and  3-dimensional  
MHD  simulation  codes. Tables 1 and 2 show the results of comparisons of  the  
computer  processing capability. In Table 1, simple averages for execution  of  
the four  fundamental arithmetic calculations, addition, subtraction,   multi-
plication   and division are shown to evaluate a basic processing  capability,  
where  the unit is millions of floating point operations per  second  (MFLOPS) 
and   the compiler  option  was adopted to get the maximum performance  if   a  
particular compiler  option  is  not written. The values  of  the   processing  
capability (MFLOPS) stand for a simple average from the four arithmetic calcu-
lations.

  In  Table 2, the execution times for single time step advance of 2-  and  3-
dimensional  MHD  simulation  codes are shown when  3-dimensional  global  MHD 
simulation  codes  of the interaction between the solar wind and  the  earth's 
magnetosphere were used for (a) the dawn-dusk asymmetry model with grid points 
50x50x26  [Ogino et al., 1985, 1986a], (b) the dawn-dusk symmetry  model  with 
grid points 62x32x32 [Ogino, 1986], and also 2-dimensional MHD simulation code 
of the interaction between the solar wind and the cometary plasma was used for 
(c)  with grid points 402x102 [Ogino et al, 1986b] including boundary  points. 
Since the three MHD codes were originally developed to efficiently execute  by 
CRAY-1 supercomputer, the program size is not large and less than 1MW  memory, 
We successively applied the MHD codes to other computers after we modified the 
original  codes to get a good computer performance keeping the number of  grid 
points.  One essential difference of the two 3-dimensional MHD models  of  (a) 
and   (b) are the length of the "do loop" in the programs. In model  (a),  the  
long  "do   loop"   was separated into several parts of small  "do  loops"  in  
order to vectorize all the "do loops" in the CRAY-1 compiler because the  long 
"do  loop" for vectorization is limited in CRAY computers. On the other  hand, 
the minimum number of long "do loops" is usually used to get a better process-
ing performance and model (b) just corresponds to that case.

  Table   1  is to demonstrate the average values for  the  four   fundamental 
arithmetic calculation and tells us a rough evaluation on computer  processing 
capability,   where  array  arguments with a length of 10,000 number are  used  
in   the calculations. There, we obtain the average values for  vector  option 
and scalar option in compiler for the supercomputers, where all the vector  do 
loops   were confirmed to be fully vectorized. The ratio of vector  to  scalar 
options  can be understood to give the maximum capability of a practical  vec-
torization  in  the supercomputers.  The  vectorization ratio or  acceleration 
ratio of  the  supercomputer  is in the range from 10 to 100 times, and it may 
become a guide  to develop and execute the simulation codes.
         
  In Table 1 are shown only the average values for the four arithmetic  calcu-
lations. Each value is not equal for addition, subtraction, multiplication and 
division.  However, in most cases the processing times for addition,  subtrac-
tion and multiplication are almost same; on the contrary, that for division is 
relatively  small and is a quarter the other values. This is true  for  vector 
compilers  in supercomputers and so it should be noted that the  division  has 
worst  efficiency  in the four arithmetic calculations. Therefore,  we  should 
decrease  the number of divisions in each "do loop" of simulation codes if  we 
want to have a higher efficiency.

  It  is surprising that new-age supercomputers such as NEC SX-3  and  Fujitsu 
VP-2600  show quite high performance, larger than 1 GFLOPS. Even if the  proc-
essing capability of the workstations becomes much higher recently, the  prac-
tical  computation   speed is almost 1 to 10 MFLOPS and is less than  a   hun-
dredth   the  fastest supercomputer speed. Therefore, we must  depend  on  the 
supercomputer by all  means when we carry out a large simulation code. In  the 
last  line  of  the table, the performance of a  massive  parallel  processor, 
Matsusita ADENART  is  shown  and is almost equivalent to that of the  vector-
type supercomputer  like Fujitsu VP-200.

  Strictly speaking, the processing capability of the four fundamental  arith-
metic  calculations  does not reflect on that of  complete  simulation  codes, 
because  a  complete  program is composed of many kinds  of  calculations  and 
processing.  Therefore,  the  processing capability strongly  depends  on  the 
character  of  each complete program. Table 2 show an example on the  computer  
processing capability when we use three types of global MHD simulation  codes. 
Computation times  corresponding to single time step advance in the MHD  codes  
are demonstrated  in seconds. The new-age supercomputers such as NEC SX-3  and  
Fujitsu  VP-2600   again give excellent results in the global  MHD  simulation  
codes.  In our  test by using the MHD simulation codes, three kinds of  super-
computers  of Fujitsu VP-200, NEC SX-2, and CRAY-YMP-864, and a massive paral-
lel   processor, Matsusita ADENART  give almost comparable performance. It  is 
noted that CRAY-2 did  not present good values and that CRAY-XMP and  CRAY-YMP 
did not show  good performance for model (b). In those cases, the full vector-
ization in  compiler was not achieved because we could not understand well how 
to vectorize, or the length of some "do loops" was too long for  vectorization 
in CRAY computer.
  
  Moreover,  it  can be noted that we can nowadays get about 10  to  20  times 
computation performance by using the recent supercomputers in comparison  with 
the  first  supercomputer, CRAY-1. At the same time we can also  use  a  large 
amount of computer  memory from 300 MB to 1 GB in the  simulation, which   may 
permit  us  to  handle  large  numbers  of  grid  points,  much  greater  than 
100x100x100  and  1000x1000  even  in the 2-and 3-dimensional  MHD  simulation  
codes.  As  a result, we can confidently expect that we will be able to obtain 
much  physically  meaningful  results from computer simulations  in  the  STEP 
interval.
        
        
3.  Summary
  
  We  demonstrated comparisons of computer processing capabilities for  funda-
mental arithmetic calculations and for three kinds of complete MHD  simulation  
codes.  The almost computations to obtain the results were executed   in  fact  
by ourselves. These tables to demonstrate computer performances are of  course 
a  particular  example and do not mean the general performance  of  computers.  
However, they may be useful for us to give guidance when we develop new  simu-
lation codes and execute them by using particular computers.

  In  the 3-dimensional global MHD simulation of the interaction  between  the 
solar wind and the earth's magnetosphere, we particularly need higher speed in 
calculation   and  large  computer memory. Since such a  high  performance  of 
computers   has  been quickly achieved, we will be able to study dynamics   of  
the earth's  magnetosphere  in more detail in order to compare with   theories  
and observations in the STEP interval.

  I  would  like to express my acknowledgement to the  many  computer  centers 
where   I  had  opportunities to execute the test programs  and  also  to  the 
staffs, in the computer centers.


References

Ogino, T., A three dimensional MHD simulation of the interaction of the  solar 
wind with the earth's magnetosphere: The generation of field-aligned currents, 
J. Geophys. Res., 91, A6, 6791, 1986.

Ogino,  T, R.J. Walker, M. Ashour-Abdalla, and J.M. Dawson, An MHD  simulation 
of  By-dependent magnetospheric convection and field-aligned  currents  during 
northward IMF, J. Geophys. Res., 90, 10,835, 1985.

Ogino, T., R.J. Walker, M. Ashour-Abdalla, and J.M. Dawson. An MHD  simulation 
of the effects of the interplanetary magnetic field By component on the inter-
action  of  the  solar wind with the earth's  magnetosphere  during  southward 
interplanetary magnetic field, J. Geophys. Res., 91, 10,029, 1986a.

Ogino, T., R.J. Walker, and M. Ashour-Abdalla, An MHD simulation of the inter-
action  of  the solar wind with the outflowing plasma from a  comet,  Geophys. 
Res. Lett., 13, 929, 1986b.
 

Table  1.   A  comparison of the processing capability in  computers.  A  test 
program   to execute the four fundamental arithmetic  calculations,  addition,  
subtraction,   multiplication and   division was used to  evaluate  the   com-
puter  processing  capability, where the unit is  millions  of  floating-point 
operations  per  second (MFLOPS) and the compiler option to  get  the  maximum 
performance was adopted  if a compiler option is not given. In the table,  IAP 
means to use the inner array processor and NIAP (or NOIAP) means not to use.

   ---------------------------------------------------------------------
   computer             compiler option       processing capability (MFLOPS)
   ---------------------------------------------------------------------
   NEC ACOS-650            Fortran                    0.41
   NEC ACOS-850            NIAP                       1.09
   NEC ACOS-850            IAP                        2.47
   NEC ACOS-930            NIAP, OPT=1                1.04
   NEC ACOS-930            IAP, opt=3                 6.87
   NEC S-2000              NIAP                       7.37
   NEC S-2000              IAP                       13.58
   NEC SX-2A               vector                   196.3
   NEC SX-2                scalar                     7.74
   NEC SX-2                vector                   247.4
   NEC SX-3/14             scalar                    10.78
   NEC SX-3/14             vector                   583.1
   NEC SX-3                vector                 1,406.9
   Fujitsu M-200                                      1.54
   Fujitsu M-380                                      3.64
   Fujitsu M-780/20                                   8.31
   Fujitsu M-780/30        FORT77 O2                 14.16
   Fujitsu M-780/30        FORT77EX O3               18.07
   Fujitsu M-1800          FORT77                    18.46
   Fujitsu VP-100          scalar                     4.17
   Fujitsu VP-100          vector                    94.91
   Fujitsu VP-200          scalar                     3.20
   Fujitsu VP-200          vector                   225.0
   Fujitsu VP-400          vector                   262.8
   Fujitsu VP-2600         FORT77EX O3            1,238.4
   Fujitsu VPP-500 (1PE)   frtpx, -sc                19.78
   Fujitsu VPP-500 (1PE)   frtpx                    730.3
   Fujitsu VPP-5000 (1PE)  frt, -sc                 189.78    (1999.12.27)
   Fujitsu VPP-5000 (1PE)  frt                    3,073.7     (1999.12.27)   
   Hitachi M-680                                      8.76
   Hitachi M-680D          NOIAP, OPT=0               1.23
   Hitachi M-680D          NOIAP, OPT=3               4.37
   Hitachi M-680D          IAP, OPT=3                49.12
   Hitachi M-680H          NOIAP                      6.65
   Hitachi M-680H          IAP                       44.54
   Hitachi S810/10         scalar                     3.96
   Hitachi S810/10         vector                    51.94
   Hitachi S820            vector                   358.9
   Hitachi S820/80         vector                   497.4
   Hitachi S3800/480       vector                   820.7
   VAX 8600                                           0.507
   IBM-3090                Level(0)                   4.03
   IBM-3090                Level(1)                   8.52
   IBM-3090                Level(2)                   8.28
   CRAY-XMP-48             CFT114i off=v              3.39
   CRAY-XMP-48             CFT114i                   36.00
   CRAY-2                  CIVIC                     29.51
   CRAY-YMP-864            -o off                     1.45
   CRAY-YMP-864            -o novector               11.09
   CRAY-YMP-864            -o full                  116.36
   SCS40                   SCSFT o=h                  0.695
   SCS40                   SCSFT vector               8.30
   Asahi Stellar GS-1000   O1 (scalar)                0.538
     (version 1.6)         O2 (vector)                4.67
                           O3 (parallel)             10.77
   NEC EWS-4800/20                                    0.188
   NEC EWS-4800/50                                    0.112
   NEC EWS-4800/210        f77 -O                     1.665
   NEC EWS-4800/220        f77 -O                     1.812
   NEC EWS-4800/260        f77 -O                     2.009
   NEC EWS-4800/350        f77 -O                     4.929
   NEC EWS-4800/360        f77 -O                     5.016
   MicroVAX-3400                                      0.419
   Sun SPARC Station 1     f77 -O                     0.961
   Sun SPARC Station 2     f77 -O                     2.188
   Sun SPARC IPX           f77 -O                     1.646
   Sun SPARC 2 (AS4075)    f77 -O                     1.400
   Sun SPARC Station 10    f77 -O                     2.999
   Sun SPARC S-4/5         f77 -O                     4.217
   Sun SPARC S-4/CL4       f77 -O                     4.419
   Sun SPARC S-4/20H       f77 -O                    18.67
   Sun SPARC S-4/20H(stcpu1) f77 -O                  18.78    (1998.04.17)    
   Sun SPARC S-4/20H(stcpu1) f90 -O                  30.75    (1998.04.17)
   Sun S-7/300U            f77 -O                    20.18
   Sun Ultra 2   (162MHz)  f77 -O                    24.04    (1998.04.09)
   Sun Ultra 2   (162MHz)  f90 -O                    13.56    (1998.04.09)
   Sun Ultra 2   (162MHz)  frt -O (Fujitsu f90)      23.46    (1998.04.18)
   Sun S-7/7000U (296MHz)  f77 -O                    38.19    (1998.04.07)
   Sun S-7/7000U (296MHz)  f90 -O                    23.12    (1998.04.07)
   Sun S-7/7000U 350       f77 -O                    42.01    (1999.08.02)
   Sun S-4/420U            f77 -O                    44.59    (1999.08.02)
   Sun GP7000F             frt -O                    92.38    (2001.01.31)
   Sun (stcpu1)            f90 -O                    97.20    (2002.08.28)
   Sun (c046)              f77 -O                    33.15    (2002.09.04)
   Sun (e039)              f90 -O                    18.45    (2002.08.28)
   Sun (sv010)             f90 -O                    22.88    (2002.08.28)
   Sun (sv011)             f90 -O                    27.83    (2002.08.28)
   SunFireV800 (sv080)     f90 -O                   138.13    (2002.08.28)
   Fujitsu (gpcs)          frt -O3                   95.82    (2003.04.08)
   Fujitsu (ngrd1)         frt -O3                  273.30    (2003.04.08)
   Sun PanaStation         f77 -O                    18.06
   DELL OptiPlex GXi       f77 -O                    11.36    (1997.11.12)
   DELL PowerEdge 2400     frt -O                   213.1     (2001.11.20)
   DEC Alpha (500MHz)      f77                       63.0     (1998.04.17)
   DEC Alpha (500MHz)      f90                       64.1     (1998.04.17)
   DEC Alpha DS-20(833MHz) f90 -fast                246.86    (2002.11.08)
   SONY VAIO Pentium4      lf95                     547.56    (2002.09.03)
   HPC P4Linux             ifc                      478.72    (2003.02.06)
   SGI Indy                f77 -O                     3.96
   SGI Indigo2             f77 -O                     9.46
   SGI Origin2000(1CPU)    Fortran77                 27.00
   SGI Octane              f77 -O                    17.30    (1999.08.02)
   SGI O2                  f77 -O                     4.87    (1999.08.02)
   DEC alpha 3000AXP/500   f77 -O3                   13.52
   Solbourne               f77 -O3                    1.824
   TITAN                   O1 (scalar)                0.904
   TITAN                   O2 (vector)                3.756
   TITAN III               O1 (scalar)                1.176
   TITAN III               O2 (vector)                6.228
   TITAN III               O3 (parallel)              6.543
   IBM 6091-19             f77 -O                     8.125
   Matsusita ADENART       ADETRAN (parallel)       218.0
   Convex C3810            Fortran -O1 (scalar)       4.089
   Convex C3810            Fortran -O2 (vector)      94.37
   nCUBE2                  HPF                        0.758
   nCUBE2                  HPF SSS32                  1.788
   DECmpp 12000 MP-2 1K pe HPF Ver.3.1               70.95
   DECmpp 12000 MP-2 2K pe HPF Ver.3.1              128.69
   DECmpp 12000 MP-2 4K pe HPF Ver.3.1              189.34

   ---------------------------------------------------------------------
        
        
Table  2.  Comparison of the computer processing capability for  the  2-dimen-
sional  and 3-dimensional global magnetohydrodynamic (MHD)  simulation,  where 
numerical values stand for computation times (in seconds) corresponding to one 
time  step  advance  in the MHD simulation codes. In the  test,  the  compiler 
options  to  get  maximum performance were adopted if  a  particular  compiler 
option  is  not given. Moreover, the grid numbers used in the  MHD  simulation 
codes are (a) 50 x 50 x 26 , (b) 62 x 32 x 32 for 3-dimensional simulation and 
(c)  402 x 102 for 2-dimensional simulation when the boundary grid points  are 
included. 

   ---------------------------------------------------------------------
  computer           compiler        (a)3D-MHD     (b)3D-MHD     (c)2D-MHD
                                      50x50x26      62x32x32      402x102
                                        sec (MFLOPS)  sec (MFLOPS)  sec (MFLOPS)
   ---------------------------------------------------------------------
  NEC ACOS-650        Fortran 77      187.1  (  0.6)159.5  (  0.7) 29.3  (  1.0)
  NEC ACOS-930        NIAP, OPT=1      14.07 (    9) 13.76 (    8)  4.31 (  6.5)
  NEC ACOS-930        IAP,OPT=3         9.97 (   12) 11.34 (   10)  2.44 ( 11.4)
  NEC SX-2            opt=scalar        3.66 (   33)  5.02 (   23)  0.90 ( 31.0)
  NEC SX-2            Fortran 77        0.34 (  356)  0.28 (  412)  0.042(  664)
  NEC SX-3/14         opt=scalar        2.11 (   57)  1.81 (   64)  0.48 ( 58.1)
  NEC SX-3/14         Fortran 77        0.097(1,248)  0.116(  994)  0.0149(1,871)
  NEC SX-3            Fortran 77                                    0.014(1,991)
  Fujitsu M-200       Fortran 77       34.4  (  3.5) 34.2  (  3.4)  7.84 (  3.6)
  Fujitsu M-380       Fortran 77       11.60 (   10)  9.37 (   12)  3.31 (  8.4)
  Fujitsu M-780/20    Fortran 77        4.87 (   25)  3.94 (   30)  1.14 ( 24.5)
  Fujitsu M-780/30    FORT77 O2         3.95 (   31)  5.06 (   23)  0.84 ( 33.2)
  Fujitsu M-780/30    FORT77EX O3       2.63 (   46)  2.21 (   53)  0.66 ( 42.2)
  Fujitsu VP-100      opt=scalar       11.44 (   11)  9.61 (   12)  2.39 ( 11.7)
  Fujitsu VP-100      Fortran 77        0.80 (  151)  0.75 (  154)  0.13 (  214)
  Fujitsu VP-200      opt=scalar       12.20 (   10) 10.23 (   11)  2.56 ( 10.9)
  Fujitsu VP-200      Fortran 77        0.50 (  242)  0.41 (  281)  0.080(  348)
  Fujitsu VP-400      Fortran 77        0.49 (  247)  0.39 (  296)  0.042(  664)
  Fujitsu VP-2600     FORTCLG           0.099(1,223)  0.082(1,405)  0.014(1,991)
  Fujitsu VPP-500( 1PE) frtpx, -sc      2.65 (   46)  3.26 (   36)  0.704( 39.6)
  Fujitsu VPP-500( 1PE) frtpx           0.150(  807)  0.132(  881)  0.029(  961)
  Fujitsu VPP-500( 1PE) frtpx, -sc     3.1043(   39) 5.359 (   22)
  Fujitsu VPP-500( 1PE) frtpx          0.1396(  867) 0.1194(  974)
  Fujitsu VPP-500( 2PE) frtpx, -Wx     0.0749(1,616) 0.0632(1,840)
  Fujitsu VPP-500( 4PE) frtpx, -Wx     0.0440(2,751) 0.0372(3,126)
  Fujitsu VPP-500( 8PE) frtpx, -Wx     0.0277(4,370) 0.0244(4,766)
  Fujitsu VPP-500(16PE) frtpx, -Wx     0.0189(6,405) 0.0155(7,503)
  Fujitsu VPP-5000( 1PE) frt,  -sc     0.717 (  169) 0.694 (  168)0.1238 (  225)
  Fujitsu VPP-5000( 1PE) frt           0.0301( 4026) 0.0264( 4416)0.00441( 6316)
  Fujitsu VPP-5000( 1PE) frt,  -sc     0.7209(  168) 0.8529(  136)
  Fujitsu VPP-5000( 1PE) frt          0.02330( 5201)0.02270( 5110)
  Fujitsu VPP-5000( 2PE) frt,  -Wx    0.01279( 9475)0.01050(11047)
  Fujitsu VPP-5000( 4PE) frt,  -Wx    0.00751(16136)0.00594(19528)
  Fujitsu VPP-5000( 8PE) frt,  -Wx    0.00451(26870)0.00356(32583)
  Fujitsu VPP-5000(16PE) frt,  -Wx    0.00306(39602)0.00225(51554)  
  Hitachi M-680       Fortran 77                                    1.49 ( 18.7)
  Hitachi M-680D      NOIAP, OPT=3      8.42 (   14)  9.31 (   12)  2.11 ( 13.2)
  Hitachi M-680D      IAP, OPT=3        3.54 (   34)  2.75 (   42)  0.57 ( 48.9)
  Hitachi M-680D      IAP, SOPT         3.25 (   37)  2.44 (   48)  0.53 ( 52.6)
  Hitachi S810/10     opt=scalar                                    3.17 (  8.8)
  Hitachi S810/10     Fortran 77                                    0.167(  167)
  Hitachi S820/20     Fortran 77        0.23 (  526)  0.16 (  727)  0.020(1,394)
  Hitachi S3800/480   Fortran 77        0.125(  968)  0.103(1,129)  0.0093(2,998)
  IBM-3033            VS               33.3  (  3.6) 27.8  (  4.2)  7.90 (  3.5)
  IBM-3090            VS                9.17 (   13)  8.76 (   13)  2.27 ( 12.3)
  IBM-3090            Fortvclg L0       9.11 (   13)  8.78 (   13)  2.28 ( 12.2)
  IBM-3090            Fortvclg L1       5.19 (   23)  4.01 (   29)  1.11 ( 25.1)
  VAX-11/750          Fortran         449.5  (  0.3)432.9  (  0.3) 96.64 (  0.3)
  CRAY-1              CFT               1.88 (   64)  1.76 (   65)  0.372( 74.9)
  CRAY-XMP            CFT 1.13          1.67 (   73)  3.85 (   39)  0.282( 98.9)
  CRAY-2              CIVIC 131        10.3  (   12)  7.29 (   16)
  CRAY-XMP-48         off=v             5.68 (   21)  6.15 (   19)  1.436( 19.4)
  CRAY-XMP-48         CFT114i           1.29 (   94)  1.13 (  101)  0.252(  111)
  CRAY-YMP-864        -o off            9.36 (   13)  9.26 (   13)  2.74 ( 10.2)
  CRAY-YMP-864        -o novector       3.62 (   33)  3.81 (   31)  0.999( 27.9)
  CRAY-YMP-864        -o full           0.430(  282)  1.921(   61)  0.0982( 284)
  SCS40               SCSFT o=h        18.86 (  6.4) 20.25 (  5.7)  5.71 (  4.9)
  SCS40               SCSFT             3.94 (   31)  3.81 (   31)  0.964( 28.9)
  TITAN               O1 (scalar)      49.44 (  2.5) 56.70 (  2.1) 12.59 (  2.2)
  TITAN               O2 (vector)      22.02 (  5.5) 23.97 (  4.8)  5.47 (  5.1)
  TITAN III           O1 (scalar)      15.66 (  7.7) 18.57 (  6.3)  3.68 (  7.6)
  TITAN III           O2 (vector)       7.96 (   15)  7.54 (   15)  1.62 ( 17.2)
  TITAN III           O3 (parallel)     7.73 (   16)  7.31 (   16)  1.59 ( 17.5)
  Sun SPARC Station 1 f77 -O           47.50 (  2.5) 47.25 (  2.4) 13.00 (  2.1)
  Sun SPARC Station 2 f77 -O           20.50 (  5.9) 19.88 (  5.8)  4.81 (  5.8)
  Sun SPARC IPX       f77 -O           16.30 (  7.2) 17.60 (  6.6)  5.23 (  5.3)
  Sun SPARC 2(AS4075) f77 -O           15.18 (  8.0) 16.63 (  7.0)  4.92 (  5.7)
  Sun SPARC Station 10 f77 -O           8.22 ( 14.7) 11.46 ( 10.2)  2.04 ( 13.7)
  Sun SPARC S-4/5     f77 -O            5.57 ( 21.7)  6.81 ( 17.1)  1.79 ( 15.6)
  Sun SPARC S-4/CL4   f77 -O            5.68 ( 21.3)  6.79 ( 17.1)  1.79 ( 15.6)
  Sun SPARC S-4/20H   f77 -O            1.90 ( 63.7)  1.92 ( 60.6)  0.344( 81.0)
  Sun SPARC S-4/20H(stcpu1) f77 -O      1.90 ( 63.7)  1.92 ( 60.6)  0.344( 81.0)
  Sun SPARC S-4/20H(stcpu1) f90 -O      1.97 ( 61.4)  1.79 ( 65.0)  0.450( 61.9)
  Sun S-7/300U        f77 -O            1.76 ( 68.8)  1.97 ( 59.1)  0.426( 65.4)
  Sun Ultra 2   (162MHz) f77 -O         1.49 (   81)  3.39 (   34)  0.344(   81)
  Sun Ultra 2   (162MHz) f90 -O         3.91 (   31)  5.14 (   23)  0.680(   41)
  Sun Ultra 2   (162MHz) frt -O (f90)   1.53 (   79)  2.74 (   42)  0.380(   74)
  Sun S-7/7000U (296MHz) f77 -O         0.836(  145)  1.09 (  107)  0.195(  143)
  Sun S-7/7000U (296MHz) f90 -O         3.04 (   40)  2.72 (   43)  0.375(   74)
  Sun S-7/7000U 350   f77 -O            1.055(  115)  1.039(  113)  0.172(  161)
  Sun S-4/420U        f77 -O            0.969(  125)  0.953(  123)  0.156(  178)
  Sun GP7000F         frt -O            0.781(  155)  0.629(  186)  0.119(  233)
  Sun (shcpu1)        f90 -O            0.546(  222)  0.539(  217)  0.117(  247)
  Sun (c046)          f77 -O            0.879(  138)  0.930(  126)  0.211(  172)
  Sun (sv010)         f90 -O            3.039(   40)  2.734(   43)  0.367(   76)
  Sun (sv011)         f90 -O            2.461(   49)  2.219(   53)  0.313(   89)
  SunFireV800 (sv080) f90 -O            0.305(  397)  0.336(  348)  0.0781( 355)
  Fujitsu (gpcs)      frt -O3          0.2515(  481) 0.2905(  403)  0.0562( 493)
  Fujitsu (ngrd1)     frt -O3          0.0943( 1284) 0.0952( 1228)  0.0249(1113)
  Fujitsu (ngrd1)     frt -O3          0.0938( 1291) 0.0938( 1246)  0.0234(1184)
  Sun PanaStation     f77 -O            2.08 ( 58.2)  1.87 ( 62.2)  0.348( 80.1)
  DELL OptiPlex GXi   f77 -O            2.68 ( 45.2)  3.14 ( 37.0)  0.641( 43.5)
  DELL PowerEdge 2400 frt -O            0.621(195.1)  0.594(195.6)  0.109(255.8)
  DEC Alpha (500MHz)  f77               0.359(  337)  0.383(  304)  0.0781( 346)
  DEC Alpha (500MHz)  f90               0.359(  337)  0.383(  304)  0.0781( 346)
  DEC Alpha (833MHz)  f90 -fast        0.1016( 1186) 0.0859( 1355)  0.0234(1155)
  SONY VAIO Pentium4  lf95              0.203(  596)  0.312(  375)  0.0430( 645)
  HPC P4Linux         ifc              0.0906( 1335) 0.0943( 1241)  0.0240(1172)
  SGI Indy            f77 -O            8.20 ( 14.8) 10.35 ( 11.2)  2.21 ( 12.6)
  SGI Indigo2         f77 -O            2.68 ( 45.2)  2.85 ( 40.8)  0.775( 36.0)
  SGI Octane          f77 -O            0.475(  255)  0.550(  211)  0.133(  210)
  SGI O2              f77 -O            1.875( 64.6)  2.125( 54.7)  0.325( 85.8)
  SGI Origin2000(1CPU) Fortran77        0.531(  228)  0.797(  146)  0.129(  216)
  SGI Origin2000(2CPU) Fortran77        0.324(  374)  0.464(  251)
  SGI Origin2000(4CPU) Fortran77        0.202(  599)  0.275(  423)
  SGI Origin2000(8CPU) Fortran77        0.155(  781)  0.191(  609)
  DEC alpha 3000AXP/500 f77 -O3         2.59 ( 46.8)  4.41 ( 26.4)  0.56 ( 49.8)
  Solbourne           f77 -O3          23.68 (  5.1) 25.70 (  4.5)  6.06 (  4.6)
  NEC EWS-4800/210    f77 -O3          16.65 (  7.3) 21.15 (  5.5)  4.17 (  6.7)
  NEC EWS-4800/220    f77 -O3          18.50 (  6.5) 17.00 (  6.7)  3.50 (  8.0)
  NEC EWS-4800/260    f77 -O           12.53 (  9.7) 14.87 (  7.8)  3.04 (  9.2)
  NEC EWS-4800/350    f77 -O            6.72 ( 18.0)  7.54 ( 15.4)  1.46 ( 19.1)
  NEC EWS-4800/360    f77 -O            4.69 ( 25.8)  5.40 ( 21.5)  1.06 ( 26.3)
  IBM 6091-19         f77 -O            4.437( 27.3)  4.500( 25.9)  1.125( 24.8)
  Matsusita ADENART   ADETRAN(parallel) 0.431(  281)  0.307(  375)  0.110(  253)
  Convex C3810        f77 -O1 (scalar)  5.704(   21)  6.286(   19)  1.378( 20.2)
  Convex C3810        f77 -O2 (vector)  0.948(  128)  0.895(  130)  0.213(  131)
  nCUBE2E  16 pe      f90 -O (parallel) 2.78 ( 43.5)                0.544( 51.2)
  nCUBE2E  32 pe      f90 -O (parallel)                             0.293( 95.1)
  nCUBE2S 128 pe      f90 -O (parallel)                             0.083(  336)
  nCUBE2  256 pe      f90 -O (parallel)                             0.072(  378)

   ---------------------------------------------------------------------
  CRAY Y-MP4E         1 processor       0.460(  263)  0.431(  263)  0.106(  263)
                      2 processors      0.246(  492)  0.233(  495)  0.062(  450)
                      4 processors      0.136(  890)  0.129(  893)  0.040(  697)
  CRAY Y-MP C90       1 processor       0.289(  419)  0.265(  439)
                      2 processors      0.159(  762)  0.144(  800)
                      4 processors      0.087(1,392)  0.079(1,459)
   ----------------------------------------------------------------------
   ----------------------------------------------------------------------
  computer             compiler       (a)3D-MHD     (b)3D-MHD     (c)2D-MHD
                                         sec (MFLOPS) sec (MFLOPS)    sec (MFLOPS)
                       Grid points    192x192x96     240x120x120   1600x400
  Convex C3810 (1cpu)  240 MFLOPS      52.8  (  143) 77.7  (   95)  3.4  (  131)
  Convex C3820 (2cpu)  480 MFLOPS      29.5  (  256) 41.6  (  178)  1.9  (  235)
  Convex C3840 (4cpu)  960 MFLOPS      18.9  (  400) 16.1  (  283)  1.2  (  372)

                       Grid points    240x120x120    240x120x120   1600x400
  SGI Origin2000(1CPU) Fortran77       37.59 (  201) 40.58 (  183)  2.24 (  199)
  SGI Origin2000(2CPU) Fortran77       19.86 (  381) 21.21 (  351)  1.76 (  253)
  SGI Origin2000(4CPU) Fortran77       10.45 (  724) 11.08 (  672)  1.40 (  319)
  SGI Origin2000(8CPU) Fortran77        5.94 (1,274)  6.21 (1,199)
                       Grid points    320x 80x160    320x 80x160
  SGI Origin2000(1CPU) Fortran77       77.49 (  116) 82.97 (  100)
  SGI Origin2000(2CPU) Fortran77       40.39 (  222) 43.01 (  194)
  SGI Origin2000(4CPU) Fortran77       21.12 (  424) 22.85 (  364)
  SGI Origin2000(8CPU) Fortran77       11.56 (  774) 12.25 (  680)


Table 3. Comparison of the computer processing capability for the 3-dimensional 
global magnetohydrodynamic (MHD) simulation, where numerical values stand for 
computation times (in seconds) corresponding to one time step advance in the MHD 
simulation codes. In the  test, the compiler options to get maximum performance 
were adopted if a particular compiler option is not given.

---------------------------------------------------------------------------
 Computer Processing Capability                 2000.6.25  by Tatsuki OGINO
---------------------------------------------------------------------------
  computer                           grid number       sec   (MFLOPS) GF/PE
---------------------------------------------------------------------------
  Matsusita ADENART (256CPU)         180x 60x 60      3.46   (   400)
  Matsusita ADENART (256CPU)         150x100x 50      5.81   (   276)
  CRAY Y-MP C90  (8CPU)              400x200x200      7.00   ( 4,883) 0.61
  SGI Origin2000 (1CPU, earthb)      240x120x120     40.58   (   183) 0.18
  SGI Origin2000 (2CPU)              240x120x120     21.21   (   351) 0.18
  SGI Origin2000 (4CPU)              240x120x120     11.08   (   672) 0.17
  SGI Origin2000 (8CPU)              240x120x120      6.21   ( 1,199) 0.15
  Fujitsu VP-200                     240x 80x 80     10.38   (   316) 0.32
  Fujitsu VP-2600                    240x 80x 80      1.50   ( 2,188) 2.19
  Fujitsu VP-2600                    320x 80x 80      1.76   ( 2,486) 2.49
  Fujitsu VP-2600                    300x100x100      2.57   ( 2,494) 2.49
  Fujitsu VP-2600                    320x 80x160      3.63   ( 2,417) 2.42
  Fujitsu VPP-500 (1PE, earthb)      320x 80x 80      3.556  ( 1,230) 1.23
  Fujitsu VPP-500 (2PE)              320x 80x 80      1.846  ( 2,370) 1.19
  Fujitsu VPP-5000 (2PE,earthb,MPI)  320x 80x 80      0.2913 (15,017) 7.51 2002.04.06
  Fujitsu VPP-500 (4PE)              320x 80x 80      1.012  ( 4,323) 1.08
  Fujitsu VPP-500 (8PE)              320x 80x 80      0.591  ( 7,403) 0.93
  Fujitsu VPP-500 (16PE)             320x 80x 80      0.368  (11,889) 0.74
  Fujitsu VPP-500 (16PE)             400x100x100      0.666  (12,831) 0.80
  Fujitsu VPP-500 (16PE)             640x160x160      2.308  (15,165) 0.95
  Fujitsu VPP-500 (16PE)             800x200x200      4.119  (16,597) 1.04
  Fujitsu VPP-500 (1PE, eartha2)     320x 80x160      7.088  ( 1,234) 1.23
  Fujitsu VPP-500 (2PE)              320x 80x160      3.620  ( 2,417) 1.21
  Fujitsu VPP-500 (4PE)              320x 80x160      1.899  ( 4,608) 1.15
  Fujitsu VPP-500 (8PE)              320x 80x160      1.035  ( 8,454) 1.06
  Fujitsu VPP-500 (16PE)             320x 80x160      0.592  (14,781) 0.92
  Fujitsu VPP-500 (16PE)             400x100x200      1.088  (15,708) 0.98
  Fujitsu VPP-500 (16PE)             640x160x320      4.064  (17,225) 1.08
  Fujitsu VPP-500 (16PE)             800x200x400      7.632  (17,914) 1.12
  Fujitsu VPP-500 (16PE, Venus)      400x100x100      0.667  (12,811) 0.80
  Fujitsu VPP-500 (16PE, Jupiter)    300x200x100      0.975  (13,146) 0.82
  Fujitsu VPP-5000 (1PE, earthb)     400x100x100      1.154 (  7,405) 7.40
  Fujitsu VPP-5000 (2PE, earthb)     400x100x100      0.5762( 14,831) 7.41
  Fujitsu VPP-5000 (4PE, earthb)     400x100x100      0.3039( 28,119) 7.03
  Fujitsu VPP-5000 (8PE, earthb)     400x100x100      0.1613( 52,979) 6.62
  Fujitsu VPP-5000 (16PE, earthb)    400x100x100     0.09355( 91,346) 5.71
  Fujitsu VPP-5000 (2PE,earthb,MPI)  320x 80x 80     0.29130( 15,017) 7.51 2002.04.06
  Fujitsu VPP-5000 (4PE,earthb,MPI)  320x 80x 80     0.15419( 28,371) 7.09 2002.04.06
  Fujitsu VPP-5000 (8PE,earthb,MPI)  320x 80x 80     0.08348( 52,404) 6.55 2002.04.06
  Fujitsu VPP-5000 (16PE,earthb,MPI) 320x 80x 80     0.04672( 93,629) 5.85 2002.04.06
  Fujitsu VPP-5000 (32PE,earthb,MPI) 320x 80x 80     0.02557(171,092) 5.35 2002.04.06
  Fujitsu VPP-5000 (2PE,earthb,MPI)  400x100x100     0.57079( 14,972) 7.49 2002.04.06
  Fujitsu VPP-5000 (4PE,earthb,MPI)  400x100x100     0.29966( 28,518) 7.13 2002.04.06
  Fujitsu VPP-5000 (8PE,earthb,MPI)  400x100x100     0.15224( 56,135) 7.02 2002.04.06
  Fujitsu VPP-5000 (16PE,earthb,MPI) 400x100x100     0.08435(101,310) 6.33 2002.04.06
  Fujitsu VPP-5000 (32PE,earthb,MPI) 400x100x100     0.05093(167,792) 5.24 2002.04.06
  Fujitsu VPP-5000 (2PE,earthb,MPI)  504x126x126     1.05899( 16,143) 8.07 2002.04.06
  Fujitsu VPP-5000 (4PE,earthb,MPI)  504x126x126     0.53828( 31,759) 7.94 2002.04.06
  Fujitsu VPP-5000 (8PE,earthb,MPI)  504x126x126     0.27369( 62,461) 7.81 2002.04.06
  Fujitsu VPP-5000 (16PE,earthb,MPI) 504x126x126     0.14173(120,618) 7.54 2002.04.06
  Fujitsu VPP-5000 (32PE,earthb,MPI) 504x126x126     0.07776(219,854) 6.87 2002.04.06
  Fujitsu VPP-5000 (16PE,earthb,MPI) 600x200x200     0.43824(117,005) 7.31 2002.04.06
  Fujitsu VPP-5000 (16PE, earthb)    800x200x200     0.62417(109,526) 6.85
  Fujitsu VPP-5000 (16PE, eartha2)   500x100x200     0.19975(106,948) 6.68
  Fujitsu VPP-5000 (16PE, eartha2)   800x200x400     1.20162(113,779) 7.11
  Fujitsu VPP-5000 ( 2PE, eartha2)   800x200x478    10.65936( 15,327) 7.66
  Fujitsu VPP-5000 ( 4PE, eartha2)   800x200x478     5.35061( 30,534) 7.63
  Fujitsu VPP-5000 ( 8PE, eartha2)   800x200x478     2.73815( 59,666) 7.46
  Fujitsu VPP-5000 (12PE, eartha2)   800x200x478     1.86540( 87,581) 7.30
  Fujitsu VPP-5000 (16PE, eartha2)   800x200x478     1.41918(115,119) 7.19
  Fujitsu VPP-5000 (32PE, eartha2)   800x200x478     0.72187(226,328) 7.07
  Fujitsu VPP-5000 (48PE, eartha2)   800x200x478     0.53445(305,698) 6.36
  Fujitsu VPP-5000 (56PE, eartha2)   800x200x478     0.49367(330,950) 5.91
  
  Fujitsu VPP-5000 (32PE, eartha2)  1000x200x478     0.91633(222,872) 6.96
  Fujitsu VPP-5000 (32PE, eartha2)   800x400x478     1.44683(225,845) 7.06
  Fujitsu VPP-5000 ( 2PE, eartha2)   800x200x670     -.-----(---,---)
  Fujitsu VPP-5000 ( 4PE, eartha2)   800x200x670     7.61763( 30,063) 7.52
  Fujitsu VPP-5000 ( 8PE, eartha2)   800x200x670     3.79406( 60,359) 7.54
  Fujitsu VPP-5000 (12PE, eartha2)   800x200x670     2.80623( 81,606) 6.80
  Fujitsu VPP-5000 (16PE, eartha2)   800x200x670     1.92435(119,004) 7.44
  Fujitsu VPP-5000 (24PE, eartha2)   800x200x670     1.30786(175,099) 7.30
  Fujitsu VPP-5000 (32PE, eartha2)   800x200x670     0.97929(233,848) 7.31
  Fujitsu VPP-5000 (48PE, eartha2)   800x200x670     0.68234(335,618) 6.99
  Fujitsu VPP-5000 (56PE, eartha2)   800x200x670     0.59542(384,611) 6.87
  Fujitsu VPP-5000 (16PE, eartha2)   1000x500x1118   9.66792(123,518) 7.72 (2000.07.21)
  Fujitsu VPP-5000 (32PE, eartha2)   1000x500x1118   5.04442(236,729) 7.40 (2000.07.21)
  Fujitsu VPP-5000 (48PE, eartha2)   1000x500x1118   3.54985(336,397) 7.01 (2000.08.07)
  Fujitsu VPP-5000 (56PE, eartha2)   1000x500x1118   3.00623(397,228) 7.09 (2000.08.07)
  Fujitsu VPP-5000 (56PE, eartha2)   1000x500x1118   2.98512(400,038) 7.14
  Fujitsu VPP-5000 (32PE, eartha2)  1000x1000x1118   9.97933(239,327) 7.48 (2000.07.19)
  Fujitsu VPP-5000 (48PE, eartha2)  1000x1000x1118   7.17658(332,794) 6.93 (2000.08.07)
  Fujitsu VPP-5000 (56PE, eartha2)  1000x1000x1118   5.81743(410,546) 7.33
  Fujitsu VPP-5000 (56PE, eartha2)  1000x1000x1118   5.97927(399,433) 7.13 (2000.08.07)
  Fujitsu VPP-5000 (32PE, eartha2)   2238x558x1118  12.96936(229,926) 7.19 (2000.07.28)
  Fujitsu VPP-5000 (48PE, eartha2)   2238x558x1118   9.49812(313,956) 6.54 (2000.08.07)
  Fujitsu VPP-5000 (56PE, eartha2)   2238x558x1118   8.04309(370,752) 6.62 (2000.08.07)
  
  Fujitsu VPP-5000(1PE,eartha2,scalar) 200x100x478 119.60663(    171) 0.171  
  Fujitsu VPP-5000 ( 1PE, eartha2)   200x100x478     2.96691(  6,883) 6.88
  Fujitsu VPP-5000 ( 2PE, eartha2)   200x100x478     1.45819( 14,005) 7.00
  Fujitsu VPP-5000 ( 4PE, eartha2)   200x100x478     0.72109( 28,320) 7.08
  Fujitsu VPP-5000 ( 8PE, eartha2)   200x100x478     0.36541( 55,886) 6.99
  Fujitsu VPP-5000 (16PE, eartha2)   200x100x478     0.20548( 99,383) 6.21  
  Fujitsu VPP-5000 (32PE, eartha2)   200x100x478     0.10678(191,226) 5.98
  Fujitsu VPP-5000 (48PE, eartha2)   200x100x478     0.06853(297,959) 6.21
  Fujitsu VPP-5000 (56PE, eartha2)   200x100x478     0.06391(319,531) 5.71

  /vpp/home/usr6/a41456a/heartha2/prog9032.f  reviced boundary
  Fujitsu VPP-5000 ( 1PE, eartha2, frt)   500x100x200     2.69078(  7,939) 7.94
  Fujitsu VPP-5000 ( 2PE, eartha2, frt)   500x100x200     1.38118( 15,467) 7.73
  Fujitsu VPP-5000 ( 4PE, eartha2, frt)   500x100x200     0.71535( 29,965) 7.47
  Fujitsu VPP-5000 ( 8PE, eartha2, frt)   500x100x200     0.39820( 53,648) 6.71
  Fujitsu VPP-5000 (16PE, eartha2, frt)   500x100x200     0.20970(101,873) 6.37
  Fujitsu VPP-5000 (32PE, eartha2, frt)   500x100x200     0.13062(163,548) 5.11
  Fujitsu VPP-5000 (48PE, eartha2, frt)   500x100x200     0.09960(214,479) 4.46
  Fujitsu VPP-5000 (56PE, eartha2, frt)   500x100x200     0.08921(239,478) 4.28
  -----------------------------------------------------------------------  
   HPF/JA (High Performance Fortran)
   MPI    (Massage Passing Interface)
  /vpp/home/usr6/a41456a/heartha2/proghpf53.f
  Fujitsu VPP-5000 ( 1PE, eartha2, HPF)   500x100x200     2.69089(  7,938) 7.94    
  Fujitsu VPP-5000 ( 2PE, eartha2, HPF)   500x100x200     1.39017( 15,366) 7.68
  Fujitsu VPP-5000 ( 4PE, eartha2, HPF)   500x100x200     0.71228( 29,993) 7.50
  Fujitsu VPP-5000 ( 8PE, eartha2, HPF)   500x100x200     0.39285( 54,381) 6.80
  Fujitsu VPP-5000 (16PE, eartha2, HPF)   500x100x200     0.20202(105,742) 6.61
  Fujitsu VPP-5000 (32PE, eartha2, HPF)   500x100x200     0.12034(175,496) 5.48
  Fujitsu VPP-5000 (48PE, eartha2, HPF)   500x100x200     0.09115(231,688) 4.82
  Fujitsu VPP-5000 (56PE, eartha2, HPF)   500x100x200     0.08625(244,846) 4.37

  Fujitsu VPP-5000 ( 1PE, eartha2, HPF)   500x100x200     2.69089(  7,938) 7.94
  Fujitsu VPP-5000 ( 2PE, eartha1, MPI)   500x100x200     1.35724( 15,739) 7.87 (2002.6.19)
  Fujitsu VPP-5000 ( 4PE, eartha1, MPI)   500x100x200     0.68837( 31,035) 7.76 (2002.6.19)
  Fujitsu VPP-5000 ( 8PE, eartha1, MPI)   500x100x200     0.37153( 57,502) 7.19 (2002.6.19)
  Fujitsu VPP-5000 (16PE, eartha1, MPI)   500x100x200     0.19298(110,695) 6.92 (2002.6.19)
  Fujitsu VPP-5000 (32PE, eartha1, MPI)   500x100x200     0.12034(175,496) 5.48
  
  HPF/JA (High Performance Fortran) 
  Fujitsu VPP-5000 ( 1PE, eartha2, HPF)   200x100x478     3.00248(  6,801) 6.80 OK
  Fujitsu VPP-5000 ( 2PE, eartha2, HPF)   200x100x478     1.53509( 13,303) 6.65 OK
  Fujitsu VPP-5000 ( 4PE, eartha2, HPF)   200x100x478     0.76061( 26,849) 6.71 OK
  Fujitsu VPP-5000 ( 8PE, eartha2, HPF)   200x100x478     0.38589( 52,921) 6.62 OK
  Fujitsu VPP-5000 (16PE, eartha2, HPF)   200x100x478     0.21867( 93,390) 5.84 OK  
  Fujitsu VPP-5000 (32PE, eartha2, HPF)   200x100x478     0.10972(186,129) 5.82 OK
  Fujitsu VPP-5000 (48PE, eartha2, HPF)   200x100x478     0.07374(276,956) 5.77 OK
  Fujitsu VPP-5000 (56PE, eartha2, HPF)   200x100x478     0.06823(299,269) 5.34 OK

  Fujitsu VPP-5000 ( 1PE, eartha2, HPF)   200x100x478     3.00248(  6,801) 6.80 OK
  Fujitsu VPP-5000 ( 2PE, eartha1, MPI)   200x100x478     1.44399( 14,142) 7.07 (2002.6.19)
  Fujitsu VPP-5000 ( 4PE, eartha1, MPI)   200x100x478     0.71408( 28,599) 7.15 (2002.6.19)
  Fujitsu VPP-5000 ( 8PE, eartha1, MPI)   200x100x478     0.36115( 56,546) 7.07 (2002.6.19)
  Fujitsu VPP-5000 (16PE, eartha1, MPI)   200x100x478     0.19052(107,189) 6.70 (2002.6.19)
  Fujitsu VPP-5000 (32PE, eartha2, HPF)   200x100x478     0.10972(186,129) 5.82 OK
  
  Fujitsu VPP-5000 ( 2PE, proghpf63.f)    800x200x478    10.74172( 15,210) 7.60
  Fujitsu VPP-5000 ( 4PE, proghpf63.f)    800x200x478     5.35382( 30,516) 7.63
  Fujitsu VPP-5000 ( 8PE, proghpf63.f)    800x200x478     2.72973( 59,851) 7.48
  Fujitsu VPP-5000 (12PE, proghpf63.f)    800x200x478     1.91098( 85,493) 7.12
  Fujitsu VPP-5000 (16PE, proghpf63.f)    800x200x478     1.38854(117,660) 7.35 
  Fujitsu VPP-5000 (32PE, proghpf63.f)    800x200x478     0.71746(227,715) 7.12
  Fujitsu VPP-5000 (48PE, proghpf63.f)    800x200x478     0.51497(317,257) 6.61
  Fujitsu VPP-5000 (56PE, proghpf63.f)    800x200x478     0.46350(352,488) 6.29

  Fujitsu VPP-5000 ( 2PE, eartha1, MPI)   800x200x478    10.42753( 15,668) 7.83 (2002.6.19)
  Fujitsu VPP-5000 ( 4PE, eartha1, MPI)   800x200x478     5.22334( 31,278) 7.82 (2002.6.19)
  Fujitsu VPP-5000 ( 8PE, eartha1, MPI)   800x200x478     4.62705( 35,309) 4.41 (2002.6.19)
  Fujitsu VPP-5000 (12PE, eartha1, MPI)   800x200x478     1.91098( 85,493) 7.12
  Fujitsu VPP-5000 (16PE, eartha1, MPI)   800x200x478     2.66656( 61,268) 3.83 (2002.6.19)
  Fujitsu VPP-5000 (32PE, eartha1, MPI)   800x200x478     0.71746(227,715) 7.12

  Fujitsu VPP-5000 ( 2PE, proghpf63.f)    800x200x670     -.-----(---,---)
  Fujitsu VPP-5000 ( 4PE, proghpf63.f)    800x200x670     8.00096( 28,622) 7.16 OK
  Fujitsu VPP-5000 ( 8PE, proghpf63.f)    800x200x670     3.96162( 57,806) 7.23 OK
  Fujitsu VPP-5000 (12PE, proghpf63.f)    800x200x670     3.00484( 76,212) 6.35 OK
  Fujitsu VPP-5000 (16PE, proghpf63.f)    800x200x670     2.01151(113,848) 7.12 OK
  Fujitsu VPP-5000 (24PE, proghpf63.f)    800x200x670     1.35955(168,442) 7.02 OK
  Fujitsu VPP-5000 (32PE, proghpf63.f)    800x200x670     1.03211(221,880) 6.93 OK
  Fujitsu VPP-5000 (48PE, proghpf63.f)    800x200x670     0.72060(317,798) 6.62 OK
  Fujitsu VPP-5000 (56PE, proghpf63.f)    800x200x670     0.62764(364,866) 6.52 OK

  Fujitsu VPP-5000 ( 2PE, eartha1, MPI)   800x200x670    10.42753( 15,668) 7.83
  Fujitsu VPP-5000 ( 4PE, eartha1, MPI)   800x200x670     5.35382( 30,516) 7.63
  Fujitsu VPP-5000 ( 8PE, eartha1, MPI)   800x200x670     7.33518( 22,273) 2.78 (2002.6.19)
  Fujitsu VPP-5000 (12PE, eartha1, MPI)   800x200x670     1.91098( 85,493) 7.12
  Fujitsu VPP-5000 (16PE, eartha1, MPI)   800x200x670     3.68912( 44,286) 2.77 (2002.6.19)
  Fujitsu VPP-5000 (32PE, eartha1, MPI)   800x200x670     0.71746(227,715) 7.12

  Fujitsu VPP-5000 (16PE, eartha2)   1000x500x1118   9.84601(122,615) 7.66 (2000.07.21)
  Fujitsu VPP-5000 (16PE, eartha2a)  1000x500x1118   9.61939(125,504) 7.84 (2000.07.21) 
  Fujitsu VPP-5000 (32PE, eartha2)   1000x500x1118   5.19470(232,403) 7.26 (2000.07.21)
  Fujitsu VPP-5000 (32PE, eartha2a)  1000x500x1118   4.99224(241,828) 7.56 (2000.07.21)
  Fujitsu VPP-5000 (48PE, eartha2a)  1000x500x1118   3.47943(346,972) 7.23 (2000.08.07)
  Fujitsu VPP-5000 (56PE, eartha2a)  1000x500x1118   2.93481(411,361) 7.35 (2000.08.07)
  Fujitsu VPP-5000 (32PE, eartha2)  1000x1000x1118  10.22563(233,562) 7.30 (2000.07.19)
  Fujitsu VPP-5000 (32PE, eartha2a) 1000x1000x1118   9.81345(243,372) 7.61 (2000.07.21)
  Fujitsu VPP-5000 (48PE, eartha2a) 1000x1000x1118   7.02753(339,852) 7.08 (2000.08.07)
  Fujitsu VPP-5000 (56PE, eartha2a) 1000x1000x1118   5.79368(412,228) 7.36 (2000.08.07)
  Fujitsu VPP-5000 (63PE, eartha2a) 1000x1000x1258   6.17179(435,431) 6.91 (2002.01.03)
  
  Fujitsu VPP-5000 (48PE, eartha2)   1678x558x1118   6.52886(342,453) 7.13 (2000.08.07)
  Fujitsu VPP-5000 (56PE, eartha2)   1678x558x1118   5.54894(402,929) 7.20 (2000.08.07)

  Fujitsu VPP-5000 (32PE, eartha2)   2238x558x1118  13.25331(225,000) 7.03 (2000.07.21) 
  Fujitsu VPP-5000 (32PE, eartha2a)  2238x558x1118  12.71245(234,573) 7.33 (2000.07.21) 
  Fujitsu VPP-5000 (48PE, eartha2a)  2238x558x1118   9.22722(323,174) 6.73 (2000.08.07)
  Fujitsu VPP-5000 (56PE, eartha2a)  2238x558x1118   7.80778(381,926) 6.82 (2000.08.07)
  ----------------------------------------------------------------------- 
  A Quarter Model of the Earth's Magnetosphere (nx,ny,nz)=(180,60,60)
  Fujitsu GR720    (1PE, earthb) Fortran 90          7.72998(    136) 0.14 (2002.08.01)
  Fujitsu VPP-5000 (1PE, earthb) VP Fortran          0.19342(  5,428) 5.43 (2002.08.01)
  Fujitsu VPP-5000 (2PE, earthb) VPP Fortran         0.10509(  9,990) 5.00 (2002.08.01)
  Fujitsu VPP-5000 (2PE, earthb) HPF/JA              0.11064(  9,489) 4.74 (2002.08.01)
  Fujitsu VPP-5000 (2PE, earthb) MPI                 0.09899( 10,606) 5.30 (2002.08.01)
---------------------------------------------------------------------------------
  computer                           grid number       sec   (GFLOPS) GF/PE (Date)
---------------------------------------------------------------------------------    
  Earth Simulator (mearthd, hearthd)
  ES 128node x 8PE=1024PE (MPI)      2046x1022x1022  0.62047 (7357.1) 7.18 (2003.04.04)
                                                             (7273.1) 7.10
  ES  64node x 8PE= 512PE (HPF/JA)   1022x1022x1022  0.68877 (3310.5) 6.47 (2003.04.04)
                                                             (3275.9) 6.40
---------------------------------------------------------------------------------    
   frt: Fujitsu VPP Fortran 90    HPF: High Performance Fortran
   : MFLOPS is an estimated value in comparison with the computation by
     1 processor of CRAY Y-MP C90.


Table  4. Comparison of computer processing capability between VPP Fortran, HPF/JA
and MPI in a 3-dimensional global MHD code of the solar wind-magnetosphere interaction
by using Fujitsu VPP5000/64.

------------------------------------------------------------------------------------
Number    Number of     VPP Fortran             HPF/JA                  MPI
of PE     grids        cpu time Gflops Gf/PE   cpu time Gflops Gf/PE   cpu time Gflops Gf/PE

 1PE     200x100x478   119.607 (  0.17) 0.17 (scalar)
 1PE     200x100x478     2.967 (  6.88) 6.88     3.002 (  6.80) 6.80
 2PE     200x100x478     1.458 ( 14.01) 7.00     1.535 ( 13.30) 6.65     1.444 ( 14.14) 7.07
 4PE     200x100x478     0.721 ( 28.32) 7.08     0.761 ( 26.85) 6.71     0.714 ( 28.60) 7.15
 8PE     200x100x478     0.365 ( 55.89) 6.99     0.386 ( 52.92) 6.62     0.361 ( 56.55) 7.07
16PE     200x100x478     0.205 ( 99.38) 6.21     0.219 ( 93.39) 5.84     0.191 (107.19) 6.70
24PE     200x100x478     0.141 (144.49) 6.02     0.143 (143.02) 5.96     0.1302(157.24) 6.55
32PE     200x100x478     0.107 (191.23) 5.98     0.110 (186.13) 5.82     0.1011(202.50) 6.33
48PE     200x100x478     0.069 (297.96) 6.21     0.074 (276.96) 5.77     0.0679(301.51) 6.28
56PE     200x100x478     0.064 (319.53) 5.71     0.068 (299.27) 5.34     0.0639(320.39) 5.72
64PE     200x100x478     0.0662(308.91) 4.83     0.0627(324.57) 5.07     0.0569(359.80) 5.62

 1PE     500x100x200     2.691 (  7.94) 7.94     2.691 (  7.94) 7.94
 2PE     500x100x200     1.381 ( 15.47) 7.73     1.390 ( 15.37) 7.68     1.357 ( 15.74) 7.87
 2PE     500x100x200     1.381 ( 15.47) 7.73     1.390 ( 15.37) 7.68     1.355 ( 15.77) 7.89 (isend)
 4PE     500x100x200     0.715 ( 29.97) 7.47     0.712 ( 29.99) 7.50     0.688 ( 31.03) 7.76
 8PE     500x100x200     0.398 ( 53.65) 6.71     0.393 ( 54.38) 6.80     0.372 ( 57.50) 7.19
16PE     500x100x200     0.210 (101.87) 6.37     0.202 (105.74) 6.61     0.193 (110.70) 6.92
24PE     500x100x200     0.160 (133.70) 5.57     0.150 (142.40) 5.93     0.135 (158.26) 6.59
32PE     500x100x200     0.131 (163.55) 5.11     0.120 (175.50) 5.48     0.1084(197.10) 6.15
48PE     500x100x200     0.100 (214.48) 4.46     0.091 (231.69) 4.82     0.0811(263.44) 5.49
56PE     500x100x200     0.089 (239.48) 4.28     0.086 (244.85) 4.37     0.0688(310.54) 5.55
64PE     500x100x200     0.0956(222.95) 3.48     0.0844(249.49) 3.90     0.0687(310.99) 4.86

 2PE     800x200x478    10.659 ( 15.33) 7.66    10.742 ( 15.21) 7.60    10.428 ( 15.67) 7.83
 2PE     800x200x478    10.659 ( 15.33) 7.66    10.742 ( 15.21) 7.60    21.768 (  7.51) 3.75
 2PE     800x200x478    10.659 ( 15.33) 7.66    10.742 ( 15.21) 7.60    16.001 ( 10.21) 5.11
 2PE     800x200x478    10.659 ( 15.33) 7.66    10.742 ( 15.21) 7.60    18.183 (  8.99) 4.49
 4PE     800x200x478     5.351 ( 30.53) 7.63     5.354 ( 30.52) 7.63     5.223 ( 31.28) 7.82
 4PE     800x200x478     5.351 ( 30.53) 7.63     5.354 ( 30.52) 7.63     5.394 ( 30.35) 7.59
 8PE     800x200x478     2.738 ( 59.67) 7.46     2.730 ( 59.85) 7.48     4.627 ( 35.31) 4.41
 8PE     800x200x478     2.738 ( 59.67) 7.46     2.730 ( 59.85) 7.48     2.696 ( 60.61) 7.58
12PE     800x200x478     1.865 ( 87.58) 7.30     1.911 ( 85.49) 7.12     3.527 ( 46.33) 3.86
12PE     800x200x478     1.865 ( 87.58) 7.30     1.911 ( 85.49) 7.12     1.771 ( 92.25) 7.68
16PE     800x200x478     1.419 (115.12) 7.19     1.389 (117.66) 7.35     2.667 ( 61.27) 3.83
16PE     800x200x478     1.419 (115.12) 7.19     1.389 (117.66) 7.35     1.342 (121.81) 7.61
24PE     800x200x478     0.975 (167.54) 6.98     0.976 (167.45) 6.98     0.905 (180.59) 7.52
32PE     800x200x478     0.722 (226.33) 7.07     0.717 (227.72) 7.12     0.690 (236.63) 7.39
48PE     800x200x478     0.534 (305.70) 6.36     0.515 (317.26) 6.61     0.469 (348.38) 7.25
56PE     800x200x478     0.494 (330.95) 5.91     0.464 (352.49) 6.29     0.433 (377.73) 7.74
64PE     800x200x478     0.465 (351.59) 5.49     0.438 (373.41) 5.83     0.389 (420.45) 6.57

 4PE     800x200x670     7.618 ( 30.06) 7.52     8.001 ( 28.62) 7.16     7.723 ( 29.65) 7.41
 4PE     800x200x670     7.618 ( 30.06) 7.52     8.001 ( 28.62) 7.16     7.433 ( 30.81) 7.70
 8PE     800x200x670     3.794 ( 60.36) 7.54     3.962 ( 57.81) 7.23     7.335 ( 22.27) 2.78
 8PE     800x200x670     3.794 ( 60.36) 7.54     3.962 ( 57.81) 7.23     6.677 ( 31.22) 3.90
 8PE     800x200x670     3.794 ( 60.36) 7.54     3.962 ( 57.81) 7.23     3.683 ( 62.17) 7.77
12PE     800x200x670     2.806 ( 81.61) 6.80     3.005 ( 76.21) 6.35     5.360 ( 42.72) 3.56
12PE     800x200x670     2.806 ( 81.61) 6.80     3.005 ( 76.21) 6.35     2.696 ( 84.95) 7.08
16PE     800x200x670     1.924 (119.00) 7.44     2.012 (113.85) 7.12     3.689 ( 44.29) 2.77
16PE     800x200x670     1.924 (119.00) 7.44     2.012 (113.85) 7.12     3.387 ( 48.36) 3.02
16PE     800x200x670     1.924 (119.00) 7.44     2.012 (113.85) 7.12     1.854 (123.53) 7.72
24PE     800x200x670     1.308 (175.10) 7.30     1.360 (168.44) 7.02     2.518 ( 90.98) 3.39
24PE     800x200x670     1.308 (175.10) 7.30     1.360 (168.44) 7.02     1.254 (182.61) 7.60
32PE     800x200x670     0.979 (233.85) 7.31     1.032 (221.88) 6.93     1.931 ( 84.60) 2.64
32PE     800x200x670     0.979 (233.85) 7.31     1.032 (221.88) 6.93     0.955 (239.77) 7.49
48PE     800x200x670     0.682 (335.62) 6.99     0.721 (317.80) 6.62     0.662 (346.21) 7.21
56PE     800x200x670     0.595 (384.61) 6.87     0.628 (364.87) 6.52     0.572 (400.59) 7.15
64PE     800x200x670     0.979 (233.85) 7.31     1.032 (221.88) 6.93     0.519 (441.50) 6.90

16PE   1000x500x1118     9.668 (123.52) 7.72     9.619 (125.50) 7.84
32PE   1000x500x1118     5.044 (236.73) 7.40     4.992 (241.83) 7.56
48PE   1000x500x1118     3.550 (336.40) 7.01     3.479 (346.97) 7.23
56PE   1000x500x1118     2.985 (400.04) 7.14     2.935 (411.36) 7.35
32PE  1000x1000x1118     9.979 (239.33) 7.48     9.813 (243.37) 7.61
48PE  1000x1000x1118     7.177 (332.79) 6.93     7.028 (339.85) 7.08
56PE  1000x1000x1118     5.817 (410.55) 7.33     5.794 (412.23) 7.36
------------------------------------------------------------------------------------
      : Mflops is an estimated value in comparison with the computation by
        1 processor of CRAY Y-MP C90.


Table  5. Comparison of computer processing capability between HPF/JA and MPI in a 
3-dimensional global MHD code of the solar wind-magnetosphere interaction by using 
Earth Simulator(ES) and Fujitsu VPP5000/64. The used 3D MHD code is to solve full 3D 
Earth's Magnetosphere, earthd (nx,ny,nz)=(500,318,318).

Kind of    Number of        HPF/JA                     MPI
computer   PEs              cpu time  Gflops  Gf/PE    cpu time  Gflops  Gf/PE  Date
                            (sec)                      (sec)
------------------------------------------------------------------------------------
ES     1node x 1PE= 1PE     17.8244 (   6.06) 6.06      0.0000 (   0.00) 0.00  2003.03.11

SX-7                1PE     17.0600 (   6.33) 6.33     16.8875 (   6.40) 6.40  2003.04
SX-7                8PE      2.2675 (  47.64) 5.95      2.2500 (  48.01) 6.00  2003.04
SX-7               16PE      1.2350 (  87.46) 5.47      1.3475 (  80.16) 5.01  2003.04
PrimePower          1PE                               156.6250 (   0.69) 0.69  2003.04
PrimePower         32PE                                 6.1750 (  17.49) 0.55  2003.04
PrimePower         64PE                                 4.6750 (  23.10) 0.36  2003.04
PrimePower         80PE                                 4.0000 (  27.00) 0.34  2003.04
------------------------------------------------------------------------------------    
                                                        use mpi_send and mpi_recv
ES     1node x 1PE= 1PE     17.8244 (   6.06) 6.06      0.0000 (   0.00) 0.00  2003.01.21
ES     1node x 2PE= 2PE      9.3535 (  11.55) 5.77      8.6676 (  12.46) 6.23  2003.01.21
ES     1node x 4PE= 4PE      4.7028 (  22.97) 5.74      4.3607 (  24.77) 6.19  2003.01.21
ES     1node x 8PE= 8PE      2.3609 (  45.76) 5.72      2.2075 (  48.94) 6.12  2003.01.21
ES     2node x 8PE=16PE      1.1998 (  90.03) 5.63      1.1229 (  96.20) 6.01  2003.01.21
ES     4node x 8PE=32PE      0.6056 ( 178.38) 5.57      0.6023 ( 179.36) 5.61  2003.01.21
ES     8node x 8PE=64PE      0.3095 ( 349.00) 5.45      0.3803 ( 284.08) 4.44  2003.01.21
ES    10node x 8PE=80PE      0.2517 ( 429.17) 5.36      0.3529 ( 306.12) 3.83  2003.01.21

                                                        use mpi_sendrecv
ES     1node x 1PE= 1PE     17.8244 (   6.06) 6.06      0.0000 (   0.00) 0.00  2003.03.11
ES     1node x 2PE= 2PE      9.3535 (  11.55) 5.77      8.6499 (  12.49) 6.24  2003.03.11
ES     1node x 4PE= 4PE      4.7028 (  22.97) 5.74      4.3484 (  24.84) 6.21  2003.03.11
ES     1node x 8PE= 8PE      2.3609 (  45.76) 5.72      2.1976 (  49.15) 6.14  2003.03.11
ES     2node x 8PE=16PE      1.1998 (  90.03) 5.63      1.1020 (  98.02) 6.13  2003.03.11
ES     4node x 8PE=32PE      0.6056 ( 178.38) 5.57      0.5552 ( 194.56) 6.08  2003.03.11
ES     8node x 8PE=64PE      0.3095 ( 349.00) 5.45      0.2812 ( 384.15) 6.00  2003.03.11
ES    10node x 8PE=80PE      0.2517 ( 429.17) 5.36      0.2265 ( 476.92) 5.96  2003.03.11

                                                        use mpi_isend and mpi_wait
ES     1node x 8PE= 8PE                                 3.9438 (  27.39) 3.42  2003.02.20
ES     8node x 8PE=64PE                                 0.7188 ( 150.28) 2.35  2003.02.20
ES     8node x 8PE=64PE                                 0.7031 ( 153.64) 2.40  2003.04.04
ES    10node x 8PE=80PE                                 0.5631 ( 191.83) 2.40  2003.04.04
                                                        without all communication
ES    10node x 8PE=80PE                                 0.22535( 479.35) 5.99  2003.02.20
ES     1node x 2PE= 2PE      8.70285(  12.41) 6.21      8.65524(  12.48) 6.24  2003.04.01
ES     2node x 1PE= 2PE      8.70214(  12.41) 6.21      8.65108(  12.49) 6.24  2003.04.01
ES     8node x 8PE=64PE      0.25132( 429.82) 6.72      0.2546 ( 424.29) 6.63  2003.04.04
ES    10node x 8PE=80PE      0.19642( 549.95) 6.87      0.21875( 493.81) 6.17  2003.04.01 

                            reflect                     use mpi_sendrecv
ES     1node x 1PE= 1PE     17.8244 (   6.06) 6.06      0.0000 (   0.00) 0.00  2003.03.11
ES     1node x 2PE= 2PE      8.7212 (  12.39) 6.19      8.6499 (  12.49) 6.24  2003.04.04
ES     1node x 4PE= 4PE      4.3891 (  24.61) 6.15      4.3484 (  24.84) 6.21  2003.04.04
ES     1node x 8PE= 8PE      2.2221 (  48.61) 6.08      2.1976 (  49.15) 6.14  2003.04.04
ES     2node x 8PE=16PE      1.1168 (  96.73) 6.05      1.1020 (  98.02) 6.13  2003.04.04
ES     4node x 8PE=32PE      0.5683 ( 190.10) 5.94      0.5552 ( 194.56) 6.08  2003.04.04
ES     8node x 8PE=64PE      0.2943 ( 367.07) 5.74      0.2812 ( 384.15) 6.00  2003.04.04
ES    10node x 8PE=80PE      0.2410 ( 448.29) 5.60      0.2224 ( 485.65) 6.07  2003.04.04 

                            without all communication   without all communication
ES     1node x 2PE= 2PE      8.1007 (  13.33) 6.67      8.6499 (  12.49) 6.24  2003.04.04
ES     1node x 4PE= 4PE      4.0669 (  26.56) 6.64      4.3484 (  24.84) 6.21  2003.04.04
ES     1node x 8PE= 8PE      2.0183 (  53.52) 6.69      2.1976 (  49.15) 6.14  2003.04.04
ES     2node x 8PE=16PE      1.0001 ( 108.01) 6.75      1.1020 (  98.02) 6.13  2003.04.04
ES     4node x 8PE=32PE      0.4884 ( 221.16) 6.91      0.5552 ( 194.56) 6.08  2003.04.04
ES     8node x 8PE=64PE      0.2324 ( 464.86) 7.26      0.2546 ( 424.29) 6.63  2003.04.04
ES    10node x 8PE=80PE      0.1812 ( 596.11) 7.45      0.2034 ( 531.12) 6.64  2003.04.04       

                                                        use mpi_send and mpi_recv
VPP5000       1PE           16.7886 (   6.44) 6.44      0.0000 (   0.00) 0.00  2003.01.26
VPP5000       2PE            8.3927 (  12.87) 6.44      7.8305 (  13.80) 6.90  2003.01.26
VPP5000       4PE            4.4073 (  24.52) 6.13      4.0589 (  26.61) 6.65  2003.01.26
VPP5000       8PE            2.1753 (  49.67) 6.21      1.9973 (  54.10) 6.76  2003.01.26
VPP5000      16PE            1.1446 (  94.40) 5.90      1.0116 ( 106.82) 6.68  2003.03.21
VPP5000      16PE            1.1446 (  94.40) 5.90      1.0103 ( 106.95) 6.68  2003.03.21
VPP5000      32PE            0.6139 ( 176.01) 5.50      0.5116 ( 211.21) 6.60  3003.03.21
                                                        use mpi_send and mpi_recv + right
VPP5000       2PE            8.3927 (  12.87) 6.44      7.8406 (  13.78) 6.89  2003.03.23
VPP5000      16PE            1.1446 (  94.40) 5.90      1.0104 ( 106.94) 6.68  2003.03.22
VPP5000      32PE            0.6139 ( 176.01) 5.50      0.5145 ( 210.01) 6.56  3003.03.23

                                                        use mpi_isend and mpi_irecv
VPP5000       2PE            7.8874 (  13.70) 6.85      7.8227 (  13.81) 6.90  2003.03.24
VPP5000      32PE            0.6139 ( 176.01) 5.50      0.5089 ( 212.31) 6.63  2003.03.18
                                                        use mpi_isend and mpi_irecv + right
VPP5000       2PE            8.3927 (  12.87) 6.44      7.8238 (  13.81) 6.90  2003.03.24
VPP5000      32PE            0.6139 ( 176.01) 5.50      0.5092 ( 212.18) 6.63  2003.03.19
                                                        without all communication
VPP5000       2PE            7.8898 (  13.69) 6.85      7.3934 (  14.61) 7.31  2003.03.24
VPP5000      32PE            0.5016 ( 215.40) 6.73      0.4639 ( 232.90) 7.28  2003.03.19

                                                        use mpi_sendrecv
VPP5000       2PE            8.3927 (  12.87) 6.44      7.8334 (  13.79) 6.90  2003.03.14
VPP5000      16PE            1.1446 (  94.40) 5.90      1.0109 ( 106.88) 6.68  3003.03.21
VPP5000      32PE            0.6139 ( 176.01) 5.50      0.5127 ( 210.74) 6.59  2003.03.18


Fujitsu (ngrd1)     2PE (250,158,158) sendrecv -KV9     6.7873 (  1.999) 0.999 2003.04.23
Fujitsu (ngrd1)     4PE (250,158,158) sendrecv -KV9     3.5883 (  3.780) 0.945 2003.04.23
Fujitsu (ngrd1)     8PE (250,158,158) sendrecv -KV9     1.8780 (  7.224) 0.903 2003.04.23
Fujitsu (ngrd1)    16PE (250,158,158) sendrecv -Kfast   0.8991 ( 15.091) 0.943 2003.04.23
Fujitsu (ngrd1)    24PE (250,158,158) sendrecv -Kfast   0.6378 ( 21.273) 0.886 2003.04.23
Fujitsu (ngrd1)    32PE (250,158,158) sendrecv -Kfast   0.5036 ( 26.940) 0.841 2003.04.23
Fujitsu (ngrd1)    48PE (250,158,158) sendrecv -Kfast   4.8827 (  2.779) 0.058 2003.04.25

Fujitsu (ngrd1)     2PE (250,158,158) no comm  -KV9     6.7865 (  1.999) 0.999 2003.04.23
Fujitsu (ngrd1)     4PE (250,158,158) no comm  -KV9     3.3942 (  3.997) 0.999 2003.04.23
Fujitsu (ngrd1)     8PE (250,158,158) no comm  -KV9     1.7190 (  7.892) 0.986 2003.04.23
Fujitsu (ngrd1)    16PE (250,158,158) no comm  -Kfast   0.8289 ( 16.366) 1.023 2003.04.23
Fujitsu (ngrd1)    24PE (250,158,158) no comm  -Kfast   0.5898 ( 23.001) 0.958 2003.04.23
Fujitsu (ngrd1)    32PE (250,158,158) no comm  -Kfast   0.4464 ( 30.387) 0.950 2003.04.23
Fujitsu (ngrd1)    48PE (250,158,158) no comm  -Kfast   0.3755 ( 36.125) 0.753 2003.04.25

Fujitsu (ngrd1)     2PE (500,318,318) sendrecv -Cpp    48.4191 (  2.231) 1.116 2003.05.01
Fujitsu (ngrd1)     4PE (500,318,318) sendrecv -Cpp    24.7140 (  4.372) 1.093 2003.05.01
Fujitsu (ngrd1)     8PE (500,318,318) sendrecv -Cpp    12.9242 (  8.360) 1.045 2003.05.01
Fujitsu (ngrd1)    16PE (500,318,318) sendrecv -Cpp     6.8664 ( 15.736) 0.983 2003.05.01
Fujitsu (ngrd1)    20PE (500,318,318) sendrecv -Cpp     5.6745 ( 19.041) 0.952 2003.05.01
Fujitsu (ngrd1)    24PE (500,318,318) sendrecv -Cpp     5.4276 ( 19.907) 0.829 2003.05.01
Fujitsu (ngrd1)    32PE (500,318,318) sendrecv -Cpp     4.1945 ( 25.760) 0.805 2003.05.01
Fujitsu (ngrd1)    33PE (500,318,318) sendrecv -Cpp    15.2646 (  7.078) 0.214 2003.05.01
Fujitsu (ngrd1)    40PE (500,318,318) sendrecv -Cpp    13.9701 (  7.733) 0.193 2003.05.01
Fujitsu (ngrd1)    48PE (500,318,318) sendrecv -Cpp    12.5962 (  8.577) 0.179 2003.05.01

Fujitsu (ngrd1)     2PE (500,318,318) no comm -Cpp     47.6551 (  2.268) 1.134 2003.05.01
Fujitsu (ngrd1)     4PE (500,318,318) no comm -Cpp     22.6370 (  4.774) 1.194 2003.05.01
Fujitsu (ngrd1)     8PE (500,318,318) no comm -Cpp     11.6999 (  9.237) 1.155 2003.05.01
Fujitsu (ngrd1)    16PE (500,318,318) no comm -Cpp      5.8821 ( 18.376) 1.148 2003.05.01
Fujitsu (ngrd1)    24PE (500,318,318) no comm -Cpp      4.3032 ( 25.118) 1.047 2003.05.01
Fujitsu (ngrd1)    32PE (500,318,318) no comm -Cpp      3.2285 ( 33.479) 1.046 2003.05.01
Fujitsu (ngrd1)    48PE (500,318,318) no comm -Cpp      2.3294 ( 46.401) 0.967 2003.05.01

HPC P4Linux   2PE                                       4.5374 (  2.99 ) 1.495 2003.02.06
HPC P4Linux 256PE  (LIMPACK)                                   (266.00 ) 1.039 2003.??.??
------------------------------------------------------------------------------------  



前のページへ戻る