Back to home page

LXR

 
 

    


Warning, /bsps/powerpc/shared/altivec/README.md is written in an unsupported language. File is not indexed.

0001 altivec
0002 =======
0003 
0004 Altivec support was developed and maintained as a user-extension
0005 outside of RTEMS. This extension is still available (unbundled)
0006 from Till Straumann <strauman@slac.stanford.edu>; it is useful
0007 if an application desires 'lazy switching' of the altivec context.
0008 
0009 Modes
0010 -----
0011 
0012 Altivec support -- the unbundled extension, that is -- can be used
0013 in two ways:
0014 
0015 a. All tasks are implicitly AltiVec-enabled.
0016 
0017 b. Only designated tasks are AltiVec-enabled. 'Lazy-context switching'
0018    is implemented to switch AltiVec the context.
0019 
0020 Note that the code implemented in this directory supports mode 'a'
0021 and mode 'a' ONLY. For mode 'b' you need the unbundled extension
0022 (which is completely independent of this code).
0023 
0024 Mode 'a' (All tasks are AltiVec-enabled)
0025 ----------------------------------------
0026 
0027 The major disadvantage of this mode is that additional overhead is 
0028 involved: tasks that never use the vector unit still save/restore
0029 the volatile vector registers (20 registers * 16bytes each) across
0030 every interrupt and all non-volatile registers (12 registers * 16b each)
0031 during every context switch.
0032 
0033 However, saving/restoring e.g., the volatile registers is quite
0034 fast -- on my 1GHz 7457 saving or restoring 20 vector registers
0035 takes only about 1us or even less (if there are cache hits).
0036 
0037 The advantage is complete transparency to the user and full ABI
0038 compatibility (exept for ISRs and exception handlers), see below.
0039 
0040 Mode 'b' (Only dedicated tasks are AltiVec-enabled)
0041 ---------------------------------------------------
0042 
0043 The advantage of this mode of operation is that the vector-registers
0044 are only saved/restored when a different, altivec-enabled task becomes
0045 ready to run. In particular, if there is only a single altivec-enabled
0046 task then the altivec-context *never* is switched.
0047 
0048 Note that this mode of operation is not supported by the code
0049 in this directory -- you need the unbundled altivec extension
0050 mentioned above.
0051 
0052 Compiler Options
0053 ----------------- 
0054 ```
0055 Three compiler options affect AltiVec: -maltivec, -mabi=altivec and
0056 -mvrsave=yes/no.
0057 
0058 -maltivec: This lets the cpp define the symbol __ALTIVEC__ and enables
0059            gcc to emit vector instructions. Note that gcc may use the
0060            AltiVec engine implicitly, i.e., **without you writing any
0061            vectorized code**.
0062 
0063 -mabi=altivec: This option has two effects:
0064            i) It ensures 16-byte stack alignment required by AltiVec
0065               (even in combination with eabi which is RTEMS' default).
0066            ii) It allows vector arguments to be passed in vector registers.
0067 
0068 -mvrsave=yes/no: Instructs gcc to emit code which sets the VRSAVE register
0069            indicating which vector registers are 'currently in use'.
0070            Because the altivec support does not use this information *) the
0071            option has no direct affect but it is desirable to compile with
0072            -mvrsave=no so that no unnecessary code is generated.
0073 
0074           *) The file vec_sup_asm.S conditionally disables usage of
0075              the VRSAVE information if the preprocessor symbol
0076              'IGNORE_VRSAVE' is defined, which is the default.
0077 
0078              If 'IGNORE_VRSAVE' is undefined then the code *does*
0079              use the VRSAVE information but I found that this does
0080              not execute noticeably faster.
0081 ```
0082 
0083 IMPORTANT NOTES
0084 ---------------
0085 
0086 AFAIK, RTEMS uses the EABI which requires a stack alignment of only 8 bytes
0087 which is NOT enough for AltiVec (which requires 16-byte alignment).
0088 
0089 There are two ways for obtaining 16-byte alignment:
0090 
0091 I)  Compile with -mno-eabi (ordinary SYSV ABI has 16-byte alignment)
0092 II) Compile with -mabi=altivec (extension to EABI; maintains 16-byte alignment
0093     but also allows for passing vector arguments in vector registers)
0094 
0095 Note that it is crucial to compile ***absolutely everything*** with the same
0096 ABI options (or a linker error may occur). In particular, this includes
0097 
0098  - newlibc multilib variant
0099  - RTEMS proper 
0100  - application + third-party code
0101 
0102 IMO the proper compiler options for Mode 'a' would be
0103 
0104     -maltivec -mabi=altivec -mvrsave=no
0105 
0106 Note that the -mcpu=7400 option also enables -maltivec and -mabi=altivec
0107 but leaves -mvrsave at some 'default' which is probably 'no'.
0108 Compiling with -mvrsave=yes does not produce incompatible code but
0109 may have a performance impact (since extra code is produced to maintain
0110 VRSAVE).
0111 
0112 
0113 Multilib Variants
0114 -----------------
0115 
0116 The default GCC configuration for RTEMS contains a -mcpu=7400 multilib
0117 variant which is the correct one to choose.
0118 
0119 
0120 BSP 'custom' file.
0121 ------------------
0122 Now that you have the necessary newlib and libgcc etc. variants
0123 you also need to build RTEMS accordingly.
0124 
0125 In you BSP's make/custom/<bsp>.cfg file make sure the CPU_CFLAGS
0126 select the desired variant:
0127 
0128 for mode 'a':
0129 
0130 ```shell
0131    CPU_CFLAGS = ... -mcpu=7400
0132 ```
0133 
0134 Note that since -maltivec globally defines __ALTIVEC__ RTEMS automatially
0135 enables code that takes care of switching the AltiVec context as necessary.
0136 This is transparent to application code.
0137 6. BSP support
0138 --------------
0139 It is the BSP's responsibility to initialize MSR_VE, VSCR and VRSAVE
0140 during early boot, ideally before any C-code is executed (because it
0141 may, theoretically, use vector instructions).
0142 
0143 The BSP must
0144 
0145  - set MSR_VE
0146  - clear VRSAVE; note that the probing algorithm for detecting
0147    whether -mvrsave=yes or 'no' was used relies on the BSP
0148    clearing VRSAVE during early start. Since no interrupts or
0149    context switches happen before the AltiVec support is initialized
0150    clearing VRSAVE is no problem even if it turns out that -mvrsave=no
0151    was in effect (eventually a value of all-ones will be stored
0152    in VRSAVE in this case).
0153  - clear VSCR
0154 
0155 PSIM note
0156 ---------
0157 PSIM supports the AltiVec instruction set with the exception of
0158 the 'data stream' instructions for cache prefetching. The RTEMS
0159 altivec support includes run-time checks to skip these instruction
0160 when executing on PSIM.
0161 
0162 Note that AltiVec support within PSIM must be enabled at 'configure'
0163 time by passing the 'configure' option
0164 
0165 ```shell
0166 --enable-sim-float=altivec
0167 ```
0168 
0169 Note also that PSIM's AltiVec support has many bugs. It is recommended
0170 to apply the patches filed as an attachment with gdb bug report #2461
0171 prior to building PSIM.
0172 
0173 The CPU type and corresponding multilib must be changed when
0174 building RTEMS/psim:
0175 
0176   edit make/custom/psim.cfg and change
0177 
0178 ```shell
0179     CPU_CFLAGS = ... -mcpu=603e
0180 ```
0181 
0182   to
0183 
0184 ```shell
0185     CPU_CFLAGS = ... -mcpu=7400
0186 ```
0187 
0188 This change must be performed *before* configuring RTEMS/psim.