Warning, /bsps/powerpc/shared/altivec/README.md is written in an unsupported language. File is not indexed.
0001 altivec
0002 =======
0003
0004 Altivec support was developed and maintained as a user-extension
0005 outside of RTEMS. This extension is still available (unbundled)
0006 from Till Straumann <strauman@slac.stanford.edu>; it is useful
0007 if an application desires 'lazy switching' of the altivec context.
0008
0009 Modes
0010 -----
0011
0012 Altivec support -- the unbundled extension, that is -- can be used
0013 in two ways:
0014
0015 a. All tasks are implicitly AltiVec-enabled.
0016
0017 b. Only designated tasks are AltiVec-enabled. 'Lazy-context switching'
0018 is implemented to switch AltiVec the context.
0019
0020 Note that the code implemented in this directory supports mode 'a'
0021 and mode 'a' ONLY. For mode 'b' you need the unbundled extension
0022 (which is completely independent of this code).
0023
0024 Mode 'a' (All tasks are AltiVec-enabled)
0025 ----------------------------------------
0026
0027 The major disadvantage of this mode is that additional overhead is
0028 involved: tasks that never use the vector unit still save/restore
0029 the volatile vector registers (20 registers * 16bytes each) across
0030 every interrupt and all non-volatile registers (12 registers * 16b each)
0031 during every context switch.
0032
0033 However, saving/restoring e.g., the volatile registers is quite
0034 fast -- on my 1GHz 7457 saving or restoring 20 vector registers
0035 takes only about 1us or even less (if there are cache hits).
0036
0037 The advantage is complete transparency to the user and full ABI
0038 compatibility (exept for ISRs and exception handlers), see below.
0039
0040 Mode 'b' (Only dedicated tasks are AltiVec-enabled)
0041 ---------------------------------------------------
0042
0043 The advantage of this mode of operation is that the vector-registers
0044 are only saved/restored when a different, altivec-enabled task becomes
0045 ready to run. In particular, if there is only a single altivec-enabled
0046 task then the altivec-context *never* is switched.
0047
0048 Note that this mode of operation is not supported by the code
0049 in this directory -- you need the unbundled altivec extension
0050 mentioned above.
0051
0052 Compiler Options
0053 -----------------
0054 ```
0055 Three compiler options affect AltiVec: -maltivec, -mabi=altivec and
0056 -mvrsave=yes/no.
0057
0058 -maltivec: This lets the cpp define the symbol __ALTIVEC__ and enables
0059 gcc to emit vector instructions. Note that gcc may use the
0060 AltiVec engine implicitly, i.e., **without you writing any
0061 vectorized code**.
0062
0063 -mabi=altivec: This option has two effects:
0064 i) It ensures 16-byte stack alignment required by AltiVec
0065 (even in combination with eabi which is RTEMS' default).
0066 ii) It allows vector arguments to be passed in vector registers.
0067
0068 -mvrsave=yes/no: Instructs gcc to emit code which sets the VRSAVE register
0069 indicating which vector registers are 'currently in use'.
0070 Because the altivec support does not use this information *) the
0071 option has no direct affect but it is desirable to compile with
0072 -mvrsave=no so that no unnecessary code is generated.
0073
0074 *) The file vec_sup_asm.S conditionally disables usage of
0075 the VRSAVE information if the preprocessor symbol
0076 'IGNORE_VRSAVE' is defined, which is the default.
0077
0078 If 'IGNORE_VRSAVE' is undefined then the code *does*
0079 use the VRSAVE information but I found that this does
0080 not execute noticeably faster.
0081 ```
0082
0083 IMPORTANT NOTES
0084 ---------------
0085
0086 AFAIK, RTEMS uses the EABI which requires a stack alignment of only 8 bytes
0087 which is NOT enough for AltiVec (which requires 16-byte alignment).
0088
0089 There are two ways for obtaining 16-byte alignment:
0090
0091 I) Compile with -mno-eabi (ordinary SYSV ABI has 16-byte alignment)
0092 II) Compile with -mabi=altivec (extension to EABI; maintains 16-byte alignment
0093 but also allows for passing vector arguments in vector registers)
0094
0095 Note that it is crucial to compile ***absolutely everything*** with the same
0096 ABI options (or a linker error may occur). In particular, this includes
0097
0098 - newlibc multilib variant
0099 - RTEMS proper
0100 - application + third-party code
0101
0102 IMO the proper compiler options for Mode 'a' would be
0103
0104 -maltivec -mabi=altivec -mvrsave=no
0105
0106 Note that the -mcpu=7400 option also enables -maltivec and -mabi=altivec
0107 but leaves -mvrsave at some 'default' which is probably 'no'.
0108 Compiling with -mvrsave=yes does not produce incompatible code but
0109 may have a performance impact (since extra code is produced to maintain
0110 VRSAVE).
0111
0112
0113 Multilib Variants
0114 -----------------
0115
0116 The default GCC configuration for RTEMS contains a -mcpu=7400 multilib
0117 variant which is the correct one to choose.
0118
0119
0120 BSP 'custom' file.
0121 ------------------
0122 Now that you have the necessary newlib and libgcc etc. variants
0123 you also need to build RTEMS accordingly.
0124
0125 In you BSP's make/custom/<bsp>.cfg file make sure the CPU_CFLAGS
0126 select the desired variant:
0127
0128 for mode 'a':
0129
0130 ```shell
0131 CPU_CFLAGS = ... -mcpu=7400
0132 ```
0133
0134 Note that since -maltivec globally defines __ALTIVEC__ RTEMS automatially
0135 enables code that takes care of switching the AltiVec context as necessary.
0136 This is transparent to application code.
0137 6. BSP support
0138 --------------
0139 It is the BSP's responsibility to initialize MSR_VE, VSCR and VRSAVE
0140 during early boot, ideally before any C-code is executed (because it
0141 may, theoretically, use vector instructions).
0142
0143 The BSP must
0144
0145 - set MSR_VE
0146 - clear VRSAVE; note that the probing algorithm for detecting
0147 whether -mvrsave=yes or 'no' was used relies on the BSP
0148 clearing VRSAVE during early start. Since no interrupts or
0149 context switches happen before the AltiVec support is initialized
0150 clearing VRSAVE is no problem even if it turns out that -mvrsave=no
0151 was in effect (eventually a value of all-ones will be stored
0152 in VRSAVE in this case).
0153 - clear VSCR
0154
0155 PSIM note
0156 ---------
0157 PSIM supports the AltiVec instruction set with the exception of
0158 the 'data stream' instructions for cache prefetching. The RTEMS
0159 altivec support includes run-time checks to skip these instruction
0160 when executing on PSIM.
0161
0162 Note that AltiVec support within PSIM must be enabled at 'configure'
0163 time by passing the 'configure' option
0164
0165 ```shell
0166 --enable-sim-float=altivec
0167 ```
0168
0169 Note also that PSIM's AltiVec support has many bugs. It is recommended
0170 to apply the patches filed as an attachment with gdb bug report #2461
0171 prior to building PSIM.
0172
0173 The CPU type and corresponding multilib must be changed when
0174 building RTEMS/psim:
0175
0176 edit make/custom/psim.cfg and change
0177
0178 ```shell
0179 CPU_CFLAGS = ... -mcpu=603e
0180 ```
0181
0182 to
0183
0184 ```shell
0185 CPU_CFLAGS = ... -mcpu=7400
0186 ```
0187
0188 This change must be performed *before* configuring RTEMS/psim.