3.18.3 ARC Options
The following options control the architecture variant for which code
is being compiled:
-mbarrel-shifter
- Generate instructions supported by barrel shifter. This is the default
unless -mcpu=ARC601 or ‘-mcpu=ARCEM’ is in effect.
-mjli-always
- Force to call a function using jli_s instruction. This option is
valid only for ARCv2 architecture.
-mcpu=
cpu- Set architecture type, register usage, and instruction scheduling
parameters for cpu. There are also shortcut alias options
available for backward compatibility and convenience. Supported
values for cpu are
- ‘arc600’
- Compile for ARC600. Aliases: -mA6, -mARC600.
- ‘arc601’
- Compile for ARC601. Alias: -mARC601.
- ‘arc700’
- Compile for ARC700. Aliases: -mA7, -mARC700.
This is the default when configured with --with-cpu=arc700.
- ‘arcem’
- Compile for ARC EM.
- ‘archs’
- Compile for ARC HS.
- ‘em’
- Compile for ARC EM CPU with no hardware extensions.
- ‘em4’
- Compile for ARC EM4 CPU.
- ‘em4_dmips’
- Compile for ARC EM4 DMIPS CPU.
- ‘em4_fpus’
- Compile for ARC EM4 DMIPS CPU with the single-precision floating-point
extension.
- ‘em4_fpuda’
- Compile for ARC EM4 DMIPS CPU with single-precision floating-point and
double assist instructions.
- ‘hs’
- Compile for ARC HS CPU with no hardware extensions except the atomic
instructions.
- ‘hs34’
- Compile for ARC HS34 CPU.
- ‘hs38’
- Compile for ARC HS38 CPU.
- ‘hs38_linux’
- Compile for ARC HS38 CPU with all hardware extensions on.
- ‘arc600_norm’
- Compile for ARC 600 CPU with
norm
instructions enabled.
- ‘arc600_mul32x16’
- Compile for ARC 600 CPU with
norm
and 32x16-bit multiply
instructions enabled.
- ‘arc600_mul64’
- Compile for ARC 600 CPU with
norm
and mul64
-family
instructions enabled.
- ‘arc601_norm’
- Compile for ARC 601 CPU with
norm
instructions enabled.
- ‘arc601_mul32x16’
- Compile for ARC 601 CPU with
norm
and 32x16-bit multiply
instructions enabled.
- ‘arc601_mul64’
- Compile for ARC 601 CPU with
norm
and mul64
-family
instructions enabled.
- ‘nps400’
- Compile for ARC 700 on NPS400 chip.
- ‘em_mini’
- Compile for ARC EM minimalist configuration featuring reduced register
set.
-mdpfp
-mdpfp-compact
- Generate double-precision FPX instructions, tuned for the compact
implementation.
-mdpfp-fast
- Generate double-precision FPX instructions, tuned for the fast
implementation.
-mno-dpfp-lrsr
- Disable
lr
and sr
instructions from using FPX extension
aux registers.
-mea
- Generate extended arithmetic instructions. Currently only
divaw
, adds
, subs
, and sat16
are
supported. This is always enabled for -mcpu=ARC700.
-mno-mpy
- Do not generate
mpy
-family instructions for ARC700. This option is
deprecated.
-mmul32x16
- Generate 32x16-bit multiply and multiply-accumulate instructions.
-mmul64
- Generate
mul64
and mulu64
instructions.
Only valid for -mcpu=ARC600.
-mnorm
- Generate
norm
instructions. This is the default if -mcpu=ARC700
is in effect.
-mspfp
-mspfp-compact
- Generate single-precision FPX instructions, tuned for the compact
implementation.
-mspfp-fast
- Generate single-precision FPX instructions, tuned for the fast
implementation.
-msimd
- Enable generation of ARC SIMD instructions via target-specific
builtins. Only valid for -mcpu=ARC700.
-msoft-float
- This option ignored; it is provided for compatibility purposes only.
Software floating-point code is emitted by default, and this default
can overridden by FPX options; -mspfp, -mspfp-compact, or
-mspfp-fast for single precision, and -mdpfp,
-mdpfp-compact, or -mdpfp-fast for double precision.
-mswap
- Generate
swap
instructions.
-matomic
- This enables use of the locked load/store conditional extension to implement
atomic memory built-in functions. Not available for ARC 6xx or ARC
EM cores.
-mdiv-rem
- Enable
div
and rem
instructions for ARCv2 cores.
-mcode-density
- Enable code density instructions for ARC EM.
This option is on by default for ARC HS.
-mll64
- Enable double load/store operations for ARC HS cores.
-mtp-regno=
regno- Specify thread pointer register number.
-mmpy-option=
multo- Compile ARCv2 code with a multiplier design option. You can specify
the option using either a string or numeric value for multo.
‘wlh1’ is the default value. The recognized values are:
- ‘0’
- ‘none’
- No multiplier available.
- ‘1’
- ‘w’
- 16x16 multiplier, fully pipelined.
The following instructions are enabled:
mpyw
and mpyuw
.
- ‘2’
- ‘wlh1’
- 32x32 multiplier, fully
pipelined (1 stage). The following instructions are additionally
enabled:
mpy
, mpyu
, mpym
, mpymu
, and mpy_s
.
- ‘3’
- ‘wlh2’
- 32x32 multiplier, fully pipelined
(2 stages). The following instructions are additionally enabled:
mpy
,
mpyu
, mpym
, mpymu
, and mpy_s
.
- ‘4’
- ‘wlh3’
- Two 16x16 multipliers, blocking,
sequential. The following instructions are additionally enabled:
mpy
,
mpyu
, mpym
, mpymu
, and mpy_s
.
- ‘5’
- ‘wlh4’
- One 16x16 multiplier, blocking,
sequential. The following instructions are additionally enabled:
mpy
,
mpyu
, mpym
, mpymu
, and mpy_s
.
- ‘6’
- ‘wlh5’
- One 32x4 multiplier, blocking,
sequential. The following instructions are additionally enabled:
mpy
,
mpyu
, mpym
, mpymu
, and mpy_s
.
- ‘7’
- ‘plus_dmpy’
- ARC HS SIMD support.
- ‘8’
- ‘plus_macd’
- ARC HS SIMD support.
- ‘9’
- ‘plus_qmacw’
- ARC HS SIMD support.
This option is only available for ARCv2 cores.
-mfpu=
fpu- Enables support for specific floating-point hardware extensions for ARCv2
cores. Supported values for fpu are:
- ‘fpus’
- Enables support for single-precision floating-point hardware
extensions.
- ‘fpud’
- Enables support for double-precision floating-point hardware
extensions. The single-precision floating-point extension is also
enabled. Not available for ARC EM.
- ‘fpuda’
- Enables support for double-precision floating-point hardware
extensions using double-precision assist instructions. The single-precision
floating-point extension is also enabled. This option is
only available for ARC EM.
- ‘fpuda_div’
- Enables support for double-precision floating-point hardware
extensions using double-precision assist instructions.
The single-precision floating-point, square-root, and divide
extensions are also enabled. This option is
only available for ARC EM.
- ‘fpuda_fma’
- Enables support for double-precision floating-point hardware
extensions using double-precision assist instructions.
The single-precision floating-point and fused multiply and add
hardware extensions are also enabled. This option is
only available for ARC EM.
- ‘fpuda_all’
- Enables support for double-precision floating-point hardware
extensions using double-precision assist instructions.
All single-precision floating-point hardware extensions are also
enabled. This option is only available for ARC EM.
- ‘fpus_div’
- Enables support for single-precision floating-point, square-root and divide
hardware extensions.
- ‘fpud_div’
- Enables support for double-precision floating-point, square-root and divide
hardware extensions. This option
includes option ‘fpus_div’. Not available for ARC EM.
- ‘fpus_fma’
- Enables support for single-precision floating-point and
fused multiply and add hardware extensions.
- ‘fpud_fma’
- Enables support for double-precision floating-point and
fused multiply and add hardware extensions. This option
includes option ‘fpus_fma’. Not available for ARC EM.
- ‘fpus_all’
- Enables support for all single-precision floating-point hardware
extensions.
- ‘fpud_all’
- Enables support for all single- and double-precision floating-point
hardware extensions. Not available for ARC EM.
-mirq-ctrl-saved=
register-range,
blink,
lp_count- Specifies general-purposes registers that the processor automatically
saves/restores on interrupt entry and exit. register-range is
specified as two registers separated by a dash. The register range
always starts with
r0
, the upper limit is fp
register.
blink and lp_count are optional. This option is only
valid for ARC EM and ARC HS cores.
-mrgf-banked-regs=
number- Specifies the number of registers replicated in second register bank
on entry to fast interrupt. Fast interrupts are interrupts with the
highest priority level P0. These interrupts save only PC and STATUS32
registers to avoid memory transactions during interrupt entry and exit
sequences. Use this option when you are using fast interrupts in an
ARC V2 family processor. Permitted values are 4, 8, 16, and 32.
-mlpc-width=
width- Specify the width of the
lp_count
register. Valid values for
width are 8, 16, 20, 24, 28 and 32 bits. The default width is
fixed to 32 bits. If the width is less than 32, the compiler does not
attempt to transform loops in your program to use the zero-delay loop
mechanism unless it is known that the lp_count
register can
hold the required loop-counter value. Depending on the width
specified, the compiler and run-time library might continue to use the
loop mechanism for various needs. This option defines macro
__ARC_LPC_WIDTH__
with the value of width.
-mrf16
- This option instructs the compiler to generate code for a 16-entry
register file. This option defines the
__ARC_RF16__
preprocessor macro.
-mbranch-index
- Enable use of
bi
or bih
instructions to implement jump
tables.
The following options are passed through to the assembler, and also
define preprocessor macro symbols.
-mdsp-packa
- Passed down to the assembler to enable the DSP Pack A extensions.
Also sets the preprocessor symbol
__Xdsp_packa
. This option is
deprecated.
-mdvbf
- Passed down to the assembler to enable the dual Viterbi butterfly
extension. Also sets the preprocessor symbol
__Xdvbf
. This
option is deprecated.
-mlock
- Passed down to the assembler to enable the locked load/store
conditional extension. Also sets the preprocessor symbol
__Xlock
.
-mmac-d16
- Passed down to the assembler. Also sets the preprocessor symbol
__Xxmac_d16
. This option is deprecated.
-mmac-24
- Passed down to the assembler. Also sets the preprocessor symbol
__Xxmac_24
. This option is deprecated.
-mrtsc
- Passed down to the assembler to enable the 64-bit time-stamp counter
extension instruction. Also sets the preprocessor symbol
__Xrtsc
. This option is deprecated.
-mswape
- Passed down to the assembler to enable the swap byte ordering
extension instruction. Also sets the preprocessor symbol
__Xswape
.
-mtelephony
- Passed down to the assembler to enable dual- and single-operand
instructions for telephony. Also sets the preprocessor symbol
__Xtelephony
. This option is deprecated.
-mxy
- Passed down to the assembler to enable the XY memory extension. Also
sets the preprocessor symbol
__Xxy
.
The following options control how the assembly code is annotated:
-misize
- Annotate assembler instructions with estimated addresses.
-mannotate-align
- Explain what alignment considerations lead to the decision to make an
instruction short or long.
The following options are passed through to the linker:
-marclinux
- Passed through to the linker, to specify use of the
arclinux
emulation.
This option is enabled by default in tool chains built for
arc-linux-uclibc
and arceb-linux-uclibc
targets
when profiling is not requested.
-marclinux_prof
- Passed through to the linker, to specify use of the
arclinux_prof
emulation. This option is enabled by default in
tool chains built for arc-linux-uclibc
and
arceb-linux-uclibc
targets when profiling is requested.
The following options control the semantics of generated code:
-mlong-calls
- Generate calls as register indirect calls, thus providing access
to the full 32-bit address range.
-mmedium-calls
- Don't use less than 25-bit addressing range for calls, which is the
offset available for an unconditional branch-and-link
instruction. Conditional execution of function calls is suppressed, to
allow use of the 25-bit range, rather than the 21-bit range with
conditional branch-and-link. This is the default for tool chains built
for
arc-linux-uclibc
and arceb-linux-uclibc
targets.
-G
num- Put definitions of externally-visible data in a small data section if
that data is no bigger than num bytes. The default value of
num is 4 for any ARC configuration, or 8 when we have double
load/store operations.
-mno-sdata
- Do not generate sdata references. This is the default for tool chains
built for
arc-linux-uclibc
and arceb-linux-uclibc
targets.
-mvolatile-cache
- Use ordinarily cached memory accesses for volatile references. This is the
default.
-mno-volatile-cache
- Enable cache bypass for volatile references.
The following options fine tune code generation:
-malign-call
- Do alignment optimizations for call instructions.
-mauto-modify-reg
- Enable the use of pre/post modify with register displacement.
-mbbit-peephole
- Enable bbit peephole2.
-mno-brcc
- This option disables a target-specific pass in arc_reorg to
generate compare-and-branch (
br
cc) instructions.
It has no effect on
generation of these instructions driven by the combiner pass.
-mcase-vector-pcrel
- Use PC-relative switch case tables to enable case table shortening.
This is the default for -Os.
-mcompact-casesi
- Enable compact
casesi
pattern. This is the default for -Os,
and only available for ARCv1 cores. This option is deprecated.
-mno-cond-exec
- Disable the ARCompact-specific pass to generate conditional
execution instructions.
Due to delay slot scheduling and interactions between operand numbers,
literal sizes, instruction lengths, and the support for conditional execution,
the target-independent pass to generate conditional execution is often lacking,
so the ARC port has kept a special pass around that tries to find more
conditional execution generation opportunities after register allocation,
branch shortening, and delay slot scheduling have been done. This pass
generally, but not always, improves performance and code size, at the cost of
extra compilation time, which is why there is an option to switch it off.
If you have a problem with call instructions exceeding their allowable
offset range because they are conditionalized, you should consider using
-mmedium-calls instead.
-mearly-cbranchsi
- Enable pre-reload use of the
cbranchsi
pattern.
-mexpand-adddi
- Expand
adddi3
and subdi3
at RTL generation time into
add.f
, adc
etc. This option is deprecated.
-mindexed-loads
- Enable the use of indexed loads. This can be problematic because some
optimizers then assume that indexed stores exist, which is not
the case.
-mlra
- Enable Local Register Allocation. This is still experimental for ARC,
so by default the compiler uses standard reload
(i.e. -mno-lra).
-mlra-priority-none
- Don't indicate any priority for target registers.
-mlra-priority-compact
- Indicate target register priority for r0..r3 / r12..r15.
-mlra-priority-noncompact
- Reduce target register priority for r0..r3 / r12..r15.
-mmillicode
- When optimizing for size (using -Os), prologues and epilogues
that have to save or restore a large number of registers are often
shortened by using call to a special function in libgcc; this is
referred to as a millicode call. As these calls can pose
performance issues, and/or cause linking issues when linking in a
nonstandard way, this option is provided to turn on or off millicode
call generation.
-mcode-density-frame
- This option enable the compiler to emit
enter
and leave
instructions. These instructions are only valid for CPUs with
code-density feature.
-mmixed-code
- Tweak register allocation to help 16-bit instruction generation.
This generally has the effect of decreasing the average instruction size
while increasing the instruction count.
-mq-class
- Enable ‘q’ instruction alternatives.
This is the default for -Os.
-mRcq
- Enable ‘Rcq’ constraint handling.
Most short code generation depends on this.
This is the default.
-mRcw
- Enable ‘Rcw’ constraint handling.
Most ccfsm condexec mostly depends on this.
This is the default.
-msize-level=
level- Fine-tune size optimization with regards to instruction lengths and alignment.
The recognized values for level are:
- ‘0’
- No size optimization. This level is deprecated and treated like ‘1’.
- ‘1’
- Short instructions are used opportunistically.
- ‘2’
- In addition, alignment of loops and of code after barriers are dropped.
- ‘3’
- In addition, optional data alignment is dropped, and the option Os is enabled.
This defaults to ‘3’ when -Os is in effect. Otherwise,
the behavior when this is not set is equivalent to level ‘1’.
-mtune=
cpu- Set instruction scheduling parameters for cpu, overriding any implied
by -mcpu=.
Supported values for cpu are
- ‘ARC600’
- Tune for ARC600 CPU.
- ‘ARC601’
- Tune for ARC601 CPU.
- ‘ARC700’
- Tune for ARC700 CPU with standard multiplier block.
- ‘ARC700-xmac’
- Tune for ARC700 CPU with XMAC block.
- ‘ARC725D’
- Tune for ARC725D CPU.
- ‘ARC750D’
- Tune for ARC750D CPU.
-mmultcost=
num- Cost to assume for a multiply instruction, with ‘4’ being equal to a
normal instruction.
-munalign-prob-threshold=
probability- Set probability threshold for unaligning branches.
When tuning for ‘ARC700’ and optimizing for speed, branches without
filled delay slot are preferably emitted unaligned and long, unless
profiling indicates that the probability for the branch to be taken
is below probability. See Cross-profiling.
The default is (REG_BR_PROB_BASE/2), i.e. 5000.
The following options are maintained for backward compatibility, but
are now deprecated and will be removed in a future release:
-margonaut
- Obsolete FPX.
-mbig-endian
-EB
- Compile code for big-endian targets. Use of these options is now
deprecated. Big-endian code is supported by configuring GCC to build
arceb-elf32
and arceb-linux-uclibc
targets,
for which big endian is the default.
-mlittle-endian
-EL
- Compile code for little-endian targets. Use of these options is now
deprecated. Little-endian code is supported by configuring GCC to build
arc-elf32
and arc-linux-uclibc
targets,
for which little endian is the default.
-mbarrel_shifter
- Replaced by -mbarrel-shifter.
-mdpfp_compact
- Replaced by -mdpfp-compact.
-mdpfp_fast
- Replaced by -mdpfp-fast.
-mdsp_packa
- Replaced by -mdsp-packa.
-mEA
- Replaced by -mea.
-mmac_24
- Replaced by -mmac-24.
-mmac_d16
- Replaced by -mmac-d16.
-mspfp_compact
- Replaced by -mspfp-compact.
-mspfp_fast
- Replaced by -mspfp-fast.
-mtune=
cpu- Values ‘arc600’, ‘arc601’, ‘arc700’ and
‘arc700-xmac’ for cpu are replaced by ‘ARC600’,
‘ARC601’, ‘ARC700’ and ‘ARC700-xmac’ respectively.
-multcost=
num- Replaced by -mmultcost.