Custom instructions: what and how
ARC architecture allows users to specify extension instructions. These extension instructions are not macros; the assembler creates encodings for use of these instructions according to the specification by the user.
To create a custom instruction, ones need to make use of the .extInstruction pseudo-op, which also allows the user to choose for a particular instruction syntax, one of:
Three operand instruction;
Two operand instruction;
One operand instruction;
No operand instruction.
But what is the difference between those ones. To answer this question, we need to check how an extension instruction is encoded:
Major Opcode = 0x07
Sub Opcode1 = 0x00-0x2E, 0x30-0x3f : Used by three operand instructions;
Sub Opcode1 = 0x2F:
Sub Opcode2 = 0x00-0x3E : Used by two operand instructions;
Sub Opcode2 = 0x3F:
Sub Opcode3 = 0x00-0x3F: Used by one operand instructions;
The three operand instructions are having op<.cc><.f> a,b,c syntax format, and it is the most general form of an ARC instruction:
op<.f> a,b,c
op<.f> a,b,u6
op<.f> b,b,s12
op<.cc><.f> b,b,c
op<.cc><.f> b,b,u6
op<.f> a,limm,c
op<.f> a,limm,u6
op<.f> 0,limm,s12
op<.cc><.f> 0,limm,c
op<.cc><.f> 0,limm,u6
op<.f> a,b,limm
op<.cc><.f> b,b,limm
op<.f> a,limm,limm
op<.cc><.f> 0,limm,limm
op<.f> 0,b,c
op<.f> 0,b,u6
op<.f> 0,limm,c
op<.f> 0,limm,u6
op<.f> 0,b,limm
op<.f> 0,limm,limm
The two operand instructions are having the following syntax format:
op<.f> b,c
op<.f> b,u6
op<.f> b,limm
op<.f> 0,c
op<.f> 0,u6
op<.f> 0,limm
The one operand instructions are having the following syntax format:
op<.f> c
op<.f> u6
op<.f> limm
The no-operand instructions are actually using op<.f> u6 one-operand instruction syntax, with u6 set to zero.
On top of the formal syntax choices, we have also syntax class modifiers:
OP1_MUST_BE_IMM which applies for SYNTAX_3OP type of extension instructions, specifying that the first operand of a three-operand instruction must be an immediate (i.e., the result is discarded). This is usually used to set the flags using specific instructions and not retain results.
OP1_IMM_IMPLIED modifies syntax class SYNTAX_2OP, specifying that there is an implied immediate destination operand which does not appear in the syntax. In fact this is actually an 3-operand encoded instruction!
Examples
Example 1
.extInstruction insn1, 0x07, 0x2d, SUFFIX_NONE, SYNTAX_3OP|OP1_MUST_BE_IMM
will allow us the following syntax:
insn1 0,b,c
insn1 0,b,u6
insn1 0,limm,c
insn1 0,b,limm
Example 2
.extInstruction insn2, 0x07, 0x2d, SUFFIX_NONE, SYNTAX_2OP|OP1_IMM_IMPLIED
will allow us the following syntax:
insn2 b,c
insn2 b,u6
insn2 limm,c
insn2 b,limm
Note
The encoding of insn2 uses the SYNTAX_3OP format (i.e., Major 0x07 and SubOpcode1: 0x00-0x2E, 0x30-0x3F)
Example 3
.extInstruction insn1, 7, 0x21, SUFFIX_NONE, SYNTAX_3OP
.extInstruction insn2, 7, 0x21, SUFFIX_NONE, SYNTAX_2OP
.extInstruction insn3, 7, 0x21, SUFFIX_NONE, SYNTAX_1OP
.extInstruction insn4, 7, 0x21, SUFFIX_NONE, SYNTAX_NOP
start:
insn1 r0,r1,r2
insn2 r0,r1
insn3 r1
insn4
will result in the following encodings:
Disassembly of section .text:
0x0000 <start>:
0: 3921 0080 insn1 r0,r1,r2
4: 382f 0061 insn2 r0,r1
8: 392f 407f insn3 r1
c: 396f 403f insn4