Permute Prototype and Function List¶
Description¶
The kernel permutes dimensions of input tensor according to provided order. In other words, it transposes input tensors.
Functions¶
The functions which implement Permute have the following prototype:
mli_status mli_krn_permute_<data_format>(
const mli_tensor *in,
const mli_permute_cfg *cfg,
mli_tensor *out);
where data_format
is one of the data formats listed in Table MLI Data Formats and the function parameters
are shown in the following table:
Parameter |
Type |
Description |
---|---|---|
|
|
[IN] Pointer to constant input tensor |
|
|
[IN] Pointer to Permute parameters structure |
|
|
[OUT] Pointer to output tensor. Result is stored here |
mli_permute_cfg
structure is defined as:
typedef struct {
uint8_t perm_dim[MLI_MAX_RANK];
} mli_permute_cfg;
Field name |
Type |
Description |
---|---|---|
|
|
A permutation array. Dimensions order for output tensor. |
The new order of dimensions is given by perm_dim
array of kernel configuration structure. The
out
tensor’s dimension idx
corresponds to the dimension of the in
tensor with perm_dim[idx]
.
The tensor’s data is reordered according to the new shape.
For example, if input tensors have the shape (2, 4, 8) and perm_dim
order is (2, 0, 1) then output
tensor is of the shape (8, 2, 4). This transpose reflects changing the feature map layout from HWC to CHW.
Here is a list of all available permute functions:
Function Name |
Details |
---|---|
|
All tensors data format: sa8 |
|
All tensors data format: fx8 |
|
All tensors data format: fx16 |
Conditions¶
Ensure that you satisfy the following general conditions before calling the function:
in
andout
tensors must be valid (see mli_tensor Structure Field Descriptions) and satisfy data requirements of the selected version of the kernel.Shape of
out
tensor must reflect already permuted shape ofin
tensor.
in
andout
tensors must not point to overlapped memory regions.Only first N (equal to
rank
ofin
tensor) values in permutation order array are considered by kernel. All of them must be unique, non-negative and less than therank
of thein
tensor.
Ensure that you satisfy the platform-specific conditions in addition to those listed above (see the Platform Specific Details chapter).
Result¶
These functions modify:
Memory pointed by
out.data.mem
field.
el_params
field ofout
tensor.
It is assumed that all the other fields and structures are properly populated to be used in calculations and are not modified by the kernel.
For fx8 and fx16 el_params
field of in
tensor is copyed to the out
tensor. The same is applied for
sa8 versions of kernel in case of per-tensor quantization (in.el_params.sa.dim
< 0)
For sa8 versions of kernel, and in case of per-axis quantization, the el_params
field of the
out
tensor is filled by the kernel using the quantization parameters of the in
tensor.
The following fields are affected:
out.el_params.sa.zero_point.mem.pi16
and the related capacity field
out.el_params.sa.scale.mem.pi16
and the related capacity field
out.el_params.sa.scale_frac_bits.mem.pi8
and the related capacity field
Depending on the state of the preceding pointer fields, ensure that you choose only one of the following options to initialize all the fields in a consistent way:
If you initialize the pointers with a
nullptr
, then the corresponding fields from thein
tensor are copied to theout
tensor. No copy of quantization parameters itself is performed.If you initialize the pointers with the corresponding fields from the
in
tensor, then no action is applied.If you initialize the pointers and capacity fields with pre-allocated memory and its capacity, then a copy of quantization parameters itself is performed. Capacity of allocated memory must be big enough to keep related data from input tensor.
Depending on the debug level (see section Error Codes) this function performs a parameter
check and returns the result as an mli_status
code as described in section Kernel Specific Configuration Structures.