RNN Dense Prototype and Function List ------------------------------------- Description ^^^^^^^^^^^ This kernel implements a single basic fully connected (or dense) calculation typically used in the majority of RNN architectures: .. math:: y_{i} = b_{i} + \sum_{j}^{}{xa}_{j}*{Wa}_{i,j} + \ \sum_{j}^{}{xb}_{j}*{Wb}_{i,j} + \ldots\ \sum_{j}^{}{xn}_{j}*{Wn}_{i,j} .. Where: :math:`{xa}_{j}`, :math:`{xb}_{j}`, :math:`{xn}_{j}` *-* :math:`j_{\text{th}}` *value in one of the input tensors. These input tensors might be current input, previous output, cell state or any other tensor depending on RNN Cell architecture* :math:`{Wa}_{i,j}`, :math:`{Wb}_{i,j}`, :math:`{Wc}_{i,j}` *- weight of* :math:`j_{th}\ `\ *input element for* :math:`i_{th}` *neuron in one of input weights tensors. These weights tensors might be input-to-a-gate weights, output-to-a-gate weights or any other tensor depending on RNN Cell architecture* :math:`y_{i}` *- output of* :math:`i_{th}` neuron ( :math:`i_{th}` *value in output tensor).* :math:`b_{i}` *- bias for* :math:`i_{th}` *neuron* This is a MAC-based kernel which implies accumulation. See :ref:`quant_accum_infl` for more information on related quantization aspects. The number of accumulation series is equal to total number of values in all inputs. Functions ^^^^^^^^^ Kernels which implement an RNN Dense functionality have the following prototype: .. code:: c mli_status mli_krn_rnn_dense_( const mli_tensor **inputs, const mli_tensor **weights, const mli_tensor *bias, const mli_rnn_dense_cfg *cfg, mli_tensor *out); .. where ``data_format`` is one of the data formats listed in Table :ref:`mli_data_fmts` and the function parameters are shown in the following table: .. table:: RNN Dense Function Parameters :align: center :widths: auto +------------------+---------------------------+-------------------------------------------------------------------+ | **Parameter** | **Type** | **Description** | +==================+===========================+===================================================================+ | ``inputs`` | ``mli_tensor **`` | [IN] Pointer to the array of pointers to constant input tensors | +------------------+---------------------------+-------------------------------------------------------------------+ | ``weights`` | ``mli_tensor **`` | [IN] Pointer to the array of pointers to constant weights tensors | +------------------+---------------------------+-------------------------------------------------------------------+ | ``bias`` | ``mli_tensor *`` | [IN] Pointer to constant bias tensor | +------------------+---------------------------+-------------------------------------------------------------------+ | ``cfg`` | ``mli_rnn_dense_cfg *`` | [IN] Pointer to RNN dense parameters structure | +------------------+---------------------------+-------------------------------------------------------------------+ | ``out`` | ``mli_tensor *`` | [IN | OUT] Pointer to output tensor. Result is stored here. | +------------------+---------------------------+-------------------------------------------------------------------+ .. :code:`mli_rnn_dense_cfg` is defined as: .. code:: c typedef struct { uint8_t inputs_num; } mli_rnn_dense_cfg; .. .. tabularcolumns:: |\Y{0.2}|\Y{0.2}|\Y{0.4}| .. _t_mli_rnn_dense_cfg_desc: .. table:: mli_rnn_dense_cfg Structure Field Description :align: center +-----------------+--------------+------------------------------------------------------------+ | **Field Name** | **Type** | **Description** | +=================+==============+============================================================+ | | | Number of input tensors (number of pointers in inputs | | | | array). Also, the number of weights tensors (number of | | ``inputs_num`` | ``uint8_t`` | pointers in weights array), as each input is specified | | | | with its own weights tensor. Maximum number of tensors | | | | in the array is specified by MLI_RNN_MAX_INPUTS define. | +-----------------+--------------+------------------------------------------------------------+ .. Here is a list of all available RNN Dense functions: .. tabularcolumns:: |\Y{0.4}|\Y{0.4}| .. table:: List of Available RNN Dense Functions :align: center :widths: auto +------------------------------------+-------------------------------------+ | **Function Name** | **Details** | +====================================+=====================================+ | ``mli_krn_rnn_dense_sa8_sa8_sa32`` | In/out/weights data format: **sa8** | | | | | | Bias data format: **sa32** | +------------------------------------+-------------------------------------+ | ``mli_krn_rnn_dense_fx16`` | All tensors data format: **fx16** | +------------------------------------+-------------------------------------+ | ``mli_krn_rnn_dense_fx16_fx8_fx8`` | In/out data format: **fx16** | | | | | | Weights/Bias data format: **fx8** | +------------------------------------+-------------------------------------+ .. Conditions ^^^^^^^^^^ Ensure that you satisfy the following general conditions before calling the listed functions: - ``bias``, ``out``, all tensors in ``inputs`` array and all tensors in ``weights`` array must be valid (see :ref:`mli_tnsr_struc`). - The number of tensors in ``inputs`` and ``weights`` arrays must be the same and must not exceed ``MLI_RNN_MAX_INPUTS`` value. - Shapes of ``bias``, ``out``, all tensors in ``inputs`` array and all tensors in ``weights`` array must be compatible, which implies the following requirements: - Each tensor in ``inputs`` array might be of any shape and rank. Only total number of elements is considered. - The :math:`i_{th}` tensor in ``weights`` array corresponds to the :math:`i_{th}` tensor in ``inputs`` array, which means that ``weights[i]`` must be a two-dimensional tensor (rank==2) of shape :math:`(N_i, M)`, where :math:`N_i` is the total number of elements in the ``inputs[i]`` tensor and :math:`M` is the total number of neurons and is equal to output length. - ``bias`` must be a one-dimensional tensor (rank==1). Its length must be equal to :math:`M` (number of filters and is equal to output length) of any weights tensor. - ``out`` must be a one-dimensional tensor (rank==1). Its length must be equal to :math:`M` (number of filters and is equal to output length) of any weights tensor. - Any tensor from ``inputs`` array and ``out`` tensor must not point to overlapped memory regions. - ``mem_stride`` must satisfy the following statements: - For ``out`` tensor and all tensors in ``inputs`` array memstride must reflect the shape, e.g memory of these tensors must be contiguous. - For all tensors in ``weights`` and ``bias`` arrays - memstride of the innermost dimension must be equal to 1. For **fx16** and **fx16_fx8_fx8** versions of kernel, in addition to the general conditions, ensure that you satisfy the following quantization conditions before calling the function: - The number of ``frac_bits`` in the ``bias`` tensor must not exceed the sum of ``frac_bits`` in the ``inputs[0]`` and ``weights[0]`` tensors. - The number of ``frac_bits`` in the ``out`` tensor must not exceed the sum of ``frac_bits`` in the any pair of related tensors in ``inputs`` and ``weights`` arrays. For **sa8_sa8_sa32** versions of kernel, in addition to the general conditions, ensure that you satisfy the following quantization conditions before calling the function: - ``bias``, ``out``, all the tensors in ``inputs`` array, and all tensors in ``weights`` array must be quantized on the tensor level. This implies that each tensor contains a single scale factor and a single zero offset. - Zero offset of each tensor in inputs and out tensor must be within [-128, 127] range. - ``bias`` and all tensors in weights array must be symmetric. This implies that both tensors contain single zero offset equal to 0. - The scale factor of ``bias`` tensor must be equal to the multiplication of the scale factor of the **first** input and the **first** weights tensors in corresponding arrays (that is, :math:`bias.scale = inputs[0].scale * weights[0].scale`). See the example for the similar condition in the :ref:`conv_2d`. Ensure that you satisfy the platform-specific conditions in addition to those listed above (see the :ref:`platform_spec_chptr` chapter). Result ^^^^^^ These functions only modify the memory pointed by ``out.data.mem`` field. It is assumed that all the other fields of ``out`` tensor are properly populated to be used in calculations and are not modified by the kernel. Depending on the debug level (see section :ref:`err_codes`), this function performs a parameter check and returns the result as an ``mli_status`` code as described in section :ref:`kernl_sp_conf`.