Skip to content

[NPU:rknn] Add initial rknn support for Rockchip NPUs, supporting traditional models.#4524

Open
huangzhengxiang wants to merge 3 commits into
alibaba:masterfrom
Embedded-AI-Systems:rknn
Open

[NPU:rknn] Add initial rknn support for Rockchip NPUs, supporting traditional models.#4524
huangzhengxiang wants to merge 3 commits into
alibaba:masterfrom
Embedded-AI-Systems:rknn

Conversation

@huangzhengxiang

Copy link
Copy Markdown
Contributor

Description

Current design:

  • Converter side generates two artifacts from the same ONNX model:
    • a wrapper .mnn model containing Plugin(type="RKNN")
    • a sidecar .rknn model plus bundle manifest
  • Runtime side executes Plugin("RKNN") through the MNN CPU Plugin framework.
  • Application-side session backend remains MNN_FORWARD_CPU.

1. Host build for MNNConvert --rknn

Build a host MNNConvert with plugin support and RKNN converter support enabled:

cmake -S /path/to/MNN-Agent -B /path/to/MNN-Agent/build-linux \
  -DMNN_BUILD_CONVERTER=ON \
  -DMNN_WITH_PLUGIN=ON \
  -DMNN_RKNN=ON \
  -DMNN_RKNN_CONVERT_MODE=ON \
  -DRKNN_API_INCLUDE_DIR=/path/to/rknn-toolkit2/rknpu2/runtime/Linux/librknn_api/include

cmake --build /path/to/MNN-Agent/build-linux --target MNN MNNConvert -j8

2. Generate wrapper .mnn + sidecar .rknn

Before running MNNConvert --rknn, export these environment variables:

export MNN_RKNN_TARGET=rv1126b
export MNN_RKNN_PYTHON=/path/to/python
export MNN_RKNN_SCRIPT=/path/to/to_rknn.py
export MNN_RKNN_OUTPUT_DIR=/path/to/output/sidecar

Example:

/path/to/MNN-Agent/build-linux/MNNConvert \
  -f ONNX \
  --modelFile /path/to/model.onnx \
  --MNNModel /path/to/model.mnn \
  --rknn

Expected outputs:

  • /path/to/model.mnn
  • ${MNN_RKNN_OUTPUT_DIR}/model_<target>.rknn
  • ${MNN_RKNN_OUTPUT_DIR}/model.rknn.bundle.json

The generated wrapper .mnn contains:

  • Input ops for original inputs
  • one Plugin(type="RKNN") op
  • plugin attrs including:
    • model_path
    • bundle_manifest
    • target
    • inputs
    • outputs
    • o_0, o_1, ... for output shape metadata

Important:

  • model_path and bundle_manifest are emitted as relative file names.
  • The validated deployment layout is: wrapper .mnn, sidecar .rknn, and bundle .json in the same target directory.

3. Cross compile runtime for Linux aarch64 / ARMv8

Example cross build using the system aarch64-linux-gnu toolchain.
This builds the target-side runtime libraries; MNNConvert itself is usually only needed on the host.

cmake -S /path/to/MNN-Agent -B /path/to/MNN-Agent/build-linux-aarch64-gnu \
  -DCMAKE_SYSTEM_NAME=Linux \
  -DCMAKE_SYSTEM_PROCESSOR=aarch64 \
  -DCMAKE_C_COMPILER=/usr/bin/aarch64-linux-gnu-gcc \
  -DCMAKE_CXX_COMPILER=/usr/bin/aarch64-linux-gnu-g++ \
  -DCMAKE_C_FLAGS='-march=armv8-a' \
  -DCMAKE_CXX_FLAGS='-march=armv8-a' \
  -DMNN_WITH_PLUGIN=ON \
  -DMNN_RKNN=ON \
  -DMNN_BUILD_CONVERTER=OFF \
  -DMNN_BUILD_DEMO=OFF \
  -DMNN_BUILD_TOOLS=ON \
  -DRKNN_API_INCLUDE_DIR=/path/to/rknn-toolkit2/rknpu2/runtime/Linux/librknn_api/include

cmake --build /path/to/MNN-Agent/build-linux-aarch64-gnu --target MNN MNN_Express -j8

Notes:

  • MNN_WITH_PLUGIN=ON is required because RKNN is implemented as a Plugin op.
  • MNN_RKNN=ON pulls in the RKNN Plugin kernels.
  • RKNN_API_INCLUDE_DIR must point to the directory containing rknn_api.h.
  • The RKNN runtime library is loaded at runtime via dlopen, not linked as a hard dependency.

4. Target runtime usage

On the target board, export the RKNN runtime library path:

export MNN_RKNN_RUNTIME_LIB=/path/to/librknnrt.so

The wrapper .mnn should be deployed together with its sidecar .rknn and bundle manifest in the same directory on target.

Important:

  • On RK boards, commands that actually execute NPU code should be run with sudo.

Runtime behavior:

  • MNN loads the wrapper .mnn
  • Plugin(type="RKNN") is created by the CPU Plugin framework
  • the plugin loads the .rknn sidecar using RKNN C API
  • application-side MNN backend is still MNN_FORWARD_CPU
  • if the RKNN model expects NHWC but the incoming MNN tensor is NCHW, the plugin converts layout automatically
  • if the incoming tensor is already NHWC, no extra layout conversion is done
  • backend-side RKNN profiling can be enabled through the public hint path:
    • Interpreter::setSessionHint(Interpreter::RKNN_PROFILE, 1) or RuntimeManager::setHint(Interpreter::RKNN_PROFILE, 1)
    • retrieve the exported profile text through getSessionInfo(..., Interpreter::BACKEND_PROFILE, &ptr) or RuntimeManager::getInfo(Interpreter::BACKEND_PROFILE, &ptr)
    • because the profile is exposed as plain text, applications can print it or write it directly to a file

5. Current limitations

  • This is a sidecar-subgraph path, not a per-op RKNN backend.
  • Current implementation uses host buffer copies; zero-copy is not implemented.
  • Current output copy path assumes float32 outputs from RKNN runtime.
  • Input layout auto-conversion currently handles the common NCHW -> NHWC case for 4D tensors only, and only when the RKNN model explicitly expects NHWC.
  • Host-side PC simulation through MNN runtime requires an x86 RKNN runtime library; usually this path is meant for target boards.

Module

Backend. Add rknn backend.

Type

  • Feature

Checklist

  • Commit message follows [Module:Type] Description format
  • Code compiles without errors
  • Tested on relevant platform(s)

@wangzhaode wangzhaode left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Summary

这个 PR 为 MNN 添加了 Rockchip RKNN NPU 后端支持,采用 Converter + Plugin 的架构设计。

架构评价

Converter + Plugin 的设计方案合理:

  • Host 侧 MNNConvert --rknn 生成 wrapper .mnn + sidecar .rknn
  • Device 侧通过 CPU Plugin 框架 dlopen librknnrt.so 执行
  • 应用侧 Session backend 仍使用 MNN_FORWARD_CPU
  • 部署简单,不需要在线逐算子构图

⚠️ 核心代码改动需要特别关注

这个 PR 修改了 MNN 核心公共头文件和基类:

  1. include/MNN/Interpreter.hpp:新增了 RKNN_PROFILE (HintMode=19) 和 BACKEND_PROFILE (SessionInfoCode=5 / getInfo code=7)。枚举值需确认不与现有值冲突,RKNN_PROFILE=19 跳过了不少数字。

  2. source/core/Backend.hpp:新增虚函数 onGetRuntimeInfo()。这会改变 Backend 虚表布局,属于 ABI 变更。虽然 MNN 不保证跨版本 ABI 兼容,但需要确认此影响可接受。

  3. source/core/Pipeline.hpp:新增 mModelExternalDir 成员,影响较小。

  4. source/core/Session.cpp / express/Executor.cpp:新增 BACKENDS_PROFILE 查询逻辑,遍历所有 runtime 尝试获取 profile 文本。需要确认其他 backend 返回 false 时不会有副作用。

RKNN Backend 实现

  • RKNNBackend.cpp 通过 dlopen 动态加载 librknnrt.so,符号绑定清晰
  • 环境变量驱动(MNN_RKNN_RUNTIME_LIB, MNN_RKNN_TARGET 等),避免硬编码
  • Converter 侧用 system() 调外部 Python 脚本转 RKNN:这在 CI 和自动化场景中需要注意 PATH 和 Python 环境设置
  • CMakeLists.txt 强制依赖 MNN_WITH_PLUGIN=ON

文档

  • docs/inference/npu.md 和 source/backend/rknn/README.md 文档完善 ✅

建议

  1. 核心代码改动(Backend.hpp 虚函数、Interpreter.hpp 枚举)建议单独提 PR,便于更仔细地评审和测试
  2. 确认 HintMode 和 SessionInfoCode 的枚举值分配不冲突
  3. 需要在 RK3588 板上实际验证后再合入

@wangzhaode wangzhaode self-assigned this Jun 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants