Skip to content

feat: add ROCm/HIP backend support for AMD GPUs#7820

Closed
dev-miro26 wants to merge 13 commits into
janhq:mainfrom
dev-miro26:feat/rocm-hip-backend-support
Closed

feat: add ROCm/HIP backend support for AMD GPUs#7820
dev-miro26 wants to merge 13 commits into
janhq:mainfrom
dev-miro26:feat/rocm-hip-backend-support

Conversation

@dev-miro26

@dev-miro26 dev-miro26 commented Mar 25, 2026

Copy link
Copy Markdown
Contributor

Describe Your Changes

Add full ROCm/HIP backend support for AMD GPUs in Jan's llama.cpp engine. This enables users with AMD GPUs and ROCm installed to use the HIP backend, which offers significantly better inference performance compared to Vulkan.

What this PR does:

  1. Runtime detection — New is_hip_runtime_available() function checks for ROCm libraries (libamdhip64.so on Linux, amdhip64.dll on Windows) in standard paths and environment variables (HIP_PATH, ROCM_PATH).
  2. Library path injectionadd_hip_paths() injects ROCm library directories into LD_LIBRARY_PATH (Linux) or PATH (Windows) before launching llama-server, mirroring the existing CUDA path injection pattern.
  3. Binary dependency checkbinary_requires_hip() uses ldd (Linux) or PE scanning (Windows) to detect if a llama-server binary is linked against HIP/ROCm libraries, enabling clear user-facing warnings when ROCm is missing.
  4. Accurate HIP feature detection — Previously, get_supported_features() set features.hip = true based solely on GPU vendor being AMD. Now it requires both an AMD GPU and a working ROCm runtime, preventing false positives on systems with AMD iGPUs but no ROCm installed.
  5. Upstream naming compatibilitymap_old_backend_to_new() and get_backend_category() now map upstream ggml-org/llama.cpp naming (ubuntu-rocm-7.2-x64) to Jan's convention (linux-hip-x64), so manually installed upstream ROCm builds are discovered and prioritized correctly.
  6. Backend prioritization — HIP is ranked above Vulkan (but below CUDA) when sufficient GPU memory is available, with automatic fallback.
  7. OOM error detection — Added hiperroroutofmemory to the stderr parser so HIP out-of-memory errors produce the same user-friendly message as CUDA/Vulkan/Metal.
  8. Documentation — Added ROCm/HIP sections to the hardware backends guide, Linux install page, and fixed a stale link in the Windows install page (ggerganovggml-org).

Changes across layers

Layer Files What changed
Rust utils system.rs +278 lines: is_hip_runtime_available(), add_hip_paths(), binary_requires_hip()
Rust plugin backend.rs HIP detection fix, upstream naming mapping, category/priority support, 9 new tests
Rust plugin commands.rs, device.rs HIP path injection + warning on model load and device detection
Rust plugin error.rs HIP OOM error string
TS extension index.ts Comment: upstream ROCm archive example
Docs llama-cpp.mdx, linux.mdx, windows.mdx ROCm/HIP setup instructions

Fixes Issues

Self Checklist

  • Added relevant comments, esp in complex areas

  • Updated docs (for bug fixes / features)

  • Created issues for follow-up changes or refactoring needed

  • Follow-up: janhq/llama.cpp CI pipeline to produce HIP release assets

@dev-miro26

Copy link
Copy Markdown
Contributor Author

@louis-jan @Vanalite
Could you please check this PR?
I appreciate you.

Vanalite
Vanalite previously approved these changes Mar 26, 2026

@Vanalite Vanalite left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@louis-jan

Copy link
Copy Markdown
Contributor

I see 2 major issues with this:

  1. HIP backend is not added yet, how it could select? (https://github.com/janhq/llama.cpp/releases/tag/b8149)
  2. There is no HIP runtime check yet, it just check the GPU but what if the backend does not work with the runtime.

@dev-miro26

Copy link
Copy Markdown
Contributor Author

I see 2 major issues with this:

  1. HIP backend is not added yet, how it could select? (https://github.com/janhq/llama.cpp/releases/tag/b8149)
  2. There is no HIP runtime check yet, it just check the GPU but what if the backend does not work with the runtime.

You are the correct. 👍

  1. The code in backend.rs already has HIP fully wired in — determine_supported_backends() at line 237–239 adds "linux-hip-x64" and line 216–218 adds "win-hip-x64" whenever features.hip == true.
    just missed: there is no HIP binary.
  2. I will add Runtime check.

I will update soon.

@dev-miro26

Copy link
Copy Markdown
Contributor Author

@louis-jan @Vanalite
The HIP backend binaries don't exist in Jan's llama.cpp fork releases, so there's nothing for users to download.

User has AMD GPU-> (yes) -> ROCm runtime installed? -> (yes) -> features.hip = true -> builds download URL, picks .tar.gz or .zip (Windows HIP uses .zip) -> When launching llama-server, the code adds ROCm library paths so the HIP binary can find libamdhip64.

This feature almost similar with Vulkan feature.
I am not use the AMD GPU now. But I am sure this feature works well as expected.
Please share your opinion. I need your help to get this implemented successfully.
I appreciate you.

@dev-miro26 dev-miro26 force-pushed the feat/rocm-hip-backend-support branch from e9bb4ac to 588839b Compare March 26, 2026 07:13
@louis-jan

louis-jan commented Mar 26, 2026

Copy link
Copy Markdown
Contributor

@dev-miro26 We appreciate contributors who really deep dive into the code base and test how the app works before making changes since it could break the app.
As in CONTRIBUTION.md we mentioned each PR should have video record if it relates to UX enhancement and make sure we can filter out PRs that really work before merging.

Regarding this feature, you can find the sources where it displays the backend list (attached the link above for the backend repository)

As if users don't download, the feature would not work because even it can detect it still could not select the backend automatically (as what the issue mentioned). If users downloaded the backend users will still have to select it manually so the change does not make sense?

So it would be great to:

  1. PR to https://github.com/janhq/llama.cpp to have HIP backend first
  2. Back to this PR (merge after the PR above) and test with new llama.cpp release

@dev-miro26

Copy link
Copy Markdown
Contributor Author

@louis-jan
Thanks fro your advice.
I will update more and will test on the AMD. Will record the video.
Please wait.

@louis-jan

Copy link
Copy Markdown
Contributor

Thank you so much!

@louis-jan

Copy link
Copy Markdown
Contributor

@Vanalite can help test on AMD device if there is a hardware constraint

@dev-miro26

dev-miro26 commented Mar 26, 2026

Copy link
Copy Markdown
Contributor Author

@louis-jan
Wow thank you.
When update the code, I will ask to @Vanalite for the test.
Thank you Thank you.

@dev-miro26

Copy link
Copy Markdown
Contributor Author

@louis-jan @Vanalite
Could you please test this PR?
janhq/llama.cpp#467

// Normalize upstream naming (e.g. "ubuntu-rocm-7.2-x64") to Jan
// conventions (e.g. "linux-hip-x64") so the folder name matches what
// the backend discovery and dropdown logic expects.
const backendIdentifier = await mapOldBackendToNew(rawBackend)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to use a different function? mapOldBackendToNew is used for migrating avx* backend to common_cpus for cpu agnostic.

@tokamak-pm

tokamak-pm Bot commented Mar 31, 2026

Copy link
Copy Markdown

Code Review

Summary

Large PR adding end-to-end ROCm/HIP backend support for AMD GPUs. New detection functions (is_hip_runtime_available), backend prioritization, upstream naming normalization, HIP library path injection, and corresponding TypeScript types. Includes unit tests.

Key Findings

  • vendor field population gapGpuInfo.vendor is read on the backend but the PR does not show a frontend change to populate it. Without this, HIP detection may never trigger even on AMD hardware.
  • Windows archive formatarchiveExt is hardcoded to 'tar.gz' for all platforms. If upstream HIP releases ship as .zip on Windows, downloads will 404.
  • HIP runtime detection is thorough (scans known paths, versioned symlinks on Linux; HIP_PATH and common paths on Windows).
  • Backend prioritization correctly places HIP above Vulkan but below CUDA.
  • Test coverage is good, with appropriate CI-environment guards.

Recommendation: fix needed

Verify/implement the vendor field population from frontend to backend, and resolve the Windows .zip vs .tar.gz archive format question.

@qnixsynapse qnixsynapse added needs: rework need: infra support needs: comms Major issue - we should inform users labels Apr 8, 2026
@dev-miro26 dev-miro26 requested a review from qnixsynapse April 9, 2026 17:48
@tokamak-pm

tokamak-pm Bot commented Apr 10, 2026

Copy link
Copy Markdown

Code Review — follow-up (3 new commits since last review)

What changed

  • faf73fc0 — aligns HIP backend types, prioritization, and test coverage
  • d8f346c7 — extracts upstream ROCm naming normalization from mapOldBackendToNew
  • 0fd123b3 — merge from main

Previous blockers — resolved

1. vendor field population gap — fixed. GpuInfo in both Rust and TypeScript now carries vendor?: string. get_supported_features() reads it to gate HIP detection (if vendor == "AMD" { features.hip = is_hip_runtime_available() }). TS-side normalizeFeatures() propagates hip correctly.

2. Windows archive format — addressed. archiveExt is now a named constant with a clear comment. The decompress loop also handles .zip as insurance.

New work in this round

  • SystemFeatures and SupportedFeatures both gain hip: bool in Rust and TypeScript
  • HIP inserted into priority list between CUDA cu11.7 and Vulkan
  • hiperroroutofmemory added to OOM stderr pattern list
  • add_hip_paths() and binary_requires_hip() called symmetrically with CUDA in process launch
  • normalize_upstream_backend() converts upstream names like ubuntu-rocm-7.2-x64linux-hip-x64
  • 8 new Rust tests covering AMD HIP detection, backend prioritization, and normalization
  • TypeScript types updated: BackendFeatures.hip, GpuInfo.vendor

Remaining concerns

1. No HIP binary in janhq/llama.cpp releases yet (blocking, not a code issue)
The code is architecturally complete, but shipping it now adds UI and detection logic for a backend users cannot download. janhq/llama.cpp#467 needs to be merged and a tagged release with linux-hip-x64/win-hip-x64 assets published first.

2. AMD hardware testing still outstanding
@louis-jan requested on-device testing by @Vanalite on AMD hardware. This is still pending.

3. Minor: add_hip_paths_linux returns true on most Linux systems
/usr/lib exists on virtually all Debian/Ubuntu systems, so the function returns true even without ROCm. Not harmful (the binary_requires_hip() guard prevents spurious warnings) but the return value is misleading.

4. Minor: test_get_supported_features_amd_hip does not assert hip value
The test does let _ = result.hip — a no-op. Acceptable for CI without ROCm, but means the positive path has no automated coverage.

Recommendation: fix needed (external dependency)

The code quality is now solid. The block is the missing HIP binary release in janhq/llama.cpp. Once that ships and AMD hardware testing is done, this is ready to merge.

@tokamak-pm

tokamak-pm Bot commented Apr 23, 2026

Copy link
Copy Markdown

Code Review: feat: add ROCm/HIP backend support for AMD GPUs

Summary

This PR adds comprehensive ROCm/HIP backend support for AMD GPUs, including runtime detection, library path injection, binary dependency checking, backend naming normalization, prioritization (HIP > Vulkan, CUDA > HIP), OOM error detection, and documentation updates.

Strengths

  • Well-structured approach mirroring the existing CUDA pattern for consistency
  • Proper runtime detection (is_hip_runtime_available) avoids false positives on systems with AMD iGPUs but no ROCm
  • Good test coverage (9+ new tests covering normalization, feature detection, backend prioritization, category mapping)
  • Documentation is thorough with clear ROCm/HIP vs Vulkan guidance
  • Upstream ggml-org naming compatibility (ubuntu-rocm-7.2-x64 -> linux-hip-x64) is a smart forward-looking addition

Issues and Concerns

Minor issues:

  1. binary_requires_hip_linux fallback reads entire binary into memory -- The fallback in binary_requires_hip_linux() calls std::fs::read(bin_path) to byte-search for library names in the ELF string table. For large llama-server binaries (100+ MB), this loads the entire file into memory. Consider reading only the first N KB or using mmap instead.

  2. add_hip_paths_linux adds directories that merely exist, not that contain HIP libs -- The function adds /usr/lib, /usr/lib/x86_64-linux-gnu, etc. to LD_LIBRARY_PATH simply because they exist, not because they contain HIP libraries. This is harmless but noisy; the CUDA counterpart is more targeted.

  3. Versioned ROCm directory walk under /opt -- The pattern name_str.starts_with("rocm") will match directories like rocm-6.0 but also any unrelated directory starting with "rocm". Minor edge case.

  4. normalizeUpstreamBackend is an async IPC call for a pure string transformation -- This function has no I/O and could be done client-side in TypeScript. The Tauri command overhead per call is small but unnecessary.

  5. Test test_get_supported_features_amd_hip -- The assertion let _ = result.hip; does not actually test the value. The comment explains this is intentional (no ROCm in CI), but it means HIP detection is effectively untested in CI. Consider a mock or cfg-based test.

  6. archiveExt variable in backend.ts -- Setting archiveExt = 'tar.gz' as a constant and then string-interpolating it everywhere adds complexity without value since it is never changed. The comment says "If upstream HIP releases switch to .zip, this will need updating" but the decompression code already handles both formats.

Architecture

The layered approach (utils -> plugin -> extension -> docs) is clean. The new normalize_upstream_backend Tauri command properly separates naming concerns from the extension layer.

Recommendation

can merge -- This is a solid, well-tested feature addition. The issues above are all minor and can be addressed in follow-ups.

@tokamak-pm

tokamak-pm Bot commented Apr 24, 2026

Copy link
Copy Markdown

PR Review: feat: add ROCm/HIP backend support for AMD GPUs

Summary

Adds full ROCm/HIP backend support for AMD GPUs across the stack: runtime detection of ROCm libraries, HIP backend prioritization (above Vulkan, below CUDA), upstream naming normalization (e.g., ubuntu-rocm-7.2-x64 to linux-hip-x64), library path injection for HIP binaries, OOM error detection for HIP, and documentation updates for both Linux and Windows.

Key Findings

Positives:

  • Comprehensive implementation touching Rust backend, TypeScript guest bindings, and documentation.
  • Excellent test coverage: 10+ new tests covering normalization, feature detection, backend prioritization (HIP over Vulkan, CUDA over HIP), and category resolution.
  • Clean separation between runtime detection (is_hip_runtime_available) and path injection (add_hip_paths), following the existing CUDA pattern.
  • binary_requires_hip uses both ldd and fallback byte-scanning, matching the existing binary_requires_cuda approach.
  • Proper vendor field addition to GpuInfo enables vendor-specific feature detection without breaking existing code.
  • Documentation is thorough with both Linux and Windows instructions.

Concerns:

  1. read_dir scanning in hot paths: is_hip_runtime_available_linux calls read_dir on /opt/rocm/lib, /usr/lib, and /usr/lib/x86_64-linux-gnu looking for versioned libamdhip64.so.* files. On systems with large /usr/lib directories, this scan could be slow. Consider checking only the specific static paths first and skipping the read_dir scan if any match (the current code does check static paths first, but the read_dir runs even if the static check already returned true -- actually, looking again, the early return on line 849 prevents this. This is correct).
  2. add_hip_paths_linux adds broad directories: Directories like /usr/lib and /usr/local/lib are added to LD_LIBRARY_PATH. While this works, it could unintentionally affect library resolution for other shared objects loaded by the process. Consider only adding ROCm-specific directories (/opt/rocm/lib, HIP_PATH/lib).
  3. binary_requires_hip_linux reads entire binary into memory: The fallback in binary_requires_hip_linux does std::fs::read(bin_path) which loads the entire binary (potentially hundreds of MB for a GPU-enabled llama-server). Consider using memory-mapped I/O or searching only the first N bytes / string table section.
  4. Windows add_hip_paths_windows prepends to PATH: The ROCm bin paths are prepended to the system PATH. If the ROCm installation has DLLs that shadow system DLLs, this could cause unexpected behavior. Consider appending instead of prepending, or being more selective.
  5. normalizeUpstreamBackend as Tauri command: This is a pure string transformation that could be a local TypeScript function instead of an IPC round-trip. Making it a Tauri command adds latency for no benefit. The Rust implementation is fine for the backend tests, but the TS side could call a local function.
  6. Documentation link inconsistency: Some docs reference /docs/desktop/local-engines/llama-cpp (old) and others /docs/desktop/local-engine/llama-cpp (new). The PR fixes some but verify all links are consistent.

Recommendation

improve needed -- Strong implementation with good test coverage. Address the broad LD_LIBRARY_PATH additions (point 2), the full binary read in the fallback path (point 3), and consider making the upstream backend normalization a local TS function (point 5). The rest are minor polish items.

@tokamak-pm

tokamak-pm Bot commented Apr 26, 2026

Copy link
Copy Markdown

Code Review

Summary: Large PR adding end-to-end ROCm/HIP backend support for AMD GPUs, including runtime detection, library path injection, backend naming normalization, prioritization logic, documentation updates, and extensive tests.

Findings:

  • Runtime detection (is_hip_runtime_available) is thorough: checks standard paths, versioned symlinks, and environment variables on both Linux and Windows.
  • Naming normalization (normalize_upstream_backend) correctly maps upstream ggml-org naming (e.g., ubuntu-rocm-7.2-x64) to Jan conventions (linux-hip-x64). Good approach for forward compatibility.
  • Backend prioritization correctly places HIP between CUDA and Vulkan in the priority list, which matches performance expectations.
  • Library path injection (add_hip_paths) is well-implemented, scanning ROCm install directories and environment variables, then prepending to LD_LIBRARY_PATH/PATH.
  • OOM detection: Adding hiperroroutofmemory to the error detection in error.rs is a nice touch for user-facing error messages.
  • Potential concern: The binary_requires_hip_linux function reads the entire binary file into memory as a fallback if ldd fails. For large binaries (llama-server can be 100+ MB), this could cause memory pressure. Consider reading only a portion or using grep -c on the binary.
  • Test coverage is good: tests for normalization, feature detection, backend categorization, and prioritization ordering.
  • Documentation updates in the .mdx files are clear and well-structured. Minor nit: the link /docs/desktop/local-engine/llama-cpp (singular) vs /docs/desktop/local-engines/llama-cpp (plural) -- verify the correct path.
  • The PR is quite large and touches many layers. Maintainer-side AMD hardware testing (as discussed in comments) is important before merging.

Recommendation: Improve needed -- the implementation is solid, but the binary file read fallback in binary_requires_hip_linux should be optimized, and AMD hardware testing should be confirmed before merge.

@tokamak-pm

tokamak-pm Bot commented May 1, 2026

Copy link
Copy Markdown

Code Review

Summary: Adds end-to-end ROCm/HIP backend support for AMD GPUs: runtime detection (is_hip_runtime_available()), library path injection, binary dependency checking, upstream naming normalization (e.g. ubuntu-rocm-7.2-x64 to linux-hip-x64), backend prioritization (HIP > Vulkan, below CUDA), and documentation updates.

Key Findings:

  • Detection logic is sound: get_supported_features now requires both an AMD GPU vendor AND a working ROCm runtime (is_hip_runtime_available()), preventing false positives on systems with AMD iGPUs but no ROCm installed. This is a good improvement over the previous vendor-only check.
  • Naming normalization (normalize_upstream_backend): Clean implementation with good unit tests covering Jan-native pass-through, upstream ubuntu-rocm-* mapping, and non-HIP pass-through.
  • Prioritization: HIP is correctly placed between CUDA and Vulkan in the priority list, which is the right call for performance.
  • Documentation: Linux and Windows install docs and the llama-cpp backend guide are all updated with clear instructions for both ROCm/HIP and Vulkan paths.
  • Merge state is dirty -- conflicts need resolving before this can proceed.
  • No HIP backend binaries exist yet in Jan's llama.cpp fork releases, as discussed in comments. @louis-jan noted this and the author has a companion PR (ci: add Linux HIP (ROCm) backend build to release workflow llama.cpp#467) to add them. The feature code is ready but the backend itself is not yet available for download -- this should be coordinated.
  • Missing video demo -- @louis-jan asked for a video recording of the feature working on AMD hardware. @Vanalite was offered to help test.
  • Type additions (vendor field on GpuInfo): Clean addition, correctly optional.
  • Tests: Good coverage with test_normalize_upstream_backend_hip and updated test_get_supported_features_* tests.

Recommendation: fix needed

Resolve merge conflicts, coordinate with the companion llama.cpp PR to ensure HIP binaries are available, and provide the requested video demo of the feature working on AMD hardware.

@dev-miro26 dev-miro26 closed this May 6, 2026
@github-project-automation github-project-automation Bot moved this to Done in Jan May 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

need: infra support needs: comms Major issue - we should inform users needs: rework

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

idea: rocm support on Linux

4 participants