feat: add ROCm/HIP backend support for AMD GPUs#7820
Conversation
|
@louis-jan @Vanalite |
|
I see 2 major issues with this:
|
You are the correct. 👍
I will update soon. |
…ts and normalize backend naming conventions
… addition to .tar.gz
|
@louis-jan @Vanalite User has AMD GPU-> (yes) -> ROCm runtime installed? -> (yes) -> features.hip = true -> builds download URL, picks .tar.gz or .zip (Windows HIP uses .zip) -> When launching llama-server, the code adds ROCm library paths so the HIP binary can find libamdhip64. This feature almost similar with Vulkan feature. |
e9bb4ac to
588839b
Compare
|
@dev-miro26 We appreciate contributors who really deep dive into the code base and test how the app works before making changes since it could break the app. Regarding this feature, you can find the sources where it displays the backend list (attached the link above for the backend repository) As if users don't download, the feature would not work because even it can detect it still could not select the backend automatically (as what the issue mentioned). If users downloaded the backend users will still have to select it manually so the change does not make sense? So it would be great to:
|
|
@louis-jan |
|
Thank you so much! |
|
@Vanalite can help test on AMD device if there is a hardware constraint |
|
@louis-jan |
|
@louis-jan @Vanalite |
| // Normalize upstream naming (e.g. "ubuntu-rocm-7.2-x64") to Jan | ||
| // conventions (e.g. "linux-hip-x64") so the folder name matches what | ||
| // the backend discovery and dropdown logic expects. | ||
| const backendIdentifier = await mapOldBackendToNew(rawBackend) |
There was a problem hiding this comment.
I think it's better to use a different function? mapOldBackendToNew is used for migrating avx* backend to common_cpus for cpu agnostic.
Code ReviewSummaryLarge PR adding end-to-end ROCm/HIP backend support for AMD GPUs. New detection functions ( Key Findings
Recommendation: fix neededVerify/implement the |
Code Review — follow-up (3 new commits since last review)What changed
Previous blockers — resolved1. 2. Windows archive format — addressed. New work in this round
Remaining concerns1. No HIP binary in janhq/llama.cpp releases yet (blocking, not a code issue) 2. AMD hardware testing still outstanding 3. Minor: 4. Minor: Recommendation: fix needed (external dependency)The code quality is now solid. The block is the missing HIP binary release in janhq/llama.cpp. Once that ships and AMD hardware testing is done, this is ready to merge. |
Code Review: feat: add ROCm/HIP backend support for AMD GPUsSummaryThis PR adds comprehensive ROCm/HIP backend support for AMD GPUs, including runtime detection, library path injection, binary dependency checking, backend naming normalization, prioritization (HIP > Vulkan, CUDA > HIP), OOM error detection, and documentation updates. Strengths
Issues and ConcernsMinor issues:
ArchitectureThe layered approach (utils -> plugin -> extension -> docs) is clean. The new Recommendationcan merge -- This is a solid, well-tested feature addition. The issues above are all minor and can be addressed in follow-ups. |
PR Review: feat: add ROCm/HIP backend support for AMD GPUsSummaryAdds full ROCm/HIP backend support for AMD GPUs across the stack: runtime detection of ROCm libraries, HIP backend prioritization (above Vulkan, below CUDA), upstream naming normalization (e.g., Key FindingsPositives:
Concerns:
Recommendationimprove needed -- Strong implementation with good test coverage. Address the broad |
Code ReviewSummary: Large PR adding end-to-end ROCm/HIP backend support for AMD GPUs, including runtime detection, library path injection, backend naming normalization, prioritization logic, documentation updates, and extensive tests. Findings:
Recommendation: Improve needed -- the implementation is solid, but the binary file read fallback in |
Code ReviewSummary: Adds end-to-end ROCm/HIP backend support for AMD GPUs: runtime detection ( Key Findings:
Recommendation: fix needed Resolve merge conflicts, coordinate with the companion llama.cpp PR to ensure HIP binaries are available, and provide the requested video demo of the feature working on AMD hardware. |
Describe Your Changes
Add full ROCm/HIP backend support for AMD GPUs in Jan's llama.cpp engine. This enables users with AMD GPUs and ROCm installed to use the HIP backend, which offers significantly better inference performance compared to Vulkan.
What this PR does:
is_hip_runtime_available()function checks for ROCm libraries (libamdhip64.soon Linux,amdhip64.dllon Windows) in standard paths and environment variables (HIP_PATH,ROCM_PATH).add_hip_paths()injects ROCm library directories intoLD_LIBRARY_PATH(Linux) orPATH(Windows) before launchingllama-server, mirroring the existing CUDA path injection pattern.binary_requires_hip()usesldd(Linux) or PE scanning (Windows) to detect if allama-serverbinary is linked against HIP/ROCm libraries, enabling clear user-facing warnings when ROCm is missing.get_supported_features()setfeatures.hip = truebased solely on GPU vendor being AMD. Now it requires both an AMD GPU and a working ROCm runtime, preventing false positives on systems with AMD iGPUs but no ROCm installed.map_old_backend_to_new()andget_backend_category()now map upstreamggml-org/llama.cppnaming (ubuntu-rocm-7.2-x64) to Jan's convention (linux-hip-x64), so manually installed upstream ROCm builds are discovered and prioritized correctly.hiperroroutofmemoryto the stderr parser so HIP out-of-memory errors produce the same user-friendly message as CUDA/Vulkan/Metal.ggerganov→ggml-org).Changes across layers
system.rsis_hip_runtime_available(),add_hip_paths(),binary_requires_hip()backend.rscommands.rs,device.rserror.rsindex.tsllama-cpp.mdx,linux.mdx,windows.mdxFixes Issues
Self Checklist
Added relevant comments, esp in complex areas
Updated docs (for bug fixes / features)
Created issues for follow-up changes or refactoring needed
Follow-up:
janhq/llama.cppCI pipeline to produce HIP release assets