Skip to content

Bugfix: Build fails to enable ARM CPU crypto extensions#309

Open
pdath wants to merge 1 commit into
bitcoinknots:29.x-knotsfrom
pdath:fix-arm-crypto-feature
Open

Bugfix: Build fails to enable ARM CPU crypto extensions#309
pdath wants to merge 1 commit into
bitcoinknots:29.x-knotsfrom
pdath:fix-arm-crypto-feature

Conversation

@pdath

@pdath pdath commented Jun 2, 2026

Copy link
Copy Markdown

When building on an ARM64 CPU with the crypto extensions, the build process fails to enable the custom assembler version. It instead uses the software version.

This patch relates to this issue:
#308

Example build command:
cmake -B build
-DENABLE_WALLET=OFF
-DWITH_ZMQ=ON
-DENABLE_TOR_SUBPROCESS=OFF
-D RDTS_CONSENT=IMPLICIT
-DCMAKE_CXX_FLAGS_RELWITHDEBINFO="-O2 -g0 -march=native"

When you run bitcoind, this appears near the top of the log:
2026-05-31T23:36:14.290939Z Using the 'standard' SHA256 implementation

After MANY hours, I have finally tracked this down. In this file: ./cmake/introspection.cmake
Line 282 has:
#pragma GCC target ("armv8-a+crypto")
The GCC target has to be a feature, not an architecture. For example, this is valid:
#pragma GCC target ("+crypto")
You can match an architecture using this syntax:
#pragma GCC target ("arch=armv8-a+crypto")

I think matching on the feature is better, because there may be other ARM CPUs with the crypto feature set (in the future) that are not this exact architecture.

Also note that if you apply this fix, it still works even without -marchive=native.

With the fix applied, you now get this in the log:
2026-06-02T09:42:45Z Using the 'arm_shani(1way;2way)' SHA256 implementation

When building on an ARM64 CPU with the crypto extensions, the build process fails to enable the custom assembler version.  It instead uses the software version.

This patch relates to this issue:
bitcoinknots#308

Example build command:
cmake -B build \
  -DENABLE_WALLET=OFF \
  -DWITH_ZMQ=ON \
  -DENABLE_TOR_SUBPROCESS=OFF \
  -D RDTS_CONSENT=IMPLICIT \
  -DCMAKE_CXX_FLAGS_RELWITHDEBINFO="-O2 -g0 -march=native"

When you run bitcoind, this appears near the top of the log:
2026-05-31T23:36:14.290939Z Using the 'standard' SHA256 implementation

After *MANY* hours, I have finally tracked this down.  In this file:
./cmake/introspection.cmake
Line 282 has:

The GCC target has to be a *feature*, not an architecture.  For example, this is valid:
You can match an architecture using this syntax:

I think matching on the feature is better, because there may be other ARM CPUs with the crypto feature set (in the future) that are not this exact architecture.

Also note that if you apply this fix, it still works even without -marchive=native.

With the fix applied, you now get this in the log:
2026-06-02T09:42:45Z Using the 'arm_shani(1way;2way)' SHA256 implementation
@pdath

pdath commented Jun 3, 2026

Copy link
Copy Markdown
Author

I have tested this code change by mining a block on testnet4.
https://mempool.guide/testnet4/block/137635

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes ARM64 builds failing to detect/enable the optimized ARM SHA-NI (“crypto extensions”) SHA256 implementation by correcting the GCC #pragma target syntax used during CMake feature introspection.

Changes:

  • Update the GCC target pragma in the ARM SHA-NI compile check to use a feature specifier ("+crypto") rather than an architecture string ("armv8-a+crypto").

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@mweinberg

Copy link
Copy Markdown

I can confirm this bug and the proposed fix on AWS Graviton4 (c8g.medium, ARM Neoverse V2).

Environment

  • Bitcoin Knots v29.3.knots20260507
  • Amazon Linux 2023, GCC 11.5.0 (aarch64)
  • AWS c8g.medium (1 vCPU Graviton4, 2 GB RAM)
  • Build flags: -O2 -march=armv8.2-a+crypto

Fix

cmake/introspection.cmake line 282:

-#pragma GCC target ("armv8-a+crypto")
+#pragma GCC target ("+crypto")

Confirmation

Before fix (startup log):

2026-06-03T00:06:51Z Using the 'standard' SHA256 implementation

After fix (startup log):

2026-06-03T04:27:39Z Using the 'arm_shani(1way;2way)' SHA256 implementation

Binary Verification

$ objdump -d bitcoind | grep -c 'sha256'
Binary SHA256 hw instructions
Before fix 119
After fix 497

Benchmarks

Same machine, same chain data (block 952,280), warm EBS volume. Only the binary was swapped between runs.

Test standard arm_shani Improvement
verifychain 4 100 5m 12s 4m 18s 17% faster
gettxoutsetinfo (165M UTXOs) 1m 58s 1m 38s 20% faster

Both tests are I/O-mixed workloads (LevelDB reads dominate). The actual SHA256 throughput improvement is much larger — isolated getblocktemplate testing (pure merkle root computation) showed a 2.9× speedup (301ms → 105ms per call).

Impact

This affects all aarch64/ARM64 builds using GCC. Anyone running Knots on Raspberry Pi 4/5, AWS Graviton, Ampere, Apple Silicon (Linux), or any ARM64 platform with crypto extensions is silently falling back to software SHA256.

The #pragma GCC target on AArch64 expects a feature string (e.g., "+crypto") or an arch= prefix (e.g., "arch=armv8-a+crypto"). Passing "armv8-a+crypto" without the prefix is silently ignored by GCC, causing the compile-time feature test to rely solely on the -march CXXFLAGS rather than the pragma — which then fails to propagate correctly to the runtime SHA256 autodetection.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants