Skip to content

fix: Upgrade cyclors 0.2.7 -> 0.3.10 to fix DDS handle refcount overflow#695

Open
cmeng-gao wants to merge 1 commit intoeclipse-zenoh:mainfrom
opkit:fix/upgrade-cyclors-handle-overflow
Open

fix: Upgrade cyclors 0.2.7 -> 0.3.10 to fix DDS handle refcount overflow#695
cmeng-gao wants to merge 1 commit intoeclipse-zenoh:mainfrom
opkit:fix/upgrade-cyclors-handle-overflow

Conversation

@cmeng-gao
Copy link
Copy Markdown
Contributor

cyclors 0.2.7 embeds CycloneDDS from Aug 2023, which has a 14-bit child refcount limit (max 16,383) in the handle system. When the DDS participant accumulates more children than this limit (e.g. from topic entities created by cdds_create_blob_topic), the refcount overflows into HDL_FLAG_NO_USER_ACCESS (0x04000000), causing all subsequent user-facing DDS API calls to return DDS_RETCODE_BAD_PARAMETER (-3).

This manifests as "Error creating DDS Reader/Writer (retcode: -3)" failures after hours of operation in production environments.

cyclors 0.3.10 embeds CycloneDDS with the fix from PR eclipse-cyclonedds/cyclonedds#2119 which expands cnt_flags to 64-bit on 64-bit platforms, effectively eliminating the overflow.

Additionally:

  • Remove old iceoryx SHM chunk handling code from dds_types.rs, as cyclors 0.3.x uses PSMX where shared memory is handled transparently by CycloneDDS through ddsi_serdata_to_ser_ref.
  • Skip dds_shm CI steps on Windows, as cyclors 0.3.10 explicitly rejects iceoryx on Windows (not supported by PSMX plugin).

Fixes: eclipse-cyclonedds/cyclonedds#1679
Fixes: eclipse-cyclonedds/cyclonedds#2022

cyclors 0.2.7 embeds CycloneDDS from Aug 2023, which has a 14-bit
child refcount limit (max 16,383) in the handle system. When the DDS
participant accumulates more children than this limit (e.g. from topic
entities created by cdds_create_blob_topic), the refcount overflows
into HDL_FLAG_NO_USER_ACCESS (0x04000000), causing all subsequent
user-facing DDS API calls to return DDS_RETCODE_BAD_PARAMETER (-3).

This manifests as "Error creating DDS Reader/Writer (retcode: -3)"
failures after hours of operation in production environments.

cyclors 0.3.10 embeds CycloneDDS with the fix from PR eclipse-cyclonedds/cyclonedds#2119
which expands cnt_flags to 64-bit on 64-bit platforms, effectively
eliminating the overflow.

Additionally:
- Remove old iceoryx SHM chunk handling code from dds_types.rs, as
  cyclors 0.3.x uses PSMX where shared memory is handled transparently
  by CycloneDDS through ddsi_serdata_to_ser_ref.
- Skip dds_shm CI steps on Windows, as cyclors 0.3.10 explicitly
  rejects iceoryx on Windows (not supported by PSMX plugin).

Fixes: eclipse-cyclonedds/cyclonedds#1679
Fixes: eclipse-cyclonedds/cyclonedds#2022

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@cmeng-gao
Copy link
Copy Markdown
Contributor Author

@JEnoch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Errors occur during high-load communication. Crash when initialising a big amount of Action CLients, Services and Publishers (ROS2 Humble)

1 participant