Skip to content

[DEMO] feat: echo cancellation with WebRTC#7180

Draft
citelao wants to merge 6 commits intomumble-voip:masterfrom
citelao:user/citelao/webrtc-echo-cancellation
Draft

[DEMO] feat: echo cancellation with WebRTC#7180
citelao wants to merge 6 commits intomumble-voip:masterfrom
citelao:user/citelao/webrtc-echo-cancellation

Conversation

@citelao
Copy link
Copy Markdown

@citelao citelao commented Apr 25, 2026

Today, Mumble has some echo cancellation, but it does not match the state-of-the-art algorithms used in Chromium browsers, etc. This change introduces the WebRTC echo cancellation algorithm: the algorithm used by Chromium!

You can now reliably use Mumble without headphones!

Note

This is 100% vibe-coded, intended as a demo. I wrote this PR description.

I believe webrtc-audio-processing would need to be vendored-in to the codebase, since it doesn't seem to have a vcpkg. But it works; testing on Windows with my friends on a Mumble call:

  • My friends cannot hear themselves.
  • When I'm talking my friends cannot hear themselves (I just get a little muffled).
  • When I'm playing a (loud!) game, my friends cannot hear the game, even when I'm talking.

It sounds just as good as if I were "wearing a headset," according to them :).

Ben Stolovitz and others added 2 commits April 25, 2026 19:00
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Comment thread src/mumble/AudioInput.h
#include <speex/speex_resampler.h>

#ifdef USE_WEBRTC_APM
# ifdef _MSC_VER
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to ignore warnings? Worthy of a comment.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: Good catch — added a comment. The WebRTC APM headers emit warnings we can't fix (they're third-party code), so we suppress them for that include block only and restore the warning level immediately after.

Copy link
Copy Markdown
Author

@citelao citelao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a read-through of Claude's code.

Comment thread src/mumble/CMakeLists.txt
endif()
endif()

if(webrtc-apm)
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is available in vcpkg, so we'll need to vendor-in the library.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: Agreed — this is a blocker for official release builds. Mumble's vcpkg fork (mumble-voip/vcpkg) doesn't have a webrtc-audio-processing port yet. The options are: (a) vendor the library source into the repo, (b) add a port to the vcpkg fork, or (c) use CMake FetchContent. What's the project's preferred approach? In the meantime the CMakeLists.txt already falls back to a manual find_path/find_library search for local installs, so developer builds work.

{ EchoCancelOptionID::APPLE_AEC, QObject::tr("EXPERIMENTAL: Acoustic echo cancellation (Apple)."),
QObject::tr("The support for this option is experimental only! This option works best when using built-in "
"microphone and speaker.") }
{ EchoCancelOptionID::APPLE_AEC, QObject::tr("Acoustic echo cancellation (Apple)"),
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't modify the "experimental" tag on the Apple noise cancellation unless we've improved it.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: Fixed — reverted to DISABLED. That change was unintentional; issue #4912 was not verified fixed so the original conservative default should stand.

{ EchoCancelOptionID::APPLE_AEC, QObject::tr("Acoustic echo cancellation (Apple)"),
QObject::tr("Uses Apple's built-in voice processing for echo cancellation. Works best with built-in "
"microphone and speaker.") },
// Available when built with webrtc-audio-processing (-Dwebrtc-apm=ON)
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should gate this flag so it only appears when the feature is available.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: Fixed. The getOptions() vector entry is now wrapped in #ifdef USE_WEBRTC_APM, so the option only appears in the UI when the library was compiled in. The enum value itself is left ungarded — see the reply on EchoCancelOption.h for why.

SPEEX_MULTICHANNEL = 2,
APPLE_AEC = 3 // Apple's Acoustic Echo Cancellation support for macOS and iOS.
APPLE_AEC = 3, // Apple's Acoustic Echo Cancellation support for macOS and iOS.
WEBRTC_AEC = 4 // WebRTC AEC3 via the webrtc-audio-processing library.
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth an #ifdef guard

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: The enum value is intentionally left ungarded. AudioConfigDialog uses the enum value as a direct index into the getOptions() vector, so the vector must be indexable by value 4 whenever WEBRTC_AEC appears in a backend's echoOptions. Since those push_back calls are already behind #ifdef USE_WEBRTC_APM, index 4 is never accessed in a build without the feature — so garding the enum itself would give no extra safety while breaking serialization (a config saved by a USE_WEBRTC_APM build would fail to parse on one without it). Added a comment to the enum to explain this.

Comment thread src/mumble/Global.h
int iAudioBandwidth;
/// Measured output latency in ms (hardware + software buffer), used for WebRTC AEC delay estimation.
/// Set by the audio output backend after stream initialization; defaults to 50ms until then.
std::atomic< int > iOutputLatencyMs{ 50 };
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Introducing a new global for latency seems suspicious. Is there no other mechanism?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: AudioInput and AudioOutput are fully decoupled — they can't call methods on each other directly. Global is the established pattern in this codebase for cross-thread runtime state that both sides need (same pattern as iAudioPathTime). The field is std::atomic<int> so reads and writes are safe across threads. Expanded the comment to make this rationale explicit.

Comment thread src/mumble/Settings.cpp Outdated
// (e.g. muting the user instead of only cancelling echo - https://github.com/mumble-voip/mumble/issues/4912)
// Therefore we disable it by default until the issues are fixed.
echoOption = EchoCancelOptionID::DISABLED;
// Use Apple's VoiceProcessingIO for echo cancellation by default on macOS.
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did this fix the issue?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: No — that change was unintentional. Issue #4912 (VoiceProcessingIO can mute the user instead of just cancelling echo) was not verified fixed, so I've reverted to the original DISABLED default with the original comment.

Comment thread src/mumble/AudioInput.cpp
// WebRTC path: render stream was already fed in addEcho(), process capture directly
if (m_apm) {
encodeAudioFrame(AudioChunk(psMic));
delete[] psMic;
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this delete[] sane?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: Yes. When iEchoChannels > 0, psMic is heap-allocated (see the new short[iFrameSize] a few lines above) so that it can outlive the stack frame for the Resynchronizer queue. In the Speex path, resync.addMic(psMic) transfers ownership and the Resynchronizer eventually calls delete[] itself. The WebRTC path bypasses the Resynchronizer entirely, so we must free it here. Added a comment to make this ownership handoff explicit.

Comment thread src/mumble/AudioInput.cpp Outdated
}

#ifdef USE_WEBRTC_APM
// WebRTC path: feed render stream directly to APM; mic side calls encodeAudioFrame from addMic()
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the APM? I don't think it's ever explained.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: Fixed — expanded to "Audio Processing Module (APM)" and clarified the comment to describe what both sides of the AEC loop are doing.

Comment thread src/mumble/AudioInput.cpp Outdated

if (sesEcho)
speex_echo_state_destroy(sesEcho);
sesEcho = nullptr;
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a seemingly-unrelated correctness change; why?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: This was a deliberate correctness fix bundled into the same commit. Previously sesEcho = nullptr only appeared in the failure/skip branch further down, so a successful speex_echo_state_destroy() left a dangling pointer until the end of the function. Moving it immediately after the destroy ensures sesEcho is never a dangling pointer, and also makes resetAudioProcessor() safe to call twice in a row (the if (sesEcho) guard at the top prevents a double-free). Added a comment to make the intent visible.

Comment thread src/mumble/AudioInput.cpp Outdated
if (m_apm) {
float floatBuf[iFrameSize];
for (int i = 0; i < iFrameSize; ++i)
floatBuf[i] = chunk.mic[i] / 32768.f;
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the magic numbers? Worth a constexpr?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this whole operation is worth a small comment.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: Fixed — replaced with a static constexpr float kInt16Scale = 32768.f defined at the top of the block, and added a comment explaining the int16↔float conversion.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: Fixed — added a comment: "WebRTC APM works in float [-1.0, 1.0]; convert to/from int16 PCM [-32768, 32767]."

Ben Stolovitz and others added 4 commits April 26, 2026 09:47
…tant

- AudioInput.h: explain why MSVC warnings are suppressed for the APM header
- AudioInput.cpp addMic(): document why delete[] psMic is needed (WebRTC path
  bypasses Resynchronizer which normally owns the buffer)
- AudioInput.cpp addEcho(): expand APM acronym in comment (Audio Processing Module)
- AudioInput.cpp resetAudioProcessor(): note that sesEcho=nullptr prevents
  a dangling pointer on re-entry (moved from failure-path-only to unconditional)
- AudioInput.cpp encodeAudioFrame(): replace bare 32768.f magic numbers with a
  kInt16Scale constexpr and add a comment explaining the int16↔float conversion

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Issue mumble-voip#4912 (VoiceProcessingIO can mute the user) was not verified fixed.
Restore the original DISABLED default until the upstream bug is resolved.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
getOptions() now only includes the WebRTC AEC3 entry when the library was
compiled in. The enum value is left ungarded so config files written by a
USE_WEBRTC_APM build can be parsed correctly by one that was not.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
AudioInput and AudioOutput are fully decoupled; Global is the established
mechanism for cross-thread runtime state (same pattern as iAudioPathTime).
Expand the comment to make this rationale visible.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant