fix(http): preserve raw URL bytes via opt-in url_raw flag#1533
fix(http): preserve raw URL bytes via opt-in url_raw flag#1533juandiego-bmu wants to merge 2 commits intoOWASP:masterfrom
Conversation
aiohttp passes string URLs through yarl.URL(str), which decodes percent- encoded dots and collapses dot-segments before the bytes hit the wire. Modules that rely on traversal in the URL path therefore send a flattened path to the target and never trigger the bypass condition they describe. Adds an opt-in step-level flag url_raw: true that wraps the URL with yarl.URL(url, encoded=True). Default behavior is unchanged. Updates apache_cve_2021_41773.yaml with the flag as a regression case. See linked issue for the full reproduction and the list of affected modules.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
✅ Files skipped from review due to trivial changes (1)
Summary by CodeRabbit
Walkthroughsend_request now consumes an optional Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Pull request overview
Adds an opt-in url_raw: true flag for HTTP module steps to preserve percent-encoded and dot-segment URL bytes on the wire (avoiding aiohttp/yarl normalization), and updates the Apache CVE-2021-41773 module to use it as a regression case.
Changes:
- Import
yarl.URLand wrap request URLs withURL(..., encoded=True)whenurl_rawis enabled. - Add
url_raw: truetoapache_cve_2021_41773.yamlstep configuration.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
nettacker/core/lib/http.py |
Adds url_raw handling to optionally preserve raw URL bytes by constructing an encoded yarl.URL before calling aiohttp. |
nettacker/modules/vuln/apache_cve_2021_41773.yaml |
Opts the module into the new raw-URL behavior to prevent traversal payload normalization. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if request_options.pop("url_raw", False): | ||
| request_options["url"] = URL(request_options["url"], encoded=True) | ||
| async with aiohttp.ClientSession() as session: |
There was a problem hiding this comment.
url_raw is popped from request_options before the URL(..., encoded=True) conversion runs. If the conversion raises (e.g., due to invalid percent-escapes or other characters that are only accepted in the non-encoded constructor), the caller’s retry loop will run again with url_raw already removed and will silently fall back to the default URL normalization path, defeating the opt-in behavior. Consider reading the flag with get() and only pop()-ing it after a successful conversion (or build a shallow copy of request_options for the aiohttp call and leave the original dict untouched).
The schema validator in test_yaml_schema_and_regex.py rejects unknown keys. Add url_raw as Optional(bool) so modules that opt in to raw URL preservation pass schema validation.
What this changes
Adds an opt-in
url_raw: truestep-level flag in HTTP modules. When set, the URL is wrapped withyarl.URL(url, encoded=True)before reachingaiohttp, so percent-encoded and literal dot-segments survive intact onto the wire.Closes #1532 (URL normalization breaks path-traversal modules).
Why
aiohttp passes string URLs through
yarl.URL(str), which decodes%2eto.and then collapses..segments. Modules that put traversal in the URL path therefore send a flattened path to the target, and the bypass condition the module is checking for never triggers. Full reproduction and affected-module table in the linked issue.The change
Two-line patch in
nettacker/core/lib/http.py::send_request:Default behavior is unchanged: a module without
url_rawfollows the existing normalization path. The flag is the opt-in.apache_cve_2021_41773.yamlis updated to seturl_raw: trueas a regression case. With the flag in place, a request like:leaves the client as
GET /cgi-bin/.%2e/%2e%2e/%2e%2e/%2e%2e/etc/passwd HTTP/1.1instead ofGET /etc/passwd HTTP/1.1.Verification
End-to-end test against a local TCP echo server, on Nettacker's pinned
aiohttp 3.13.5/yarl 1.23.0:url_rawnot set (default)GET /etc/passwd HTTP/1.1url_raw: trueGET /cgi-bin/.%2e/%2e%2e/%2e%2e/%2e%2e/etc/passwd HTTP/1.1GET /cgi-bin/.%2e/%2e%2e/%2e%2e/%2e%2e/etc/passwd HTTP/1.1What this does not do
yarltopyproject.tomlas a direct dep. It is already pulled transitively byaiohttp, so the import works today; happy to add it explicitly if reviewers prefer.