Skip to content

Fix a few subtle URI unescaping RFC compliant issues#693

Merged
jviotti merged 2 commits intomainfrom
core-uri-fixes
Mar 25, 2026
Merged

Fix a few subtle URI unescaping RFC compliant issues#693
jviotti merged 2 commits intomainfrom
core-uri-fixes

Conversation

@jviotti
Copy link
Copy Markdown
Member

@jviotti jviotti commented Mar 25, 2026

Signed-off-by: Juan Cruz Viotti jv@jviotti.com

Signed-off-by: Juan Cruz Viotti <jv@jviotti.com>
Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 7 files

@augmentcode
Copy link
Copy Markdown

augmentcode bot commented Mar 25, 2026

🤖 Augment PR Summary

Summary: Updates the vendored sourcemeta/core dependency and tightens URI percent-encoding handling to be more RFC 3986–compliant, plus hardens JSON Schema base-dialect resolution.

Changes:

  • Unescape only unreserved characters during URI parsing; keep reserved characters percent-encoded.
  • Preserve existing percent-encoded triplets during recomposition to avoid double-encoding.
  • Add URIEscapeMode::Path and use it when recomposing the path component.
  • Emit percent-escapes as two uppercase hex digits (e.g., %0A), and normalize hex case during canonicalization.
  • Fully unescape file:// paths when converting to filesystem paths.
  • Detect metaschema-chain cycles when computing JSON Schema base_dialect.

Technical Notes: These changes mainly affect edge cases around reserved-character escaping/unescaping and metaschema self-references/cycles.

🤖 Was this summary useful? React with 👍 or 👎

Copy link
Copy Markdown

@augmentcode augmentcode bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 3 suggestions posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

@@ -132,9 +153,9 @@ auto URI::recompose_without_fragment() const -> std::optional<std::string> {
if (result_scheme.has_value() && !has_authority &&
path_value.starts_with("/") && !path_value.starts_with("//")) {
escape_component_to_string(result, path_value.substr(1),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

path_value.substr(1) drops the leading / when a scheme is present but there’s no authority, which can change scheme:/path into scheme:path (different URI semantics and potentially breaks round-tripping). Could we confirm this is intended for e.g. file:/... / http:/... forms?

Severity: high

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

}

// Path is already fully decoded, just return it
uri_unescape_all_inplace(path);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uri_unescape_all_inplace(path) happens after the Windows-absolute detection (/C:/), so inputs like file:///C%3A/Windows won’t be recognized as Windows paths before separator conversion/leading-slash handling. Consider whether unescaping needs to happen earlier to keep %3A-encoded drive letters working.

Severity: medium

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

Signed-off-by: Juan Cruz Viotti <jv@jviotti.com>
@jviotti jviotti merged commit bead3e6 into main Mar 25, 2026
14 checks passed
@jviotti jviotti deleted the core-uri-fixes branch March 25, 2026 18:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant