Skip to content

fix(files): don't reject external URLs containing '..' in file parse validation#4821

Merged
waleedlatif1 merged 4 commits into
stagingfrom
waleedlatif1/file-block-path-traversal-error
May 31, 2026
Merged

fix(files): don't reject external URLs containing '..' in file parse validation#4821
waleedlatif1 merged 4 commits into
stagingfrom
waleedlatif1/file-block-path-traversal-error

Conversation

@waleedlatif1
Copy link
Copy Markdown
Collaborator

Summary

  • The file block's file_fetch operation rejected any external URL whose path contained .. (e.g. Slack files-pri slugs with a literal ...) with Access denied: path traversal detected
  • Filesystem traversal checks (.., ~, leading /, drive letters) now only apply to local/internal paths — external http(s):// URLs short-circuit as valid since they're fetched with SSRF protection downstream (validateUrlWithDNS + secureFetchWithPinnedIP) and never touch the filesystem
  • Internal /api/files/serve/... URLs keep full traversal protection (no regression — /api/files/serve/../../../etc/passwd is still rejected)
  • Added a test confirming external URLs with .. are allowed through

Type of Change

  • Bug fix

Testing

Tested manually. Added unit test covering a Slack-style external URL containing ..; existing path-traversal security suite still guards local/internal paths.

Checklist

  • Code follows project style guidelines
  • Self-reviewed my changes
  • Tests added/updated and passing
  • No new warnings introduced
  • I confirm that I have read and agree to the terms outlined in the Contributor License Agreement (CLA)

…validation

The file block's file_fetch operation rejected any external URL whose path
contained '..' (e.g. Slack files-pri slugs with a literal '...') with
'Access denied: path traversal detected'. Traversal checks only apply to
local paths — external http(s) URLs are fetched with SSRF protection
downstream and are never resolved against the filesystem, so they now
short-circuit as valid. Internal /api/files/serve/ URLs keep full traversal
protection.
@vercel
Copy link
Copy Markdown

vercel Bot commented May 31, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
docs Skipped Skipped May 31, 2026 2:29am

Request Review

@cursor
Copy link
Copy Markdown

cursor Bot commented May 31, 2026

PR Summary

Medium Risk
Touches security-sensitive path validation but narrows traversal rules to paths that hit storage; regression coverage keeps internal serve URLs protected.

Overview
Fixes false path-traversal rejections when the file parse API validates http/https inputs whose URL path contains .. (e.g. Slack files-pri slugs with a literal ...), which previously failed with Access denied: path traversal detected and blocked file_fetch.

validateFilePath now treats non-internal external URLs as valid after null-byte checks only; filesystem-oriented rules (.., ~, absolute paths) still apply to local keys and /api/files/serve/... references, including https://…/api/files/serve/… URLs that match isInternalFileUrl. External fetches remain gated by downstream SSRF handling.

Adds unit tests for allowed Slack-style external URLs and for blocked traversal on HTTPS URLs that look like internal serve paths.

Reviewed by Cursor Bugbot for commit 499e834. Configure here.

@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@greptile

@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@cursor review

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 31, 2026

Greptile Summary

This PR fixes a false-positive path traversal rejection that blocked external URLs whose paths legitimately contain .. (e.g., Slack files-pri slugs). The fix short-circuits validateFilePath for http/https URLs that are not internal serve paths, relying on the existing downstream SSRF controls (validateUrlWithDNS + secureFetchWithPinnedIP) instead of filesystem-oriented checks.

  • validateFilePath now early-returns valid for any http(s):// URL that does not match isInternalFileUrl, preserving full ../~ traversal protection for /api/files/serve/… references (including absolute https://host/api/files/serve/… variants).
  • Two new tests cover the regression: one confirming Slack-style external URLs with .. are accepted, another confirming https://attacker.com/api/files/serve/../../../etc/passwd is still rejected.

Confidence Score: 5/5

Safe to merge — the change is narrowly scoped, the routing interaction with isInternalFileUrl is correct, and downstream SSRF controls remain the effective gate for external URLs.

The early-return applies only to http/https URLs that do not match isInternalFileUrl, which is the same predicate used by parseFileSingle to route to handleCloudFile. Any URL containing /api/files/serve/ — whether relative or absolute — still runs the full traversal checks. External URLs never touch the filesystem and are protected by validateUrlWithDNS + secureFetchWithPinnedIP. Both the positive and negative test cases confirm the boundary is enforced correctly.

No files require special attention.

Important Files Changed

Filename Overview
apps/sim/app/api/files/parse/route.ts Adds an early-return in validateFilePath for external HTTP(S) URLs that are not internal serve paths; logic and routing interaction with isInternalFileUrl are correct.
apps/sim/app/api/files/parse/route.test.ts Adds two well-scoped tests: one confirming the Slack-URL bug fix and one confirming the isInternalFileUrl carve-out still enforces traversal protection.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[filePath input] --> B{null byte?}
    B -- yes --> REJECT1[❌ null byte detected]
    B -- no --> C{http/https prefix\nAND NOT isInternalFileUrl?}
    C -- yes --> ALLOW1[✅ return valid\nexternal URL — SSRF\nprotection downstream]
    C -- no --> D{contains '..'?}
    D -- yes --> REJECT2[❌ path traversal detected]
    D -- no --> E{contains '~'?}
    E -- yes --> REJECT3[❌ tilde not allowed]
    E -- no --> F{starts with '/' AND\nNOT isInternalFileUrl?}
    F -- yes --> REJECT4[❌ outside allowed directory]
    F -- no --> G{drive letter e.g. C:\\?}
    G -- yes --> REJECT5[❌ outside allowed directory]
    G -- no --> ALLOW2[✅ return valid]
Loading

Reviews (3): Last reviewed commit: "fix(files): keep traversal protection fo..." | Re-trigger Greptile

Comment thread apps/sim/app/api/files/parse/route.test.ts Outdated
Comment thread apps/sim/app/api/files/parse/route.ts
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 31, 2026

Greptile Summary

This PR fixes a false-positive in validateFilePath where any URL whose path component contained .. was rejected with "Access denied: path traversal detected", breaking Slack file URLs whose slugs use a literal ... ellipsis. The fix adds an early-return for http:///https:// URLs, since those are never resolved against the filesystem and are guarded downstream by validateUrlWithDNS + secureFetchWithPinnedIP.

  • Early-return for external URLs: http:// and https:// URLs now skip all path-traversal checks in validateFilePath and route to handleExternalUrl, which applies full SSRF protection.
  • Internal URL traversal protection maintained: Plain /api/files/serve/... paths (no http:// prefix) still hit the .. check and are blocked as before.
  • New regression test: Exercises a Slack-style URL with ... in the slug, asserting it passes validation and reaches secureFetchWithPinnedIP.

Confidence Score: 4/5

Safe to merge; the external-URL bypass is correct and downstream SSRF guards remain intact for all reachable code paths.

The early-return for http/https URLs is the right fix for the Slack slug false-positive, and the existing security test suite still blocks plain /api/files/serve/ traversal attempts. The docstring claim about internal-URL traversal protection is imprecise for https-prefixed URLs containing /api/files/serve/, but downstream controls (URL normalization + access control) remain solid with no exploit path.

apps/sim/app/api/files/parse/route.ts — specifically the interaction between the new https early-return in validateFilePath and the isInternalFileUrl substring check in parseFileSingle.

Important Files Changed

Filename Overview
apps/sim/app/api/files/parse/route.ts Added early-return for http/https URLs in validateFilePath; docstring slightly overstates protection for https://-prefixed internal-looking URLs (see inline comment).
apps/sim/app/api/files/parse/route.test.ts Added a well-structured test that verifies a Slack-style ..-containing URL passes validation and reaches the SSRF-protected pinned fetch, without breaking existing traversal tests.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["filePath input"] --> B{"null byte?"}
    B -- yes --> REJECT1["reject: null byte"]
    B -- no --> C{"startsWith http/https?"}
    C -- yes --> D["isValid: true - NEW early return"]
    D --> E{"isInternalFileUrl? contains /api/files/serve/"}
    E -- yes --> F["handleCloudFile - extractStorageKey + verifyFileAccess"]
    E -- no --> G["handleExternalUrl - validateUrlWithDNS + secureFetchWithPinnedIP"]
    C -- no --> H{"contains ..?"}
    H -- yes --> REJECT2["reject: path traversal"]
    H -- no --> I{"contains ~ or absolute path?"}
    I -- yes --> REJECT3["reject: invalid path"]
    I -- no --> K["handleLocalFile or handleCloudFile"]
Loading

Reviews (2): Last reviewed commit: "test(files): assert success explicitly i..." | Re-trigger Greptile

Comment thread apps/sim/app/api/files/parse/route.ts
@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@greptile

@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@cursor review

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Bugbot reviewed your changes and found no new issues!

Comment @cursor review or bugbot run to trigger another review on this PR

Reviewed by Cursor Bugbot for commit 499e834. Configure here.

@waleedlatif1 waleedlatif1 merged commit 6c476cf into staging May 31, 2026
10 checks passed
@waleedlatif1 waleedlatif1 deleted the waleedlatif1/file-block-path-traversal-error branch May 31, 2026 02:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant