fix(files): don't reject external URLs containing '..' in file parse validation#4821
Conversation
…validation The file block's file_fetch operation rejected any external URL whose path contained '..' (e.g. Slack files-pri slugs with a literal '...') with 'Access denied: path traversal detected'. Traversal checks only apply to local paths — external http(s) URLs are fetched with SSRF protection downstream and are never resolved against the filesystem, so they now short-circuit as valid. Internal /api/files/serve/ URLs keep full traversal protection.
|
The latest updates on your projects. Learn more about Vercel for GitHub. |
PR SummaryMedium Risk Overview
Adds unit tests for allowed Slack-style external URLs and for blocked traversal on HTTPS URLs that look like internal serve paths. Reviewed by Cursor Bugbot for commit 499e834. Configure here. |
|
@greptile |
|
@cursor review |
Greptile SummaryThis PR fixes a false-positive path traversal rejection that blocked external URLs whose paths legitimately contain
Confidence Score: 5/5Safe to merge — the change is narrowly scoped, the routing interaction with The early-return applies only to No files require special attention. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[filePath input] --> B{null byte?}
B -- yes --> REJECT1[❌ null byte detected]
B -- no --> C{http/https prefix\nAND NOT isInternalFileUrl?}
C -- yes --> ALLOW1[✅ return valid\nexternal URL — SSRF\nprotection downstream]
C -- no --> D{contains '..'?}
D -- yes --> REJECT2[❌ path traversal detected]
D -- no --> E{contains '~'?}
E -- yes --> REJECT3[❌ tilde not allowed]
E -- no --> F{starts with '/' AND\nNOT isInternalFileUrl?}
F -- yes --> REJECT4[❌ outside allowed directory]
F -- no --> G{drive letter e.g. C:\\?}
G -- yes --> REJECT5[❌ outside allowed directory]
G -- no --> ALLOW2[✅ return valid]
Reviews (3): Last reviewed commit: "fix(files): keep traversal protection fo..." | Re-trigger Greptile |
Greptile SummaryThis PR fixes a false-positive in
Confidence Score: 4/5Safe to merge; the external-URL bypass is correct and downstream SSRF guards remain intact for all reachable code paths. The early-return for http/https URLs is the right fix for the Slack slug false-positive, and the existing security test suite still blocks plain /api/files/serve/ traversal attempts. The docstring claim about internal-URL traversal protection is imprecise for https-prefixed URLs containing /api/files/serve/, but downstream controls (URL normalization + access control) remain solid with no exploit path. apps/sim/app/api/files/parse/route.ts — specifically the interaction between the new https early-return in validateFilePath and the isInternalFileUrl substring check in parseFileSingle. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A["filePath input"] --> B{"null byte?"}
B -- yes --> REJECT1["reject: null byte"]
B -- no --> C{"startsWith http/https?"}
C -- yes --> D["isValid: true - NEW early return"]
D --> E{"isInternalFileUrl? contains /api/files/serve/"}
E -- yes --> F["handleCloudFile - extractStorageKey + verifyFileAccess"]
E -- no --> G["handleExternalUrl - validateUrlWithDNS + secureFetchWithPinnedIP"]
C -- no --> H{"contains ..?"}
H -- yes --> REJECT2["reject: path traversal"]
H -- no --> I{"contains ~ or absolute path?"}
I -- yes --> REJECT3["reject: invalid path"]
I -- no --> K["handleLocalFile or handleCloudFile"]
Reviews (2): Last reviewed commit: "test(files): assert success explicitly i..." | Re-trigger Greptile |
|
@greptile |
|
@cursor review |
There was a problem hiding this comment.
✅ Bugbot reviewed your changes and found no new issues!
Comment @cursor review or bugbot run to trigger another review on this PR
Reviewed by Cursor Bugbot for commit 499e834. Configure here.
Summary
file_fetchoperation rejected any external URL whose path contained..(e.g. Slackfiles-prislugs with a literal...) withAccess denied: path traversal detected..,~, leading/, drive letters) now only apply to local/internal paths — externalhttp(s)://URLs short-circuit as valid since they're fetched with SSRF protection downstream (validateUrlWithDNS+secureFetchWithPinnedIP) and never touch the filesystem/api/files/serve/...URLs keep full traversal protection (no regression —/api/files/serve/../../../etc/passwdis still rejected)..are allowed throughType of Change
Testing
Tested manually. Added unit test covering a Slack-style external URL containing
..; existing path-traversal security suite still guards local/internal paths.Checklist