Skip to content

[DRAFT] Fix diff scan timeout propagation + API failure handling#195

Closed
lelia wants to merge 4 commits into
mainfrom
lelia/fix-diff-scan-timeout
Closed

[DRAFT] Fix diff scan timeout propagation + API failure handling#195
lelia wants to merge 4 commits into
mainfrom
lelia/fix-diff-scan-timeout

Conversation

@lelia
Copy link
Copy Markdown
Contributor

@lelia lelia commented May 4, 2026

Summary

Fixes two issues causing slow or failing full-scan diff comparisons:

  1. --timeout was only applied to the CLI-local CliClient, but the full-scan diff comparison uses the Socket SDK instance, and was constructed without the CLI timeout, so diff requests kept using the SDK default of 1200 seconds
  2. --exclude-license-details was applied to full scan params/report URLs, but not passed to the /full-scans/diff SDK request
  3. The diff comparison APIFailures exited directly from core diff logic with sys.exit(1), which bypassed top-level CLI handling for the --disable-blocking flag

Changes

  • Passes the effective CLI API timeout into the Socket SDK constructor
  • Passes include_license_details=false to fullscans.stream_diff when license details are excluded
  • Re-raises diff comparison APIFailures so the top-level CLI wrapper handles exit codes consistently, including the --disable-blocking flag
  • Added regression test coverage for timeout propagation, license-detail exclusion, and API failure exit behavior

lelia added 4 commits May 4, 2026 13:52
Signed-off-by: lelia <2418071+lelia@users.noreply.github.com>
Signed-off-by: lelia <2418071+lelia@users.noreply.github.com>
Signed-off-by: lelia <2418071+lelia@users.noreply.github.com>
Signed-off-by: lelia <2418071+lelia@users.noreply.github.com>
@lelia lelia requested a review from a team as a code owner May 4, 2026 21:08
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 4, 2026

🚀 Preview package published!

Install with:

pip install --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple socketsecurity==2.2.87.dev1

Docker image: socketdev/cli:pr-195

@lelia lelia marked this pull request as draft May 11, 2026 20:41
@lelia lelia changed the title Fix diff scan timeout propagation + API failure handling [DRIFT] Fix diff scan timeout propagation + API failure handling May 11, 2026
@lelia lelia changed the title [DRIFT] Fix diff scan timeout propagation + API failure handling [DRAFT] Fix diff scan timeout propagation + API failure handling May 11, 2026
@lelia lelia closed this May 18, 2026
lelia added a commit that referenced this pull request May 19, 2026
v2.2.88 was tagged from PR #202 (bun.lock / bun.lockb / vlt-lock.json
manifest support) while this branch was being prepared. The earlier
in-flight 2.2.87 from PR #195 was never released; its three substantive
fixes (timeout SDK propagation, --exclude-license-details propagation,
APIFailure exit-handling) ship for the first time as part of 2.3.0.

CHANGELOG.md changes:
- Drop the never-released `## 2.2.87` section
- Add a `## 2.2.88` section noting the bun/vlt lockfile addition
- Fold the three PR #195 bullets into the 2.3.0 "Fixed" subsection so
  the substantive fixes are credited in the release notes that ship

Signed-off-by: lelia <2418071+lelia@users.noreply.github.com>
lelia added a commit that referenced this pull request May 29, 2026
The full-scan diff comparison ignored --exclude-license-details: the flag was
applied to full-scan params and report URLs but never forwarded to the
fullscans.stream_diff request, so diff comparisons always fetched license
details regardless of the flag.

Thread it through get_added_and_removed_packages -> stream_diff via a new
include_license_details param (defaulting True to preserve current behavior).

Non-breaking: the APIFailure handling at this call site is deliberately left
as-is (exit 1, --disable-blocking -> 0). Re-routing diff APIFailures through
the top-level exit-3 path is part of the 3.0 exit-code change, not this one.

Originally from the unreleased PR #195 branch; the timeout-propagation half
already landed in the preceding commit.

Signed-off-by: lelia <2418071+lelia@users.noreply.github.com>
lelia added a commit that referenced this pull request May 29, 2026
* feat: add --exit-code-on-api-error flag + Buildkite-aware infra error logging

Adds a configurable exit code for API/infrastructure failures so CI pipelines
can distinguish them from blocking security findings (exit 1), without changing
any default behavior.

- New CliConfig field exit_code_on_api_error (default 3) + --exit-code-on-api-error
  flag. The CLI already exited 3 on unexpected errors; this just makes that code
  configurable (e.g. remap to a Buildkite soft_fail code, or 0 to swallow).
- New _emit_infrastructure_error helper + IS_BUILDKITE gate: emits Buildkite log
  section markers (^^^ +++ / --- ⚠️) and a soft_fail hint when running in
  Buildkite; plain log.error elsewhere so markers don't leak as literal text.
- Wire the top-level generic-exception handler in cli() through the helper and
  the configurable code.

Deliberately NON-breaking for 2.3.x:
- --disable-blocking STILL forces exit 0 for all outcomes and takes precedence
  over --exit-code-on-api-error (documented in the flag help so the two aren't
  combined by mistake).
- Default exit codes are unchanged; the exit code only changes when the user
  explicitly passes the flag.

The breaking variant (infra errors bypassing --disable-blocking, distinct
RequestTimeoutExceeded handling, exit 1 -> 3 for diff API failures) is
intentionally deferred to a future 3.0 release.

Signed-off-by: lelia <2418071+lelia@users.noreply.github.com>

* feat(config): auto-truncate commit messages over 200 chars

The --commit-message flag passes its value directly into the API request URL
as a query parameter with no length limit. AI-generated commit messages and
the common CI pattern of concatenating $BUILDKITE_BUILD_NUMBER + $BUILDKITE_MESSAGE
can easily exceed URL length limits, producing HTTP 413 errors.

The 413 originates from an infrastructure-layer URL length limit (nginx/Cloudflare),
not application-level validation -- confirmed via inspection of the Socket API route
handler, which has no constraint on commit_message (unlike committers, which enforces
<= 200 chars and returns a clean 400).

200 chars chosen as a conservative defensive ceiling given URL encoding can 2-3x
raw character count. No customer should ever want a 2000-character commit message
in their scan metadata.

A backend-side validation (returning 400 instead of 413) is filed as a follow-on
for the depscan API team.

Motivated by customer incidents (Plaid).

Signed-off-by: lelia <2418071+lelia@users.noreply.github.com>

* pass timeout through SDK diff requests

Signed-off-by: lelia <2418071+lelia@users.noreply.github.com>

* fix: propagate --exclude-license-details to the full-scan diff request

The full-scan diff comparison ignored --exclude-license-details: the flag was
applied to full-scan params and report URLs but never forwarded to the
fullscans.stream_diff request, so diff comparisons always fetched license
details regardless of the flag.

Thread it through get_added_and_removed_packages -> stream_diff via a new
include_license_details param (defaulting True to preserve current behavior).

Non-breaking: the APIFailure handling at this call site is deliberately left
as-is (exit 1, --disable-blocking -> 0). Re-routing diff APIFailures through
the top-level exit-3 path is part of the 3.0 exit-code change, not this one.

Originally from the unreleased PR #195 branch; the timeout-propagation half
already landed in the preceding commit.

Signed-off-by: lelia <2418071+lelia@users.noreply.github.com>

* test: cover --exit-code-on-api-error, truncation, and Buildkite formatting

tests/unit/test_cli_config.py
- exit_code_on_api_error default 3 / custom / zero
- commit-message truncation: passthrough under 200, truncate over 200,
  quote-strip-before-truncate

tests/unit/test_socketcli.py
- unexpected error exits 3 by default
- --exit-code-on-api-error 100 remaps the failure exit code
- --disable-blocking OVERRIDES --exit-code-on-api-error (-> 0): locks in the
  documented precedence so the soft_fail guidance can't silently regress
- KeyboardInterrupt still exits 2
- _emit_infrastructure_error: BK markers + soft_fail hint only when
  IS_BUILDKITE; traceback gated on include_traceback

Signed-off-by: lelia <2418071+lelia@users.noreply.github.com>

* chore(release): 2.3.0 -- configurable API-error exit code

Minor bump for the new --exit-code-on-api-error flag and the supporting
non-breaking improvements (commit-message truncation, Buildkite-aware infra
error logging, --timeout / --exclude-license-details fixes).

This release is intentionally NON-breaking: default exit codes are unchanged,
the exit code only shifts when --exit-code-on-api-error is explicitly passed,
and --disable-blocking keeps its existing precedence. The breaking exit-code
behavior change (infra errors exiting non-zero even under --disable-blocking)
is deferred to a future 3.0.

CHANGELOG + README document the flag AND its interaction with --disable-blocking
(which overrides it) to reduce user error in the Buildkite soft_fail setup.

Version refs synced across pyproject.toml, socketsecurity/__init__.py, and
uv.lock (per the version-incrementation CI check).

Signed-off-by: lelia <2418071+lelia@users.noreply.github.com>

---------

Signed-off-by: lelia <2418071+lelia@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant