Skip to content

improvement(enrichments): limit company-info to fields both providers return#4817

Merged
TheodoreSpeaks merged 1 commit into
stagingfrom
fix/enrichment-error
May 31, 2026
Merged

improvement(enrichments): limit company-info to fields both providers return#4817
TheodoreSpeaks merged 1 commit into
stagingfrom
fix/enrichment-error

Conversation

@TheodoreSpeaks
Copy link
Copy Markdown
Collaborator

@TheodoreSpeaks TheodoreSpeaks commented May 31, 2026

Summary

  • Company Info enrichment showed industry and founded_year inconsistently across rows — Hunter's company dataset returns null for those fields on many large companies (verified against the live API for Microsoft, Amazon, Google), and the first-non-empty-wins cascade meant Hunter usually won before PDL could fill them.
  • Limited the outputs to the two fields both Hunter and PDL reliably return: employee count and description. Every row is now consistent.
  • employeeCount is a string so Hunter's range bucket (e.g. "11-50") and PDL's exact count share the same column. Hunter (free) runs first, PDL is the paid fallback.

Type of Change

  • Bug fix

Testing

Tested manually against the live Hunter API to confirm the field gaps are real (not a mapping bug). bun run lint clean, bun run check:api-validation:strict passed.

Checklist

  • Code follows project style guidelines
  • Self-reviewed my changes
  • Tests added/updated and passing
  • No new warnings introduced
  • I confirm that I have read and agree to the terms outlined in the Contributor License Agreement (CLA)

@vercel
Copy link
Copy Markdown

vercel Bot commented May 31, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
docs Skipped Skipped May 31, 2026 2:18am

Request Review

@cursor
Copy link
Copy Markdown

cursor Bot commented May 31, 2026

PR Summary

Medium Risk
Existing tables or workflows that relied on industry/founded year or numeric employee count may see missing columns or type mismatches until columns are updated.

Overview
The Company Info enrichment now only exposes employee count and description, dropping industry and founded year so rows stay filled consistently when Hunter wins the provider cascade.

Hunter runs first (free), with People Data Labs as fallback. employeeCount is now a string so Hunter size buckets (e.g. "11-50") and PDL counts share one column; provider mappings were updated accordingly and the numeric num() helper was removed.

Reviewed by Cursor Bugbot for commit a1ff849. Bugbot is set up for automated code reviews on this repo. Configure here.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 31, 2026

Greptile Summary

This PR narrows the Company Info enrichment to only the two fields both Hunter and PDL reliably return (employeeCount as a string and description), removing industry and foundedYear that were inconsistently null across providers. It also swaps the cascade order so Hunter (free tier) runs first and PDL is the paid fallback.

  • employeeCount output type changed from number to string so Hunter's range buckets (e.g. "11-50") and PDL's exact integer counts share one column.
  • Provider order inverted: Hunter is now index 0, PDL is index 1; the cascade runner in run.ts stops at the first provider whose mapOutput returns any non-empty field.
  • The num() helper is removed; PDL's numeric employee_count is now coerced to a string via str().

Confidence Score: 3/5

The cascade runner stops at the first provider with any non-empty field, so a Hunter hit that has description but no size will win and leave employeeCount permanently blank — the same gap the PR aims to close. Additionally, changing employeeCount from number to string breaks any saved workflow that passes the output to numeric inputs or arithmetic. Both issues are in open review threads and remain unaddressed.

Two distinct defects on the changed path remain unresolved: the partial-hit cascade silently drops employeeCount for companies Hunter partially knows, and the number-to-string type change breaks existing workflow connections wired to numeric operations.

apps/sim/enrichments/company-info/company-info.ts — the cascade ordering, mapOutput logic, and output type declaration all warrant another look before merging.

Important Files Changed

Filename Overview
apps/sim/enrichments/company-info/company-info.ts Narrows outputs to employeeCount (string) and description, swaps to Hunter-first cascade, removes num() helper — pre-existing cascade partial-hit and type-breaking-change concerns are open in review threads

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Input: company domain] --> B[normalizeDomain]
    B -->|empty string| Z[skip provider — return null params]
    B -->|valid domain| C[Hunter: hunter_companies_find]
    C -->|404| D[PDL: pdl_company_enrich]
    C -->|error| E[errorCount++, try PDL]
    C -->|success| F{mapOutput has any non-empty field?}
    F -->|yes — Hunter wins| G[Return result: employeeCount, description]
    F -->|no — both fields empty| D
    E --> D
    D -->|404 or skipped| H[Return empty result]
    D -->|error| I[Return error if all providers errored]
    D -->|success| J{mapOutput has any non-empty field?}
    J -->|yes — PDL wins| G
    J -->|no| H
Loading

Reviews (2): Last reviewed commit: "improvement(enrichments): limit company-..." | Re-trigger Greptile

Comment on lines 31 to +49
providers: [
toolProvider({
id: 'hunter',
label: 'Hunter',
toolId: 'hunter_companies_find',
buildParams: (inputs) => {
const domain = normalizeDomain(inputs.domain)
if (!domain) return null
return { domain }
},
mapOutput: (output) => {
return filterUndefined({
industry: str(output.industry) || undefined,
employeeCount: str(output.size) || undefined,
foundedYear: num(output.founded_year),
description: str(output.description) || undefined,
})
},
}),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Hunter partial-hit silently blocks PDL employee count

The cascade runner (run.ts:80) stops at the first provider whose mapOutput returns any non-empty field. If Hunter finds a company record but its response has no size field, Hunter still wins (because industry, foundedYear, or description satisfies hasResult), PDL is never attempted, and employeeCount stays blank — the same symptom the PR set out to fix, just narrowed to Hunter-known companies where size is absent. In that scenario the reorder actively regresses coverage relative to the previous PDL-first order.

Comment thread apps/sim/enrichments/company-info/company-info.ts
… return

Hunter's company dataset returns null industry/foundedYear for many large companies (verified against the live API for Microsoft, Amazon, Google), so under the first-non-empty-wins cascade those columns appeared inconsistently across rows. Limit company-info outputs to employee count and description — the fields Hunter and PDL both reliably return — so every row is consistent. employeeCount is a string so Hunter's range bucket and PDL's exact count share the column.
@TheodoreSpeaks TheodoreSpeaks force-pushed the fix/enrichment-error branch from af4a677 to a1ff849 Compare May 31, 2026 02:18
@TheodoreSpeaks TheodoreSpeaks changed the title fix(enrichments): unify company-info employee count across providers improvement(enrichments): limit company-info to fields both providers return May 31, 2026
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit a1ff849. Configure here.

Comment thread apps/sim/enrichments/company-info/company-info.ts
@TheodoreSpeaks
Copy link
Copy Markdown
Collaborator Author

@greptile review

@TheodoreSpeaks TheodoreSpeaks merged commit 97f7fe9 into staging May 31, 2026
14 checks passed
@waleedlatif1 waleedlatif1 deleted the fix/enrichment-error branch May 31, 2026 03:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant