Skip to content

Do not make 'describe table' query when schema is known#1845

Merged
zvonand merged 1 commit into
antalya-26.3from
feature/antalya-26.3/no_useless_describe
Jun 1, 2026
Merged

Do not make 'describe table' query when schema is known#1845
zvonand merged 1 commit into
antalya-26.3from
feature/antalya-26.3/no_useless_describe

Conversation

@ianton-ru
Copy link
Copy Markdown

Changelog category (leave one):

  • Performance Improvement

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Do not make 'describe table' query when schema is known

Documentation entry for user-facing changes

With 'object_storage_remote_initiator' setting query to Iceberg table converted to remote(.., iceberg(...))
remote table function calls describe table by default to resolve remote table schema.
But when table is from DataLake catalog, schema already known, 'describe table' query can be avoided.

CI/CD Options

Exclude tests:

  • Fast test
  • Integration Tests
  • Stateless tests
  • Stateful tests
  • Performance tests
  • All with ASAN
  • All with TSAN
  • All with MSAN
  • All with UBSAN
  • All with Coverage
  • All with Aarch64
  • All Regression
  • Disable CI Cache

Regression jobs to run:

  • Fast suites (mostly <1h)
  • Aggregate Functions (2h)
  • Alter (1.5h)
  • Benchmark (30m)
  • ClickHouse Keeper (1h)
  • Iceberg (2h)
  • LDAP (1h)
  • Parquet (1.5h)
  • RBAC (1.5h)
  • SSL Server (1h)
  • S3 (2h)
  • S3 Export (2h)
  • Swarms (30m)
  • Tiered Storage (2h)

@github-actions
Copy link
Copy Markdown

Workflow [PR], commit [a1edeca]

@ianton-ru
Copy link
Copy Markdown
Author

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Already looking forward to the next diff.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Copy link
Copy Markdown
Collaborator

@arthurpassos arthurpassos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM


auto remote_function = TableFunctionFactory::instance().get(remote_query, new_context);

std::shared_ptr<TableFunctionRemote> remote_table_function = std::dynamic_pointer_cast<TableFunctionRemote>(remote_function);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I would add a comment stating why we can set the structure for remote table functions. And possible scope the remote_table_function to an if statement, something like:

if (auto remote_table_function = cast(remote_function))
{
    ...
}

But this is a minor, no need to do it

@alsugiliazova
Copy link
Copy Markdown
Member

Audit — PR #1845: Do not make describe table query when schema is known

PR: #1845
Branch: feature/antalya-26.3/no_useless_describeantalya-26.3
Commit: a1edeca

Confirmed defects

Low — Structure substitution is unconditional for every IStorageCluster subclass

  • Impact: The authoritative remote DESCRIBE TABLE is now always skipped and the
    initiator's local getInMemoryMetadata().columns is trusted. If the initiator's
    resolved schema diverges from what the remote node would resolve (mixed-version
    cluster, different schema-inference settings such as nullability, partition/virtual
    column handling, or schema evolution between catalog refreshes), the
    StorageDistributed header is built from the wrong structure. Reads then rely on
    StorageDistributed's name-based tolerance/conversion; a missing/renamed/retyped
    column surfaces as a conversion error or wrong header instead of the previously
    correct remote-derived schema.
  • Anchor: src/Storages/IStorageCluster.cpp / IStorageCluster::convertToRemote
    (the setActualTableStructure(getInMemoryMetadata().columns) call) and
    src/TableFunctions/TableFunctionRemote.cpp / getActualTableStructure (early
    return on non-empty remote_table_columns).
  • Trigger: object_storage_remote_initiator query against an IStorageCluster
    whose local in-memory schema differs from the remote-resolved one (e.g. differing
    inference settings between initiator and replica).
  • Why defect: The PR motivation ("schema already known for DataLake catalog") is
    narrower than the code, which applies to all IStorageCluster subclasses regardless
    of whether the local schema is truly authoritative; the previous DESCRIBE guarded
    this.
  • Fix direction: Gate the substitution on the catalog/known-schema case, or keep
    it but document/accept the tolerance contract explicitly.
  • Regression test direction: Add a case where initiator and remote infer different
    nullability and assert the result/error matches pre-PR behavior.

Note (non-defect)

The reviewer's nit (scope remote_table_function inside an if and add an
explanatory comment) is style-only and not a defect.

@alsugiliazova
Copy link
Copy Markdown
Member

Verification Report: PR #1845 — Skip DESCRIBE TABLE When Schema Is Known

PR: Altinity/ClickHouse#1845
Commit: a1edeca23c94c0c7fe068e2fb36c32d419609b77antalya-26.3
Reviewed: 2026-06-01
CI run: 26518000304 · report


Verdict

CI status RED (15 fail/error jobs)
Block merge on CI? No — most red is flake, infra, or combinatorial regression noise
Server change risk Low–medium; builds green, iceberg regression green; no clear upstream regression signal

What matters in the red CI

Not PR regressions

Failure Why ignore for #1845
1124 regression "new fails" (swarms, parquet, s3_export) Fail rates identical base vs PR (swarms 37.5%, parquet 5.0%, etc.) — combinatorial diff, not rate increase
Stateless targeted (3 tests) + stress s3 Flaky on master; targeted job runner crashed
Integration 4/6, 1/5 (interrupted) Infra/cascade (test_sync_replica, mass iceberg FAILs at one timestamp)
test_namespace_filter Fails on CREATE TABLE in setup, before the filter logic under test; root cause unknown. ~15% fail rate on master (90d) — not #1845-specific

Tracked separately

Failure Note
test_remote_initiator_after_non_remote[s3] Test/harness issue, not a server regression from #1845. Fix in PR #1856 (Fix test_remote_initiator_after_non_remote — updates test_remote_initiator.py + cluster config). Also flaky on master (~9% fail rate).

What passed

All builds · fast test · SQLLogic · unit tests · most stateless/integration · iceberg regression (both suites, both arches) · parquet minio/aws_s3 · docker/grype.


Recommendation

  1. Merge decision should not hinge on CI red alone for this PR.
  2. test_namespace_filter — investigate setup CREATE TABLE failure independently of Do not make 'describe table' query when schema is known #1845.
  3. test_remote_initiator_after_non_remote — follow #1856, not Do not make 'describe table' query when schema is known #1845.
  4. Optional: confirm skipped-DESCRIBE schema matches remote DESCRIBE for remote/s3Cluster paths (swarms/parquet failure patterns match the change, but rates don't worsen vs base).

@alsugiliazova alsugiliazova added the verified Approved for release label Jun 1, 2026
@zvonand zvonand merged commit 289ca73 into antalya-26.3 Jun 1, 2026
284 of 313 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants