Skip to content

feat(sea): metadata methods — catalogs/schemas/tables/columns/functions/keys/typeInfo#404

Open
msrathore-db wants to merge 18 commits into
msrathore/sea-integration-foundationfrom
msrathore/sea-execution-metadata
Open

feat(sea): metadata methods — catalogs/schemas/tables/columns/functions/keys/typeInfo#404
msrathore-db wants to merge 18 commits into
msrathore/sea-integration-foundationfrom
msrathore/sea-execution-metadata

Conversation

@msrathore-db
Copy link
Copy Markdown
Contributor

What

Wires the SEA metadata surface through the napi binding: the JDBC DatabaseMetaData-style methods on IDBSQLSession, backed by the kernel's flat Metadata API (the kernel runs the SHOW / information_schema queries and returns JDBC-shaped Arrow batches via a Statement handle identical to executeStatement).

Methods wired:

  • getCatalogs, getSchemas, getTables, getColumns, getFunctions, getTableTypes, getTypeInfo (the 7 listing methods)
  • getPrimaryKeys, getCrossReference (with upfront required-arg validation)
  • Client-side tableTypes filter on getTables (SeaTableTypeFilter) to match the Thrift driver's filtering semantics
  • f8a86d5 refactor: catalog / schema / sessionConf moved from per-statement forwarding to openSession (matches the kernel session-level surface)

getProcedures and getInfo are intentionally not in this PR — getProcedures isn't on the Thrift IDBSQLSession surface either (deferral note in SeaNativeLoader), and getInfo lands separately (it's a Thrift-protocol synthesis — see the stacked parity PR #403).

Base

Stacked on msrathore/sea-integration-foundation — an integration anchor equal to the SEA backend foundation (the decomposed 8-PR SEA stack: #380/#377/#379/#382/#381/#384/#383). Basing here keeps this PR's diff to just the metadata layer (~18 commits, lib/sea/SeaSessionBackend.ts + SeaTableTypeFilter.ts + SeaNativeLoader.ts + metadata.test.ts) rather than re-showing the foundation.

Validation

Exercised live against a real warehouse via the node comparator (thrift-vs-SEA): metadata column names match Thrift's JDBC names (TABLE_CAT/TABLE_SCHEM, not raw SEA catalog); residual metadata-listing diffs are server/kernel-side row ordering + system-schema scope (confirmed by the Python use_kernel oracle). getTypeInfo returns the kernel's ODBC-shaped type catalog (documented divergence from Thrift's JDBC shape). 600+ lines of metadata.test.ts unit coverage.

Stack

sea-integration-foundationthis (metadata)#403 (interval/getInfo/error-class parity).

This pull request and its description were written by Isaac.

Implement getCatalogs, getSchemas, getTables, getTableTypes, getColumns,
getFunctions, and getTypeInfo on SeaSessionBackend by delegating to the
corresponding napi Connection methods (listCatalogs, listSchemas,
listTables, listTableTypes, listColumns, listFunctions, listTypeInfo).
Each method calls failIfClosed(), invokes the napi method, wraps the
returned Statement in SeaOperationBackend, and translates napi errors
through decodeNapiKernelError. Extends SeaNativeConnection in
SeaNativeLoader with the 7 napi method signatures plus the shared
infrastructure for getPrimaryKeys/getCrossReference/getInfo (added in
the next two commits).
…ding

Implement getPrimaryKeys and getCrossReference on SeaSessionBackend by
delegating to the napi Connection methods of the same name. Arg routing:
getPrimaryKeys forwards catalogName/schemaName/tableName directly;
getCrossReference forwards all 6 fields (parent + foreign side) in the
order the kernel expects. Both follow the standard failIfClosed +
decodeNapiKernelError + SeaOperationBackend wrapping pattern.
Implement getInfo on SeaSessionBackend: calls connection.getInfo(infoType),
wraps the returned SeaNativeInfoValue (structurally compatible with
TGetInfoValue) in InfoValue via an any-cast, and translates napi errors
through decodeNapiKernelError. Unlike the 9 other metadata methods this
one returns InfoValue directly rather than IOperationBackend, matching
the ISessionBackend contract.
Reconcile SeaNativeConnection against fd5e866 (msrathore/krn-napi-metadata-exposure):

- Drop SeaNativeInfoValue and getInfo — kernel Metadata has no getInfo method.
  The revert of the earlier getInfo commit (b644856) left SeaSessionBackend.getInfo
  as a stub-throw; this commit cleans up the interface and test fakes to match.
- Update optional-param types from `string | undefined` to `string | undefined | null`
  (napi-rs emits this for Rust Option<String> parameters).
- Fix getPrimaryKeys signature: napi emits `catalog: string` (required, no `?`)
  because the kernel takes `Identifier` not `Option<Identifier>`. Adapter now
  passes `catalogName ?? ''` — kernel returns InvalidArgument for empty identifiers,
  surfaced as HiveDriverError.
- Update getCrossReference parent-side params to `string | undefined | null`.
- Update FakeNativeConnection / FakeMetadataConnection in tests to match.
Extends IDBSQLSession / ISessionBackend / DBSQLSession with a new
getProcedures(request?) method backed by connection.listProcedures() on
the SEA path. ThriftSessionBackend stubs the method with a clear
not-supported error (no Thrift wire analogue). Adds 5 unit tests
covering argument routing, absent-field pass-through, SeaOperationBackend
return type, closed-session rejection, and kernel-error wrapping.
Suite: 57/57 passing.

Co-authored-by: Isaac
…ublic surface

The Thrift NodeJS driver has no getProcedures today; exposing it on
IDBSQLSession requires a parity decision. Removes ProceduresRequest type
and getProcedures() from IDBSQLSession, ISessionBackend, DBSQLSession,
and the ThriftSessionBackend stub-throw added in the previous commit.

SeaSessionBackend.getProcedures remains on the class (not interface)
with a comment pointing to Connection.listProcedures on the napi binding,
making the future-add path discoverable. SeaNativeLoader.listProcedures
and the FakeMetadataConnection stub are kept so the napi surface is fully
typed and mockable when IDBSQLSession is extended. Suite: 52/52 passing.

Co-authored-by: Isaac
…face definition

Places the canonical explanation for why getProcedures is not on the JS
public surface where a future implementer will look first — the typed
SeaNativeConnection interface — and reduces the per-method comment in
SeaSessionBackend to a single pointer line. No behavioural change.

Co-authored-by: Isaac
…review

Replaces the JSDoc block on listProcedures with the canonical NOTE-style
inline comment requested in review. Wording matches the agreed text exactly.
No behavioural change. Suite: 52/52.

Co-authored-by: Isaac
…PrimaryKeys / getCrossReference

The kernel rejects empty identifier strings with a different, less
diagnostic error than missing-identifier. Validate JS-optional fields
before calling the napi binding so callers get a clear HiveDriverError
with an explicit message rather than an opaque kernel InvalidArgument.

getPrimaryKeys: guards catalogName (optional in PrimaryKeysRequest) before
passing all three required napi strings. schemaName/tableName are
compile-time required on PrimaryKeysRequest so no JS guard needed there.

getCrossReference: guards all three foreign-side identifiers
(foreignCatalogName, foreignSchemaName, foreignTableName). The parent
side remains optional as the napi binding accepts undefined/null there.

Both methods diverge from Thrift, which silently resolves absent catalog
to the session default. The upfront error is strictly better for SEA
callers — failing fast with a clear message is preferable to an opaque
kernel round-trip failure. Tracked as divergence D-NN.

Tests: 55/55 (was 52). Adds 4 new guard tests replacing the prior
"passes empty string" test that documented the now-fixed ?? '' workaround.

Co-authored-by: Isaac
The Databricks server does not honour the table_types argument to
listTables server-side (advisory only — documented in pyo3 metadata.rs
and matrix-audit-python.md).  Add SeaTableTypeFilter, a thin
IOperationBackend wrapper that filters fetchChunk rows by exact
case-sensitive match on the TABLE_TYPE column.

getTables wraps the SeaOperationBackend in SeaTableTypeFilter when
request.tableTypes is non-null; passing undefined (no filter) returns the
plain backend unchanged.  Empty array tableTypes means "keep nothing" —
same semantics as pyo3 Some([]).
Add a 'wraps kernel error via decodeNapiKernelError' test case to the
getSchemas, getTables, getTableTypes, getTypeInfo, getColumns, getFunctions,
and getCrossReference describe blocks.  These were the 7 methods flagged by
MEDIUM-2 in the DA review as lacking error-wrapping coverage.  Pattern
mirrors the existing getCatalogs and getPrimaryKeys error tests.
The class doc said "all 10 metadata methods" but M1 implements 9:
getInfo is a stub-throw (kernel Metadata API doesn't expose it yet) and
getProcedures is not on IDBSQLSession.  Enumerate the 9 implemented methods
by name and explain both omissions so reviewers don't expect a missing
tenth method.
executeStatement used `nativeStatement!` when constructing SeaOperationBackend
while every other method in the same file uses `nativeStatement` without the
assertion.  The variable is assigned inside a try block and the catch rethrows,
so by the time we reach the return the value is always defined.  The `!` was
misleading — remove it for consistency.
…riverError

All 9 'wraps kernel error via decodeNapiKernelError' tests previously
asserted instanceOf(Error), which passes trivially even if decodeNapiKernelError
never ran (a plain rethrow would still satisfy instanceOf(Error) since
HiveDriverError extends Error).

Fix: throw a string from FakeMetadataConnection.throwNextCall — decodeNapiKernelError
converts non-Error throws to new HiveDriverError(msg) — and assert
instanceOf(HiveDriverError).  Changed throwNextCall type to unknown so a string
value is type-safe; the null-check in record() switches to !== null to match.
The previous filter tests only checked instanceof SeaTableTypeFilter on the
returned operation; a misbehaving filter (passes all rows, mis-reads the
column name) would not have been caught.

Add two tests using a minimal IOperationBackend inline fake whose fetchChunk
returns a 4-row mixed-type fixture:
- allowedTypes = Set(['TABLE']) → 2 TABLE rows returned, VIEW and SYSTEM TABLE
  are excluded
- allowedTypes = Set() (empty) → 0 rows returned (keep-nothing semantics)
…rwarding to openSession

Inherits the same shape change as auth-u2m: the napi binding moved
initialCatalog/initialSchema/sessionConfig from ExecuteOptions onto
ConnectionOptions because the kernel does not read
StatementSpec::statement_conf — the per-statement values were
silently no-op'd.

- SeaAuth.ts: extend SeaNativeConnectionOptions with optional
  catalog / schema / sessionConf (intersection with each auth-mode
  variant). New SeaSessionDefaults interface for the shared shape.
- SeaBackend.ts::openSession: fold OpenSessionRequest.initialCatalog /
  initialSchema / configuration into the napi options before the
  openSession call. Drop the SeaSessionBackend.defaults forwarding.
- SeaNativeLoader.ts: drop SeaExecuteOptions; executeStatement now
  takes only the SQL.
- SeaSessionBackend.ts: drop SeaSessionDefaults and defaults field;
  drop per-statement overlay logic. useCloudFetch no-op'd (see
  auth-u2m commit for rationale).
- tests: replace per-statement-forwarding assertions with
  openSession-arg assertions in both execution.test.ts and
  metadata.test.ts. 185/185 SEA tests pass.

Signed-off-by: Madhavendra Rathore <madhavendra.rathore@databricks.com>
@github-actions
Copy link
Copy Markdown

Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase (git rebase -i main).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant