Support typed Data.define members via RBS comment#865
Draft
julianojulio wants to merge 1 commit into
Draft
Conversation
0cb51f6 to
905e0e9
Compare
905e0e9 to
69efa7e
Compare
69efa7e to
bd746d6
Compare
Add support for typing Data.define members by propagating types from
a sig on initialize to the member reader methods. This works with
both RBS #: comments and traditional sig { } blocks.
For the RBS case, a virtual initialize is synthesized from the #:
comment — zero runtime cost, no method defined:
TypedPoint = Data.define(:x, :y) do
#: (x: Integer, y: String) -> void
end
For the Sorbet sig case, an explicit def initialize is required:
TypedPoint = Data.define(:x, :y) do
extend T::Sig
sig { params(x: Integer, y: String).void }
def initialize(x:, y:) = super
end
Typed readers are only created when the initialize body is exactly
bare super, ensuring the sig types reliably match the stored values.
When the user transforms values (e.g. super(x: x.to_i)), readers
conservatively fall back to T.untyped.
Three pipeline components are changed:
CommentsAssociator: extracts orphan #: comments in Data.define
blocks and associates them with the Block node.
SigsRewriter: synthesizes a virtual def initialize with keyword
args matching the define members from the orphan RBS signature.
Data rewriter: when the block contains a sig preceding a bare-super
def initialize, extracts parameter types and creates typed reader
stubs with sig { returns(Type) }.
Based on the typed Data.define approach originally designed by
Cameron Bothner in sorbet#8079.
Co-authored-by: Cameron Bothner <cbothner@users.noreply.github.com>
Co-authored-by: Claude <noreply@anthropic.com>
bd746d6 to
d5a324a
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Add support for typing
Data.definemembers by propagating types from a sig oninitializeto the member reader methods. This works with both RBS#:comments and traditionalsig { }blocks.Based on the typed Data.define approach originally designed by @cbothner in sorbet/sorbet#8079. This PR adds the RBS virtual initialize (zero runtime cost) and incorporates the conservative bare-super check from @jez's review feedback on that PR.
Addresses sorbet#10055
Two ways to type Data.define members
RBS comment (zero runtime cost — recommended)
The
#:comment annotates a "virtual initialize" — no method is defined at runtime:Traditional Sorbet sig (requires explicit initialize)
An explicit
def initialize(...) = superis needed, which adds a small runtime cost (one extra method dispatch per construction):Both produce identical type checking. The RBS form is preferred because it has zero runtime overhead.
Ambiguity with methods in the block
When the block also contains methods, a
#:comment could be ambiguous — is it a virtual initialize signature or a signature for the method below it?The implementation resolves this using a gap heuristic: a
#:comment is treated as a virtual initialize only if there is at least one blank line between it and the first method definition. If the comment is immediately above adef, it's treated as that method's signature.However, for clarity, when a block contains both typed members and additional methods, we recommend using an explicit
def initializeto remove all ambiguity:This way each
#:unambiguously attaches to thedefbelow it. The virtual initialize (no explicitdef) is best suited for simple Data classes with no additional methods.Design decisions (informed by #8079 review)
Following @jez's principle that "the default assumption should be 'Sorbet doesn't do anything' and only if a certain set of very precise constraints are met should Sorbet do something":
Bare super required: Typed readers are only created when the
initializebody is exactly baresuper(i.e.,def initialize(x:, y:) = super). When the user transforms values (e.g.,super(x: x.to_i)), readers conservatively fall back toT.untyped. This keeps our options open for smarter analysis in the future.self.[]left untyped: Sorbet cannot overload a method to accept both positional and keyword arguments, so onlynew/initializeis typed.No partial typing: If the sig's param names don't match the
Data.definemembers, standard RBS/sig parameter mismatch errors fire and readers fall back toT.untyped.Approach
rbs/prism/CommentsAssociatorPrism.cc:maybeExtractDataDefineOrphanComments— for Data.define blocks without an explicitdef initialize, extracts leading orphan#:comments and associates them with the Prism block node. Uses a gap heuristic: a comment is "orphan" (for virtual init) only if there's at least one blank line before the firstdefin the body, or the body is empty.rbs/prism/SigsRewriterPrism.cc:maybeSynthesizeDataDefineVirtualInit— for Data.define blocks with an associated orphan signature, synthesizes a Prismpm_def_node_tforinitialize(with keyword args + bare super body) and translates the RBS comment into a sig node.rewriter/Data.cc:isBareSuper+findSigBeforeInitialize+extractTypesFromParams+extractInitializeTypes— when the block contains a sig preceding a bare-superdef initialize, extracts the parameter types and creates typed reader stubs withsig { returns(Type) }.Test coverage
RBS tests (
test/testdata/rbs/signatures_data_define.rb):::Data.define, untyped fallback, mismatched param namesT.nilable,T.any,T::Array,T::Hash,T::BooleanT.assert_type!Sorbet sig tests (
test/testdata/rewriter/data.rb):T::Array,T.nilable)Limitations
#|continuation: Supported in code (the orphan extraction collects both#:and#|prefixed comments) but not tested — Sorbet's test assertion parser interprets#|lines containing colons as test annotations (e.g.,#| name: Stringis parsed as assertion typename).Co-authored-by: Cameron Bothner cbothner@users.noreply.github.com
Co-authored-by: Claude noreply@anthropic.com