-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathbuilding_block.py
More file actions
1192 lines (994 loc) · 56 KB
/
building_block.py
File metadata and controls
1192 lines (994 loc) · 56 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
"""Building block classifier — a recursive Pydantic type tree walker.
================================================================================
WHAT THIS SCRIPT TEACHES
================================================================================
This script demonstrates "programming in Pydantic" — an architecture where
the types ARE the program. Instead of writing functions that transform data
step by step, you declare models (types) and wire them together. One call to
model_validate at the root constructs the entire result. Pydantic descends the
type tree, constructing each inner model as it goes.
If you come from procedural Python, here's the mental shift:
PROCEDURAL: Write functions. Call them in order. Pass data between them.
CONSTRUCTION: Declare models. Wire them as fields. One model_validate cascades.
The script takes ANY Pydantic BaseModel class and recursively classifies its
entire construction graph — every field, and every field of every record-typed
field, all the way down. It does this with zero domain knowledge. It works on
any BaseModel, anywhere.
================================================================================
THE KEY IDEAS
================================================================================
1. NESTED CONSTRUCTION
One model_validate at the root fires the entire tree of construction.
Each model declares its children as field types. Pydantic constructs them
automatically during the parent's validation. You never manually build
inner models — the type annotations ARE the construction instructions.
2. DISCRIMINATED UNIONS
Instead of if/elif chains to handle different cases, you declare a variant
model for each case. Each variant has a Literal tag field. Pydantic reads
the tag and routes to the correct variant automatically. The variant's
fields ARE the answer — no computation needed.
3. from_attributes=True
This is how models read from other models. When Model B has
from_attributes=True, you can pass it Model A and Pydantic reads A's
attributes by name to populate B's fields. The field names ARE the wiring.
Properties count — Pydantic reads them via getattr.
4. SHAPE OVER PROCEDURE
A well-shaped model with the right field names, types, and aliases does
most of the work. Pydantic's default coercion handles the rest. If you're
reaching for a validator or helper function, ask first: can I add an
intermediate model that makes the shape fit?
5. SELF-CLASSIFYING WRAPPERS
A RootModel[object] wrapping a raw value + @property projections = a model
that knows what it is. TypeAnnotation wraps an annotation and exposes .kind.
ResolvedType wraps a type and exposes .block_kind. Downstream DUs read the
tag via from_attributes and route automatically. The object classifies
itself. No external classifier needed.
This pattern appears TWICE in this script. See it once on TypeAnnotation,
see it again on ResolvedType, and you own the pattern.
6. DEMAND-DRIVEN RECURSION
RecordBlock has a children field. LeafBlock doesn't. When Pydantic
constructs RecordBlock from a ResolvedType via from_attributes, it reads
.children — which fires ModelTree.model_validate on the inner type,
producing more ClassifiedNodes. When Pydantic constructs LeafBlock, it
never reads .children because the field doesn't exist on the variant.
Nobody decides whether to recurse. The variant's shape IS the decision.
7. IRREDUCIBLE MINIMUM
Some boundaries can't be crossed with pure shape — like Python's
dict.items() producing positional tuples instead of named objects.
At those boundaries, one small validator bridges the gap. Everything
else is construction. This script has exactly ONE wrap validator that
does real work: ModelTree._reshape, which iterates dict.items(), pairs
keys with values, and guards against cycles via a ContextVar. That's the
irreducible minimum of procedure.
8. THE CONSTRUCTION LIFECYCLE
Every active computation on a frozen model is one of six phases:
Phase 1: Boundary Transform — Field(alias=...) or mode="before" validator
Phase 2: Sealed Boundary — mode="wrap" validator (controls IF construction happens)
Phase 3: Field Construction — Pydantic parses types, from_attributes reads, DUs route
Phase 4: Proof Obligation — mode="after" validator (cross-field invariants)
Phase 5: Construction Effect — model_post_init (registration, side effects)
Phase 6: Derived Projection — @computed_field / @cached_property / @property
This script uses Phases 1, 2, 3, and 6. Phase 4 and 5 aren't needed here.
Each class below is annotated with which phase(s) it demonstrates.
================================================================================
THE CASCADE
================================================================================
One call does everything:
tree = ModelTree.model_validate(Team)
Here is every step that fires, annotated with WHAT does the work and WHY:
ModelTree.model_validate(Team) # YOU call this. One call.
│
│ ┌─ WRAP VALIDATOR (_reshape) ─────────────────────────────────────┐
│ │ WHY: model_fields is a dict. Dict keys aren't attributes. │
│ │ This is the ONE irreducible boundary — someone must iterate │
│ │ dict.items() and pair keys with values. Also guards against │
│ │ cycles: if Team was already seen, return empty fields. │
│ │ │
│ │ HOW: FieldSlot.model_validate(item) for each (name, FieldInfo) │
│ │ CYCLE: ContextVar[frozenset[type]] tracks visited types. │
│ │ frozenset | {data} creates a new set per level — no mutation. │
│ └─────────────────────────────────────────────────────────────────┘
│
├─► FieldSlot.model_validate( ("budget_authority", FieldInfo) )
│ │
│ │ ┌─ BEFORE VALIDATOR (_from_tuple) ────────────────────────────┐
│ │ │ WHY: Tuples have positions, not names. Models need names. │
│ │ │ HOW: {"field_name": data[0], "annotation": data[1].ann} │
│ │ └─────────────────────────────────────────────────────────────┘
│ │
│ ├─ field_name = "budget_authority" # from dict key
│ ├─ annotation = TypeAnnotation(RoleName) # Pydantic coerces raw → TypeAnnotation
│ └─ resolved_type → ResolvedType(RoleName) # @property wraps inner type (Phase 6)
│
├─► Pydantic coerces FieldSlot → ClassifiedNode (from_attributes=True)
│ │
│ │ ClassifiedNode inherits FieldEntry. FieldEntry declares:
│ │ field_name: str ← reads FieldSlot.field_name ✓
│ │ shape: AnnotationShape = Field(alias="annotation")
│ │ ← reads FieldSlot.annotation ✓
│ │
│ │ ┌─ ANNOTATION SHAPE DU (#1: annotation form) ──────────────────┐
│ │ │ FieldSlot.annotation is a TypeAnnotation (self-classifying │
│ │ │ wrapper #1). Pydantic reads .kind (@property): │
│ │ │ get_origin(RoleName) → None → AnnotationKind.DIRECT │
│ │ │ │
│ │ │ DU sees kind="direct" → selects DirectAnnotation: │
│ │ │ nullable = Literal[False] (constant, not computed) │
│ │ │ collection = Literal[False] (constant, not computed) │
│ │ │ resolved_type = RoleName (unwrapped inner type) │
│ │ └──────────────────────────────────────────────────────────────┘
│ │
│ │ ClassifiedNode also declares:
│ │ block_shape: BlockShape = Field(alias="resolved_type")
│ │ ← reads FieldSlot.resolved_type ✓
│ │
│ │ ┌─ BLOCK SHAPE DU (#2: type classification) ───────────────────┐
│ │ │ FieldSlot.resolved_type is a ResolvedType (self-classifying │
│ │ │ wrapper #2). Pydantic reads .block_kind (@property): │
│ │ │ issubclass(RoleName, StrEnum)? No. │
│ │ │ issubclass(RoleName, RootModel)? Yes → Block.NEWTYPE │
│ │ │ │
│ │ │ DU sees block_kind="newtype" → selects LeafBlock: │
│ │ │ No children field → ResolvedType.children NEVER FIRES │
│ │ │ Recursion doesn't happen. The shape decided. │
│ │ └──────────────────────────────────────────────────────────────┘
│ │
│ │ For a RECORD field like team: Team (a BaseModel):
│ │ ResolvedType.block_kind → Block.RECORD → selects RecordBlock
│ │ RecordBlock HAS children: tuple[ClassifiedNode, ...]
│ │ Pydantic reads ResolvedType.children → fires:
│ │ ModelTree.model_validate(Team) → RECURSE (entire cascade again)
│ │ The children ARE the recursive result. Construction IS descent.
│ │
│ └─ Result: ClassifiedNode(
│ field_name="budget_authority",
│ shape=DirectAnnotation(nullable=False, collection=False, ...),
│ block_shape=LeafBlock(block_kind=Block.NEWTYPE)
│ )
│
└─► tree.fields = (ClassifiedNode(...), ClassifiedNode(...), ...)
For an Optional field like reports_to: RoleName | None:
TypeAnnotation.kind → OPTIONAL (structural form)
TypeAnnotation.resolved_type → RoleName (unwrapped — strips the | None)
DU selects OptionalAnnotation → nullable=Literal[True] ← THE SHAPE IS THE ANSWER
┌─ PROJECTION: RENDERING IS RECURSIVE CONSTRUCTION ─────────────────────┐
│ │
│ report = TreeReport.model_validate(tree) # Another model_validate. │
│ └─ reports: tuple[FieldReport, ...] = Field(alias="fields") │
│ └─ FieldReport reads ClassifiedNode via from_attributes: │
│ field_name ← ClassifiedNode.field_name │
│ block ← ClassifiedNode.block (delegates to block_shape) │
│ nullable ← ClassifiedNode.nullable (@property) │
│ collection ← ClassifiedNode.collection (@property) │
│ children ← ClassifiedNode.children → FieldReport(...) │
│ └─ line: str ← @computed_field — one field's display string │
│ └─ lines: tuple[str, ...] ← recursive flatten of all lines │
│ └─ text: str ← @computed_field — indented tree rendering │
│ │
│ print(report) → report.__str__() → self.text (cached) │
│ TreeReport owns indentation. FieldReport has no knowledge of depth. │
└───────────────────────────────────────────────────────────────────────┘
No helper functions. No if-chains. No manual dict building between steps.
The models wire to each other through field declarations and from_attributes.
================================================================================
THE BUILDING BLOCK HIERARCHY
================================================================================
Every type in a Pydantic program is one of these building blocks:
| Block | What it is | Pydantic construct |
|------------|--------------------------------------|-------------------------|
| ENUM | Closed vocabulary | StrEnum |
| NEWTYPE | Semantic scalar wrapper | RootModel[scalar] |
| COLLECTION | Immutable sequence wrapper | RootModel[tuple[T,...]] |
| RECORD | Frozen product with named fields | BaseModel, frozen=True |
| ALGEBRA | Record + derived fields | Record + @computed_field|
| EFFECT | Record + side effects on construction| Record + model_post_init|
| SCALAR | Primitive (str, int, bool, etc.) | bare Python types |
| UNION | Sum type / discriminated union | Annotated[A|B, Field()] |
Use the simplest construct that does the job. Enum before Newtype before Record.
================================================================================
USAGE
================================================================================
uv run python .claude/scripts/building_block.py module:ClassName
uv run python .claude/scripts/building_block.py module:ClassName --json
Examples:
uv run python .claude/scripts/building_block.py arm_ont.team:Team
uv run python .claude/scripts/building_block.py arm_ont.team:Team --json
uv run python .claude/scripts/building_block.py arm_ont.role:Role
uv run python .claude/scripts/building_block.py pydantic:BaseModel
--json outputs the full recursive tree as JSON (for bots and tooling).
Without --json, outputs indented human-readable text.
TWO AUDIENCES, ZERO EXTRA CODE:
The same TreeReport serves both. For humans, __str__ delegates to .text
(the indented rendering). For bots, model_dump_json() serializes the
entire recursive FieldReport tree — including .line and .lines, because
they're @computed_field. No separate "JSON formatter." The projection
model IS the API.
================================================================================
SEMANTIC INDEX TYPES AND FIELD DESCRIPTIONS
================================================================================
A semantic index type is a type declaration where natural-language tokens —
field names, descriptions, and enum member names — function as computational
instructions for a consumer that reads them. In a conventional type system, a
field name is just an address: it tells the system WHICH slot. In a semantic
index type, the field name also tells the consumer WHAT to put in that slot.
This matters because Pydantic models are consumed by language models. When an
LLM sees a schema (via model_json_schema(), tool definitions, or structured
output), it reads field names and Field(description=...) values as natural-
language instructions. Changing a field name or description changes what the
model computes. Renaming is refactoring. The description is part of the program.
A type annotation and a field description form a two-channel system:
STRUCTURAL CHANNEL: The type annotation bounds the space of valid values.
bool gives 2 values. A 4-member enum gives 4. str gives unbounded.
This channel is enforced mechanically — the consumer physically
cannot produce a value outside the type's constraints.
SEMANTIC CHANNEL: The Field(description=...) guides the consumer's
selection WITHIN the structurally valid space. It determines which
valid value the consumer produces. The tighter the structural
constraint, the less the description needs to do. A bool has 1 bit
of freedom — the description resolves which bit. A bare str has
unbounded freedom — the description bears the full burden.
Field descriptions in this file are NOT documentation. Docstrings teach how the
code works. Comments teach the meta-principles. Field descriptions are semantic
indices — they instruct the consumer (human or LLM) about what each field's
value MEANS for the model being analyzed.
OUR STRATEGY IN THIS FILE:
This tool classifies Pydantic models for developers and agents building
construction-first programs. Its consumers hold a classification result
and need to know: what does this value tell me about the model I analyzed?
Where is modeling incomplete? What should I build next?
Each Field(description=...) answers that question for its field. It uses
the vocabulary of the domain (type analysis, field classification, model
structure) — not the vocabulary of the implementation (validators,
construction phases, internal wiring). It says only what the type
annotation doesn't already say. It grounds in what the value tells the
consumer, not how the program computed it.
Literal constant fields (like nullable: Literal[False] on DirectAnnotation)
have ZERO degrees of freedom — the type says everything. No description
needed. Fields with real degrees of freedom get descriptions calibrated
to their structural compression: a bool (1 bit) gets a short precise
description. An 8-member enum gets a description that maps each value
to its meaning. An unbounded str gets a description that fully specifies
the expected content.
THE GENERIC PROMPT FOR WRITING FIELD DESCRIPTIONS:
Gather what the program does, who uses its output, and what they do with
it specifically. Then walk the construction graph from leaves to roots.
For each field, write a Field(description=...) that:
1. Uses the fewest tokens that leave zero ambiguity about what the value
means in this program's domain
2. Says only what the type annotation doesn't already say — the type
handled its part, the description handles the rest
3. Grounds in what the value tells the consumer about the thing being
analyzed, not how the program computed it
Use the vocabulary of the domain, not the vocabulary of the
implementation.
Do not explain the codebase. Do not teach concepts. Do not reference
frameworks or architectural patterns. Each description is an instruction
that resolves what this field's value means for the consumer holding
the result.
"""
from __future__ import annotations
import importlib
import types
from collections.abc import Callable
from contextvars import ContextVar
from enum import StrEnum
from typing import (
Annotated,
ClassVar,
Literal,
TypeAliasType,
get_args,
get_origin,
override,
)
from functools import cached_property
from pydantic import BaseModel, Field, RootModel, computed_field, model_validator
from pydantic.fields import FieldInfo
# =============================================================================
# ENUMS — closed vocabularies that drive dispatch
# =============================================================================
# Enums are the simplest building block. They're exhaustive — every consumer
# must handle every variant. Add a variant, the type checker finds every
# incomplete match. That's why we use them for classification tags.
class Block(StrEnum):
"""The eight structural building blocks of a Pydantic program.
Every type you encounter in a Pydantic model is one of these. The
classifier inspects a type and tells you which block it is. This enum
is CLOSED — no other blocks exist. That's what makes dispatch total.
"""
ENUM = "enum" # StrEnum — closed set of string values
NEWTYPE = "newtype" # RootModel[scalar] — semantic wrapper (e.g. UserName(str))
COLLECTION = "collection" # RootModel[tuple[T, ...]] — immutable sequence
RECORD = "record" # BaseModel, frozen=True — product type with named fields
ALGEBRA = "algebra" # Record + @computed_field — stored fields in, derived out
EFFECT = "effect" # Record + model_post_init — construction triggers action
SCALAR = "scalar" # str, int, bool, etc. — Python primitives
UNION = "union" # Annotated[A | B, Field(discriminator=...)] — sum type
@override
def __repr__(self) -> str:
return f"{type(self).__name__}.{self.name}"
class AnnotationKind(StrEnum):
"""The four structural forms a Python type annotation can take.
Python annotations aren't always plain types. They can be wrapped in
structural constructs: Optional (X | None), tuple (tuple[X, ...]),
or TypeAliasType (type X = ...). This enum captures WHICH wrapping
form an annotation has. The AnnotationShape DU below uses it as its
discriminator — the kind tag routes to the correct variant.
"""
DIRECT = "direct" # Plain type: str, int, MyModel
OPTIONAL = "optional" # X | None — nullable
TUPLE = "tuple" # tuple[X, ...] — homogeneous collection
ALIAS = "alias" # type X = ... — TypeAliasType
@override
def __repr__(self) -> str:
return f"{type(self).__name__}.{self.name}"
# =============================================================================
# ANNOTATION SHAPE — discriminated union (Phase 3: Field Construction)
# =============================================================================
# This is the core teaching example of a discriminated union (DU) and of
# Phase 3: Field Construction. When Pydantic constructs an AnnotationShape
# field, it reads the "kind" tag and selects the matching variant. This
# happens automatically during field construction — no code triggers it.
#
# In procedural code you'd write:
# if is_optional(ann): nullable = True
# elif is_tuple(ann): collection = True
# ...
#
# Instead, we declare a VARIANT MODEL for each case. Each variant has:
# - A Literal tag (kind) that identifies it
# - Fields whose VALUES are baked into the type (Literal[True], Literal[False])
#
# Pydantic reads the tag, selects the variant, and the variant's fields
# ARE the answer. No computation. The shape IS the program.
#
# Notice: nullable and collection are Literal[True] or Literal[False] on
# three of the four variants. They're not computed — they're CONSTANTS
# determined by which variant was selected. That's the power of DUs:
# dispatch replaces computation.
class DirectAnnotation(BaseModel, frozen=True, from_attributes=True):
"""A plain type annotation — not wrapped in Optional, tuple, or alias.
Examples: str, int, TeamName, BaseModel subclasses.
Always nullable=False, collection=False — a direct type is neither.
"""
kind: Literal[AnnotationKind.DIRECT] = AnnotationKind.DIRECT
resolved_type: object = Field(
exclude=True, description="The Python type the field holds"
)
nullable: Literal[False] = False # Direct types aren't nullable
collection: Literal[False] = False # Direct types aren't collections
class OptionalAnnotation(BaseModel, frozen=True, from_attributes=True):
"""An X | None annotation — the field can be absent.
Examples: RoleName | None, str | None.
Always nullable=True — that's what Optional MEANS.
"""
kind: Literal[AnnotationKind.OPTIONAL] = AnnotationKind.OPTIONAL
resolved_type: object = Field(
exclude=True, description="The Python type the field holds"
)
nullable: Literal[True] = True # Optional is ALWAYS nullable
collection: Literal[False] = False
class TupleAnnotation(BaseModel, frozen=True, from_attributes=True):
"""A tuple[X, ...] annotation — a homogeneous immutable sequence.
Examples: tuple[str, ...], tuple[TeamName, ...].
Always collection=True — that's what tuple[X, ...] MEANS.
"""
kind: Literal[AnnotationKind.TUPLE] = AnnotationKind.TUPLE
resolved_type: object = Field(
exclude=True, description="The Python type the field holds"
)
nullable: Literal[False] = False
collection: Literal[True] = True # Tuple IS a collection
class AliasAnnotation(BaseModel, frozen=True, from_attributes=True):
"""A TypeAliasType annotation — created by 'type X = ...' syntax.
Examples: type Authority = LimitedAuthority | UnlimitedAuthority | NoAuthority
nullable and collection depend on what the alias resolves to, so they're
plain bool here, not Literal constants.
"""
kind: Literal[AnnotationKind.ALIAS] = AnnotationKind.ALIAS
resolved_type: object = Field(
exclude=True, description="The Python type the field holds"
)
nullable: bool = Field(
default=False, description="True when the alias target is an Optional type"
)
collection: bool = Field(
default=False, description="True when the alias target is a tuple type"
)
# The discriminated union itself. Pydantic reads the "kind" field from the
# input data and routes to the matching variant. This is the type-level
# equivalent of a match/case statement — but it fires DURING CONSTRUCTION,
# not in calling code. The consumer never switches on kind. They just
# read .nullable and .collection from whatever variant was selected.
AnnotationShape = Annotated[
DirectAnnotation | OptionalAnnotation | TupleAnnotation | AliasAnnotation,
Field(discriminator="kind"),
]
# =============================================================================
# TYPE ANNOTATION — self-classifying wrapper #1 (Phase 6: Derived Projection)
# =============================================================================
# This is self-classifying wrapper #1 of two. (ResolvedType is #2.)
#
# THE PATTERN: RootModel[object] wraps a raw value. Properties expose
# derived attributes. Downstream DUs read those properties via
# from_attributes=True and auto-route to the correct variant. The object
# classifies itself — no external classifier needed.
#
# Here, TypeAnnotation wraps a raw Python annotation and exposes:
# .kind → AnnotationKind tag (drives AnnotationShape DU routing)
# .resolved_type → the inner type after unwrapping structural wrappers
#
# The key insight: from_attributes reads PROPERTIES, not just stored fields.
# So a RootModel with the right properties IS a self-classifying object.
# Pydantic reads .kind, routes the DU, and the variant's Literal fields
# ARE the answer. No one manually determines the kind. The type does it.
#
# .kind and .resolved_type are Phase 6: Derived Projection. Pure functions
# of the frozen root value. Bare @property (not @computed_field) because
# they don't need caching or serialization — trivial derivations read
# once during Phase 3 field construction on downstream models.
class TypeAnnotation(RootModel[object], frozen=True):
"""Self-classifying wrapper #1: raw annotation → structural form.
Wraps any annotation (type, generic alias, TypeAliasType, etc.) and
exposes .kind and .resolved_type as properties. Downstream models read
these via from_attributes=True — enabling automatic DU routing.
.kind classifies the STRUCTURAL FORM: is this a plain type, an Optional,
a tuple, or a type alias? The AnnotationShape DU reads .kind and routes.
.resolved_type UNWRAPS the structure: RoleName | None → RoleName.
tuple[Team, ...] → Team. type X = Y → Y. The wrapper is captured by
.kind — resolved_type is what's inside. This unwrapping is what makes
recursive descent work: a field typed tuple[Team, ...] resolves to Team,
which classifies as RECORD, which triggers descent into Team's fields.
"""
@property
def kind(self) -> AnnotationKind:
"""Detect which structural form this annotation has.
Uses get_origin() for generic aliases (tuple[X,...], X | None)
and isinstance() for TypeAliasType. Falls back to DIRECT.
"""
# get_origin() returns the "base" of a generic alias:
# get_origin(str | None) → types.UnionType
# get_origin(tuple[str, ...]) → tuple
# get_origin(str) → None (not generic)
o = get_origin(self.root)
if o is types.UnionType:
return AnnotationKind.OPTIONAL
if o is tuple:
return AnnotationKind.TUPLE
# TypeAliasType isn't a generic alias — it's a special class
# created by 'type X = ...' syntax. isinstance detects it.
if isinstance(self.root, TypeAliasType):
return AnnotationKind.ALIAS
return AnnotationKind.DIRECT
@property
def resolved_type(self) -> object:
"""The inner type after structural unwrapping.
The structural wrapper is already captured by .kind — resolved_type
is what's inside:
X | None → X (the non-None member)
tuple[X, ...] → X (the element type)
type X = Y → Y (the alias target)
plain type → itself
"""
o = get_origin(self.root)
if o is types.UnionType:
return next(a for a in get_args(self.root) if a is not type(None)) # pyright: ignore[reportAny]
if o is tuple:
return get_args(self.root)[0] # pyright: ignore[reportAny]
if isinstance(self.root, TypeAliasType):
return self.root.__value__ # pyright: ignore[reportAny]
return self.root
# =============================================================================
# FIELD SLOT — bridging the dict boundary (Phases 1, 3, and 6)
# =============================================================================
# Python's model_fields is a dict[str, FieldInfo]. The field NAME is the
# dict KEY, not an attribute on the FieldInfo value. This is the one place
# where pure from_attributes construction can't work — dict keys aren't
# attributes.
#
# FieldSlot is the bridge model. Its Phase 1 before validator takes a
# (key, value) tuple from dict.items() and produces a dict with named
# fields. This is the IRREDUCIBLE MINIMUM of procedure in the entire
# cascade — one small validator that pairs positional data into named data.
# Everything else is pure construction.
#
# Phase 3: The annotation field is typed as TypeAnnotation — Pydantic
# automatically coerces the raw annotation into a TypeAnnotation wrapper.
# The type declaration IS the instruction.
#
# Phase 6: The .resolved_type property wraps the unwrapped inner type in a
# ResolvedType (self-classifying wrapper #2). ClassifiedNode reads this
# during construction via from_attributes, triggering BlockShape DU routing.
# This is the bridge between annotation-level classification (AnnotationShape)
# and type-level classification (BlockShape).
class FieldSlot(BaseModel, frozen=True, from_attributes=True, populate_by_name=True):
"""A single entry from model_fields: field name + its type annotation.
Constructed from dict.items() tuples via the before validator.
populate_by_name=True allows both "name" (alias) and "field_name" to work.
Two attributes feed downstream construction:
.annotation (stored field) — TypeAnnotation, coerced during Phase 3.
ClassifiedNode reads this via alias to construct AnnotationShape DU.
.resolved_type (property) — ResolvedType, derived during Phase 6.
ClassifiedNode reads this via alias to construct BlockShape DU.
The two self-classifying wrappers (TypeAnnotation, ResolvedType) both
originate here. FieldSlot is the source of truth for everything
ClassifiedNode needs to classify a field.
"""
field_name: str = Field(
alias="name", description="Name of the field on the model being classified"
)
annotation: TypeAnnotation = Field(
description="The field's type annotation as declared in source"
)
@model_validator(mode="before")
@classmethod
def _from_tuple(cls, data: tuple[str, FieldInfo]) -> dict[str, object]:
"""The irreducible minimum: pair a dict key with its value's annotation.
dict.items() produces (str, FieldInfo) tuples. Tuples have positions,
not names. This validator bridges position → name so Pydantic can
construct the model's named fields. This is the ONE place in the
entire cascade where we write procedural mapping code.
"""
return {"field_name": data[0], "annotation": data[1].annotation}
@property
def resolved_type(self) -> ResolvedType:
return ResolvedType(self.annotation.resolved_type)
# =============================================================================
# FIELD ENTRY — alias + DU + properties (Phases 1, 3, and 6 together)
# =============================================================================
# This model demonstrates three phases firing in sequence on one class:
#
# Phase 1 (Boundary Transform): Field(alias="annotation") — Pydantic reads
# .annotation from the FieldSlot. Pure declaration, no validator needed.
# This is the cleanest form of Phase 1: an alias, not a before validator.
#
# Phase 3 (Field Construction): Pydantic constructs AnnotationShape (the DU)
# from the TypeAnnotation it just read. TypeAnnotation.kind drives the DU
# routing. The DU selects the variant. The variant's Literal fields set
# nullable/collection. All automatic — no code triggers it.
#
# Phase 6 (Derived Projection): Properties delegate to self.shape, exposing
# resolved_type, nullable, collection as flat attributes. from_attributes
# on ClassifiedNode reads these properties. The flattening IS the projection.
class FieldEntry(BaseModel, frozen=True, from_attributes=True):
"""A field with its annotation resolved into an AnnotationShape.
The shape field is aliased from "annotation" — when Pydantic constructs
a FieldEntry from a FieldSlot (via from_attributes), it reads
FieldSlot.annotation (a TypeAnnotation), and constructs the AnnotationShape
DU from it. TypeAnnotation.kind drives the DU routing. TypeAnnotation.resolved_type
provides the inner type.
Properties flatten the nested shape for ClassifiedNode to read:
ClassifiedNode.from_attributes → reads .resolved_type, .nullable, .collection
These are properties on FieldEntry, delegating to self.shape.
"""
field_name: str = Field(
description="Name of the field on the model being classified"
)
shape: AnnotationShape = Field(
alias="annotation",
description="Structural form of the annotation — direct, optional, tuple, or alias — with nullable and collection flags",
)
@property
def resolved_type(self) -> object:
"""Delegate to shape — ClassifiedNode reads this via from_attributes."""
return self.shape.resolved_type
@property
def nullable(self) -> bool:
"""Delegate to shape — True if the annotation was X | None."""
return self.shape.nullable
@property
def collection(self) -> bool:
"""Delegate to shape — True if the annotation was tuple[X, ...]."""
return self.shape.collection
# =============================================================================
# CLASSIFIED NODE — two DUs, full classification (Phase 3 + Phase 6)
# =============================================================================
# ClassifiedNode inherits from FieldEntry. Inheritance in Pydantic means
# "I am a FieldEntry, plus more." Phase 3 fires the parent's fields first
# (field_name, shape via AnnotationShape DU), then ClassifiedNode's own
# field: block_shape via the BlockShape DU.
#
# Two DUs fire during one ClassifiedNode construction:
# 1. AnnotationShape — classifies the annotation form (nullable? collection?)
# Fed by TypeAnnotation (self-classifying wrapper #1)
# 2. BlockShape — classifies the type itself (record? enum? scalar?)
# Fed by ResolvedType (self-classifying wrapper #2)
#
# The BlockShape DU is where recursion happens — or doesn't. RecordBlock
# has a children field, so Pydantic reads ResolvedType.children, which
# fires ModelTree.model_validate on the inner type. LeafBlock has no
# children field, so the property never fires. The variant's shape IS
# the recursion decision. Nobody writes "if record: descend."
#
# .block and .children are Phase 6 delegation properties — they forward
# to block_shape so downstream models (FieldReport) see a flat interface.
class ClassifiedNode(FieldEntry, frozen=True, from_attributes=True):
"""One field fully classified: its building block type, structural flags, and children.
Inherits from FieldEntry — gets field_name, shape, and the flattening
properties for free. Adds block classification via BlockShape DU routed
from ResolvedType.block_kind. For recursive variants (RecordBlock,
AlgebraBlock, EffectBlock), ResolvedType.children fires → recursion.
"""
block_shape: BlockShape = Field(
alias="resolved_type",
description="Building block classification with recursive children for record-like types",
)
@property
def block(self) -> Block:
return self.block_shape.block_kind
@property
def children(self) -> tuple[ClassifiedNode, ...]:
return self.block_shape.children
# =============================================================================
# RESOLVED TYPE — self-classifying wrapper #2 (Phase 6: Derived Projection)
# =============================================================================
# This is self-classifying wrapper #2 of two. (TypeAnnotation is #1.)
#
# Same pattern, different domain:
# TypeAnnotation wraps an annotation → .kind classifies the FORM
# ResolvedType wraps a type → .block_kind classifies the TYPE
#
# Two properties:
# .block_kind — walks a predicate table (_BLOCK_MAP) in priority order.
# The table evolved from the original (type, Block) pairs to
# (Callable[[type], bool], Block) predicates. The walk is the same
# one-liner — next() with a default. The table got wider (predicates
# instead of bare types) so it can express ALGEBRA and EFFECT as
# refinements of RECORD. Data-driven dispatch scales by widening rows,
# not adding branches.
#
# .children — unconditionally calls ModelTree.model_validate(self.root).
# No guard. No isinstance check. This property is ONLY ever read when
# a BlockShape DU variant with a children field constructs via
# from_attributes (RecordBlock, AlgebraBlock, EffectBlock). Leaf
# variants have no children field, so this property never fires for
# them. The DU variant's shape IS the guard.
class ResolvedType(RootModel[object], frozen=True):
"""Self-classifying wrapper #2: raw type → building block classification.
Parallels TypeAnnotation: RootModel[object] wrapping a value, exposing
derived properties downstream DUs read via from_attributes.
TypeAnnotation classifies the ANNOTATION FORM (Optional, tuple, alias, direct).
ResolvedType classifies the TYPE ITSELF (enum, newtype, record, scalar, ...).
.block_kind drives BlockShape DU variant routing.
.children provides recursive descent — only ever read when a variant
with a children field constructs. Leaf variants never request it.
"""
_BLOCK_MAP: ClassVar[tuple[tuple[Callable[[type], bool], Block], ...]] = (
(lambda t: issubclass(t, StrEnum), Block.ENUM),
(lambda t: issubclass(t, RootModel), Block.NEWTYPE),
(
lambda t: issubclass(t, BaseModel) and bool(t.model_computed_fields),
Block.ALGEBRA,
),
(
lambda t: issubclass(t, BaseModel) and "model_post_init" in vars(t),
Block.EFFECT,
),
(lambda t: issubclass(t, BaseModel), Block.RECORD),
)
@property
def block_kind(self) -> Block:
return next(
(
b
for pred, b in self._BLOCK_MAP
if isinstance(self.root, type) and pred(self.root)
),
Block.SCALAR,
)
@property
def children(self) -> tuple[ClassifiedNode, ...]:
return ModelTree.model_validate(self.root).fields
# =============================================================================
# BLOCK SHAPE — discriminated union #2 (Phase 3 + demand-driven recursion)
# =============================================================================
# This is the second DU in the cascade. AnnotationShape (#1) classifies the
# annotation FORM (nullable? collection?). BlockShape (#2) classifies the
# TYPE ITSELF (record? enum? scalar?) and controls whether recursion happens.
#
# THE KEY INSIGHT: Recursion is driven by VARIANT SHAPE, not by branching.
# - RecordBlock, AlgebraBlock, EffectBlock have a children field.
# When Pydantic constructs them from a ResolvedType (via from_attributes),
# it reads .children — which fires ModelTree.model_validate(inner_type).
# That's how recursion starts. Nobody writes "if record: recurse."
# - LeafBlock has NO children field. It has a .children @property that
# returns (). Pydantic never reads it during construction (no field
# demands it). It exists so downstream delegation (ClassifiedNode.children)
# works uniformly on any variant.
#
# MULTI-VALUE LITERAL: LeafBlock uses Literal[Block.ENUM, Block.NEWTYPE, ...].
# Pydantic supports multiple values in a single Literal for DU discriminators.
# One variant catches all five leaf cases. Without this, you'd need five
# identical leaf variants — shape duplication for no semantic gain.
class RecordBlock(BaseModel, frozen=True, from_attributes=True):
"""A plain BaseModel — frozen product type with named fields.
Has children: Pydantic reads ResolvedType.children during construction,
triggering recursive descent into the record's own fields.
"""
block_kind: Literal[Block.RECORD] = Block.RECORD
children: tuple[ClassifiedNode, ...] = Field(
description="Classified fields of the inner model, one per field"
)
class AlgebraBlock(BaseModel, frozen=True, from_attributes=True):
"""A BaseModel with @computed_field — stored fields in, derived knowledge out.
Has children: same recursive descent as RecordBlock. The distinction from
RecordBlock is semantic — the type has derived projections, not just stored
fields. Detected by checking model_computed_fields.
"""
block_kind: Literal[Block.ALGEBRA] = Block.ALGEBRA
children: tuple[ClassifiedNode, ...] = Field(
description="Classified fields of the inner model, one per field"
)
class EffectBlock(BaseModel, frozen=True, from_attributes=True):
"""A BaseModel with model_post_init — construction triggers a side effect.
Has children: same recursive descent. The distinction is semantic — this
type does something when constructed (registers itself, emits events).
Detected by checking for model_post_init in the class's own vars().
"""
block_kind: Literal[Block.EFFECT] = Block.EFFECT
children: tuple[ClassifiedNode, ...] = Field(
description="Classified fields of the inner model, one per field"
)
class LeafBlock(BaseModel, frozen=True, from_attributes=True):
"""A non-record type — enum, newtype, collection, scalar, or union.
No children field: Pydantic never reads ResolvedType.children when
constructing a LeafBlock, so recursive descent never fires. The variant's
shape IS the decision not to recurse.
The .children property returns () for uniform delegation — ClassifiedNode
can call self.block_shape.children on any variant without checking type.
Multi-value Literal: one variant catches all five leaf block kinds.
Pydantic matches any of them during DU routing.
"""
block_kind: Literal[
Block.ENUM, Block.NEWTYPE, Block.COLLECTION, Block.SCALAR, Block.UNION
] = Field(
description="Which leaf building block — enum, newtype, collection, scalar, or union"
)
@property
def children(self) -> tuple[ClassifiedNode, ...]:
return ()
# The discriminated union itself. Pydantic reads block_kind from the
# ResolvedType (via from_attributes) and routes to the matching variant.
# RecordBlock/AlgebraBlock/EffectBlock → children field fires recursion.
# LeafBlock → no children field, recursion stops.
BlockShape = Annotated[
RecordBlock | AlgebraBlock | EffectBlock | LeafBlock,
Field(discriminator="block_kind"),
]
# =============================================================================
# FIELD REPORT + TREE REPORT — recursive rendering (Phase 3 + Phase 6)
# =============================================================================
# In procedural code, rendering means writing a format_output() function that
# loops over results and builds strings. In construction-first code, rendering
# is ANOTHER model_validate (Phase 3: Field Construction) followed by
# @computed_field derivation (Phase 6: Derived Projection).
#
# RECURSIVE STRUCTURE: FieldReport has a children field typed as
# tuple[FieldReport, ...]. When Pydantic constructs a FieldReport from a
# ClassifiedNode, it reads .children (which are ClassifiedNodes) and coerces
# each one into a FieldReport — recursively. The SAME from_attributes
# construction that works at the top level works at every depth. No special
# recursion code. The type annotation IS the recursion instruction.
#
# Phase 3: TreeReport.model_validate(tree) reads .fields from ModelTree via
# from_attributes. Pydantic coerces each ClassifiedNode → FieldReport by
# reading ClassifiedNode's properties (field_name, block, nullable, collection,
# children). Children coerce recursively into more FieldReports.
#
# Phase 6: Three @computed_field projections derive the display:
# .line — one field's display string, derived from stored fields
# .lines — recursive flatten of this report + all descendant lines
# .text (on TreeReport) — indented tree rendering with depth tracking
# All cached on first access, included in model_dump()/JSON.
#
# OWNERSHIP OF DEPTH: TreeReport owns indentation. FieldReport has no
# knowledge of its position in the tree — it just stores its data and its
# children. TreeReport._indent walks the recursive structure and derives
# the visual depth. This separation means FieldReport is reusable in any
# context (flat list, JSON, table), while TreeReport specializes for
# indented text output.
#
# __str__ delegates to .text — Python's display protocol uses the cached
# Phase 6 projection. print(report) fires the entire rendering pipeline.
#
# The cascade: ClassifiedNode → FieldReport → TreeReport → print().
# Each step is model_validate + from_attributes. No string formatting functions.
class FieldReport(BaseModel, frozen=True, from_attributes=True):
"""One classified field projected for display.
Phase 3: Constructs from ClassifiedNode via from_attributes — reads
field_name, block, nullable, collection, children.
Phase 6: line derives the display string, lines recursively flattens
the tree. Derived once, cached, serializable.
"""
field_name: str = Field(description="Name of the field on the analyzed model")
block: Block = Field(
description="Structural role: record (named fields), enum (closed vocabulary), newtype (typed wrapper), scalar (primitive), algebra (record + derived), effect (record + side effects), collection (sequence wrapper), union (sum type)"
)
nullable: bool = Field(
description="True when the field's type was X | None — the field accepts absence"
)
collection: bool = Field(
description="True when the field's type was tuple[X, ...] — the field holds a sequence"
)
children: tuple[FieldReport, ...] = Field(
default=(),
description="Classified children of the inner type, present for record/algebra/effect, empty for leaves",
)
@computed_field(
description="Single-line summary: name, block type, nullable, and collection"
)
@cached_property
def line(self) -> str:
return f"{self.field_name}: {self.block.value} (nullable={self.nullable}, collection={self.collection})"
@computed_field(
description="This field's line plus all descendant lines, flattened in tree order"
)
@cached_property
def lines(self) -> tuple[str, ...]:
return (self.line, *(line for child in self.children for line in child.lines))
# =============================================================================
# MODEL TREE — the root (Phase 2: Sealed Boundary + cycle detection)
# =============================================================================
# The entry point. ModelTree.model_validate(SomeBaseModelClass) fires the