perf(shared/codegen-core): few small performance improvements#3917
perf(shared/codegen-core): few small performance improvements#3917SukkaW wants to merge 11 commits into
Conversation
|
|
|
@SukkaW is attempting to deploy a commit to the Hey API Team on Vercel. A member of the Team first needs to authorize it. |
|
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #3917 +/- ##
==========================================
+ Coverage 37.70% 37.72% +0.01%
==========================================
Files 582 582
Lines 20844 20880 +36
Branches 6063 6075 +12
==========================================
+ Hits 7860 7876 +16
- Misses 10570 10591 +21
+ Partials 2414 2413 -1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Important
Most of the optimizations look clean and the swap from spread-copy to Object.assign / in-place mutation is safe where the target is freshly constructed. However, replacing required = [...required, ...schema.required] with required.push(...schema.required) is not equivalent in this code path: irCompositionSchema.required (and inlineSchema.required in the discriminator branch) may already alias schema.required from the source spec via parseType's irSchema.required = schema.required, so the push mutates the user's input OpenAPI document. The previous spread allocated a fresh array and avoided this. Five inline comments below covering all three parser versions.
TL;DR — A batch of micro-optimizations across the shared parser (hoisted regexps, in-place mutation, fast paths, iterator-free loops, delete → = undefined) reportedly shaves ~9% off Stripe spec generation. The wins look solid, but a few of the spread-to-push rewrites cross from "harmless in-place mutation of fresh IR objects" into "mutation of arrays aliased from the input spec".
Key changes
- Hoist top-level IR pattern regexps — move the
Record<IrTopLevelKind, RegExp>table out ofmatchIrPointerToGroupso the regex literals aren't re-created on each call. - Switch object-emptiness checks to a running flag —
parseObjectnow tracksisSchemaPropertiesEmpty/isPatternPropertiesEmptyduring iteration instead of callingObject.keys(...).lengthafterward, and skips the inner allocation whenschema.properties/schema.patternPropertiesare absent. getSchemaType/getSchemaTypesaccept the schema directly — drop the unused destructuring wrapper at every call site across 2.0.x / 3.0.x / 3.1.x parsers.parseEnumuses an index loop and hoistsx-enum*lookups — avoids the iterator-protocol overhead ofentries()and the per-iteration property reads.- In-place merges over spread —
Object.assign(target, source)replacestarget = {...target, ...source}indeduplicateSchema,addItemsToSchema, and theparseArraycomposition-lift branches. delete→= undefined— forstate.inAllOfreset and theresult.items/result.logicalOperatorcleanup indeduplicateSchema, to keep V8 hidden-class shapes stable.buildIndexKeySpaceaccumulates into a sharedentriesarray — recursion passes the array down instead of spreading sub-results back up.- Fast path in
jsonPointerToPath— skip the.map(replaceAll)decode pass entirely when the pointer contains no~. - Fast path in
isTopLevelComponentfor string input — segment-count viaindexOfinstead of parsing the pointer into an array and discarding it. - Lazy
Setallocation incollectPointerDependencies— switchsubtreeDependencies/transitiveDependenciestoSet<string> | null, allocate only when there's something to add, and useMap.hasinstead ofMap.gettruthy-checks sonullresults cache correctly. - Move
patternPropertiesparsing beforeadditionalPropertiesbranch — lets theisPatternPropertiesEmptyflag participate in theisEmptyObjectInAllOfcheck.
Summary | 12 files | 11 commits | base: main ← more-performance-improvement
In-place push onto aliased required arrays
Before: spread allocated a fresh array, leaving the input spec untouched.
After:required.push(...)mutates whatever arrayirCompositionSchema.required(orinlineSchema.required) currently points at — which may be the source spec's array.
In parseType, irSchema.required = schema.required aliases the source schema's required array directly into the IR (no copy). When a $ref'd composition schema is later processed in parseAllOf and the parent schema.required is non-empty, the new irCompositionSchema.required.push(...schema.required) writes into the referenced spec node's required array. The same hazard applies to the discriminator inlineSchema.required.push(discriminator.propertyName) rewrites. The original spread ([...a, ...b]) avoided this by always producing a fresh array; the safe fix here is the same — either keep the spread on this hot path, or guard with a "clone if aliased" check.
packages/shared/src/openApi/3.1.x/parser/schema.ts · packages/shared/src/openApi/3.0.x/parser/schema.ts · packages/shared/src/openApi/2.0.x/parser/schema.ts
isTopLevelComponent string fast path and pointer encoding
Before: decoded the pointer through
jsonPointerToPathand checked segment count.
After: counts/characters directly in the raw string.
The fast path treats encoded slashes (~1) inside a component name as real segment separators, so e.g. a hypothetical #/components/schemas/foo~1bar (name containing a /) now returns false where the array path returned true. The OpenAPI spec restricts component-key names to ^[a-zA-Z0-9.\-_]+$, so this is unreachable for conformant specs — flagging only because the previous behavior was documented as "decode then count" and this is a quiet semantic narrowing. Not worth fixing unless there's an appetite for an explicit guard.
packages/shared/src/utils/ref.ts
Claude Opus | 𝕏
| if (schema.required) { | ||
| if (irCompositionSchema.required) { | ||
| irCompositionSchema.required = [...irCompositionSchema.required, ...schema.required]; | ||
| irCompositionSchema.required.push(...schema.required); |
There was a problem hiding this comment.
irCompositionSchema.required may alias the source spec's array — parseType does irSchema.required = schema.required (no copy) at line 476, so for a $ref'd compositionSchema the IR's required points into the input document. The old [...irCompositionSchema.required, ...schema.required] returned a fresh array; this push mutates the source spec's required array, which can corrupt the input on re-parse or affect other references to the same component. Either keep the spread, or clone before pushing: irCompositionSchema.required = irCompositionSchema.required.concat(schema.required).
There was a problem hiding this comment.
Technically, this is OK, we don't use original required anywhere else but to turn them into irSchema.required. Also, the unit test says passed, so I expected this to be fine.
There was a problem hiding this comment.
Fair point on tests passing. The residual concern is the multi-reference case: resolveRef returns the live spec node, so when one #/components/schemas/Foo is $ref'd from multiple parent allOfs, the first parent's push(...schema.required) writes into Foo.required in the spec, and the second parent then reads (and pushes onto) a polluted array. If your benchmark spec doesn't hit that shape it won't show up in tests. Happy to defer — flagging only so you know what to look for if you see surprising IR shapes later.
Claude Opus | 𝕏
| if (schema.required) { | ||
| if (irCompositionSchema.required) { | ||
| irCompositionSchema.required = [...irCompositionSchema.required, ...schema.required]; | ||
| irCompositionSchema.required.push(...schema.required); |
There was a problem hiding this comment.
Same aliasing concern as the 3.1.x version: irCompositionSchema.required can be the same array reference as the underlying spec's schema.required (assigned without copy in parseType). The previous spread allocated fresh; this push mutates the input spec. Suggest irCompositionSchema.required = irCompositionSchema.required.concat(schema.required) instead.
There was a problem hiding this comment.
Technically, this is OK, we don't use original required anywhere else but to turn them into irSchema.required. Also, the unit test says passed, so I expected this to be fine.
There was a problem hiding this comment.
Same context as the 3.1.x thread — the residual concern is a single $ref'd component used from multiple allOf parents, where the first push pollutes the spec node read by the second. Not blocking; deferring to you since tests pass.
Claude Opus | 𝕏
| if (schema.required) { | ||
| if (irCompositionSchema.required) { | ||
| irCompositionSchema.required = [...irCompositionSchema.required, ...schema.required]; | ||
| irCompositionSchema.required.push(...schema.required); |
There was a problem hiding this comment.
Same aliasing concern: 2.0.x's parseType also does irSchema.required = schema.required without copying (see line 251 of this file), so irCompositionSchema.required can alias the source spec. This push will mutate the input document. .concat(...) would preserve the perf intent while keeping the array fresh.
There was a problem hiding this comment.
Technically, this is OK, we don't use original required anywhere else but to turn them into irSchema.required. Also, the unit test says passed, so I expected this to be fine.
There was a problem hiding this comment.
Same context as the 3.1.x thread — the residual concern is a single $ref'd component used from multiple allOf parents, where the first push pollutes the spec node read by the second. Not blocking; deferring to you since tests pass.
Claude Opus | 𝕏
| } | ||
| if (!inlineSchema.required.includes(discriminator.propertyName)) { | ||
| inlineSchema.required = [...inlineSchema.required, discriminator.propertyName]; | ||
| inlineSchema.required.push(discriminator.propertyName); |
There was a problem hiding this comment.
inlineSchema is the last object/properties-bearing entry in schemaItems, which can carry a required aliased to the source spec via parseType. The previous inlineSchema.required = [...inlineSchema.required, discriminator.propertyName] created a fresh array; this push mutates whatever inlineSchema.required currently points at, including potentially the input document. Use .concat([discriminator.propertyName]) if you want to keep allocation low without aliasing.
There was a problem hiding this comment.
The same, we don't use the original input document anywhere else but to turn it into inlineSchema including inlineSchema.required. Also, the unit test says passed, so I expected this to be fine.
Also concat copies array so can't skip the allocation here.
There was a problem hiding this comment.
You're right that concat allocates — my suggestion didn't save anything over the original spread. Withdrawing the concat fix. The residual concern (inline schema's required may alias the spec and get polluted when the same composition is referenced again) stands, but it's narrow enough that deferring to your judgement here makes sense.
Claude Opus | 𝕏
| } | ||
| if (!inlineSchema.required.includes(discriminator.propertyName)) { | ||
| inlineSchema.required = [...inlineSchema.required, discriminator.propertyName]; | ||
| inlineSchema.required.push(discriminator.propertyName); |
There was a problem hiding this comment.
Same aliasing concern as the 3.1.x discriminator branch — inlineSchema.required may reference the source spec's array. Suggest .concat([discriminator.propertyName]) to keep the perf gain without risking mutation of the input document.
There was a problem hiding this comment.
The same, we don't use the original input document anywhere else but to turn it into inlineSchema including inlineSchema.required. Also, the unit test says passed, so I expected this to be fine.
Also concat copies array so can't skip the allocation here.
There was a problem hiding this comment.
You're right that concat allocates — that suggestion didn't actually save anything over the original spread. Withdrawing it. Same residual aliasing concern as the 3.1.x discriminator thread, but narrow enough to defer.
Claude Opus | 𝕏
@hey-api/codegen-core
@hey-api/json-schema-ref-parser
@hey-api/nuxt
@hey-api/openapi-ts
@hey-api/shared
@hey-api/spec-types
@hey-api/types
@hey-api/vite-plugin
commit: |
|
@pullfrog respond to the comments.. |
|
Replied to all 5 review threads. Conceded
|

With the changes, the Stripe OpenAPI 3.1 spec generation time is reduced from 12 ~ 13s down to 10 ~ 11s, about 9% (more or less, more less than more more though) faster for Stripe spec.
Here are the optimizations:
Object.keys(obj).lengthto detect if an object is empty or not, especially when we have already iterated the object with aforloop, we already know if the object is empty or not.parseEnumscurrently usesfor...of..over an array's iterator protocol; however, the iterator protocol introduces a lot of overhead whileparseEnumsis on the hot path. Use a normal for loop on the array instead.isTopLevelComponentcheck will normalize the json pointer, then simply discard the parsed result, wasting result array object allocation. This is now avoided with string comparison (indexOf)obj = { ...obj, new }->Object.assign(obj, new).buildIndexKeySpacenow prefers recursive over top-level mergingproperties: avoid unused object allocations~, skip iterating and mappingdeletewhen possible:deletecan easily cause v8 to deopt an object. Use= undefinedwhen the type is allowed.getSchemaTypeandgetSchemaTypesno longer accept nested objects. It was never extended for the last 2 years. By passingschemadirectly, we also avoid one object allocation per invocation.