perf(shared/codegen-core): few small performance improvements#3917
perf(shared/codegen-core): few small performance improvements#3917SukkaW wants to merge 13 commits into
Conversation
|
|
|
@SukkaW is attempting to deploy a commit to the Hey API Team on Vercel. A member of the Team first needs to authorize it. |
🦋 Changeset detectedLatest commit: ec2bdba The changes in this PR will be included in the next version bump. This PR includes changesets to release 4 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #3917 +/- ##
==========================================
+ Coverage 37.75% 37.76% +0.01%
==========================================
Files 582 582
Lines 20817 20853 +36
Branches 6064 6076 +12
==========================================
+ Hits 7859 7875 +16
- Misses 10543 10564 +21
+ Partials 2415 2414 -1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Important
Most of the optimizations look clean and the swap from spread-copy to Object.assign / in-place mutation is safe where the target is freshly constructed. However, replacing required = [...required, ...schema.required] with required.push(...schema.required) is not equivalent in this code path: irCompositionSchema.required (and inlineSchema.required in the discriminator branch) may already alias schema.required from the source spec via parseType's irSchema.required = schema.required, so the push mutates the user's input OpenAPI document. The previous spread allocated a fresh array and avoided this. Five inline comments below covering all three parser versions.
TL;DR — A batch of micro-optimizations across the shared parser (hoisted regexps, in-place mutation, fast paths, iterator-free loops, delete → = undefined) reportedly shaves ~9% off Stripe spec generation. The wins look solid, but a few of the spread-to-push rewrites cross from "harmless in-place mutation of fresh IR objects" into "mutation of arrays aliased from the input spec".
Key changes
- Hoist top-level IR pattern regexps — move the
Record<IrTopLevelKind, RegExp>table out ofmatchIrPointerToGroupso the regex literals aren't re-created on each call. - Switch object-emptiness checks to a running flag —
parseObjectnow tracksisSchemaPropertiesEmpty/isPatternPropertiesEmptyduring iteration instead of callingObject.keys(...).lengthafterward, and skips the inner allocation whenschema.properties/schema.patternPropertiesare absent. getSchemaType/getSchemaTypesaccept the schema directly — drop the unused destructuring wrapper at every call site across 2.0.x / 3.0.x / 3.1.x parsers.parseEnumuses an index loop and hoistsx-enum*lookups — avoids the iterator-protocol overhead ofentries()and the per-iteration property reads.- In-place merges over spread —
Object.assign(target, source)replacestarget = {...target, ...source}indeduplicateSchema,addItemsToSchema, and theparseArraycomposition-lift branches. delete→= undefined— forstate.inAllOfreset and theresult.items/result.logicalOperatorcleanup indeduplicateSchema, to keep V8 hidden-class shapes stable.buildIndexKeySpaceaccumulates into a sharedentriesarray — recursion passes the array down instead of spreading sub-results back up.- Fast path in
jsonPointerToPath— skip the.map(replaceAll)decode pass entirely when the pointer contains no~. - Fast path in
isTopLevelComponentfor string input — segment-count viaindexOfinstead of parsing the pointer into an array and discarding it. - Lazy
Setallocation incollectPointerDependencies— switchsubtreeDependencies/transitiveDependenciestoSet<string> | null, allocate only when there's something to add, and useMap.hasinstead ofMap.gettruthy-checks sonullresults cache correctly. - Move
patternPropertiesparsing beforeadditionalPropertiesbranch — lets theisPatternPropertiesEmptyflag participate in theisEmptyObjectInAllOfcheck.
Summary | 12 files | 11 commits | base: main ← more-performance-improvement
In-place push onto aliased required arrays
Before: spread allocated a fresh array, leaving the input spec untouched.
After:required.push(...)mutates whatever arrayirCompositionSchema.required(orinlineSchema.required) currently points at — which may be the source spec's array.
In parseType, irSchema.required = schema.required aliases the source schema's required array directly into the IR (no copy). When a $ref'd composition schema is later processed in parseAllOf and the parent schema.required is non-empty, the new irCompositionSchema.required.push(...schema.required) writes into the referenced spec node's required array. The same hazard applies to the discriminator inlineSchema.required.push(discriminator.propertyName) rewrites. The original spread ([...a, ...b]) avoided this by always producing a fresh array; the safe fix here is the same — either keep the spread on this hot path, or guard with a "clone if aliased" check.
packages/shared/src/openApi/3.1.x/parser/schema.ts · packages/shared/src/openApi/3.0.x/parser/schema.ts · packages/shared/src/openApi/2.0.x/parser/schema.ts
isTopLevelComponent string fast path and pointer encoding
Before: decoded the pointer through
jsonPointerToPathand checked segment count.
After: counts/characters directly in the raw string.
The fast path treats encoded slashes (~1) inside a component name as real segment separators, so e.g. a hypothetical #/components/schemas/foo~1bar (name containing a /) now returns false where the array path returned true. The OpenAPI spec restricts component-key names to ^[a-zA-Z0-9.\-_]+$, so this is unreachable for conformant specs — flagging only because the previous behavior was documented as "decode then count" and this is a quiet semantic narrowing. Not worth fixing unless there's an appetite for an explicit guard.
packages/shared/src/utils/ref.ts
Claude Opus | 𝕏
| if (schema.required) { | ||
| if (irCompositionSchema.required) { | ||
| irCompositionSchema.required = [...irCompositionSchema.required, ...schema.required]; | ||
| irCompositionSchema.required.push(...schema.required); |
There was a problem hiding this comment.
Same aliasing concern: 2.0.x's parseType also does irSchema.required = schema.required without copying (see line 251 of this file), so irCompositionSchema.required can alias the source spec. This push will mutate the input document. .concat(...) would preserve the perf intent while keeping the array fresh.
There was a problem hiding this comment.
Technically, this is OK, we don't use original required anywhere else but to turn them into irSchema.required. Also, the unit test says passed, so I expected this to be fine.
There was a problem hiding this comment.
Same context as the 3.1.x thread — the residual concern is a single $ref'd component used from multiple allOf parents, where the first push pollutes the spec node read by the second. Not blocking; deferring to you since tests pass.
Claude Opus | 𝕏
@hey-api/codegen-core
@hey-api/json-schema-ref-parser
@hey-api/nuxt
@hey-api/openapi-ts
@hey-api/shared
@hey-api/spec-types
@hey-api/types
@hey-api/vite-plugin
commit: |
|
@pullfrog respond to the comments.. |
|
Replied to all 5 review threads. Conceded
|

With the changes, the Stripe OpenAPI 3.1 spec generation time is reduced from 12 ~ 13s down to 10 ~ 11s, about 9% (more or less, more less than more more though) faster for Stripe spec.
Here are the optimizations:
Object.keys(obj).lengthto detect if an object is empty or not, especially when we have already iterated the object with aforloop, we already know if the object is empty or not.parseEnumscurrently usesfor...of..over an array's iterator protocol; however, the iterator protocol introduces a lot of overhead whileparseEnumsis on the hot path. Use a normal for loop on the array instead.isTopLevelComponentcheck will normalize the json pointer, then simply discard the parsed result, wasting result array object allocation. This is now avoided with string comparison (indexOf)obj = { ...obj, new }->Object.assign(obj, new).buildIndexKeySpacenow prefers recursive over top-level mergingproperties: avoid unused object allocations~, skip iterating and mappingdeletewhen possible:deletecan easily cause v8 to deopt an object. Use= undefinedwhen the type is allowed.getSchemaTypeandgetSchemaTypesno longer accept nested objects. It was never extended for the last 2 years. By passingschemadirectly, we also avoid one object allocation per invocation.