fix(SEC-2): replace input-sized C VLAs with heap allocation at three sites#226
fix(SEC-2): replace input-sized C VLAs with heap allocation at three sites#226kostub wants to merge 3 commits into
Conversation
…sites Three C variable-length arrays sized by attacker/render-controlled input could overflow the stack (UB / EXC_BAD_ACCESS) when given large inputs. Replace each with malloc/calloc + matching free, preserving all behavior: - +[MTMathAtomFactory mathListForCharacters:] — unichar buff[len] → malloc - changeFont() in MTTypesetter.m — unichar charBuffer[str.length] → malloc - -[MTTypesetter makeTable:] — CGFloat columnWidths[numColumns] → calloc (calloc zero-inits, replacing the memset; no helper signature changes needed) Add three SEC-2 regression tests in MTTypesetterTest.m verifying correctness for larger-than-trivial inputs at each of the three sites. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
There was a problem hiding this comment.
Code Review
This pull request replaces Variable Length Arrays (VLAs) with heap allocations (malloc and calloc) in MTMathAtomFactory.m and MTTypesetter.m to prevent stack overflows on large inputs, and adds corresponding regression tests. The reviewer feedback suggests optimizing these changes by implementing a hybrid stack/heap allocation pattern. Since these methods are frequently called with small inputs, unconditionally allocating on the heap introduces performance overhead. Using a small stack buffer for typical sizes and falling back to heap allocation only for exceptionally large inputs avoids this overhead while maintaining safety.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| for (int i = 0; i < str.length; ++i) { | ||
| NSUInteger length = str.length; | ||
| NSMutableString* retval = [NSMutableString stringWithCapacity:length]; | ||
| unichar *charBuffer = malloc(sizeof(unichar) * (size_t)length); |
There was a problem hiding this comment.
changeFont is a very hot path called frequently during typesetting (often for single-character nuclei like variables or numbers). Unconditionally allocating memory on the heap via malloc for every single call introduces a significant performance regression.
Implementing a hybrid stack/heap allocation pattern avoids this overhead for typical short strings (length <= 256) while still safely preventing stack overflows for exceptionally large inputs.
unichar stackBuffer[256];
unichar *charBuffer = stackBuffer;
if (length > 256) {
charBuffer = malloc(sizeof(unichar) * (size_t)length);
if (!charBuffer) {
return @"";
}
}| NSString* charStr = [[NSString alloc] initWithBytes:&unicode length:sizeof(unicode) encoding:NSUTF32LittleEndianStringEncoding]; | ||
| [retval appendString:charStr]; | ||
| } | ||
| free(charBuffer); |
There was a problem hiding this comment.
Addressed in db50120. Wrapped the alloc..use region in @try { ... } @finally { free(charBuffer); } so the buffer is released on all exit paths, including the IllegalCharacter / Invalid style exceptions that styleCharacter() can @throw from inside the loop. (Kept unconditional heap alloc rather than the stack/heap hybrid — the blocking concern here was the unwind-path leak.)
| NSParameterAssert(chars); | ||
| NSInteger len = chars.length; | ||
| unichar buff[len]; | ||
| unichar *buff = malloc(sizeof(unichar) * (size_t)len); |
There was a problem hiding this comment.
Unconditional heap allocation via malloc for every call to mathListForCharacters: introduces unnecessary performance overhead, especially for typical short math strings (which are usually well under 100 characters). Additionally, if malloc fails (returning NULL), passing NULL to getCharacters:range: will cause a crash.
Using a hybrid stack/heap allocation pattern (using a small stack buffer for common short lengths and only falling back to malloc for exceptionally large inputs) avoids heap allocation overhead in the vast majority of cases while safely handling large inputs and potential allocation failures.
unichar stackBuff[256];
unichar *buff = stackBuff;
if (len > 256) {
buff = malloc(sizeof(unichar) * (size_t)len);
if (!buff) {
return [[MTMathList alloc] init];
}
}| [list addAtom:atom]; | ||
| } | ||
| } | ||
| free(buff); |
| // Xcode Analyze "Assigned value is garbage or undefined". | ||
| // https://stackoverflow.com/questions/21191194/analyzer-warning-assigned-value-is-garbage-or-undefined | ||
| memset(columnWidths, 0, sizeof(columnWidths)); | ||
| CGFloat *columnWidths = calloc(numColumns, sizeof(CGFloat)); |
There was a problem hiding this comment.
Most math tables have a very small number of columns (typically under 10). Unconditionally allocating the columnWidths array on the heap via calloc introduces unnecessary overhead.
Using a hybrid stack/heap allocation pattern avoids heap allocation for typical tables while safely falling back to calloc for tables with a very large number of columns.
CGFloat stackColumnWidths[32];
CGFloat *columnWidths = stackColumnWidths;
if (numColumns > 32) {
columnWidths = calloc(numColumns, sizeof(CGFloat));
if (!columnWidths) {
return [[MTMathListDisplay alloc] initWithDisplays:[NSArray array] range:table.indexRange];
}
} else {
memset(stackColumnWidths, 0, sizeof(stackColumnWidths));
}| [self positionRows:rowDisplays forTable:table]; | ||
| MTMathListDisplay* tableDisplay = [[MTMathListDisplay alloc] initWithDisplays:rowDisplays range:table.indexRange]; | ||
| tableDisplay.position = _currentPosition; | ||
| free(columnWidths); |
There was a problem hiding this comment.
kostub
left a comment
There was a problem hiding this comment.
EM-REVIEW v1
SEC-2: VLA -> heap allocation review
Scope reviewed: the three converted sites and the three regression tests. The core change is sound -- heap allocation removes the unbounded-stack-growth risk from input-controlled sizes, and calloc zero-init correctly replaces the prior memset (all-zero bytes == 0.0 for CGFloat, so behavior is preserved). The tests are meaningful (they exercise 10k-char / 500-column inputs that would have stressed the old VLAs and assert correct output counts).
However, two of the three sites introduce a leak regression on exception unwind paths that did not exist with the VLA (VLA storage is auto-reclaimed when the stack unwinds; malloc/calloc is not).
Important -- free is skipped on @throw paths (Sites 2 and 3)
Site 2 changeFont() and Site 3 makeTable: only free on the normal return path. Both have reachable @throw paths between allocation and free:
changeFont->styleCharacter->getDefaultStylecan@throwIllegalCharacter(MTTypesetter.m:226) and the style switch can@throwInvalid style(MTTypesetter.m:419).makeTable:callstypesetCells:(andmakeRowWithColumns:) which callcreateLineForMathList:->changeFontfor cell contents, so the same exceptions propagate up throughmakeTable:, skippingfree(columnWidths).
With the VLA this was harmless; now an IllegalCharacter / Invalid style exception leaks charBuffer / columnWidths. Recommend wrapping the body in @try { ... } @finally { free(ptr); } (or freeing immediately before each rethrow). Site 1 (mathListForCharacters:) has no throw between malloc and free, so it is fine.
Minor -- no NULL check on malloc/calloc
None of the three sites check the allocation result. On failure, the subsequent getCharacters: (Sites 1/2) writes through a NULL pointer -> crash rather than graceful handling. Low practical risk at these sizes, but worth a guard given this is a hardening PR.
Minor / note -- unchecked size multiplication in the malloc sites
malloc(sizeof(unichar) * (size_t)len) (Sites 1/2) is not overflow-safe in principle, though not exploitable on 64-bit since NSString length is bounded well below the overflow threshold. Site 3 correctly uses calloc(numColumns, sizeof(CGFloat)), whose two-argument form is overflow-checked by spec -- the malloc sites do not get that protection. Acceptable as-is, noted for completeness.
Not approving
Holding approval pending the exception-path leak fix (Important).
…makeTable:) After replacing input-sized VLAs with malloc/calloc, the heap buffers were freed only on the normal return path. styleCharacter() can @throw IllegalCharacter / Invalid style from inside the loop, and those exceptions propagate up through changeFont() and makeTable: (via typesetCells: -> createLineForMathList: -> changeFont), skipping free() and leaking the buffer. VLAs were auto-reclaimed on unwind; malloc/calloc are not. Wrap the alloc..use region in @Try { ... } @finally { free(buf); } at: - changeFont() (charBuffer) - makeTable: (columnWidths) Site 1 (mathListForCharacters:) has no @throw between malloc and free, so it keeps the straight-line free. Add NULL-check assertions at all three sites. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
Addressed the blocking exception-path leak in db50120. Important (blocking) —
Site 1 ( Minor — NULL check: Added Minor — unchecked size multiplication: The two Re: the gemini-code-assist suggestion of a stack/heap hybrid — that's a perf optimization orthogonal to the correctness/leak fix; not adopting it here to keep this change focused on the security hardening + leak.
|
…timization changeFont() is called hundreds of times per equation render with length==1 in the common case. The previous SEC-2 fix allocated a heap buffer unconditionally (malloc/free per call). Replace with a 256-element fixed-size stack buffer for the common small case, falling back to heap only when length > 256. The SEC-2 security fix is preserved: the stack buffer is a FIXED 256 unichars (not input-sized), so attacker-controlled input above the threshold still goes to the heap. The original unbounded VLA hole remains closed. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Summary
Fixes SEC-2 from issues.md: three C variable-length arrays (VLAs) sized by
attacker/render-controlled input could silently overflow the thread stack —
undefined behavior that manifests as a hard
EXC_BAD_ACCESS/SIGSEGV crashwith no recoverable
NSException.Changed sites (2 files):
iosMath/lib/MTMathAtomFactory.m—+[MTMathAtomFactory mathListForCharacters:]:unichar buff[len]→unichar *buff = malloc(sizeof(unichar) * len)+free(buff).lenis public-API-caller-controlled; a multi-megabyte string put 2×N bytes on the stack.iosMath/render/internal/MTTypesetter.m—changeFont():unichar charBuffer[str.length]→malloc+free.strisatom.nucleus; fused atom runs or a single large-nucleus atom make it input-sized.iosMath/render/internal/MTTypesetter.m—-[MTTypesetter makeTable:]:CGFloat columnWidths[numColumns]→CGFloat *columnWidths = calloc(numColumns, sizeof(CGFloat))+free(columnWidths).calloczero-inits, replacing the previousmemset. No helper method signature changes.Behavior: Identical. No API, header, or method signature changes.
Tests
Three new regression tests added to
MTTypesetterTest:testMathListForCharactersLargeInput_SEC2— 10k-character digit string returns 10k atomstestChangeFontLargeNucleus_SEC2— 10k-char variable atom renders to a non-nil displaytestMathTableManyColumns_SEC2— 500-column table typesets without crashAll existing tests (
MTFontManagerTest,MTMathAtomTest,MTMathListBuilderTest,MTMathListRangeTest,MTMathListTest,MTTypesetterTest) continue to pass.Test plan
swift build— clean build, no warningsxcodebuild testiOS scheme — all tests passedmalloc/callochas a matchingfreeon all return paths🤖 Generated with Claude Code