Skip to content

fix(SEC-3): make global symbol/lookup tables thread-safe#228

Open
kostub wants to merge 2 commits into
masterfrom
em/2026-06-11-issues/t3
Open

fix(SEC-3): make global symbol/lookup tables thread-safe#228
kostub wants to merge 2 commits into
masterfrom
em/2026-06-11-issues/t3

Conversation

@kostub

@kostub kostub commented Jun 12, 2026

Copy link
Copy Markdown
Owner

Summary

Fixes SEC-3: global symbol/lookup tables in iosMath are not thread-safe, causing crashes when LaTeX is parsed concurrently (common pattern: pre-rendering math off the main thread).

Three independent, additive changes — no public API or behavior change:

  • 17 lazy-init conversions (if (!table) { … }dispatch_once): 13 sites in MTMathAtomFactory.m (all read-only factory tables including stackCommands, stackCommandReverseTable), 4 sites in MTMathListBuilder.m (singleCharCommands, fractionCommands, spaceToCommands, styleToCommands). Brings stragglers in line with the file's own existing dispatch_once convention.

  • Symbol table locking (os_unfair_lock): guards the two genuinely-mutable tables (commands / textToCommands) at all four touch-points — writer (+addLatexSymbol:value:) and three readers (+atomForLatexSymbolName:, +latexSymbolNameForAtom:, +supportedLatexSymbolNames). Read-only factory tables receive no lock (none needed after dispatch_once).

  • Copy-on-write in +addLatexSymbol:value:: stores [atom copy] instead of the caller's atom directly — symmetric with the existing [atom copy] on the read side (already present with comment "atoms are mutable"). Caller mutations after insertion no longer silently corrupt the global table.

  • Font cache lock (os_unfair_lock ivar on MTFontManager): guards the check-then-act read-modify-write on nameToFontMap in -fontWithName:size:.

Test plan

  • New MTConcurrencyTest.m with 6 tests: concurrent first-touch of all tables, concurrent read + addLatexSymbol writes, post-add resolvability, copy-on-write regression, concurrent font cache access, concurrent parsing — all pass
  • All 298 existing tests pass unchanged (swift test)
  • swift build clean

🤖 Generated with Claude Code

Convert all 17 lazy `if (!table)` init sites to `dispatch_once` (13 in
MTMathAtomFactory.m, 4 in MTMathListBuilder.m).

Guard the two genuinely-mutable symbol tables (`commands` and
`textToCommands`) with `os_unfair_lock` in all four touch-points:
+addLatexSymbol:value: (write), +atomForLatexSymbolName: (read),
+latexSymbolNameForAtom: (read), +supportedLatexSymbolNames (read).

Copy the atom on write in +addLatexSymbol:value: — symmetric with the
existing `[atom copy]` on the read side.

Guard MTFontManager -fontWithName:size:'s check-then-act on
nameToFontMap with a per-instance `os_unfair_lock`.

Add MTConcurrencyTest with 6 tests covering: concurrent first-touch,
concurrent read+write, post-add resolvability, copy-on-write regression,
concurrent font cache access, and concurrent parsing. All 298 tests pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces thread-safety improvements across the iosMath library by replacing unsafe lazy-initialization checks with dispatch_once blocks and adding os_unfair_lock guards to mutable symbol tables and the font cache. It also adds a comprehensive concurrency test suite to verify these changes under stress. The review feedback highlights two important concurrency issues: first, a potential data race in MTMathAtomFactory.m where a mutable dictionary is read outside of a lock, which can be resolved by copying the dictionary under the lock; second, a performance bottleneck in MTFontManager.m caused by holding a lock during disk I/O, which can be optimized using a double-checked locking pattern.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread iosMath/lib/MTMathAtomFactory.m Outdated
Comment on lines +191 to +196
// textToLatexSymbolNames is a genuinely-mutable dict (addLatexSymbol: writes it), so guard.
NSMutableDictionary* dict = [MTMathAtomFactory textToLatexSymbolNames];
NSDictionary<NSNumber*, NSString*>* inner = nil;
os_unfair_lock_lock(&gSymbolTableLock);
inner = dict[atom.nucleus];
os_unfair_lock_unlock(&gSymbolTableLock);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Although the outer dictionary dict is accessed under the lock, the retrieved inner dictionary is a mutable NSMutableDictionary that can be concurrently mutated in +addLatexSymbol:value:. Reading from inner outside the lock (on lines 200 and 208) creates a data race, which can lead to crashes or undefined behavior when LaTeX is parsed concurrently.

To resolve this safely without holding the lock during the entire lookup, you can copy the inner dictionary under the lock to obtain an immutable snapshot. Additionally, ensure you guard against nil keys (such as atom.nucleus) before accessing the dictionary to prevent NSInvalidArgumentException crashes.

    NSMutableDictionary* dict = [MTMathAtomFactory textToLatexSymbolNames];
    NSDictionary<NSNumber*, NSString*>* inner = nil;
    if (atom.nucleus != nil) {
        os_unfair_lock_lock(&gSymbolTableLock);
        inner = [dict[atom.nucleus] copy];
        os_unfair_lock_unlock(&gSymbolTableLock);
    }
References
  1. Always guard against nil keys before accessing or modifying an NSMutableDictionary (or NSDictionary) in Objective-C to prevent NSInvalidArgumentException crashes.

Comment on lines +62 to +68
os_unfair_lock_lock(&_cacheLock);
MTFont* f = self.nameToFontMap[name];
if (!f) {
f = [[MTFont alloc] initFontWithName:name size:size];
if (!f) { return nil; } // unknown/unloadable font — do not cache
self.nameToFontMap[name] = f;
if (f) { self.nameToFontMap[name] = f; }
}
os_unfair_lock_unlock(&_cacheLock);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Calling [[MTFont alloc] initFontWithName:name size:size] while holding _cacheLock is problematic because it performs disk I/O (loading .otf and .plist files). os_unfair_lock is a low-level, spin-like lock designed for extremely short critical sections. Holding it during disk I/O can cause thread starvation, priority inversion, and severe performance bottlenecks under high concurrency.

To optimize this, use a double-checked locking pattern where the expensive font initialization is performed outside the lock, and the lock is re-acquired only to insert the newly created font if it is still missing from the cache. Also, ensure you guard against nil keys (such as name) before accessing or modifying self.nameToFontMap to prevent NSInvalidArgumentException crashes.

Suggested change
os_unfair_lock_lock(&_cacheLock);
MTFont* f = self.nameToFontMap[name];
if (!f) {
f = [[MTFont alloc] initFontWithName:name size:size];
if (!f) { return nil; } // unknown/unloadable font — do not cache
self.nameToFontMap[name] = f;
if (f) { self.nameToFontMap[name] = f; }
}
os_unfair_lock_unlock(&_cacheLock);
if (name == nil) {
return nil;
}
os_unfair_lock_lock(&_cacheLock);
MTFont* f = self.nameToFontMap[name];
os_unfair_lock_unlock(&_cacheLock);
if (!f) {
MTFont* newFont = [[MTFont alloc] initFontWithName:name size:size];
if (newFont) {
os_unfair_lock_lock(&_cacheLock);
f = self.nameToFontMap[name];
if (!f) {
self.nameToFontMap[name] = newFont;
f = newFont;
}
os_unfair_lock_unlock(&_cacheLock);
}
}
References
  1. Always guard against nil keys before accessing or modifying an NSMutableDictionary (or NSDictionary) in Objective-C to prevent NSInvalidArgumentException crashes.

Comment thread iosMath/lib/MTMathAtomFactory.m Outdated
NSMutableDictionary* dict = [MTMathAtomFactory textToLatexSymbolNames];
NSDictionary<NSNumber*, NSString*>* inner = nil;
os_unfair_lock_lock(&gSymbolTableLock);
inner = dict[atom.nucleus];

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Important — incomplete critical section / residual data race]

latexSymbolNameForAtom: releases gSymbolTableLock immediately after fetching the inner reference (inner = dict[atom.nucleus]), but then reads inner[@(atom.type)] (and the Bin-fallback inner[@(kMTMathAtomBinaryOperator)]) outside the lock.

inner is an NSMutableDictionary that the writer in +addLatexSymbol:value: mutates in place while holding the lock:

NSMutableDictionary* inner = dict[atom.nucleus];  // existing inner reused
...
inner[typeKey] = name;   // in-place mutation, under lock

So when an existing nucleus is updated by a concurrent writer, a reader holding the same inner reference reads it after unlocking — a NSMutableDictionary read concurrent with a write, i.e. the exact UB this PR is meant to close. The lock only protects the outer dict[...] lookup, not the inner dict[...] read.

Fix: keep the lock held until after name = inner[@(atom.type)] (and the fallback) are read, e.g. resolve name to a local inside the locked region and unlock just before returning. The other two readers (atomForLatexSymbolName:, supportedLatexSymbolNames) read directly under the lock and are fine; only this one unlocks too early.

Note: testConcurrentAddAndLookupSymbol’s writers insert nucleus "X" while the reader queries nucleus "→", so they touch different inner dicts and this race is never exercised. A regression test that has writers and the reverse-lookup reader share the same nucleus would expose it under TSan.

- (nullable MTFont *)fontWithName:(NSString *)name size:(CGFloat)size
{
if (!name) { return nil; } // nil name cannot key the cache dictionary
os_unfair_lock_lock(&_cacheLock);

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Minor — held lock during font I/O]

The [[MTFont alloc] initFontWithName:size:] (which loads a font file from disk and parses the OpenType MATH table plist) now runs while holding _cacheLock. This serializes concurrent first-loads of different font names behind one global lock. It is correct (no deadlock — MTFont init does not call back into MTFontManager) and only happens once per name, so it is acceptable. Flagging only in case you want a per-name or double-checked-locking scheme later; not blocking. The lock/unlock balance and the moved if(!f) return nil (now after unlock) are correct and behavior-preserving.

…ForAtom:

+latexSymbolNameForAtom: unlocked immediately after fetching the per-nucleus
`inner` reference, then read inner[@(atom.type)] (and the Bin-type fallback)
outside the lock. The writer +addLatexSymbol:value: mutates that same inner
NSMutableDictionary in place (inner[typeKey] = name) under the lock, so the
post-unlock reads were a read/write data race on the same dictionary.

Extend the critical section to cover the full read: resolve the name to a
local under the lock, copy it out, then unlock and return.
atomForLatexSymbolName: and supportedLatexSymbolNames already read fully under
the lock and are unchanged.

Strengthen testConcurrentAddAndLookupSymbol so writers and the reverse-lookup
reader share one nucleus + type, hammering the same inner-dict cell. Verified
under ThreadSanitizer: the pre-fix code flags a race on NSMutableDictionary
between addLatexSymbol: and latexSymbolNameForAtom:; the fix is clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@kostub

kostub commented Jun 12, 2026

Copy link
Copy Markdown
Owner Author

Addressed the blocking review on the symbol-table thread-safety fix.

Issue: +latexSymbolNameForAtom: released gSymbolTableLock immediately after fetching the per-nucleus inner reference, then read inner[@(atom.type)] (and the Bin-type fallback) outside the lock. +addLatexSymbol:value: mutates that same inner NSMutableDictionary in place under the lock, so the post-unlock reads were a read/write data race on the same dictionary.

Fix (d86d1bf):

  • Extended the critical section in +latexSymbolNameForAtom: to cover the full read — both the outer dict[atom.nucleus] lookup and the inner[...] lookups including the Bin fallback. The resolved name is [name copy]-d to a local before unlocking so no reference into the mutable container survives past the unlock.
  • Confirmed +atomForLatexSymbolName: and +supportedLatexSymbolNames already read their shared mutable state fully inside the lock (unchanged).
  • Strengthened testConcurrentAddAndLookupSymbol so writers and the reverse-lookup reader share one nucleus + type (kMTMathAtomRelation), hammering the same inner-dict cell. The original test queried a different nucleus () than the writers touched, so it could never collide and couldn't catch the race. Verified under ThreadSanitizer: the pre-fix code flags a race between addLatexSymbol: and latexSymbolNameForAtom:; the fix is clean.

swift build and swift test are green (298 tests, 0 failures).

@kostub

kostub commented Jun 12, 2026

Copy link
Copy Markdown
Owner Author

/gemini review

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request improves thread safety across the library by replacing unsafe lazy-initialization checks with dispatch_once and introducing os_unfair_lock to protect mutable symbol tables and the font cache. It also adds a comprehensive concurrency test suite to verify these changes. The review feedback highlights two key improvements: optimizing the font cache locking mechanism to perform expensive font initialization outside the critical section to avoid thread starvation, and adding runtime nil guards in addLatexSymbol:value: to prevent crashes in release builds where assertions are disabled.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment on lines +62 to +69
os_unfair_lock_lock(&_cacheLock);
MTFont* f = self.nameToFontMap[name];
if (!f) {
f = [[MTFont alloc] initFontWithName:name size:size];
if (!f) { return nil; } // unknown/unloadable font — do not cache
self.nameToFontMap[name] = f;
if (f) { self.nameToFontMap[name] = f; }
}
os_unfair_lock_unlock(&_cacheLock);
if (!f) { return nil; } // unknown/unloadable font — do not cache

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Holding os_unfair_lock while performing expensive disk I/O and plist parsing inside [[MTFont alloc] initFontWithName:name size:size] can cause thread starvation and priority inversion under high concurrency. Since os_unfair_lock is designed for extremely short critical sections, we should perform the font initialization outside the lock, and use a double-checked locking pattern to safely insert the newly created font into the cache. Additionally, we must guard against nil keys before accessing or modifying NSMutableDictionary to prevent crashes.

Suggested change
os_unfair_lock_lock(&_cacheLock);
MTFont* f = self.nameToFontMap[name];
if (!f) {
f = [[MTFont alloc] initFontWithName:name size:size];
if (!f) { return nil; } // unknown/unloadable font — do not cache
self.nameToFontMap[name] = f;
if (f) { self.nameToFontMap[name] = f; }
}
os_unfair_lock_unlock(&_cacheLock);
if (!f) { return nil; } // unknown/unloadable font — do not cache
if (!name) {
return nil;
}
os_unfair_lock_lock(&_cacheLock);
MTFont* f = self.nameToFontMap[name];
os_unfair_lock_unlock(&_cacheLock);
if (!f) {
f = [[MTFont alloc] initFontWithName:name size:size];
if (!f) { return nil; }
os_unfair_lock_lock(&_cacheLock);
MTFont* existing = self.nameToFontMap[name];
if (!existing) {
self.nameToFontMap[name] = f;
} else {
f = existing;
}
os_unfair_lock_unlock(&_cacheLock);
}
References
  1. Always guard against nil keys before accessing or modifying an NSMutableDictionary (or NSDictionary) in Objective-C to prevent NSInvalidArgumentException crashes.

Comment on lines 217 to +226
NSParameterAssert(name);
NSParameterAssert(atom);
// Ensure both tables are initialized before taking the lock (dispatch_once
// inside each accessor guarantees a single init; no deadlock since the tokens
// are method-local and the lock is not held during the once-block).
NSMutableDictionary<NSString*, MTMathAtom*>* commands = [self supportedLatexSymbols];
commands[name] = atom;
NSMutableDictionary<NSString*, NSMutableDictionary<NSNumber*, NSString*>*>* dict = [self textToLatexSymbolNames];

os_unfair_lock_lock(&gSymbolTableLock);
commands[name] = [atom copy]; // copy on write — symmetric with the read side

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

In Objective-C, assertions like NSParameterAssert are disabled in release builds. If name or atom is nil at runtime in a release build, attempting to set a value in commands or copying atom will cause an NSInvalidArgumentException crash. Always guard against nil keys and values before modifying an NSMutableDictionary.

    if (!name || !atom) {
        return;
    }
    NSMutableDictionary<NSString*, MTMathAtom*>* commands = [self supportedLatexSymbols];
    NSMutableDictionary<NSString*, NSMutableDictionary<NSNumber*, NSString*>*>* dict = [self textToLatexSymbolNames];

    os_unfair_lock_lock(&gSymbolTableLock);
    commands[name] = [atom copy];
References
  1. Always guard against nil keys before accessing or modifying an NSMutableDictionary (or NSDictionary) in Objective-C to prevent NSInvalidArgumentException crashes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant