fix(SEC-3): make global symbol/lookup tables thread-safe#228
Conversation
Convert all 17 lazy `if (!table)` init sites to `dispatch_once` (13 in MTMathAtomFactory.m, 4 in MTMathListBuilder.m). Guard the two genuinely-mutable symbol tables (`commands` and `textToCommands`) with `os_unfair_lock` in all four touch-points: +addLatexSymbol:value: (write), +atomForLatexSymbolName: (read), +latexSymbolNameForAtom: (read), +supportedLatexSymbolNames (read). Copy the atom on write in +addLatexSymbol:value: — symmetric with the existing `[atom copy]` on the read side. Guard MTFontManager -fontWithName:size:'s check-then-act on nameToFontMap with a per-instance `os_unfair_lock`. Add MTConcurrencyTest with 6 tests covering: concurrent first-touch, concurrent read+write, post-add resolvability, copy-on-write regression, concurrent font cache access, and concurrent parsing. All 298 tests pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Code Review
This pull request introduces thread-safety improvements across the iosMath library by replacing unsafe lazy-initialization checks with dispatch_once blocks and adding os_unfair_lock guards to mutable symbol tables and the font cache. It also adds a comprehensive concurrency test suite to verify these changes under stress. The review feedback highlights two important concurrency issues: first, a potential data race in MTMathAtomFactory.m where a mutable dictionary is read outside of a lock, which can be resolved by copying the dictionary under the lock; second, a performance bottleneck in MTFontManager.m caused by holding a lock during disk I/O, which can be optimized using a double-checked locking pattern.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| // textToLatexSymbolNames is a genuinely-mutable dict (addLatexSymbol: writes it), so guard. | ||
| NSMutableDictionary* dict = [MTMathAtomFactory textToLatexSymbolNames]; | ||
| NSDictionary<NSNumber*, NSString*>* inner = nil; | ||
| os_unfair_lock_lock(&gSymbolTableLock); | ||
| inner = dict[atom.nucleus]; | ||
| os_unfair_lock_unlock(&gSymbolTableLock); |
There was a problem hiding this comment.
Although the outer dictionary dict is accessed under the lock, the retrieved inner dictionary is a mutable NSMutableDictionary that can be concurrently mutated in +addLatexSymbol:value:. Reading from inner outside the lock (on lines 200 and 208) creates a data race, which can lead to crashes or undefined behavior when LaTeX is parsed concurrently.
To resolve this safely without holding the lock during the entire lookup, you can copy the inner dictionary under the lock to obtain an immutable snapshot. Additionally, ensure you guard against nil keys (such as atom.nucleus) before accessing the dictionary to prevent NSInvalidArgumentException crashes.
NSMutableDictionary* dict = [MTMathAtomFactory textToLatexSymbolNames];
NSDictionary<NSNumber*, NSString*>* inner = nil;
if (atom.nucleus != nil) {
os_unfair_lock_lock(&gSymbolTableLock);
inner = [dict[atom.nucleus] copy];
os_unfair_lock_unlock(&gSymbolTableLock);
}References
- Always guard against
nilkeys before accessing or modifying anNSMutableDictionary(orNSDictionary) in Objective-C to preventNSInvalidArgumentExceptioncrashes.
| os_unfair_lock_lock(&_cacheLock); | ||
| MTFont* f = self.nameToFontMap[name]; | ||
| if (!f) { | ||
| f = [[MTFont alloc] initFontWithName:name size:size]; | ||
| if (!f) { return nil; } // unknown/unloadable font — do not cache | ||
| self.nameToFontMap[name] = f; | ||
| if (f) { self.nameToFontMap[name] = f; } | ||
| } | ||
| os_unfair_lock_unlock(&_cacheLock); |
There was a problem hiding this comment.
Calling [[MTFont alloc] initFontWithName:name size:size] while holding _cacheLock is problematic because it performs disk I/O (loading .otf and .plist files). os_unfair_lock is a low-level, spin-like lock designed for extremely short critical sections. Holding it during disk I/O can cause thread starvation, priority inversion, and severe performance bottlenecks under high concurrency.
To optimize this, use a double-checked locking pattern where the expensive font initialization is performed outside the lock, and the lock is re-acquired only to insert the newly created font if it is still missing from the cache. Also, ensure you guard against nil keys (such as name) before accessing or modifying self.nameToFontMap to prevent NSInvalidArgumentException crashes.
| os_unfair_lock_lock(&_cacheLock); | |
| MTFont* f = self.nameToFontMap[name]; | |
| if (!f) { | |
| f = [[MTFont alloc] initFontWithName:name size:size]; | |
| if (!f) { return nil; } // unknown/unloadable font — do not cache | |
| self.nameToFontMap[name] = f; | |
| if (f) { self.nameToFontMap[name] = f; } | |
| } | |
| os_unfair_lock_unlock(&_cacheLock); | |
| if (name == nil) { | |
| return nil; | |
| } | |
| os_unfair_lock_lock(&_cacheLock); | |
| MTFont* f = self.nameToFontMap[name]; | |
| os_unfair_lock_unlock(&_cacheLock); | |
| if (!f) { | |
| MTFont* newFont = [[MTFont alloc] initFontWithName:name size:size]; | |
| if (newFont) { | |
| os_unfair_lock_lock(&_cacheLock); | |
| f = self.nameToFontMap[name]; | |
| if (!f) { | |
| self.nameToFontMap[name] = newFont; | |
| f = newFont; | |
| } | |
| os_unfair_lock_unlock(&_cacheLock); | |
| } | |
| } |
References
- Always guard against
nilkeys before accessing or modifying anNSMutableDictionary(orNSDictionary) in Objective-C to preventNSInvalidArgumentExceptioncrashes.
| NSMutableDictionary* dict = [MTMathAtomFactory textToLatexSymbolNames]; | ||
| NSDictionary<NSNumber*, NSString*>* inner = nil; | ||
| os_unfair_lock_lock(&gSymbolTableLock); | ||
| inner = dict[atom.nucleus]; |
There was a problem hiding this comment.
[Important — incomplete critical section / residual data race]
latexSymbolNameForAtom: releases gSymbolTableLock immediately after fetching the inner reference (inner = dict[atom.nucleus]), but then reads inner[@(atom.type)] (and the Bin-fallback inner[@(kMTMathAtomBinaryOperator)]) outside the lock.
inner is an NSMutableDictionary that the writer in +addLatexSymbol:value: mutates in place while holding the lock:
NSMutableDictionary* inner = dict[atom.nucleus]; // existing inner reused
...
inner[typeKey] = name; // in-place mutation, under lockSo when an existing nucleus is updated by a concurrent writer, a reader holding the same inner reference reads it after unlocking — a NSMutableDictionary read concurrent with a write, i.e. the exact UB this PR is meant to close. The lock only protects the outer dict[...] lookup, not the inner dict[...] read.
Fix: keep the lock held until after name = inner[@(atom.type)] (and the fallback) are read, e.g. resolve name to a local inside the locked region and unlock just before returning. The other two readers (atomForLatexSymbolName:, supportedLatexSymbolNames) read directly under the lock and are fine; only this one unlocks too early.
Note: testConcurrentAddAndLookupSymbol’s writers insert nucleus "X" while the reader queries nucleus "→", so they touch different inner dicts and this race is never exercised. A regression test that has writers and the reverse-lookup reader share the same nucleus would expose it under TSan.
| - (nullable MTFont *)fontWithName:(NSString *)name size:(CGFloat)size | ||
| { | ||
| if (!name) { return nil; } // nil name cannot key the cache dictionary | ||
| os_unfair_lock_lock(&_cacheLock); |
There was a problem hiding this comment.
[Minor — held lock during font I/O]
The [[MTFont alloc] initFontWithName:size:] (which loads a font file from disk and parses the OpenType MATH table plist) now runs while holding _cacheLock. This serializes concurrent first-loads of different font names behind one global lock. It is correct (no deadlock — MTFont init does not call back into MTFontManager) and only happens once per name, so it is acceptable. Flagging only in case you want a per-name or double-checked-locking scheme later; not blocking. The lock/unlock balance and the moved if(!f) return nil (now after unlock) are correct and behavior-preserving.
…ForAtom: +latexSymbolNameForAtom: unlocked immediately after fetching the per-nucleus `inner` reference, then read inner[@(atom.type)] (and the Bin-type fallback) outside the lock. The writer +addLatexSymbol:value: mutates that same inner NSMutableDictionary in place (inner[typeKey] = name) under the lock, so the post-unlock reads were a read/write data race on the same dictionary. Extend the critical section to cover the full read: resolve the name to a local under the lock, copy it out, then unlock and return. atomForLatexSymbolName: and supportedLatexSymbolNames already read fully under the lock and are unchanged. Strengthen testConcurrentAddAndLookupSymbol so writers and the reverse-lookup reader share one nucleus + type, hammering the same inner-dict cell. Verified under ThreadSanitizer: the pre-fix code flags a race on NSMutableDictionary between addLatexSymbol: and latexSymbolNameForAtom:; the fix is clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
Addressed the blocking review on the symbol-table thread-safety fix. Issue: Fix (d86d1bf):
|
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request improves thread safety across the library by replacing unsafe lazy-initialization checks with dispatch_once and introducing os_unfair_lock to protect mutable symbol tables and the font cache. It also adds a comprehensive concurrency test suite to verify these changes. The review feedback highlights two key improvements: optimizing the font cache locking mechanism to perform expensive font initialization outside the critical section to avoid thread starvation, and adding runtime nil guards in addLatexSymbol:value: to prevent crashes in release builds where assertions are disabled.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| os_unfair_lock_lock(&_cacheLock); | ||
| MTFont* f = self.nameToFontMap[name]; | ||
| if (!f) { | ||
| f = [[MTFont alloc] initFontWithName:name size:size]; | ||
| if (!f) { return nil; } // unknown/unloadable font — do not cache | ||
| self.nameToFontMap[name] = f; | ||
| if (f) { self.nameToFontMap[name] = f; } | ||
| } | ||
| os_unfair_lock_unlock(&_cacheLock); | ||
| if (!f) { return nil; } // unknown/unloadable font — do not cache |
There was a problem hiding this comment.
Holding os_unfair_lock while performing expensive disk I/O and plist parsing inside [[MTFont alloc] initFontWithName:name size:size] can cause thread starvation and priority inversion under high concurrency. Since os_unfair_lock is designed for extremely short critical sections, we should perform the font initialization outside the lock, and use a double-checked locking pattern to safely insert the newly created font into the cache. Additionally, we must guard against nil keys before accessing or modifying NSMutableDictionary to prevent crashes.
| os_unfair_lock_lock(&_cacheLock); | |
| MTFont* f = self.nameToFontMap[name]; | |
| if (!f) { | |
| f = [[MTFont alloc] initFontWithName:name size:size]; | |
| if (!f) { return nil; } // unknown/unloadable font — do not cache | |
| self.nameToFontMap[name] = f; | |
| if (f) { self.nameToFontMap[name] = f; } | |
| } | |
| os_unfair_lock_unlock(&_cacheLock); | |
| if (!f) { return nil; } // unknown/unloadable font — do not cache | |
| if (!name) { | |
| return nil; | |
| } | |
| os_unfair_lock_lock(&_cacheLock); | |
| MTFont* f = self.nameToFontMap[name]; | |
| os_unfair_lock_unlock(&_cacheLock); | |
| if (!f) { | |
| f = [[MTFont alloc] initFontWithName:name size:size]; | |
| if (!f) { return nil; } | |
| os_unfair_lock_lock(&_cacheLock); | |
| MTFont* existing = self.nameToFontMap[name]; | |
| if (!existing) { | |
| self.nameToFontMap[name] = f; | |
| } else { | |
| f = existing; | |
| } | |
| os_unfair_lock_unlock(&_cacheLock); | |
| } |
References
- Always guard against nil keys before accessing or modifying an NSMutableDictionary (or NSDictionary) in Objective-C to prevent NSInvalidArgumentException crashes.
| NSParameterAssert(name); | ||
| NSParameterAssert(atom); | ||
| // Ensure both tables are initialized before taking the lock (dispatch_once | ||
| // inside each accessor guarantees a single init; no deadlock since the tokens | ||
| // are method-local and the lock is not held during the once-block). | ||
| NSMutableDictionary<NSString*, MTMathAtom*>* commands = [self supportedLatexSymbols]; | ||
| commands[name] = atom; | ||
| NSMutableDictionary<NSString*, NSMutableDictionary<NSNumber*, NSString*>*>* dict = [self textToLatexSymbolNames]; | ||
|
|
||
| os_unfair_lock_lock(&gSymbolTableLock); | ||
| commands[name] = [atom copy]; // copy on write — symmetric with the read side |
There was a problem hiding this comment.
In Objective-C, assertions like NSParameterAssert are disabled in release builds. If name or atom is nil at runtime in a release build, attempting to set a value in commands or copying atom will cause an NSInvalidArgumentException crash. Always guard against nil keys and values before modifying an NSMutableDictionary.
if (!name || !atom) {
return;
}
NSMutableDictionary<NSString*, MTMathAtom*>* commands = [self supportedLatexSymbols];
NSMutableDictionary<NSString*, NSMutableDictionary<NSNumber*, NSString*>*>* dict = [self textToLatexSymbolNames];
os_unfair_lock_lock(&gSymbolTableLock);
commands[name] = [atom copy];References
- Always guard against nil keys before accessing or modifying an NSMutableDictionary (or NSDictionary) in Objective-C to prevent NSInvalidArgumentException crashes.
Summary
Fixes SEC-3: global symbol/lookup tables in iosMath are not thread-safe, causing crashes when LaTeX is parsed concurrently (common pattern: pre-rendering math off the main thread).
Three independent, additive changes — no public API or behavior change:
17 lazy-init conversions (
if (!table) { … }→dispatch_once): 13 sites inMTMathAtomFactory.m(all read-only factory tables includingstackCommands,stackCommandReverseTable), 4 sites inMTMathListBuilder.m(singleCharCommands,fractionCommands,spaceToCommands,styleToCommands). Brings stragglers in line with the file's own existingdispatch_onceconvention.Symbol table locking (
os_unfair_lock): guards the two genuinely-mutable tables (commands/textToCommands) at all four touch-points — writer (+addLatexSymbol:value:) and three readers (+atomForLatexSymbolName:,+latexSymbolNameForAtom:,+supportedLatexSymbolNames). Read-only factory tables receive no lock (none needed afterdispatch_once).Copy-on-write in
+addLatexSymbol:value:: stores[atom copy]instead of the caller's atom directly — symmetric with the existing[atom copy]on the read side (already present with comment "atoms are mutable"). Caller mutations after insertion no longer silently corrupt the global table.Font cache lock (
os_unfair_lockivar onMTFontManager): guards the check-then-act read-modify-write onnameToFontMapin-fontWithName:size:.Test plan
MTConcurrencyTest.mwith 6 tests: concurrent first-touch of all tables, concurrent read +addLatexSymbolwrites, post-add resolvability, copy-on-write regression, concurrent font cache access, concurrent parsing — all passswift test)swift buildclean🤖 Generated with Claude Code