[Torch] Lora kernels compilation by daniil-lyakhov · Pull Request #4100 · openvinotoolkit/nncf

daniil-lyakhov · 2026-06-15T13:07:27Z

asymmetric_quantize_lora and symmetric_quantize_lora are wrapped by the CompilationWrapper
ReferenceQuantize is updated to be torch.compile friendly: previous code was making graph brakes in the compiled graph
CompilationWrapper is updated to skip compilation for nested compiled functions as it is not supported by the PyTorch

Before:

After:

Aprox ~25% speed up
With unlimited cache
(torch._dynamo.config.cache_size_limit = 100
torch._dynamo.config.accumulated_cache_size_limit = 100):

HW:
Intel(R) Core(TM) i9-10980XE CPU @ 3.00GHz
3x RTX 3090

daniil-lyakhov force-pushed the dl/lora_compile branch from 15914d2 to 709a45c Compare June 16, 2026 15:59

daniil-lyakhov added 2 commits June 16, 2026 18:13

[Torch] Lora kernels compilation

709a45c

Fix precommit

ad6e3cf

Provide feedback