Bump transformers to 5.0.0#4060
Open
AlexanderDokuchaev wants to merge 9 commits into
Open
Conversation
b4dd5f6 to
66c7934
Compare
Contributor
There was a problem hiding this comment.
Pull request overview
This PR updates NNCF’s test and example environments to be compatible with transformers==5.0.0, adjusting call sites for API changes (notably apply_chat_template return type) and updating generation static-cache configuration accordingly.
Changes:
- Bump
transformersto5.0.0across test/example requirements and update related dependencies (e.g.,sentence-transformers,tensorflow-io,whowhatbench,lm_eval[hf]). - Update example code to handle
tokenizer.apply_chat_template(...)returning aBatchFeature(extractinginput_idsexplicitly). - Remove
StaticCacheConfigusage and switch generationcache_configto a dict-based configuration; remove thedummy_llamasparsify-activations unit test helper/testcase.
Reviewed changes
Copilot reviewed 25 out of 25 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| tests/torch/requirements.txt | Pins transformers==5.0.0 and updates sentence-transformers for torch test environment. |
| tests/torch/function_hook/sparsify_activations/test_algo.py | Removes dummy_llama sparsify-activations algorithm test case. |
| tests/torch/function_hook/sparsify_activations/helpers.py | Removes dummy_llama_model helper and related transformers import. |
| tests/post_training/requirements.txt | Pins transformers==5.0.0, bumps tensorflow-io, updates WWB fork commit. |
| tests/post_training/pipelines/fx_modelling.py | Drops StaticCacheConfig and updates static cache setup for Transformers 5. |
| tests/openvino/requirements.txt | Pins transformers==5.0.0 for OpenVINO tests. |
| examples/llm_compression/torch/downstream_qat_with_nls/requirements.txt | Pins transformers==5.0.0 and switches to lm_eval[hf]. |
| examples/llm_compression/torch/distillation_qat_with_lora/requirements.txt | Pins transformers==5.0.0 and switches to lm_eval[hf]. |
| examples/llm_compression/torch_fx/tiny_llama/requirements.txt | Pins transformers==5.0.0 for the torch.fx tiny-llama example. |
| examples/llm_compression/torch_fx/tiny_llama/modelling.py | Updates static cache configuration to dict and ensures use_cache is enabled. |
| examples/llm_compression/torch_fx/tiny_llama/main.py | Adjusts chat template tokenization to handle BatchFeature output. |
| examples/llm_compression/openvino/tiny_llama/requirements.txt | Pins transformers==5.0.0 for OpenVINO tiny-llama example. |
| examples/llm_compression/openvino/tiny_llama/main.py | Adjusts chat template tokenization to handle BatchFeature output. |
| examples/llm_compression/openvino/tiny_llama_synthetic_data/requirements.txt | Pins transformers==5.0.0 for synthetic-data example. |
| examples/llm_compression/openvino/tiny_llama_find_hyperparams/requirements.txt | Pins transformers==5.0.0 and updates WWB fork commit. |
| examples/llm_compression/openvino/smollm2_360m_fp8/requirements.txt | Pins transformers==5.0.0 for FP8 example. |
| examples/llm_compression/openvino/smollm2_360m_fp8/main.py | Adjusts chat template tokenization to handle BatchFeature output. |
| examples/llm_compression/openvino/smollm2_360m_codebook/requirements.txt | Pins transformers==5.0.0 for codebook example. |
| examples/llm_compression/openvino/smollm2_360m_codebook/main.py | Adjusts chat template tokenization to handle BatchFeature output. |
| examples/llm_compression/openvino/smollm2_360m_adaptive_codebook/requirements.txt | Pins transformers==5.0.0 for adaptive codebook example. |
| examples/llm_compression/openvino/smollm2_360m_adaptive_codebook/main.py | Adjusts chat template tokenization to handle BatchFeature output. |
| examples/llm_compression/onnx/tiny_llama/requirements.txt | Pins transformers==5.0.0 for ONNX tiny-llama example. |
| examples/llm_compression/onnx/tiny_llama/main.py | Adjusts chat template tokenization to handle BatchFeature output. |
| examples/llm_compression/onnx/tiny_llama_scale_estimation/requirements.txt | Pins transformers==5.0.0 for scale-estimation example. |
| examples/llm_compression/onnx/tiny_llama_scale_estimation/main.py | Adjusts chat template tokenization to handle BatchFeature output. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Changes
Bump transformers to
5.0.0Reason for changes
Related tickets
Tests
Test examples - success
Weight compression - success
PTQ-873