Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions constraints-dev.txt
Original file line number Diff line number Diff line change
Expand Up @@ -48,11 +48,11 @@ gitdb==4.0.12 # via gitpython
gitpython==3.1.45 # via wandb
grpcio==1.74.0 # via tensorboard
h11==0.16.0 # via httpcore
hf-xet==1.1.9 # via huggingface-hub
hf-xet==1.2.0 # via huggingface-hub
hjson==3.1.0 # via deepspeed
httpcore==1.0.9 # via httpx
httpx==0.28.1 # via jupyterlab
huggingface-hub==0.34.4 # via accelerate, datasets, peft, tokenizers, transformers, -r requirements-dev.txt
huggingface-hub==1.3.4 # via accelerate, datasets, peft, tokenizers, transformers, -r requirements-dev.txt
identify==2.6.13 # via pre-commit
idna==3.10 # via anyio, httpx, jsonschema, requests, yarl
iniconfig==2.1.0 # via pytest
Expand Down Expand Up @@ -181,15 +181,15 @@ tensorboard==2.20.0 # via -r requirements-dev.txt
tensorboard-data-server==0.7.2 # via tensorboard
terminado==0.18.1 # via jupyter-server, jupyter-server-terminals
tinycss2==1.4.0 # via bleach
tokenizers==0.22.0 # via transformers
tokenizers==0.22.2 # via transformers
tomlkit==0.13.3 # via pylint
torch==2.6.0 # via accelerate, bitsandbytes, deepspeed, flash-attn, liger-kernel, peft, -c constraints-dev.txt.in, -r requirements.txt
tornado==6.5.2 # via ipykernel, jupyter-client, jupyter-server, jupyterlab, notebook, terminado
tox==4.29.0 # via tox-uv, -r requirements-dev.txt
tox-uv==1.25.0 # via -r requirements-dev.txt
tqdm==4.67.1 # via datasets, deepspeed, huggingface-hub, peft, transformers
traitlets==5.14.3 # via ipykernel, ipython, ipywidgets, jupyter-client, jupyter-console, jupyter-core, jupyter-events, jupyter-server, jupyterlab, matplotlib-inline, nbclient, nbconvert, nbformat
transformers==4.56.0 # via peft, trl, -r requirements.txt
transformers==5.0.0 # via peft, trl, -r requirements.txt
triton==3.2.0 # via liger-kernel, torch
trl==0.22.1 # via -r requirements.txt
types-python-dateutil==2.9.0.20250822 # via arrow
Expand Down
2 changes: 1 addition & 1 deletion requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ wheel>=0.43
pyyaml
py-cpuinfo
torch>=2.6.0
transformers>=4.55.0
transformers>=5.0.0
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

transformers 5.0.0 minimum Python version requirements dependencies

💡 Result:

transformers 5.0.0 minimum Python version

  • Python >= 3.10.0 (this is the package’s Requires-Python on PyPI). [1]

Note: the project description on PyPI also says “Python 3.9+”, but the install metadata for 5.0.0 is stricter (>=3.10.0). [1]

Runtime (core) dependencies (required)

Listed dependencies for transformers on piwheels are: filelock, huggingface-hub, numpy, packaging, pyyaml, regex, requests, safetensors, tokenizers, tqdm. [2]

Framework requirement (practical)

  • The PyPI description for 5.0.0 states it works with PyTorch 2.1+ (installed separately; commonly via pip install "transformers[torch]"). [1]

Sources: [1], [2]


🌐 Web query:

transformers 5.0.0 breaking changes release notes

💡 Result:

Here are the Transformers 5.0.0 breaking / backward-incompatible changes, straight from the official v5.0.0 release notes (released Jan 26, 2026). [1]

Library-wide

  • from_pretrained(... ) default dtype is now auto (instead of forcing FP32). [2]
  • Default Hub shard size is now 50GB (up from 5GB). [2]
  • use_auth_token is deprecated in favor of token everywhere. [2]

Attention / Torch / Quantization

  • Removed: head masking, relative positional biases in BERT-like models, and head pruning. [3]
  • Dropped support for torchscript and torch.fx (focus is now on dynamo / export). [3]
  • Removed quantization args load_in_4bit and load_in_8bit → use quantization_config instead. [4]

Configuration

  • Removed nested-config init helpers like from_xxx_config. [4]
  • Can’t load a config class from a URL anymore (must be local path or Hub repo). [4]
  • RoPE config changes: parameters moved under config.rope_parameters; e.g. accessing config.rope_theta can now error. [4]
  • Non-generative models no longer have generation_config; model.config.generation_config may now error. [5]

Tokenization / Processing

  • Tokenizers: consolidation away from slow/fast split; single tokenization_<model>.py with backend selection. [6]
  • Tokenization breaking changes include (highlights): legacy tokenizer files removal, encode_plus -> __call__, batch_decode -> decode, and apply_chat_template now returns a BatchEncoding. [5]
  • Processing: processor serialization changes; XXXFeatureExtractors removed in favor of XXXImageProcessor for vision models. [5]

Modeling / Generate / Trainer / Hub / CLI

  • All models’ forward() now accept **kwargs. [7]
  • Generate: removed deprecated output aliases and some deprecated decoding-related classes. [7]
  • Trainer/TrainingArguments: many deprecated args removed/renamed (e.g. no_cuda -> use_cpu, per_gpu_* -> per_device_*, etc.). [8]
  • PushToHubMixin: removed/changed args; push_to_hub is now keyword-only (except repo_id), no **kwargs, removed use_temp_dir, etc. [8]
  • CLI: transformers-cli removed; transformers ... is the only entry point; transformers chat/serve split, and transformers run removed. [8]
  • Env vars: removed legacy cache env vars like TRANSFORMERS_CACHE in favor of HF_HOME; plus huggingface_hub-related changes (e.g. requests→httpx error type). [9]

If you tell me what you’re upgrading from (e.g. 4.57.x → 5.0.0) and which parts you use (Trainer, tokenizers, quantization, CLI), I can produce a short “search/replace + code changes” checklist.


🌐 Web query:

transformers 5.0.0 PyPI package metadata tokenizers huggingface-hub compatibility

💡 Result:

For transformers==5.0.0 (released Jan 26, 2026), the published package metadata indicates:

  • Python requirement: Python >=3.10 (PyPI) [1]
  • Core runtime deps include: huggingface-hub and tokenizers (among others like numpy, safetensors, tqdm, etc.) [2]
  • Compatibility change in v5: transformers now requires huggingface_hub >= 1.0.0 [3]

What this means for tokenizers / huggingface-hub compatibility

Because transformers depends on both huggingface-hub and tokenizers [2] and also enforces huggingface_hub >= 1.0.0 [3], you must use a tokenizers version that does not pin huggingface_hub to <1.0.0—otherwise pip’s resolver will fail with a version conflict (this kind of conflict is documented historically when tokenizers pinned huggingface_hub to an upper bound) [4].

Practical rule: with transformers==5.0.0, keep huggingface_hub on 1.x and ensure tokenizers is new enough to be compatible with that.

Sources:
[1] https://pypi.org/project/transformers/
[2] https://www.piwheels.org/project/transformers/
[3] https://newreleases.io/project/pypi/transformers/release/5.0.0
[4] huggingface/transformers#26276


Verify Transformers v5.0.0 compatibility—breaking changes affect multiple APIs.

The bump to >=5.0.0 introduces major breaking changes beyond Python/dependency updates:

  • Python: Minimum is now 3.10 (not 3.9).
  • Dependencies: Requires huggingface-hub >= 1.0.0 and tokenizers (ensure no version conflicts; older tokenizers may pin huggingface-hub to <1.0.0).
  • Breaking changes: Model loading dtype now defaults to auto (not FP32); quantization args (load_in_4bit, load_in_8bit) removed in favor of quantization_config; tokenizers API consolidated; Trainer args renamed/removed; transformers-cli removed entirely; use_auth_token deprecated for token; and many more.

Confirm your code does not use deprecated/removed APIs (quantization, trainer args, CLI, tokenizers functions, model config patterns). Update all affected code, CI, and docs accordingly.

🤖 Prompt for AI Agents
In `@requirements.txt` at line 6, The transformers>=5.0.0 bump may be incompatible
with our code; verify and either pin to a compatible version or update
code/CI/docs: ensure Python minimum is 3.10, add/update dependency constraints
for huggingface-hub>=1.0.0 and tokenizers to avoid conflicts, and search for and
replace removed/deprecated APIs—e.g., replace load_in_4bit/load_in_8bit usage
with quantization_config, change model loading dtype handling (auto vs fp32),
replace use_auth_token with token, remove references to transformers-cli, update
Trainer usages (renamed/removed args), and adapt tokenizers API calls—update
tests, CI images, and docs accordingly so all occurrences (load_in_4bit,
load_in_8bit, quantization_config, use_auth_token, Trainer, transformers-cli,
tokenizers.*) are fixed or dependency pinned.


datasets>=2.15.0
numba>=0.62.0
Expand Down
4 changes: 2 additions & 2 deletions src/instructlab/training/data_process.py
Original file line number Diff line number Diff line change
Expand Up @@ -393,7 +393,7 @@ def process_messages_into_input_ids_with_chat_template(args: DataProcessArgs):

# Adding after tokenizer setup as these are temp tokens, not to be saved
tokenizer.add_special_tokens(
{"additional_special_tokens": ["<|pretrain|>", "<|/pretrain|>", "<|MASK|>"]}
{"extra_special_tokens": ["<|pretrain|>", "<|/pretrain|>", "<|MASK|>"]}
)

try:
Expand Down Expand Up @@ -1300,7 +1300,7 @@ def configure_tokenizer(model_path: str) -> PreTrainedTokenizer:
# Add special tokens for masking
tokenizer.add_special_tokens(
{
"additional_special_tokens": [
"extra_special_tokens": [
UNMASK_BEGIN_TOKEN,
UNMASK_END_TOKEN,
UNMASK_REASONING_BEGIN_TOKEN,
Expand Down
3 changes: 3 additions & 0 deletions src/instructlab/training/main_ds.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,9 @@
import time
import warnings

# Suppress verbose HTTP request logs from httpx (used by huggingface_hub in transformers v5+)
logging.getLogger("httpx").setLevel(logging.WARNING)

try:
# Third Party
from deepspeed.ops.adam import DeepSpeedCPUAdam
Expand Down
14 changes: 6 additions & 8 deletions src/instructlab/training/tokenizer_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,19 +18,17 @@ def setup_tokenizer_with_existing_chat_template(
# we need to set the padding token
tokenizer.add_special_tokens({"pad_token": tokenizer.eos_token})

# ensure the pad token is in the additional special tokens without duplicating anything else
# ensure the pad token is in the extra special tokens without duplicating anything else
new_tokens = []
if tokenizer.pad_token not in tokenizer.additional_special_tokens:
if tokenizer.pad_token not in tokenizer.extra_special_tokens:
new_tokens.append(tokenizer.pad_token)
if tokenizer.eos_token not in tokenizer.additional_special_tokens:
if tokenizer.eos_token not in tokenizer.extra_special_tokens:
new_tokens.append(tokenizer.eos_token)

# ensure the tokens are being sorted to prevent any issues
new_tokens = sorted(new_tokens)
additional_special_tokens = tokenizer.additional_special_tokens + new_tokens
tokenizer.add_special_tokens(
{"additional_special_tokens": additional_special_tokens}
)
extra_special_tokens = tokenizer.extra_special_tokens + new_tokens
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lol these tokens are extra special

tokenizer.add_special_tokens({"extra_special_tokens": extra_special_tokens})

# ensure the necessary tokens exist
assert len(get_sp_token(tokenizer, tokenizer.pad_token)) == 1, (
Expand All @@ -57,7 +55,7 @@ def setup_tokenizer_from_new_chat_template(
}
)
tokenizer.add_special_tokens(
{"additional_special_tokens": SPECIAL_TOKENS.get_tokens_to_add()}
{"extra_special_tokens": SPECIAL_TOKENS.get_tokens_to_add()}
)
if getattr(tokenizer, "add_bos_token", False) or getattr(
tokenizer, "add_eos_token", False
Expand Down
Loading