File python-tokenizers.changes of Package python-tokenizers
------------------------------------------------------------------- Wed Dec 3 23:45:24 UTC 2025 - Guang Yee <gyee@suse.com> - Update to 0.22.0 * Bump on-headers and compression in /tokenizers/examples/unstable_wasm/www in #1827 * Implement from_bytes and read_bytes Methods in WordPiece Tokenizer for WebAssembly Compatibility in #1758 * fix: use AHashMap to fix compile error in #1840 * New stream in #1856 * [docs] Add more decoders in #1849 * Fix missing parenthesis in EncodingVisualizer.calculate_label_colors in #1853 * Update quicktour.mdx re: Issue #1625 in #1846 * remove stray comment in #1831 * Fix typo in README in #1808 * RUSTSEC-2024-0436 - replace paste with pastey in #1834 * Tokenizer: Add native async bindings, via py03-async-runtimes. in #1843 ------------------------------------------------------------------- Mon Oct 27 14:43:48 UTC 2025 - Aline Werner <aline.werner@suse.com> - Update suse_version condition to support 15.7. ------------------------------------------------------------------- Wed Mar 19 18:26:11 UTC 2025 - Lucas Mulling <lucas.mulling@suse.com> - Update to 0.21.1: * Update dev version and pyproject.toml * Add feature flag hint to README.md * Upgrade to PyO3 0.23 * Fixing the README.md * Fix typo in Split docstrings * Fix typos * Update documentation of Rust feature * Fix panic in DecodeStream::step due to incorrect index usage * Fixing the stream by removing the read_index altogether * Fixing NormalizedString append when normalized is empty * Update metadata as Python3.7 and Python3.8 support was dropped * Add rustls-tls feature - Remove define skip_python313 1 ------------------------------------------------------------------- Wed Mar 5 10:32:02 UTC 2025 - Christian Goll <cgoll@suse.com> - disable python3.13 ------------------------------------------------------------------- Thu Jan 9 15:52:37 UTC 2025 - Andreas Schwab <schwab@suse.de> - Enable build on riscv64 ------------------------------------------------------------------- Wed Dec 18 14:20:07 UTC 2024 - Soc Virnyl Estela <uncomfyhalomacro@opensuse.org> - Update to version 0.21.0: * More cache options. * Disable caching for long strings. * Testing ABI3 wheels to reduce number of wheels * Adding an API for decode streaming. * Decode stream python * Fix encode_batch and encode_batch_fast to accept ndarrays again ------------------------------------------------------------------- Thu Nov 7 11:30:50 UTC 2024 - Soc Virnyl Estela <uncomfyhalomacro@opensuse.org> - Select only rust tier 1 arches. - Update registry.tar.zst dependencies - Update version to 0.20.3: * fix pylist * [MINOR:TYP] Fix docstrings - Updates from 0.20.2: * Bump cookie and express in /tokenizers/examples/unstable_wasm/www * Fix off-by-one error in tokenizer::normalizer::Range::len * Arg name correction: auth_token -> token * Unsound call of set_var * Add safety comments * PyO3 0.22 - Updates from 0.20.1: * Update README.md * fix benchmark file link * [ignore_merges] Fix offsets * Bump body-parser and express in /tokenizers/examples/unstable_wasm/www * Bump serve-static and express in /tokenizers/examples/unstable_wasm/www * Bump send and express in /tokenizers/examples/unstable_wasm/www * Bump webpack from 5.76.0 to 5.95.0 in /tokenizers/examples/unstable_wasm/www * Fix documentation build * style: simplify string formatting for readability ------------------------------------------------------------------- Sun Nov 3 12:22:13 UTC 2024 - Soc Virnyl Estela <uncomfyhalomacro@opensuse.org> - Experiment with cargo vendor home registry. See documentation: https://github.com/openSUSE-Rust/obs-service-cargo/blob/master/README.md#cargo-vendor-home-registry ------------------------------------------------------------------- Mon Sep 23 07:19:52 UTC 2024 - Simon Lees <sflees@suse.de> - Don't use macros for Requires ------------------------------------------------------------------- Fri Aug 30 05:18:30 UTC 2024 - Simon Lees <sflees@suse.de> - Update package name back to "huggingface-hub" to match pypi ------------------------------------------------------------------- Tue Aug 27 05:48:31 UTC 2024 - Guang Yee <gyee@suse.com> - Update package name "huggingface-hub" to "huggingface_hub" ------------------------------------------------------------------- Tue Aug 20 07:27:42 UTC 2024 - Simon Lees <sflees@suse.de> - Fix testsuite on 15.6 ------------------------------------------------------------------- Sun Aug 18 16:49:56 UTC 2024 - Soc Virnyl Estela <obs@uncomfyhalomacro.pl> - Replace vendor tarball to zstd compressed vendor tarball - Force gcc version on leap. Thanks @marv7000 for your zed.spec - Use `CARGO_*` environmental variables to force generate full debuginfo and avoid stripping. - Enable cargo test in %check. - Update to version 0.20.0: * remove enforcement of non special when adding tokens * [BREAKING CHANGE] Ignore added_tokens (both special and not) in the decoder * Make USED_PARALLELISM atomic * Fixing for clippy 1.78 * feat(ci): add trufflehog secrets detection * Switch from cached_download to hf_hub_download in tests * Fix "dictionnary" typo * make sure we don't warn on empty tokens * Enable dropout = 0.0 as an equivalent to none in BPE * Revert "[BREAKING CHANGE] Ignore added_tokens (both special and not) … * Add bytelevel normalizer to fix decode when adding tokens to BPE * Fix clippy + feature test management. * Bump spm_precompiled to 0.1.3 * Add benchmark vs tiktoken * Fixing the benchmark. * Tiny improvement * Enable fancy regex * Fixing release CI strict (taken from safetensors). * Adding some serialization testing around the wrapper. * Add-legacy-tests * Adding a few tests for decoder deserialization. * Better serialization error * Add test normalizers * Improve decoder deserialization * Using serde (serde_pyo3) to get str and repr easily. * Merges cannot handle tokens containing spaces. * Fix doc about split * Support None to reset pre_tokenizers and normalizers, and index sequences * Fix strip python type * Tests + Deserialization improvement for normalizers. * add deserialize for pre tokenizers * Perf improvement 16% by removing offsets. ------------------------------------------------------------------- Wed Jul 3 14:55:36 UTC 2024 - Christian Goll <cgoll@suse.com> - initial commit on rust based python-tokenizers