2
2
3
3
This is the changelog for the open source version of tiktoken.
4
4
5
+ ## [ v0.8.0]
6
+
7
+ - Support for ` o1- ` and ` chatgpt-4o- ` models
8
+ - Build wheels for Python 3.13
9
+ - Add possessive quantifiers to limit backtracking in regular expressions, thanks to @l0rinc !
10
+ - Provide a better error message and type for invalid token decode
11
+ - Permit tuples in type hints
12
+ - Better error message for passing invalid input to ` get_encoding `
13
+ - Better error messages during plugin loading
14
+ - Add a ` __version__ ` attribute
15
+ - Update versions of ` pyo3 ` , ` regex ` , ` fancy-regex `
16
+ - Drop support for Python 3.8
17
+
5
18
## [ v0.7.0]
19
+
6
20
- Support for ` gpt-4o `
7
21
- Performance improvements
8
22
9
23
## [ v0.6.0]
24
+
10
25
- Optimise regular expressions for a 20% performance improvement, thanks to @paplorinc !
11
26
- Add ` text-embedding-3-* ` models to ` encoding_for_model `
12
27
- Check content hash for downloaded files
@@ -16,14 +31,17 @@ This is the changelog for the open source version of tiktoken.
16
31
Thank you to @paplorinc , @mdwelsh , @Praneet460 !
17
32
18
33
## [ v0.5.2]
34
+
19
35
- Build wheels for Python 3.12
20
36
- Update version of PyO3 to allow multiple imports
21
37
- Avoid permission errors when using default cache logic
22
38
23
39
## [ v0.5.1]
40
+
24
41
- Add ` encoding_name_for_model ` , undo some renames to variables that are implementation details
25
42
26
43
## [ v0.5.0]
44
+
27
45
- Add ` tiktoken._educational ` submodule to better document how byte pair encoding works
28
46
- Ensure ` encoding_for_model ` knows about several new models
29
47
- Add ` decode_with_offets `
@@ -32,23 +50,28 @@ Thank you to @paplorinc, @mdwelsh, @Praneet460!
32
50
- Update versions of dependencies
33
51
34
52
## [ v0.4.0]
53
+
35
54
- Add ` decode_batch ` and ` decode_bytes_batch `
36
55
- Improve error messages and handling
37
56
38
57
## [ v0.3.3]
58
+
39
59
- ` tiktoken ` will now make a best effort attempt to replace surrogate pairs with the corresponding
40
- Unicode character and will replace lone surrogates with the Unicode replacement character.
60
+ Unicode character and will replace lone surrogates with the Unicode replacement character.
41
61
42
62
## [ v0.3.2]
63
+
43
64
- Add encoding for GPT-4
44
65
45
66
## [ v0.3.1]
67
+
46
68
- Build aarch64 wheels
47
69
- Make ` blobfile ` an optional dependency
48
70
49
71
Thank you to @messense for the environment variable that makes cargo not OOM under emulation!
50
72
51
73
## [ v0.3.0]
74
+
52
75
- Improve performance by 5-20%; thank you to @nistath !
53
76
- Add ` gpt-3.5-turbo ` models to ` encoding_for_model `
54
77
- Add prefix matching to ` encoding_for_model ` to better support future model versions
@@ -57,16 +80,19 @@ Thank you to @messense for the environment variable that makes cargo not OOM und
57
80
- Add packaging metadata
58
81
59
82
## [ v0.2.0]
60
- - Add `` tiktoken.encoding_for_model `` to get the encoding for a specific model
83
+
84
+ - Add ` tiktoken.encoding_for_model ` to get the encoding for a specific model
61
85
- Improve portability of caching logic
62
86
63
87
Thank you to @fritzo , @arvid220u , @khanhvu207 , @henriktorget for various small corrections
64
88
65
89
## [ v0.1.2]
90
+
66
91
- Avoid use of ` blobfile ` for public files
67
92
- Add support for Python 3.8
68
93
- Add py.typed
69
94
- Improve the public tests
70
95
71
96
## [ v0.1.1]
97
+
72
98
- Initial release
0 commit comments