Skip to content

Commit cd3c997

Browse files
1.0.0
1 parent 9835f4f commit cd3c997

File tree

4 files changed

+87
-47
lines changed

4 files changed

+87
-47
lines changed

README.md

+8-2
Original file line numberDiff line numberDiff line change
@@ -10,8 +10,6 @@
1010

1111
Formulaic is a high-performance implementation of Wilkinson formulas for Python.
1212

13-
**Note:** This project, while largely complete, is still a work in progress, and the API is subject to change between major versions (0.<major>.<minor>).
14-
1513
- **Documentation**: https://matthewwardrop.github.io/formulaic
1614
- **Source Code**: https://github.com/matthewwardrop/formulaic
1715
- **Issue tracker**: https://github.com/matthewwardrop/formulaic/issues
@@ -31,6 +29,7 @@ It provides:
3129
- `numpy.ndarray`
3230
- `scipy.sparse.CSCMatrix`
3331
- support for symbolic differentiation of formulas (and hence model matrices).
32+
- and much more.
3433

3534
## Example code
3635

@@ -107,6 +106,13 @@ y, X = Formula('y ~ x + z').get_model_matrix(df)
107106
</tbody>
108107
</table>
109108

109+
Note that the above can be short-handed to:
110+
111+
```
112+
from formulaic import model_matrix
113+
model_matrix('y ~ x + z', df)
114+
```
115+
110116
## Benchmarks
111117

112118
Formulaic typically outperforms R for both dense and sparse model matrices, and vastly outperforms `patsy` (the existing implementation for Python) for dense matrices (`patsy` does not support sparse model matrix output).

benchmarks/benchmarks.csv

+45-45
Original file line numberDiff line numberDiff line change
@@ -1,46 +1,46 @@
11
,formula,tooling,mean,stderr
2-
0,a,patsy,0.06235763004847935,0.005395749279517771
3-
1,a,formulaic,0.01614434378487723,0.0032546966398573733
4-
2,a,formulaic_sparse,0.3261915956224714,0.016261649503701136
5-
3,a,R,0.28679302760532926,0.04140188245157612
6-
4,a,R_sparse,0.3757050037384033,0.10653194222854354
7-
5,A,patsy,5.07630558013916,0.21714776785515827
8-
6,A,formulaic,0.20960685185023717,0.00647732136737535
9-
7,A,formulaic_sparse,0.49657981736319406,0.014160553634556148
10-
8,A,R,0.270564215523856,0.048446057523137825
11-
9,A,R_sparse,0.6199769633156913,0.046671131192277546
12-
10,a+A,patsy,5.372520089149475,0.24907024334140557
13-
11,a+A,formulaic,0.21435914720807756,0.005022992135378462
14-
12,a+A,formulaic_sparse,0.5922877447945731,0.01141291396766895
15-
13,a+A,R,0.3385444368634905,0.050949061524224834
16-
14,a+A,R_sparse,0.8427209513528007,0.05420930370296844
17-
15,a:A,patsy,5.41690331697464,0.20045329513286106
18-
16,a:A,formulaic,0.2447828565325056,0.009831368693974841
19-
17,a:A,formulaic_sparse,0.5952485970088414,0.015557371582484582
20-
18,a:A,R,0.325153112411499,0.0525960780057241
21-
19,a:A,R_sparse,0.6293824059622628,0.05176575091323222
22-
20,A+B,patsy,10.592723488807678,0.36380136013031006
23-
21,A+B,formulaic,0.39785667828151156,0.004189211004718843
24-
22,A+B,formulaic_sparse,0.7370290756225586,0.00559989056050753
25-
23,A+B,R,0.45774364471435547,0.04577860217478202
26-
24,A+B,R_sparse,1.128925051007952,0.0730276871543806
27-
25,a:A:B,patsy,13.139377474784851,0.735141396522522
28-
26,a:A:B,formulaic,0.5296461582183838,0.02897198866398977
29-
27,a:A:B,formulaic_sparse,0.9496099608285087,0.017482256914701913
30-
28,a:A:B,R,0.5121760368347168,0.05913549901764887
31-
29,a:A:B,R_sparse,2.4410063539232527,0.15718149749347077
32-
30,A:B:C:D,patsy,33.971909284591675,0.0
33-
31,A:B:C:D,formulaic,1.4003467900412423,0.013122149254960603
34-
32,A:B:C:D,formulaic_sparse,2.6644029957907542,0.0594315471815126
35-
33,A:B:C:D,R,1.5739161627633231,0.043335739540618624
36-
34,A:B:C:D,R_sparse,11.206892251968384,0.07203364372253418
37-
35,a*b*A*B,patsy,14.135663151741028,0.023609280586242676
38-
36,a*b*A*B,formulaic,0.7015061037881034,0.015836408587630867
39-
37,a*b*A*B,formulaic_sparse,1.2936896937234061,0.008783658171925213
40-
38,a*b*A*B,R,0.7440026828220913,0.0779464030983464
41-
39,a*b*A*B,R_sparse,8.046716928482056,0.09924621730008089
42-
40,a*b*c*A*B*C,patsy,52.30743145942688,0.0
43-
41,a*b*c*A*B*C,formulaic,3.124175344194685,0.015513429204320773
44-
42,a*b*c*A*B*C,formulaic_sparse,4.722880220413208,0.05794530543951235
45-
43,a*b*c*A*B*C,R,3.261254208428519,0.03376348572979368
46-
44,a*b*c*A*B*C,R_sparse,96.12985253334045,0.0
2+
0,a,patsy,0.05834197998046875,0.00803367432263398
3+
1,a,formulaic,0.02302394594464983,0.005276772701941135
4+
2,a,formulaic_sparse,0.21061321667262486,0.010992980906779363
5+
3,a,R,0.20319366455078125,0.04054977688850774
6+
4,a,R_sparse,0.25407181467328754,0.06840892083032614
7+
5,A,patsy,4.188197422027588,0.06544489467164907
8+
6,A,formulaic,0.1523878574371338,0.0036558939106885466
9+
7,A,formulaic_sparse,0.3251234803880964,0.011790060964279047
10+
8,A,R,0.17913893290928432,0.024771923314745647
11+
9,A,R_sparse,0.2776027406964983,0.030147102033801613
12+
10,a+A,patsy,4.570634412765503,0.15648980716104596
13+
11,a+A,formulaic,0.17730648177010672,0.011166617158905944
14+
12,a+A,formulaic_sparse,0.4082690307072231,0.019076310496142206
15+
13,a+A,R,0.38174584933689665,0.07049519867629231
16+
14,a+A,R_sparse,0.5332009451729911,0.21147057077093623
17+
15,a:A,patsy,4.846947574615479,0.17439756289836647
18+
16,a:A,formulaic,0.18150435175214494,0.0029803910808661733
19+
17,a:A,formulaic_sparse,0.40479908670697895,0.02179493572840004
20+
18,a:A,R,0.209270749773298,0.02890730110968228
21+
19,a:A,R_sparse,0.3095934050423758,0.03366518302136102
22+
20,A+B,patsy,8.886903127034506,0.12687320827416307
23+
21,A+B,formulaic,0.37893104553222656,0.07142932738422411
24+
22,A+B,formulaic_sparse,0.6603872776031494,0.12048599251620122
25+
23,A+B,R,0.3503831795283726,0.08556777381884671
26+
24,A+B,R_sparse,0.6867697579520089,0.1680831720230895
27+
25,a:A:B,patsy,10.59350836277008,0.006071925163269043
28+
26,a:A:B,formulaic,0.38779779842921663,0.005652063736152758
29+
27,a:A:B,formulaic_sparse,0.6174772126334054,0.006656982815848345
30+
28,a:A:B,R,0.41255525180271696,0.005872361057027324
31+
29,a:A:B,R_sparse,1.3681020736694336,0.13250905747410396
32+
30,A:B:C:D,patsy,27.812817335128784,0.0
33+
31,A:B:C:D,formulaic,1.7389381953648158,0.1128489022833264
34+
32,A:B:C:D,formulaic_sparse,1.821084805897304,0.04040580196899275
35+
33,A:B:C:D,R,1.1703059673309326,0.01475477228204255
36+
34,A:B:C:D,R_sparse,6.603186547756195,0.10862060432577084
37+
35,a*b*A*B,patsy,14.305930256843567,1.4570282697677612
38+
36,a*b*A*B,formulaic,0.849949870790754,0.12062261319473745
39+
37,a*b*A*B,formulaic_sparse,1.194093908582415,0.27659465274967987
40+
38,a*b*A*B,R,0.633225509098598,0.09868026112633763
41+
39,a*b*A*B,R_sparse,7.428930600484212,1.4652407668448042
42+
40,a*b*c*A*B*C,patsy,48.66431951522827,0.0
43+
41,a*b*c*A*B*C,formulaic,4.352833080291748,0.31885221781014655
44+
42,a*b*c*A*B*C,formulaic_sparse,4.8097954273223875,0.7055727752553242
45+
43,a*b*c*A*B*C,R,2.6774498394557407,0.06813521510330559
46+
44,a*b*c*A*B*C,R_sparse,72.07087659835815,0.0

benchmarks/benchmarks.png

434 Bytes
Loading

docsite/docs/changelog.md

+34
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,40 @@ For changes since the latest tagged release, please refer to the
33

44
---
55

6+
## 1.0.0 (24 December 2023)
7+
8+
**Breaking changes:**
9+
10+
* Python tokens are now canonically formatted (see below).
11+
* Methods deprecated during the 0.x series have been removed: `Formula.terms`,
12+
`ModelSpec.feature_names`, and `ModelSpec.feature_indices`.
13+
14+
**New features and enhancements:**
15+
16+
* Python tokens are now sanitized and canonically formatted to prevent
17+
ambiguities and better align with `patsy`.
18+
* Added official support for Python 3.12 (no code changes were necessary).
19+
* Added the `hashed` transform for categorically encoding deterministically
20+
hashed representations of a dataset.
21+
22+
**Bugfixes and cleanups:**
23+
24+
* Fixed transform state not propagating correctly when Python code tokens were
25+
not canonically formatted.
26+
* Literals in formulae will no longer be silently ignored, and feature scaling
27+
is now fully supported.
28+
* Improved code parsing and formatting utilities and dropped the requirement for
29+
`astor` for Python 3.9 and newer.
30+
* Fixed all warnings emitted during unit tests.
31+
32+
**Documentation:**
33+
34+
* Removed incompleteness warnings.
35+
* Added some lightweight developer documents.
36+
* Fixed some broken links.
37+
38+
---
39+
640
## 0.6.6 (4 October 2023)
741

842
This is minor release with one important bugfix.

0 commit comments

Comments
 (0)