Skip to content

Add support for interaction i transform (ala fixest::i())#265

Open
matthewwardrop wants to merge 3 commits intomainfrom
add_interaction_operator
Open

Add support for interaction i transform (ala fixest::i())#265
matthewwardrop wants to merge 3 commits intomainfrom
add_interaction_operator

Conversation

@matthewwardrop
Copy link
Owner

@matthewwardrop matthewwardrop commented Jan 9, 2026

This patch implements initial support for the i() operator, motivated by pyfixest use-cases. There are still some rough edges (and encoder state is not actually preserved), but this implementation should work in most cases. See attached Jupyter notebook for usage demos.

Note that this is not a 1:1 drop in for the fixest estimator, in particular:

  • This implementation supports interactions of arbitrary order (e.g. i(a,b,c,d,...))
  • This implementation labels columns according to conventions set elsewhere in this package; e.g. i(A, B) might output columns with names like: A[a]:B[g].
  • Unlike normal interactions, there is no guarantee provided for structural full-rankness, and usage of the same column inside of and outside of the i(...) operator can result in linear dependence.
  • Binning and "keep" values are provided by per feature transforms, rather than by i itself. For example: i(bin(A, ab=['a', 'b'])) and i(C(A, levels=['a', 'b'])).
  • The ref functionality is, for the time being, implemented by a reduce_rank argument passed to C(), as above for levels. I'm not sure I like that.

i_operator.ipynb

closes: #244
closes: #263

@leostimpfle
Copy link

Hi @matthewwardrop ! Thanks very much for this. I've played around with the PR a bit and it looks overall pretty close to what we're hoping for.

Two comments:

The ref functionality is, for the time being, implemented by a reduce_rank argument passed to C(), as above for levels. I'm not sure I like that.

Specifying C inside of i feels somewhat redundant because in fixest the first argument to i is always assumed to be categorically encoded. I personally don't necessarily mind the verbosity/explicitness here but fixest always drops a reference level (the first if ref is not specified), so i(x,y) would not be equivalent between formulaic and fixest (I guess one would need to use i(C(x, reduce_rank=True), y) in formulaic?). I personally feel like not implicitly dropping a level is probably not a bad thing but it's definitely something to be aware of.

Second, is my understanding correct that the binning functionality doesn't cover dropping of reference levels yet? For example, is it possible to replicate syntax of the form i(f_str, bin=list(fruit=c('apple','banana')), ref='fruit')? I'd be happy to work on this if I can be of help.

PS: I think the current branch was missing a from __future__ import annotations for compatibility with Python < 3.14 (see #267)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feasture Request: fixest::i() operator to set reference levels and interact Categorical Variables

2 participants