You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
let on_true = Tensor::new(on_true, on_false.device())?.broadcast_as(shape.dims())?;
let m = mask.where_cond(&on_true, on_false)?;
Ok(m)
}
In the current setup, the user must invert the attention mask obtained from the tokenizer before passing it to the model.forward function. This requirement can be confusing as it differs from transformers implementation.
I second this. The masked_fill function is indeed counterintuitive (though its name isn’t that telling either ^^). In any case, the fact that the forward function actually expects an inverted mask for that reason can be quite troublesome.
I imagine this example isn’t the most critical one, as it’s not the most up-to-date, but I think many people (like me) might try it for initial tests with Candle, since it’s well known. In that sense, the example doesn’t really serve its purpose well because it’s misleading.
masked_fill
function of distilbert model implementation has currently unintuitive logiccandle/candle-transformers/src/models/distilbert.rs
Lines 13 to 18 in efd0e68
In the current setup, the user must invert the attention mask obtained from the tokenizer before passing it to the
model.forward
function. This requirement can be confusing as it differs from transformers implementation.Proposition:
Replace
masked_fill
function with:The text was updated successfully, but these errors were encountered: