Skip to content

Commit 214a1be

Browse files
committed
added text preprocessing in FastTextLangIdClassifier
Signed-off-by: Sasha Meister <ameister@nvidia.com>
1 parent 45fdc8a commit 214a1be

File tree

1 file changed

+2
-1
lines changed

1 file changed

+2
-1
lines changed

sdp/processors/inference/nlp/fasttext/fasttext.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -110,7 +110,8 @@ def process_dataset_entry(self, data_entry: dict):
110110
"""Applies the classifier to a single dataset entry."""
111111

112112
self._load_model()
113-
label, prob = self._model.predict(data_entry[self.text_field])
113+
text = data_entry[self.text_field].strip().replace("\n", " ")
114+
label, prob = self._model.predict(text)
114115
data_entry[self.output_field] = label[0].replace('__label__', '')
115116
data_entry[f"{self.output_field}_prob"] = prob[0]
116117

0 commit comments

Comments
 (0)