You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Alignment model creation works fine, but during filtering Eflomal crashes with the following error message:
INFO:opusfilter.opusfilter:Running step 5: filter
20343327it [10:23, 32615.14it/s]
INFO:eflomal:Prepared 20343327 sentences for alignment
INFO:eflomal:Reading lexical priors...
INFO:eflomal:1618911 (of 2174631) pairs of lexical priors used
Traceback (most recent call last):
File "/mnt/c/Users/yvessche/work/americasnlp2023-st/myenv/bin/opusfilter", line 31, in <module>
of.execute_steps(overwrite=args.overwrite, last=args.last)
File "/mnt/c/Users/yvessche/work/americasnlp2023-st/myenv/lib/python3.8/site-packages/opusfilter/opusfilter.py", line 224, in execute_steps
self._run_step(step, num + 1, overwrite)
File "/mnt/c/Users/yvessche/work/americasnlp2023-st/myenv/lib/python3.8/site-packages/opusfilter/opusfilter.py", line 289, in _run_step
self.step_functions[step['type']](parameters, overwrite=overwrite)
File "/mnt/c/Users/yvessche/work/americasnlp2023-st/myenv/lib/python3.8/site-packages/opusfilter/opusfilter.py", line 96, in wrapper
return self.parallelize(*args, **kwargs)
File "/mnt/c/Users/yvessche/work/americasnlp2023-st/myenv/lib/python3.8/site-packages/opusfilter/opusfilter.py", line 141, in parallelize
self.func(obj, parameters, overwrite)
File "/mnt/c/Users/yvessche/work/americasnlp2023-st/myenv/lib/python3.8/site-packages/opusfilter/opusfilter.py", line 380, in filter_data
for idx, pair in enumerate(pairs):
File "/mnt/c/Users/yvessche/work/americasnlp2023-st/myenv/lib/python3.8/site-packages/opusfilter/word_alignment.py", line 170, in _filtergen
self.aligner.align(
File "/mnt/c/Users/yvessche/work/americasnlp2023-st/myenv/lib/python3.8/site-packages/eflomal/__init__.py", line 72, in align
align(srcf.name, trgf.name,
File "python/eflomal/eflomal.pyx", line 161, in eflomal.cython.align
File "/usr/lib/python3.8/subprocess.py", line 516, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['/mnt/c/Users/yvessche/work/americasnlp2023-st/myenv/lib/python3.8/site-packages/eflomal/bin/eflomal', '-m', '3', '-s', '/tmp/tmpawsij1rg', '-t', '/tmp/tmpphsceo43', '-n', '3', '-N', '0.2', '-1', '2', '-q', '-2', '1', '-3', '2', '-F', '/tmp/tmpyamo5usj', '-R', '/tmp/tmps4d0ndvi', '-p', '/tmp/tmp18jxqkax']' died with <Signals.SIGKILL: 9>.
The Eflomal unittest (test_eflomal.py) runs fine:
/mnt/c/Users/yvessche/work/americasnlp2023-st/myenv/lib/python3.8/site-packages/eflomal/bin/eflomal -m 3 -s /tmp/tmpst1zbe0v -t /tmp/tmps4j5_0m8 -n 3 -N 0.2 -1 721 -2 721 -3 2887 -f /tmp/tmpf50or8p5 -r /tmp/tmp98dw6njz
Read texts (3 sentences): 0.000 s
Vocabulary sizes are 9 (source), 9 (target)
Created alignment structures: 0.000 s
Created alignment structures: 0.000 s
Randomized alignment: 0.002 s
Aligning with model 1 (721 iterations)
Randomized alignment: 0.000 s
Aligning with model 1 (721 iterations)
Done: 0.002 s
Aligning with model 2 (721 iterations)
Done: 0.002 s
Aligning with model 2 (721 iterations)
Done: 0.001 s
Aligning with model 3 (2887 iterations)
Done: 0.001 s
Aligning with model 3 (2887 iterations)
Done: 0.019 s
Final argmax iteration: 0.000 s
Writing alignments to /tmp/tmpf50or8p5 for 3 sentencess
Done: 0.019 s
Final argmax iteration: 0.000 s
Writing alignments to /tmp/tmp98dw6njz for 3 sentencess
./mnt/c/Users/yvessche/work/americasnlp2023-st/myenv/lib/python3.8/site-packages/eflomal/bin/eflomal -m 3 -s /tmp/tmpfpe3h_i5 -t /tmp/tmpqggbus3t -n 3 -N 0.2 -1 721 -2 721 -3 2887 -f /tmp/tmp4y0_3tw1 -r /tmp/tmpk5nynnwy -p /tmp/tmp4yygknic
Read texts (3 sentences): 0.000 s
Vocabulary sizes are 9 (source), 9 (target)
Created alignment structures: 0.000 s
Created alignment structures: 0.000 s
Randomized alignment: 0.001 s
Aligning with model 1 (721 iterations)
Randomized alignment: 0.001 s
Aligning with model 1 (721 iterations)
Done: 0.001 s
Aligning with model 2 (721 iterations)
Done: 0.002 s
Aligning with model 2 (721 iterations)
Done: 0.002 s
Aligning with model 3 (2887 iterations)
Done: 0.001 s
Aligning with model 3 (2887 iterations)
Done: 0.019 s
Final argmax iteration: 0.000 s
Writing alignments to /tmp/tmpk5nynnwy for 3 sentencess
Done: 0.019 s
Final argmax iteration: 0.000 s
Writing alignments to /tmp/tmp4y0_3tw1 for 3 sentencess
./mnt/c/Users/yvessche/work/americasnlp2023-st/myenv/lib/python3.8/site-packages/eflomal/bin/eflomal -m 3 -s /tmp/tmpdd0kzzqb -t /tmp/tmpex4wlj51 -n 3 -N 0.2 -1 721 -2 721 -3 2887 -f /tmp/tmpjxe0px3n -r /tmp/tmpu0jpju0y
Read texts (3 sentences): 0.000 s
Vocabulary sizes are 9 (source), 9 (target)
Created alignment structures: 0.000 s
Created alignment structures: 0.000 s
Randomized alignment: 0.002 s
Aligning with model 1 (721 iterations)
Randomized alignment: 0.002 s
Aligning with model 1 (721 iterations)
Done: 0.002 s
Aligning with model 2 (721 iterations)
Done: 0.003 s
Aligning with model 2 (721 iterations)
Done: 0.002 s
Aligning with model 3 (2887 iterations)
Done: 0.001 s
Aligning with model 3 (2887 iterations)
Done: 0.019 s
Final argmax iteration: 0.000 s
Writing alignments to /tmp/tmpu0jpju0y for 3 sentencess
Done: 0.019 s
Final argmax iteration: 0.000 s
Writing alignments to /tmp/tmpjxe0px3n for 3 sentencess
.
----------------------------------------------------------------------
Ran 3 tests in 0.182s
OK
The OpusFilter unit test also seems to run fine:
.........
----------------------------------------------------------------------
Ran 9 tests in 0.911s
OK
The text was updated successfully, but these errors were encountered:
It seems most probable that the process was killed due to exceeding memory limits. Eflomal is using a considerable amount of memory for large inputs, apparently growing linearly with the corpus size. For a corpus of 20 million sentence pairs, it used 10 gigabytes of memory.
Possible solutions:
Split the files to smaller subsets before filtering
If you use multiple filters, set WordAlignFilter as the last one (less data remaining)
The score step and filter with filterfalse=True automatically do chunking, but the normal filter does not. Maybe there should be an option for that.
Alignment model creation works fine, but during filtering Eflomal crashes with the following error message:
The Eflomal unittest (test_eflomal.py) runs fine:
The OpusFilter unit test also seems to run fine:
The text was updated successfully, but these errors were encountered: