You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
app_tests/integration_tests/llm/server_management.py:118: TimeoutError
------------------------------ Captured log setup ------------------------------
INFO app_tests.integration_tests.llm.model_management:model_management.py:102 Copying local model from /data/llama3.1/weights/8b/fp16/llama3.1_8b_fp16_instruct.irpa
INFO app_tests.integration_tests.llm.model_management:model_management.py:138 Downloading tokenizer NousResearch/Meta-Llama-3.1-8B
INFO app_tests.integration_tests.llm.model_management:model_management.py:151 Exporting model with following settings:
MLIR Path: /shark-dev/pytest-of-nod/pytest-516/model_cache1/local/llama3.1_8b_fp16_instruct/model.mlir
Config Path: /shark-dev/pytest-of-nod/pytest-516/model_cache1/local/llama3.1_8b_fp16_instruct/config.json
Batch Sizes: 1,4
INFO app_tests.integration_tests.llm.model_management:model_management.py:172 Model successfully exported to /shark-dev/pytest-of-nod/pytest-516/model_cache1/local/llama3.1_8b_fp16_instruct/model.mlir
INFO app_tests.integration_tests.llm.model_management:model_management.py:178 Compiling model to /shark-dev/pytest-of-nod/pytest-516/model_cache1/local/llama3.1_8b_fp16_instruct/model.vmfb
INFO app_tests.integration_tests.llm.model_management:model_management.py:189 Model successfully compiled to /shark-dev/pytest-of-nod/pytest-516/model_cache1/local/llama3.1_8b_fp16_instruct/model.vmfb
=========================== short test summary info ============================
ERROR app_tests/integration_tests/llm/shortfin/cpu_llm_server_test.py::TestLLMServer::test_basic_generation[llama31_8b_none] - TimeoutError: Server failed to start within 10 seconds
This test is fairly flaky. Can we come up with a better testing methodology?
The text was updated successfully, but these errors were encountered:
dan-garvey
changed the title
[shortfin][ci]
[shortfin][ci] Flaky startup timer test failure
Jan 27, 2025
We don't have meta llama 3.1 weights on all of the machiens
I recently removed the xfail, because it was moved to Mi300x-3 which had the irpa file. But, makes sense if you wanna set it back to xfail since the file isn't at the same location on all of the machines. We had some bad irpa files that were causing the shortfin output to seem corrupt, which got fixed by downloading safetensors and regenerating the irpa files, so had that issue on my mind when I removed the xfail
app_tests/integration_tests/llm/server_management.py:118: TimeoutError
------------------------------ Captured log setup ------------------------------
INFO app_tests.integration_tests.llm.model_management:model_management.py:102 Copying local model from /data/llama3.1/weights/8b/fp16/llama3.1_8b_fp16_instruct.irpa
INFO app_tests.integration_tests.llm.model_management:model_management.py:138 Downloading tokenizer NousResearch/Meta-Llama-3.1-8B
INFO app_tests.integration_tests.llm.model_management:model_management.py:151 Exporting model with following settings:
MLIR Path: /shark-dev/pytest-of-nod/pytest-516/model_cache1/local/llama3.1_8b_fp16_instruct/model.mlir
Config Path: /shark-dev/pytest-of-nod/pytest-516/model_cache1/local/llama3.1_8b_fp16_instruct/config.json
Batch Sizes: 1,4
INFO app_tests.integration_tests.llm.model_management:model_management.py:172 Model successfully exported to /shark-dev/pytest-of-nod/pytest-516/model_cache1/local/llama3.1_8b_fp16_instruct/model.mlir
INFO app_tests.integration_tests.llm.model_management:model_management.py:178 Compiling model to /shark-dev/pytest-of-nod/pytest-516/model_cache1/local/llama3.1_8b_fp16_instruct/model.vmfb
INFO app_tests.integration_tests.llm.model_management:model_management.py:189 Model successfully compiled to /shark-dev/pytest-of-nod/pytest-516/model_cache1/local/llama3.1_8b_fp16_instruct/model.vmfb
=========================== short test summary info ============================
ERROR app_tests/integration_tests/llm/shortfin/cpu_llm_server_test.py::TestLLMServer::test_basic_generation[llama31_8b_none] - TimeoutError: Server failed to start within 10 seconds
This test is fairly flaky. Can we come up with a better testing methodology?
The text was updated successfully, but these errors were encountered: