HF model tracker #899

pdhirajkumarprasad · 2025-01-09T08:51:19Z

Total no. of models	545
PASS	307 -> 408
Numeric	12 -> 37
compilation
compiled_inference
setup and import

Detailed list

amd-vivekag · 2025-02-10T15:00:41Z

HF test run summary out of 155 tests which are failing compilation, compiled_inference and setup. Numerics failures run summary is not part of the following report:

Passing Summary

TOTAL TESTS = 155

Stage	# Passing	% of Total	% of Attempted
Setup	143	92.3%	92.3%
IREE Compilation	81	52.3%	56.6%
Gold Inference	29	18.7%	35.8%
IREE Inference Invocation	26	16.8%	89.7%
Inference Comparison (PASS)	26	16.8%	100.0%

Fail Summary

TOTAL TESTS = 155

Stage	# Failed at Stage	% of Total
Setup	12	7.7%
IREE Compilation	62	40.0%
Gold Inference	52	33.5%
IREE Inference Invocation	3	1.9%
Inference Comparison	0	0.0%

Test Run Detail

Test was run with the following arguments:
Namespace(device='local-task', backend='llvm-cpu', target_chip='gfx942', iree_compile_args=None, mode='cl-onnx-iree', torchtolinalg=False, stages=None, skip_stages=None, benchmark=False, load_inputs=False, groups='all', test_filter=
None, testsfile='hf_failing.txt.1', tolerance=None, verbose=False, rundirectory='test-run', no_artifacts=False, cleanup='0', report=True, report_file='reports/hf_report.md', get_metadata=False)

Test	Exit Status	Mean Benchmark Time (ms)
hf_albert-base-v2	construct_inputs	None
hf_albert-base-v2-squad2	construct_inputs	None
hf_albert-japanese-v2	construct_inputs	None
hf_albert-xlarge-vitaminc-mnli	construct_inputs	None
hf_all-mpnet-base-v1	setup	None
hf_bart-base	native_inference	None
hf_beit-base-patch16-224-pt22k	compilation	None
hf_beit-base-patch16-224-pt22k-ft22k	compilation	None
hf_bert_turkish_sentiment	construct_inputs	None
hf_camembert-base	construct_inputs	None
hf_camembert-base-squadFR-fquad-piaf	construct_inputs	None
hf_camembert-keyword-extractor	construct_inputs	None
hf_camembert-ner	PASS	None
hf_camembert-ner-with-dates	construct_inputs	None
hf_CentralBankRoBERTa-sentiment-classifier	construct_inputs	None
hf_checkpoints_3_14	compilation	None
hf_ChemBERTa-77M-MLM	construct_inputs	None
hf_chinese-roberta-wwm-ext	construct_inputs	None
hf_codebert-base	construct_inputs	None
hf_codebert-base-mlm	construct_inputs	None
hf_codebert-java	construct_inputs	None
hf_codebert-python	construct_inputs	None
hf_content	compilation	None
hf_COPA_albert_base_finetuned	construct_inputs	None
hf_cross-en-de-roberta-sentence-transformer	construct_inputs	None
hf_deberta-base	compilation	None
hf_deberta-large-mnli	compilation	None
hf_deberta-v2-base-japanese	compilation	None
hf_deberta-v2-base-japanese-char-wwm	compilation	None
hf_deberta-v3-base	compilation	None
hf_deberta-v3-base-absa-v1.1	compilation	None
hf_deberta-v3-base-injection	compilation	None
hf_DeBERTa-v3-base-mnli-fever-anli	compilation	None
hf_deberta-v3-base-squad2	compilation	None
hf_deberta-v3-base-zeroshot-v1.1-all-33	compilation	None
hf_deberta-v3-base_finetuned_ai4privacy_v2	compilation	None
hf_deberta-v3-large	compilation	None
hf_deberta-v3-large-squad2	compilation	None
hf_deberta-v3-large_boolq	compilation	None
hf_deberta-v3-small	compilation	None
hf_deberta-v3-xsmall	compilation	None
hf_deberta_finetuned_pii	compilation	None
hf_Debertalarg_model_multichoice_Version2	compilation	None
hf_deeplabv3-mobilevit-small	compilation	None
hf_deeplabv3-mobilevit-xx-small	compilation	None
hf_distilbert-base-cased	PASS	None
hf_distilbert-base-cased-distilled-squad	PASS	None
hf_distilbert-base-cased-finetuned-conll03-english	PASS	None
hf_distilbert-base-multilingual-cased	PASS	None
hf_distilbert-base-multilingual-cased-ner-hrl	PASS	None
hf_distilbert-base-multilingual-cased-sentiments-student	PASS	None
hf_distilbert-base-nli-mean-tokens	PASS	None
hf_distilbert-base-nli-stsb-mean-tokens	PASS	None
hf_distilbert-base-uncased	PASS	None
hf_distilbert-base-uncased-distilled-squad	PASS	None
hf_distilbert-extractive-qa-project	PASS	None
hf_distilbert-NER	PASS	None
hf_distilbert-SBD-en-judgements-laws	PASS	None
hf_distilbert_distilbert-base-uncased-15-epoch	PASS	None
hf_distilbert_multiple_choice	PASS	None
hf_distilbert_science_multiple_choice	PASS	None
hf_distilcamembert-base-ner	PASS	None
hf_distilgpt2	construct_inputs	None
hf_distilhubert	compilation	None
hf_dpt-large-ade	compilation	None
hf_e_care_albert_base_finetuned	construct_inputs	None
hf_esm2_t36_3B_UR50D	import_model	None
hf_eva_large_patch14_196.in22k_ft_in22k_in1k	native_inference	None
hf_fine-tuned-MoritzLaurer-deberta-v3-large-zeroshot-v2.0-arceasy	setup	None
hf_french-camembert-postag-model	construct_inputs	None
hf_gpt2	construct_inputs	None
hf_gpt2-small-spanish	native_inference	None
hf_IndicBERTv2-MLM-only	construct_inputs	None
hf_ivila-row-layoutlm-finetuned-s2vl-v2	native_inference	None
hf_kda-albert-xxlarge-v2-race	construct_inputs	None
hf_keyphrase-extraction-distilbert-inspec	PASS	None
hf_ko-sroberta-multitask	construct_inputs	None
hf_KUCI_albert_base_Finetuned	construct_inputs	None
hf_llama-68m	construct_inputs	None
hf_llama-7b	import_model	None
hf_Llama3-8B-1.58-100B-tokens-GGUF	setup	None
hf_mdeberta-v3-base	compilation	None
hf_mDeBERTa-v3-base-mnli-xnli	compilation	None
hf_mdeberta-v3-base-squad2	compilation	None
hf_mDeBERTa-v3-xnli-ft-bs-multiple-choice	compilation	None
hf_Medical-NER	compilation	None
hf_Meta-Llama-3.1-8B-Instruct-AWQ-INT4	setup	None
hf_Meta-Llama-3.1-8B-Instruct-bnb-4bit	setup	None
hf_Midnight-Miqu-70B-v1.5-4bit	setup	None
hf_Mistral-7B-Instruct-v0.2-GPTQ	setup	None
hf_mobilenet_v1_0.75_192	native_inference	None
hf_mobilevit-small	compilation	None
hf_msmarco-distilbert-base-tas-b	PASS	None
hf_msmarco-distilbert-base-v4	PASS	None
hf_msmarco-distilbert-cos-v5	PASS	None
hf_msmarco-distilbert-dot-v5	PASS	None
hf_multi-qa-distilbert-cos-v1	PASS	None
hf_multi-qa-MiniLM-L6-cos-v1	PASS	None
hf_multi-qa-mpnet-base-cos-v1	setup	None
hf_Multiple_Choice	setup	None
hf_Multiple_Choice_EN	setup	None
hf_multiple_choice_model	setup	None
hf_mxbai-rerank-base-v1	compilation	None
hf_mxbai-rerank-xsmall-v1	compilation	None
hf_nfnet_l0.ra2_in1k	import_model	None
hf_nli-deberta-v3-base	compilation	None
hf_oasst-sft-4-pythia-12b-epoch-3.5	import_model	None
hf_opt-125m	native_inference	None
hf_output	compilation	None
hf_pedestrian_gender_recognition	compilation	None
hf_Phi-3-mini-128k-instruct	import_model	None
hf_Phi-3-mini-4k-instruct	import_model	None
hf_Phi-3.5-mini-instruct	import_model	None
hf_phobert-base-finetuned	compiled_inference	None
hf_phobert-large-finetuned	compiled_inference	None
hf_piiranha-v1-detect-personal-information	compilation	None
hf_pnasnet5large.tf_in1k	compilation	None
hf_Qwen1.5-0.5B-Chat	native_inference	None
hf_Qwen2-0.5B	native_inference	None
hf_Qwen2-7B-Instruct	import_model	None
hf_Qwen2.5-0.5B-Instruct	native_inference	None
hf_Qwen2.5-1.5B-Instruct	import_model	None
hf_Qwen2.5-7B-Instruct	import_model	None
hf_really-tiny-falcon-testing	native_inference	None
hf_robbert-v2-dutch-base	construct_inputs	None
hf_robbert-v2-dutch-ner	construct_inputs	None
hf_roberta-base-chinese-extractive-qa	construct_inputs	None
hf_robertuito-sentiment-analysis	construct_inputs	None
hf_ruRoPEBert-e5-base-2k	setup	None
hf_SapBERT-UMLS-2020AB-all-lang-from-XLMR	construct_inputs	None
hf_sbert_large_nlu_ru	construct_inputs	None
hf_sentence-bert-base-ja-mean-tokens-v2	compiled_inference	None
hf_splinter-base	compilation	None
hf_splinter-base-qass	compilation	None
hf_swin-tiny-patch4-window7-224	compilation	None
hf_swin_base_patch4_window7_224.ms_in22k_ft_in1k	compilation	None
hf_tiny-distilbert-base-cased-distilled-squad	PASS	None
hf_tiny-dummy-qwen2	native_inference	None
hf_tiny-Qwen2ForCausalLM-2.5	native_inference	None
hf_tiny-random-GemmaForCausalLM	native_inference	None
hf_tiny-random-LlamaForCausalLM	native_inference	None
hf_tiny-random-mistral	construct_inputs	None
hf_tiny-random-mt5	native_inference	None
hf_tiny-random-Phi3ForCausalLM	native_inference	None
hf_TinyLlama-1.1B-Chat-v1.0	import_model	None
hf_vicuna-7b-v1.5	import_model	None
hf_wangchanberta-base-att-spm-uncased	construct_inputs	None
hf_wasmai-7b-v1	import_model	None
hf_wavlm-base-plus	construct_inputs	None
hf_yolos-base	compilation	None
hf_yolos-fashionpedia	compilation	None
hf_yolos-small	compilation	None
hf_yolos-small-finetuned-license-plate-detection	compilation	None
hf_yolos-small-rego-plates-detection	compilation	None
hf_zephyr-7b-beta	import_model	None

amd-vivekag · 2025-02-10T15:12:29Z

construct_inputs summary after the fix " Skip tokenizer checks in favor of AutoTokenizer #442 "

Passing Summary

TOTAL TESTS = 36

Stage	# Passing	% of Total	% of Attempted
Setup	36	100.0%	100.0%
IREE Compilation	36	100.0%	100.0%
Gold Inference	31	86.1%	86.1%
IREE Inference Invocation	28	77.8%	90.3%
Inference Comparison (PASS)	28	77.8%	100.0%

Fail Summary

TOTAL TESTS = 36

Stage	# Failed at Stage	% of Total
Setup	0	0.0%
IREE Compilation	0	0.0%
Gold Inference	5	13.9%
IREE Inference Invocation	3	8.3%
Inference Comparison	0	0.0%

Test Run Detail

Test was run with the following arguments:
Namespace(device='local-task', backend='llvm-cpu', target_chip='gfx942', iree_compile_args=None, mode='cl-onnx-iree', torchtolinalg=False, stages=None, skip_stages=None, benchmark=False, load_inputs=False, groups='all', test_filter=
None, testsfile='contruct_input_failures.txt', tolerance=None, verbose=False, rundirectory='test-run', no_artifacts=False, cleanup='0', report=True, report_file='reports/hf_construct_input_report.md', get_metadata=False)

Test	Exit Status	Mean Benchmark Time (ms)
hf_albert-base-v2	PASS	None
hf_albert-base-v2-squad2	PASS	None
hf_albert-japanese-v2	PASS	None
hf_albert-xlarge-vitaminc-mnli	PASS	None
hf_bert_turkish_sentiment	PASS	None
hf_camembert-base	PASS	None
hf_camembert-base-squadFR-fquad-piaf	PASS	None
hf_camembert-keyword-extractor	PASS	None
hf_camembert-ner-with-dates	PASS	None
hf_CentralBankRoBERTa-sentiment-classifier	PASS	None
hf_ChemBERTa-77M-MLM	PASS	None
hf_chinese-roberta-wwm-ext	PASS	None
hf_codebert-base	PASS	None
hf_codebert-base-mlm	PASS	None
hf_codebert-java	PASS	None
hf_codebert-python	PASS	None
hf_COPA_albert_base_finetuned	PASS	None
hf_cross-en-de-roberta-sentence-transformer	PASS	None
hf_distilgpt2	construct_inputs	None
hf_e_care_albert_base_finetuned	PASS	None
hf_french-camembert-postag-model	PASS	None
hf_gpt2	construct_inputs	None
hf_IndicBERTv2-MLM-only	PASS	None
hf_kda-albert-xxlarge-v2-race	PASS	None
hf_ko-sroberta-multitask	compiled_inference	None
hf_KUCI_albert_base_Finetuned	PASS	None
hf_llama-68m	construct_inputs	None
hf_robbert-v2-dutch-base	PASS	None
hf_robbert-v2-dutch-ner	PASS	None
hf_roberta-base-chinese-extractive-qa	PASS	None
hf_robertuito-sentiment-analysis	compiled_inference	None
hf_SapBERT-UMLS-2020AB-all-lang-from-XLMR	PASS	None
hf_sbert_large_nlu_ru	compiled_inference	None
hf_tiny-random-mistral	construct_inputs	None
hf_wangchanberta-base-att-spm-uncased	PASS	None
hf_wavlm-base-plus	construct_inputs	None

amd-vivekag · 2025-02-12T13:46:34Z

142 Tests failing after changes for Autotokenizer. These were run on GPU, so kicked-off the run on CPU on VDI but that run got killed due to limited memory so again running it on machine sharkmi300x-3.

Following is the bucket:
53 compilation
33 compiled_inference
7 construct_inputs
15 import_model
16 native_inference
18 setup

Note: test-classification tests got stuck so data from that list is missing in the below table.

Following are the detailed summary:

Test	Exit Status	Mean Benchmark Time (ms)
hf_distilhubert	construct_inputs	None
hf_wavlm-base-plus	construct_inputs	None
hf_bart-base	native_inference	None
hf_conv-bert-base	construct_inputs	None
hf_cross-en-de-roberta-sentence-transformer	setup	None
hf_ko-sroberta-multitask	compiled_inference	None
hf_ruRoPEBert-e5-base-2k	setup	None
hf_sbert_large_nlu_ru	compiled_inference	None
hf_sentence-bert-base-ja-mean-tokens-v2	compiled_inference	None
hf_tiny-random-mt5	native_inference	None
hf_all-mpnet-base-v1	setup	None
hf_deberta-base	compilation	None
hf_deberta-v2-base-japanese	compilation	None
hf_deberta-v2-base-japanese-char-wwm	compilation	None
hf_deberta-v3-base	compilation	None
hf_deberta-v3-large	compilation	None
hf_deberta-v3-small	compilation	None
hf_deberta-v3-xsmall	compilation	None
hf_esm2_t12_35M_UR50D	compilation	None
hf_esm2_t30_150M_UR50D	compilation	None
hf_esm2_t36_3B_UR50D	import_model	None
hf_esm2_t6_8M_UR50D	compilation	None
hf_mdeberta-v3-base	compilation	None
hf_multi-qa-mpnet-base-cos-v1	setup	None
hf_beit-base-patch16-224-pt22k	compilation	None
hf_beit-base-patch16-224-pt22k-ft22k	compilation	None
hf_deit_base_distilled_patch16_224.fb_in1k	setup	None
hf_densenet121.ra_in1k	setup	None
hf_eva_large_patch14_196.in22k_ft_in22k_in1k	native_inference	None
hf_mit-b0	compiled_inference	None
hf_mit-b5	compiled_inference	None
hf_mobilenet_v1_0.75_192	native_inference	None
hf_mobilevit-small	compilation	None
hf_nfnet_l0.ra2_in1k	import_model	None
hf_pedestrian_gender_recognition	compilation	None
hf_pnasnet5large.tf_in1k	compilation	None
hf_swin-tiny-patch4-window7-224	compilation	None
hf_swin_base_patch4_window7_224.ms_in22k_ft_in1k	compilation	None
hf_tf_mobilenetv3_large_minimal_100.in1k	setup	None
hf_deeplabv3-mobilevit-small	compilation	None
hf_deeplabv3-mobilevit-xx-small	compilation	None
hf_dpt-large-ade	compilation	None
hf_face-parsing	compiled_inference	None
hf_segformer-b0-finetuned-ade-512-512	compiled_inference	None
hf_segformer-b0-finetuned-cityscapes-1024-1024	compiled_inference	None
hf_segformer-b0-finetuned-cityscapes-512-1024	compiled_inference	None
hf_segformer-b1-finetuned-ade-512-512	compiled_inference	None
hf_segformer-b1-finetuned-cityscapes-1024-1024	compiled_inference	None
hf_segformer-b2-fashion	compiled_inference	None
hf_segformer-b2-finetuned-ade-512-512	compiled_inference	None
hf_segformer-b2-finetuned-cityscapes-1024-1024	compiled_inference	None
hf_segformer-b3-fashion	compiled_inference	None
hf_segformer-b3-finetuned-ade-512-512	compiled_inference	None
hf_segformer-b3-finetuned-cityscapes-1024-1024	compiled_inference	None
hf_segformer-b4-finetuned-ade-512-512	compiled_inference	None
hf_segformer-b4-finetuned-cityscapes-1024-1024	compiled_inference	None
hf_segformer-b5-finetuned-ade-640-640	compiled_inference	None
hf_segformer-b5-finetuned-cityscapes-1024-1024	compiled_inference	None
hf_segformer_b2_clothes	compiled_inference	None
hf_segformer_b3_clothes	compiled_inference	None
hf_segformer_for_optic_disc_cup_segmentation	compiled_inference	None
hf_1_microsoft_deberta_V1.0	compilation	None
hf_1_microsoft_deberta_V1.1	compilation	None
hf_checkpoints_10_1_microsoft_deberta_V1.1_384	compilation	None
hf_checkpoints_1_16	compilation	None
hf_checkpoints_26_9_microsoft_deberta_21_9	compilation	None
hf_checkpoints_28_9_microsoft_deberta_V2	compilation	None
hf_checkpoints_28_9_microsoft_deberta_V4	compilation	None
hf_checkpoints_28_9_microsoft_deberta_V5	compilation	None
hf_checkpoints_29_9_microsoft_deberta_V1	compilation	None
hf_checkpoints_30_9_microsoft_deberta_V1.0_384	compilation	None
hf_checkpoints_3_14	compilation	None
hf_content	compilation	None
hf_deberta-v3-large_test	compilation	None
hf_deberta-v3-large_test_9e-6	compilation	None
hf_Debertalarg_model_multichoice_Version2	compilation	None
hf_fine-tuned-MoritzLaurer-deberta-v3-large-zeroshot-v2.0-arceasy	setup	None
hf_llm-mdeberta-v3-swag	compilation	None
hf_mDeBERTa-v3-xnli-ft-bs-multiple-choice	compilation	None
hf_Multiple_Choice	setup	None
hf_Multiple_Choice_EN	setup	None
hf_multiple_choice_model	setup	None
hf_output	compilation	None
hf_phobert-base-finetuned	compiled_inference	None
hf_phobert-large-finetuned	compiled_inference	None
hf_yolos-base	compilation	None
hf_yolos-fashionpedia	compilation	None
hf_yolos-small	compilation	None
hf_yolos-small-finetuned-license-plate-detection	compilation	None
hf_yolos-small-rego-plates-detection	compilation	None
hf_bert-large-finetuned-squad2	setup	None
hf_deberta-v3-base-squad2	compilation	None
hf_deberta-v3-large-squad2	compilation	None
hf_mdeberta-v3-base-squad2	compilation	None
hf_splinter-base	compilation	None
hf_splinter-base-qass	compilation	None
hf_segformer-b0-finetuned-segments-sidewalk-2	compiled_inference	None
hf_segformer-b4-finetuned-segments-sidewalk	compiled_inference	None
hf_tcd-segformer-mit-b0	compiled_inference	None
hf_tcd-segformer-mit-b1	compiled_inference	None
hf_tcd-segformer-mit-b2	compiled_inference	None
hf_tcd-segformer-mit-b3	compiled_inference	None
hf_tcd-segformer-mit-b4	compiled_inference	None
hf_distilgpt2	construct_inputs	None
hf_gpt2	construct_inputs	None
hf_gpt2-small-spanish	native_inference	None
hf_llama-68m	construct_inputs	None
hf_llama-7b	import_model	None
hf_Llama3-8B-1.58-100B-tokens-GGUF	setup	None
hf_Meta-Llama-3.1-8B-Instruct-AWQ-INT4	setup	None
hf_Meta-Llama-3.1-8B-Instruct-bnb-4bit	setup	None
hf_Midnight-Miqu-70B-v1.5-4bit	setup	None
hf_Mistral-7B-Instruct-v0.2-GPTQ	setup	None
hf_oasst-sft-4-pythia-12b-epoch-3.5	import_model	None
hf_opt-125m	native_inference	None
hf_Phi-3-mini-128k-instruct	import_model	None
hf_Phi-3-mini-4k-instruct	import_model	None
hf_Phi-3.5-mini-instruct	import_model	None
hf_Qwen1.5-0.5B-Chat	native_inference	None
hf_Qwen2-0.5B	native_inference	None
hf_Qwen2-7B-Instruct	import_model	None
hf_Qwen2.5-0.5B-Instruct	native_inference	None
hf_Qwen2.5-1.5B-Instruct	import_model	None
hf_Qwen2.5-7B-Instruct	import_model	None
hf_really-tiny-falcon-testing	native_inference	None
hf_StableBeluga2	import_model	None
hf_tiny-dummy-qwen2	native_inference	None
hf_tiny-Qwen2ForCausalLM-2.5	native_inference	None
hf_tiny-random-GemmaForCausalLM	native_inference	None
hf_tiny-random-LlamaForCausalLM	native_inference	None
hf_tiny-random-mistral	construct_inputs	None
hf_tiny-random-Phi3ForCausalLM	native_inference	None
hf_TinyLlama-1.1B-Chat-v1.0	import_model	None
hf_vicuna-7b-v1.5	import_model	None
hf_wasmai-7b-v1	import_model	None
hf_zephyr-7b-beta	import_model	None
hf_bert-base-thai-upos	setup	None
hf_deberta-v3-base_finetuned_ai4privacy_v2	compilation	None
hf_deberta_finetuned_pii	compilation	None
hf_ivila-row-layoutlm-finetuned-s2vl-v2	native_inference	None
hf_Medical-NER	compilation	None
hf_piiranha-v1-detect-personal-information	compilation	None

amd-vivekag · 2025-02-12T14:58:17Z

text-classification failures:

Test	Exit Status	Mean Benchmark Time (ms)
hf_deberta-large-mnli	compilation	None
hf_deberta-v3-base-absa-v1.1	compilation	None
hf_deberta-v3-base-injection	compilation	None
hf_DeBERTa-v3-base-mnli-fever-anli	compilation	None
hf_deberta-v3-base-zeroshot-v1.1-all-33	compilation	None
hf_deberta-v3-large_boolq	compilation	None
hf_mDeBERTa-v3-base-mnli-xnli	compilation	None
hf_mxbai-rerank-base-v1	compilation	None
hf_mxbai-rerank-xsmall-v1	compilation	None
hf_nli-deberta-v3-base	compilation	None
hf_robertuito-sentiment-analysis	compiled_inference	None

amd-vivekag · 2025-02-13T09:54:51Z

Failure summary:

#	Stage
61	compilation
6	compiled_inference
5	construct_inputs
15	import_model
16	native_inference
12	setup

Setup failures categories:
Total Failures: 12

#	Device	Issue type	Issue Message	Issue no	#Model impacted	List of model
1	CPU	setup	ImportError("Loading an AWQ quantized model requires auto-awq library (`pip install autoawq`)	918	2	hf_Midnight-Miqu-70B-v1.5-4bit, hf_Meta-Llama-3.1-8B-Instruct-AWQ-INT4
2	CPU	setup	requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url	919	3	hf_Multiple_Choice, hf_multiple_choice_model, hf_Multiple_Choice_EN
3	CPU	setup	IndexError: index out of range in self	920	1	hf_ruRoPEBert-e5-base-2k
4	CPU	setup	Unknown task: fill-mask	921	2	hf_multi-qa-mpnet-base-cos-v1, hf_all-mpnet-base-v1
5	CPU	setup	importlib.metadata.PackageNotFoundError: No package metadata was found for bitsandbytes	922	1	hf_Meta-Llama-3.1-8B-Instruct-bnb-4bit
6	CPU	setup	RuntimeError: Error(s) in loading state_dict for DebertaV2ForMultipleChoice:	923	1	hf_fine-tuned-MoritzLaurer-deberta-v3-large-zeroshot-v2.0-arceasy
7	CPU	setup	TypeError: DisableCompileContextManager.enter....() got an unexpected keyword argument 'dtype'	924	1	hf_Llama3-8B-1.58-100B-tokens-GGUF
8	CPU	setup	torch.onnx.errors.UnsupportedOperatorError: Exporting the operator 'aten::bitwise_and' to ONNX opset version 14 is not supported	925	1	hf_Mistral-7B-Instruct-v0.2-GPTQ

zjgarvey · 2025-02-13T17:09:02Z

I assume the most recent run is on CPU? Can you share the detail table in a gist? Can you also post the IREE version?

amd-vivekag · 2025-02-13T17:44:15Z

I assume the most recent run is on CPU? Can you share the detail table in a gist? Can you also post the IREE version?

Yes, these are run on CPU. I was getting more failures (around 40 more failures on GPU). I'm using following IREE version:

IREE (https://iree.dev):
  IREE compiler version 3.2.0rc20250206 @ f3bef2de123f08b4fc3b0ce691494891bd6760d0
  LLVM version 20.0.0git
  Optimized build

Following is the detailed table link:
https://gist.github.com/amd-vivekag/377a7b141b40c118f880b2ced176f95c

amd-vivekag · 2025-02-14T06:05:17Z

Category: import_model
Total Failures: 15

#	Device	Issue type	Issue Message	Issue no	#Model impacted	List of model
1	CPU	import_model	Killed due to OOM	#926	1	hf_StableBeluga2
2	CPU	import_model	assertNonNull: Assertion `g.get() != nullptr` failed	#927	5	hf_esm2_t36_3B_UR50D, hf_Phi-3.5-mini-instruct, hf_Phi-3-mini-128k-instruct, hf_Phi-3-mini-4k-instruct, hf_zephyr-7b-beta
3	CPU	import_model	assertInVersionRange: Assertion `version >= version_range.first && version <= version_range.second` failed	#928	8	hf_llama-7b, hf_oasst-sft-4-pythia-12b-epoch-3.5, hf_Qwen2.5-1.5B-Instruct, hf_Qwen2.5-7B-Instruct, hf_Qwen2-7B-Instruct, hf_TinyLlama-1.1B-Chat-v1.0, hf_vicuna-7b-v1.5, hf_wasmai-7b-v1
4	CPU	import_model	Assertion `node->outputs().size() < 4` failed	#929	1	hf_nfnet_l0.ra2_in1k

amd-vivekag · 2025-02-14T12:50:47Z

#	Device	Issue type	Issue Message	Issue no	#Model impacted	List of model
1	CPU	compilation	error: failed to legalize operation 'torch.operator' that was explicitly marked illegal	#930	45	hf_1_microsoft_deberta_V1.0, hf_1_microsoft_deberta_V1.1, hf_checkpoints_10_1_microsoft_deberta_V1.1_384, hf_checkpoints_1_16, hf_checkpoints_26_9_microsoft_deberta_21_9, hf_checkpoints_28_9_microsoft_deberta_V2, hf_checkpoints_28_9_microsoft_deberta_V4, hf_checkpoints_28_9_microsoft_deberta_V5, hf_checkpoints_29_9_microsoft_deberta_V1, hf_checkpoints_30_9_microsoft_deberta_V1.0_384, hf_checkpoints_3_14, hf_content, hf_deberta-base, hf_deberta_finetuned_pii, hf_deberta-large-mnli, hf_Debertalarg_model_multichoice_Version2, hf_deberta-v2-base-japanese, hf_deberta-v2-base-japanese-char-wwm, hf_deberta-v3-base, hf_deberta-v3-base-absa-v1.1, hf_deberta-v3-base_finetuned_ai4privacy_v2, hf_deberta-v3-base-injection, hf_DeBERTa-v3-base-mnli-fever-anli, hf_deberta-v3-base-squad2, hf_deberta-v3-base-zeroshot-v1.1-all-33, hf_deberta-v3-large, hf_deberta-v3-large_boolq, hf_deberta-v3-large-squad2, hf_deberta-v3-large_test, hf_deberta-v3-large_test_9e-6, hf_deberta-v3-small, hf_deberta-v3-xsmall, hf_llm-mdeberta-v3-swag, hf_mdeberta-v3-base, hf_mDeBERTa-v3-base-mnli-xnli, hf_mdeberta-v3-base-squad2, hf_mDeBERTa-v3-xnli-ft-bs-multiple-choice, hf_Medical-NER, hf_mxbai-rerank-base-v1, hf_mxbai-rerank-xsmall-v1, hf_nli-deberta-v3-base, hf_output, hf_piiranha-v1-detect-personal-information, hf_splinter-base, hf_splinter-base-qass
2	CPU	compilation	error: failed to legalize unresolved materialization from ('i64') to ('index') that remained live after conversion	#931	3	hf_deeplabv3-mobilevit-small, hf_deeplabv3-mobilevit-xx-small, hf_mobilevit-small
3	CPU	compilation	error: 'flow.dispatch.workgroups' op value set has 3 dynamic dimensions but only 2 dimension values are attached	#932	3	hf_beit-base-patch16-224-pt22k, hf_beit-base-patch16-224-pt22k-ft22k, hf_pedestrian_gender_recognition
4	CPU	compilation	error: expected sizes to be non-negative, but got -1	#933	7	hf_swin_base_patch4_window7_224.ms_in22k_ft_in1k, hf_swin-tiny-patch4-window7-224, hf_yolos-base, hf_yolos-fashionpedia, hf_yolos-small, hf_yolos-small-finetuned-license-plate-detection, hf_yolos-small-rego-plates-detection
5	CPU	compilation	error: 'stream.async.dispatch' op has invalid Read access range	#934	1	hf_dpt-large-ade
6	CPU	compilation	error: 'iree_linalg_ext.pack' op write affecting operations on global resources are restricted to workgroup distributed contexts.	#935	1	hf_distilhubert
7	CPU	compilation	error: expected offsets to be non-negative, but got -1	#936	1	hf_pnasnet5large.tf_in1k

amd-vivekag · 2025-02-14T15:25:01Z

Category: construct_inputs
Total Failures: 5

#	Device	Issue type	Issue Message	Issue no	#Model impacted	List of model	Assignee	Status
1	CPU	construct_inputs	ValueError: Asking to pad but the tokenizer does not have a padding token	#938	4	hf_distilgpt2, hf_gpt2, hf_llama-68m, hf_tiny-random-mistral
2	CPU	construct_inputs	name 'tokens' is not defined	#939	1	hf_wavlm-base-plus	@amd-vivekag

Category: native_inference
Total Failures: 14

#	Device	Issue type	Issue Message	Issue no	#Model impacted	List of model
1	CPU	native_inference	IndexError: tuple index out of range	#940	14	hf_bart-base, hf_gpt2-small-spanish, hf_ivila-row-layoutlm-finetuned-s2vl-v2, hf_opt-125m, hf_Qwen1.5-0.5B-Chat, hf_Qwen2-0.5B, hf_Qwen2.5-0.5B-Instruct, hf_really-tiny-falcon-testing, hf_tiny-dummy-qwen2, hf_tiny-Qwen2ForCausalLM-2.5, hf_tiny-random-GemmaForCausalLM, hf_tiny-random-LlamaForCausalLM, hf_tiny-random-mt5, hf_tiny-random-Phi3ForCausalLM
2	CPU	native_inference	[ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Got invalid dimensions for input: pixel_values for the following indices	#941	1	hf_mobilenet_v1_0.75_192
3	CPU	native_inference	[ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Add node	#942	1	hf_eva_large_patch14_196.in22k_ft_in22k_in1k

Category: compiled_inference
Total Failures: 6

#	Device	Issue type	Issue Message	Issue no	#Model impacted	List of model	Assignee	Status
1	CPU	compiled_inference	INVALID_ARGUMENT; function expected fewer input values; parsing input `input.bin`	#943	1	hf_ko-sroberta-multitask, hf_robertuito-sentiment-analysis, hf_sbert_large_nlu_ru, hf_sentence-bert-base-ja-mean-tokens-v2
2	CPU	compiled_inference	:0: FAILED_PRECONDITION; onnx.Expand input has a dim that is not statically 1	#944	1	hf_phobert-base-finetuned, hf_phobert-large-finetuned

pdhirajkumarprasad mentioned this issue Jan 9, 2025

[Tracker] All the issue related with e2e shark test suite #812

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HF model tracker #899

HF model tracker #899

pdhirajkumarprasad commented Jan 9, 2025 •

edited

Loading

amd-vivekag commented Feb 10, 2025 •

edited

Loading

amd-vivekag commented Feb 10, 2025 •

edited

Loading

amd-vivekag commented Feb 12, 2025 •

edited

Loading

amd-vivekag commented Feb 12, 2025

amd-vivekag commented Feb 13, 2025 •

edited

Loading

zjgarvey commented Feb 13, 2025 •

edited

Loading

amd-vivekag commented Feb 13, 2025

amd-vivekag commented Feb 14, 2025 •

edited

Loading

amd-vivekag commented Feb 14, 2025 •

edited

Loading

amd-vivekag commented Feb 14, 2025 •

edited

Loading

HF model tracker #899

HF model tracker #899

Comments

pdhirajkumarprasad commented Jan 9, 2025 • edited Loading

amd-vivekag commented Feb 10, 2025 • edited Loading

Passing Summary

Fail Summary

Test Run Detail

amd-vivekag commented Feb 10, 2025 • edited Loading

Passing Summary

Fail Summary

Test Run Detail

amd-vivekag commented Feb 12, 2025 • edited Loading

amd-vivekag commented Feb 12, 2025

amd-vivekag commented Feb 13, 2025 • edited Loading

zjgarvey commented Feb 13, 2025 • edited Loading

amd-vivekag commented Feb 13, 2025

amd-vivekag commented Feb 14, 2025 • edited Loading

amd-vivekag commented Feb 14, 2025 • edited Loading

amd-vivekag commented Feb 14, 2025 • edited Loading

pdhirajkumarprasad commented Jan 9, 2025 •

edited

Loading

amd-vivekag commented Feb 10, 2025 •

edited

Loading

amd-vivekag commented Feb 10, 2025 •

edited

Loading

amd-vivekag commented Feb 12, 2025 •

edited

Loading

amd-vivekag commented Feb 13, 2025 •

edited

Loading

zjgarvey commented Feb 13, 2025 •

edited

Loading

amd-vivekag commented Feb 14, 2025 •

edited

Loading

amd-vivekag commented Feb 14, 2025 •

edited

Loading

amd-vivekag commented Feb 14, 2025 •

edited

Loading