Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable CPU helper in AUTO when the model is LLM #29233

Open
wants to merge 21 commits into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
03810de
Disable CPU helper in AUTO when the model is LLM
wgzintel Mar 3, 2025
a6906e3
Merge branch 'master' of https://github.com/openvinotoolkit/openvino …
wgzintel Mar 3, 2025
0618583
move get_optimum_intel_version to a common API
wgzintel Mar 3, 2025
0b321b0
Merge branch 'master' of https://github.com/openvinotoolkit/openvino …
wgzintel Mar 4, 2025
f0380c4
use is_large_language_model to match LLM
wgzintel Mar 4, 2025
04212be
resolve conflict
wgzintel Mar 5, 2025
6420d0b
Merge branch 'master' into guozhong/disable_cpu_helper
wgzintel Mar 10, 2025
eac23ab
Merge branch 'master' into guozhong/disable_cpu_helper
wgzintel Mar 12, 2025
46131fd
Fix the error Not Implemented
wgzintel Mar 12, 2025
9f74701
Add comments
wgzintel Mar 12, 2025
d740d0f
Merge branch 'master' into guozhong/disable_cpu_helper
wgzintel Mar 12, 2025
6323789
Merge branch 'master' into guozhong/disable_cpu_helper
wgzintel Mar 12, 2025
ea3a8c3
Merge branch 'master' into guozhong/disable_cpu_helper
wgzintel Mar 13, 2025
7c89221
Merge branch 'master' into guozhong/disable_cpu_helper
wgzintel Mar 14, 2025
ddf2cca
Merge branch 'master' into guozhong/disable_cpu_helper
wgzintel Mar 17, 2025
62f0427
Merge branch 'master' into guozhong/disable_cpu_helper
wgzintel Mar 18, 2025
c584b08
Merge branch 'master' into guozhong/disable_cpu_helper
wgzintel Mar 19, 2025
74959c2
Merge branch 'master' into guozhong/disable_cpu_helper
wgzintel Mar 20, 2025
5441b0f
LLM model handled in filter_device_by_model when model path is empty
wgzintel Mar 20, 2025
af5e285
Merge branch 'master' into guozhong/disable_cpu_helper
wgzintel Mar 21, 2025
6e928a3
optimize the code
xipingyan Mar 21, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 8 additions & 11 deletions src/plugins/auto/src/plugin.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -431,12 +431,8 @@ std::shared_ptr<ov::ICompiledModel> Plugin::compile_model_impl(const std::string
bool is_cumulative =
(auto_s_context->m_performance_hint == ov::hint::PerformanceMode::CUMULATIVE_THROUGHPUT) ? true : false;
std::list<DeviceInformation> devices_with_priority(support_devices.begin(), support_devices.end());
if (model_path.empty()) {
support_devices = filter_device_by_model(support_devices_by_property, model, load_config);
} else {
// AUTO / MULTI don't support caching explicitly, but can redirect this functionality to actual HW plugin
LOG_INFO_TAG("compile model with model path");
}
auto m_model = model_path.empty() ? model : get_core()->read_model(model_path, std::string{}, {});
support_devices = filter_device_by_model(support_devices_by_property, m_model, load_config);
if (!is_cumulative) {
devices_with_priority = get_valid_device(support_devices, model_precision);
}
Expand Down Expand Up @@ -890,11 +886,11 @@ std::vector<DeviceInformation> Plugin::filter_device_by_model(const std::vector<

auto disable_startup_runtime_fallback = [&]() {
if (load_config.get_property(ov::intel_auto::enable_startup_fallback)) {
LOG_WARNING_TAG("Setting property ov::intel_auto::enable_startup_fallback to false for stateful model.");
LOG_WARNING_TAG("Setting property ov::intel_auto::enable_startup_fallback to false.");
load_config.set_property(ov::intel_auto::enable_startup_fallback(false));
}
if (load_config.get_property(ov::intel_auto::enable_runtime_fallback)) {
LOG_WARNING_TAG("Setting property ov::intel_auto::enable_running_fallback to false for stateful model.");
LOG_WARNING_TAG("Setting property ov::intel_auto::enable_running_fallback to false.");
load_config.set_property(ov::intel_auto::enable_runtime_fallback(false));
}
};
Expand All @@ -910,12 +906,13 @@ std::vector<DeviceInformation> Plugin::filter_device_by_model(const std::vector<
stateful_node_names.push_back(op->get_friendly_name());
}
}
if (stateful_node_names.empty()) {
// not stateful model
bool is_LLM_model = ov::op::util::is_large_language_model(*model);
if (stateful_node_names.empty() && !is_LLM_model) {
//if not stateful model and not LLM model
return meta_devices;
}

// disable CPU_HELP and runtime fallback if model is stateful
// disable CPU_HELP and runtime fallback if model is stateful or LLM model
disable_startup_runtime_fallback();

bool isCumulative = (get_device_name() == "MULTI") || (load_config.get_property(ov::hint::performance_mode) ==
Expand Down
Loading