Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Introduce node level circuit breaker settings for k-NN #2263

Open
kotwanikunal opened this issue Nov 8, 2024 · 0 comments
Open

Comments

@kotwanikunal
Copy link
Member

Is your feature request related to a problem?
The existing k-NN plugin uses a cluster-level circuit breaker to prevent excessive memory consumption. While effective, this approach may not be optimal for all cluster configurations, especially in heterogeneous environments where nodes have varying capacities. It lacks fine-grained control over memory usage on individual nodes.

What solution would you like?
Implement node-level circuit breakers for the k-NN plugin, allowing memory limits to be set and enforced on a per-node basis.
The enhancement will refactor the circuit breaker system in OpenSearch to support differentiated limits based on node attributes.

The approach involves defining circuit breaker limits at the cluster level, but with distinct values for different node types. Nodes would be categorized using attributes such as "node.attr.type" set to values like "big" or "small" in their opensearch.yml configuration.
For example -

PUT _cluster/settings
{
  "persistent": {
    "plugins.knn.circuit_breaker.limit.big": "70%",
    "plugins.knn.circuit_breaker.limit.small": "40%"
  }
}

The k-NN plugin would then apply the appropriate limit based on each node's attributes. This method leverages existing OpenSearch configuration mechanisms, allows for centralized management, and provides the necessary flexibility for mixed-capability clusters. It maintains backwards compatibility by falling back to a default or existing cluster-wide setting for nodes without specified attributes.

Implementation would require modifying the k-NN plugin to read node attributes, select the corresponding circuit breaker limit, and apply it in the circuit breaker logic. This solution offers a balance of granular control and ease of management, tailoring resource allocation to node capabilities while keeping configuration centralized and straightforward.

What alternatives have you considered?

  • Moving to per node configuration within opensearch.yml as overrides
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Backlog (Hot)
Development

No branches or pull requests

2 participants