Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Separate and organize endpoint selector manifests #202

Closed
Tracked by #73
ahg-g opened this issue Jan 15, 2025 · 7 comments
Closed
Tracked by #73

Separate and organize endpoint selector manifests #202

ahg-g opened this issue Jan 15, 2025 · 7 comments
Milestone

Comments

@ahg-g
Copy link
Contributor

ahg-g commented Jan 15, 2025

We have a bunch of yamls to install the ext-proc and a sample inferencepool and inferencemodel, I think we need to organize the manifests into three categories since not all are needed at the same time:

  • manifests to setup an envoy gateway
  • manifests to setup an httproute
  • manifests to deploy the container + an InferencePool
  • manifests to install an InferenceModel

/cc @danehans

@danehans
Copy link
Contributor

@ahg-g PTAL at #217. I setup e2e by simply running Envoy with a static config. A user can use the following steps to run the Endpoint Selector extension:

  • Use this manifest to deploy Envoy.
  • Install the infernecepools and inferencemodels CRDs.
  • Use the repo manifest to create an instance of the InferencePool/InferenceModel custom resources.
  • Deploy EPP using the repo manifest.

@kfswain
Copy link
Collaborator

kfswain commented Jan 23, 2025

We originally had a static config, and working with the Envoy folks to find something that was a little more flexible.

The problem with the static config is that it works for the one case, but if a user wants to add another pool or change the httpRoute, they have to go mess with the xDS API which can be a daunting task, esp if you're area of expertise is inference/ML

See the conversation in: #18

@danehans
Copy link
Contributor

danehans commented Jan 23, 2025

I found the barrier to entry quite high for new users wanting to explore the project. To address this, I created Issue #219 to track the problems with the POC README. While the README lists six steps, a new user must complete approximately ten steps to test EPP. Typically, a quickstart guide focuses on a narrow use case, is executable with a single command, and has minimal dependencies. Although we could wrap the installation process in a script, this would not address the complexity posed by the numerous resource types—particularly custom resources—that users must understand to modify or deviate from the example installation. To lower the barrier for new users, we should reconsider supporting a static configuration in the quickstart guide. This approach would simplify the onboarding process, while advanced users could be directed to the EG installation guide at the end of the quickstart for further customization.

@ahg-g
Copy link
Contributor Author

ahg-g commented Jan 24, 2025

I think we should have two guides, a quick start where the envoy setup is bundled in one file, and a detailed guide where we break things down, wdyt?

@danehans
Copy link
Contributor

@ahg-g Yes, I agree. This is what I intended to convey in my previous comment. The quickstart should consist of a single kubectl apply command that uses a static Envoy configuration, making it very straightforward for first-time users to walk through a basic use case. At the end of the quickstart, there's a kubectl delete command for clean-up. After that, we point readers to the user guide for more advanced scenarios—which will require a Gateway API implementation. The question is: Should we maintain separate guides for each Gateway API implementation that supports EPP, or should that responsibility lie with the individual implementations themselves?

@ahg-g
Copy link
Contributor Author

ahg-g commented Jan 25, 2025

Great! @danehans would you like to send a quick start guide once we merge #221?

The question is: Should we maintain separate guides for each Gateway API implementation that supports EPP, or should that responsibility lie with the individual implementations themselves?

I think we can postpone answering this question for now, but I would be in favor of having the guides in the gateway repo with a reference to it from this repo.

@danehans
Copy link
Contributor

@ahg-g I believe recent PRs resolved this issue. If you agree, please close this issue. Otherwise, provide a brief explanation of the remaining work.

@ahg-g ahg-g closed this as completed Jan 29, 2025
@danehans danehans added this to the v0.1.0-rc.1 milestone Jan 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants