Skip to content

feat: REPL with tracing and isolated benchmarking #10

@GandalfTea

Description

@GandalfTea

Add a REPL to:

  • manage the main API server and lets the user discover and handle worker nodes
  • manage local model storage and download of restricted models from HF
  • trace callstack and get performance information at different levels (including interpreter-level with sys.setprofile)
  • trace different subsystems (prefill, compute, prefetch) and specific components (tokenizer, attention, etc.)
  • isolate and benchmark different subsystems and components with dummy data

Metadata

Metadata

Assignees

Labels

enhancementNew feature or requestin progressActive work is being done on this issue

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions