Skip to content

Commit 67cb58b

Browse files
author
Luke Hinds
authored
Lag Effect (#11)
* Lag Effect * Organise imports
1 parent 10ab72c commit 67cb58b

File tree

6 files changed

+79
-67
lines changed

6 files changed

+79
-67
lines changed

README.md

Lines changed: 23 additions & 57 deletions
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,7 @@
44
[![PyPI version](https://badge.fury.io/py/mockllm.svg)](https://badge.fury.io/py/mockllm)
55
[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
66

7-
A FastAPI-based mock LLM server that mimics OpenAI and Anthropic API formats. Instead of calling actual language models,
8-
it uses predefined responses from a YAML configuration file.
7+
A FastAPI-based mock LLM server that mimics OpenAI and Anthropic API formats. Instead of calling actual language models, it uses predefined responses from a YAML configuration file.
98

109
This is made for when you want a deterministic response for testing or development purposes.
1110

@@ -17,8 +16,6 @@ Check out the [CodeGate](https://github.com/stacklok/codegate) project when you'
1716
- Streaming support (character-by-character response streaming)
1817
- Configurable responses via YAML file
1918
- Hot-reloading of response configurations
20-
- JSON logging
21-
- Error handling
2219
- Mock token counting
2320

2421
## Installation
@@ -128,10 +125,11 @@ curl -X POST http://localhost:8000/v1/messages \
128125

129126
### Response Configuration
130127

131-
Responses are configured in `responses.yml`. The file has two main sections:
128+
Responses are configured in `responses.yml`. The file has three main sections:
132129

133130
1. `responses`: Maps input prompts to predefined responses
134131
2. `defaults`: Contains default configurations like the unknown response message
132+
3. `settings`: Contains server behavior settings like network lag simulation
135133

136134
Example `responses.yml`:
137135
```yaml
@@ -141,71 +139,39 @@ responses:
141139

142140
defaults:
143141
unknown_response: "I don't know the answer to that. This is a mock response."
144-
```
145-
146-
### Hot Reloading
147-
148-
The server automatically detects changes to `responses.yml` and reloads the configuration without requiring a restart.
149-
150-
## Development
151-
152-
The project uses Poetry for dependency management and includes a Makefile to help with common development tasks:
153-
154-
```bash
155-
# Set up development environment
156-
make setup
157-
158-
# Run all checks (setup, lint, test)
159-
make all
160142

161-
# Run tests
162-
make test
163-
164-
# Format code
165-
make format
166-
167-
# Run all linting and type checking
168-
make lint
169-
170-
# Clean up build artifacts
171-
make clean
172-
173-
# See all available commands
174-
make help
143+
settings:
144+
lag_enabled: true
145+
lag_factor: 10 # Higher values = faster responses (10 = fast, 1 = slow)
175146
```
176147
177-
### Development Commands
148+
### Network Lag Simulation
178149
179-
- `make setup`: Install all development dependencies with Poetry
180-
- `make test`: Run the test suite
181-
- `make format`: Format code with black and isort
182-
- `make lint`: Run all code quality checks (format, lint, type)
183-
- `make build`: Build the package with Poetry
184-
- `make clean`: Remove build artifacts and cache files
185-
- `make install-dev`: Install package with development dependencies
150+
The server can simulate network latency for more realistic testing scenarios. This is controlled by two settings:
186151
187-
For more details on available commands, run `make help`.
152+
- `lag_enabled`: When true, enables artificial network lag
153+
- `lag_factor`: Controls the speed of responses
154+
- Higher values (e.g., 10) result in faster responses
155+
- Lower values (e.g., 1) result in slower responses
156+
- Affects both streaming and non-streaming responses
188157

189-
## Error Handling
158+
For streaming responses, the lag is applied per-character with slight random variations to simulate realistic network conditions.
190159

191-
The server includes comprehensive error handling:
192-
193-
- Invalid requests return 400 status codes with descriptive messages
194-
- Server errors return 500 status codes with error details
195-
- All errors are logged using JSON format
160+
### Hot Reloading
196161

197-
## Logging
162+
The server automatically detects changes to `responses.yml` and reloads the configuration without restarting the server.
198163

199-
The server uses JSON-formatted logging for:
164+
## Testing
200165

201-
- Incoming request details
202-
- Response configuration loading
203-
- Error messages and stack traces
166+
To run the tests:
167+
```bash
168+
poetry run pytest
169+
```
204170

205171
## Contributing
206172

207-
Contributions are welcome! Please feel free to submit a Pull Request.
173+
Contributions are welcome! Please open an issue or submit a PR.
208174

209175
## License
210176

211-
This project is licensed under the Apache License, Version 2.0 - see the [LICENSE](LICENSE) file for details.
177+
This project is licensed under the [Apache 2.0 License](LICENSE).

example.responses.yml

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,4 +5,8 @@ responses:
55
"what is the meaning of life?": "According to this mock response, the meaning of life is to write better mock servers."
66

77
defaults:
8-
unknown_response: "I don't know the answer to that. This is a mock response."
8+
unknown_response: "I don't know the answer to that. This is a mock response."
9+
10+
settings:
11+
lag_enabled: true
12+
lag_factor: 10 # Higher values = faster responses (10 = fast, 1 = slow)

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[tool.poetry]
22
name = "mockllm"
3-
version = "0.0.6"
3+
version = "0.0.7"
44
description = "A mock server that mimics OpenAI and Anthropic API formats for testing"
55
authors = ["Luke Hinds <[email protected]>"]
66
license = "Apache-2.0"

src/mockllm/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,4 +2,4 @@
22
Mock LLM Server - You will do what I tell you!
33
"""
44

5-
__version__ = "0.0.6"
5+
__version__ = "0.0.7"

src/mockllm/config.py

Lines changed: 40 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,8 @@
1+
import asyncio
12
import logging
3+
import random
24
from pathlib import Path
3-
from typing import Dict, Generator, Optional
5+
from typing import AsyncGenerator, Dict, Generator, Optional
46

57
import yaml
68
from fastapi import HTTPException
@@ -20,6 +22,8 @@ def __init__(self, yaml_path: str = "responses.yml"):
2022
self.last_modified = 0
2123
self.responses: Dict[str, str] = {}
2224
self.default_response = "I don't know the answer to that."
25+
self.lag_enabled = False
26+
self.lag_factor = 10
2327
self.load_responses()
2428

2529
def load_responses(self) -> None:
@@ -33,6 +37,9 @@ def load_responses(self) -> None:
3337
self.default_response = data.get("defaults", {}).get(
3438
"unknown_response", self.default_response
3539
)
40+
settings = data.get("settings", {})
41+
self.lag_enabled = settings.get("lag_enabled", False)
42+
self.lag_factor = settings.get("lag_factor", 10)
3643
self.last_modified = int(current_mtime)
3744
logger.info(
3845
f"Loaded {len(self.responses)} responses from {self.yaml_path}"
@@ -62,3 +69,35 @@ def get_streaming_response(
6269
# Yield response character by character
6370
for char in response:
6471
yield char
72+
73+
async def get_response_with_lag(self, prompt: str) -> str:
74+
"""Get response with artificial lag for non-streaming responses."""
75+
response = self.get_response(prompt)
76+
if self.lag_enabled:
77+
# Base delay on response length and lag factor
78+
delay = len(response) / (self.lag_factor * 10)
79+
await asyncio.sleep(delay)
80+
return response
81+
82+
async def get_streaming_response_with_lag(
83+
self, prompt: str, chunk_size: Optional[int] = None
84+
) -> AsyncGenerator[str, None]:
85+
"""Generator that yields response content with artificial lag."""
86+
response = self.get_response(prompt)
87+
88+
if chunk_size:
89+
for i in range(0, len(response), chunk_size):
90+
chunk = response[i : i + chunk_size]
91+
if self.lag_enabled:
92+
delay = len(chunk) / (self.lag_factor * 10)
93+
await asyncio.sleep(delay)
94+
yield chunk
95+
else:
96+
for char in response:
97+
if self.lag_enabled:
98+
# Add random variation to character delay
99+
base_delay = 1 / (self.lag_factor * 10)
100+
variation = random.uniform(-0.5, 0.5) * base_delay
101+
delay = max(0, base_delay + variation)
102+
await asyncio.sleep(delay)
103+
yield char

src/mockllm/server.py

Lines changed: 9 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -30,15 +30,14 @@
3030

3131
async def openai_stream_response(content: str, model: str) -> AsyncGenerator[str, None]:
3232
"""Generate OpenAI-style streaming response in SSE format."""
33-
# Send the first message with role
3433
first_chunk = OpenAIStreamResponse(
3534
model=model,
3635
choices=[OpenAIStreamChoice(delta=OpenAIDeltaMessage(role="assistant"))],
3736
)
3837
yield f"data: {first_chunk.model_dump_json()}\n\n"
3938

40-
# Stream the content character by character
41-
for chunk in response_config.get_streaming_response(content):
39+
# Stream the content character by character with lag
40+
async for chunk in response_config.get_streaming_response_with_lag(content):
4241
chunk_response = OpenAIStreamResponse(
4342
model=model,
4443
choices=[OpenAIStreamChoice(delta=OpenAIDeltaMessage(content=chunk))],
@@ -58,7 +57,7 @@ async def anthropic_stream_response(
5857
content: str, model: str
5958
) -> AsyncGenerator[str, None]:
6059
"""Generate Anthropic-style streaming response in SSE format."""
61-
for chunk in response_config.get_streaming_response(content):
60+
async for chunk in response_config.get_streaming_response_with_lag(content):
6261
stream_response = AnthropicStreamResponse(
6362
delta=AnthropicStreamDelta(delta={"text": chunk})
6463
)
@@ -98,7 +97,9 @@ async def openai_chat_completion(
9897
media_type="text/event-stream",
9998
)
10099

101-
response_content = response_config.get_response(last_message.content)
100+
response_content = await response_config.get_response_with_lag(
101+
last_message.content
102+
)
102103

103104
# Calculate mock token counts
104105
prompt_tokens = len(str(request.messages).split())
@@ -159,7 +160,9 @@ async def anthropic_chat_completion(
159160
media_type="text/event-stream",
160161
)
161162

162-
response_content = response_config.get_response(last_message.content)
163+
response_content = await response_config.get_response_with_lag(
164+
last_message.content
165+
)
163166

164167
# Calculate mock token counts
165168
prompt_tokens = len(str(request.messages).split())

0 commit comments

Comments
 (0)