A Node.js Express microservice that processes natural language queries using embeddings, vector search, and LLM-based response generation.
- Query processing with semantic understanding
- Vector similarity search using Qdrant
- LLM-powered response generation with Ollama
- Redis caching for frequently asked queries
- TypeScript for type safety
- Error handling and logging
- Node.js 18+ and npm
- Redis server
- Qdrant vector database
- Ollama running locally with the qwen:7b model
- Clone the repository:
git clone <repository-url>
cd aiqueryms
- Install dependencies:
npm install
- Configure environment variables:
Copy the
.env.example
file to.env
and update the values:
cp .env.example .env
- Build the project:
npm run build
- Start the service:
npm start
For development with hot reload:
npm run dev
Process a natural language query and return an AI-generated response.
Request:
{
"query": "What do people like about the phone?"
}
Response:
{
"response": "Based on the reviews, people particularly appreciate..."
}
Check the service health status.
Response:
{
"status": "OK"
}
PORT
: Server port (default: 3000)QDRANT_URL
: Qdrant server URLQDRANT_COLLECTION
: Qdrant collection nameOLLAMA_API
: Ollama API endpointOLLAMA_MODEL
: Ollama model nameREDIS_HOST
: Redis server hostREDIS_PORT
: Redis server portREDIS_TTL
: Cache TTL in secondsEMBEDDING_DIMENSION
: Embedding vector dimensionTOP_K
: Number of similar results to retrieve
-
Query Processing:
- Convert user query to embedding vector
- Search similar vectors in Qdrant
- Rank and filter results
- Generate response using Ollama LLM
- Cache results in Redis
-
Caching:
- Redis is used to cache query-response pairs
- Configurable TTL for cache entries
- Improves response time for frequent queries
The service includes comprehensive error handling:
- Input validation
- API error handling
- Service-level error handling
- Logging with Winston
- Fork the repository
- Create your feature branch
- Commit your changes
- Push to the branch
- Create a new Pull Request
ISC