PostgreSQL extension for automatic vector embedding using external embedding services.
- Global embedding service configuration via database settings
- Register tables for automatic vector embedding on INSERT/UPDATE
- Asynchronous embedding computation using background workers
- Synchronous embedding function for queries
- Based on
httpandpg_background_queueextensions
- PostgreSQL 9.5+ with
vectorextension - http extension
- pg_background_queue extension
- pgTAP extension (for testing)
git clone https://github.com/hank-cp/pg_vector_embedding.git
cd pg_vector_embedding
make
sudo make installCREATE EXTENSION IF NOT EXISTS pg_vector_embedding SCHEMA "public" CASCADE;-- Set database-level configuration
ALTER DATABASE your_db SET pg_vector_embedding.embedding_url = 'https://api.siliconflow.cn/v1/embeddings';
ALTER DATABASE your_db SET pg_vector_embedding.embedding_api_key = 'your-api-key';
ALTER DATABASE your_db SET pg_vector_embedding.embedding_model = 'BAAI/bge-m3';
-- Restart Postgres to apply settingsCREATE TABLE documents (
id SERIAL PRIMARY KEY,
title TEXT,
content TEXT,
embedding VECTOR(1024)
);SELECT ve_enable(
'public', -- schema name
'documents', -- table name
ARRAY['title', 'content'], -- columns to embed
'embedding' -- vector column name
);INSERT INTO documents (title, content)
VALUES ('PostgreSQL Extensions', 'Learn how to build powerful PostgreSQL extensions');The embedding will be computed asynchronously via pg_background and stored in the embedding column.
-- Compute embedding for search query
SELECT * FROM documents
ORDER BY embedding <-> ve_compute_embedding('{"title": "PostgreSQL", "content": "extensions"}'::text)
LIMIT 10;SELECT ve_disable('public', 'documents');Create test/.env file:
EMBEDDING_URL=https://api.siliconflow.cn/v1/embeddings
EMBEDDING_API_KEY=your-api-key
EMBEDDING_MODEL=BAAI/bge-m3cd test
./runner.sh# Run with custom database settings
./runner.sh --host localhost --port 5432 --user postgres
# Keep test database after running (for debugging)
./runner.sh --no-cleanup- Trigger-based Detection: When a registered table is modified,
ve_trigger()captures the change - Column Extraction: The trigger extracts configured info columns as JSON using
ve_compact_row_data(). Column comments will also be included in the JSON to improve embedding quality. - Background Processing: A background worker will be launched to process the embedding request and update the vector column.
- Storage: The returned vector is saved to the configured vector column. It could be leveraged in vector similarity search, like RAG.
All configuration is stored at the database level using ALTER DATABASE:
| Key | Description | Example |
|---|---|---|
pg_vector_embedding.embedding_url |
Embedding API endpoint | https://api.siliconflow.cn/v1/embeddings |
pg_vector_embedding.embedding_api_key |
API authentication key | sk-... |
pg_vector_embedding.embedding_model |
Model to use (optional) | BAAI/bge-m3 |
Table-level configuration is passed as trigger arguments and does not require database settings.
-- 1. Setup
CREATE EXTENSION pg_vector_embedding CASCADE;
ALTER DATABASE mydb SET pg_vector_embedding.embedding_url = 'https://api.example.com/v1/embeddings';
ALTER DATABASE mydb SET pg_vector_embedding.embedding_api_key = 'sk-xxxxx';
\c -- Reconnect to apply settings
-- 2. Create and register table
CREATE TABLE articles (
id SERIAL PRIMARY KEY,
title TEXT,
content TEXT,
embedding VECTOR(1024)
);
SELECT ve_enable('public', 'articles', ARRAY['title', 'content'], 'embedding');
-- 3. Insert data (embeddings computed automatically in background)
INSERT INTO articles (title, content) VALUES
('PostgreSQL Extensions', 'Learn how to build powerful PostgreSQL extensions'),
('Vector Search', 'Implementing semantic search with pgvector');
-- 4. Wait for background processing (or check if embeddings are ready)
SELECT COUNT(*) FROM articles WHERE embedding IS NOT NULL;
-- 5. Perform similarity search
WITH search_query AS (
SELECT ve_compute_embedding('{"title": "PostgreSQL", "content": "tutorial"}'::text) AS query_embedding
)
SELECT id, title, embedding <-> query_embedding AS distance
FROM articles, search_query
WHERE embedding IS NOT NULL
ORDER BY distance
LIMIT 5;- Check if
pg_backgroundextension is installed and working - Verify database configuration is set and session is reconnected
- Check PostgreSQL logs for background worker errors
- Ensure the embedding API is accessible and credentials are valid
- Verify table is registered: Check for trigger
pg_vector_embedding_triggeron your table - Ensure table has a primary key (required for tracking records)
- Check trigger function exists:
\df ve_trigger