Skip to content

Commit adbdc9b

Browse files
committed
Add RS docs
1 parent 053f08f commit adbdc9b

File tree

1 file changed

+166
-0
lines changed

1 file changed

+166
-0
lines changed

docs/motherduck-rs.md

Lines changed: 166 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,166 @@
1+
# MotherDuck Read-Scaling
2+
3+
MotherDuck read-scaling enables you to handle read-heavy workloads by creating multiple read-only connections to a MotherDuck database from different PostgreSQL sessions. This feature helps avoid performance bottlenecks when connecting many concurrent users or BI tools through a single MotherDuck account.
4+
5+
## Overview
6+
7+
By default, all connections using the same MotherDuck account share a single cloud DuckDB instance (a "duckling"). Read-scaling solves this limitation by:
8+
9+
- **Creating Read-Only Replicas**: Spinning up multiple read-only replicas of your database
10+
- **Distributing Load**: Each replica is powered by its own dedicated duckling
11+
- **Scaling Automatically**: As more users connect via read-scaling tokens, your flock of ducklings expands
12+
- **Maintaining Affinity**: Users are assigned to specific replicas for consistent performance
13+
14+
## How It Works
15+
16+
### Token Types
17+
18+
**Primary Token**: Standard MotherDuck token with full read-write access
19+
**Read Scaling Token**: Special token that provides read-only access and directs connections to dedicated read replicas
20+
21+
### Connection Behavior
22+
23+
1. **Primary Connection**: Uses standard token, has full read-write access to the main duckling
24+
2. **Read-Scaling Connections**: Use read-scaling tokens, are assigned to dedicated read-only replicas
25+
3. **Eventual Consistency**: Read replicas sync changes from the primary instance within a few minutes
26+
4. **Session Affinity**: Users can be consistently routed to the same replica for better caching
27+
28+
## Configuration
29+
30+
### Prerequisites
31+
32+
- Valid MotherDuck account and authentication tokens
33+
- MotherDuck database created and accessible
34+
- Multiple PostgreSQL sessions for testing read-scaling
35+
36+
### Creating Read Scaling Tokens
37+
38+
1. Generate a read-scaling token through the MotherDuck UI
39+
2. When creating an access token, select "Read Scaling Token" as the token type
40+
3. Read-scaling tokens grant read permissions but restrict write operations
41+
42+
Additional resources:
43+
44+
- [MotherDuck Authentication Documentation](https://motherduck.com/docs/key-tasks/authenticating-and-connecting-to-motherduck/)
45+
- [MotherDuck Read Scaling](https://motherduck.com/docs/key-tasks/authenticating-and-connecting-to-motherduck/read-scaling)
46+
47+
### Setting Up Read-Scaling in pg_duckdb
48+
49+
#### Primary (Read-Write) Connection
50+
```sql
51+
-- Connect with standard token for full read-write access
52+
CALL duckdb.enable_motherduck('<your_primary_token>', '<database_name>');
53+
```
54+
55+
#### Read-Scaling (Read-Only) Connections
56+
```sql
57+
-- Connect with read-scaling token for read-only access
58+
CALL duckdb.enable_motherduck('<your_read_scaling_token>', '<database_name>');
59+
```
60+
61+
## Usage Examples
62+
63+
### Basic Read-Scaling Setup
64+
65+
**Session 1: Primary Connection (Read-Write)**
66+
```sql
67+
-- Connect with primary token
68+
CALL duckdb.enable_motherduck('md_primary_token_here', 'sales_db');
69+
70+
-- Create and populate data
71+
CREATE TABLE products(id int, name varchar, price decimal);
72+
INSERT INTO products VALUES
73+
(1, 'Widget A', 19.99),
74+
(2, 'Widget B', 29.99),
75+
(3, 'Widget C', 39.99);
76+
```
77+
78+
**Session 2: Read-Scaling Connection (Read-Only)**
79+
```sql
80+
-- Connect with read-scaling token
81+
CALL duckdb.enable_motherduck('md_read_scaling_token_here', 'sales_db');
82+
83+
-- Wait for data synchronization (replicas sync within a few minutes)
84+
SELECT duckdb.raw_query($$ REFRESH DATABASE sales_db; $$);
85+
86+
-- Query data (read-only access)
87+
SELECT name, price
88+
FROM products
89+
WHERE price > 25.00
90+
ORDER BY price;
91+
```
92+
93+
### Advanced Data Synchronization
94+
95+
For applications requiring stricter synchronization, you can manually control data freshness:
96+
97+
**On Primary Connection:**
98+
```sql
99+
-- Make changes
100+
INSERT INTO products VALUES (4, 'Widget D', 49.99);
101+
102+
-- Create snapshot to ensure consistency
103+
SELECT duckdb.raw_query($$ CREATE SNAPSHOT OF sales_db; $$);
104+
```
105+
106+
**On Read-Scaling Connections:**
107+
```sql
108+
-- Refresh to see the latest snapshot
109+
SELECT duckdb.raw_query($$ REFRESH DATABASE sales_db; $$);
110+
-- Or refresh all databases
111+
SELECT duckdb.raw_query($$ REFRESH DATABASES; $$);
112+
113+
-- Now query the updated data
114+
SELECT COUNT(*) FROM products; -- Will show all 4 products
115+
```
116+
117+
## Session Hints and Connection Management
118+
119+
### Using Session Hints
120+
121+
To ensure users consistently connect to the same replica (improving caching and consistency), you can use session hints when connecting:
122+
123+
```python
124+
# Example using session hint in connection setup
125+
user_spec = {
126+
"database": "my_database",
127+
"token": "read_scaling_token_here",
128+
"hint": "user123" # Consistent hint for this user
129+
}
130+
```
131+
132+
### Connection String Parameters
133+
134+
You can also configure additional parameters in MotherDuck connection strings:
135+
136+
- **session_hint**: Routes clients with the same hint to the same replica
137+
- **dbinstance_inactivity_ttl**: Sets cache TTL in seconds (e.g., `md:?dbinstance_inactivity_ttl=300`)
138+
139+
## Limitations and Behavior
140+
141+
### Read-Only Restrictions
142+
143+
Read-scaling connections cannot perform write operations, including:
144+
145+
- **Data Modifications**: `INSERT`, `UPDATE`, `DELETE` operations
146+
- **Schema Changes**: `CREATE TABLE`, `DROP TABLE`, `ALTER TABLE`
147+
- **Database Management**: Creating new databases, attaching/detaching databases
148+
149+
Attempting write operations on read-scaling connections will result in an error:
150+
```
151+
Cannot execute statement of type "INSERT" on database "database_name" which is attached in read-only mode!
152+
```
153+
154+
### Data Consistency Model
155+
156+
- **Eventually Consistent**: Read replicas typically sync changes from the primary instance within a few minutes
157+
- **Lag Tolerance**: Read operations might see data that slightly lags behind the latest writes
158+
- **Manual Refresh**: Use `REFRESH DATABASE` to ensure read-scaling connections see the latest data
159+
- **Snapshot Control**: Create snapshots on the writer and refresh on readers for stricter consistency
160+
161+
### Scaling Limits
162+
163+
- **Default Replica Limit**: 16 read-scaling replicas by default
164+
- **Configurable**: Contact MotherDuck support to adjust the replica limit
165+
- **Sharing Behavior**: If the limit is exceeded, new connections share existing replicas
166+
- **Affinity Preservation**: MotherDuck maintains user-to-replica affinity where possible

0 commit comments

Comments
 (0)