Skip to content

Segmentation fault with RTREE index #681

@vetleaa

Description

@vetleaa

Description

I've been encountering a segmentation fault specifically when using RTREE index in either @duckdb/node-api or the unofficial DuckDB.NET (both using duckdb v1.3.2, C# on Windows, NodeJS on a Mac). So far the issue appears to happen most of the time when hitting a checkpoint threshold, but it doesnt happen every single time.

Increasing the checkout_threshold delays the seg fault at least until the checkpoint has been hit once.

I've been able to retrieve rows and validate that the data returned is reasonable and running logging pragma with log_level='TRACE' did not seem to give any more info about the crash. Additionally I tried to disable the optimizer, but it didnt help.

Is there anything obvious I'm missing here?

Minimal reproduction

NodeJS reproduction As you can see, I've reduced the checkpoint_threshold to encounter the issue quicker, but I initially had the default 16MB.

The error does not happen on every single checkpoint, but it does most of the time.

import fs from 'fs';
import { DuckDBConnection, DuckDBInstance, version } from "@duckdb/node-api";

const instance = await DuckDBInstance.create("test_db.db")

console.log("DuckDB version:", version()); // v1.3.2

const conn = await DuckDBConnection.create(instance);

await conn.run("INSTALL spatial;");
await conn.run("LOAD spatial;");
await conn.run("SET checkpoint_threshold = '4MB';");
await conn.run("CREATE TABLE IF NOT EXISTS testtable (position GEOMETRY);");
await conn.run("CREATE INDEX IF NOT EXISTS idx_testtable_geom ON testtable USING RTREE (position);");

const reps = 100_000;
let totalRowsAffected = 0;

for (let i = 0; i < reps; i++) {
    const prepared = await conn.prepare("INSERT INTO testtable (position) VALUES (ST_Point($Longitude, $Latitude));");

    prepared.bind({
        Longitude: 0.1 + Math.random() * 40,
        Latitude: 43.1 + Math.random() * 10,
    });

    totalRowsAffected += (await prepared.run()).rowsChanged;

    if (i % 2_000 === 0) {
        console.log(i, "reps, ", fs.statSync("./test_db.db.wal").size / 1_000_000, "MB WAL");
    }
}

console.log("SUCCESS");

conn.closeSync();
instance.closeSync();
C# reproduction
[Test]
public void TestRtreeCrash() {
    var dbFilePath = ""; // path to a DB file (In-memory db doesnt appear to have the issue)
    var connectionString = $"Data Source='{dbFilePath}'";
    using var db = new DuckDBConnection(connectionString);

    db.Open();
    db.Execute("INSTALL spatial;");
    db.Execute("LOAD spatial;");
    db.Execute("SET checkpoint_threshold = '2MB';");

    db.Execute(@"
        CREATE TABLE IF NOT EXISTS testtable (position GEOMETRY);
        CREATE INDEX IF NOT EXISTS idx_testtable_geom ON testtable USING RTREE (position);
    ");

    var rng = new Random();
    var reps = 50_000;
    var totalRowsAffected = 0;

    for (var i = 0; i < reps; i++)
    {
        using var cmd = db.CreateCommand();
        cmd.CommandText = "INSERT INTO testtable (position) VALUES (ST_Point($Longitude, $Latitude))";

        cmd.Parameters.Add(new DuckDBParameter("Longitude", 0.1 + rng.NextDouble() * 40));
        cmd.Parameters.Add(new DuckDBParameter("Latitude", 43.1 + rng.NextDouble() * 10));

        totalRowsAffected += cmd.ExecuteNonQuery();
    }

    Assert.That(totalRowsAffected, Is.EqualTo(reps));
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions