Skip to content
Open
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
75 changes: 75 additions & 0 deletions benchmark_report.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
# Cohort Generation Execution Benchmark

*Date generated:* 2026-03-18 14:24:28.792419

This report compares the execution time of cohort generation between the traditional **CirceR (Java + T-SQL)** and the new **circe_py (Python + Ibis)** implementation.

### Aggregate Performance
- **CirceR/Java Average Median Time:** 19.30 ms
- **CircePy/Ibis Average Median Time:** 18.85 ms

> **Conclusion:** `circe_py` (Ibis) is generally **faster** than `SqlRender` by **~2.3%** when evaluating identically generated cohorts.

### Raw Results
| Cohort | Approach | Min | lq | Mean | Median | uq | Max | Neval |
|---|---|---|---|---|---|---|---|---|
| 10.json | CirceR_Java_DBI | 13.90 | 14.10 | 15.17 | 14.40 | 15.90 | 18.44 | 10 |
| | CircePy_Ibis_DuckDB | 4.48 | 4.58 | 5.47 | 5.21 | 6.28 | 7.43 | 10 |
| 100.json | CirceR_Java_DBI | 23.87 | 24.86 | 27.73 | 27.13 | 29.68 | 33.68 | 10 |
| | CircePy_Ibis_DuckDB | 10.41 | 11.21 | 12.46 | 12.40 | 13.83 | 14.40 | 10 |
| 1000.json | CirceR_Java_DBI | 12.55 | 12.67 | 13.46 | 13.01 | 13.90 | 15.70 | 10 |
| | CircePy_Ibis_DuckDB | 2.96 | 3.25 | 3.51 | 3.52 | 3.60 | 4.18 | 10 |
| 1001.json | CirceR_Java_DBI | 13.38 | 13.56 | 14.38 | 14.24 | 14.47 | 17.32 | 10 |
| | CircePy_Ibis_DuckDB | 3.53 | 3.82 | 4.17 | 4.04 | 4.28 | 5.66 | 10 |
| 1002.json | CirceR_Java_DBI | 13.08 | 13.56 | 14.08 | 14.09 | 14.54 | 15.20 | 10 |
| | CircePy_Ibis_DuckDB | 2.93 | 3.01 | 3.35 | 3.39 | 3.65 | 3.78 | 10 |
| 1003.json | CirceR_Java_DBI | 10.94 | 11.31 | 12.26 | 11.55 | 12.06 | 15.87 | 10 |
| | CircePy_Ibis_DuckDB | 4.93 | 5.53 | 6.57 | 5.68 | 6.88 | 12.75 | 10 |
| 1004.json | CirceR_Java_DBI | 11.22 | 11.73 | 13.00 | 12.40 | 13.43 | 16.80 | 10 |
| | CircePy_Ibis_DuckDB | 4.64 | 5.08 | 5.50 | 5.44 | 5.65 | 7.15 | 10 |
| 1005.json | CirceR_Java_DBI | 14.40 | 14.71 | 16.63 | 16.11 | 18.71 | 19.41 | 10 |
| | CircePy_Ibis_DuckDB | 3.87 | 4.15 | 4.65 | 4.29 | 4.45 | 8.27 | 10 |
| 1006.json | CirceR_Java_DBI | 17.88 | 19.09 | 23.73 | 22.73 | 27.82 | 36.86 | 10 |
| | CircePy_Ibis_DuckDB | 4.05 | 4.36 | 5.27 | 5.19 | 5.70 | 7.24 | 10 |
| 1007.json | CirceR_Java_DBI | 27.90 | 29.12 | 31.12 | 29.98 | 31.89 | 37.63 | 10 |
| | CircePy_Ibis_DuckDB | 19.45 | 19.87 | 24.08 | 23.85 | 25.21 | 34.15 | 10 |
| 1009.json | CirceR_Java_DBI | 51.96 | 52.18 | 54.75 | 53.13 | 57.70 | 60.11 | 10 |
| | CircePy_Ibis_DuckDB | 109.65 | 118.49 | 126.79 | 123.25 | 135.94 | 146.27 | 10 |
| 1010.json | CirceR_Java_DBI | 15.56 | 15.90 | 16.91 | 16.52 | 17.45 | 20.41 | 10 |
| | CircePy_Ibis_DuckDB | 32.26 | 34.48 | 36.62 | 35.08 | 37.44 | 47.88 | 10 |
| 1011.json | CirceR_Java_DBI | 11.31 | 11.76 | 12.08 | 12.06 | 12.42 | 13.06 | 10 |
| | CircePy_Ibis_DuckDB | 4.98 | 5.18 | 5.56 | 5.59 | 6.04 | 6.15 | 10 |
| 1012.json | CirceR_Java_DBI | 11.65 | 11.82 | 12.64 | 12.12 | 12.89 | 16.19 | 10 |
| | CircePy_Ibis_DuckDB | 5.59 | 5.81 | 6.13 | 6.11 | 6.27 | 6.86 | 10 |
| 1013.json | CirceR_Java_DBI | 11.42 | 11.77 | 12.29 | 12.32 | 12.92 | 13.18 | 10 |
| | CircePy_Ibis_DuckDB | 4.86 | 5.37 | 5.48 | 5.41 | 5.74 | 6.14 | 10 |
| 1016.json | CirceR_Java_DBI | 14.41 | 14.75 | 18.62 | 16.87 | 20.66 | 30.34 | 10 |
| | CircePy_Ibis_DuckDB | 7.85 | 9.19 | 10.87 | 9.60 | 13.18 | 18.71 | 10 |
| 1017.json | CirceR_Java_DBI | 11.96 | 12.23 | 12.91 | 12.67 | 13.50 | 15.13 | 10 |
| | CircePy_Ibis_DuckDB | 6.49 | 6.75 | 7.40 | 7.11 | 7.35 | 10.22 | 10 |
| 1018.json | CirceR_Java_DBI | 11.65 | 12.45 | 14.46 | 13.53 | 15.68 | 21.79 | 10 |
| | CircePy_Ibis_DuckDB | 4.56 | 5.07 | 7.10 | 5.34 | 8.30 | 12.47 | 10 |
| 1019.json | CirceR_Java_DBI | 16.19 | 17.09 | 18.92 | 17.54 | 19.98 | 25.55 | 10 |
| | CircePy_Ibis_DuckDB | 16.94 | 17.45 | 19.03 | 18.22 | 21.05 | 22.03 | 10 |
| 1020.json | CirceR_Java_DBI | 26.52 | 28.96 | 37.04 | 34.96 | 44.87 | 51.18 | 10 |
| | CircePy_Ibis_DuckDB | 18.99 | 19.20 | 22.83 | 19.41 | 24.89 | 35.34 | 10 |
| 1021.json | CirceR_Java_DBI | 21.82 | 24.82 | 33.44 | 29.32 | 42.61 | 51.70 | 10 |
| | CircePy_Ibis_DuckDB | 40.08 | 46.80 | 55.84 | 53.25 | 69.19 | 72.25 | 10 |
| 1022.json | CirceR_Java_DBI | 19.17 | 21.31 | 27.71 | 22.62 | 35.44 | 49.43 | 10 |
| | CircePy_Ibis_DuckDB | 24.18 | 24.78 | 27.38 | 26.16 | 28.88 | 36.16 | 10 |
| 1023.json | CirceR_Java_DBI | 25.40 | 27.33 | 28.62 | 28.36 | 29.49 | 32.65 | 10 |
| | CircePy_Ibis_DuckDB | 38.29 | 38.63 | 40.16 | 39.66 | 40.84 | 45.80 | 10 |
| 1024.json | CirceR_Java_DBI | 17.94 | 19.07 | 22.18 | 20.38 | 24.48 | 32.79 | 10 |
| | CircePy_Ibis_DuckDB | 21.80 | 22.29 | 24.13 | 22.64 | 23.39 | 36.53 | 10 |
| 1025.json | CirceR_Java_DBI | 29.39 | 30.29 | 31.92 | 30.97 | 32.82 | 37.89 | 10 |
| | CircePy_Ibis_DuckDB | 34.85 | 36.47 | 39.93 | 38.42 | 43.38 | 52.41 | 10 |
| 1026.json | CirceR_Java_DBI | 13.14 | 14.88 | 20.48 | 16.16 | 25.98 | 43.50 | 10 |
| | CircePy_Ibis_DuckDB | 5.78 | 6.12 | 6.71 | 6.39 | 7.23 | 8.68 | 10 |
| 1027.json | CirceR_Java_DBI | 11.86 | 12.30 | 12.68 | 12.70 | 13.13 | 13.49 | 10 |
| | CircePy_Ibis_DuckDB | 5.24 | 5.54 | 5.69 | 5.68 | 5.88 | 6.15 | 10 |
| 1028.json | CirceR_Java_DBI | 12.69 | 13.13 | 13.69 | 13.37 | 13.62 | 16.58 | 10 |
| | CircePy_Ibis_DuckDB | 4.22 | 4.68 | 4.99 | 4.95 | 5.48 | 5.89 | 10 |
| 1029.json | CirceR_Java_DBI | 13.26 | 13.40 | 15.09 | 14.42 | 14.89 | 22.23 | 10 |
| | CircePy_Ibis_DuckDB | 51.15 | 51.99 | 56.75 | 53.93 | 55.72 | 78.29 | 10 |
| 1030.json | CirceR_Java_DBI | 12.27 | 12.96 | 13.62 | 13.43 | 14.31 | 15.24 | 10 |
| | CircePy_Ibis_DuckDB | 5.82 | 6.05 | 6.33 | 6.38 | 6.54 | 6.76 | 10 |
42 changes: 42 additions & 0 deletions benchmark_report_databricks.md
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@egillax this shows a ~55% improvement but it should be noted that this is when the lazy evaluation has been completed. A better approach might be to render the sql and just compare that. From a user perspective ibis adds processing overhead that may make cohort generation slower.

Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# Cohort Generation Execution Benchmark (Databricks)

*Date generated:* 2026-03-18 19:13:18.096735

This report compares the execution time of cohort generation on **Databricks** between the traditional **CirceR (Java + SqlRender)** and the new **circe_py (Python + Ibis)** implementation.

## Benchmark Configuration

- **Pre-compilation**: Both approaches pre-compile SQL/relations before execution
- **Validation**: Python uses `skip_validation=TRUE` to bypass table/row checks (matches Java behavior)
- **Spark Optimizations**: Adaptive query execution, partition coalescing, and broadcast joins enabled
- **Iterations**: Each cohort benchmarked with identical parameters

### Aggregate Performance
- **CirceR/Java Average Median Time:** 48.78 ms
- **CircePy/Ibis Average Median Time:** 21.62 ms

> **Conclusion:** `circe_py` (Ibis) is generally **faster** than Circe-be/`SqlRender` by **~55.7%** when evaluating identically generated cohorts.

### Raw Results
| Cohort | Approach | Min | lq | Mean | Median | uq | Max | Neval |
|---|---|---|---|---|---|---|---|---|
| 10.json | CirceR_Java_Databricks | 53.26 | 53.26 | 53.26 | 53.26 | 53.26 | 53.26 | 1 |
| | CircePy_Ibis_Databricks | 41.86 | 41.86 | 41.86 | 41.86 | 41.86 | 41.86 | 1 |
| 100.json | CirceR_Java_Databricks | 50.24 | 50.24 | 50.24 | 50.24 | 50.24 | 50.24 | 1 |
| | CircePy_Ibis_Databricks | 26.66 | 26.66 | 26.66 | 26.66 | 26.66 | 26.66 | 1 |
| 1000.json | CirceR_Java_Databricks | 50.52 | 50.52 | 50.52 | 50.52 | 50.52 | 50.52 | 1 |
| | CircePy_Ibis_Databricks | 15.47 | 15.47 | 15.47 | 15.47 | 15.47 | 15.47 | 1 |
| 1001.json | CirceR_Java_Databricks | 59.03 | 59.03 | 59.03 | 59.03 | 59.03 | 59.03 | 1 |
| | CircePy_Ibis_Databricks | 37.61 | 37.61 | 37.61 | 37.61 | 37.61 | 37.61 | 1 |
| 1002.json | CirceR_Java_Databricks | 33.83 | 33.83 | 33.83 | 33.83 | 33.83 | 33.83 | 1 |
| | CircePy_Ibis_Databricks | 9.94 | 9.94 | 9.94 | 9.94 | 9.94 | 9.94 | 1 |
| 1003.json | CirceR_Java_Databricks | 33.42 | 33.42 | 33.42 | 33.42 | 33.42 | 33.42 | 1 |
| | CircePy_Ibis_Databricks | 11.35 | 11.35 | 11.35 | 11.35 | 11.35 | 11.35 | 1 |
| 1004.json | CirceR_Java_Databricks | 38.99 | 38.99 | 38.99 | 38.99 | 38.99 | 38.99 | 1 |
| | CircePy_Ibis_Databricks | 11.68 | 11.68 | 11.68 | 11.68 | 11.68 | 11.68 | 1 |
| 1005.json | CirceR_Java_Databricks | 55.79 | 55.79 | 55.79 | 55.79 | 55.79 | 55.79 | 1 |
| | CircePy_Ibis_Databricks | 20.65 | 20.65 | 20.65 | 20.65 | 20.65 | 20.65 | 1 |
| 1006.json | CirceR_Java_Databricks | 69.09 | 69.09 | 69.09 | 69.09 | 69.09 | 69.09 | 1 |
| | CircePy_Ibis_Databricks | 19.52 | 19.52 | 19.52 | 19.52 | 19.52 | 19.52 | 1 |
| 1007.json | CirceR_Java_Databricks | 43.67 | 43.67 | 43.67 | 43.67 | 43.67 | 43.67 | 1 |
| | CircePy_Ibis_Databricks | 21.49 | 21.49 | 21.49 | 21.49 | 21.49 | 21.49 | 1 |
201 changes: 201 additions & 0 deletions scripts/benchmark_cohort_generation.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,201 @@
#!/usr/bin/env Rscript
# scripts/benchmark_cohort_generation.R
# Benchmark cohort GENERATION (execution) process: CirceR + SqlRender vs Ibis (Python)

# Ensure packages are installed
required_packages <- c("microbenchmark", "reticulate", "CirceR", "SqlRender", "DBI", "duckdb")
for (pkg in required_packages) {
if (!requireNamespace(pkg, quietly = TRUE)) {
cat(sprintf("Installing %s...\n", pkg))
install.packages(pkg, repos = "https://cloud.r-project.org")
}
}

library(microbenchmark)
library(reticulate)
library(CirceR)
library(SqlRender)
library(DBI)
library(duckdb)

cat("Copying Eunomia to a temporary writable database for the benchmark...\n")
eunomia_source <- "tests/eunomia.duckdb"
if (!file.exists(eunomia_source)) {
eunomia_source <- file.path("..", "tests", "eunomia.duckdb")
}
temp_db <- tempfile(fileext = ".duckdb")
file.copy(eunomia_source, temp_db)

cat("Setting up R DuckDB connection using temp Eunomia...\n")
r_con <- dbConnect(duckdb::duckdb(), dbdir = temp_db)
venv_path <- file.path(getwd(), ".venv")
if (dir.exists(venv_path)) {
use_python(file.path(venv_path, "bin", "python"), required = TRUE)
} else {
# Fallback to current environment
}

cat("Importing Python libraries via reticulate...\n")
circe_api <- import("circe.api")
ibis <- import("ibis")

cat("Preparing Ibis duckdb backend referencing temp Eunomia...\n")
py_backend <- ibis$duckdb$connect(temp_db)

# ==============================================================================
# DEFAULT EUNOMIA BENCHMARK CONFIGURATION
# ==============================================================================
TARGET_DIALECT <- "duckdb"
CDM_SCHEMA <- "main"
TARGET_SCHEMA <- "main"
COHORT_TABLE <- "cohort"

# ==============================================================================
# LIVE DATABASE CONFIGURATION (Uncomment and edit to benchmark real databases)
# ==============================================================================
# library(DatabaseConnector)
# connectionDetails <- createConnectionDetails(dbms = "postgresql", server = "localhost/cdm", user = "user", password = "pw")
# r_con <- connect(connectionDetails)
# py_backend <- ibis$postgres$connect(host="localhost", database="cdm", user="user", password="pw")
# TARGET_DIALECT <- "postgresql" # Or your true dialect
# CDM_SCHEMA <- "cdm"
# TARGET_SCHEMA <- "results"
# COHORT_TABLE <- "cohort"
# ==============================================================================

cohorts_dir <- file.path("tests", "cohorts")
if (!dir.exists(cohorts_dir)) {
cohorts_dir <- file.path("..", "tests", "cohorts")
}

json_files <- list.files(cohorts_dir, pattern = "\\.json$", full.names = TRUE)
if (length(json_files) == 0) stop("No cohort JSON files found in tests/cohorts/")
sample_files <- head(json_files, 30)

cat("\nStarting Generation Benchmark...\n")

all_results <- list()

for (file in sample_files) {
cat("\n========================================\n")
cat(sprintf("Benchmarking Cohort Generation: %s\n", basename(file)))
cat("========================================\n")

json_str <- readChar(file, file.info(file)$size)

# PRE-COMPILE JAVA SQL
cohort_expr <- CirceR::cohortExpressionFromJson(json_str)
options <- CirceR::createGenerateOptions(generateStats = FALSE)
sql <- CirceR::buildCohortQuery(cohort_expr, options = options)
rendered_sql <- SqlRender::render(sql,
vocabulary_database_schema = CDM_SCHEMA,
cdm_database_schema = CDM_SCHEMA,
target_database_schema = TARGET_SCHEMA,
results_database_schema = TARGET_SCHEMA,
target_cohort_table = COHORT_TABLE,
target_cohort_id = 1)
translated_sql <- SqlRender::translate(rendered_sql, targetDialect = TARGET_DIALECT)
java_queries <- strsplit(translated_sql, ";\n*\\s*")[[1]]
java_queries_clean <- vector("list", length(java_queries))
idx <- 1
for (q in java_queries) {
qc <- trimws(q)
if (nchar(qc) > 0) {
java_queries_clean[[idx]] <- qc
idx <- idx + 1
}
}

# PRE-COMPILE IBIS SQL
py_expr <- circe_api$cohort_expression_from_json(json_str)
table_expr <- circe_api$build_cohort(py_expr, backend = py_backend, cdm_schema = CDM_SCHEMA)
ibis_sql_str <- ibis$to_sql(table_expr)

run_circe_java_execution <- function() {
# Execute on connection without overhead!
for (q in java_queries_clean) {
if (!is.null(q)) {
dbExecute(r_con, q)
}
}
}

run_circe_ibis_execution <- function() {
# Execute the pre-compiled Ibis SQL directly on duckdb
py_backend$raw_sql(ibis_sql_str)
}

# Microbenchmark purely on execution!
cat(" Running Pure Execution Microbenchmark (10 iterations)...\n")
mb <- microbenchmark(
CirceR_Java_DBI = run_circe_java_execution(),
CircePy_Ibis_DuckDB = run_circe_ibis_execution(),
times = 10
)
print(mb)
all_results[[basename(file)]] <- summary(mb)
}

# Generate Markdown Report
report_file <- "benchmark_report.md"
cat("Generating report...\n")

java_medians <- c()
ibis_medians <- c()

for (cohort_name in names(all_results)) {
res <- all_results[[cohort_name]]
ja <- res$median[res$expr == "CirceR_Java_DBI"]
ib <- res$median[res$expr == "CircePy_Ibis_DuckDB"]
if (length(ja) > 0) java_medians <- c(java_medians, ja)
if (length(ib) > 0) ibis_medians <- c(ibis_medians, ib)
}

java_avg <- mean(java_medians)
ibis_avg <- mean(ibis_medians)

diff_pct <- if (ibis_avg < java_avg) ((java_avg - ibis_avg) / java_avg) * 100 else ((ibis_avg - java_avg) / ibis_avg) * 100
faster_stmt <- if (ibis_avg < java_avg) "faster" else "slower"

report_summary <- sprintf("> **Conclusion:** `circe_py` (Ibis) is generally **%s** than Circe-be/`SqlRender` by **~%.1f%%** when evaluating identically generated cohorts.", faster_stmt, diff_pct)

report_lines <- c(
"# Cohort Generation Execution Benchmark",
"",
paste("*Date generated:*", Sys.time()),
"",
"This report compares the execution time of cohort generation between the traditional **CirceR (Java + T-SQL)** and the new **circe_py (Python + Ibis)** implementation.",
"",
"### Aggregate Performance",
sprintf("- **CirceR/Java Average Median Time:** %.2f ms", java_avg),
sprintf("- **CircePy/Ibis Average Median Time:** %.2f ms", ibis_avg),
"",
report_summary,
"",
"### Raw Results",
"| Cohort | Approach | Min | lq | Mean | Median | uq | Max | Neval |",
"|---|---|---|---|---|---|---|---|---|"
)

for (cohort_name in names(all_results)) {
res <- all_results[[cohort_name]]
for (i in seq_len(nrow(res))) {
row <- res[i, ]
expr_name <- as.character(row$expr)

# Format times depending on unit (microbenchmark handles units automatically but we extract raw or formatted)
# The summary data frame contains time in the unit specified by attr(res, "unit") or assumes milliseconds/microseconds.
# Actually, print provides nice formatting, but summary() returns raw values depending on unit.
# Usually it's in milliseconds if we don't specify. To be safe, we round them.
format_num <- function(x) sprintf("%.2f", x)

report_lines <- c(report_lines, sprintf("| %s | %s | %s | %s | %s | %s | %s | %s | %d |",
if(i==1) cohort_name else "",
expr_name,
format_num(row$min), format_num(row$lq), format_num(row$mean),
format_num(row$median), format_num(row$uq), format_num(row$max), row$neval))
}
}

writeLines(report_lines, report_file)
cat(sprintf("\nBenchmark Complete. Report saved to %s\n", report_file))
Loading