A Python-based benchmarking tool to compare query performance and cost between Snowflake and Databricks using the TPC-H SF1000 (1TB) dataset.
This benchmark executes all 22 TPC-H queries against both Snowflake and Databricks, measuring execution time, cost (credits/DBUs), and detailed performance metrics. Results are stored in DuckDB for analysis and comparison using dbt-generated views.
The tool supports three benchmarking scenarios:
-
Normal (Sequential)
- Executes all 22 queries sequentially on a warm warehouse
- Warehouse remains running throughout the entire run
- First query experiences cold start, subsequent queries benefit from warm warehouse
- Default mode for comprehensive performance testing
-
Coldstart
- Warehouse suspended/stopped between each query
- Each query experiences full cold start overhead (warehouse startup time)
- Tests worst-case performance when no cache is available
- Defaults to queries 1, 3, 5, 10, 18 (configurable)
- Useful for understanding cold cache performance
-
Concurrent
- All 22 queries executed in parallel on the same warehouse
- Tests warehouse concurrency and resource contention
- Measures performance degradation under concurrent load
- Uses multi-cluster warehouses to handle parallel execution
- Defaults to all 22 queries (configurable)
-
CTAS (Create Table As Select)
- Benchmarks write performance by creating tables with different data shapes
- Tests data ingestion and table creation performance
- Five variants with different row counts and column widths:
narrow_tall: Many rows, few columns (~6B rows, minimal columns)standard_tall: Many rows, standard columns (~6B rows)medium_wide: Moderate rows, more columnsvery_wide: Fewer rows, many columns (wide denormalized table)filtered: Subset of data with filters applied
- Tables are automatically dropped after benchmark completion
-
DML (Delete + Insert)
- Benchmarks data modification performance using DELETE and INSERT operations
- Simulates partition refresh pattern: delete a monthly slice, then re-insert it
- Operations:
delete: Delete June 1995 lineitem data (~7.5M rows)insert: Insert June 1995 lineitem data from source (~7.5M rows)
- Tests real-world data update patterns common in data warehouses
- Requires
lineitem_dmltable (copy of lineitem) which is created automatically
Use --scenario normal, --scenario coldstart, --scenario concurrent, --scenario ctas, --scenario dml, or --scenario all (default) to run all scenarios with a unified run ID.
If you do not have uv installed, run curl -LsSf https://astral.sh/uv/install.sh | sh. See uv docs for details and troubleshooting.
uv syncNote:
uvwill automatically install the Snowflake and Databricks CLI tools for you.
Run the automated setup script from the project root:
uv run setup_config.pyThis interactive script will:
- ✓ Prompt for your Snowflake connection name
- ✓ Prompt for Databricks workspace credentials
- ✓ Discover available catalogs and schemas
- ✓ Generate your
.envfile automatically
Note: SQL Warehouses are created and destroyed automatically during benchmark runs. No manual warehouse configuration needed.
Warehouse Naming: Warehouses are named with the pattern {PREFIX}_{SIZE}_{SCENARIO}_{RUN_ID} to prevent conflicts when running multiple scenarios. For example: BENCHMARK_WH_MEDIUM_NORMAL_001 and BENCHMARK_WH_MEDIUM_COLDSTART_001.
If you prefer to manually configure, copy .env.example to .env and configure all user-specific environment variables for your environment.
Ensure you have the Snowflake CLI installed and configured.
First, add your Snowflake CLI connection:
snow connection add --connection <your-connection-name>
snow connection test --connection <your-connection-name>Then configure these variables in .env:
SNOWFLAKE_CONNECTION- Your Snowflake CLI connection name (required)SNOWFLAKE_ROLE- Role to use for benchmark (default:BENCHMARK)SNOWFLAKE_DATABASE- Database for benchmark tables (default:BENCHMARK)SNOWFLAKE_SCHEMA- Schema for benchmark tables (default:PUBLIC)SNOWFLAKE_WAREHOUSE_PREFIX- Prefix for warehouse names (default:BENCHMARK_WH)- Full warehouse names will be:
{PREFIX}_small,{PREFIX}_medium,{PREFIX}_xlargewith run IDs appended
- Full warehouse names will be:
Example:
export SNOWFLAKE_CONNECTION=my_connection
export SNOWFLAKE_ROLE=BENCHMARK
export SNOWFLAKE_DATABASE=MY_BENCHMARK_DB
export SNOWFLAKE_SCHEMA=PUBLIC
export SNOWFLAKE_WAREHOUSE_PREFIX=MY_WHIMPORTANT: TPC-H SF1000 Data Requirement
Unlike Snowflake (which provides SNOWFLAKE_SAMPLE_DATA.TPCH_SF1000), Databricks does not include a pre-loaded TPC-H SF1000 (1TB) dataset. You must ETL the TPC-H data into your Databricks catalog before running benchmarks.
The benchmark expects these tables in your catalog/schema:
customer,lineitem,nation,orders,part,partsupp,region,supplier
Configure these variables in .env:
DATABRICKS_HOST- Your Databricks workspace URL (required)- Format:
https://dbc-xxxxxxxxx.cloud.databricks.com
- Format:
DATABRICKS_TOKEN- Your Databricks personal access token (required)DATABRICKS_CATALOG- Catalog containing your TPC-H SF1000 tables (required, user-specific)DATABRICKS_SCHEMA- Schema containing your TPC-H SF1000 tables (required, user-specific)
Note: SQL Warehouses are created and destroyed automatically during benchmark runs. No pre-configuration of warehouses needed.
Example:
export DATABRICKS_HOST=https://dbc-abc123.cloud.databricks.com
export DATABRICKS_TOKEN=dapi_abc123xyz789
export DATABRICKS_CATALOG=your_tpch_catalog
export DATABRICKS_SCHEMA=sf1000# Run all scenarios (normal + coldstart + concurrent) with default settings (medium warehouse)
uv run main.py| Flag | Options | Description |
|---|---|---|
--warehouse-size |
small, medium, large, xl, 2xl, all, or comma-separated |
Warehouse size(s) to use. Automatically maps to platform-specific sizes: • small: Snowflake Small / Databricks Small• medium: Snowflake Medium / Databricks Small (default)• large: Snowflake Large / Databricks Medium• xl: Snowflake XLarge / Databricks Large• 2xl: Snowflake 2XLarge / Databricks XLarge• all: Run all sizes• Comma-separated: e.g., medium,xl |
--scenario |
normal, coldstart, concurrent, ctas, dml, all |
Benchmark scenario to run: • normal: Sequential queries with warm warehouse only• coldstart: Warehouse suspended between each query only (defaults to queries 1,3,5,10,18 if not specified)• concurrent: All queries executed in parallel on same warehouse• ctas: Create Table As Select benchmarks with multiple data shapes• dml: DELETE + INSERT operations simulating partition refresh• all: Run all scenarios with unified run ID (default) |
--queries |
e.g., 1,2,3 or 1-5 |
Specific queries to run (default: all 22 TPC-H queries) |
--ctas-variants |
comma-separated list | CTAS variants to run (default: all). Options: narrow_tall, standard_tall, medium_wide, very_wide, filtered |
--snowflake-only |
(flag) | Run only Snowflake benchmark (skip Databricks) |
--databricks-only |
(flag) | Run only Databricks benchmark (skip Snowflake) |
# Run all scenarios with default medium warehouse
uv run main.py
# Run all scenarios with ALL warehouse sizes (small, medium, large)
uv run main.py --warehouse-size all
# Run with specific warehouse sizes (comma-separated)
uv run main.py --warehouse-size small,large
# Run with large warehouse only
uv run main.py --warehouse-size large
# Run specific queries (all scenarios)
uv run main.py --queries 1,2,3
uv run main.py --queries 1-5
# Run ONLY normal scenario (warm warehouse)
uv run main.py --scenario normal
# Run ONLY cold start scenario (warehouse suspended between queries)
uv run main.py --scenario coldstart
# Run ONLY concurrent scenario (all queries in parallel)
uv run main.py --scenario concurrent
# Explicitly run all scenarios (same as default)
uv run main.py --scenario all
# Run ALL scenarios with ALL warehouse sizes (the full matrix!)
uv run main.py --scenario all --warehouse-size all
# Run cold start with specific queries
uv run main.py --scenario coldstart --queries 1,5,10
# Run concurrent with specific queries
uv run main.py --scenario concurrent --queries 1-10
# Run only Databricks
uv run main.py --databricks-only
# Run only Snowflake
uv run main.py --snowflake-only
# Combine flags: large warehouse, specific queries, all scenarios
uv run main.py --warehouse-size large --queries 1-10 --scenario all
# Run CTAS scenario with all variants
uv run main.py --scenario ctas
# Run CTAS with specific variants (skip very_wide)
uv run main.py --scenario ctas --ctas-variants narrow_tall,standard_tall,medium_wide,filtered
# Run CTAS on multiple warehouse sizes
uv run main.py --scenario ctas --warehouse-size medium,large,xl
# Run CTAS on Snowflake small only, skipping very_wide
uv run main.py --scenario ctas --snowflake-only --warehouse-size small --ctas-variants narrow_tall,standard_tall,medium_wide,filtered
# Run CTAS on Databricks XL only (uses 2xl mapping)
uv run main.py --scenario ctas --databricks-only --warehouse-size 2xl
# Run DML scenario with default medium warehouse
uv run main.py --scenario dml
# Run DML on multiple warehouse sizes
uv run main.py --scenario dml --warehouse-size medium,large,xl
# Run DML on Snowflake only with large warehouse
uv run main.py --scenario dml --snowflake-only --warehouse-size large
# Run DML on Databricks only with XL warehouse
uv run main.py --scenario dml --databricks-only --warehouse-size xlBoth Snowflake and Databricks collect detailed cost and performance metrics in system tables, but this data is not immediately available. After running benchmarks, enrich all unenriched queries in the DuckDB database.
Run at least 1-2 hours after benchmark completion:
uv run enrich.pyThis unified enrichment script runs the following in order:
- Snowflake query enrichment - Data from
SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY - Snowflake warehouse usage - Data from
SNOWFLAKE.ACCOUNT_USAGE.WAREHOUSE_METERING_HISTORY - Databricks warehouse usage - Data from
system.billing.usage
Snowflake enrichment adds:
- Query compilation time
- Queue time (provisioning + repair + overload)
- Bytes scanned
- Cloud services credits used
- Total elapsed time (server-side)
Databricks enrichment adds:
- Warehouse usage data for DBU cost approximation
- Approximate DBU cost per query (proportionally distributed from warehouse-hour costs)
Important Notes:
- Timing: Snowflake data available after 45 minutes, Databricks may take 1-2 hours
- Databricks costs: DBU costs are approximations (proportionally distributed from warehouse-hour costs)
- Snowflake costs: Exact per-query credit tracking
- Permissions required:
- Snowflake:
SELECTonSNOWFLAKE.ACCOUNT_USAGE.* - Databricks:
SELECTonsystem.query.historyandsystem.billing.usage
- Snowflake:
After running benchmarks, view comparison reports using the dbt-generated views.
Database Schema:
- Database file:
benchmark_results.duckdb(in project root) - Schema:
benchmark_results.main - Raw tables:
snowflake_results,databricks_results - Analysis views: Generated by dbt (see below)
# Build/refresh all analysis views
cd common/transformations
uvx dbt buildCommand-line:
# Query results for normal scenario (warm warehouse)
duckdb benchmark_results.duckdb -c "SELECT * FROM platform_comparison_normal;"
# Query results for coldstart scenario (suspended warehouse)
duckdb benchmark_results.duckdb -c "SELECT * FROM platform_comparison_coldstart;"
# Query results for concurrent scenario (parallel execution)
duckdb benchmark_results.duckdb -c "SELECT * FROM platform_comparison_concurrent;"
# Query results for DML scenario (delete + insert operations)
duckdb benchmark_results.duckdb -c "SELECT * FROM platform_comparison_dml;"
# Query latest run (all scenarios)
duckdb benchmark_results.duckdb -c "SELECT * FROM platform_comparison_latest;"GUI Tool (Recommended):
Use DBeaver for interactive querying and visualization:
- Download and install DBeaver
- Create a new DuckDB connection
- Point to
benchmark_results.duckdbin the project root - Query the views with full SQL editor and export capabilities
Available Views:
platform_comparison_normal- Latest normal scenario (sequential queries, warm warehouse)platform_comparison_coldstart- Latest coldstart scenario (warehouse suspended between queries)platform_comparison_concurrent- Latest concurrent scenario (all queries in parallel)platform_comparison_dml- Latest DML scenario (delete + insert operations)platform_comparison_latest- Latest run with all scenarios combined
See common/transformations/README.md for detailed documentation on the analysis views.
After running benchmarks and enriching data, you can launch an interactive React dashboard to explore and compare results visually.
Export the latest results from DuckDB to JSON format:
uv run visualization/update_data.pyThis script:
- Queries the
run_summary_aggandquery_detailsdbt views - Exports formatted data to
visualization/src/data/benchmarkData.json - Includes scenario summaries and per-query details with proportional cost allocation
Install dependencies and start the development server:
cd visualization
npm install
npm run devThe dashboard will open at http://localhost:5173 (or another port if 5173 is in use).
Features:
- Interactive comparison charts for all scenarios
- Query-level performance breakdowns
- Cost comparisons (Snowflake credits vs Databricks DBUs)
- Filterable by scenario and warehouse tier
- Responsive design with detailed tooltips
Note: The dashboard uses static JSON data. Re-run uv run visualization/update_data.py whenever you want to refresh with new benchmark results.
- Python 3.x with
uvpackage manager - Snowflake account with access to
SNOWFLAKE_SAMPLE_DATA.TPCH_SF1000(pre-loaded sample data) - Databricks workspace with TPC-H SF1000 dataset in Delta Lake
- Note: TPC-H SF1000 data is NOT pre-loaded in Databricks - you must ETL it yourself
- Set
DATABRICKS_CATALOGto your catalog containing the TPC-H tables - Required tables:
customer,lineitem,nation,orders,part,partsupp,region,supplier
main.py- Main benchmark execution scriptsnowflake/- Snowflake benchmark implementationdatabricks/- Databricks benchmark implementationcommon/- Shared utilities and transformationstransformations/- dbt models for analysis views
tests/- Test suite for benchmark logicproject_plan.md- Detailed implementation planCLAUDE.md- Development guidelines
Run the test suite to verify benchmark logic and ensure code quality:
# Run all tests with test runner script
./run_tests.sh
# Or run manually with pytest
uv run pytest tests/ -vThe test suite includes 34 tests covering:
- Warehouse manager lifecycle (create, destroy, suspend/resume)
- Query executor logic and metrics collection
- Scenario integration (normal, coldstart, all)
- Run type classification (cold, semi-warm, warm)
Tests use mocks and do not require database connections.