Skip to content

Commit 8105871

Browse files
add tpc-ds tests and property-based testing utilities
This change introduces a new `property_based.rs` test utility which lets us evaluate correctness using properties. These are useful for evaluating correctness when we do not know the expected output of a test (ex. if we were to fuzz the database with randomized data or randomzed queries, then we can only verify the output using properties). The two oracles are - ResultSetOracle: Compares the result set between single node and distributed datafusion - OrderingOracle: Uses plan properties to figure out the expected ordering and asserts it This change does not introduce a fuzz test, but it introduces a TPC-DS test. This test randomly generates data using the duckdb CLI and runs 99 queries on a distributed cluster. The query outputs are validated against single-node datafusion using test utils in `metamorphic.rs`. This test also randomizes the test cluster parameters - there's no harm in doing so. Next steps: - Add fuzzing - Now that we have property-based testing utils, we can properly fuzz the project using SQLancer - SQLancer produces INSERT and SELECT statements which we could point at a datafusion distributed cluster and verify against single node datafusion - Although it doesn't support nested select statements, 70% of the queries were valid datafusion queries, meaning these are good test cases for us - Add metrics oracle to validate output_rows metric / metrics propagation
1 parent 4867a18 commit 8105871

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

109 files changed

+6867
-387
lines changed

.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,3 +3,5 @@
33
/benchmarks/data/
44
testdata/tpch/*
55
!testdata/tpch/queries
6+
testdata/tpch/data/
7+
testdata/tpcds/data/

0 commit comments

Comments
 (0)