Comparing models and thinking effort - Without modeling

Each model was given 11 insurance-domain questions (claims, policies, premiums) and asked to answer them via two methods: Semantic Layer (MetricFlow queries) and SQL (direct SQL generation). Each model/effort combination was run 5 times to account for variance. 3 of the 11 questions are "too many hops" — they require joins the Semantic Layer cannot express, testing whether models correctly refuse or fail gracefully.

Note: SQL runs on this page use the schema without modeling (no additional dbt models, raw DDL only).

Summary

Loading...

Accuracy

Loading...

Latency

Loading...
Loading...

Cost

Loading...

Tradeoffs

Ideal: top-left corner (high accuracy, low cost/latency)

Loading...
Loading...
Loading...