Comparing models and thinking effort - Without modeling
Each model was given 11 insurance-domain questions (claims, policies, premiums) and asked to answer them via two methods: Semantic Layer (MetricFlow queries) and SQL (direct SQL generation). Each model/effort combination was run 5 times to account for variance. 3 of the 11 questions are "too many hops" — they require joins the Semantic Layer cannot express, testing whether models correctly refuse or fail gracefully.
Note: SQL runs on this page use the schema without modeling (no additional dbt models, raw DDL only).
Summary
Loading...
Accuracy
Loading...
Latency
Loading...
Loading...
Cost
Loading...
Tradeoffs
Ideal: top-left corner (high accuracy, low cost/latency)
Loading...
Loading...
Loading...
