Causal · Counterfactual · RLHF Research Console
Tenantconsolidated · 14 tenants
Window
live · refreshed 14s ago
Causal ROICounterfactualsSimulation LabLearning TimelineRec QualityPolicy EvolutionTreatment vs Control
Simulation Lab
Monte Carlo · 10,000 trajectories · scenario comparison
live · 28d window · n=18,420
Scenario controls
Automation level
0.78
Confidence threshold
0.82
Risk aversion (λ)
0.40
Horizon (days)
30
Trajectories (n)
10,000
Random seed
42
last run · 2m 41s · 10,000 trajectories
Outcome distribution
$ savings · 10k runsp10
$3.62M
p50
$4.82M
p90
$5.74M
VaR95
$0.62M
Scenario comparison
4 active strategiesID
Scenario
p10
p50
p90
VaR95
S-A
Baseline policy
$3.91M
$4.82M
$5.74M
0.62
S-B
Aggressive automation
$3.62M
$5.41M
$7.11M
1.12
S-C
Conservative guardrails
$3.78M
$4.20M
$4.61M
0.31
S-D
Optimal (RL-tuned)
$4.41M
$5.62M
$6.81M
0.84
Would-have-saved analysis
autonomous vs human policyAutonomous
$5.62M
Human-only
$3.81M
Δ uplift
+$1.81M
Backtested over 30 days. Autonomous policy outperforms in 87% of trajectories with confidence 0.91. Tail risk (p10) improves $0.41M after risk regularizer.
Constraint impact
break OTIF guardrail at 92%Savings if relaxed
+$0.92M
OTIF risk (p90)
91.2%
Customer SLA breach
+1.2%
Relaxing the OTIF guardrail to 92% unlocks ~$0.92M but risks SLA breaches for 3 enterprise customers (Acme, Globex, Initech).