A whole-building energy model is only as useful as it is calibrated to actual measured performance. ASHRAE Guideline 14-2014 specifies cv(RMSE) (coefficient of variation of root-mean-square error) and NMBE (normalised mean bias error) thresholds that determine whether a model is “calibrated” — and therefore acceptable as the basis for measurement & verification (M&V) decisions in retrofit savings calculations.
This guide explains both metrics, the ASHRAE G14 thresholds, and the practical workflow for calibrating an Indian commercial-building model.
Why calibration matters
A new-build project’s energy model is a forward prediction. A retrofit project’s model is a baseline against which measured savings are computed. Both need defensible accuracy.
For LEED v4.1 EAc6 (Measurement & Verification) and many retrofit-savings contracts, the model must be calibrated within ASHRAE G14 thresholds before its predictions are accepted. Without calibration, claimed savings are challenged.
The two metrics
cv(RMSE) — Coefficient of Variation of Root-Mean-Square Error
cv(RMSE) = √(Σ(predicted_i - measured_i)² / N) / Mean(measured)
Expressed as a percentage. Lower is better. Measures both random and systematic errors.
ASHRAE G14 threshold:
- Monthly billing data: cv(RMSE) ≤ 15%
- Hourly metered data: cv(RMSE) ≤ 30%
The hourly threshold is more permissive because hourly variability is naturally higher than monthly.
NMBE — Normalised Mean Bias Error
NMBE = (Σ(predicted_i - measured_i) / N) / Mean(measured)
Expressed as a percentage. Measures systematic over- or under-prediction.
ASHRAE G14 threshold:
- Monthly billing data: NMBE ≤ 5% (in absolute value)
- Hourly metered data: NMBE ≤ 10%
A model with NMBE = 0% means the average prediction equals the average measurement (no systematic bias, though individual hours may differ). A model with NMBE = 8% systematically over-predicts by 8%.
A model passes G14 if both cv(RMSE) and NMBE thresholds are met.
What “calibrated” actually requires
For a typical office building with monthly billing data over one year:
- Predicted vs measured monthly energy use, 12 data points
- cv(RMSE) ≤ 15%
- NMBE ≤ 5%
This is a low bar in principle but high in practice — initial models routinely show cv(RMSE) of 25-45% and NMBE of 10-20% before iteration.
Calibration workflow
Stage 1: Define inputs from physical observation
- Envelope dimensions from architectural drawings (R/U values from material specs)
- Occupancy schedule from building HR (or default schedule if office)
- Lighting density from lighting schedule (W/m²)
- Plug load density from electrical schedule
- HVAC equipment specifications + nameplate efficiencies
- Setpoint schedules (cooling 24°C in occupied; setback 28°C unoccupied)
Stage 2: Run baseline simulation, compare to measured
- Compute monthly cv(RMSE) and NMBE
- Identify the months with largest deviations
- Look for systematic patterns (e.g. winter under-prediction = wrong heating system; summer over-prediction = wrong cooling efficiency)
Stage 3: Iteration
Adjust inputs and re-run. Common adjustments:
- Schedules (occupancy, lighting, plug load) — usually +/- 15% from default
- Cooling setpoint (rarely 24°C all year; reality is 25-26°C summer with comfort drift)
- Equipment efficiencies (manufacturer datasheet vs actual derating)
- Infiltration rate (often higher than design 0.4 cfm/sf — can be 0.6-1.2 in real buildings)
- HVAC controls (DCV setpoint actually achieved; minimum OA actually delivered)
Stage 4: Repeat until calibrated
Typical iterations: 5-10 for first calibration. Document each iteration’s input changes.
Common calibration challenges in Indian buildings
Under-predicted lighting energy
Cause: Lighting schedule assumes design hours; actual reality is occupants leaving lights on after-hours.
Fix: Increase nighttime lighting fraction from 5% to 15-25%.
Over-predicted cooling energy
Cause: Setpoint at 24°C continuously; reality has drift to 26°C as occupants disable AC.
Fix: Implement setpoint reset schedule.
Under-predicted plug load
Cause: Original design used office defaults of 7 W/m²; actual use is 12-18 W/m² with personal heaters, monitors, USB-charging stations.
Fix: Increase plug load density to measured 12-15 W/m².
Under-predicted ventilation OA
Cause: Design assumed DCV reduces OA; building manager has bypassed DCV during high-occupancy periods.
Fix: Model OA at design rate (no DCV); more conservative and matches reality.
Worked calibration example
5,000 m² office building. 12 months of utility data.
| Month | Measured (MWh) | Predicted v1 | v1 error | v2 (after 3 iterations) | v2 error |
|---|---|---|---|---|---|
| Jan | 35 | 28 | -20% | 32 | -9% |
| Feb | 38 | 31 | -18% | 36 | -5% |
| Mar | 42 | 38 | -10% | 41 | -2% |
| Apr | 48 | 47 | -2% | 49 | +2% |
| May | 65 | 75 | +15% | 65 | 0% |
| Jun | 75 | 88 | +17% | 75 | 0% |
| Jul | 70 | 82 | +17% | 71 | +1% |
| Aug | 68 | 80 | +18% | 70 | +3% |
| Sep | 62 | 70 | +13% | 64 | +3% |
| Oct | 55 | 58 | +5% | 56 | +2% |
| Nov | 45 | 42 | -7% | 45 | 0% |
| Dec | 40 | 35 | -13% | 39 | -3% |
v1 metrics: cv(RMSE) = 19%, NMBE = +4% — fails cv(RMSE) threshold of 15%.
v2 metrics: cv(RMSE) = 7%, NMBE = -1% — both thresholds met. Model is calibrated.
Adjustments made between v1 and v2:
- Lighting nighttime fraction: 5% → 18%
- Plug load density: 7 → 13 W/m²
- Cooling setpoint: 24 °C → 25.5 °C summer
- Infiltration: 0.4 → 0.7 cfm/sf
- Cooling tower performance derate at high-WBT: 95% → 88%
Five common calibration mistakes
1. Adjusting many inputs at once. Each adjustment should be physically defensible; multi-variable tuning produces over-fit models that fail in retrofit scenarios.
2. Ignoring weather variation. Weather year used in simulation must match the year of measured data; using TMY weather while measured-year weather differs creates artificial deviation.
3. Cherry-picking calibration months. Calibrate against full year, not just summer or winter; partial-year calibration gives illusory accuracy.
4. No documentation of iterations. Future retrofit savings claims need traceability of calibration assumptions.
5. Using AGI/cv(RMSE) on hourly data with quality issues. Sensor failures, meter resets generate huge spikes; data cleaning must precede calibration.
Quick checklist
- [ ] At least 12 months of measured energy data (utility billing minimum)
- [ ] Weather data for measured period (actual, not TMY)
- [ ] All inputs documented from physical observation
- [ ] Iterative refinement of inputs with physical justification
- [ ] cv(RMSE) ≤ 15% (monthly) or 30% (hourly)
- [ ] NMBE within ±5% (monthly) or ±10% (hourly)
- [ ] All calibration iterations documented
- [ ] Final model + assumptions saved with project documentation
References: ASHRAE Guideline 14-2014 Measurement of Energy, Demand, and Water Savings; IPMVP 2022 International Performance Measurement and Verification Protocol; FEMP M&V Guidelines v4 (US Federal Energy Management Program); ASHRAE Handbook on Building Energy Performance.
[Disclosure block, Legal notice — auto-included by article template]
