Energy Model Calibration: cv(RMSE) and NMBE per ASHRAE Guideline 14

A whole-building energy model is only as useful as it is calibrated to actual measured performance. ASHRAE Guideline 14-2014 specifies cv(RMSE) (coefficient of variation of root-mean-square error) and NMBE (normalised mean bias error) thresholds that determine whether a model is “calibrated” — and therefore acceptable as the basis for measurement & verification (M&V) decisions in retrofit savings calculations.

This guide explains both metrics, the ASHRAE G14 thresholds, and the practical workflow for calibrating an Indian commercial-building model.

Why calibration matters

A new-build project’s energy model is a forward prediction. A retrofit project’s model is a baseline against which measured savings are computed. Both need defensible accuracy.

For LEED v4.1 EAc6 (Measurement & Verification) and many retrofit-savings contracts, the model must be calibrated within ASHRAE G14 thresholds before its predictions are accepted. Without calibration, claimed savings are challenged.

The two metrics

cv(RMSE) — Coefficient of Variation of Root-Mean-Square Error


cv(RMSE) = √(Σ(predicted_i - measured_i)² / N) / Mean(measured)

Expressed as a percentage. Lower is better. Measures both random and systematic errors.

ASHRAE G14 threshold:

  • Monthly billing data: cv(RMSE) ≤ 15%
  • Hourly metered data: cv(RMSE) ≤ 30%

The hourly threshold is more permissive because hourly variability is naturally higher than monthly.

NMBE — Normalised Mean Bias Error


NMBE = (Σ(predicted_i - measured_i) / N) / Mean(measured)

Expressed as a percentage. Measures systematic over- or under-prediction.

ASHRAE G14 threshold:

  • Monthly billing data: NMBE ≤ 5% (in absolute value)
  • Hourly metered data: NMBE ≤ 10%

A model with NMBE = 0% means the average prediction equals the average measurement (no systematic bias, though individual hours may differ). A model with NMBE = 8% systematically over-predicts by 8%.

A model passes G14 if both cv(RMSE) and NMBE thresholds are met.

What “calibrated” actually requires

For a typical office building with monthly billing data over one year:

  • Predicted vs measured monthly energy use, 12 data points
  • cv(RMSE) ≤ 15%
  • NMBE ≤ 5%

This is a low bar in principle but high in practice — initial models routinely show cv(RMSE) of 25-45% and NMBE of 10-20% before iteration.

Calibration workflow

Stage 1: Define inputs from physical observation

  • Envelope dimensions from architectural drawings (R/U values from material specs)
  • Occupancy schedule from building HR (or default schedule if office)
  • Lighting density from lighting schedule (W/m²)
  • Plug load density from electrical schedule
  • HVAC equipment specifications + nameplate efficiencies
  • Setpoint schedules (cooling 24°C in occupied; setback 28°C unoccupied)

Stage 2: Run baseline simulation, compare to measured

  • Compute monthly cv(RMSE) and NMBE
  • Identify the months with largest deviations
  • Look for systematic patterns (e.g. winter under-prediction = wrong heating system; summer over-prediction = wrong cooling efficiency)

Stage 3: Iteration

Adjust inputs and re-run. Common adjustments:

  • Schedules (occupancy, lighting, plug load) — usually +/- 15% from default
  • Cooling setpoint (rarely 24°C all year; reality is 25-26°C summer with comfort drift)
  • Equipment efficiencies (manufacturer datasheet vs actual derating)
  • Infiltration rate (often higher than design 0.4 cfm/sf — can be 0.6-1.2 in real buildings)
  • HVAC controls (DCV setpoint actually achieved; minimum OA actually delivered)

Stage 4: Repeat until calibrated

Typical iterations: 5-10 for first calibration. Document each iteration’s input changes.

Common calibration challenges in Indian buildings

Under-predicted lighting energy

Cause: Lighting schedule assumes design hours; actual reality is occupants leaving lights on after-hours.

Fix: Increase nighttime lighting fraction from 5% to 15-25%.

Over-predicted cooling energy

Cause: Setpoint at 24°C continuously; reality has drift to 26°C as occupants disable AC.

Fix: Implement setpoint reset schedule.

Under-predicted plug load

Cause: Original design used office defaults of 7 W/m²; actual use is 12-18 W/m² with personal heaters, monitors, USB-charging stations.

Fix: Increase plug load density to measured 12-15 W/m².

Under-predicted ventilation OA

Cause: Design assumed DCV reduces OA; building manager has bypassed DCV during high-occupancy periods.

Fix: Model OA at design rate (no DCV); more conservative and matches reality.

Worked calibration example

5,000 m² office building. 12 months of utility data.

Month Measured (MWh) Predicted v1 v1 error v2 (after 3 iterations) v2 error
Jan 35 28 -20% 32 -9%
Feb 38 31 -18% 36 -5%
Mar 42 38 -10% 41 -2%
Apr 48 47 -2% 49 +2%
May 65 75 +15% 65 0%
Jun 75 88 +17% 75 0%
Jul 70 82 +17% 71 +1%
Aug 68 80 +18% 70 +3%
Sep 62 70 +13% 64 +3%
Oct 55 58 +5% 56 +2%
Nov 45 42 -7% 45 0%
Dec 40 35 -13% 39 -3%

v1 metrics: cv(RMSE) = 19%, NMBE = +4% — fails cv(RMSE) threshold of 15%.

v2 metrics: cv(RMSE) = 7%, NMBE = -1% — both thresholds met. Model is calibrated.

Adjustments made between v1 and v2:

  • Lighting nighttime fraction: 5% → 18%
  • Plug load density: 7 → 13 W/m²
  • Cooling setpoint: 24 °C → 25.5 °C summer
  • Infiltration: 0.4 → 0.7 cfm/sf
  • Cooling tower performance derate at high-WBT: 95% → 88%

Five common calibration mistakes

1. Adjusting many inputs at once. Each adjustment should be physically defensible; multi-variable tuning produces over-fit models that fail in retrofit scenarios.

2. Ignoring weather variation. Weather year used in simulation must match the year of measured data; using TMY weather while measured-year weather differs creates artificial deviation.

3. Cherry-picking calibration months. Calibrate against full year, not just summer or winter; partial-year calibration gives illusory accuracy.

4. No documentation of iterations. Future retrofit savings claims need traceability of calibration assumptions.

5. Using AGI/cv(RMSE) on hourly data with quality issues. Sensor failures, meter resets generate huge spikes; data cleaning must precede calibration.

Quick checklist

  • [ ] At least 12 months of measured energy data (utility billing minimum)
  • [ ] Weather data for measured period (actual, not TMY)
  • [ ] All inputs documented from physical observation
  • [ ] Iterative refinement of inputs with physical justification
  • [ ] cv(RMSE) ≤ 15% (monthly) or 30% (hourly)
  • [ ] NMBE within ±5% (monthly) or ±10% (hourly)
  • [ ] All calibration iterations documented
  • [ ] Final model + assumptions saved with project documentation

References: ASHRAE Guideline 14-2014 Measurement of Energy, Demand, and Water Savings; IPMVP 2022 International Performance Measurement and Verification Protocol; FEMP M&V Guidelines v4 (US Federal Energy Management Program); ASHRAE Handbook on Building Energy Performance.

[Disclosure block, Legal notice — auto-included by article template]

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top