FireBench

Wildfire Interdisciplinary Research Center (WIRC)

February 01, 2026

Aurélien Costes^1,2, Muthu K. Selvaraj^1,3, Albert Simeoni^1,3, Adam Kochanski^1,2
¹ Wildfire Interdisciplinary Research Center (WIRC)
² San José State University
³ Worcester Polytechnic Institute

benchmark example — Example of benchmarking a coupled fire-atmosphere model developed at WIRC. Left panel: fire reconstruction from observations showing fire extent, smoke, and vertical wind structure, right panel: a model simulation being benchmarked.

Introduction
The increasing frequency and size of large catastrophic wildfires have driven investment in and advancements in wildfire modeling. The result is that today, academics and industry have developed many complex, varied models to help government agencies, companies, and landowners predict wildfire hazards and plan to mitigate them. Currently, there is no framework that allows for model validation against multiple sources of observation, accounts for uncertainty, and is scientifically rigorous. FireBench proposes a model-agnostic validation and intercomparison framework that can be integrated with various data sources and customized to suit both model type and end-user needs.

FireBench is a framework for evaluating fire models and will help the end user determine which wildfire model to trust for a given scenario. FireBench was created by the wildfire modeling team at San José State University in conjunction with our Industry Advisory Board members from the Wildfire Interdisciplinary Research Center (WIRC) – a National Science Foundation (NSF)- funded Industry-University Collaborative Research Center (IUCRC). FireBench is based on a custom file format that standardizes the data processing for both observation and models, facilitating the standardization of metrics and evaluation processes. The FireBench codebase is open-source (Costes & Selvaraj 2026) and distributed as a Python package. The benchmark datasets and ready-to-use scripts are distributed as open-source datasets on the Zenodo open-access platform. FireBench offers certification capabilities that allow a verification of the model validation process by an independent third party. Because of its flexibility, FireBench can be used to validate models across different contexts, such as operational forecasting, risk, or event reanalysis.

Project Specifics
The FireBench workflow (Figure 1) relies on a custom file format to centralize observational and model datasets. The file standard eliminates format‑specific parsing errors, enabling automated cross‑model comparisons. A Python routine ingests these files, calculates performance scores, and generates a scorecard summarizing key metrics (e.g., perimeter overlap, etc.) in an intuitive format (Figure 2). Users can select from a catalog of benchmark scenarios to tailor the analysis to their specific operational needs. Each benchmark evaluates models against predefined Key Performance Indicators (KPIs). Normalized KPI values are converted into an overall score (0 = poor, 100 = excellent), facilitating cross‑benchmark comparisons. FireBench includes automated preprocessing modules that convert raw observational data, such as California Department of Forestry and Fire Protection (CAL FIRE) building‑damage reports and remotely sensed burn‑severity maps, into the standardized FireBench format.

benchmarking results — Figure 2. Benchmarking results for a coupled fire-atmosphere model (WRF-SFIRE) forecast of an example building-damage benchmark for the Caldor Fire. The overall score in this example is 46.21.

Currently, our industry advisory board members can, with the WIRC modeling team, access a comprehensive database and systematic evaluation framework for fire models, submit their models for intercomparison, and receive a quantitative ranking and effective assessment of performance against our benchmarks. Output includes multiple-dimensional (0D, 2D, and 3D) analyses for adaptable evaluations. The project is creating a centralized and growing codebase that will bring together various methods and models. FireBench is available as an open-source tool at (https://github.com/wirc-sjsu/firebench). Benchmarking datasets are available at https://zenodo.org/communities/firebench/records.

Impact
FireBench fulfills the critical need for our industry advisory board members and local-to-national-scale decision-makers to have a systematic method for evaluating uncertainty in wildfire models (Wildfire Safety Advisory Board, 2025). It advances the field of fire modeling by establishing a centralized database and a comprehensive benchmarking and evaluation framework for wildfire models. Integrating diverse models and methods into a unified codebase facilitates in-depth, comparative analysis across multiple dimensions. The systematic comparison and benchmarking offered by FireBench will improve the accuracy and applicability of wildfire models. It will also enable cross-model comparison, allowing stakeholders to effectively assess various models and implement the one best tailored to their specific needs.

Citations
(7th) Wildfire Safety Advisory Board. (2025, June 30). Recommendations to the Office of Energy Infrastructure Safety (Energy Safety). Office of Energy Infrastructure Safety. https://energysafety.ca.gov/wp-content/uploads/2025/06//recommendations-to-energy-safety-2025-adopted-1.pdf Costes, A., & Kumaran Selvaraj, M. (2026). firebench (0.8.0). Zenodo. https://doi.org/10.5281/zenodo.18251039

FireBench

Wildfire Interdisciplinary Research Center (WIRC)

You are leaving the IUCRC - National Science Foundation website to go to a non-government website: