A topological data analysis (TDA) of 200,000 U.S. wildfires larger than 5 acres indicates that events with the largest final burned areas are associated with systematically low fuel moistures, low precipitation, and high vapor pressure deficits in the 30 days prior to the fire start. These parameters are widely used in empirical fire forecasting tools, thus confirming that an unguided, machine learning (ML) analysis can reproduce known relationships. The simple, short time scale parameters identified can therefore provide quantifiable forecast skill for wildfires with extreme sizes. In contrast, longer aggregates of weather observations for the year prior to fire start, including specific humidity, normalized precipitation indices, average temperature, average precipitation, and vegetation indices are not strongly coupled to extreme fire size, thus afford limited or no enhanced forecast skill. The TDA demonstrates that fuel moistures and short-term weather parameters should optimize the training of ML algorithms for fire forecasting, whilst longer-term climate and ecological measures could be downweighted or omitted. The most useful short-term meteorological and fuels metrics are widely available with low latency for the conterminous U.S, and are not computationally intensive to calculate, suggesting that ML tools using these data streams may suffice to improve situational awareness for wildfire hazards in the U.S.