Physics-Informed Large AI Models for Hydrology and the Water Cycle

September 27, 2025

Large AI models -- so-called foundation models -- are revolutionizing fields like language, vision, and now Earth system science. In hydrology, the challenge is to blend strong physical consistency with data-driven flexibility. Physics-informed large AI models aim to do exactly that: infuse hydrological laws (e.g. mass balance, Darcy’s law, infiltration physics) into large neural architectures so that they respect physical constraints while learning from data.

Microsoft’s Aurora: A Case Study in Earth System Foundation Modeling

Microsoft’s Aurora is a 1.3 billion-parameter foundation model for atmospheric and Earth system prediction. Aurora was pretrained on more than a million hours of diverse climate and atmospheric data (analyses, reanalyses, forecasts, simulations) and can be fine-tuned to downstream tasks like weather forecasting, air quality, ocean waves, and tropical cyclone tracking.

Some key features of Aurora:
1. Pretrain - fine-tune paradigm: It learns general atmospheric representations in pretraining, and then is fine-tuned for specific tasks.

2. LoRA (Low-Rank Adaptation) used in fine-tuning: For long-lead prediction (rollouts), Aurora employs LoRA to efficiently adapt the large model to forecasting tasks.

3. Efficiency vs physics models: Aurora can generate forecasts orders of magnitude faster than traditional numerical weather prediction (NWP) models, yet matches or exceeds their accuracy in many metrics.

4. Multitask flexibility: The same core model architecture can be adapted to new tasks (e.g. pollutant concentration, wave forecasting) with modest fine-tuning data.

Thus, Aurora is a state-of-the-art example of how to build a foundation model for the Earth system, not just for text or images.  

Integrating Aurora-Style Models into Hydrology and Our Research

In our hydrology-AI lab, we aim to bring the lessons of Aurora into soil and water modeling in several ways:
1. Use physics-informed regularization or embedding of hydrological equations inside large models (e.g. continuity, infiltration, evaporation) so that learned predictions respect mass conservation.

2. Fine-tune foundation models for soil moisture, runoff, evapotranspiration using LoRA or similar parameter-efficient methods, leveraging in-situ networks (like our Core Validation Site) and satellite products.

3. Use multimodal learning: combine satellite brightness temperature, meteorological data, topographic/soil maps, climate reanalysis, and in-situ sensor networks as inputs to a unified model.

4. Evaluate generalization & transferability: test models in new climate zones or extreme events to see if the physics-informed model extrapolates better than pure data-driven ones.

Why This Matters

1. Scalable hydrology AI: Using foundation models adapted to water systems means the same architecture might serve soil moisture, streamflow, drought, and groundwater tasks.

2. Physical consistency: Embedding hydrological laws ensures that predictions do not violate conservation, boundary conditions, or mass balance constraints.

3. Rapid prototyping: Methods like LoRA let us fine-tune large models with limited data and compute, making the approach accessible for PhD-level research.

4. Frontier research space: This is prime territory for students interested in AI + geoscience, building the next generation of hybrid models that bridge physics and deep learning.

Contact

Read other projects

WRF-Hydro: Advancing Hydrological Modeling with Next-Generation Tools

-

Read this project
Physics-Informed Large AI Models for Hydrology and the Water Cycle

-

Read this project
Multi-Scale Soil Moisture Monitoring Using ELBARA-III and Drone Radiometer

The Portable L-Band Radiometer, mounted on a drone, enables high-resolution soil moisture retrieval across diverse terrains. Operating at 1.4 GHz (L-band), it collects brightness temperature (TB) data over rice paddies and varied vegetation zones, complementing satellite missions like SMAP and SMOS. By integrating airborne observations with ground-based sensors, the system enhances spatial coverage and improves soil moisture retrieval models for climate and agricultural applications. This drone-based approach significantly increases data accuracy and efficiency compared to traditional methods.

Read this project
Developing the Long-Term Brightness Temperature Measurement Site in South Korea

Soil moisture (SM) is essential for agriculture and hydrometeorology but difficult to measure. To improve validation, Korea’s first Core Validation Site is being built in Hampyeong-gun and Naju-si with TEROS sensors, ESA’s ELBARA-III radiometer, and drone-based PoLRa. The site will provide continuous SM and temperature data, advancing satellite validation, retrieval models, and climate resilience research.

Read this project
AI-Based Water Quality Prediction for Inland Water using Satellite and Land Surface Model Data

Climate change-driven heatwaves and water cycle disruptions threaten inland water quality (WQ), necessitating efficient monitoring. Traditional methods are labor-intensive with limited coverage, prompting us to develop an AI-based model for predicting chlorophyll-a concentrations in lakes and rivers. By integrating high-resolution satellite imagery (Landsat-8/9, Sentinel-2/3) with land surface models (ERA5-Land, GLDAS, MERRA-2), our model tailors predictions to different water bodies. Future plans include incorporating socio-statistical data (e.g., population, livestock) and climate scenarios (RCP, SSP). Using AI, GIS, and high-performance computing, we explore low-concentration chlorophyll-a prediction, precipitation and flow speed impacts on WQ, transfer learning, multi-sensor data fusion, and uncertainty quantification. Beyond chlorophyll-a, we aim to extend predictions to turbidity and dissolved oxygen, providing a comprehensive AI-driven approach to monitoring and mitigating climate change effects on inland water quality.

Read this project
Harnessing Deep Learning to Predict and Decode the Mysteries of Flash Droughts (GAN/SHAP/3D-CNN with Transfer Learning)

The application of deep learning in predicting flash droughts offers a transformative approach to understanding and anticipating these rapid-onset events, significantly enhancing preparedness and response strategies. By unraveling the complex mechanisms behind flash droughts, this project aims to provide precise, timely forecasts, thereby mitigating the severe agricultural, ecological, and socioeconomic impacts associated with these phenomena.

Read this project
Streamflow and Drought Predictions over Ungaged Regions using Deep and Transfer Learning Approaches

Streamflow and flash drought predictions are essential for managing water resources and mitigating potential disasters in ungaged regions. With remotely-sensed data, deep and transfer learning approaches provide powerful tools to analyze complex hydrological data, enabling more accurate predictions and better decision-making in these areas.

Read this project
Applications of Bayesian Machine Learning in Big Data in Earth Science

Bayesian methods help us improve our guesses by using new information. In Earth science, these methods are applied to big data to better understand our planet. This approach is useful for predicting things like natural disaster patterns and climate changes. By continuously updating our knowledge with new data, we can make more accurate predictions and decisions in Earth science.

Read this project
Water Balance Budgeting with Bayesian Machine Learning

The water balance equation in Earth science, P = E + R + etc, describes the relationship between precipitation (P), evaporation (E), runoff (R), and etc (e.g., soil moisture, ground water) in a given area. Bayesian inference can be applied to solve this equation by incorporating prior knowledge and updating the probability distributions of the variables based on new data, ultimately improving water resource management and prediction.

Read this project
Integrating Earth Science and Engineering for Climate Resilience: Innovative Approaches to Infrastructure and Societal Justice

Earth science informs infrastructure development by providing insights into site suitability, resource management, and sustainable design, enhancing the resilience and long-term viability of projects. It also plays a crucial role in addressing societal justice related to climate change by helping identify vulnerable communities and develop mitigation strategies, ensuring equitable access to resources and protection from environmental hazards.

Read this project
Enhancing Earth Science Predictions through Advanced Data Assimilation Techniques

Data assimilation is vital in earth science as it integrates diverse observations and model simulations, improving the accuracy of forecasts and predictions. This process enhances our understanding of complex Earth systems, enabling better decision-making for environmental management and climate adaptation.

Read this project
Floods and Droughts Predictions using Machine Learning Approaches

Satellite data and machine learning transformed Earth science by predicting and monitoring natural disasters. This combination delivers precise and timely predictions, crucial for mitigating the impacts of events like floods and droughts.

Read this project
Data Error Characterizations

Characterizing the error of satellite data and land surface models is vital in Earth science, as it ensures the accuracy and reliability of information used for monitoring and predicting environmental phenomena. By understanding these errors, scientists can refine data interpretation, enhance models, and ultimately make better-informed decisions about the Earth's complex systems.

Read this project
Developing Algorithms to Improve the Temporal Sampling of Satellite Data

Enhancing the temporal repeat of satellite data for obtaining soil moisture information is a vital research area due to its implications for agriculture, water resource management, climate change research, and ecosystem health. It helps in making informed decisions, increasing productivity, and reducing the impact of natural disasters, as well as contributing to our understanding of the global climate system.

Read this project
Exploring the Impact of Human Activities on the Subdaily Global Terrestrial Water Cycle

Humans have been modifying the Earth's surface for thousands of years, with practices like clearing forests for agriculture and creating uniform land covers. But how do these changes impact the subdaily global terrestrial water cycle? That's the question a project aims to answer.

Read this project
Satellite Image Disaggregation with Machine Learning

Microwave soil moisture data is critical for agriculture, weather, and climate modeling, but has low spatial resolution. Disaggregation via machine learning can improve resolution, offering detailed local soil moisture data. Machine learning can handle complex relationships between microwave signals and soil moisture.

Read this project