Date Awarded


Document Type


Degree Name

Doctor of Philosophy (Ph.D.)


Virginia Institute of Marine Science


Jian Shen

Committee Member

Donna M Bilkovic

Committee Member

William G Reay

Committee Member

Harry V Wang

Committee Member

Kyeong Park


Water quality in coastal waters is of great socio-economic concern. Anthropogenic activities along the coast have led to an increasing number of impaired waterbodies and degraded ecosystems. To manage water quality issues, accurately modeling the coastal water quality is vitally important. One traditional way to model water quality is using numerical models. Despite great advances in hydrodynamic modeling over the past few decades, water quality simulation is still challenging as the performance of water quality model depends on how well the complex biogeochemical processes are parameterized. While numerical models are the dominant tool for water quality modeling, there are increasing efforts in developing data-driven models in marine sciences. Several major challenges associated with data-driven models for coastal water quality are addressed in this dissertation. These challenges include difficulties in high-dimensional simulation, missing records in observational data, and uncertain watershed loadings. A data-driven model for coastal water quality is introduced in this dissertation. The proposed model has three major components including (1) forcing transformation auto-selection, (2) empirical orthogonal functions (EOF), and (3) neural network. It uses EOF to extract principal components of the target variable and applies a neural network to simulate the temporal variations of nontrivial components. Different from previous empirical models, the approach is able to simulate three-dimensional variations of water quality variables and it does not use in situ measured physical conditions but only external forcings as model inputs. The robustness of the model is verified with applications to predict temporal-spatial distributions of key water quality variables, including dissolved oxygen (DO) and Chlorophyll-a (Chl-a) concentration in Chesapeake Bay. Using a major portion of historical shipboard monthly measurements and corresponding external forcings for training, the model shows good performance in terms of predicting both seasonal and interannual variations for the testing period. The model is also tested for high-resolution simulation using Visible Infrared Imaging Radiometer Suite (VIIRS) Chl-a data. The missing records in the satellite data are effectively interpolated by Data Interpolating Empirical Orthogonal Functions (DINEOF). An overall satisfactory model performance demonstrates that by combining DINEOF and machine learning, it is feasible to use data-driven models to predict high-resolution spatiotemporal variations of water quality variables in coastal waters. Finally, to address the uncertainty in watershed loading, a typically important forcing for coastal water quality, an inverse method is introduced to estimate loading by combining observation and numerical model. In this method, an estuary is divided into multiple segments. Water and material fluxes between neighboring segments are computed from a set of linear equations derived from mass balance and the relationship between residence time and water fluxes. With sparse observational data, inversely estimated loadings agree well with loadings from a previously calibrated watershed model, demonstrating the reliability of the method. Overall, this dissertation demonstrates the feasibility of using data-driven approaches to model three-dimensional coastal water quality. With the rapidly accumulated observational data and quick advances in machine learning techniques, data-driven approaches have great potential for water quality modeling and environmental management in the future.




© The Author

Available for download on Wednesday, December 13, 2023

Included in

Oceanography Commons