Date Thesis Awarded


Access Type

Honors Thesis -- Open Access

Degree Name

Bachelors of Science (BS)


Data Science


Dana Willner

Committee Members

Dan Cristol

Ron Smith

Adrian Bacong


Asian mothers, as an aggregate, may be at increased risk for adverse perinatal outcomes, but heterogeneity between disaggregated Asian American subgroups is an understudied topic. The first objective of this study is to examine differences in perinatal outcomes between disaggregated Asian American subgroups and differences compared to Non-Hispanic Whites (NHWs). The second goal is to develop models to predict gestational diabetes, one perinatal outcome, in Asian Indian mothers to see how precision medicine may be able to advance pregnancy care. Using the National Vital Statistics System Natality Dataset (n=10,823,868), odds ratios (OR) were calculated with 95% confidence intervals (CI) for four perinatal outcomes (gestational diabetes, gestational hypertension, low birthweight, preterm birth) in six Asian subgroups (Indian, Chinese, Filipino, Japanese, Korean, Vietnamese) compared to NHWs and by nativity. The models adjusted for mother’s age, educational attainment, pre-pregnancy hypertension, pre-pregnancy diabetes, and pre-pregnancy BMI. Additionally, three types of models were built to predict gestational diabetes in Asian American mothers (logistic regression, random forest, and gradient boosted). The calculated odds ratios showed that Asian Americans generally had an increased risk for gestational diabetes and low birth weight, and a decreased risk for gestational hypertension and preterm birth; however, results varied between disaggregated subgroups. Additionally, foreign born Asians generally had an increased risk of gestational diabetes compared to US born Asians, and a decreased risk for the other three perinatal outcomes, but variation between subgroups persisted. In addition, accuracy and recall scores varied substantially between models. Additionally, undersampling techniques also impacted the success of the various models. Ultimately, the results show that Asian Americans face different risks in perinatal outcomes compared to NHWs, there is heterogeneity in results between disaggregated Asian subgroups, and machine learning can be used to make personalized risk predictions.