plus sign

Sarah C. Shi


University of California, Berkeley, Ph.D. in Earth Science (2024-); Chancellor's Fellow, Department Fellow
Data Science Fellow at the Lamont-Doherty Earth Observatory, Columbia University (2022-2024)
University of Cambridge, M.Phil in Earth Science (2021-2022); Euretta J. Kellett Fellow
Columbia University, B.A. in Earth Science (2016-2020)

C.V.
GitHub
@sarahcshi
sarah.shi@columbia.edu

Research

Research

Autoencoder
Leveraging Bayesian neural networks with variational inference and autoencoders to classify common minerals in global geochemical data repositories and in EDS maps.
QEMSCAN zoning
Utilizing machine learning to automate petrographic and mineralogical observation from energy dispersive X-ray spectroscopy (EDS) and electron microprobe (EPMA) data.
FTIR Baselines
Developing PyIRoGlass, an open-source package that provides a reproducible and documented method for reducing FTIR spectra, with routines for determining concentrations of H2O and CO2.
Olivines
Probing syn-eruptive dynamics of the Fuego 2018 eruption with diffusion chronometry in olivine and volatiles in melt inclusions.
Thermometry
Developing new olivine-saturated melt geothermometers with inversions for reducing temperature uncertainties.

Code

Code

Open, accessible, and reproducible science is critical. I am deeply interested in utilizing statistics and machine learning for developing open-source tools for petrologic questions.
PyIRoGlass Logo
PyIRoGlass, a Bayesian MCMC algorithm for fitting baselines to the FTIR spectra of basaltic-andesitic glasses
mineralML Logo
mineralML, a Python package leveraging machine learning for probabilistic mineral classification in repository and analytical data

Publications

Publications

[2, in press] 2024 Moussallam, Y., Towbin, W.H., Plank, T.A., Bureau, H., Khodja, H., Guan, Y., Ma, C., Baker, M.B., Stolper, E.M., Naab, F.U., Monteleone, B.D., Gaetani, G.A., Shimizu, K., Ushikubo, T., Lee, H., Ding, S., Shi, S.C., Rose-Koga, E.F., ND70 series basaltic glass reference materials for volatile elements (H2O, CO2, S, Cl, F) analysis and the C ionisation efficiency suppression effect of water in silicate glasses in SIMS analysis. Geostandards and Geoanalytical Research.
[1, in press] 2023 Shi, S.C., Towbin, W.H., Plank, T.A., Barth, A.C., Rasmussen, D., Moussallam, Y., Lee, H., Menke, W., PyIRoGlass: An Open-Source, Bayesian MCMC Algorithm for Fitting Baselines to FTIR Spectra of Basaltic-Andesitic Glasses. Volcanica.

Conferences

Conferences

[16] Shi, S.C., Antoshechkina, P., Lehnert, K., Profeta, L., Figueroa, J.D., Cao, S., Class, C., Wieser, P., Toth, N., Harnessing Flexible Search Tools and Machine Learning for Data-Driven Discovery with EarthChem, Goldschmidt 2024 (Talk).
[15] Shi, S.C., Towbin, W.H., Plank, T.A., Barth, A.C., Rasmussen, D., Moussallam, Y., Lee, H., Menke, W., Quantifying H2O and CO2 Concentrations and Uncertainties with PyIRoGlass: An Open-Source Bayesian MCMC Algorithm for Fitting Baselines to Basaltic-Andesitic FTIR Spectra, Goldschmidt 2024 (Talk).
[14] Shi, S.C., Wieser, P., Toth, N., Antoshechkina, P., Lehnert, K., mineralML: Leveraging Machine Learning for Probabilistic Mineral Classification, Gordon Research Seminar, Geochemistry of Mineral Deposits (Invited Talk).
[13] Tweedy, R., Shi, S.C., Uno, K.T., Machine Learning Analysis of n-Alkanes from Woody and Grassy African Plants, NE GSA 2024 (Talk).
[12] Shi, S.C., Wieser, P., Toth, N., Antoshechkina, P., Lehnert, K., MIN-ML: Leveraging Machine Learning for Probabilistic Mineral Classification in Geochemical Databases, AGU 2023 (Talk).
[11] Tweedy, R., Shi, S.C., Uno, K.T., African Plant Functional Type Identification from n-Alkanes Chain Lengths via Non-Linear Methods, AGU 2023 (Talk).
[10] Bidgood, A., Shi, S.C., Prabhu, A., Que, X., Twigg, H., Using Supervised and Unsupervised Machine Learning Methods to Predict Missing Geochemical Data and Determine Geochemical Trends in Multielement Systems: Application to Sediment-Hosted Ore Deposits, AGU 2023 (Poster).
[9] Prabhu, A., Wong, M.L., Morrison, S.M.M., Ostroverkhova, A., Clark, M., Zhong, H., Prestgard, T.J., Li, W., Williams, J.R., Shi, S.C., Mays, J., Hazen, R., From detecting agnostic biosignatures to characterizing chondrites: How network science is perfect for making scientific discoveries with geochemical data, AGU 2023 (Invited Talk).
[8] Shi, S.C., Wieser, P., Toth, N., Antoshechkina, P., Lehnert, K., MIN-ML: A Machine Learning Framework for Exploring Mineral Relations and Classifying Common Igneous Minerals, Goldschmidt 2023 (Invited Workshop Talk).
[7] Shi, S.C., Wieser, P., Lehnert, K., Profeta, L., MIN-ML: A Machine Learning Framework for Exploring Mineral Relations and Classifying Common Igneous Minerals, EGU 2023 (Talk).
[6] Tweedy, R., Shi, S.C., Uno, K.T., Grass in the Past: Eastern African Chemotaxonomy from Plant Wax n-alkanes, AGU 2022 (Poster).
[5] Shi, S.C., Barth, A.C., Plank, T.A., Towbin, W.H., Flores, O., Arias, C.P., Magma stalling weakens eruption: Uncertainty quantification in thermometry and volatile measurements, VMSG 2022 (Talk).
[4] Toth, N., Shi, S.C., Maclennan, J., Automated petrography using machine learning, VMSG 2022 (Poster).
[3] Shi, S.C., Barth, A.C., Plank, T.A., Towbin, W.H., Magma stalling weakens eruption, AGU 2021 (Talk and ePoster).
[2] Shi, S.C., Cerling, T.E., Uno, K.T., What plant is that? Chemotaxonomy from n-alkane molecular distributions of East African plants with implications for paleoecology, AGU 2018 (Poster).
[1] Shi, S.C., Cerling, T.E., Uno, K.T., Resolving taxonomy with n-alkane molecular distributions of East African plants, Columbia University Chandler Society Research Symposium (Invited Talk).

Field

Field

Poás Volcano, Costa Rica
Highlands, Iceland
Aberdare National Park, Kenya
Cornwall (ESB Field Geology), UK

Teaching

Teaching

I develop Jupyter notebooks to teach computational basics, statistical thinking, machine learning, and petrology. This work supports the educational goals of the IEDA2 data infrastructure, funded by the United States' National Science Foundation through a cooperative agreement. Please find these materials on my earthchem-teaching GitHub repository.

The python_fundamentals notebook was developed as an introduction to Python. We explore some basics with Python, NumPy, pandas, and matplotlib. Open In Colab

The SERC_MORB_colab notebook was developed for and taught to the Earth's Environmental Systems: Solid Earth course at Columbia University. The notebook provides code for visualizing and understanding trends in petrologic melt data. We explore global chemical variability in mid-ocean ridge basalts (MORB), understand variations in mantle melting temperatures, and think about what this all means for crustal thicknesses and seismic velocities. Open In Colab

The MINML_colab notebook was taught during the Goldschmidt 2023 Conference Workshop - Open Data in Geochemistry: Navigating Present Data Infrastructure and during the NFDI4Earth Lecture Series. The notebook provides code for visualizing and understanding trends in petrologic mineral data, and for utilizing machine learning to perform mineral classification. We explore large mineral datasets from PetDB and GEOROC. Open In Colab