Earth

Machine learning enhances new approaches to detect soil contaminants

The soil used in this study was collected from Harris Galley, a restored basin and natural area on the Rice University campus. Credit: Brandon Martin/Rice University

A team of researchers from Rice University and Baylor College of Medicine have developed a new strategy to identify harmful contaminants in soils that have not been quarantined or studied in the lab.

A new approach, described in the study published in Proceedings of the National Academy of Sciences, uses light-based imaging, theoretical prediction of the photosignature of compounds, and machine learning (ML) algorithms to detect toxic compounds such as polyclick aromatic hydrocarbons (PAHs) and derivative compounds (PACS).

Common byproducts of burning, PAH and PAC are associated with cancer, developmental issues and other serious health issues.

Identifying contaminants in soil usually requires advanced laboratories and standard physical reference samples of suspected contaminants. However, for many environmental pollutants pose a public health risk, there is no experimental data available to detect them.

“This method allows us to identify chemicals that have not yet been experimentally separated,” says Naomi Halas, a university professor and professor of Stanley C. Moore, electrical and computer engineering at Rice.

This new method uses a light-based imaging technique known as surface-enhancing Raman spectroscopy. This analyzes how light interacts with molecules and tracks unique patterns or spectra. The spectrum acts as a “chemical fingerprint” for each compound. This technique is refined by using signature nanoshells designed to enhance relevant properties within the spectrum.

Using density functional theory, it is a computational modeling technique that allows you to predict how atoms and electrons behave in molecules, and researchers calculated how the spectrum across the entire range of PAH and PACs look based on the molecular structure of the compounds. This allowed us to generate a virtual library of PAH and PAC “fingerprints”.

Two complementary ML algorithms – characteristic peak extraction and characteristic peak similarity – were used to analyze spectral properties related to actual soil samples and match them with compounds mapped to virtual libraries of spectrals.

“We are using PAH in the soil to explain this very important new strategy,” Haras said. “There are tens of thousands of PAH-derived chemicals and this approach. We use machine learning to calculate their spectra and connect the theoretical calculated spectra to the spectra observed in the sample to allow us to identify chemicals that do not have or do not have experimental data.”

This method addresses key gaps in environmental monitoring and opens the door to identify a much wider range of dangerous compounds, including those that have changed over time. This is especially important given that soil is a dynamic environment affected by transformations that can make detection more difficult.

Thomas Senfuru, Associate Professor of Chemistry and Biomolecular Engineering at Rice, compared it to the process of using facial recognition to find individuals in the crowd.

“You can imagine there’s a picture of people from when they were teenagers, but now they’re in their 30s,” Senftle said. “In my group, what we’re doing is, on the theory side, we can predict what a photograph will look like.”

The researchers tested soil methods from recovered basins and natural regions using both artificially contaminated and control samples. The results demonstrate a new approach that reliably selected fine traces of PAH using a simpler and faster process than traditional techniques.

“This method can identify lesser known and almost substantial PAH and PAC contaminant molecules,” says Rice research scientist Oara Neumann, a co-author of the study.

In the future, this method will enable on-site field testing by integrating ML algorithms and theoretical spectral libraries into portable Raman devices and mobile systems, making it easier for farmers, communities and environmental agencies to send samples to specialized labs and test dangerous compounds without waiting for results.

Details: Yilong Ju et al, in silico machine learning – Detection of polycyclic aromatic hydrocarbons from contaminated soils, Proceedings of the National Academy of Sciences (2025). doi:10.1073/pnas.2427069122

Provided by Rice University

Citation: A New Approach to Machine Learning Detecting Soil Pollutants (May 9, 2025) Retrieved May 9, 2025 from https://phys.org/news/2025-05-machine-powers-approach-soil-contaminants.html

This document is subject to copyright. Apart from fair transactions for private research or research purposes, there is no part that is reproduced without written permission. Content is provided with information only.

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button