Multi-condition machine learning models for understanding retention mechanisms and predicting retention time in supercritical fluid chromatography/mass spectrometry

Research output: Contribution to journalArticlepeer-review

Abstract

Background: Modern supercritical fluid chromatography (SFC) enables fast and efficient separations owing to the low viscosity and high diffusivity of supercritical mobile phases. However, its retention mechanisms remain incompletely understood, limiting method development and confident compound identification in SFC/MS. In this study, the retention times (RTs) of 1217 compounds measured under 51 chromatographic conditions—covering 15 stationary phases, three modifier chemistries (neutral, acidic, and basic), and two gradient programs—were analyzed to develop RT prediction models and elucidate the underlying retention mechanisms. Results: Gradient boosting (GB) models were first trained separately for each condition using the measured RTs together with 2285 molecular descriptors. Then, for the first time, system descriptors encoding chromatographic conditions (i.e., stationary phase, modifier, and gradient type) were introduced to integrate these individual models into multi-condition models. These models achieved high predictive accuracy, with R2 values of 0.951 and 0.923 and mean absolute errors (MAE) of 0.613 and 0.520 min for Gradients 1 (G1) and 2 (G2), respectively. To interpret retention mechanisms, GB-selected descriptors were quantified using partial least squares (PLS), classified into 10 physicochemical categories, and evaluated using the normalized combination effect (nCE) across conditions. Subsequently, RT shift analysis revealed the most pronounced differences between neutral and acidic media. Finally, heatmaps for each stationary phase summarized peak quality and detection percentages for functional group clusters. Significance: By introducing system descriptors, this study established multi-condition RT prediction models that accurately predict retention across diverse SFC conditions. Moreover, comprehensive descriptor-based analysis under 51 conditions elucidated the underlying retention mechanisms and provided a practical framework for selecting optimal analytical conditions.

Original languageEnglish
Article number345026
JournalAnalytica Chimica Acta
Volume1385
DOIs
Publication statusPublished - Feb 1 2026

All Science Journal Classification (ASJC) codes

  • Analytical Chemistry
  • Environmental Chemistry
  • Biochemistry
  • Spectroscopy

Fingerprint

Dive into the research topics of 'Multi-condition machine learning models for understanding retention mechanisms and predicting retention time in supercritical fluid chromatography/mass spectrometry'. Together they form a unique fingerprint.

Cite this