Classification of Nicotine Treatment Response Based on Gene Expression Profiles Using Support Vector Machine and Gaussian Process Models
DOI:
https://doi.org/10.24036/mjmf.v4i1.50Keywords:
Nicotine exposure, Gene expression profiling, Gaussian Process Classification, iPSC, Support Vector Machine, RNA-seqAbstract
Nicotine is known to impair endothelial function and increase cardiovascular risk through transcriptional dysregulation. This study investigates the gene expression response of human induced pluripotent stem cell (iPSC)-derived endothelial cells to nicotine exposure using the RNA-seq dataset GSE274506. The analysis was conducted on 40 samples, consisting of 20 nicotine-treated and 20 untreated/control samples, using a 5-fold outer stratified cross-validation with 3-fold inner cross-validation for model tuning, reflecting the limited sample size relative to the high-dimensional gene expression feature space. Differential expression analysis identified 46 significant genes, comprising 28 upregulated and 18 downregulated, indicating perturbations in G protein-coupled receptor (GPCR) signaling, calcium homeostasis, and inflammatory processes. Functional enrichment analyses based on Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG), and Reactome consistently revealed dominant involvement of GPCR signaling, cyclic adenosine monophosphate (cAMP) signaling, calcium signaling, and transient receptor potential (TRP)-related pathways, suggesting a coordinated molecular response to nicotine-induced stress. To discriminate between nicotine-treated and control samples, Support Vector Machine (SVM) and Gaussian Process Classification (GPC) models were evaluated. The linear SVM achieved the best and most stable performance, with an accuracy of 0.875, an F1-score of 0.881, and a G-mean of 0.861, outperforming SVM with radial basis function kernels, single-kernel GPC variants, and a multiple kernel learning (MKL) GPC model. These findings indicate that the underlying transcriptomic structure of the data is predominantly linear, favoring linear kernel-based classifiers in high-dimensional gene expression analysis.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Mathematical Journal of Modelling and Forecasting

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.


