上海口腔医学 ›› 2024, Vol. 33 ›› Issue (6): 600-607.doi: 10.19439/j.sjos.2024.06.006

• 论著 • 上一篇    下一篇

基于加权共表达网络和机器学习筛选唾液腺腺样囊性癌疾病特征性基因

卜文超1,2, 陈世新1,2, 江银华1,2, 曹明国1,2, 吴心茹1, 关云茜1, 谢思源1   

  1. 1.丽水学院医学院, 浙江 丽水 323000;
    2.丽水学院附属第一医院, 浙江 丽水 323000
  • 收稿日期:2023-11-02 修回日期:2024-01-08 出版日期:2024-12-25 发布日期:2025-01-07
  • 通讯作者: 曹明国,E-mail: cmg@lsu.edu.cn
  • 作者简介:卜文超(1992-),男,硕士,E-mail:2318015868@qq.com
  • 基金资助:
    浙江省一般科研项目(Y202351961);浙江省大学生创新创业项目(S202110352022,S202110352030);浙江省新苗计划(2021R434008)

Screening of characteristic genes of salivary gland adenoid cystic carcinoma based on weighted co-expression network and machine learning

BU Wen-chao1,2, CHEN Shi-xin1,2, JIANG Yin-hua1,2, CAO Ming-guo1,2, WU Xin-ru1, GUAN Yun-qian1, XIE Si-yuan1   

  1. 1. School of Medicine, Lishui University. Lishui 323000;
    2. The First Affiliated Hospital of Lishui University. Lishui 323000, Zhejiang Province, China
  • Received:2023-11-02 Revised:2024-01-08 Online:2024-12-25 Published:2025-01-07

摘要: 目的: 确定唾液腺腺样囊性癌潜在的生物标志物,以进一步了解腺样囊性癌潜在的发病机制。方法: 从NCBI GEO数据库下载2个微阵列数据集(GSE59701、GSE88804),用LIMMA软件包筛选SACC差异表达基因。WGCNA寻找与SACC最相关的重要模块基因,以LASSO、SVM-RFE 2种机器学习算法识别hub基因。随后生成用于预测SACC的ROC曲线,判断诊断效果。采用R4.2.1软件进行统计学分析。结果: 鉴定出3个hub基因(GABBR1EN1LINC01296),并用其建立一个预测性能较高的ROC曲线(AUC,1.000~1.000)。结论: 采用WGCNA、LASSO和SVM-RFE获得3个枢纽基因(GABBR1EN1LINC01296)作为SACC的潜在生物标志物,为未来对SACC潜在关键基因的研究提供基础,为SACC的早期诊断和治疗提供依据。

关键词: 唾液腺腺样囊性癌, 机器学习, 加权分析, 生物信息学

Abstract: PURPOSE: To identify potential biomarkers of salivary gland adenoid cystic carcinoma to further understand the potential pathogenesis of adenoid cystic carcinoma. METHODS: Two microarray datasets (GSE59701, GSE88804) were downloaded from NCBI GEO database. LIMMA software package was used to screen SACC differentially expressed genes. WGCNAs were used to find the important module genes that were most associated with SACC. Two machine learning methods(LASSO and SVM-RFE) were used to identify Hub genes. Subsequently, ROC curve used to predict SACC was developed to determine the diagnostic effect. R4.2.1 software was used for statistical analysis. RESULTS: Three hub genes(GABBR1, EN1 and LINC01296) were identified, and a ROC curve with high predictive performance (AUC, 1.000-1.000) was established. CONCLUSIONS: Three hub genes(GABBR1, EN1 and LINC01296) were obtained by WGCNA, LASSO, SVM-RFE as potential biomarkers of SACC, and the findings of this study provide a foothold for future research on potential key genes of SACC, and a target basis for the early diagnosis and treatment of SACC.

Key words: Salivary gland adenoid cystic carcinoma, Machine learning, Weighted analysis, Bioinformatics

中图分类号: