کشف الگوها: مطالعه مقایسه ای تکنیک های خوشه بندی در یادگیری ماشین بر روی داده های مصنوعی انواع سردرد جهت پیش بینی داده واقعی

Armina Shafti * ℗, Amir Farrokhian, Mahdiye AbiyarGhamsari, Sepideh Akbari Shadbad

کشف الگوها: مطالعه مقایسه ای تکنیک های خوشه بندی در یادگیری ماشین بر روی داده های مصنوعی انواع سردرد جهت پیش بینی داده واقعی

کد: G-1334

نویسندگان: Armina Shafti * ℗, Amir Farrokhian, Mahdiye AbiyarGhamsari, Sepideh Akbari Shadbad

زمان بندی: زمان بندی نشده!

برچسب: دستیار مجازی هوشمند

دانلود: دانلود پوستر

خلاصه مقاله:

خلاصه مقاله

Background and aims: Headaches are recognized as one of the most prevalent health issues, affecting millions of people worldwide daily. Given its potential to disrupt lives and pose life-threatening risks, accurate diagnosis and effective management are paramount. Considering the limitations posed by real-world data, synthetic data generation emerges as a powerful tool for deeper analysis. This study aims to compare and evaluate various Clustering methods in Machine Learning applied to synthetic datasets representing different types of headaches. The analysis seeks to uncover hidden patterns and structures within the data, ultimately contributing to the enhancement of predictive models and improving headache diagnosis. Method: Comprehensive data on primary and secondary headache disorders were extracted from state-of-the-art versions of resources, including UpToDate, Applied Therapeutics: The Clinical Use of Drugs, The 5-Minute Clinical Consult, Symptoms in the Pharmacy, Community Pharmacy, and Aminoff's Neurology and General Medicine. The collected features of headaches were used to construct a synthetic dataset with the assistance of Synthetic Data Generation Libraries in Python. Subsequently, preprocessing was performed separately for each data type within the heterogeneous dataset, which comprised long-text categorical data, short-text categorical data, and numerical data. The final preprocessed dataset was then utilized to analyze, compare, and evaluate clustering patterns. It was applied to various clustering models, including DBSCAN, HDBSCAN, OPTICS, Agglomerative Clustering, GMM, and Hybrid Agglomerative-DBSCAN, using different hyperparameter configurations. Finally, the Cluster Purity metric was employed to assess the alignment of the dataset with the clustering models. Results: The performance of seven clustering algorithms was evaluated based on Cluster Purity. The results indicate that DBSCAN, HDBSCAN, and OPTICS achieved a purity score of 0.9285, while Hybrid Agglomerative-DBSCAN obtained 0.9285 without hyperparameter tuning and 1 with optimized parameters. In contrast, Agglomerative Clustering yielded 0.7254, and GMM recorded 0.7142. Conclusion: Based on the evaluation results, Hybrid Agglomerative-DBSCAN emerged as the most effective approach, achieving the highest Cluster Purity score of 1, demonstrating superior clustering accuracy. These findings suggest that Hybrid Agglomerative-DBSCAN can effectively identify the appropriate cluster within real-world patient data, facilitating precise classification of headache-related clusters.

کلمات کلیدی

Headache, Machine-Learning, Artificial-Intelligence, Synthetic-Data-Generalization, Clustering-Techniques, Unsupervised-Learning

بازخورد

نظر شما چی هست؟ بر روی ستاره های مورد نظرتون کلیک کنید.

5
  • Review rating
  • Review rating
  • Review rating
  • Review rating
  • Review rating
میانگین نمرات

دیدگاه ها (0)

تاکنون دیدگاهی منتشر نشده است. شما اولین نفر باشید!

ارسال یک دیدگاه