Assignment 2: Unsupervised Learning

Coursework Instructions:

Hello Dear, I am going to submit the file and from there you can follow the instructions:)) Also, I would appreciate to have the jpynb file to submit to show my work. Thanks, Yalda

Coursework Sample Content Preview:

Fall 2024: CSC-480/680 Assignment 2 Student's Name University Course Professor Date Clustering Algorithms, Evaluation Metrics, and Classifiers in Performance Measurement Classifiers, evaluation metrics for measuring performances, and clustering techniques are central to the goal of achieving good results in machine learning. Thus, this essay aims to discuss these aspects and their influence on the performance of classifiers while comparing the quality of clustering results assessed by intrinsic and extrinsic measures. The first evaluation is the performance of two classifiers, Support Vector Machine (SVM) and Naive Bayes (NB), on the Cleveland Heart Disease dataset through 10-fold Cross-Validation and the Area Under the Curve (AUC). Besides, the Frequentists t-test and the Bayesian Correlated t-test will be used for statistical analysis of the findings; the confidence intervals in comparison with the results will be discussed, and Cohen's d Effect Size will be addressed. Further, three classifiers, SVM, Naïve Bayes, and Logistic Regression models, are tested on three sets of data (Cleveland Heart Disease, Wine Quality, Adult). The first performance measure is the F-measure, and the evaluation method chosen is the 10-fold Cross-Validation. Quantitative methods, such as Friedman's Test, Nemenyi's Test, and Bayesian Hierarchical Correlated t-test, will be used to verify the differences in performance. In addition, Bootstrapped Confidence Intervals and Kendall's W Effect Size will be considered to analyze the consistency of classifier performance across the domains. The last part focuses on clustering algorithms with the help of the Cleveland Heart Disease dataset (without labels). We shall use K-Means, BIRCH as well as DBSCAN and measure its efficiency using intrinsic methodologies that include the Silhouette Coefficient, Davies-Bouldin Index and Calinski-Harabasz Index, and extrinsic methods that include Purity, Normalized Mutual Information and Adjusted Rand Index. Besides, the T-SNE projection method will be applied to assess the clusters, and the results will specify whether the clusters are distinctly categorized or not. Question 1 SVM Classifier Performance In the case of the Support Vector Machine (SVM) classifier, with the Area Under the Curve (AUC) measure given as an outcome, it got a mean AUC of 0.842 ± 0.054. The high AUC score means that the test of the SVM model separated positive and negative heart disease instances. It is further essential to validate measures that are usually associated with high values AUC greater than 0.8, showing a good model by the SVM in correctly giving out the true positives that were available while at the same time minimizing the false positive, as elaborated by Feng et al. (2022). The Naïve Bayes Classifier The Naïve Bayes (NB) classifier achieved a mean AUC of 0.797 ± 0.063; thus, despite being good, it is lower than what the SVM garnered. This is because NB makes probabilistic assumptions, the data is irregular, and independent features overlap (Haq et al., 2024). Frequentists t-test: The t-test was 2.195 with a probability value of 0.032, corroborating the conjecture that there is a statistical difference in the performances of the SVM and Naive Bayes classifiers. Cohen's d Effect Size: Cohen's calculated and obtained value d is 0.758, implying that the effect size between the two classifiers, SVM and Naïve Bayes, is moderate. This gives a quantitative measure of performance, which shows that for every case, the SVM classifier provides better predictive capability than NB. Confidence Intervals and Interpretation A graph that depicted the confidence intervals for both classifiers to evaluate...

Updated on January 22, 2025

Get the Whole Paper!

Not exactly what you need?

Do you need a custom essay? Order right now:

Order

👀 Other Visitors are Viewing These APA Essay Samples:

Cybersecurity

1 page/≈275 words | No Sources | APA | IT & Computer Science | Coursework |
Project 2: Discovery

5 pages/≈1375 words | 10 Sources | APA | IT & Computer Science | Coursework |
Project 1: Reconnaissance

6 pages/≈1650 words | 10 Sources | APA | IT & Computer Science | Coursework |