Machine Learning With IBM SPSS - KMeans Clustering

Blog

Machine Learning With IBM SPSS - KMeans Clustering

Data Science/ Statistical Analysis

Machine Learning With IBM SPSS – KMeans Clustering

Introduction

KMeans clustering (pretty close to Two-Step Clustering) belongs to the unsupervised learning techniques of machine learning, and it is often used when we have no idea about the categories or group of data. Therefore, the aim of the algorithm is to help you to find categories in a data and these categories are represented by the K value. Thus, the name K-Means.

To use the spss algortithm for kmeans clustering it is expedient to standardize the variables to prevent biasing in the results. By doing this, the variables are placed on the same scale. We are using the variables “wheelbase” “length” “width” “height” then the label variable is “enginelocation_tr.” The variables are highlighted below.

Machine Learning KMeans Clustering - SPSS

To standardize follow: Analyze >>> Descriptive Statistics >>> Descriptives as shown below.

Machine Learning KMeans Clustering - SPSS

Then select the variables to be standardized and put in the box. Often the Save standardized values as variables is unchecked whereas that is exactly what we need for this analysis. Therefore, ensure that it is checked then click OK. This will save the standardized version of the variables which we shall use later.

Machine Learning KMeans Clustering - SPSS

The standardized variables are shown below as “Zwheelbase” “Zlength” “Zwidth” “Zheight”.

Machine Learning KMeans Clustering - SPSS

So, we start the KMeans clustering analysis by following: Analyze >>> Classify >>> K-Means Cluster… as shown below.

Machine Learning KMeans Clustering - SPSS

Then the select the standardized variables and drop in the box and click OK as shown below.

Machine Learning KMeans Clustering - SPSS

You may click Option to indicate the number of iteration and convergence criterion. In this case we shall leave it in default.

Machine Learning KMeans Clustering - SPSS

Machine Learning KMeans Clustering - SPSS

Regarding the output, I often check the ANOVA table to see the statistics of the KMeans clustering such as the p-values. Then click Continue and OK.

Machine Learning KMeans Clustering - SPSS

Additionally, by default we are selecting 2 as the number of clusters and “enginelocation_tr” as the “Label Cases by:” as shown below.

Machine Learning KMeans Clustering - SPSS

When you click OK above, the output should display below and the results of the KMeans clustering such as the number of clusters, cluster centers and the p-values. One of the most interesting things I found so handy is the creation of plots from SPSS output. I’m going to use that here!

You must double click the result that you want to create graph from then highlight the data that you want to plot. Right click on the data and select Create Graph and the type of graph you want to create.

Machine Learning KMeans Clustering - SPSS

The bar chart below shows the distribution of the variables for the first and second cluster.

Machine Learning KMeans Clustering - SPSS

The tables below indicate the cluster center for each of the variable and the iteration history. The statistics show that the iteration converge at the fouth iteration for both cluster 1 and 2. In addition, the ANOVA table reveals that the variables p-value is < 0.0001.

Machine Learning KMeans Clustering - SPSS

Conclusion

In this blog, KMeans clustering of IBM SPSS has been introduced using the automobile data. With the algorithm you can make my choices that are functions of your goal for the analysis. Often, I consider many clusters to evaluate the model and select the ones with higher accuracy. Normally, 2-5 clusters is best theoretically however, when dealing with highly variable data it is required to increase the number of cluster.

See you in the next technique - Hierarchical Clustering.

← Back

Comments

No comments added

Leave a Reply

Success/Error Message Goes Here

Do you need help with your academic work? Get in touch

AcademicianHelp

Your one-stop website for academic resources, tutoring, writing, editing, study abroad application, cv writing & proofreading needs.