Discriminant Analysis With IBM SPSS

Introduction

Bien-ve-Nido de nuevo in Spanish means Welcome back in English, so what does discriminant analysis mean in data analytics? In a simple sentence, it can be used to classify an observation into different groups. Therefore, it is in a “classification” family of analysis. It must be noted that it is a widely applied statistical tool by the market researchers.

For example, a clinician can use discriminant analysis to find individuals with high or low stroke risk. Or identifying the group of the consumer of a fruit in a supermarket.

Business Task or Question.

A loan officer at a given bank wants to be able to identify characteristics that indicate customers that are most prone to miss payments on collected loans. Well, this is where discriminant analysis shines; it can be used to characterize and identify good and bad credit risks.

DATA

Suppose bankloan_data has data on 850 current and potential clients. Customers in the first 700 rows have already received loans.

We are going to use a random sample of these 700 customers to build a discriminant analysis model and reserve the remaining customers to validate our analysis after categorizing the 150 potential customers as excellent or bad credit risks using the model.

Discriminant Analysis Model

The goal of discriminant analysis is to identify the linear combinations of independent variables that best distinguish the groups of examples.

The shape of these combinations, known as discriminant functions, is seen in the equation.

discriminant analysis

Where

d_ik = the value of the k^th discriminant function for the i^th case

p = the number of predictors

b_jk = the value of the j^thcoefficient of the k^th function

x_ij = the value of the i^th case of the j^th predictor

Assumptions

The following assumptions apply to the discriminant model:

• There is not much correlation between the predictors.

• There is no correlation between a predictor’s mean and variance.

• Across groups, the correlation between two predictors is constant.

• Each predictor’s values follow a normal distribution.

Data for Analysis Preparation

After importing the data into SPSS, the Next is to go to the variable view in Spss and name those variables, as shown in table 1 and figure 1

variable measurement

discriminant analysis

Data Transformation and building a discriminant model

This option allows you to replicate an analysis from a random case selection which eliminates bias in our analysis; From the menu, select => Transform => Random Number Generators

discriminant analysis

Make a choice and Set Starting Point. Choose Fixed Value and enter the value 9191972. Select OK.

discriminant analysis

To create the selection variable for validation, from the menus, choose: Transform => Compute Variable. Then in label [1], type validates in the Target Variable text box also type rv.bernoulli(0.7) in the Numeric Expression text box.

The above action was done to sets the values of validate to be randomly generated Bernoulli variates with probability parameter 0.7.

discriminant analysis

A validate value of 1 will be present for around 70% of the customers who have previously received loans. The model will be developed using these customers. The remaining customers who have already received loans will be utilized to verify the model’s findings.

Label [3] is used to perform the computation only for previous customers. Click If.

The IF case is selected to satisfy the condition by typing MISSING (default) = 0 as an expression label [4] in figure 3.

This ensures that the variable name validate which compute cases with non-missing values by default for customers who previously received loans.

A Bernoulli variate takes the value range of 0 – 1 with a probability equal to the specified probability parameter.

Only situations that might be used to build the model, i.e., previous customers, will be utilized to validate that the data file containing 150 examples of matches potential customers.

Using Discriminant Analysis to Assess Credit Risk

Select the following from the menu to launch the discriminant analysis: Analyze => Classify => Discriminant, as shown in figure 4

discriminant analysis

Select formerly served as the grouping variable by default. The independent variables should be Years at current employment, Years at current residence, debt to income ratio (100 times) and hundreds of dollars in credit card debt.

As the selection variable, select validate. Choose Previously defaults and click.

Define the scope. At a minimum, type 0. Maximum type 1, then choose Continue.

discriminant analysis

In the Discriminant Analysis dialogue, click value after selecting validate. As the value for the selected variable, type 1. Then click Continue.

discriminant analysis

Click Statistics in the Discriminant Analysis dialogue. In the Descriptives category, Select Means, Univariate ANOVAs, and Box's M.

In the Function Coefficients category, pick Fisher's and Unstandardized. Select the Matrices

group's within-groups correlation.

Click Classify in the Discriminant Analysis dialogue. Click Continue.

discriminant analysis

Click Save in the Discriminant Analysis dialogue.

discriminant analysis

Choose Probabilities of group membership and Predicted group membership. To continue, click.

In the Discriminant Analysis dialogue, click OK.

Statistical Interpretation

Customers who previously had the same address and were employed for the same company for a long time are less likely to default since the coefficients for Years with current employer and Years at current address are less for the Yes categorization function.

discriminant analysis

← Back

Shopping cart

Discriminant Analysis With IBM SPSS

-

search

Category

Recent Posts

Tags

Business Task or Question.

DATA

Discriminant Analysis Model

Assumptions

Data for Analysis Preparation

After importing the data into SPSS, the Next is to go to the variable view in Spss and name those variables, as shown in table 1 and figure 1

Data Transformation and building a discriminant model

Using Discriminant Analysis to Assess Credit Risk

Statistical Interpretation

Comments

Sima Thurlow Feb 07, 2023

Leave a Reply

Do you need help with your academic work? Get in touch

AcademicianHelp