Different algorithms and impurity measures are used for building a Decision Tree (Decision Tree – A Statistical and Analytical tools of better Decisions).
One of the decision tree algorithms is CART (Classification and Regression Tree). CART is developed by Breiman, Friedman, Olshen, & Stone in 1984 (Book - Classification and Regression Trees).
Classification Tree: When decision or target variable is categorical, the decision tree is classification decision tree. e.g. predicting whether customer will default or not (Binary Target variable). Or predicting food choices of the customers (nominal variable) using set of independent variable is an example of Classification Decision Tree.
Regression Tree: When the decision or target variable is continuous variable, the decision tree is called regression decision tree. e.g. predicting house prices using attributes of houses such as size of a house, type of house (independent, apartment etc) and others. The independent variables can continuous and categorical variables.
CART algorithm can be used for building both Classification and Regression Decision Trees. The impurity (or purity) measure used in building decision tree in CART is Gini Index. The decision tree built by CART algorithm is always a binary decision tree (each node will have only two child nodes).
Gini Index: Explained
Consider an example, target variable is Binary variable which means it take two values (Yes and No). There can be 4 combinations.
P(Target=1).P(Target=1) + P(Target=1).P(Target=0) + P(Target=0).P(Target=1) + P(Target=0).P(Target=0) = 1
P(Target=1).P(Target=0) + P(Target=0).P(Target=1) = 1 – P2(Target=0) – P2(Target=1)
Gini Index for Binary Target variable is
= 1 – P2(Target=0) – P2(Target=1)
pt : Proportion of observations with target variable value t. In Binary t takes value 0 and 1.
Similarly if Target Variable is categorical variable with multiple levels, the Gini Index will be still similar. If Target variable takes k different values, the Gini Index will be
Maximum value of Gini Index could be when all target values are equally distributed.
For Binary Target variable, Max Gini Index value
= 1 - (1/2)2 - (1/2)2
= 1 - 2*(1/2)2
= 1- 2*(1/4)
Similarly for Nominal variable with k level, the maximum value Gini Index is
= 1 - 1/k
Minimum value of Gini Index will be 0 when all observations belong to one label.