Data Science: Profile Screening Model for Mid-Management Roles

Business Context: The client was an executive search firm. It has built a candidate database with over a million candidate profiles.  The client wanted to leverage the candidate database for smart candidate selection and recruitment process. For this project, the aim was to build a predictive model which will help in identifying a list of ... Read more

Scenarios: Binary Predictive Models

A long list of business decisions are of binary in nature and we will list a few of such scenarios. When a decision variable (also referred as target variable, Response variable or dependent variable) is binary (takes only two value), a long list of supervised statistical and machine learning algorithms can be used. Some of ... Read more

Logistic Regression using R: German Credit Example

Logistic Regression is one of the oldest  and widely used Statistical/Machine Learning techniques for Binary Decision Variable scenarios. In the previous blog, we have explained the overall steps to build a predictive model using Logistic Regression.   Also, if you are interested to understand Binary Model Performance Statistics, you can read a detailed blog on Model ... Read more

Decision Tree- Credit Risk Data and Model

Decision Tree is one of the commonly used exploratory data analysis and objective segmentation techniques. Great advantage with Decision Tree is that the its output is relatively easy to understand or intrepret. Introduction to Decision Tree and intrepet Decision Tree results Simple way to understand decision tree is that it is hierarchical approach to partition ... Read more

Bagging Algorithm: Concepts with Example

Bagging meaning Bootstrap Aggregation. Bootstrapping is a process of selecting samples from original sample (or population) and using these samples for estimating various statistics or model accuracy.  Bagging (Bootstrap aggregating) was proposed by Leo Breiman in 1994 for improving classification accuracy. Bootstrapping is a process of creating random samples with replacement for estimating sample statistics. ... Read more

Confusion Matrix and Cost Matrix

A Confusion Matrix is an important tool to measure accuracy of a classification algorithm. It compares predicted class of an outcome and actual outcome.  Some example of classification. Scenario 1: Credit Risk Based on a credit risk scorecard, application for credit card are classified as “Good” and “Bad”. “Good” indicates applicants paying back dues on ... Read more

Training:Predictive Modelling

Predictive Modeling training is detailed hands on Model Development workshop. The predictive modeling training covers detailed steps  - target variable definition, data preparation & treatment and model development. This is the first to date a detailed training offered on Predictive Model Development in India by any institute. We have experience and example datasets for various ... Read more

Predictive Modelling Technique - Logistic Regression - Interpret Output - Part 2

MAXIMUM LIKELIHOOD AND ODDS RATIO Analysis of Maximum Likelihood Estimates Parameters in logistic regression are estimated using Maximum Likelihood Estimation (MLE) functions.  The significance of individual exploratory variable parameters is assessed using Wald Chi Square test. Parameter:  Intercept and exploratory variables used in a logistic model, the weight of these are estimated using MLE DF: ... Read more

Interview Process - Evaluating Analytical Skills

In the previous blog, we have shared the list of questions which were asked for evaluating communication, confidence, and technical skills (SAS). In the next round, main expectation was to check the candidate for analytics skills. After the interviewers were comfortable with the technical skills (e.g. SAS in this case), in this round questions were asked ... Read more

Building Predictive Model using SVM and R

Predictive Modelling problems are classified either as classification or regression problem. Within classification based on the level and type of decision variable (Target Variable), different algorithms could be used. A number of statistical and machine learning techniques are available for both classification and regression type of the problems. Some of the commonly used techniques for ... Read more

Decision Tree: CHAID

There a number of different Decision Tree building algorithm available for both Regression and Classification problems. One of the great advantage with Decision Tree algorithm is that the output can be easily explained to business users. Some of the decision tree building algorithms are CHAID CART C6.0 In this blog, the focus will be to ... Read more

Support Vector Machine: Simplified

Support Vector Machine (SVM) is one of the machine learning algorithms used for supervised problem sets mainly.Some of the other algorithms which can be replaced with SVM are Decision Tree, Random Forest, Neural Network or Logistic Regression, specifically for Binary Classification problems. Due to mathematical complexity involved in SVM algorithm, it is some time difficult ... Read more

Credit Score: What is it and how is it developed?

When a customer applies for a credit facility (e.g. credit card, personal loan, car/vehicle loan, home/mortgage loan, home equity etc.) at a bank, the bank evaluates credit worthiness of the customer.  What do we mean by credit worthiness? It is assessment by the bank that the customer would be able to meet his/her financial obligations. ... Read more

Logic to Madness of Facebook Post Likes?

Regression Model: Improving Facebook Post Likes In the previous blogs, we have focused on extracting data from Facebook page of a bank/company, providing descriptive statistics of Facebook Post KPIs, and condulting exploratory data analysis to understand factors driving the Facebook likes or sharing of a post. Extracting Data from Facebook Association of Likes and Post ... Read more

Learn Predictive Modeling – Why and How?

Data Analytics, Big Data Analytics, Data & Decision Science and Predictive Modeling are some of the hot topics in the digital world. Though data analytics and predictive modeling have been prevalent or are being used by the organizations for many decades. But due to new data sources (from digital and social media) and focus on ... Read more

Neural Network - Tutorial

Inspiration for Neural Network comes from psychologist and neurobiologist. They wanted to develop computational analogous to neurons. Hence, computational neural network is also called artificial neural network. Neural Network (NN) has neurons and layers in its architecture. 3 layers in Neural Network are Input Layer , Hidden Layer and Output Layer. Each of these layers have units or neurons in ... Read more

Model Performance: R code and Explanations

In the last blog, we have described model performance statistics. Considering interest and questions from users, we are describing R functions in a bit more details. Model Building and Calculating Predicted Probability In this, we have taken cross sell data. The name of data frame is termCrossSell. Dependent variable in this dataset is target. Using ... Read more

Predictive Model Performance Statistics

Predictive Model building is much more than running a logistic regression function or any other techniques. In the previous blog, steps of building a predictive model are discussed. If you are interested to learn on how to prepare data for predictive modeling development, you could explore another blog. Before a predictive model is selected for deployment, it undergoes ... Read more

Survival Modeling Tutorial using R - Part 1

Survival Modeling is a family of techniques which are used when time to even becomes important. Survival Models can be used for predicting time of an event ( when customer will take up a product), estimating duration until next event occurs (customer visit to a retail store). Some of the applications of Survival Modeling across ... Read more

Survival Modeling and Its Applications

Survival Models in Retail Banking   Application of Survival Modeling in Personal Loan1 Credit Scoring has been in focus for some time and gradually focus is shifting to maximize profitability and not just minimizing risk. Hence time of default (customer not able to meets its obligation of payment) and prepayment (paying of Personal Loan before ... Read more