What drives Facebook Post Likes?

Role of digital media and channels have become key focus for organizations across industries and geographies. Web and Social Media channels are used for prospect creation, customer acquisition, customer engagement, and customer servicing .
social media
Prospect Creation : One of the basic reason for creating Facebook page is for creating presence online on social media (e.g. Facebook, Linkedin, Instagram etc). Social media visitors will come across a company and some of them are expected to get interested in product and services offered the company. So, company pages and posts are expected to create interest and prospects.

Customer Acquisition: On the sales funnels , some of the prospects (prospective customers) gets converted into customers. Providing required information and making it easy for the prospects to apply are critical.

Customer Engagement: Quite a large number of organizations are focused on acquiring customers (across online & offline channels), since it is easy to measure and assign responsibility. They need to recognize the value of customer engagement and retention. Understanding the customer preferences and interest could help them in engaging the customers on social media channels and creating a long term value for both customers and the organization.

Customer Servicing: Over the years the social media channels have become important channels of servicing the customers. For example, if they have any questions, they get the answer over the social media pages.

One of the ways to measure the engagement over social media for a company is to count likes of the posts and over time. And understanding of the success of the posts in terms of likes could help the company to post the contents and engage the customers based on their preferences.

In this blog, we will try to understand the factors or factor levels driving the likes over Facebook. One of the dimensions - type of post- will be explored in detailed.

We can use Facebook API to extract the data from a company page. Blog on how to extract data from Facebook.

Read Facebook Data


setwd("C:\\Ram\\Learn R\\facebook")
facebook.likes <- read.csv(file="nab.page.list.csv",
##  [1] "X"              "from_id"        "from_name"      "message"       
##  [5] "created_time"   "type"           "link"           "id"            
##  [9] "likes_count"    "comments_count" "shares_count"


When we extract data from a company page, we typically get follow variables.

  • From ID: Who have posted the page (Company ID)
  • From Name: Company Name
  • Message
  • Create Time
  • Type : Whether Video, Event, Link, Photo or Status
  • Link
  • Like Counts
  • Comment Counts
  • Share Counts

Since likes, comments and shares are important to measure engagement due to a post, we could create a variable engagement_volume as sum of these three measures. But we will make it simple and understand impact only on likes.

One of the base hypothesis is that type of post drives different level of likes. We want to compare average value of likes and test the hypothesis that no mean difference between average engagement due to type of video.

Box Plot

Box plot could help in comparing likes across types of post.

        xlab="Type of Post")

Box Plot -v1

It seems there a few posts with a high level engagement and they seems be outliers. So, we can do outlier treatment.

q <-quantile(facebook.likes$likes_count,
        ylab="Count of Engagement",


Around 40% of the posts have less than 3,500 likes etc. and 5% posts have over 70,000 likes (and comments and shares ).

We will consider posts with likes between 3,500 and 70,000 likes. We could do capping, but to make it simple, we have excluded the posts which were extremely successful (probably become viral) or no interest from the customers.

facebook.likes1 <- facebook.likes[facebook.likes$likes_count>3500 & facebook.likes$likes_count<70000,]

Now again, we use box plot to compare the average engagement across type. Average likes of type.

## : event
## [1] NA
## -------------------------------------------------------- 
## : link
## [1] 8892
## -------------------------------------------------------- 
## : photo
## [1] 20022
## -------------------------------------------------------- 
## : status
## [1] 9673
## -------------------------------------------------------- 
## : video
## [1] 16529
        xlab="Type of Post")

Box Plot -v2

It seems that a photo post has on average 20,000 likes compared to 16500 for the video post.

If we look at the distribution of the Video Post likes distributions, it is evident that a video post have higher chances of going viral (a few post will have exceptionally likes), but photo post have higher changes of getting a reasonable level of likes.

ANOVA - Analysis of Variance

Analysis of Variance (ANOVA) could help in establishing that average value of likes are different based on type of the post. aov helps in fitting ANOVA in R.

##                        Df   Sum Sq  Mean Sq F value Pr(>F)    
## facebook.likes1$type    3 2.73e+10 9.11e+09    37.8 <2e-16 ***
## Residuals            2083 5.02e+11 2.41e+08                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

F Statistic is significant with probability value (P Value) of less than 0.0001%. So Null hypothesis of no difference between average likes across types could not be accepted.


We have just analysed likes based on type of posts and it can be concluded that type does influence likes and on average a photo post has higher level of visitor likes. Based on the insights organization could focus more photo based posts. Further text analytics could be done based on text comments and also the post text. For example, each post has text - how text key words and level of details could influence likes for the post.

It has been noticed that a few video based posts have abnormally high likes and understanding factors impacting level of likes could also be interesting. Association analysis of post video attributes  and level of  likes could be analysed.

Note: Aim of the blog is show that simple analysis could help you get insights. Not to validate all the assumptions. We recognise that likes may not follow normal distribution and we should use non-parametric ANOVA test for comparing the means/median like counts.


Leave a Comment