Data Wrangling using Python- Part 1

In this blog, we will show some of the commonly used data wrangling steps using Python.  We will be using pandas data frame as our data object to show all the steps.

Data Wrangling with Python

Importing Python Packages

In this part of blog, we will use pandas and numpy packages available in Python. We need to import these packages before use them.

Creating a Data Frame in Python

Now, we want to create a data frame in Python and there could be multiple ways to do that.

  • Creating Data Frame by individual columns
  • Read data into a data frame
  • Convert different object to a data frame

Creating a data frame: We are creating a series of random numbers and storing into a data frame - df1.

In the data frame created, we have 3 columns  - c1, c2 and c3.

We can create a data frame by combining different columns into a dictionary and converting the dictionary to data frame.

"a" is a dictionary and "df" is a data frame.

Renaming Columns of a Data Frame

In a number of scenarios, we may want to rename the columns of an existing data frame in Python. Some of the ways to rename column(s) are:

Drop Column(s) of a Data Frame

We can drop column by column name and position.

Add New Column(s) to a Data Frame

We can create new column and add to an exiting data frame.  For creating new column, we can use existing columns or add other data.

First we have added a new column "c4", we have used existing column "c1" and multiplied with 10.  In the second line, we are repeating "3","5"  and creating a a new column 'c5'.

Change existing Column(s) of a Data Frame

A number of reasons of requiring change of columns.  We may want to change a floating into interger, convert a string values to numeric values or rounding off values. Here are the ways to achieve of these scenarios.

Finally, for this part of the blog, we may want to find the type of columns of a data frame. And here is the way to do.


Leave a Comment