Python Learning - Finding Answers

In this blog, we are sharing some of the scenario arose while working on a project and we are providing the steps to get the work done. I am sure, there would be multiple ways to achieve the outcome but I am sharing my solutions.


Q1:  How do we extract id value from html "a" attribute? ID value was indicating page number and wanted to find ID value corresponding to last page.

Solution: I have extracted html tag and output available was

<a class="last" href="" id="btn112">Last</a>

Now I wanted to extract "btn112" value. Above string was stored CompCount[3]

I have used following code to get the result.


Q2: How do I remove a part of input string? In the above example, I have "btn112" and I want number 112.

Solution: In this example, I know that "btn" would be common but last numerals can be 1,2 or 3 digits. So, better to remove the first 3 characters.

I found the length of string using len() function and selected characters from 4 position (index value 3 as indexing start from 0) and till the length of input string.

Then I can convert  to integer value using int()


Q3: How do we append or vertically combine data frames in Python?

Solution: In one of the project, we wanted to create a new data frame in the first step and then keep creating and appending the data frame with the base data frame. In panda package, we have append() function to append/vertically combining the data frame.

if l1==0:
    # Create a data frame
    df = { "Level_1": [Supplier_l1[l1]]*(len(Category_l2)),
          "Level_2" : Category_l2[0:len(Category_l2)],
          "Level_2_link": Category_l2_link[0:len(Category_l2)]
    df= pd.DataFrame(df)
    # Create a data frame
    df = { "Level_1": [Supplier_l1[l1]]*(len(Category_l2)),
          "Level_2" : Category_l2[0:len(Category_l2)],
          "Level_2_link": Category_l2_link[0:len(Category_l2)]
    df= pd.DataFrame(df)

Q4: How can we find all the files from a directory or folder?

Solution: We had a list of xlsx files in a folder /path and we wanted to get all the list of files in a list, so that we can read those files. Here is the code, we have used.

from os import listdir
from os.path import isfile, join
onlyfiles = [f for f in listdir(mypath) if isfile(join(mypath, f))]

Q5: How can you read an excel or xlsx file in Python?

Solution: I had a list of excel or xlsx files in a folder and wanted to read all files and append the content into a data frame. And below Python code is used to read the files one by one and append into a data frame.

# Read xlsx file
for f in onlyfiles:
    f1 = "C:\\Python\\"+f
    data = pd.read_excel(f1, ' Companies')
    df = df.append(data)

Q6: How can we compare a substring within another string in Python?

Solution: We have a list of all the file names and we wanted to select only filename which has ".xlsx" files. So, checking whether input string has substr ".xlsx".

for f in onlyfiles:
    x=".xlsx" in f

This will give TRUE if substr pattern is available.

Q7: How do we Subset a list based on a condition?


Q8: How do you read a text file which is Tab separated in Python?

Solution: You can find a details on reading a text file in Python. Details


Q9: How can we round off a floating number in Python?

Solution: We wanted to run a loop and find number of iterations required.  For example for one category, overall products were 421 and each of page had 20 products, so wanted to find number of pages to extract.

We have used ceiling to round off to higher value.

import math
intr = math.ceil(prd_cnt/20)

We could have used round() function, but it was rounding to lower value as after division, the result was 20.5.

val = prd_cnt/20


Leave a Comment