Python Interview Questions Set - 3 Released
Intermediate Level
What are Pandas?
Pandas is an open-source python library that has a very rich set of data
structures for data-based operations. Pandas with their cool features fit
in every role of data operation, whether it be academics or solving
complex business problems. Pandas can deal with a large variety of
files and are one of the most important tools to have a grip on.
What are data frames?
A pandas dataframe is a data structure in pandas that is mutable.
Pandas have support for heterogeneous data which is arranged across
two axes. ( rows and columns).
Reading files into pandas:-
12 Import pandas as pddf=p.read_csv(“mydata.csv”)
Here, df is a pandas data frame. read_csv() is used to read a comma
delimited file as a dataframe in pandas.
What is a Pandas Series?
Series is a one-dimensional panda’s data structure that can data of
almost any type. It resembles an excel column. It supports multiple
operations and is used for single-dimensional data operations.
Creating a series from data:
Code:
import pandas as pd
data=["1",2,"three",4.0]
series=pd.Series(data)
print(series)
print(type(series))
What do you understand about pandas groupby?
A pandas groupby is a feature supported by pandas that are used to
split and group an object. Like the sql/mysql/oracle groupby it is used
to group data by classes, and entities which can be further used for
aggregation. A dataframe can be grouped by one or more columns.
Code:
df =
pd.DataFrame({'Vehicle':['Etios','Lamborghini','Apache200','Pulsar200'
], 'Type':["car","car","motorcycle","motorcycle"]})
df
Output:
What do you understand about pandas groupby?
A pandas groupby is a feature supported by pandas that are used to
split and group an object. Like the sql/mysql/oracle groupby it is used
to group data by classes, and entities which can be further used for
aggregation. A dataframe can be grouped by one or more columns.
Code:
df =
pd.DataFrame({'Vehicle':['Etios','Lamborghini','Apache200','Pulsar200'
], 'Type':["car","car","motorcycle","motorcycle"]})
df
Output:
To perform groupby type the following code:
df.groupby('Type').count()
How to create a dataframe from lists?
To create a dataframe from lists,
1) create an empty dataframe
2) add lists as individuals columns to the list
Code:
df=pd.DataFrame()
bikes=["bajaj","tvs","herohonda","kawasaki","bmw"]
cars=["lamborghini","masserati","ferrari","hyundai","ford"]
df["cars"]=cars
df["bikes"]=bikes
df
Output:
How to create a data frame from a dictionary?
A dictionary can be directly passed as an argument to the DataFrame()
function to create the data frame.
Code:
import pandas as pd
bikes=["bajaj","tvs","herohonda","kawasaki","bmw"]
cars=["lamborghini","masserati","ferrari","hyundai","ford"]
d={"cars":cars,"bikes":bikes}
df=pd.DataFrame(d)
df
Output:
How to combine dataframes in pandas?
Two different data frames can be stacked either horizontally or
vertically by the concat(), append(), and join() functions in pandas.
Concat works best when the data frames have the same columns and
can be used for concatenation of data having similar fields and is
basically vertical stacking of dataframes into a single dataframe.
Append() is used for horizontal stacking of data frames. If two
tables(dataframes) are to be merged together then this is the best
concatenation function.
Join is used when we need to extract data from different dataframes
which are having one or more common columns. The stacking is
horizontal in this case.
Before going through the questions, here’s a quick video to help you
refresh your memory on Python.
What kind of joins does pandas offer?
Pandas have a left join, inner join, right join, and outer join.
How to merge dataframes in pandas?
Merging depends on the type and fields of different dataframes being
merged. If data has similar fields data is merged along axis 0 else they
are merged along axis 1.
Give the below dataframe drop all rows having Nan.
The dropna function can be used to do that.
df.dropna(inplace=True)
df
How to access the first five entries of a dataframe?
By using the head(5) function we can get the top five entries of a
dataframe. By default df.head() returns the top 5 rows. To get the top n
rows df.head(n) will be used.
How to access the last five entries of a dataframe?
By using the tail(5) function we can get the top five entries of a
dataframe. By default df.tail() returns the top 5 rows. To get the last n
rows df.tail(n) will be used.
How to fetch a data entry from a pandas dataframe using a given
value in index?
To fetch a row from a dataframe given index x, we can use loc.
Df.loc[10] where 10 is the value of the index.
Code:
import pandas as pd
bikes=["bajaj","tvs","herohonda","kawasaki","bmw"]
cars=["lamborghini","masserati","ferrari","hyundai","ford"]
d={"cars":cars,"bikes":bikes}
df=pd.DataFrame(d)
a=[10,20,30,40,50]
df.index=a
df.loc[10]
Recent Comments
Archives
Categories
Categories
- Inspiration (1)
- Style (1)
- Technical Blog (27)
- Tips & tricks (2)
- Uncategorized (23)