Intermediate Level

How to merge two DataFrames on a common column?


df1 = pd.DataFrame({"A":[1,2,3], "B":[11,12,13]})
df2 = pd.DataFrame({"A":[1,2,4], "B":[21,22,23]})
new_df = pd.merge(df1, df2, how='inner', on='A')
new_df
A B_x B_y
0 1 11 21
1 2 12 22

How to apply a function to each element of a pandas Series?


ser = pd.Series([1,2,3,4,5])

ser
0
0 1
1 2
2 3
3 4
4 5
dtype: int64

new_ser = ser.apply(lambda x:x*2)

new_ser

0
0 2
1 4
2 6
3 8
4 10
dtype: int64

How to filter rows in a DataFrame based on a condition?


df = pd.DataFrame({"A":np.random.randint(10,100,100), "B":np.random.choice(['IND','JPN','UK','USA'],100), "C":np.random.randint(99,999,100)})

df.head()
A B C
0 61 IND 429
1 24 UK 738
2 81 IND 604
3 70 USA 446
4 30 IND 571
new_df = df[df['B'] == 'IND']

new_df
A B C
0 61 IND 429
2 81 IND 604
4 30 IND 571
11 12 IND 731
15 97 IND 457
17 47 IND 554
21 30 IND 775
24 67 IND 332
28 68 IND 472
34 71 IND 211
41 60 IND 905
43 30 IND 485
47 13 IND 179
48 98 IND 797
55 93 IND 834
61 44 IND 151
66 13 IND 514
68 15 IND 934
72 72 IND 221
75 53 IND 392
78 71 IND 982
80 57 IND 296
89 91 IND 562
91 33 IND 869
92 35 IND 758

How to calculate the mean, median, and standard deviation of a pandas Series?


ser = pd.Series(list(np.random.randint(11,111,100)))

ser

print(ser.mean())
print(ser.median())
print(ser.std())