让我们来看看如何使用Excel将Excel文件读取到Pandas数据框对象大熊猫.
代码1:使用读取Excel文件read_excel()熊猫的方法。
# import pandas lib as pd
import pandas as pd
# read by default 1st sheet of an excel file
dataframe1 = pd.read_excel( 'SampleWork.xlsx' )
print (dataframe1)
输出:
Name Age Stream Percentage
0 Ankit 18 Math 95
1 Rahul 19 Science 90
2 Shaurya 20 Commerce 85
3 Aishwarya 18 Math 80
4 Priyanka 19 Science 75
代码2:使用read_excel()方法的'sheet_name'读取特定表。
# import pandas lib as pd
import pandas as pd
# read 2nd sheet of an excel file
dataframe2 = pd.read_excel( 'SampleWork.xlsx' , sheet_name = 1 )
print (dataframe2)
输出:
Name Age Stream Percentage
0 Priya 18 Math 95
1 shivangi 19 Science 90
2 Jeet 20 Commerce 85
3 Ananya 18 Math 80
4 Swapnil 19 Science 75
代码3:使用read_excel()方法的'usecols'参数读取特定列。
# import pandas lib as pd
import pandas as pd
require_cols = [ 0 , 3 ]
# only read specific columns from an excel file
required_df = pd.read_excel( 'SampleWork.xlsx' , usecols = require_cols)
print (required_df)
输出:
Name Percentage
0 Ankit 95
1 Rahul 90
2 Shaurya 85
3 Aishwarya 80
4 Priyanka 75
代码4:使用read_excel()方法的'na_values'参数处理丢失的数据。
# import pandas lib as pd
import pandas as pd
# Handling missing values of 3rd sheet of an excel file.
dataframe = pd.read_excel( 'SampleWork.xlsx' , na_values = "Missing" , sheet_name = 2 )
print (dataframe)
输出:
Name Age Stream Percentage
0 Priya 18 Math 95
1 shivangi 19 Science 90
2 Jeet 20 NaN 85
3 Ananya 18 Math 80
4 Swapnil 19 Science 75
代码5:使用read_excel()方法的'skiprows'参数读取Excel文件时,跳过开始行。
# import pandas lib as pd
import pandas as pd
# read 2nd sheet of an excel file after
# skipping starting two rows
df = pd.read_excel( 'SampleWork.xlsx' , sheet_name = 1 , skiprows = 2 )
print (df)
输出:
shivangi 19 Science 90
0 Jeet 20 Commerce 85
1 Ananya 18 Math 80
2 Swapnil 19 Science 75
代码6:使用read_excel()方法的'header'参数,将标题设置为任意一行,并从该行开始读取。
# import pandas lib as pd
import pandas as pd
# setting the 3rd row as header.
df = pd.read_excel( 'SampleWork.xlsx' , sheet_name = 1 , header = 2 )
print (df)
输出:
shivangi 19 Science 90
0 Jeet 20 Commerce 85
1 Ananya 18 Math 80
2 Swapnil 19 Science 75
代码7:使用read_excel()方法的'sheet_name'参数读取多个Excel表。
# import pandas lib as pd
import pandas as pd
# read both 1st and 2nd sheet.
df = pd.read_excel( 'SampleWork.xlsx' , na_values = "Mssing" , sheet_name = [ 0 , 1 ])
print (df)
输出:
OrderedDict([(0, Name Age Stream Percentage
0 Ankit 18 Math 95
1 Rahul 19 Science 90
2 Shaurya 20 Commerce 85
3 Aishwarya 18 Math 80
4 Priyanka 19 Science 75), (1, Name Age Stream Percentage
0 Priya 18 Math 95
1 shivangi 19 Science 90
2 Jeet 20 Commerce 85
3 Ananya 18 Math 80
4 Swapnil 19 Science 75)])
代码8:使用read_excel()方法的'sheet_name'参数一起读取excel文件的所有工作表。
# import pandas lib as pd
import pandas as pd
# read all sheets together.
all_sheets_df = pd.read_excel( 'SampleWork.xlsx' , na_values = "Missing" , sheet_name = None )
print (all_sheets_df)
输出:
OrderedDict([('Sheet1', Name Age Stream Percentage
0 Ankit 18 Math 95
1 Rahul 19 Science 90
2 Shaurya 20 Commerce 85
3 Aishwarya 18 Math 80
4 Priyanka 19 Science 75), ('Sheet2', Name Age Stream Percentage
0 Priya 18 Math 95
1 shivangi 19 Science 90
2 Jeet 20 Commerce 85
3 Ananya 18 Math 80
4 Swapnil 19 Science 75), ('Sheet3', Name Age Stream Percentage
0 Priya 18 Math 95
1 shivangi 19 Science 90
2 Jeet 20 NaN 85
3 Ananya 18 Math 80
4 Swapnil 19 Science 75)])
首先, 你的面试准备可通过以下方式增强你的数据结构概念:Python DS课程。