Python for Data Analysis_2nd_Task 5 之 Pandas 进阶
十道经典练习,使用Pandas,一起玩转数据分析
开始了解你的数据:探索 Chipotle 快餐数据数据过滤与排序:探索 2012 欧洲杯数据数据分组:探索酒类消费数据Apply 函数:探索 1960-2014 美国犯罪数据合并:探索虚拟姓名数据统计:探索风速数据可视化:探索泰坦尼克灾难数据创建数据框:探索 Pokemon 数据时间序列:探索 Apple 公司股价数据删除数据:探索 Iris 纸鸢花数据
开始了解你的数据:探索 Chipotle 快餐数据
探索 Chipotle 快餐数据
查看对应的数据集文件路径
ls
../input/pandas_exercise
/exercise_data
/
Apple_stock
.csv drinks
.csv second_cars_info
.csv wechart
.csv
cars
.csv Euro2012_stats
.csv train
.csv wind
.data
chipotle
.tsv iris
.csv US_Crime_Rates_1960_2014
.csv
Step 1 导入必要的库
import pandas
as np
Step 2 导入数据集
path1
= "../input/pandas_exercise/exercise_data/chipotle.tsv"
Step 3 将数据集存入 chipo 的 DataFrame 类型
chipo
= pd
.read_csv
(path1
, sep
= '\t')
Step 4 查看前 10 行内容
chipo
.head
(10)
order_id quantity item_name choice_description item_price
0 1 1 Chips
and Fresh Tomato Salsa NaN $
2.39
1 1 1 Izze
[Clementine
] $
3.39
2 1 1 Nantucket Nectar
[Apple
] $
3.39
3 1 1 Chips
and Tomatillo
-Green Chili Salsa NaN $
2.39
4 2 2 Chicken Bowl
[Tomatillo
-Red Chili Salsa
(Hot
), [Black Beans
... $
16.98
5 3 1 Chicken Bowl
[Fresh Tomato Salsa
(Mild
), [Rice
, Cheese
, Sou
... $
10.98
6 3 1 Side of Chips NaN $
1.69
7 4 1 Steak Burrito
[Tomatillo Red Chili Salsa
, [Fajita Vegetables
... $
11.75
8 4 1 Steak Soft Tacos
[Tomatillo Green Chili Salsa
, [Pinto Beans
, Ch
... $
9.25
9 5 1 Steak Burrito
[Fresh Tomato Salsa
, [Rice
, Black Beans
, Pinto
... $
9.25
Step 5 产看数据集中的列数
chipo
.shape
[1]
5
chipo
.shape
[0]
4622
Step 6 打印全部列名
chipo
.columns
Index
(['order_id', 'quantity', 'item_name', 'choice_description',
'item_price'],
dtype
='object')
chipo
.info
<bound method DataFrame
.info of order_id quantity item_name \
0 1 1 Chips
and Fresh Tomato Salsa
1 1 1 Izze
2 1 1 Nantucket Nectar
3 1 1 Chips
and Tomatillo
-Green Chili Salsa
4 2 2 Chicken Bowl
5 3 1 Chicken Bowl
6 3 1 Side of Chips
7 4 1 Steak Burrito
8 4 1 Steak Soft Tacos
9 5 1 Steak Burrito
10 5 1 Chips
and Guacamole
11 6 1 Chicken Crispy Tacos
12 6 1 Chicken Soft Tacos
13 7 1 Chicken Bowl
14 7 1 Chips
and Guacamole
15 8 1 Chips
and Tomatillo
-Green Chili Salsa
16 8 1 Chicken Burrito
17 9 1 Chicken Burrito
18 9 2 Canned Soda
19 10 1 Chicken Bowl
20 10 1 Chips
and Guacamole
21 11 1 Barbacoa Burrito
22 11 1 Nantucket Nectar
23 12 1 Chicken Burrito
24 12 1 Izze
25 13 1 Chips
and Fresh Tomato Salsa
26 13 1 Chicken Bowl
27 14 1 Carnitas Burrito
28 14 1 Canned Soda
29 15 1 Chicken Burrito
... ... ... ...
4592 1825 1 Barbacoa Burrito
4593 1825 1 Carnitas Bowl
4594 1825 1 Barbacoa Bowl
4595 1826 1 Chicken Bowl
4596 1826 1 Chips
and Guacamole
4597 1826 1 Canned Soft Drink
4598 1826 1 Bottled Water
4599 1827 1 Chicken Bowl
4600 1827 1 Chips
and Guacamole
4601 1827 1 Canned Soft Drink
4602 1827 1 Barbacoa Burrito
4603 1827 1 Barbacoa Burrito
4604 1828 1 Chicken Bowl
4605 1828 1 Chips
and Guacamole
4606 1828 1 Canned Soft Drink
4607 1829 1 Steak Burrito
4608 1829 1 Veggie Burrito
4609 1829 1 Canned Soft Drink
4610 1830 1 Steak Burrito
4611 1830 1 Veggie Burrito
4612 1831 1 Carnitas Bowl
4613 1831 1 Chips
4614 1831 1 Bottled Water
4615 1832 1 Chicken Soft Tacos
4616 1832 1 Chips
and Guacamole
4617 1833 1 Steak Burrito
4618 1833 1 Steak Burrito
4619 1834 1 Chicken Salad Bowl
4620 1834 1 Chicken Salad Bowl
4621 1834 1 Chicken Salad Bowl
choice_description item_price
0 NaN $
2.39
1 [Clementine
] $
3.39
2 [Apple
] $
3.39
3 NaN $
2.39
4 [Tomatillo
-Red Chili Salsa
(Hot
), [Black Beans
... $
16.98
5 [Fresh Tomato Salsa
(Mild
), [Rice
, Cheese
, Sou
... $
10.98
6 NaN $
1.69
7 [Tomatillo Red Chili Salsa
, [Fajita Vegetables
... $
11.75
8 [Tomatillo Green Chili Salsa
, [Pinto Beans
, Ch
... $
9.25
9 [Fresh Tomato Salsa
, [Rice
, Black Beans
, Pinto
... $
9.25
10 NaN $
4.45
11 [Roasted Chili Corn Salsa
, [Fajita Vegetables
,... $
8.75
12 [Roasted Chili Corn Salsa
, [Rice
, Black Beans
,... $
8.75
13 [Fresh Tomato Salsa
, [Fajita Vegetables
, Rice
,... $
11.25
14 NaN $
4.45
15 NaN $
2.39
16 [Tomatillo
-Green Chili Salsa
(Medium
), [Pinto
... $
8.49
17 [Fresh Tomato Salsa
(Mild
), [Black Beans
, Rice
... $
8.49
18 [Sprite
] $
2.18
19 [Tomatillo Red Chili Salsa
, [Fajita Vegetables
... $
8.75
20 NaN $
4.45
21 [[Fresh Tomato Salsa
(Mild
), Tomatillo
-Green C
... $
8.99
22 [Pomegranate Cherry
] $
3.39
23 [[Tomatillo
-Green Chili Salsa
(Medium
), Tomati
... $
10.98
24 [Grapefruit
] $
3.39
25 NaN $
2.39
26 [Roasted Chili Corn Salsa
(Medium
), [Pinto Bea
... $
8.49
27 [[Tomatillo
-Green Chili Salsa
(Medium
), Roaste
... $
8.99
28 [Dr
. Pepper
] $
1.09
29 [Tomatillo
-Green Chili Salsa
(Medium
), [Pinto
... $
8.49
... ... ...
4592 [Tomatillo Red Chili Salsa
, [Rice
, Fajita Vege
... $
11.75
4593 [Roasted Chili Corn Salsa
, [Rice
, Sour Cream
, ... $
11.75
4594 [Roasted Chili Corn Salsa
, [Pinto Beans
, Sour
... $
11.75
4595 [Tomatillo Green Chili Salsa
, [Rice
, Black Bea
... $
8.75
4596 NaN $
4.45
4597 [Nestea
] $
1.25
4598 NaN $
1.50
4599 [Roasted Chili Corn Salsa
, [Cheese
, Lettuce
]] $
8.75
4600 NaN $
4.45
4601 [Diet Coke
] $
1.25
4602 [Tomatillo Green Chili Salsa
] $
9.25
4603 [Tomatillo Green Chili Salsa
] $
9.25
4604 [Fresh Tomato Salsa
, [Rice
, Black Beans
, Chees
... $
8.75
4605 NaN $
4.45
4606 [Coke
] $
1.25
4607 [Tomatillo Green Chili Salsa
, [Rice
, Cheese
, S
... $
11.75
4608 [Tomatillo Red Chili Salsa
, [Fajita Vegetables
... $
11.25
4609 [Sprite
] $
1.25
4610 [Fresh Tomato Salsa
, [Rice
, Sour Cream
, Cheese
... $
11.75
4611 [Tomatillo Green Chili Salsa
, [Rice
, Fajita Ve
... $
11.25
4612 [Fresh Tomato Salsa
, [Fajita Vegetables
, Rice
,... $
9.25
4613 NaN $
2.15
4614 NaN $
1.50
4615 [Fresh Tomato Salsa
, [Rice
, Cheese
, Sour Cream
]] $
8.75
4616 NaN $
4.45
4617 [Fresh Tomato Salsa
, [Rice
, Black Beans
, Sour
... $
11.75
4618 [Fresh Tomato Salsa
, [Rice
, Sour Cream
, Cheese
... $
11.75
4619 [Fresh Tomato Salsa
, [Fajita Vegetables
, Pinto
... $
11.25
4620 [Fresh Tomato Salsa
, [Fajita Vegetables
, Lettu
... $
8.75
4621 [Fresh Tomato Salsa
, [Fajita Vegetables
, Pinto
... $
8.75
[4622 rows x
5 columns
]>
chipo
.describe
()
order_id quantity
count
4622.000000 4622.000000
mean
927.254868 1.075725
std
528.890796 0.410186
min 1.000000 1.000000
25% 477.250000 1.000000
50% 926.000000 1.000000
75% 1393.000000 1.000000
max 1834.000000 15.000000
Step 7 查看数据集的索引
chipo
,index
RangeIndex
(start
=0, stop
=4622, step
=1)
Step 8 查看下单最多的商品 (item)
c
= chipo
[['item_name', 'quantity']].groupby
(['item_name'], as_index
=False).agg
({'quantity':sum})
c
.sort_values
(['quantity'], ascending
=False, inplace
=True)
c
.head
()
item_name quantity
17 Chicken Bowl
761
18 Chicken Burrito
591
25 Chips
and Guacamole
506
39 Steak Burrito
386
10 Canned Soft Drink
351
Step 9 在指定列中,产看商品下单的种类
chipo
['item_name'].nuique
()
50
Step 10 在指定列 choice_description 中,查看下单最多的商品名
chipo
['choice_description'].value_counts
().head
()
[Diet Coke
] 134
[Coke
] 123
[Sprite
] 77
[Fresh Tomato Salsa
, [Rice
, Black Beans
, Cheese
, Sour Cream
, Lettuce
]] 42
[Fresh Tomato Salsa
, [Rice
, Black Beans
, Cheese
, Sour Cream
, Guacamole
, Lettuce
]] 40
Name
: choice_description
, dtype
: int64
Step 11 一共有多少商品被下单
total_items_orders
= chipo
['quantity'].sum()
total_items_orders
4972
Step 12 将 item_price 转换为 float 类型
dollarizer
= lamnda x
: float(x
[1:-1])
chipo
['item_price'] = chipo
['item_price'].apply(dollarizer
)
0 2.39
1 3.39
2 3.39
3 2.39
4 16.98
Name
: item_price
, dtype
: float64