目录
一、二者的特点
二、官网原文
三、例子——总有一款适合你
注意:此处的“整数”将被解释为index的一个label而不是index的位置
注意:此处的“整数”将被解释为index的位置,前闭后开
其中,loc是指location的意思,iloc中的i是指integer。
用人话说
df.loc["Adam", "Age"] # 返回 df 中 index=="Adam" and column=="Age"的值;
df.loc["Adam"] # 返回 df 中 index=="Adam"的行的所有值,形为Series,该Series的index为df的column,values为该行的值。
df.iloc[2, 3] # 返回 df 中 index==2 and column==3的值;
df.iloc[1:5, 3:6] # 返回 df 中 index从1到4行 and column从3到5行,形为DataFrame 。
DataFrame.loc
Access a group of rows and columns by label(s) or a boolean array.
.loc[]
is primarily label based, but may also be used with a boolean array.Allowed inputs are:
A single label, e.g.
5
or'a'
, (note that5
is interpreted as a label of the index, and never as an integer position along the index).A list or array of labels, e.g.
['a', 'b', 'c']
.A slice object with labels, e.g.
'a':'f'
.Warning:Note that contrary to usual python slices, both the start and the stop are included
A boolean array of the same length as the axis being sliced, e.g.
[True, False, True]
.A
callable
function with one argument (the calling Series or DataFrame) and that returns valid output for indexing (one of the above)
DataFrame.iloc
Purely integer-location based indexing for selection by position.
.iloc[]
is primarily integer position based (from0
tolength-1
of the axis), but may also be used with a boolean array.Allowed inputs are:
- An integer, e.g.
5
.- A list or array of integers, e.g.
[4, 3, 0]
.- A slice object with ints, e.g.
1:7
.- A boolean array.
- A
callable
function with one argument (the calling Series or DataFrame) and that returns valid output for indexing (one of the above). This is useful in method chains, when you don’t have a reference to the calling object, but would like to base your selection on some value.
.iloc
will raiseIndexError
if a requested indexer is out-of-bounds, except slice indexers which allow out-of-bounds indexing (this conforms with python/numpy slice semantics).
取值:
# 初始化df:
>>> df = pd.DataFrame([[1, 2], [4, 5], [7, 8]],
... index=['cobra', 'viper', 'sidewinder'],
... columns=['max_speed', 'shield'])
>>> df
max_speed shield
cobra 1 2
viper 4 5
sidewinder 7 8
# 取df 的一行:以 Series的形式返回该行
>>> df.loc['viper']
max_speed 4
shield 5
Name: viper, dtype: int64
# 取df的多行:以 DataFrame的形式返回这些值
>>> df.loc[['viper', 'sidewinder']] # 注意:要使用 [[]]
max_speed shield
viper 4 5
sidewinder 7 8
# 取df的一个值:
>>> df.loc['cobra', 'shield']
2
# 以“布尔值”为元素的列表,也可以取值,True取,False不取
>>> df.loc[[False, False, True]]
max_speed shield
sidewinder 7 8
# 设定判断条件后,返回“布尔值”构成的Series,也可以取值
# 在'shield'列中筛选大于6的行,取这些行的全部值
>>> df.loc[df['shield'] > 6]
max_speed shield
sidewinder 7 8
# 在'shield'列中筛选大于6的行,取['max_speed']列的对应元素(例如,筛选身高大于1.8米者的体重)
>>> df.loc[df['shield'] > 6, ['max_speed']]
max_speed
sidewinder 7
# 以lambda表达式做判断,返回“布尔值”构成的Series,实现取值
>>> df.loc[lambda df: df['shield'] == 8]
max_speed shield
sidewinder 7 8
赋值:
与“取值”类似
all_data.loc[all_data["GarageType"].isnull(), ["GarageType"]] = "No Garage"