DateFrame:
- 二维数组DateFrame的创建
-
- 基本的索引
- 排列
- 取行/列
- loc与iloc
- 以上的全部代码
二维数组DateFrame的创建
first
t1 = [{"name":"张山", "age":19, "tel":"10088"},
{"name":"李四", "age":29, "tel":"14792"},
{"name":"王五", "age":25, "tel":"10637"}]
data1 = pd.DataFrame(t1)
print(data1)
name age tel
0 张山 19 10088
1 李四 29 14792
2 王五 25 10637
second
t2 = {"name":["张山", "李四", "王五"],
"age":[19,30,28],
"tel":[10392, 38922, 43829]}
data2 = pd.DataFrame(t2)
print(data2)
name age tel
0 张山 19 10392
1 李四 30 38922
2 王五 28 43829
基本的索引
print("data2.index:\n",data2.index)
print("data2.columns:\n",data2.columns)
print("data2.values:\n",data2.values)
print("data2.astype:\n",data2.astype)
print("data2.shape:\n",data2.shape)
print("data2.ndim:\n",data2.ndim)
data2.index:
RangeIndex(start=0, stop=3, step=1)
data2.columns:
Index(['name', 'age', 'tel'], dtype='object')
data2.values:
[['张山' 19 10392]
['李四' 30 38922]
['王五' 28 43829]]
data2.astype:
<bound method NDFrame.astype of name age tel
0 张山 19 10392
1 李四 30 38922
2 王五 28 43829>
data2.shape:
(3, 3)
data2.ndim:
2
排列
df = data2.sort_values(by="age")
- 修改成降序排列(CTRL+B 显示原码,然后直接找到对应位置进行重新定义)
df = data2.sort_values(by="age", ascending=False)
取行/列
df = df.head(2)
- 直接取行或者列
- 方括号放入数字,对行操作;放入字符串,对列进行操作
df1 = data2[:1]
df2 = data2["name"]
df3 = data2[:1]["name"]
loc与iloc
- loc - 通过标签进行索引
- iloc - 通过数字进行索引
df4 = data2.loc[:1,["name"]]
print(df4)
df5 = data2.iloc[:1,:1]
df5 = data2["name"].str.len()
print(df5)
df6 = data2["name"].str.split("/")
df7 = data2.dropna(axis=0, inplace=True)
以上的全部代码
import pandas as pd
t1 = [{"name":"张山", "age":19, "tel":"10088"},
{"name":"李四", "age":29, "tel":"14792"},
{"name":"王五", "age":25, "tel":"10637"}]
data1 = pd.DataFrame(t1)
print(data1)
t2 = {"name":["张山", "李四", "王五"], "age":[19,30,28], "tel":[10392, 38922, 43829]}
data2 = pd.DataFrame(t2)
print(data2)
print("data2.index:\n",data2.index)
print("data2.columns:\n",data2.columns)
print("data2.values:\n",data2.values)
print("data2.astype:\n",data2.astype)
print("data2.shape:\n",data2.shape)
print("data2.ndim:\n",data2.ndim)
df = data2.sort_values(by="age", ascending=False)
print(df)
df = df.head(2)
print(df)
df1 = data2[:1]
df2 = data2["name"]
df3 = data2[:1]["name"]
print(df1)
print(df2)
print(df3)
df4 = data2.loc[:1,["name"]]
print(df4)
df5 = data2["name"].str.len()
print(df5)
df6 = data2["name"].str.split("/")
print(df6)
df7 = data2.dropna(axis=0, inplace=True)
print(df7)
D:\python.exe E:/pycharm文件/pandas/dataframe一般使用.py
name age tel
0 张山 19 10088
1 李四 29 14792
2 王五 25 10637
name age tel
0 张山 19 10392
1 李四 30 38922
2 王五 28 43829
data2.index:
RangeIndex(start=0, stop=3, step=1)
data2.columns:
Index(['name', 'age', 'tel'], dtype='object')
data2.values:
[['张山' 19 10392]
['李四' 30 38922]
['王五' 28 43829]]
data2.astype:
<bound method NDFrame.astype of name age tel
0 张山 19 10392
1 李四 30 38922
2 王五 28 43829>
data2.shape:
(3, 3)
data2.ndim:
2
name age tel
1 李四 30 38922
2 王五 28 43829
0 张山 19 10392
name age tel
1 李四 30 38922
2 王五 28 43829
name age tel
0 张山 19 10392
0 张山
1 李四
2 王五
Name: name, dtype: object
0 张山
Name: name, dtype: object
name
0 张山
1 李四
0 2
1 2
2 2
Name: name, dtype: int64
0 [张山]
1 [李四]
2 [王五]
Name: name, dtype: object
None
Process finished with exit code 0