Python

pandas实例-Deleting

2020-05-23  本文已影响0人  橘猫吃不胖

继续前面的练习,之前的文章参考:


这一篇是关于数据操作的,前面有类似题目,这里就当做回顾好了

url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'
df = pd.read_csv(url)

这里有个问题,是没有列名

我们需要重新指定下列名

df = pd.read_csv(url , header=None , names=['sepal_length','sepal_width', 'petal_length', 'petal_width', 'class'])

1. Is there any missing value in the dataframe

这里要看数据中有没有缺失值,其实,通过上面的info函数就可以看出来,这里还有另一种方法

pandas.isna
pandas.isnull
这俩函数貌似一样

This function takes a scalar or array-like object and indicates whether values are missing (NaN in numeric arrays, None or NaN in object arrays, NaT in datetimelike).

pd.isna(df).sum()
pd.isnull(df).sum()

2. Lets set the values of the rows 10 to 29 of the column 'petal_length' to NaN

将某几行数据设置为NaN

df['petal_length'].iloc[10:30] = np.nan

3. Good, now lets substitute the NaN values to 1.0

将NaN设置为1

df.fillna(1.0 , inplace=True)
df['petal_length'].iloc[10:30]

4. Now let's delete the column class

删除某一列

df.drop(columns='class' , inplace=True)

5. Set the first 3 rows as NaN

把前3行都设置为NaN

df.iloc[:4] = np.nan

我这里设置多了,注意哦

6. Delete the rows that have NaN

把包含NaN的行都删除掉

df.dropna(inplace=True)

7. Reset the index so it begins with 0 again

重置index

df.reset_index(drop=True , inplace=True)

好了,这一篇结束,收工,主要还是要了解函数,一开始会记不住,多用,用的时候查查API文档

上一篇 下一篇

猜你喜欢

热点阅读