numpy、pandas、linear_regression需要
Numpy
simplifies representing arrays and performing linear algebra operations
import numpy as np
- create an 8-element vector
one_dimensional_array = np.array([1,2,3,4,5,6,7,8])
- create a 3x2 matrix
matrix = np.array([1,2],[3,4],[5,6])
- all zeros/ones
np.zeros
,np.ones
- populate an array with a sequence of numbers
the lower bound 5, but not the upper bound 12
sequence_of_integers = np.arange(5,12)
- populate arrays with random numbers 6-element vectors
random_integers_between_50_and_100 = np.random.randint(low=50, high=101, size=(6))
- create random floating-point values between 0.0 and 1.0
random_floats_between_0_and_1 = np.random.random([6])
pandas
provides an easy way to represent datasets in memory
import numpy as np
import pandas as pd
dataframes are central data structure in the pandas API, which stores data in cells and has named columns(usually) and numbered rows
- create a Dataframe
my_data = np.array([[0, 3], [10, 7], [20, 9], [30, 14], [40, 15]])
my_column_names = ['temperature', 'activity']
my_dataframe = pd.DataFrame(data=my_data, columns=my_column_names)
-
adding a new column to a Dataframe
image.png
my_dataframe["adjusted"] = my_dataframe["activity"] + 2
-
specifying a subset of a DataFrame
image.png
my_dataframe.head(3)
print the first 3 rows
my_dataframe.iloc[[2]]
isolate the 2nd row
my_dataframe[1:4]
isolates the 1-4 row(starting from 0)
my_dataframe['activity']
isolates the specific columns
image.png -
Referencing. If you assign a DataFrame to a new variable, any change to the DataFrame or to the new variable will be reflected in the other.
reference_to_df = df
print the value of a particular celldf['Jason'][1]
print with format
print(" Starting value of reference_to_df: %d\n" % reference_to_df['Jason'][1])
modify a cell in df
df.at[1, 'Jason'] = df['Jason'][1] + 5
-
Copying. If you call the pd.DataFrame.copy method, you create a true independent copy. Changes to the original DataFrame or to the copy will not be reflected in the other.
copy_of_my_dataframe = my_dataframe.copy()
-
pd.read_csv(" ")
load .csv document
return DataFrame
which remains a name for each column
Linear Regression
import pandas as pd
import tensorflow as tf
from matplotlib import pyplot as plt
image.png