How to apply a standard scaler o

2017-04-25  本文已影响0人  Kulbear

This is only a very short post that contains some tips you need when scaling your data and (maybe) some problems you'll meet during this process.

There are many state-of-the-art libraries can handle this problem easily for you. I'll introduce the one I am mostly familiar with, scikit-learn in Python.

This is the most top 5 rows in our sample dataset, where open, high, 'low', 'volume' and 'amount' are our features and close is the target we want to be able to predict after the model is trained.

Data Example

But wait, before we start throwing our data into the model training process, what did you forget?

You need to standardize features by removing the mean and scaling to unit variance.

def standard_scaler(X_train, X_test):
    train_samples, train_nx, train_ny = X_train.shape
    test_samples, test_nx, test_ny = X_test.shape
    X_train = X_train.reshape((train_samples, train_nx * train_ny))
    X_test = X_test.reshape((test_samples, test_nx * test_ny))
    preprocessor = prep.StandardScaler().fit(X_train)
    X_train = preprocessor.transform(X_train)
    X_test = preprocessor.transform(X_test)
    X_train = X_train.reshape((train_samples, train_nx, train_ny))
    X_test = X_test.reshape((test_samples, test_nx, test_ny))
    return X_train, X_test

TODO...
TODO...
TODO...

上一篇 下一篇

猜你喜欢

热点阅读