机器学习基石Part1

2017-10-19 本文已影响0人 Emily_3b7b

1. data -> ML -> skill

skill: improve some performance measure

machine learning: improve some performance measure with experience computed from data

decide whether to use ML:

(1)exists some "underlying pattern" to be learned

(2)no programmable definition

(3)somehow there is data about the pattern

2. The learning model:

training examples, hypothesis set ---> learning algorithm ---> final hypothesis g

3. Difference:

Machine Learning: use data to compute hypothesis g that approximates target f

Data Mining: use huge data to find property that is intersting

Artificial Intelligence: compute something that shows intelligent behavior

Statistics: use data to make inference about an unknown process

5. Perceptron Learning Algorithm (PLA)

-- A fault confessed is half redressed.

next can follow naive cycle(1..N)

6. Linear Separability: if PLA can halt(stop)

7. Pocket Algorithm: modify PLA algorithm by keeping best weights in pocket.

maker fewer mistakes until enough iterations

pocket is slower than PLA, because it needs to compare with old w and store better weight.

8. Multiclass classification problem: which type

regression: stock price, temperature

binary classification: y={-1, +1}

structured learning:

9. Supervised learning: every Xn comes with corresponding Yn

Unsupervised learning: multiclass classification <=> 'clustering', learning without Yn

eg: articles => topics

Semi-supervised learning: coin recognition with some Yn

Reinforcement learning: learn with "partial/implicit information" (often sequentially)

10. Different Input space: concrete, raw, abstract features

11. Hoeffding's Inequality