machine-learning-foundations-3

Abstract: 机器学习基石（林轩田）第三讲，从不同角度对学习进行分类。

Learning with Different Output Space $\mathcal{Y}$
- binary classification: $\mathcal{Y}=\lbrace -1, +1 \rbrace$
- multiclass classification: $\mathcal{Y}=\lbrace 1,2,···,K \rbrace$
- regression: $\mathcal{Y}=\mathbb{R}$
- structured learning: $\mathcal{Y}=structures$ (e.g. 句子的词性结构)
- …
其中，binary classification和regression是解决更复杂问题的核心与基础。
Learning with Different Data Label
- supervised learning: every $\pmb{x}_n$ comes with corresponding $y_n$
- unsupervised learning: learning without $y_n$
  - clustering
  - density estimation
  - outlier detection
  - …
- semi-supervised learning: leverage unlabeled data to avoid ‘expensive’ labeling with some given $y_n$
- reinforcement learning: learn with ‘partial/implicit information’ (often sequentially), implicit $y_n$
其中，supervised learning是目前的核心工具。
Learning with Different Protocol $f\Rightarrow (\pmb{x}_n, y_n)$
- batch learning: learn from all known data
- online learning: hypothesis ‘improves’ through receiving data
  instances sequentially (passive)
- active learning: improve hypothesis with fewer labels (hopefully) by asking questions strategically
其中，目前最常见的还是batch learning。
Learning with Different Input Space $\mathcal{X}$
- concrete features: each dimension of $\mathcal{X}\subset \mathbb{R}^d$ represents ‘sophisticated physical meaning’
- raw features: often need human or machines to convert to concrete ones
- abstract: again need ‘feature conversion/extraction/construction’
其中，concrete features类型的输入是最容易处理的。