machine-learning-foundations-3

Abstract: 机器学习基石(林轩田)第三讲,从不同角度对学习进行分类。

  1. Learning with Different Output Space $\mathcal{Y}$

    • binary classification: $\mathcal{Y}=\lbrace -1, +1 \rbrace$
    • multiclass classification: $\mathcal{Y}=\lbrace 1,2,···,K \rbrace$
    • regression: $\mathcal{Y}=\mathbb{R}$
    • structured learning: $\mathcal{Y}=structures$ (e.g. 句子的词性结构)

    其中,binary classification和regression是解决更复杂问题的核心与基础。

  2. Learning with Different Data Label

    • supervised learning: every $\pmb{x}_n$ comes with corresponding $y_n$
    • unsupervised learning: learning without $y_n$
      • clustering
      • density estimation
      • outlier detection
    • semi-supervised learning: leverage unlabeled data to avoid ‘expensive’ labeling with some given $y_n$
    • reinforcement learning: learn with ‘partial/implicit information’ (often sequentially), implicit $y_n$

    其中,supervised learning是目前的核心工具。

  3. Learning with Different Protocol $f\Rightarrow (\pmb{x}_n, y_n)$

    • batch learning: learn from all known data
    • online learning: hypothesis ‘improves’ through receiving data
      instances sequentially (passive)
    • active learning: improve hypothesis with fewer labels (hopefully) by asking questions strategically

    其中,目前最常见的还是batch learning。

  4. Learning with Different Input Space $\mathcal{X}$

    • concrete features: each dimension of $\mathcal{X}\subset \mathbb{R}^d$ represents ‘sophisticated physical meaning’
    • raw features: often need human or machines to convert to concrete ones
    • abstract: again need ‘feature conversion/extraction/construction’

    其中,concrete features类型的输入是最容易处理的。