machine-learning-foundations-4

Abstract: 机器学习基石(林轩田)第四讲,讨论学习是否是可能的(Feasibility of Learning)。

  1. Learning is Impossible?

    • No Free Lunch
      基于数据集D的学习去推测D之外的数据的映射输出注定要失败,因为未知的映射关系存在很多可能性。
  2. Probability to the Rescue

    • Inferring Something Unknown
      以推断装有橙球和绿球的瓶子中橙球的比例为例子:
      真实:橙球概率 = $\mu$,绿球概率 = $1-\mu$
      采样:橙球概率 = $\nu$,绿球概率 = $1-\nu$
      in-sample的 $\nu$ 是否包含了out-of-sample的 $\mu$ 的信息呢?
      • possibly not: sample can be mostly green while bin is mostly orange
      • probably yes: in-sample $\nu$ likely close to unknown $\mu$

      Hoeffding’s Inequality
      in big sample (N large), $\nu$ is probably close to $\mu$ (within $\epsilon$)
      $$ \mathbb{P}[|\nu-\mu|>\epsilon]\leq 2exp(-2{\epsilon}^{2}N) $$
      so, the statement ‘$\nu=\mu$’ is probably approximately correct (PAC)
      • larger sample size N or looser gap $\epsilon$ $\Rightarrow$ higher probability for ‘$\nu=\mu$’
      • if large N, can probably infer unknown $\mu$ by known $\nu$
  3. Connection to Learning

    对任意固定的假设h,在数据量足够大的情况下(N足够大),根据Hoeffding’s Inequality:
    $$ \mathbb{P}[|{E}_{in}(h)-{E}_{out}(h)|>\epsilon]\leq 2exp(-2{\epsilon}^{2}N) $$
    故:
    $$
    ‘{E}_{in}(h)\approx{E}_{out}(h)’\ and\ ‘{E}_{in}(h)\ small’
    \Rightarrow {E}_{out}(h)\ small \Rightarrow h \approx f\ with\ repect\ to\ P
    $$
    Verification of One h

    算法选出的g是否能近似真实的f,关键在于算法是自由选择而非被迫选择(有偏)。
    验证的流程图:

  4. Connection to Real Learning
    Multiple h
    以抛硬币估计正反面概率为例:

    最合理的算法 $\mathcal{A}$ 选择 ${E}_{in}(h_m)$ 最小的 $h_m$ 作为 g,有些像pocket PLA的贪婪性质。

    即,对于有限的假设空间,在采样数据N足够大(意味着 ${E}_{out}(g)\approx{E}_{in}(g)$ )且算法 $\mathcal{A}$ 选择的 g 满足 ${E}_{in}(g)\approx 0$,则 ${E}_{out}(g)\approx 0$ ,学习是可能的。