machine-learning-foundations-4

Abstract: 机器学习基石（林轩田）第四讲，讨论学习是否是可能的(Feasibility of Learning)。

Learning is Impossible?
- No Free Lunch
  基于数据集D的学习去推测D之外的数据的映射输出注定要失败，因为未知的映射关系存在很多可能性。
Probability to the Rescue
- Inferring Something Unknown
  以推断装有橙球和绿球的瓶子中橙球的比例为例子：
  真实：橙球概率 = $\mu$，绿球概率 = $1-\mu$
  采样：橙球概率 = $\nu$，绿球概率 = $1-\nu$
  in-sample的 $\nu$ 是否包含了out-of-sample的 $\mu$ 的信息呢？
  - possibly not: sample can be mostly green while bin is mostly orange
  - probably yes: in-sample $\nu$ likely close to unknown $\mu$
  Hoeffding’s Inequality
  in big sample (N large), $\nu$ is probably close to $\mu$ (within $\epsilon$)
  $$ \mathbb{P}[|\nu-\mu|>\epsilon]\leq 2exp(-2{\epsilon}^{2}N) $$
  so, the statement ‘$\nu=\mu$’ is probably approximately correct (PAC)
  - larger sample size N or looser gap $\epsilon$ $\Rightarrow$ higher probability for ‘$\nu=\mu$’
  - if large N, can probably infer unknown $\mu$ by known $\nu$
Connection to Learning

对任意固定的假设h，在数据量足够大的情况下(N足够大)，根据Hoeffding’s Inequality：
$$ \mathbb{P}[|{E}_{in}(h)-{E}_{out}(h)|>\epsilon]\leq 2exp(-2{\epsilon}^{2}N) $$
故：
$$
‘{E}_{in}(h)\approx{E}_{out}(h)’\ and\ ‘{E}_{in}(h)\ small’
\Rightarrow {E}_{out}(h)\ small \Rightarrow h \approx f\ with\ repect\ to\ P
$$
Verification of One h

算法选出的g是否能近似真实的f，关键在于算法是自由选择而非被迫选择(有偏)。
验证的流程图：
Connection to Real Learning
Multiple h
以抛硬币估计正反面概率为例：

最合理的算法 $\mathcal{A}$ 选择 ${E}_{in}(h_m)$ 最小的 $h_m$ 作为 g，有些像pocket PLA的贪婪性质。

即，对于有限的假设空间，在采样数据N足够大(意味着 ${E}_{out}(g)\approx{E}_{in}(g)$ )且算法 $\mathcal{A}$ 选择的 g 满足 ${E}_{in}(g)\approx 0$，则 ${E}_{out}(g)\approx 0$ ，学习是可能的。