Abstract: 机器学习基石(林轩田)第四讲,讨论学习是否是可能的(Feasibility of Learning)。
Learning is Impossible?
- No Free Lunch
基于数据集D的学习去推测D之外的数据的映射输出注定要失败,因为未知的映射关系存在很多可能性。
- No Free Lunch
Probability to the Rescue
- Inferring Something Unknown
以推断装有橙球和绿球的瓶子中橙球的比例为例子:
真实:橙球概率 = $\mu$,绿球概率 = $1-\mu$
采样:橙球概率 = $\nu$,绿球概率 = $1-\nu$
in-sample的 $\nu$ 是否包含了out-of-sample的 $\mu$ 的信息呢?- possibly not: sample can be mostly green while bin is mostly orange
- probably yes: in-sample $\nu$ likely close to unknown $\mu$
Hoeffding’s Inequality
in big sample (N large), $\nu$ is probably close to $\mu$ (within $\epsilon$)
$$ \mathbb{P}[|\nu-\mu|>\epsilon]\leq 2exp(-2{\epsilon}^{2}N) $$
so, the statement ‘$\nu=\mu$’ is probably approximately correct (PAC)- larger sample size N or looser gap $\epsilon$ $\Rightarrow$ higher probability for ‘$\nu=\mu$’
- if large N, can probably infer unknown $\mu$ by known $\nu$
- Inferring Something Unknown
Connection to Learning
对任意固定的假设h,在数据量足够大的情况下(N足够大),根据Hoeffding’s Inequality:
$$ \mathbb{P}[|{E}_{in}(h)-{E}_{out}(h)|>\epsilon]\leq 2exp(-2{\epsilon}^{2}N) $$
故:
$$
‘{E}_{in}(h)\approx{E}_{out}(h)’\ and\ ‘{E}_{in}(h)\ small’
\Rightarrow {E}_{out}(h)\ small \Rightarrow h \approx f\ with\ repect\ to\ P
$$
Verification of One h算法选出的g是否能近似真实的f,关键在于算法是自由选择而非被迫选择(有偏)。
验证的流程图:Connection to Real Learning
Multiple h
以抛硬币估计正反面概率为例:最合理的算法 $\mathcal{A}$ 选择 ${E}_{in}(h_m)$ 最小的 $h_m$ 作为 g,有些像pocket PLA的贪婪性质。
即,对于有限的假设空间,在采样数据N足够大(意味着 ${E}_{out}(g)\approx{E}_{in}(g)$ )且算法 $\mathcal{A}$ 选择的 g 满足 ${E}_{in}(g)\approx 0$,则 ${E}_{out}(g)\approx 0$ ,学习是可能的。