6 Matching Annotations
  1. Oct 2019
    1. gram matrix must be normalized by dividing each element by the total number of elements in the matrix.

      true, after downsampling your gradient will get smaller on later layers

  2. Mar 2019
    1. This is one of many discussions of Kirkpatrick's four levels of evaluation. More of the page is taken up with decoration and graphics than needs to be the case but this page is included in this list because it offers a printable guide and because the hierarchy of the four levels is clearly shown. The text itself is printed in black on a white background and it is presented as a bulleted list (the bullets are not organized as well as they could be). Nonetheless it is a usable presentation of this model. rating 3/5

  3. Feb 2019
    1. Transfusion: Understanding Transfer Learning with Applications to Medical Imaging

      基于模型参数的迁移学习对 proformace 影响不大,当然训练更快啦。有趣的是,迁移 trained 模型参数的均值/方差统计性也可以得到不错的迁移效果。

  4. Nov 2018
    1. An analytic theory of generalization dynamics and transfer learning in deep linear networks

      这是一篇谈论泛化error和Transfer L.的理论 paper. 虽实验细节还没看懂, 但结论很意义:新提出一个解析的理论方法,发现网络最首要先学到并依赖的是tast structure(通过early-stoping)而不是网络size!这也就解释了为啥随机data比real data更容易被学习,似乎存在更好的non-GD优化策略.

      关于 SNR 也有迁移实验,说可以从高 SNR 迁移到低 SNR。。。

    2. A Survey on Deep Transfer Learning

      不仅综述了迁移学习的现状,也对其进行了分类。同时,还给出了“深度迁移”的概念,强调了待迁移的两个学习任务之间的非线性关系。其实这也很自然,我们本来对线性的“相似”学习任务迁移就兴趣一般,也没多大研究意义。。。。

    3. Training neural audio classifiers with few data

      这是一个比较初步的简单实验。

      图像结论其实并不意外:数据量越多当然表现越好;迁移学习在极小量数据上表现良好;Prototypical 模型可能因结构的特异性会表现出一定程度上的优势;数据量越小,过拟合问题越严重。。。