19 Matching Annotations
  1. Aug 2019
    1. Centric web solution is the renowned best web development company.

      We have a very highly experienced and expert development team who are experts in web design & development.

      We provide various services like Wordpress web development, eCommerce web development, Wordpress theme design and plugin development, website maintenance & optimization.

      For more our services call us on +91 98587 58541 or visit our website https://www.centricwebsolution.com/.

      Our Services Are:-

      • Web Design & Development
      • WordPress Development
      • WooCommerce Development
      • Custom Web Application Development
      • Website Migration Services
      • Website Maintenance & Performance optimization
      • Website Plugin and API Development
      • Website Store Optimization
      • PHP Web Development
      • Enterprise Web Application Development
      • Laravel Development Services
      • Angularjs Development Services

  2. Jul 2019
  3. Jan 2019
    1. Optimization Models for Machine Learning: A Survey

      感觉此文于我而言真正有价值的恐怕只有文末附录的 Dataset tables 汇总整理了。。。。。

  4. Nov 2018
    1. Learning with Random Learning Rates

      作者提出了一种新的Alrao优化算法,让网络中每个 unit 或 feature 都各自从不同级别的随机分布中采样获得其自己的学习率。该算法没有额外计算损耗,可以更快速达到理想 lr 下的SGD性能,用来测试 DL 模型很棒!

    2. On the loss landscape of a class of deep neural networks with no bad local valleys

      文章声称的全局最小训练,事实上主要基于一个比较特殊的人工神经网络的结构,用了各种连接到 output 的 skip connection,还有几个额外的assumptions 作为理论保证。

    3. Revisiting Small Batch Training for Deep Neural Networks

      这篇文章简而言之就是mini-batch sizes取得尽可能小一些可能比较好。自己瞅了一眼正在写的 paper,这不禁让我小肝微微一颤,心想:还是下次再把 batch-size 取得小一点吧。。。[挖鼻] ​​​​

    4. Don't Use Large Mini-Batches, Use Local SGD

      最近(2018/8)在听数学与系统科学的非凸最优化进展时候,李博士就讲过:现在其实不太欣赏变 learning rate 了,反而逐步从 SGD 到 MGD 再到 GD 的方式,提高 batch-size 会有更好的优化效果!

    5. Accelerating Natural Gradient with Higher-Order Invariance

      每次看到研究梯度优化理论的 paper,都感觉到无比的神奇!厉害到爆表。。。。

    6. Backprop Evolution

      这似乎是说反向传播的算法,在函数结构本身上,就还有很大的优化空间。作者在一些初等函数和常见矩阵操作基础上探索了一些操作搭配,发现效能轻易的就优于传统的反向传播算法。

      不禁启发:我们为什么不能也用网络去拟合优化梯度更新函数呢?

    7. Gradient Descent Finds Global Minima of Deep Neural Networks

      全篇的数学理论证明:深度过参网络可以训练到0。(仅 train loss,非 test loss)+(GD,非 SGD)

      烧脑!CMU、北大等合著论文真的找到了神经网络的全局最优解

    8. A Convergence Theory for Deep Learning via Over-Parameterization

      又一个全篇的数学理论证明,但是没找到 conclusion 到底是啥,唯一接近的是 remark 的信息,但内容也都并不惊奇。不过倒是一个不错的材料,若作为熟悉DNN背后的数学描述的话。

  5. Feb 2018
  6. Dec 2017
  7. Jun 2017
    1. @article{ben2002robust, title={Robust optimization--methodology and applications}, author={Ben-Tal, Aharon and Nemirovski, Arkadi}, journal={Mathematical Programming}, volume={92}, number={3}, pages={453--480}, year={2002}, publisher={Springer} }

    Tags

    Annotators

  8. Feb 2017
  9. Jul 2016
  10. Apr 2016
    1. While there are assets that have not been assigned to a cluster If only one asset remaining then Add a new cluster Only member is the remaining asset Else Find the asset with the Highest Average Correlation (HC) to all assets not yet been assigned to a Cluster Find the asset with the Lowest Average Correlation (LC) to all assets not yet assigned to a Cluster If Correlation between HC and LC > Threshold Add a new Cluster made of HC and LC Add to Cluster all other assets that have yet been assigned to a Cluster and have an Average Correlation to HC and LC > Threshold Else Add a Cluster made of HC Add to Cluster all other assets that have yet been assigned to a Cluster and have a Correlation to HC > Threshold Add a Cluster made of LC Add to Cluster all other assets that have yet been assigned to a Cluster and have Correlation to LC > Threshold End if End if End While

      Fast Threshold Clustering Algorithm

      Looking for equivalent source code to apply in smart content delivery and wireless network optimisation such as Ant Mesh via @KirkDBorne's status https://twitter.com/KirkDBorne/status/479216775410626560 http://cssanalytics.wordpress.com/2013/11/26/fast-threshold-clustering-algorithm-ftca/

    1. Effect of step size. The gradient tells us the direction in which the function has the steepest rate of increase, but it does not tell us how far along this direction we should step.

      That's the reason why step size is an important factor in optimization algorithm. Too small step can cause the algorithm longer to converge. Too large step can cause that we change the parameters too much thus overstepped the optima.

  11. Jan 2015
    1. There are other ways of performing the optimization (e.g. LBFGS), but Gradient Descent is currently by far the most common and established way of optimizing Neural Network loss functions.

      Are there any studies that compare different pros and cons of the optimization procedures with respect to some specific NN architectures (e.g., classical LeNets)?