20 Matching Annotations
  1. Dec 2019
    1. Practical highlights in my opinion:

      • It's important to know about data padding in PG.
      • Be conscious when modelling data tables about columns ordering, but don't be pure-school and do it in a best-effort basis.
      • Gains up to 25% in wasted storage are impressive but always keep in mind the scope of the system. For me, gains are not worth it in the short-term. Whenever a system grows, it is possible to migrate data to more storage-efficient tables but mind the operative burder.

      Here follows my own commands on trying the article points. I added - pg_column_size(row()) on each projection to have clear absolute sizes.

      -- How does row function work?
      
      SELECT pg_column_size(row()) AS empty,
             pg_column_size(row(0::SMALLINT)) AS byte2,
             pg_column_size(row(0::BIGINT)) AS byte8,
             pg_column_size(row(0::SMALLINT, 0::BIGINT)) AS byte16,
             pg_column_size(row(''::TEXT)) AS text0,
             pg_column_size(row('hola'::TEXT)) AS text4,
             0 AS term
      ;
      
      -- My own take on that
      
      SELECT pg_column_size(row()) AS empty,
             pg_column_size(row(uuid_generate_v4())) AS uuid_type,
             pg_column_size(row('hola mundo'::TEXT)) AS text_type,
             pg_column_size(row(uuid_generate_v4(), 'hola mundo'::TEXT)) AS uuid_text_type,
             pg_column_size(row('hola mundo'::TEXT, uuid_generate_v4())) AS text_uuid_type,
             0 AS term
      ;
      
      CREATE TABLE user_order (
        is_shipped    BOOLEAN NOT NULL DEFAULT false,
        user_id       BIGINT NOT NULL,
        order_total   NUMERIC NOT NULL,
        order_dt      TIMESTAMPTZ NOT NULL,
        order_type    SMALLINT NOT NULL,
        ship_dt       TIMESTAMPTZ,
        item_ct       INT NOT NULL,
        ship_cost     NUMERIC,
        receive_dt    TIMESTAMPTZ,
        tracking_cd   TEXT,
        id            BIGSERIAL PRIMARY KEY NOT NULL
      );
      
      SELECT a.attname, t.typname, t.typalign, t.typlen
        FROM pg_class c
        JOIN pg_attribute a ON (a.attrelid = c.oid)
        JOIN pg_type t ON (t.oid = a.atttypid)
       WHERE c.relname = 'user_order'
         AND a.attnum >= 0
       ORDER BY a.attnum;
      
      -- What is it about pg_class, pg_attribute and pg_type tables? For future investigation.
      
      -- SELECT sum(t.typlen)
      -- SELECT t.typlen
      SELECT a.attname, t.typname, t.typalign, t.typlen
        FROM pg_class c
        JOIN pg_attribute a ON (a.attrelid = c.oid)
        JOIN pg_type t ON (t.oid = a.atttypid)
       WHERE c.relname = 'user_order'
         AND a.attnum >= 0
       ORDER BY a.attnum
      ;
      
      -- Whoa! I need to master mocking data directly into db.
      
      INSERT INTO user_order (
          is_shipped, user_id, order_total, order_dt, order_type,
          ship_dt, item_ct, ship_cost, receive_dt, tracking_cd
      )
      SELECT true, 1000, 500.00, now() - INTERVAL '7 days',
             3, now() - INTERVAL '5 days', 10, 4.99,
             now() - INTERVAL '3 days', 'X5901324123479RROIENSTBKCV4'
        FROM generate_series(1, 1000000);
      
      -- New item to learn, pg_relation_size. 
      
      SELECT pg_relation_size('user_order') AS size_bytes,
             pg_size_pretty(pg_relation_size('user_order')) AS size_pretty;
      
      SELECT * FROM user_order LIMIT 1;
      
      SELECT pg_column_size(row(0::NUMERIC)) - pg_column_size(row()) AS zero_num,
             pg_column_size(row(1::NUMERIC)) - pg_column_size(row()) AS one_num,
             pg_column_size(row(9.9::NUMERIC)) - pg_column_size(row()) AS nine_point_nine_num,
             pg_column_size(row(1::INT2)) - pg_column_size(row()) AS int2,
             pg_column_size(row(1::INT4)) - pg_column_size(row()) AS int4,
             pg_column_size(row(1::INT2, 1::NUMERIC)) - pg_column_size(row()) AS int2_one_num,
             pg_column_size(row(1::INT4, 1::NUMERIC)) - pg_column_size(row()) AS int4_one_num,
             pg_column_size(row(1::NUMERIC, 1::INT4)) - pg_column_size(row()) AS one_num_int4,
             0 AS term
      ;
      
      SELECT pg_column_size(row(''::TEXT)) - pg_column_size(row()) AS empty_text,
             pg_column_size(row('a'::TEXT)) - pg_column_size(row()) AS len1_text,
             pg_column_size(row('abcd'::TEXT)) - pg_column_size(row()) AS len4_text,
             pg_column_size(row('abcde'::TEXT)) - pg_column_size(row()) AS len5_text,
             pg_column_size(row('abcdefgh'::TEXT)) - pg_column_size(row()) AS len8_text,
             pg_column_size(row('abcdefghi'::TEXT)) - pg_column_size(row()) AS len9_text,
             0 AS term
      ;
      
      SELECT pg_column_size(row(''::TEXT, 1::INT4)) - pg_column_size(row()) AS empty_text_int4,
             pg_column_size(row('a'::TEXT, 1::INT4)) - pg_column_size(row()) AS len1_text_int4,
             pg_column_size(row('abcd'::TEXT, 1::INT4)) - pg_column_size(row()) AS len4_text_int4,
             pg_column_size(row('abcde'::TEXT, 1::INT4)) - pg_column_size(row()) AS len5_text_int4,
             pg_column_size(row('abcdefgh'::TEXT, 1::INT4)) - pg_column_size(row()) AS len8_text_int4,
             pg_column_size(row('abcdefghi'::TEXT, 1::INT4)) - pg_column_size(row()) AS len9_text_int4,
             0 AS term
      ;
      
      SELECT pg_column_size(row(1::INT4, ''::TEXT)) - pg_column_size(row()) AS int4_empty_text,
             pg_column_size(row(1::INT4, 'a'::TEXT)) - pg_column_size(row()) AS int4_len1_text,
             pg_column_size(row(1::INT4, 'abcd'::TEXT)) - pg_column_size(row()) AS int4_len4_text,
             pg_column_size(row(1::INT4, 'abcde'::TEXT)) - pg_column_size(row()) AS int4_len5_text,
             pg_column_size(row(1::INT4, 'abcdefgh'::TEXT)) - pg_column_size(row()) AS int4_len8_text,
             pg_column_size(row(1::INT4, 'abcdefghi'::TEXT)) - pg_column_size(row()) AS int4_len9_text,
             0 AS term
      ;
      
      SELECT pg_column_size(row()) - pg_column_size(row()) AS empty_row,
             pg_column_size(row(''::TEXT)) - pg_column_size(row()) AS no_text,
             pg_column_size(row('a'::TEXT)) - pg_column_size(row()) AS min_text,
             pg_column_size(row(1::INT4, 'a'::TEXT)) - pg_column_size(row()) AS two_col,
             pg_column_size(row('a'::TEXT, 1::INT4)) - pg_column_size(row()) AS round4;
      
      SELECT pg_column_size(row()) - pg_column_size(row()) AS empty_row,
             pg_column_size(row(1::SMALLINT)) - pg_column_size(row()) AS int2,
             pg_column_size(row(1::INT)) - pg_column_size(row()) AS int4,
             pg_column_size(row(1::BIGINT)) - pg_column_size(row()) AS int8,
             pg_column_size(row(1::SMALLINT, 1::BIGINT)) - pg_column_size(row()) AS padded,
             pg_column_size(row(1::INT, 1::INT, 1::BIGINT)) - pg_column_size(row()) AS not_padded;
      
      SELECT a.attname, t.typname, t.typalign, t.typlen
        FROM pg_class c
        JOIN pg_attribute a ON (a.attrelid = c.oid)
        JOIN pg_type t ON (t.oid = a.atttypid)
       WHERE c.relname = 'user_order'
         AND a.attnum >= 0
       ORDER BY t.typlen DESC;
      
      DROP TABLE user_order;
      
      CREATE TABLE user_order (
        id            BIGSERIAL PRIMARY KEY NOT NULL,
        user_id       BIGINT NOT NULL,
        order_dt      TIMESTAMPTZ NOT NULL,
        ship_dt       TIMESTAMPTZ,
        receive_dt    TIMESTAMPTZ,
        item_ct       INT NOT NULL,
        order_type    SMALLINT NOT NULL,
        is_shipped    BOOLEAN NOT NULL DEFAULT false,
        order_total   NUMERIC NOT NULL,
        ship_cost     NUMERIC,
        tracking_cd   TEXT
      );
      
      -- And, what about other varying size types as JSONB?
      
      SELECT pg_column_size(row('{}'::JSONB)) - pg_column_size(row()) AS empty_jsonb,
             pg_column_size(row('{}'::JSONB, 0::INT4)) - pg_column_size(row()) AS empty_jsonb_int4,
             pg_column_size(row(0::INT4, '{}'::JSONB)) - pg_column_size(row()) AS int4_empty_jsonb,
             pg_column_size(row('{"a": 1}'::JSONB)) - pg_column_size(row()) AS basic_jsonb,
             pg_column_size(row('{"a": 1}'::JSONB, 0::INT4)) - pg_column_size(row()) AS basic_jsonb_int4,
             pg_column_size(row(0::INT4, '{"a": 1}'::JSONB)) - pg_column_size(row()) AS int4_basic_jsonb,
             0 AS term;
      
  2. Aug 2019
    1. Centric web solution is the renowned best web development company.

      We have a very highly experienced and expert development team who are experts in web design & development.

      We provide various services like Wordpress web development, eCommerce web development, Wordpress theme design and plugin development, website maintenance & optimization.

      For more our services call us on +91 98587 58541 or visit our website https://www.centricwebsolution.com/.

      Our Services Are:-

      • Web Design & Development
      • WordPress Development
      • WooCommerce Development
      • Custom Web Application Development
      • Website Migration Services
      • Website Maintenance & Performance optimization
      • Website Plugin and API Development
      • Website Store Optimization
      • PHP Web Development
      • Enterprise Web Application Development
      • Laravel Development Services
      • Angularjs Development Services

  3. Jul 2019
  4. Jan 2019
    1. Optimization Models for Machine Learning: A Survey

      感觉此文于我而言真正有价值的恐怕只有文末附录的 Dataset tables 汇总整理了。。。。。

  5. Nov 2018
    1. Learning with Random Learning Rates

      作者提出了一种新的Alrao优化算法,让网络中每个 unit 或 feature 都各自从不同级别的随机分布中采样获得其自己的学习率。该算法没有额外计算损耗,可以更快速达到理想 lr 下的SGD性能,用来测试 DL 模型很棒!

    2. On the loss landscape of a class of deep neural networks with no bad local valleys

      文章声称的全局最小训练,事实上主要基于一个比较特殊的人工神经网络的结构,用了各种连接到 output 的 skip connection,还有几个额外的assumptions 作为理论保证。

    3. Revisiting Small Batch Training for Deep Neural Networks

      这篇文章简而言之就是mini-batch sizes取得尽可能小一些可能比较好。自己瞅了一眼正在写的 paper,这不禁让我小肝微微一颤,心想:还是下次再把 batch-size 取得小一点吧。。。[挖鼻] ​​​​

    4. Don't Use Large Mini-Batches, Use Local SGD

      最近(2018/8)在听数学与系统科学的非凸最优化进展时候,李博士就讲过:现在其实不太欣赏变 learning rate 了,反而逐步从 SGD 到 MGD 再到 GD 的方式,提高 batch-size 会有更好的优化效果!

    5. Accelerating Natural Gradient with Higher-Order Invariance

      每次看到研究梯度优化理论的 paper,都感觉到无比的神奇!厉害到爆表。。。。

    6. Backprop Evolution

      这似乎是说反向传播的算法,在函数结构本身上,就还有很大的优化空间。作者在一些初等函数和常见矩阵操作基础上探索了一些操作搭配,发现效能轻易的就优于传统的反向传播算法。

      不禁启发:我们为什么不能也用网络去拟合优化梯度更新函数呢?

    7. Gradient Descent Finds Global Minima of Deep Neural Networks

      全篇的数学理论证明:深度过参网络可以训练到0。(仅 train loss,非 test loss)+(GD,非 SGD)

      烧脑!CMU、北大等合著论文真的找到了神经网络的全局最优解

    8. A Convergence Theory for Deep Learning via Over-Parameterization

      又一个全篇的数学理论证明,但是没找到 conclusion 到底是啥,唯一接近的是 remark 的信息,但内容也都并不惊奇。不过倒是一个不错的材料,若作为熟悉DNN背后的数学描述的话。

  6. Feb 2018
  7. Dec 2017
  8. Jun 2017
    1. @article{ben2002robust, title={Robust optimization--methodology and applications}, author={Ben-Tal, Aharon and Nemirovski, Arkadi}, journal={Mathematical Programming}, volume={92}, number={3}, pages={453--480}, year={2002}, publisher={Springer} }

    Tags

    Annotators

  9. Feb 2017
  10. Jul 2016
  11. Apr 2016
    1. While there are assets that have not been assigned to a cluster If only one asset remaining then Add a new cluster Only member is the remaining asset Else Find the asset with the Highest Average Correlation (HC) to all assets not yet been assigned to a Cluster Find the asset with the Lowest Average Correlation (LC) to all assets not yet assigned to a Cluster If Correlation between HC and LC > Threshold Add a new Cluster made of HC and LC Add to Cluster all other assets that have yet been assigned to a Cluster and have an Average Correlation to HC and LC > Threshold Else Add a Cluster made of HC Add to Cluster all other assets that have yet been assigned to a Cluster and have a Correlation to HC > Threshold Add a Cluster made of LC Add to Cluster all other assets that have yet been assigned to a Cluster and have Correlation to LC > Threshold End if End if End While

      Fast Threshold Clustering Algorithm

      Looking for equivalent source code to apply in smart content delivery and wireless network optimisation such as Ant Mesh via @KirkDBorne's status https://twitter.com/KirkDBorne/status/479216775410626560 http://cssanalytics.wordpress.com/2013/11/26/fast-threshold-clustering-algorithm-ftca/

    1. Effect of step size. The gradient tells us the direction in which the function has the steepest rate of increase, but it does not tell us how far along this direction we should step.

      That's the reason why step size is an important factor in optimization algorithm. Too small step can cause the algorithm longer to converge. Too large step can cause that we change the parameters too much thus overstepped the optima.

  12. Jan 2015
    1. There are other ways of performing the optimization (e.g. LBFGS), but Gradient Descent is currently by far the most common and established way of optimizing Neural Network loss functions.

      Are there any studies that compare different pros and cons of the optimization procedures with respect to some specific NN architectures (e.g., classical LeNets)?