5 Matching Annotations
  1. Apr 2018
    1. 为了提高系统的可测量性和反应时间,应用程序不能长时间阻塞在某个事件源上而停止对其他事件的处理,这样会严重降低对客户端的响应度。 为了提高吞吐量,任何没有必要的上下文切换、同步和CPU之间的数据移动都要避免。 引进新的服务或改良已有的服务都要对既有的事件分离和调度机制带来尽可能小的影响。 大量的应用程序代码需要隐藏在复杂的多线程和同步机制之后。

      达到此目的,要求:

      1. 非阻塞
      2. event loop
      3. 任务细粒度 (处理任何一个事件都不会长时间占用cpu)
  2. Oct 2017
    1. And so on, until we've exhausted the training inputs, which is said to complete an epoch of training. At that point we start over with a new training epoch.

      For each epoch, first shuffle the whole training data, then split the training data to multiple non-overlap mini-batches and train them all iteratively.

    2. But it seems safe to say that at least in this case we'd conclude that the input was a 000.

      But we can make up non-zero shapes by changing the relative positions of these four images.

    1. The intuition behind the backpropagation algorithm is as follows. Given a training example (x,y)(x,y)(x,y), we will first run a “forward pass” to compute all the activations throughout the network, including the output value of the hypothesis hW,b(x)hW,b(x)h_{W,b}(x). Then, for each node iii in layer lll, we would like to compute an “error term” δ(l)iδi(l)\delta^{(l)}_i that measures how much that node was “responsible” for any errors in our output. For an output node, we can directly measure the difference between the network’s activation and the true target value, and use that to define δ(nl)iδi(nl)\delta^{(n_l)}_i (where layer nlnln_l is the output layer). How about hidden units? For those, we will compute δ(l)iδi(l)\delta^{(l)}_i based on a weighted average of the error terms of the nodes that uses a(l)iai(l)a^{(l)}_i as an input. In detail, here is the backpropagation algorithm

      Backpropagation algorithm是为了计算所有参数$Wij$的相对于cost的偏导,output layer的偏导是好计算的,因为该layer的参数只参与了最后的hypothesis function计算(只在一条link里出现),但是internal参数的偏导则不好计算,因为由于forward propagation作用,从她定义的connection到最后的output中间有多条links。既然output layer的偏导容易计算,那么有没有一种方法,从后往前计算,用l layer的已知结果去计算l-1 layer呢?有点像背锅,预期的任务没完成,则根据下面每人负责的比例,将黑锅(error)一层层平摊下去,最终平摊到每人头上的锅换算成应加班的小时数就得到了我们的偏导。