Hypothesis

15 Matching Annotations

Jul 2016
mlwave.com mlwave.com

Kaggle Ensembling Guide | MLWave

3
1. jaknap32 28 Jul 2016
  
  in Public
  
  With blending, instead of creating out-of-fold predictions for the train set, you create a small holdout set of say 10% of the train set. The stacker model then trains on this holdout set only.
  
  whaat?
2. jaknap32 28 Jul 2016
  
  in Public
  
  Remember our goal is not to memorize the training data (there are far more efficient ways to store data than inside a random forest)
  
  haha nice one
3. jaknap32 04 Jul 2016
  
  in Public
  
  One or two errors are being corrected during ~66% of the majority votes. (0.36015 + 0.3087)
  
  what??
Visit annotations in context

Annotators

jaknap32

URL

mlwave.com/kaggle-ensembling-guide/
May 2016
code.facebook.com code.facebook.com

Introducing FBLearner Flow: Facebook's AI backbone

1
1. jaknap32 11 May 2016
  
  in Public
  
  most solutions did not provide a way to rerun pipelines with different inputs, mechanisms to explicitly capture outputs and/or side effects, visualization of outputs, and conditional steps for tasks like parameter sweeps.
  
  not clear read again?
Visit annotations in context

Annotators

jaknap32

URL

code.facebook.com/posts/1072626246134461/introducing-fblearner-flow-facebook-s-ai-backbone/
neuralnetworksanddeeplearning.com neuralnetworksanddeeplearning.com

Neural Networks and Deep Learning

4
1. jaknap32 06 May 2016
  
  in Public
  
  In fact, we'll allow each neuron in this layer to learn from all 20×5×520×5×520 \times 5 \times 5 input neurons in its local receptive field.
  
  better language can be used
2. jaknap32 06 May 2016
  
  in Public
  
  We can think of max-pooling as a way for the network to ask whether a given feature is found anywhere in a region of the image. It then throws away the exact positional information.
  
  didn't get it?...anyone?
3. jaknap32 06 May 2016
  
  in Public
  
  This means that all the neurons in the first hidden layer detect exactly the same feature* *I haven't precisely defined the notion of a feature. Informally, think of the feature detected by a hidden neuron as the kind of input pattern that will cause the neuron to activate: it might be an edge in the image, for instance, or maybe some other type of shape. , just at different locations in the input image.
  
  because of the ame weights and biases!
4. jaknap32 06 May 2016
  
  in Public
  
  σ(b+∑l=04∑m=04wl,maj+l,k+m).(125)
  
  didn't get it??! help?
Visit annotations in context

Annotators

jaknap32

URL

neuralnetworksanddeeplearning.com/chap6.html
neuralnetworksanddeeplearning.com neuralnetworksanddeeplearning.com

Neural Networks and Deep Learning

3
1. jaknap32 04 May 2016
  
  in Public
  
  A related heuristic explanation for dropout is given in one of the earliest papers to use the technique* *ImageNet Classification with Deep Convolutional Neural Networks, by Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton (2012).: "This technique reduces complex co-adaptations of neurons, since a neuron cannot rely on the presence of particular other neurons. It is, therefore, forced to learn more robust features that are useful in conjunction with many different random subsets of the other neurons." In other words, if we think of our network as a model which is making predictions, then we can think of dropout as a way of making sure that the model is robust to the loss of any individual piece of evidence. In this, it's somewhat similar to L1 and L2 regularization, which tend to reduce weights, and thus make the network more robust to losing any individual connection in the network.
  
  didn't get it do again
2. jaknap32 04 May 2016
  
  in Public
  
  The net result is that L1 regularization tends to concentrate the weight of the network in a relatively small number of high-importance connections, while the other weights are driven toward zero.
  
  didnt get it why?
3. jaknap32 03 May 2016
  
  in Public
  
  Now, there are ten points in the graph above, which means we can find a unique 999th-order polynomial y=a0x9+a1x8+…+a9y=a0x9+a1x8+…+a9y = a_0 x^9 + a_1 x^8 + \ldots + a_9 which fits the data exactly.
  
  why does 10 points imply 9 variables!!??
Visit annotations in context

Annotators

jaknap32

URL

neuralnetworksanddeeplearning.com/chap3.html
Apr 2016
benanne.github.io benanne.github.io

Recommending music on Spotify with deep learning

1
1. jaknap32 21 Apr 2016
  
  in Public
  
  latent representations of songs that were obtained from a collaborative filtering model.
  
  whatt??
Visit annotations in context

Annotators

jaknap32

URL

benanne.github.io/2014/08/05/spotify-cnns.html
colah.github.io colah.github.io

Neural Networks, Types, and Functional Programming -- colah's blog

1
1. jaknap32 18 Apr 2016
  
  in Public
  
  finding latent variables
  
  wtf are latent variables?
Visit annotations in context

Annotators

jaknap32

URL

colah.github.io/posts/2015-09-NN-Types-FP/
cs231n.github.io cs231n.github.io

CS231n Convolutional Neural Networks for Visual Recognition

1
1. jaknap32 17 Apr 2016
  
  in Public
  
  To continue the recurrence and to chain the gradient, the add gate takes that gradient and multiplies it to all of the local gradients for its inputs (making the gradient on both x and y 1 * -4 = -4). Notice that this has the desired effect: If x,y were to decrease (responding to their negative gradient) then the add gate's output would decrease, which in turn makes the multiply gate's output increase.
  
  not clear
Visit annotations in context

Annotators

jaknap32

URL

cs231n.github.io/optimization-2/
cs231n.github.io cs231n.github.io

CS231n Convolutional Neural Networks for Visual Recognition

1
1. jaknap32 14 Apr 2016
  
  in Public
  
  You may have noticed that evaluating the numerical gradient has complexity linear in the number of parameters. In our example we had 30730 parameters in total and therefore had to perform 30,731 evaluations of the loss function to evaluate the gradient and to perform only a single parameter update
  
  Didn't understand, can someone help?!!
Visit annotations in context

Annotators

jaknap32

URL

cs231n.github.io/optimization-1/

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL