26 Matching Annotations
  1. May 2023
    1. Researchers may end up inadvertently exploring dead ends that their fellow scientists have already run up against, solely because the information about previous failures has never been given space in the pages of the relevant scientific publications.

      This is another rather disturbing consequence of the running model.

  2. Mar 2023
    1. The first term of the objective again involves Hessian, but it is in the form of Hessian-vector products, which can be computed within O(1) backpropagations.
    1. (Image source: Rombach & Blattmann, et al. 2022)

      Wrong link. Here's the correct one.

    2. the latent variable has high dimensionality (same as the original data).

      isn't this also the case for Flow models?

    1. This is the same gradient you would use for DeepDream-style manipulation of images.

      Great point!

      Except that the DeepDream-style manipulation uses exact gradient, while DDPMs learn to infer an approximation of this type of gradient w.r.t. the data distribution which the generative model is trying to capture.

  3. Feb 2023
    1. Note that the maximum bin magnitude after performing the FFT is related to the length of the input signal. For a four sample sine wave, the maximum is 2, and for an eight sample sine wave the maximum is 4.

      this is the principle of the generalized normalization of magnitudes

    1. If we take two sine waves with the same frequency and sum them together, the result will always be a sine wave with the exact same frequency. This is a somewhat curious result, and it holds true even if the two sinusoids have completely different phases and amplitudes. No other periodic signal possesses this property.

      even more wow

    2. it can be shown that any two sine waves whose frequencies are multiples of one another are also orthogonal, regardless of their phases.

      wow

    1. The idea of adding things to the toolkit or building your own tools is so foreign to most people that the idea induces some strange mixture of mental pain, anxiety, and confusion.

      haha indeed

  4. Jan 2023
    1. 2 − n ( H ( X ) + ε ) ⩽ p ( x 1 , x 2 , … , x n ) ⩽ 2 − n ( H ( X ) − ε ) {\displaystyle 2^{-n(H(X)+\varepsilon )}\leqslant p(x_{1},x_{2},\dots ,x_{n})\leqslant 2^{-n(H(X)-\varepsilon )}}

      ChatGPT:

      In the context of Information Theory, typicality of samples refers to the concept of typical sets. A typical set is a set of sequences that have high probability under a given probability distribution. These sequences are not necessarily the most probable ones (i.e the mode of the distribution), but they are the ones that are most typical of the distribution.

      In other words, typical sets are sets of sequences that are representative of the underlying distribution, and they are defined by a criterion of asymptotic equiprobability: a sequence is typical if its probability is close to the inverse of the exponential of its length.

      In the context of deep autoregressive generative models, typicality of samples can be used as a way to evaluate the quality of the model's approximations of the underlying data distribution. Samples that are highly likely to have been generated by the model are considered to be typical, and thus the model is able to capture the underlying structure of the data well.

    1. The only use case is gambling on the random price oscillations, attempting to buy low and sell high and cash out positions for wins in a real currency like dollars or euros. Yet crypto cannot create or destroy real money because unlike a stock there is no underlying company that generates income. So if you sell your crypto and make a profit in dollars, it’s exactly because a greater fool bought it at a higher price than you did. So every dollar that comes out of a cryptocurrency is because a later investor put a dollar in. They are inherently zero-sum by design, and when you take into account the casino (i.e. exchanges and miners) taking a rake on the game then the entire structure becomes strictly negative-sum. For every winner there are guaranteed to be multiple losers. It’s a game rigged by insiders by hacking human psychology.

      Extremely well put.

    2. History tends to rhyme with itself

      I like the appropriateness of the strength of this formulation as it's so generic.

    3. The fixed-supply ideas

      I guess dynamic >> fixed when it comes to representing the sum of our planets material resources as consumables, as nice as it is to reduce our model of this supply as fixed, after a bit of consideration it's obvious that it is just as dynamic as the demand for it.

    4. The United States ultimately devalued its currency with the policies of the New Deal which slowly decoupled the dollar’s dependence on gold and which led to an era of economic growth and prosperity. Conversely Europe largely did not engage in these corrective policies and this era saw the rise of populist strong men and fascists who promised to correct the wealth inequality of the common man, and ultimately plunged the continent into the most violent period in human history.

      Still not clear on how exactly coupling to gold makes the value of a currency less stable than no coupling at all (which in turn allows for arbitrarily large inflation from unbounded money-printing)

    5. Under a gold standard, inflation, growth and the financial system were all less stable due to trade imbalances. This led to frequent recessions, larger swings in consumer prices and perpetual banking crises.

      Really? Would be interested in details, hope to find some in the Crypto Bubble book.

    6. A small amount of inflation discourages hoarding and incentivizes investment into productive enterprises which grow the economy and produce prosperity. Conversely a static fixed money supply encouages hoarding, and is inflexible in times of crisis because it does not allow intervention.

      Yet massive hoarding occurs regardless, just not in currencies, but in commodities and other actives. Just look at the sparsity of the world population's wealth distribution.

      But that's likely a different topic, I wouldn't assume the author meant to suggest that inflation is supposed to be able to address anything else than hoarding of the currency that's subject to inflation.

  5. Nov 2019
  6. Mar 2019
    1. Within the frame-based loss term, we apply a weighting to encourage accuracy at the start of the note.

      From Onsets and Frames paper

      "we define the weighted frame loss as:

      $$L_{frame}(l,p) = \begin{cases} c L'_{frame}(l,p) & t_1 \leq t \leq t_2 \\ \frac{c}{t-t_2} L'_{frame}(l,p) & t_2 < t \leq t_3 \\ L'_{frame}(l,p) & \text{ elsewhere } \end{cases}$$

      where c = 5.0 as determined with coarse hyperparameter search."

    2. we also restrict the final output of the model to start new notes only when the onset detector is confident that a note onset is in that frame.

      From Onsets and Frames paper

      "We also use the thresholded output of the onset detector during the inference process, similar to concurrent research described in [24]. An activation from the frame detector is only allowed to start a note if the onset detector agrees that an onset is present in that frame."


      From referenced paper [24]

      "Finally, we peak pick the two-channel activation matrix to convert the framewise piano roll to a list of note events. Per note, we step through each time frame and place an onset at positions where the articulation channel is above a set threshold, and then include all frames onward until the sustain channel is under another fixed threshold, at which point we output an offset. If a new articulation is found during an active note event we simply fragment it by outputting additional offsets and onsets."


      where articulation channel refers to the parallel piano-roll channel where only note frames corresponding to note onsets are active, so here onset labels (onsets = articulations in authors' lingo), and sustain channel would be our frame-level predictions corresponding to note-level frame labels.

  7. Feb 2019
    1. ill-posed

      What class of inverse problems do authors denote as ill-posed here?

      Are those simply just all such problems that are inverse to the generative process which can be defined as a surjection but not bijection?

      $$x \mapsto y$$

    1. Note to the document

    2. Go forth and annotate

      Testing annotation functionality

      Quoted text

      Inline markdown code snippet: import tensorflow as tf

      Multi-line markdown code snippet with Python syntax highlight:

      import numpy as np
      

      LaTeX math:

      $$\sum_{i=1}^{N^2} x_i$$

      HyperLink to Google

      Numbered list:

      1. Item
      2. Another one
      3. Last one

      Pointed list:

      • Stuff
      • more stuff