15 Matching Annotations
  1. Last 7 days
    1. Logit Arithmetic Our work builds off DEXPERTS (Liu et al., 2021), which introduced Eq. (1)and showed the effectiveness of ensembling logits from multiple LMs, an idea which wasalso briefly explored in earlier work (Dou et al., 2019).

      So what's the difference?

    2. Tuning

      Inprecise usage of the word

    3. In this context, “tuning” LMs at decoding-time represents an approach for efficient fine-tuning. Our work shares a similar vision with contemporary work (Mitchell et al., 2024),which applies the same DEXPERTS equation as operationalized in §3. However, they mainlyview the equation as a tool for disentangling the effects of scaling up pretraining versusinstruction-tuning, and do not measure the method’s effectiveness on existing benchmarks.In contrast, our work demonstrates the empirical strength of proxy-tuning, as well as itsgenerality beyond instruction-tuning alone. R

      They didn't have time to compare with Mitchell et al

    4. a base (M-) and a fine-tuned version (M+). The difference in logits between M- and M+ is added to M's logits, effectively transferring the tuning effect.

      simple! nice

    5. There's a new promising method for finetuning LLMs without modifying their weights called proxy-tuning (by Liu et al. arxiv.org/abs/2401.08565). How does it work? It's a simple decoding-time method where you modify the logits of the target LLM. In particular, you compute the logits' difference between a smaller base and finetuning model, then apply the difference to the target model's logits.

      similar problem, different approaches

    6. 1️⃣instruction-tuning, 2️⃣domain adaptation, and 3️⃣task finetuning.

      how

    7. logit difference

      definition?

    8. proxy-tuning,

      meaning?

    1. n some appropriate sense

      What sense exactly? This seems really shallow to me ... If only some axis of x is getting heavily changed, would that be considered appropriate?

  2. Mar 2025
    1. Position: AI/ML Influencers Have a Placein the Academic Process

      strength: interesting methodology for the analysis. this is a difficult problem to formulare analysis for, control set definition and validity is convincing

      weakness: Critical missing point is... why these two influencers:? how different will the findings be if it was two other influencers?

      and... the difference in the citations... how much of a difference does is make?

      Q1 were the control set papers also tweeted out by some other influencers... even of lower influence?

      Q2 there's work out there to identify the factors… why assume citation count factors?

    2. Finally, we establish statistical significance with three testscomparing the distributions of the experimental data withthat of the control sets, Epps-Singleton (ES), Kolmogorov-Smirnov (KS), and Mann-Whitney U (MWU), none ofwhich assume normal distribution, which is essential for ourdata. Table 5 shows the results, all with p-values well beloweven a stringent α = 0.001. From this, we can stronglyreject the null hypothesis that the citation distributions forthe influencer-shared papers and the control papers are thesame.

      nah... for such a large sample, highly likely to get stat sig.,,, but effect size is important... how high is the difference?

    3. Assuming that mean reviewscores are an accurate measure of paper quality, we concludethat we have effectively controlled for paper quality in ourmatc

      interesting

    4. collecting a largedataset of potential papers to match agains

      match what? ---

    5. n our case, we assume that a paper’s ci-tation count is most strongly influenced by elapsed time,quality, topic, and author prominence.

      there's work out there to identify the factors... why assume

    6. AK Komatsuz

      why just two???