32 Matching Annotations
  1. Sep 2025
    1. Transformers have revolutionized almost allnatural language processing (NLP) tasks butsuffer from memory and computational com-plexity that scales quadratically with sequencelength. In contrast, recurrent neural networks(RNNs) exhibit linear scaling in memory andcomputational requirements but struggle tomatch the same performance as Transformersdue to limitations in parallelization and scala-

      Transformers' memory and compute scale quadratically with sequence length. RNNs' memory and compute scale linearly, but performance is not as good as Transformers due to parallelization and scalability limits.

  2. Jun 2025
  3. Mar 2025
  4. Feb 2025
  5. Aug 2024
  6. Jan 2024
  7. Nov 2023
  8. Oct 2023
    1. (Chen, NeurIPS, 2021) Che1, Lu, Rajeswaran, Lee, Grover, Laskin, Abbeel, Srinivas, and Mordatch. "Decision Transformer: Reinforcement Learning via Sequence Modeling". Arxiv preprint rXiv:2106.01345v2, June, 2021.

      Quickly a very influential paper with a new idea of how to learn generative models of action prediction using SARSA training from demonstration trajectories. No optimization of actions or rewards, but target reward is an input.

  9. Jul 2023
  10. Jun 2023
  11. Apr 2023
  12. Jan 2023
  13. Dec 2022
  14. Nov 2022
    1. we propose the Transformer, a model architecture eschewing recurrence and insteadrelying entirely on an attention mechanism to draw global dependencies between input and output.The Transformer allows for significantly more parallelization a

      Using the attention mechanism to determine global dependencies between input and output instead of using recurrent links to past states. This is the essence of their new idea.

  15. Sep 2022
  16. Feb 2022