11 Matching Annotations
  1. Sep 2025
    1. linear attention mech-anism

      Softmax Function - converts vector of predictions into probabilities for each class. https://www.geeksforgeeks.org/deep-learning/the-role-of-softmax-in-neural-networks-detailed-explanation-and-applications/

      Linear Attention = approximation of softmax (by using linear dot product of kernel feature maps to convert each step into addition for the update equations). https://linear-transformers.com/ https://haileyschoelkopf.github.io/blog/2024/linear-attn/

      Also note that flash attention (a newer, better in terms of memory approach), is briefly mentioned at the end of this paper, and discussed more in the sequel paper.

  2. Jan 2025
    1. Basically, flash generally is erased in blocks of ~64-512 kilobytes. Therefore, for every write anywhere within that block, the controller has to erase the entire block, using a write cycle for the entire block.

      Flash is written in block level, every block size is fixed

    2. To be pedantic, FLASH memory is merely a form of EEPROM: There is a marketing / branding aspect here. Typically, the distinction used today is that EEPROMS are single-byte (or storage word) erasable / rewritable, while FLASH is block-based for erase/write operations.

      Diffs between Flash and EEPROM. Flash write is block based while EEPROM is single byte based.

  3. Oct 2024
  4. Nov 2022
  5. Aug 2022
  6. Sep 2017
  7. Jul 2017
  8. Nov 2015