2 Matching Annotations
  1. Last 7 days
    1. Loop blocking for linear algebra codes often have three levels: register blocking, L2 cache blocking, and L3 cache (or TLB) blocking.

      Some notes on blocking for different purposes in GEMM operations.

  2. Oct 2019
    1. 通用矩阵乘(GEMM)优化算法

      软件优化策略: 1)改进访存局部性; 2)利用向量指令.