3 Matching Annotations
  1. Oct 2025
    1. We note that improved reconstruction may come at the cost of increased feature absorption (Karvonen et al., 2024)

      Clearly from the nice agreement in Fig. 5, the SAE reconstructions do an excellent job at reconstructing the residual representation at each layer. I am curious about the magnitude of the reconstruction MSE for the hyperparameters covered in Fig. 8. Are there any results you've shared about the SAE training?

      There is a tradeoff between reconstruction error and L0 sparsity, but at what point are you learning more about only the SAEs than ESM2 itself?

    2. We developed a latent visualizer, InterProt, to streamline the process of identifying features.

      InterProt is an amazing tool for sorting through all of these findings.

      The Fig. 3C plot is also very nice for a global view of the learned latent features. What do you think about the relatively small fraction of "interesting" features (the "structural", "amino acid", "alpha helix", etc., top features on InterProt) compared to the total number of latents? Do you think this is more about our lack of knowledge of protein structure, or are the "uninteresting" latents just generally at a lower conceptual level (like point residue features) than what we find interesting (motifs with structural effects)?

  2. Jul 2025
    1. Model weights of both Ankh3-Large and Ankh3-XL models are available at https://huggingface.co/ElnaggarLab/ankh3-large and https://huggingface.co/ElnaggarLab/ankh3-xl.

      Thank you for open-sourcing this exciting model! It is impressive that the diverse set of training tasks allows the xl model to continue to improve over the large model performance. It’s also great to see how this was trained with Jax on TPUs.

      In running inference with the model, I had some issues with reproducing the S2S completion examples in the README. For the sequence completion example, was teacher forcing used at inference? I also observed excellent performance with [NLU] for predicting masked sites.