- Aug 2023
-
arxiv.org arxiv.org
-
Title: Delays, Detours, and Forks in the Road: Latent State Models of Training Dynamics Authors: Michael Y. Hu1 Angelica Chen1 Naomi Saphra1 Kyunghyun Cho Note: This paper seems cool, using older interpretable machine learning models, graphical models to understand what is going on inside a deep neural network
-
- Feb 2023
-
clementneo.com clementneo.com
-
The code to reproduce our results can be found here.
-
- Jan 2023
-
ar5iv.labs.arxiv.org ar5iv.labs.arxiv.org
-
This input embedding is the initial value of the residual stream, which all attention layers and MLPs read from and write to.
-
- Apr 2022
-
distill.pub distill.pub
-
Starting from random noise, we optimize an image to activate a particular neuron (layer mixed4a, unit 11).
And then we use that image as a kind of variable name to refer to the neuron in a way that more helpful than the the layer number and neuron index within the layer. This explanation is via one of Chris Olah's YouTube videos (https://www.youtube.com/watch?v=gXsKyZ_Y_i8)
-
- Jun 2020
-
psyarxiv.com psyarxiv.com
-
Moreau, David, and Kristina Wiebels. ‘Assessing Change in Intervention Research: The Benefits of Composite Outcomes’, 2 June 2020. https://doi.org/10.31234/osf.io/t9hw3.
-
- Jun 2019
-
towardsdatascience.com towardsdatascience.com
-
To interpret a model, we require the following insights :Features in the model which are most important.For any single prediction from a model, the effect of each feature in the data on that particular prediction.Effect of each feature over a large number of possible predictions
Machine learning interpretability
-