Hypothesis

7 Matching Annotations

Mar 2025
proceedings.mlr.press proceedings.mlr.press

Show, Attend and Tell: Neural Image CaptionGeneration with Visual Attention

1
1. mark.crowley 05 Mar 2025
  
  in Public
  
  Examples of mistakes where we can use attention to gain intuition into what the model saw.
  
  Perhaps the best use of this approach is for looking for mistakes or understanding why a model does badly on certain data instances.
  
  attention interpretability
Visit annotations in context

Tags

attention

interpretability

Annotators

mark.crowley

URL

proceedings.mlr.press/v37/xuc15.pdf
Aug 2023
arxiv.org arxiv.org

2308.09543.pdf

1
1. mark.crowley 22 Aug 2023
  
  in Public
  
  Title: Delays, Detours, and Forks in the Road: Latent State Models of Training Dynamics Authors: Michael Y. Hu1 Angelica Chen1 Naomi Saphra1 Kyunghyun Cho Note: This paper seems cool, using older interpretable machine learning models, graphical models to understand what is going on inside a deep neural network
  
  Link: https://arxiv.org/pdf/2308.09543.pdf
  
  deep-learning machine-learning hidden-markov-models graphical-models interpretability visualization regularization
Visit annotations in context

Tags

visualization

machine-learning

graphical-models

regularization

deep-learning

hidden-markov-models

interpretability

Annotators

mark.crowley

URL

arxiv.org/pdf/2308.09543.pdf
Feb 2023
clementneo.com clementneo.com

We Found An Neuron in GPT-2

1
1. mshook 15 Feb 2023
  
  in Public
  
  The code to reproduce our results can be found here.
  
  https://github.com/UFO-101/an-neuron
  
  https://colab.research.google.com/github/UFO-101/an-neuron/blob/main/an_neuron_investigation.ipynb
  
  code transformer gpt interpretability colab ipynb cool
Visit annotations in context

Tags

ipynb

code

colab

gpt

cool

transformer

interpretability

Annotators

mshook

URL

clementneo.com/posts/2023/02/11/we-found-an-neuron
Jan 2023
ar5iv.labs.arxiv.org ar5iv.labs.arxiv.org

Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small

1
1. mshook 17 Jan 2023
  
  in Public
  
  This input embedding is the initial value of the residual stream, which all attention layers and MLPs read from and write to.
  
  transformer interpretability ml nlp attention circuit
Visit annotations in context

Tags

circuit

nlp

attention

ml

transformer

interpretability

Annotators

mshook

URL

ar5iv.labs.arxiv.org/html/2211.00593
Apr 2022
distill.pub distill.pub

Feature Visualization

1
1. mshook 02 Apr 2022
  
  in Public
  
  Starting from random noise, we optimize an image to activate a particular neuron (layer mixed4a, unit 11).
  
  And then we use that image as a kind of variable name to refer to the neuron in a way that more helpful than the the layer number and neuron index within the layer. This explanation is via one of Chris Olah's YouTube videos (https://www.youtube.com/watch?v=gXsKyZ_Y_i8)
  
  ml feature visualization colah good nn cnn inception interpretability
Visit annotations in context

Tags

visualization

feature

inception

colah

interpretability

cnn

good

nn

ml

Annotators

mshook

URL

distill.pub/2017/feature-visualization
Jun 2020
psyarxiv.com psyarxiv.com

Assessing Change in Intervention Research: The Benefits of Composite Outcomes

1
1. katietaylor_99 03 Jun 2020
  
  in BehSci
  
  Moreau, David, and Kristina Wiebels. ‘Assessing Change in Intervention Research: The Benefits of Composite Outcomes’, 2 June 2020. https://doi.org/10.31234/osf.io/t9hw3.
  
  is:preprint lang:en intervention research resource-intensive outcome measures combining assessments evaluate effectiveness pooling information composite scores theory transparency interpretability preregistration psychology recommendations
Visit annotations in context

Tags

resource-intensive

preregistration

outcome measures

psychology

is:preprint

combining assessments

composite scores

transparency

research

evaluate effectiveness

intervention

interpretability

theory

recommendations

pooling information

lang:en

Annotators

katietaylor_99

URL

psyarxiv.com/t9hw3/
Jun 2019
towardsdatascience.com towardsdatascience.com

Interpretable Machine Learning – Towards Data Science

1
1. intelligence.refinery 26 Jun 2019
  
  in Public
  
  To interpret a model, we require the following insights :Features in the model which are most important.For any single prediction from a model, the effect of each feature in the data on that particular prediction.Effect of each feature over a large number of possible predictions
  
  Machine learning interpretability
  
  Machine learning Interpretability
Visit annotations in context

Tags

Machine learning

Interpretability

Annotators

intelligence.refinery

URL

towardsdatascience.com/interpretable-machine-learning-1dec0f2f3e6b