Hypothesis

35 Matching Annotations

Sep 2025
d2l.ai d2l.ai

14.14. Dog Breed Identification (ImageNet Dogs) on Kaggle — Dive into Deep Learning 1.0.3 documentation

1
1. cleo24621 04 Sep 2025
  
  in Public
  
  labels.csv contains the labels for the training images
  
  There are not labels for the testing images?
  
  ques
Visit annotations in context

Tags

ques

Annotators

cleo24621

URL

d2l.ai/chapter_computer-vision/kaggle-dog.html
d2l.ai d2l.ai

13.13. Image Classification (CIFAR-10) on Kaggle — Dive into Deep Learning 0.17.0 documentation

1
1. cleo24621 03 Sep 2025
  
  in Public
  
  valid_ds
  
  ?
  
  ques
Visit annotations in context

Tags

ques

Annotators

cleo24621

URL

d2l.ai/chapter_computer-vision/kaggle-cifar10.html
d2l.ai d2l.ai

13.12. Neural Style Transfer — Dive into Deep Learning 0.17.0 documentation

2
1. cleo24621 03 Sep 2025
  
  in Public
  
  it likely leads to larger values in the Gram matrix
  
  负相关？
  
  ques
2. cleo24621 03 Sep 2025
  
  in Public
  
  the image printing function requires that each pixel has a floating point value from 0 to 1
  
  why?
  
  ques
Visit annotations in context

Tags

ques

Annotators

cleo24621

URL

d2l.ai/chapter_computer-vision/neural-style.html
Aug 2025
nanonets.com nanonets.com

[Tutorial] OCR in Python with Tesseract, OpenCV and Pytesseract

1
1. cleo24621 29 Aug 2025
  
  in Public
  
  The best way to do this is by first using tesseract to get OCR text in whatever languages you might feel are in there, using langdetect to find what languages are included in the OCR text and then run OCR again with the languages found.
  
  how about the accuracy?
  
  ques
Visit annotations in context

Tags

ques

Annotators

cleo24621

URL

nanonets.com/blog/ocr-with-tesseract/
d2l.ai d2l.ai

13.2. Fine-Tuning — Dive into Deep Learning 0.17.0 documentation

1
1. cleo24621 25 Aug 2025
  
  in Public
  
  For comparison, we define an identical model, but initialize all of its model parameters to random values.
  
  全部保持初始就是随机赋值？
  
  ques
Visit annotations in context

Tags

ques

Annotators

cleo24621

URL

d2l.ai/chapter_computer-vision/fine-tuning.html
d2l.ai d2l.ai

12.1. Compilers and Interpreters — Dive into Deep Learning 0.16.2 documentation

1
1. cleo24621 24 Aug 2025
  
  in Public
  
  As is observed in the above results, after an nn.Sequential instance is scripted using the torch.jit.script function, computing performance is improved through the use of symbolic programming.
  
  but longer time
  
  ques
Visit annotations in context

Tags

ques

Annotators

cleo24621

URL

d2l.ai/chapter_computational-performance/hybridize.html
d2l.ai d2l.ai

11.11. Learning Rate Scheduling — Dive into Deep Learning 0.17.0 documentation

1
1. cleo24621 24 Aug 2025
  
  in Public
  
  In the context of computer vision this schedule can lead to improved results.
  
  图像增强？
  
  ques
Visit annotations in context

Tags

ques

Annotators

cleo24621

URL

d2l.ai/chapter_optimization/lr-scheduler.html
d2l.ai d2l.ai

11.9. Large-Scale Pretraining with Transformers — Dive into Deep Learning 1.0.3 documentation

1
1. cleo24621 23 Aug 2025
  
  in Public
  
  The photorealistic text-to-image examples in Fig. 11.9.5 suggest that the T5 encoder alone may effectively represent text even without fine-tuning.
  
  t5和输出之间应该还有网络？
  
  ques
Visit annotations in context

Tags

ques

Annotators

cleo24621

URL

d2l.ai/chapter_attention-mechanisms-and-transformers/large-pretraining-transformers.html
d2l.ai d2l.ai

11.7. The Transformer Architecture — Dive into Deep Learning 1.0.0-beta0 documentation

2
1. cleo24621 22 Aug 2025
  
  in Public
  
  Since we use the fixed positional encoding whose values are always between −1 and 1,
  
  ?
  
  ques
2. cleo24621 22 Aug 2025
  
  in Public
  
  position
  
  position对应time step?
  
  ques
Visit annotations in context

Tags

ques

Annotators

cleo24621

URL

d2l.ai/chapter_attention-mechanisms-and-transformers/transformer.html
d2l.ai d2l.ai

11.5. Multi-Head Attention — Dive into Deep Learning 1.0.0-beta0 documentation

2
1. cleo24621 22 Aug 2025
  
  in Public
  
  To compute multiple heads of multi-head attention in parallel, proper tensor manipulation is needed.
  
  不同的head一定要相同的长度吗（num_hiddens / num_heads）？
  
  ques
2. cleo24621 22 Aug 2025
  
  in Public
  
  Note that h heads can be computed in parallel if we set the number of outputs of linear transformations for the query, key, and value to pqh=pkh=pvh=po.
  
  不一致就不能平行运算吗？
  
  ques
Visit annotations in context

Tags

ques

Annotators

cleo24621

URL

d2l.ai/chapter_attention-mechanisms-and-transformers/multihead-attention.html
d2l.ai d2l.ai

11.2. Attention Pooling by Similarity — Dive into Deep Learning 1.0.0-beta0 documentation

1
1. cleo24621 20 Aug 2025
  
  in Public
  
  In the case of a (scalar) regression with observations (xi,yi) for features and labels respectively, vi=yi are scalars, ki=xi are vectors, and the query q denotes the new location where f should be evaluated.
  
  x_i和q相等？
  
  ques
Visit annotations in context

Tags

ques

Annotators

cleo24621

URL

d2l.ai/chapter_attention-mechanisms-and-transformers/attention-pooling.html
d2l.ai d2l.ai

9.8. Beam Search — Dive into Deep Learning 0.16.6 documentation

1
1. cleo24621 18 Aug 2025
  
  in Public
  
  the conditional probability of each token at time step 3 has also changed in Fig. 10.8.2
  
  why change like that?
  
  ques
Visit annotations in context

Tags

ques

Annotators

cleo24621

URL

d2l.ai/chapter_recurrent-modern/beam-search.html
d2l.ai d2l.ai

9.5. Machine Translation and the Dataset — Dive into Deep Learning 0.16.6 documentation

1
1. cleo24621 17 Aug 2025
  
  in Public
  
  Using word-level tokenization, the vocabulary size will be significantly larger than that using character-level tokenization, but the sequence lengths will be much shorter.
  
  the sequence lengths?
  
  ques
Visit annotations in context

Tags

ques

Annotators

cleo24621

URL

d2l.ai/chapter_recurrent-modern/machine-translation-and-dataset.html
d2l.ai d2l.ai

9.3. Deep Recurrent Neural Networks — Dive into Deep Learning 0.16.6 documentation

1
1. cleo24621 17 Aug 2025
  
  in Public
  
  we can easily get a deep-gated RNN by replacing the hidden state computation in (10.3.1) with that from an LSTM or a GRU.
  
  方向是不是错了？
  
  ques
Visit annotations in context

Tags

ques

Annotators

cleo24621

URL

d2l.ai/chapter_recurrent-modern/deep-rnn.html
d2l.ai d2l.ai

9.1. Gated Recurrent Units (GRU) — Dive into Deep Learning 0.7.1 documentation

1
1. cleo24621 17 Aug 2025
  
  in Public
  
  Reset gates help capture short-term dependencies in sequences. Update gates help capture long-term dependencies in sequences.
  
  why?
  
  ques
Visit annotations in context

Tags

ques

Annotators

cleo24621

URL

d2l.ai/chapter_recurrent-modern/gru.html
d2l.ai d2l.ai

9.2. Long Short-Term Memory (LSTM) — Dive into Deep Learning 0.15.1 documentation

3
1. cleo24621 17 Aug 2025
  
  in Public
  
  Note that only the hidden state is passed to the output layer.
  
  上一时间点的输出不是这一时间点的输入？
  
  ques
2. cleo24621 17 Aug 2025
  
  in Public
  
  For instance, if the first token is of great importance we will learn not to update the hidden state after the first observation.
  
  重要 -> 不更新？
  
  ques
3. cleo24621 17 Aug 2025
  
  in Public
  
  neuron
  
  is a neuro a cell?
  
  ques
Visit annotations in context

Tags

ques

Annotators

cleo24621

URL

d2l.ai/chapter_recurrent-modern/lstm.html
d2l.ai d2l.ai

8.7. Backpropagation Through Time — Dive into Deep Learning 0.7.1 documentation

5
1. cleo24621 17 Aug 2025
  
  in Public
  
  detaching the gradient
  
  ?
  
  ques
2. cleo24621 17 Aug 2025
  
  in Public
  
  Using the chain rule yields
  
  ?
  
  ques
3. cleo24621 17 Aug 2025
  
  in Public
  
  Whenever ξt=0 the recurrent computation terminates at that time step t.
  
  ?
  
  ques
4. cleo24621 17 Aug 2025
  
  in Public
  
  While we can use the chain rule to compute ∂ht/∂wh recursively, this chain can get very long whenever t is large. Let’s discuss a number of strategies for dealing with this problem.
  
  我不明白为什么可以这么替换
  
  ques
5. cleo24621 17 Aug 2025
  
  in Public
  
  where computation of ht−1 also depends on wh
  
  ?
  
  ques
Visit annotations in context

Tags

ques

Annotators

cleo24621

URL

d2l.ai/chapter_recurrent-neural-networks/bptt.html
d2l.ai d2l.ai

8.5. Implementation of Recurrent Neural Networks from Scratch — Dive into Deep Learning 0.7 documentation

1
1. cleo24621 15 Aug 2025
  
  in Public
  
  Having a small value for this upper bound might be viewed as good or bad. On the downside, we are limiting the speed at which we can reduce the value of the objective. On the bright side, this limits by just how much we can go wrong in any one gradient step.
  
  ques
Visit annotations in context

Tags

ques

Annotators

cleo24621

URL

d2l.ai/chapter_recurrent-neural-networks/rnn-scratch.html
d2l.ai d2l.ai

8.4. Recurrent Neural Networks — Dive into Deep Learning 0.15.1 documentation

1
1. cleo24621 15 Aug 2025
  
  in Public
  
  exponentially
  
  why
  
  ques
Visit annotations in context

Tags

ques

Annotators

cleo24621

URL

d2l.ai/chapter_recurrent-neural-networks/rnn.html
d2l.ai d2l.ai

9.3. Language Models — Dive into Deep Learning 1.0.0-beta0 documentation

2
1. cleo24621 15 Aug 2025
  
  in Public
  
  There will be many plausible three-word combinations that we likely will not see in our dataset.
  
  ?
  
  ques
2. cleo24621 15 Aug 2025
  
  in Public
  
  formulae
  
  独立性与unigram, bigram, trigram的关系？
  
  ques
Visit annotations in context

Tags

ques

Annotators

cleo24621

URL

d2l.ai/chapter_recurrent-neural-networks/language-model.html
d2l.ai d2l.ai

9.2. Converting Raw Text into Sequence Data — Dive into Deep Learning 1.0.0-beta0 documentation

1
1. cleo24621 14 Aug 2025
  
  in Public
  
  After all, we will significantly overestimate the frequency of the tail, also known as the infrequent words.
  
  why will overestimate?
  
  ques
Visit annotations in context

Tags

ques

Annotators

cleo24621

URL

d2l.ai/chapter_recurrent-neural-networks/text-sequence.html
d2l.ai d2l.ai

8.1. Sequence Models — Dive into Deep Learning 0.16.4 documentation

4
1. cleo24621 13 Aug 2025
  
  in Public
  
  frequency
  
  为什么有频率的描述？
  
  #ques
2. cleo24621 13 Aug 2025
  
  in Public
  
  Even today’s massive RNN- and Transformer-based language models seldom incorporate more than thousands of words of context.
  
  大模型每次输入多少文本呢？
  
  #ques
3. cleo24621 13 Aug 2025
  
  in Public
  
  probabilistic classifier
  
  从不同的概率分布的集合中分类？
  
  #ques
4. cleo24621 13 Aug 2025
  
  in Public
  
  compare
  
  not prediction, but comparation?
  
  #ques
Visit annotations in context

Tags

#ques

Annotators

cleo24621

URL

d2l.ai/chapter_recurrent-neural-networks/sequence.html

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL