155 Matching Annotations
  1. Dec 2019
    1. The quality of word representations is generally gauged by its ability to encode syntactical information and handle polysemic behavior (or word senses). These properties result in improved semantic word representations. Recent approaches in this area encode such information into its embeddings by leveraging the context. These methods provide deeper networks that calculate word representations as a function of its context.
      • Syntactical information
      • Polysemic behavior (word senses)
      • Semantic word representations

      Entendo que lidar com word senses significa dizer que a representação das palavras consegue medidas similares para palavras similares.

      O que seria informação sintática? E sua relação com representações semânticas da palavra?

    2. One solution to this problem, as explored by Mikolov et al. (2013), is to identify such phrases based on word co-occurrence and train embeddings for them separately. More recent methods have explored directly learning n-gram embeddings from unlabeled data (Johnson and Zhang, 2015).

      Co-ocorrência de palavras eu consigo entender, mas treinar as embeddings separadamente não. Seria supor a co-ocorrência das palavras como unidade na incorporação, em vez da palavra apenas?

    3. The context words are assumed to be located symmetrically to the target words within a distance equal to the window size in both directions.

      O que significa dizer "simetricamente localizadas" as palavras alvo?

    4. This led to the motivation of learning distributed representations of words existing in low-dimensional space (Bengio et al., 2003).

      Sobre maldição da dimensionalidade. Agora, o que seria representações distribuídas das palavras em espaços de menor dimensão? Isso me lembra de PCA e afins.

    1. The word vector is the arrow from the point where all three axes intersect to the end point defined by the coordinates.

      The three axes gives each one a context.

  2. Jul 2019
  3. May 2019
  4. Apr 2019
  5. Mar 2019
    1. A Sensitivity Analysis of (and Practitioners’ Guide to) Convolutional Neural Networks for Sentence Classification

  6. arxiv.org arxiv.org
    1. To the best of our knowl-edge, there has not been any other work exploringthe use of attention-based architectures for NMT

      目前并没人来用attention来做机器翻译

  7. Feb 2019
    1. 问答系统冠军之路:用 CNN 做问答任务的 QANet

      问答和机器人之间是否有某种共性?

      QANet: Combining Local Convolution with Global Self-Attention for Reading Comprehension

    1. To overcome this issue, weexplore data generation using templates and terminologies and data augmentationapproaches. Namely, we report our experiments using paraphrasing and wordrepresentations learned on a large EHR corpus with Fasttext and ELMo, to learn aNLU model without any available dataset. We evaluate on a NLU task of naturallanguage queries in EHRs divided in slot-filling and intent classification sub-tasks.On the slot-filling task, we obtain a F-score of 0.76 with the ELMo representation;and on the classification task, a mean F-score of 0.71. Our results show that thismethod could be used to develop a baseline system

      在生物医药领域很缺数据,为了解决这个问题,常识了基于模版,术语大的数据扩展技术。先在大的数据集上用ELMo来构建词向量。把任务评估分成两个子任务来进行,slot-filling和意图分类。

      偏应用的一篇文章,结果也说明不了什么

  8. www.iro.umontreal.ca www.iro.umontreal.ca
    1. bi-directional Jordan-type network that takes into account both past and future dependencies among slots works best

      双向的 Jordan-type网络对槽最好用

    2. Using Recurrent Neural Networksfor Slot Filling in Spoken Language Understanding

    1. BERT: Pre-training of Deep Bidirectional Transformers forLanguage Understanding

    1. 10 Exciting Ideas of 2018 in NLP

      2018 年NLP 10大奇观

      • 1 无监督机器翻译
      • 2 预训练的语言模型
      • 3 常识推理数据集
      • 4 meta-learning
      • 5 健壮的无监督方法
      • 6 真正理解表示学习中的表示,预训练的语言模型真的表现出了和图像模型类似的功能
      • 7 多任务学习的成功应用
      • 8 半监督和迁移学习的结合
      • 9 基于大量文本的QA和推理数据集
      • 10 归纳偏差
    2. Phrase-Based & Neural Unsupervised Machine Translation (EMNLP 2018):  T
  9. Jan 2019
  10. Dec 2018
    1. A semantic treebank is a collection of natural language sentences annotated with a meaning representation. These resources use a formal representation of each sentence's semantic structure.