336 Matching Annotations
  1. Apr 2020
    1. 1A. and John - 2015 - Survey on Chatbot Design Techniques in Speech Conv.pdf
  2. Mar 2019
  3. static.googleusercontent.com static.googleusercontent.com
    1. Multi-digit Number Recognition from Street ViewImagery using Deep Convolutional Neural Networks


      Deep Compression" can reduce the model sizeby 18?to 49?without hurting the prediction accuracy. We also discovered that pruning and thesparsity constraint not only applies to model compression but also applies to regularization, andwe proposed dense-sparse-dense training (DSD), which can improve the prediction accuracy for awide range of deep learning models. To efficiently implement "Deep Compression" in hardware,we developed EIE, the "Efficient Inference Engine", a domain-specific hardware accelerator thatperforms inference directly on the compressed model which significantly saves memory bandwidth.Taking advantage of the compressed model, and being able to deal with the irregular computationpattern efficiently, EIE improves the speed by 13?and energy efficiency by 3,400?over GPU

    1. A Sensitivity Analysis of (and Practitioners’ Guide to) Convolutional Neural Networks for Sentence Classification

    1. A Gentle Tutorial of Recurrent Neural Network with ErrorBackpropagation

      A Gentle Tutorial of Recurrent Neural Network with ErrorBackpropagation

    1. Revisiting the tree edit distance and its backtracing: A tutorial

  4. arxiv.org arxiv.org
    1. To the best of our knowl-edge, there has not been any other work exploringthe use of attention-based architectures for NMT


    1. One of the challenges of deep learning is that the gradients with respect to the weights in one layerare highly dependent on the outputs of the neurons in the previous layer especially if these outputschange in a highly correlated way. Batch normalization [Ioffe and Szegedy, 2015] was proposedto reduce such undesirable “covariate shift”. The method normalizes the summed inputs to eachhidden unit over the training cases. Specifically, for theithsummed input in thelthlayer, the batchnormalization method rescales the summed inputs according to their variances under the distributionof the data

      batch normalization的出现是为了解决神经元的输入和当前计算值交互的高度依赖的问题。因为要计算期望值,所以需要拿到所有样本然后进行计算,显然不太现实。因此将取样范围和训练时的mini-batch保持一致。但是这就把局限转移到mini-batch的大小上了,很难应用到RNN。因此需要LayerNormalization.

    2. Layer Normalization

  5. arxiv.org arxiv.org

    1. The goal here is explicitly not to improve the state of the art in the narrow domain of restaurantbooking, but to take a narrow domain where traditional handcrafted dialog systems are known toperform well, and use that to gauge the strengths and weaknesses of current end-to-end systemswith no domain knowledge



    2. Unsurprisingly, perfectly coded rule-based systems can solve the simulated tasks T1-T5 perfectly,whereas our machine learning methods cannot. However, it is not easy to build an effective rule-based




    4. We implemented a rule-based system for this task in the followingway. We initialized a dialog state using the 3 relevant slots for this task: cuisine type, location andprice range. Then we analyzed the training data and wrote a series of rules that fire for triggers likeword matches, positions in the dialog, entity detections or dialog state, to output particular responses,API calls and/or update a dialog state. Responses are created by combining patterns extracted fromthe training set with entities detected in the previous turns or stored in the dialog state. Overall webuilt 28 rules and extracted 21 patterns. We optimized the choice of rules and their application priority(when needed) using the validation set, reaching a validation per-response accuracy of 40.7%. Wedid not build a rule-based system forConciergedata as it is even less constrained.



    1. A Network-based End-to-End Trainable Task-oriented Dialogue System

      这个end-to-end的系统,在意图识别的阶段用的是cnn+LSTM 在状态管理(belief state tracking)也用的LSTM,在policy的时候自定义了一套算法,将前面的几个输出向量做了个线性模型,输出。

    2. Finally, the policy network output is generated bya three-way matrix transformation,


    3. a distributed representationgenerated by an intent network and a probabilitydistribution over slot-value pairs called the beliefstate

      造出来的一个belief state的概念:

      由intent网络生成的分布式表示和对slot-value组的概率表示叫做belief stat。

    1. Neural Approaches to Conversational AI

      Question Answering, Task-Oriented Dialogues and Social Chatbots

      The present paper surveys neural approaches to conversational AI that have beendeveloped in the last few years. We group conversational systems into three cat-egories: (1) question answering agents, (2) task-oriented dialogue agents, and(3) chatbots. For each category, we present a review of state-of-the-art neuralapproaches, draw the connection between them and traditional approaches, anddiscuss the progress that has been made and challenges still being faced, usingspecific systems and models as case studies

    1. An End-to-End Trainable Neural Network Model withBelief Tracking for Task-Oriented Dialog

    1. In learning such neural network based dialogmodel, we propose hybrid offline training and on-line interactive learning methods. We first let theagent to learn from human-human conversationswith offline supervised training. We then improvethe agent further by letting it to interact with usersand learn from user demonstrations and feedbackwith imitation and reinforcement learning.


      • 1 首先离线有监督学习 人和人的对话数据
      • 2 然后让模型和人交互,基于反馈和模仿用强化学习来学习

      为了解决样本效率问题,提出了learning-from-user and learning-from-simulationl两个方案。

    2. We design neural net-work based dialog system that is able to ro-bustly track dialog state, interface with knowl-edge bases, and incorporate structured queryresults into system responses to successfullycomplete task-oriented dialog.


    3. End-to-End Learning of Task-Oriented Dialogs


    1. To ameliorate the effect of dialogue state distri-bution mismatch between offline training and RLinteractive learning, we propose a hybrid imitationand reinforcement learning method. We first letthe agent to interact with users using its own pol-icy learned from supervised pre-training. When anagent makes a mistake, we ask users to correct themistake by demonstrating the agent the right ac-tions to take at each turn. This user corrected dia-logue sample, which is guided by the agent’s ownpolicy, is then added to the existing training cor-pus.


    2. A potential draw-back with such pre-training approach is that themodel may suffer from the mismatch of dialoguestate distributions between supervised training andinteractive learning stages. While interacting withusers, the agent’s response at each turn has a di-rect influence on the distribution of dialogue statethat the agent will operate on in the upcoming di-alogue turns.

      策略学习也是对话过程很重要的一环。 最近的策略学习过程有用基于有监督的预训练然后线上强化学习再训练的来提高学习的方案。但是这种方案有个潜在的毛病,在离线的数据中受限于数据量,线上一旦碰到了不常见的情况,容易直接恢复不来。(这个问题应该只是推断吧?有什么实证么?)


    3. These system components areusually trained independently, and their optimiza-tion targets may not fully align with the overallsystem evaluation criteria (e.g. task success rateand user satisfaction). Moreover, errors made inthe upper stream modules of the pipeline propa-gate to downstream components and get amplified,making it hard to track the source of errors

      传统pipeline方案的问题点: 1 流程比较复杂,每步骤独立训练,但是流程输入和输出有依赖,错误放大,难以跟进。

    4. Dialogue Learning with Human Teaching and Feedback in End-to-End Trainable Task-Oriented Dialogue Systems


    1. Attention-Based Recurrent Neural Network Models for Joint Intent Detection and Slot Filling

      用一个模型来解决两个不同类型的问题,intent detect是分类,填槽是序列标注。都用基于attention机制的RNN来搞定了

    2. or joint modeling of intent detection and slot filling, weadd an additional decoder for intent detection (or intent clas-sification) task that shares the same encoder with slot fillingdecoder.


    3. The attentionmechanism later introduced in [12] enables the encoder-decodermodel to learn a soft alignment and to decode at the same time.


      D. Bahdanau, K. Cho, and Y. Bengio, “Neural machine trans-lation by jointly learning to align and translate,”arXiv preprintarXiv:1409.0473, 2014

    1. We present a general solution towards building task-orienteddialogue systems for online shopping, aiming to assist on-line customers in completing various purchase-related tasks,such as searching products and answering questions, in a nat-ural language conversation manner. As a pioneering work, weshow what & how existing natural language processing tech-niques, data resources, and crowdsourcing can be leveragedto build such task-oriented dialogue systems for E-commerceusage. To demonstrate its effectiveness, we integrate our sys-tem into a mobile online shopping application. To the bestof our knowledge, this is the first time that an dialogue sys-tem in Chinese is practically used in online shopping scenariowith millions of real consumers. Interesting and insightful ob-servations are shown in the experimental part, based on theanalysis of human-bot conversation log. Several current chal-lenges are also pointed out as our future directions



      M = (I, C, A)

      I是intent,C是product category, A是商品attribute。 M是根据用户Query得到的信息的表示。

      意图分类:PhraseLDA 1000个topic

      产品分类: a CNN-based approach that resembles (Huang et al. 2013)and (Shen et al. 2014

    2. Main actions that areconsidered in the online shopping scenario include


      • Recommendation
      • Comparison
      • Opinion Summary
      • Question Answering
      • Proactive Questioning
      • Chit-chat
    3. There are several places where people would like to posttheir purchase-related intents, i.e., search engines, commu-nity sites and social network.


    4. System Formalization


    5. To deal with the problem we mentioned, our work focuson using three kinds of data resources that are common tomost E-commerce web service provider or easily crawledfrom webs, including: (i) product knowledge base, which isprovided by the E-commerce partner and contains structuredproduct information; (ii) search log, which is closely linkedwith products, natural language queries and user selectionbehaviors (mouse click); (iii) community sites, where userpost their intents in natural language and can be used to minepurchase-related intents and paraphrases of product-relatedterms. Besides, we show that crowd sourcing is necessary tobuild such AI bot


      • 1 结构化商品信息
      • 2 用户的搜索日志
      • 3 社区网站,挖掘购买意图和产品相关的词
    6. However, it is hard tocollect a large scale human-human dialogue data in shop-ping scenario


    1. The Sogou Spoken Language Understanding System for the NLPCC 2018 Evaluation

    2. or intent identi‐fication, the trained classifier is applied to the training dataset itself and the error casesare believed to be under-represented by the training set. For each under-representedquery q, it is then matched against our own query set using similarity metrics like ngramoverlap ratio, edit distance, etc. Very harsh thresholds on the metrics are used. Thosequeries that satisfy the thresholds are taken as new samples and they are labeled withthe same intent as q

      基于自己的数据集用类似ngram overlap ratio,edit distance等相似度度量。

    3. The first step is lexical analysis, i.e. word segmentation and part-of-speech (POS)tagging. The words and POS labels are used as features in the subsequent models. Forthe shared task we used HanLP [1] as our Chinese lexical analyzer.

      SLU 模型做法:

      • 1 第一步是词汇分析,也就是分词,然后词性标注。本文用的是HanLP做词性分析。

      • 2 第二步是槽位边界检测。这个任务看成一个用BILOU进行序列标注的。我们用了基于字和词的序列标注。基于字的 版本是用一个window为7的CRF,用此法特征和词典特征,另外基于词的的CRF模型是window size为5的词法特征,词性特征和词典特征。词典特征是指“当前字词是否 prefix/infix/suffix 在实体词典中某个条目关系。”每个CRF输出n(3)个输出,这整个2n个输出用到下一步。用基于字的序列标注是为了弥补分词效果差带来的可能影响。

      • 3 第三部是槽位类型识别。用的是LR+L正则分类器,预测出的slot,上下文的字词,上下文的词性标注作为特征。

      • 4 第四步是槽位纠正。这个是为了解决因为ASR导致的错误识别造成的结果。用的是一个基于搜索的方法。鉴于已经有各种槽位类型的词典,如果一个预测出来的槽位s类型T没有在对应的槽位词典中,那么就用s作为查询词来在根据最小编辑距离来查询槽位词典中的记录。这个操作会进行两次,一个是s作为中文字符,另一个是s作为拼音来查询。最好的结果是从这两个查处的结果中重新排序后得到的。

      • 5 最后一步是意图分类。用的是XGBoost及其默认参数。用到的特征是单词token,query length,以及前面步骤预测出来的槽位。

    4. Each rule is of the form “if thequery q is listed in a particular lexicon L, and the preceding queries and their predicteddomain labels satisfy certain conditions, then q is assigned a certain intent label and,with the exception of short commands, the entire q is regarded as a slot of the typecorresponding to L.” The rules are arranged in sequential order in accordance with theirpriorities

      规则的具体形式是,"如果query q被列在了一个特定的词汇表L,并且其前面的queries和它们预测出来的领域标签满足特定条件,那么q就可以被打上一个特定的意图标签,并且对于短的命令来说,整个query q是当作对应于L的一个槽位类型".所有规则按照优先级顺序组织的。

    5. Figure 1 shows the framework of our SLU system, which consists of the context-dependent rules for entity-only queries and the context-independent model for querieswith IISPs. The entire system feeds the query to the rules first. If the rule-based compo‐nent returns null result, that means the query is judged to contain IISPs and the model-based component will continue to process it. Otherwise, it means the query is regardedas entity-only and the result of the rules is returned as the final output


    6. s in real use cases of dialog systems, the queries in the shared task can be roughlydivided into two kinds, viz. queries with intent-indicating salient phrases and querieswithout. By intent-indicating salient phrase (IISP) it is meant a phrase in the query thatshows the intent of the query. E.g. the phrase “” in the query “” andthe phrases “” in the query “” are IISPs.


    1. Retrieval-based MethodsRetrieval-based methods choose a response from candidateresponses. The key to retrieval-based methods is message-response matching. Matching algorithms have to overcomesemantic gaps between messages and responses [28].


      B. Hu, Z. Lu, H. Li, and Q. Chen. Convolutional neu-ral network architectures for matching natural lan-guage sentences. InAdvances in neural informationprocessing systems, pages 2042–2050, 2014.

      单轮的匹配 match(X,Y) = X^TAy

      X:message的向量表示, y:回复的向量表示。

      H. Wang, Z. Lu, H. Li, and E. Chen. A dataset for re-search on short-text conversations. InProceedings ofthe 2013 Conference on Empirical Methods in NaturalLanguage Processing, pages 935–945, Seattle, Wash-ington, USA, October 2013. Association for Compu-tational Linguistics

      Z. Lu and H. Li. A deep architecture for matchingshort texts. InInternational Conference on Neural In-formation Processing Systems, pages 1367–1375, 2013.

      B. Hu, Z. Lu, H. Li, and Q. Chen. Convolutional neu-ral network architectures for matching natural lan-guage sentences. InAdvances in neural informationprocessing systems, pages 2042–2050, 2014

      M. Wang, Z. Lu, H. Li, and Q. Liu. Syntax-based deepmatching of short texts.InIJCAI, 03 2015

      Y. Wu, W. Wu, Z. Li, and M. Zhou. Topic augmentedneural network for short text conversation.CoRR,2016


    2. TASK-ORIENTED DIALOGUESYSTEMSTask-oriented dialogue systems have been an important branchof spoken dialogue systems. In this section, we will reviewpipeline and end-to-end methods for task-oriented dialoguesystems.


      • 1 pipeline,也就是包含SLU+DST+PL+NLG
      • 2 end-to-end
    3. 2.2 End-to-End Methods


      • 一个是信用分配问题,一个用户的反馈很难传播到上游每个组件中。
      • 另一个是问题流程的相互依赖。一个组件的输入依赖上一个组件的输出。一部分变动其他都得动。(这个真的是问题么?)


      • A network-based end-to-end trainable task-oriented di-alogue system
      • Learningend-to-end goal-oriented dialog.

      下文中,首次提出了一个联合训练dialogue state tracking和policy learning来优化得到更鲁棒的系统行为。

      • Towards end-to-end learn-ing for dialog state tracking and management us-ing deep reinforcement learning




      • Language understanding.NLU/SLU,目标是解析理解用户输入为intent,slot

      • Dialogue state tracker. 根据当前对话输入信息结合历史信息给出当前会话状态。

      • Dialogue policy learning.基于当前对话状态给出接下来要采取的行动

      • Natural language generation(NLG). 将映射的选择的动作行为转换生成对应的输出回复。

    5. 2.1.3 Policy learning

      策略学习 基于前面state tracker的状态表示,策略学习(policy learning)是来生成下一个可用的系统行动。无论是监督学习或者强化学习都可以用来优化策略学习。 H. Cuayhuitl, S. Keizer, and O. Lemon. Strategic di-alogue management via deep reinforcement learning.arxiv.org, 2015.

      通常都用一个基于规则的agent来初始化系统。 Z. Yan, N. Duan, P. Chen, M. Zhou, J. Zhou, andZ. Li. Building task-oriented dialogue systems for on-line shopping. InAAAI Conference on Artificial Intel-ligence, 2017

      然后用监督学习来基于规则生成的规则来学习。Building task-oriented dialogue systems for on-line shopping. 强化学习,Strategic di-alogue management via deep reinforcement learning.结果据说比很多系统,rule based,superviesed都好

    6. A statistical dialog system



      • S. Young, M. Gai, S. Keizer, F. Mairesse, J. Schatz-mann, B. Thomson, and K. Yu. The hidden informa-tion state model: A practical framework for pomdp-based spoken dialogue management. 在DSTC比赛中结果形式是每轮对话中每个slot的一个概率分布。各种统计学方法如下:
      • 规则集, Z. Wang and O. Lemon. A simple and generic belieftracking mechanism for the dialog state tracking chal-lenge: On the believability of observed information. InSIGDIAL Conference, pages 423–432, 2013
      • CRF S. Lee and M. Eskenazi. Recipe for building robustspoken dialog state trackers: Dialog state trackingchallenge system description. InSIGDIAL Conference,pages 414–422, 2013

        S. Lee. Structured discriminative model for dialogstate tracking. InSIGDIAL Conference, pages 442–451, 2013

      H. Ren, W. Xu, Y. Zhang, and Y. Yan. Dialog statetracking using conditional random fields. InSIGDIALConference, pages 457–461, 2013.

      • maximum entropy model J. Williams. Multi-domain learning and generaliza-tion in dialog state tracking. InSIGDIAL Conference,pages 433–441, 2013.

      • web-style ranking J. D. Williams. Web-style ranking and slu combina-tion for dialog state tracking

      深度学习的状态管理。用一个滑动窗口来在任意数量可能值上输出一个概率序列。 M. Henderson, B. Thomson, and S. Young. Deep neu-ral network approach for the dialog state tracking chal-lenge. InProceedings of the SIGDIAL 2013 Confer-ence, pages 467–471, 2013

      多领域的RNN状态跟进模型: B. Thomson, M. Gasic, P.-H. Su, D. Vandyke, T.-H. Wen, and S. Young. Multi-domain dialog state tracking using recurrent neuralnetworks.

      基于neural belief tracker(NBT)来检测slot-value对。 Neural belief tracker: Data-driven dia-logue state tracking.

    7. Dialogue State Tracking

      跟进对话状态是保障dialog system的robust的核心。主要目标是预测每轮对话的用户目标。经典的状态结构通常叫做slot-filling 或者 sematic frame.

      传统用手工规则的方法: D. Goddeau, H. Meng, J. Polifroni, S. Seneff, andS. Busayapongchai. A form-based dialogue managerfor spoken language applications. InSpoken Language,1996. ICSLP 96. Proceedings., Fourth InternationalConference on, volume 2, pages 701–704. IEEE, 1996

      基于规则的方法倾向于常见的错误,然后很多结果并不是想要的。 J. D. Williams. Web-style ranking and slu combina-tion for dialog state tracking. InSIGDIAL Conference,pages 282–291, 2014

    8. Slot filling

      填槽这个问题更多的是看成一个序列标注的问题。句子中的每个词都打上一个语义标签。输入是由词组成的句子,输出是每个词对应的slot/concept IDs.

      DBN 类的处理:

      • A Deoras and R. Sarikaya. Deep belief network basedsemantic taggers for spoken language understanding.

        L. Deng, G. Tur, X. He, and D. Hakkani-Tur. Use ofkernel deep convex networks and end-to-end learningfor spoken language understanding


      • G. Mesnil, X. He, L. Deng, and Y. Bengio. Investi-gation of recurrent-neural-network architectures andlearning methods for spoken language understanding.Interspeech, 2013.
      • K. Yao, G. Zweig, M. Y. Hwang, Y. Shi, and D. Yu.Recurrent neural networks for language understand-ing. InInterspeech, 2013
      • R. Sarikaya, G. E. Hinton, and B. Ramabhadran.Deep belief nets for natural language call-routing
      • K. Yao, B. Peng, Y. Zhang, D. Yu, G. Zweig, andY. Shi. Spoken language understanding using longshort-term memory neural networks. InIEEE Insti-tute of Electrical & Electronics Engineers, pages 189 –194, 2014
    9. Language Understanding

      目标是根据一个用户utterance/query 得到其对应的语义slot。slots是预先根据场景定于的。通常来说有两种类型的表示,一个是句子级别的类别,例如用户的意图和utterance的类别。另外一个是单词级别的信息抽取,例如命名实体和槽位填充。

      意图识别是根据一句话来检测用户的意图。 基于深度学习的意图识别: L. Deng, G. Tur, X. He, and D. Hakkani-Tur. Use ofkernel deep convex networks and end-to-end learningfor spoken language understanding. InSpoken Lan-guage Technology Workshop (SLT), 2012 IEEE, pages210–215. IEEE, 2012

      G. Tur, L. Deng, D. Hakkani-T ̈ur, and X. He. Towardsdeeper understanding: Deep convex networks for se-mantic utterance classification. InAcoustics, Speechand Signal Processing (ICASSP), 2012 IEEE Interna-tional Conference on, pages 5045–5048. IEEE, 2012.

      D. Yann, G. Tur, D. Hakkani-Tur, and L. Heck. Zero-shot learning and clustering for semantic utteranceclassification using deep learning. 2014.

      尤其是这个用CNN来抽取query vector进行query分类。 H. B. Hashemi, A. Asiaee, and R. Kraft. Query intentdetection using convolutional neural networks. InIn-ternational Conference on Web Search and Data Min-ing, Workshop on Query Understanding, 2016

      P.-S. Huang, X. He, J. Gao, L. Deng, A. Acero, andL. Heck. Learning deep structured semantic modelsfor web search using clickthrough data. InProceedingsof the 22nd ACM international conference on Confer-ence on information & knowledge management, pages2333–2338. ACM, 2013

      Y. Shen, X. He, J. Gao, L. Deng, and G. Mesnil.Learning semantic representations using convolutionalneural networks for web search. InProceedings of the23rd International Conference on World Wide Web,pages 373–374. ACM, 2014.

  6. Feb 2019
    1. Spoken language understanding (SLU) comprises two tasks, intent identification andslot filling. That is, given the current query along with the previous queries in the samesession, an SLU system predicts the intent of the current query and also all slots (entitiesor labels) associated with the predicted intent. The significance of SLU lies in that eachtype of intent corresponds to a particular service API and the slots correspond to theparameters required by this API. SLU helps the dialog system to decide how to satisfythe user’s need by calling the right service with the right information



      • 1 意图分类的复杂性
      • 2 世界知识
      • 3 用户状态
    1. 问答系统冠军之路:用 CNN 做问答任务的 QANet


      QANet: Combining Local Convolution with Global Self-Attention for Reading Comprehension

    1. 对话管理也可以看成是一个分类任务,即每个对话状态和一个合适的对话动作相对应.和其它有监督的学习任务一样,分类器可以从标注的语料库中训练得到.但是,在某状态下系统应该选择的动作不能仅仅是模仿在训练数据中同一状态对应的动作,而应该是选择合适的动作能够导致一个成功的对话.因此,把对话过程看成是一个决策过程更为合适,从而根据对话的整体成功来优化动作的选择过程[32].因而这是一个规划问题,并且可以用强化学习[33]方法学习获得最优的结果
    2. 对话系统从本体构成和业务逻辑角度,可分为领域任务型和开放型的信息交互.领域任务型系统针对具体应用领域,具有比较清晰的业务语义单元的定义、本体结构以及用户目标范畴,例如航班查询、视频搜索、设备控制等等,这类交互往往是以完成特定的操作任务作为交互目标;而开放型信息交互则不针对特定领域,或说面向非常广泛的领域,交互目标并非业务任务,而是满足用户其它方面的需求,例如开放的百科问答、聊天等.它虽然能一定程度上显示人工智能的力量,但因其并不专注于帮助人解决现实任务问题,其实际使用范围较为狭窄.近年来,随着移动终端的高速发展,面向任务的自然人机对话系统和相关的认知控制理论得到了越来越多的学术和产业界重视,这也是本文讨论的重点
    3. 任务型人机对话系统中的认知技术

    1. We com-plement recent work by showing the effec-tiveness of simple sequence-to-sequenceneural architectures with a copy mecha-nism. Our model outperforms more com-plex memory-augmented models by 7% inper-response generation and is on par withthe current state-of-the-art on DSTC2, areal-world task-oriented dialogue dataset

      用一个带有copy机制的简单seq2seq框架超过现有最好的真实DSTC2 7个点。

    2. A Copy-Augmented Sequence-to-Sequence Architecture GivesGood Performance on Task-Oriented Dialogue

      Task-oriented dialogue focuses on conversational agents that participate in dialogues with user goals on domain-specific topics. In contrast to chatbots, which simply seek to sustain open-ended meaningful discourse, existing task-oriented agents usually explicitly model user intent and belief states. This paper examines bypassing such an explicit representation by depending on a latent neural embedding of state and learning selective attention to dialogue history together with copying to incorporate relevant prior context. We complement recent work by showing the effectiveness of simple sequence-to-sequence neural architectures with a copy mechanism. Our model outperforms more complex memory-augmented models by 7% in per-response generation and is on par with the current state-of-the-art on DSTC2, a real-world task-oriented dialogue datase

    1. Both NLU and NLG are implementedwith template-based models


    2. ith baselinesin terms of three evaluation metrics following Li etal. (2017) and Peng at al. (2017a; 2017b), namely,success rate, average reward and the average num-ber of turns per dialogue session. As for classifi-cation models, we use accuracy as the metric


      基准算法:1 svm 2 random agent 3 rule based

    3. Symptom NormalizationAfter symptom ex-pression identification, medical experts manuallylink each symptom expression to the most rele-vant concept on SNOMED CT2for normaliza-tion. Table 2 shows some phrases that describesymptoms in the example and some related con-cepts in SNOMED CT. The overview of dataset ispresented in Table


    4. Symptom ExtractionWe follow the BIO(begin-in-out) schema for symptom identification(Figure 1). Each Chinese character is assigned alabel of ”B”, ”I” or ”O”. Also, each extractedsymptom expression is tagged withTrueorFalseindicating whether the patient suffers from thissymptom or not. In order to improve the anno-tation agreement between annotators, we createtwo guidelines for the self-report and the conver-sational data respectively. Each record is anno-tated by at least two annotators. Any inconsis-tency would be further judged by the third one.The Cohen’s kappa coefficient between two anno-tators are71%and67%for self-reports and con-versations respectively

      症状数据抽取,BIO格式。每个中文字符标注为“B","I","O".每个抽取出的症状根据病人真实情况打标为“True","False"。3人2个都标过的才有效,第三人评判。Cohhen kappa 相关性来作为标注标准。

    5. In this paper, we make a move to builda dialogue system for automatic diagno-sis. We first build a dataset collected froman online medical forum by extractingsymptoms from both patients’ self-reportsand conversational data between patientsand doctors. Then we propose a task-oriented dialogue system framework tomake the diagnosis for patients automat-ically, which can converse with patients tocollect additional symptoms beyond theirself-reports. Experimental results on ourdataset show that additional symptoms ex-tracted from conversation can greatly im-prove the accuracy for disease identifica-tion and our dialogue system is able tocollect these symptoms automatically andmake a better diagnosis

      In this paper, we make a move to builda dialogue system for automatic diagno-sis. We first build a dataset collected froman online medical forum by extractingsymptoms from both patients’ self-reportsand conversational data between patientsand doctors. Then we propose a task-oriented dialogue system framework tomake the diagnosis for patients automat-ically, which can converse with patients tocollect additional symptoms beyond theirself-reports. Experimental results on ourdataset show that additional symptoms ex-tracted from conversation can greatly im-prove the accuracy for disease identifica-tion and our dialogue system is able tocollect these symptoms automatically andmake a better diagnosis


    1. To overcome this issue, weexplore data generation using templates and terminologies and data augmentationapproaches. Namely, we report our experiments using paraphrasing and wordrepresentations learned on a large EHR corpus with Fasttext and ELMo, to learn aNLU model without any available dataset. We evaluate on a NLU task of naturallanguage queries in EHRs divided in slot-filling and intent classification sub-tasks.On the slot-filling task, we obtain a F-score of 0.76 with the ELMo representation;and on the classification task, a mean F-score of 0.71. Our results show that thismethod could be used to develop a baseline system



    2. Natural language understanding for task oriented dialog in the biomedical domain in a low resources context

    1. PyDial: A Multi-domain Statistical Dialogue System Toolkit


      其总的架构包含Sematic Decode,Belief Tracker,Policy Reply System,Language generator. 整体来说整个系统都支持了基于规则的判断过程,也融合了模型的支持。源码值得一看的。

  7. www.iro.umontreal.ca www.iro.umontreal.ca
    1. (a) Feed-forward NN; (b) Elman-RNN; (c) Jordan-RNN


    2. For the slot filling task, the input is the sentence consisting of a sequence of words, L, and the output is a sequence of slot/concept IDs, S, one for each word. In the statistical SLU systems, the task is often formalized as a pattern recognition problem: Given the word sequence L, the goal of SLU is to find the semantic representation of the slot sequence 푆that has the maximum a posterioriprobability 푃(푆|퐿).

      对于填槽任务,输入是一个有一系列词组成的语句,输出是每个词对应的slot/concept IDs。在统计SLU系统里,这个任务可以看作是:给定词序列L,SLU的目标是找到一个slot 序列来最大化后验概率P(S/L).

    3. Using Recurrent Neural Networks for Slot Filling in Spoken Language Understanding

    4. bi-directional Jordan-type network that takes into account both past and future dependencies among slots works best

      双向的 Jordan-type网络对槽最好用

    5. Using Recurrent Neural Networksfor Slot Filling in Spoken Language Understanding

    1. the dataset used in our experiment hasonly the tags of filled information slots extracted by patternmatching between dialogue log and final order information

      用到的数据集是一个coffee ordering的对话过程的数据,31567通对话,142412对会话。数据只有用正则匹配出来的填充的标签信息。

    2. the agent model swaps the in-put and output sequences, and it also takes the tag of filledinformation slots as an input which is extracted from dia-logue in previous turns by pattern matching with the orderinformation in ground truth

      agent model 构建前先预训练。网络结构和user model一样,但是输入和输出反转,同时也把之前对话中已经填充的槽位信息作为输入。但是这俩部分信息并不是简单的直接拼接在一起,而是来学习适合的attention 权重来更好的利用注意力机制。此外任何其他额外的语义意图标签都不必用。

    3. By directly learningfrom the raw dialogue logs, the network takes the agent ut-Figure 2: The network structure: encoder-decoder structurewith the attention mechanismteranceXa:xa1;xa2;:::;xanas the input sequence and takescorresponding user utteranceYu:yu1;yu2;:::;yumas the tar-get sequence.

      User model.直接用双向的LSTM,以agent的utterance作为X,对应的用户的utterance作为Y。

    4. In the task-oriented dialogues, a user usually firstly showsthe intention to the agent and then answers the agent’s ques-tions one by one to specify the demand.


      用户通常是被动的,偶尔的有一轮问题。换句话说用户基本上都是在一轮中回答由agent提出的问题。所以可以基于一个用户只需要考虑一轮回答来给出回复这样的假设来构建user model,,让agent model来处理多轮对话。

    5. we propose a uSer andAgent Model IntegrAtion (SAMIA) framework inspired byan observation that the roles of the user and agent models areasymmetric. Firstly, this SAMIA framework model the usermodel as a Seq2Seq learning problem instead of ranking ordesigning rules. Then the built user model is used as a lever-age to train the agent model by deep reinforcement learning.In the test phase, the output of the agent model is filtered bythe user model to enhance the stability and robustness. Ex-periments on a real-world coffee ordering dataset verify theeffectiveness of the proposed SAMIA framework.


    6. ntegrating User and Agent Models: A Deep Task-Oriented Dialogue System Weiyan Wang, Yuxiang WU, Yu Zhang, Zhongqi Lu, Kaixiang Mo, Qiang Yang (Submitted on 10 Nov 2017) Task-oriented dialogue systems can efficiently serve a large number of customers and relieve people from tedious works. However, existing task-oriented dialogue systems depend on handcrafted actions and states or extra semantic labels, which sometimes degrades user experience despite the intensive human intervention. Moreover, current user simulators have limited expressive ability so that deep reinforcement Seq2Seq models have to rely on selfplay and only work in some special cases. To address those problems, we propose a uSer and Agent Model IntegrAtion (SAMIA) framework inspired by an observation that the roles of the user and agent models are asymmetric. Firstly, this SAMIA framework model the user model as a Seq2Seq learning problem instead of ranking or designing rules. Then the built user model is used as a leverage to train the agent model by deep reinforcement learning. In the test phase, the output of the agent model is filtered by the user model to enhance the stability and robustness. Experiments on a real-world coffee ordering dataset verify the effectiveness of the proposed SAMIA framework.

    1. Deep Reinforcement Learning for Dialogue Generation

      Recent neural models of dialogue generationoffer great promise for generating responsesfor conversational agents, but tend to be short-sighted, predicting utterances one at a timewhile ignoring their influence on future out-comes. Modeling the future direction of a di-alogue is crucial to generating coherent, inter-esting dialogues, a need which led traditionalNLP models of dialogue to draw on reinforce-ment learning. In this paper, we show how tointegrate these goals, applying deep reinforce-ment learning to model future reward in chat-bot dialogue. The model simulates dialoguesbetween two virtual agents, using policy gradi-ent methods to reward sequences that displaythree useful conversational properties: infor-mativity, coherence, and ease of answering (re-lated to forward-looking function). We evalu-ate our model on diversity, length as well aswith human judges, showing that the proposedalgorithm generates more interactive responsesand manages to foster a more sustained conver-sation in dialogue simulation. This work marksa first step towards learning a neural conversa-tional model based on the long-term success ofdialogues.

    1. Dialog System & Technology Challenge 6 Overview of Track 1 - End-to-End Goal-Oriented Dialog learning

      End-to-end dialog learning is an important research subject inthe domain of conversational systems. The primary task consistsin learning a dialog policy from transactional dialogs of a givendomain. In this context, usable datasets are needed to evaluatelearning approaches, yet remain scarce. For this challenge, atransaction dialog dataset has been produced using a dialogsimulation framework developed and released by Facebook AIResearch. Overall, nine teams participated in the challenge. Inthis report, we describe the task and the dataset. Then, we specifythe evaluation metrics for the challenge. Finally, the results ofthe submitted runs of the participants are detailed.

    1. Intent Detection for code-mix utterances in task oriented dialogue systems

      Intent detection is an essential component of taskoriented dialogue systems. Over the years, extensiveresearch has been conducted resultingin many state of the art modelsdirected towards resolving user’sintents indialogue. A variety of vector representation for user utterances have been explored for the same. However, these models and vectorization approaches have more so been evaluated in a single language environment. Dialogue systems generally have to deal with queries in different languages and most importantly Code-Mix form of writing. Since Code-Mix texts are not bounded by a formal structure they are difficult to handle. We thus conduct experiments across combinations of models and various vector representations for Code-Mix as well as multi-language utterancesand evaluate how these models scale to a multi-languageenvironment. Our aim is to find the best suitable combination of vector representation and models for the process of intent detection for code-mix utterances. We have evaluated the experiments on two different dataset consisting of only Code-Mix utterances and the otherdataset consisting of English, Hindi, and Code-Mix( English-Hindi) utterances

    1. Sequence-to-Sequence Learning for Task-oriented Dialogue with Dialogue State Representation

    1. Mem2Seq: Effectively Incorporating Knowledge Bases into End-to-End Task-Oriented Dialog Systems

    1. Improving Semantic Parsing for Task Oriented Dialog

      Semantic parsing using hierarchical representations has recently been proposedfor task oriented dialog with promising results. In this paper, we present three dif-ferent improvements to the model: contextualized embeddings, ensembling, andpairwise re-ranking based on a language model. We taxonomize the errors pos-sible for the hierarchical representation, such as wrong top intent, missing spansor split spans, and show that the three approaches correct different kinds of errors.The best model combines the three techniques and gives 6.4% better exact matchaccuracy than the state-of-the-art, with an error reduction of 33%, resulting in anew state-of-the-art result on the Task Oriented Parsing (TOP) dataset

    1. Task-oriented dialog systems 要观看此视频,请启用 JavaScript 并考虑升级到 支持 HTML5 视频 的 Web 浏览器 Video Player is loading.Play VideoPlayMuteLoaded: 0%Progress: 0%Subtitlessubtitles off, selected英语(English)Quality360p720p540p360p, selectedFullscreenThis is a modal window.Beginning of dialog window. Escape will cancel and close the window.TextColorWhiteBlackRedGreenBlueYellowMagentaCyanTransparencyOpaqueSemi-TransparentBackgroundColorBlackWhiteRedGreenBlueYellowMagentaCyanTransparencyOpaqueSemi-TransparentTransparentWindowColorBlackWhiteRedGreenBlueYellowMagentaCyanTransparencyTransparentSemi-TransparentOpaqueFont Size50%75%100%125%150%175%200%300%400%Text Edge StyleNoneRaisedDepressedUniformDropshadowFont FamilyProportional Sans-SerifMonospace Sans-SerifProportional SerifMonospace SerifCasualScriptSmall CapsReset restore all settings to the default valuesDoneClose Modal DialogEnd of dialog window.

      coursera 对话系统课程

    1. 京东的机器人架构

      Existing solutions to task-oriented dia-logue systems follow pipeline designswhich introduce architectural complex-ity and fragility. We propose a novel,holistic, extendable framework based ona single sequence-to-sequence (seq2seq)model which can be optimized with su-pervised or reinforcement learning. Akey contribution is that we design textspans namedbelief spansto track dia-logue believes, allowing task-oriented dia-logue systems to be modeled in a seq2seqway. Based on this, we propose a sim-plisticTwo Stage CopyNetinstantiationwhich demonstrates good scalability: sig-nificantly reducing model complexity interms of number of parameters and train-ing time by an order of magnitude. Itsignificantly outperforms state-of-the-artpipeline-based methods on two datasetsand retains a satisfactory entity match rateon out-of-vocabulary (OOV) cases wherepipeline-designed competitors totally fail