23 Matching Annotations
  1. Sep 2019
    1. Simulated race car drivingwith reinforcementlearning

      Is the title fixed? I have a suggest for the title,

      • Deep reinforcement learning for autonomous car racing
      • Autonomous car racing with deep reinforcement learning
    2. The content table looks good in general. But I am still not sure if jumping from "Preliminaries" directly to "Setup" is a good idea. Don't we need the "Method" chapter?

    Tags

    Annotators

  2. Mar 2017
    1. Let us take a di erent sequence ofnframes andmsubsequent fra

      rewrote, original sentence was incomplete

    2. Using pooling, which is a form of non-lineardown-sampling and thus extracts higher-level features,would be part of the solution which only be employedinDmodel, since inGmodel, the output frame hasto be of the same resolution as the input frame

      rewrote

    3. n this section, we will introduce the Generative Ad-versarial Networks (GANs)[TODO], which were rstintroduced by Ian Goodfellow et al.(2014) and used forimage generation from random noise. This approachthen was exploited by Michael Mathieu et al.(2016) forframe prediction, where a series of future frames werepredicted from a sequence of input frames using twonetworks trained simutaneously. This is the approachwe use in our project

      rewrote

    4. for learningphysical properties (material, mass, density, etc.) fromunlabeled videos by encoding a

      rewrote base on your suggestion

    5. earning intuitive physics directly from high-dimensional visual input is a challenging problemthat has interested computer vision community fordecades. Recently, deep neural networks has beenachieving great success on various problems in com-puter vision, which can be considered as a promisingresearch direction for solving this task.[TODO:intuitive physics examples

      rewrote base on your suggestion, examples will be added later

    6. umans learn to understand the physics of the worldat a very early age. Evidence suggests that babies de-velop visual understanding of basic physical conceptsduring rst years of their lives[TODO Joshua Tenen-baum]. Endowing a robot with the ability to learncommon sense intuitive physics of the real world fromvisual input is an important step towards the generalgoal of arti cial intelligence.

      rewrote base on your suggestion

    7. Deep neural networks enable transformationof raw pixel inputs to useful feature represen-tations in unsupervised fashion. We investi-gate the feasibility of using such representa-tions learned from unlabeled videos for pre-dicting trajectories of target objects. Sinceboth the input and the output of the net-work constitutes a sequence of video frames,the network weights capture the notion ofintuitive physics learned from the observedvideos. We adapt the multi-scale conditionalgenerative adversarial net (CGAN) architec-ture, that has been shown to perform wellon natural image sequence prediction andused to generate realistically looking videosof Packman, to our settings, and demon-strate that it can successfully predict plau-sible video sequences that the test object fol-lows a short-range ground truth trajectory

      rewrote base on your suggestion

    8. constructs

      constructs instead of construct

    9. the training set

      training set instead of training dataset

    10. heDis trained to assign a target label 1 to (Xk;Yk), andto assign a target label 0 to (Xk;Gk(Xk)). Therefore,the loss function we use to trainDis

      revised, original was bad to understand

    11. LSTM)

      added space

    12. network (CGAN)

      added space

    13. n Figure 5, we compare two di erent scenarios (fallingscenario and holding scenario

      revised

    14. On the right side, we test our model with four inputframes of the holding scenario that the test object isheld in the hand. In the predicted frames, the trajec-tory of the test object is a xed position that is slightlydi erent from the ground truth frames, in which th

      rewrote

    15. On the left side, ourGmodel is tested with four con-catenate input frames of the falling scenario, in whichthe object is falling in the air, and predicts the futuretrajectory, a path that the object falling towards theground, of the falling object. Looking at only the lastpredicted frame^Y6and the last ground truth frameY6, in which the red part represents the object, weobserve that the object reaches the same position onthe surface or the ground in both frames. Namely,our model predicts the linear trajectory of the unseenobject, with indirect representations of moving speedand directions.

      rewrote, original was bad to understand

    16. Let us take a sequence ofnframes andmsubsequent frames from the dataset,

      revised, original sentence was incomplete

    17. The training loss of the disciminative model convergesfast in about 3000 steps and ends up a low value onaverage with small variations. The training process ofthe generative model, however, is much more unstable.TheGloss goes down on average but widly osciliatesabout the trend.

      rewrote

    18. In this work, we tackle the problem of learning in-tuitive physics of objects from unlabled videos andpredicting the object's trajectory, using a multi-scaleCGAN [TO CITE]. Our experiments show that, b

      rewrote

    19. ANs[TODO],

      to cite

    20. transforming the pixels in input frames of the train-ing set into a set of features, our model can predictplausible video sequences that the test object followsa short-range ground truth trajectory given that theobject is moving in input frames, or remians in a po-sition when prior input trajectory of the test object isa xed point.Using multi-scale CGANs to learn intuitive physicsfrom videos has advantages and disadvantages. Theadvantage is that the learning is fully unsupervised,no need to label the videos manually. Additionally,our networks reduce high-dimensional pixel data into aset of feature representations that is capable of synthe-sizing the input videos and predicting plausible videoframes of objects' trajectory. One disadvantage is thatthe training process of the generative model is quiteunstable, which may lead to failure representations.Another disadvantage is that our predictions are lin-ear, namely, trajectories with sudden changes are hardto predict, e.g., our model fails to predict when will theobject be dropped, and to which direction does the ob-ject bounce when the object hits the surface.Nevertheless, the strategy of using multi-scale CGANsto learn intuitive physics from videos admits many ap-plications

      rewrote

    21. e would like to thank Simone Parisi for providing thetetherball dataset and MIT CSAIL for sharing theirPhysics101 dataset. We would also like to acknowledgeMatt Cooper for the open-source code on GitHub

      revised

    Annotators