99 Matching Annotations
  1. Last 7 days
  2. Sep 2019
    1. At the moment, GPT-2 uses a binary search algorithm, which means that its output can be considered a ‘true’ set of rules. If OpenAI is right, it could eventually generate a Turing complete program, a self-improving machine that can learn (and then improve) itself from the data it encounters. And that would make OpenAI a threat to IBM’s own goals of machine learning and AI, as it could essentially make better than even humans the best possible model that the future machines can use to improve their systems. However, there’s a catch: not just any new AI will do, but a specific type; one that uses deep learning to learn the rules, algorithms, and data necessary to run the machine to any given level of AI.

      This is a machine generated response in 2019. We are clearly closer than most people realize to machines that can can pass a text-based Turing Test.

    1. Since all neurons in a single depth slice share the same parameters, the forward pass in each depth slice of the convolutional layer can be computed as a convolution of the neuron's weights with the input volume.[nb 2] Therefore, it is common to refer to the sets of weights as a filter (or a kernel), which is convolved with the input. The result of this convolution is an activation map, and the set of activation maps for each different filter are stacked together along the depth dimension to produce the output volume. Parameter sharing contributes to the translation invariance of the CNN architecture. Sometimes, the parameter sharing assumption may not make sense. This is especially the case when the input images to a CNN have some specific centered structure; for which we expect completely different features to be learned on different spatial locations. One practical example is when the inputs are faces that have been centered in the image: we might expect different eye-specific or hair-specific features to be learned in different parts of the image. In that case it is common to relax the parameter sharing scheme, and instead simply call the layer a "locally connected layer".

      important terms you hear repeatedly great visuals and graphics @https://distill.pub/2018/building-blocks/

    1. Here's a playground were you can select different kernel matrices and see how they effect the original image or build your own kernel. You can also upload your own image or use live video if your browser supports it. blurbottom sobelcustomembossidentityleft sobeloutlineright sobelsharpentop sobel The sharpen kernel emphasizes differences in adjacent pixel values. This makes the image look more vivid. The blur kernel de-emphasizes differences in adjacent pixel values. The emboss kernel (similar to the sobel kernel and sometimes referred to mean the same) givens the illusion of depth by emphasizing the differences of pixels in a given direction. In this case, in a direction along a line from the top left to the bottom right. The indentity kernel leaves the image unchanged. How boring! The custom kernel is whatever you make it.

      I'm all about my custom kernels!

    1. We developed a new metric, UAR, which compares the robustness of a model against an attack to adversarial training against that attack. Adversarial training is a strong defense that uses knowledge of an adversary by training on adversarially attacked images[3]To compute UAR, we average the accuracy of the defense across multiple distortion sizes and normalize by the performance of an adversarially trained model; a precise definition is in our paper. . A UAR score near 100 against an unforeseen adversarial attack implies performance comparable to a defense with prior knowledge of the attack, making this a challenging objective.

      @metric

  3. Aug 2019
    1. Using multiple copies of a neuron in different places is the neural network equivalent of using functions. Because there is less to learn, the model learns more quickly and learns a better model. This technique – the technical name for it is ‘weight tying’ – is essential to the phenomenal results we’ve recently seen from deep learning.

      This parameter sharing allows CNNs, for example, to need much less params/weights than Fully Connected NNs.

    2. The known connection between geometry, logic, topology, and functional programming suggests that the connections between representations and types may be of fundamental significance.

      Examples for each?

    3. Representations are Types With every layer, neural networks transform data, molding it into a form that makes their task easier to do. We call these transformed versions of data “representations.” Representations correspond to types.

      Interesting.

      Like a Queue Type represents a FIFO flow and a Stack a FILO flow, where the space we transformed is the operation space of the type (eg a Queue has a folded operation space compared to an Array)

      Just free styling here...

    4. In this view, the representations narrative in deep learning corresponds to type theory in functional programming. It sees deep learning as the junction of two fields we already know to be incredibly rich. What we find, seems so beautiful to me, feels so natural, that the mathematician in me could believe it to be something fundamental about reality.

      compositional deep learning

    5. Appendix: Functional Names of Common Layers Deep Learning Name Functional Name Learned Vector Constant Embedding Layer List Indexing Encoding RNN Fold Generating RNN Unfold General RNN Accumulating Map Bidirectional RNN Zipped Left/Right Accumulating Maps Conv Layer “Window Map” TreeNet Catamorphism Inverse TreeNet Anamorphism

      👌translation. I like to think about embeddings as List lookups

    1. As log-bilinear regression model for unsupervised learning of word representations, it combines the features of two model families, namely the global matrix factorization and local context window methods

      What does "log-bilinear regression" mean exactly?

  4. Jul 2019
    1. We will discuss classification in the context of supportclassificationvector machines

      SVMs aren't used that much in practice anymore. It's more of an academic fling, because they're nice to work with mathematically. Empirically, Tree Ensembles or Neural Nets are almost always better.

    1. Compared with neural networks configured by a pure grid search,we find that random search over the same domain is able to find models that are as good or betterwithin a small fraction of the computation time.
  5. Jun 2019
    1. To interpret a model, we require the following insights :Features in the model which are most important.For any single prediction from a model, the effect of each feature in the data on that particular prediction.Effect of each feature over a large number of possible predictions

      Machine learning interpretability

    1. By comparison, Amazon’s Best Seller badges, which flag the most popular products based on sales and are updated hourly, are far more straightforward. For third-party sellers, “that’s a lot more powerful than this Choice badge, which is totally algorithmically calculated and sometimes it’s totally off,” says Bryant.

      "Amazon's Choice" is made by an algorithm.

      Essentially, "Amazon" is Skynet.

    1. This problem is called overfitting—it's like memorizing the answers instead of understanding how to solve a problem.

      Simple and clear explanation of overfitting

  6. May 2019
    1. policy change index - machine learning on corpus of text to identify and predict policy changes in China

  7. Mar 2019
    1. Mention McDonald’s to someone today, and they're more likely to think about Big Mac than Big Data. But that could soon change: The fast-food giant has embraced machine learning, in a fittingly super-sized way.McDonald’s is set to announce that it has reached an agreement to acquire Dynamic Yield, a startup based in Tel Aviv that provides retailers with algorithmically driven "decision logic" technology. When you add an item to an online shopping cart, it’s the tech that nudges you about what other customers bought as well. Dynamic Yield reportedly had been recently valued in the hundreds of millions of dollars; people familiar with the details of the McDonald’s offer put it at over $300 million. That would make it the company's largest purchase since it acquired Boston Market in 1999.

      McDonald's are getting into machine learning. Beware.

  8. Feb 2019
    1. For instance, an aborigine who possesses all of our basic sensory-mental-motor capabilities, but does not possess our background of indirect knowledge and procedure, cannot organize the proper direct actions necessary to drive a car through traffic, request a book from the library, call a committee meeting to discuss a tentative plan, call someone on the telephone, or compose a letter on the typewriter.

      In other words: culture. I'm pretty sure that Engelbart would agree with the statement that someone who could order a book from a library would likely not know the best way to find a nearby water source, as the right kind of aborigine would know. Collective intelligence is a monotonically increasing store of knowledge that is maintained through social learning -- not just social learning, but teaching. Many species engage in social learning, but humans are the only primates with visible sclera -- the whites of our eyeballs -- which enables even infants to track where their teacher/parent is looking. I think this function of culture is what Engelbart would call "C work"

      A Activity: 'Business as Usual'. The organization's day to day core business activity, such as customer engagement and support, product development, R&D, marketing, sales, accounting, legal, manufacturing (if any), etc. Examples: Aerospace - all the activities involved in producing a plane; Congress - passing legislation; Medicine - researching a cure for disease; Education - teaching and mentoring students; Professional Societies - advancing a field or discipline; Initiatives or Nonprofits - advancing a cause.
      
      B Activity: Improving how we do that. Improving how A work is done, asking 'How can we do this better?' Examples: adopting a new tool(s) or technique(s) for how we go about working together, pursuing leads, conducting research, designing, planning, understanding the customer, coordinating efforts, tracking issues, managing budgets, delivering internal services. Could be an individual introducing a new technique gleaned from reading, conferences, or networking with peers, or an internal initiative tasked with improving core capability within or across various A Activities.
      
      C Activity: Improving how we improve. Improving how B work is done, asking 'How can we improve the way we improve?' Examples: improving effectiveness of B Activity teams in how they foster relations with their A Activity customers, collaborate to identify needs and opportunities, research, innovate, and implement available solutions, incorporate input, feedback, and lessons learned, run pilot projects, etc. Could be a B Activity individual learning about new techniques for innovation teams (reading, conferences, networking), or an initiative, innovation team or improvement community engaging with B Activity and other key stakeholders to implement new/improved capability for one or more B activities.
      

      In other words, human culture, using language, artifacts, methodology, and training, bootstrapped collective intelligence; what Engelbart proposed, then was to apply C work to culture's bootstrapping capabilities.

    1. Nearly half of FBI rap sheets failed to include information on the outcome of a case after an arrest—for example, whether a charge was dismissed or otherwise disposed of without a conviction, or if a record was expunged

      This explains my personal experience here: https://hyp.is/EIfMfivUEem7SFcAiWxUpA/epic.org/privacy/global_entry/default.html (Why someone who had Global Entry was flagged for a police incident before he applied for Global Entry).

    2. Applicants also agree to have their fingerprints entered into DHS’ Automatic Biometric Identification System (IDENT) “for recurrent immigration, law enforcement, and intelligence checks, including checks against latent prints associated with unsolved crimes.

      Intelligence checks is very concerning here as it suggests pretty much what has already been leaked, that the US is running complex autonomous screening of all of this data all the time. This also opens up the possibility for discriminatory algorithms since most of these are probably rooted in machine learning techniques and the criminal justice system in the US today tends to be fairly biased towards certain groups of people to begin with.

    3. It cited research, including some authored by the FBI, indicating that “some of the biometrics at the core of NGI, like facial recognition, may misidentify African Americans, young people, and women at higher rates than whites, older people, and men, respectively.

      This re-affirms the previous annotation that the set of training data for the intelligence checks the US runs on global entry data is biased towards certain groups of people.

  9. Jan 2019
    1. Measurements are variables that can be quantified. All data in the output above are measurements. Some of these measurements, such as state_percentile_16, avg_score_16 and school_rating, are outcomes; these outcomes cannot be used to explain one another. For example, explaining school_rating as a result of state_percentile_16 (test scores) is circular logic. Therefore we need a second class of variables.
  10. Nov 2018
  11. Sep 2018
    1. in equation B for the marginal of a gaussian, only the covariance of the block of the matrix involving the unmarginalized dimensions matters! Thus “if you ask only for the properties of the function (you are fitting to the data) at a finite number of points, then inference in the Gaussian process will give you the same answer if you ignore the infinitely many other points, as if you would have taken them all into account!”(Rasmunnsen)

      key insight into Gaussian processes

    1. predictive analysis

      Predictive analytics encompasses a variety of statistical techniques from data mining, predictive modelling, and machine learning, that analyze current and historical facts to make predictions about future or otherwise unknown events.

  12. Jul 2018
  13. course-computational-literary-analysis.netlify.com course-computational-literary-analysis.netlify.com
    1. There is here, moral, if not legal, evidence, that the murder was committed by the Indians.

      This is a very interesting take on "evidence" as being moral if not legal by Sergeant Cuff. It makes me question exactly what he means by that if there is a way to use computational analysis to find out. We could perhaps start by parsing out "evidence" throughout the text with a machine learning algorithm to help he define evidence and then, going forward, device a way (maybe with sentiment analysis) to determine moral evidence from legal evidence.

    1. ~32:00 What about the domain of the function being effectively lower dimensional, rather than a strongly regularity assumption? That would also work, right? Could this be the case for images? (what's the dimensionality of the manifold of natural images?)

      Nice. I like the idea of regularity <> low dimensional representation. I guess by that general definition, the above is a form of regularity..

      He comments about this on 38:30

    1. This system of demonstrating tasks to one robot that can then transfer its skills to other robots with different body shapes, strengths, and constraints might just be the first step toward independent social learning in robots. From there, we might be on the road to creating cultured robots.
    2. Soon we might add robots to this list. While our fanciful desert scene of robots teaching each other how to defuse bombs lies in the distant future, robots are beginning to learn socially. If one day robots start to develop and share knowledge independently of humans, might that be the seed for robot culture?
    3. his imaginary scene shows the power of learning from others. Anthropologists and zoologists call this “social learning”: picking up new information by observing or interacting with others and the things others produce. Social learning is rife among humans and across the wider animal kingdom. As we discussed in our previous post, learning socially is fundamental to how humans become fully rounded people, in all our diversity, creativity, and splendor.
    1. "It's so scary that it works," Perelman sighs. "Machines are very brilliant for certain things and very stupid on other things. This is a case where the machines are very, very stupid."
  14. Apr 2018
  15. Mar 2018
    1. Artificial intelligence (AI), machine learning and deep learning

      Explicación gráfica de artificial intelligence, machine learning y deep learning

  16. Dec 2017
    1. Most of the recent advances in AI depend on deep learning, which is the use of backpropagation to train neural nets with multiple layers ("deep" neural nets).

      Neural nets consist of layers of nodes, with edges from each node to the nodes in the next layer. The first and last layers are input and output. The output layer might only have two nodes, representing true or false. Each node holds a value representing how excited it is. Each edge has a value representing strength of connection, which determines how much of the excitement passes through.

      The edges in an untrained neural net start with random values. The training data consists of a series of samples that are already labeled. If the output is wrong, the edges are adjusted according to how much they contributed to the error. It's called backpropagation because it starts with the output nodes and works toward the input nodes.

      Deep neural nets can be effective, but only for single specific tasks. And they need huge sets of training data. They can also be tricked rather easily. Worse, someone who has access to the net can discover ways of adding noise to images that will make the net "see" things that obviously aren't there.

  17. Nov 2017
    1. UML automatically finds these hidden patterns to link seemingly unrelated accounts and customers. These links can be one of thousands of data fields that the UML model ingests.

      Why does this have to be done in a different system?

  18. Oct 2017
  19. Sep 2017
    1. Đầu tiên mình nghĩ bạn cần nắm về machine learning và algorithm, bạn có thể bắt đầu bằng các khóa học trên mạng. Mình recommend khóa học Machine Learning của Andrew Ng, khóa học này được coi là kinh thánh cho data scientist. Sau đó bạn có thể bắt đầu với Python hoặc R và tham gia challenge trên Kaggle. Kaggle là một platform để Data Scientist tham gia, kiếm tiền thưởng và cạnh tranh thứ hạng với nhau. Nhiều người cũng nói với mình Kaggle là con đường tốt nhất và ngắn nhất để đến với Data Science.

      Học cơ bản

  20. Aug 2017
  21. Jul 2017
  22. Jun 2017
  23. Apr 2017
    1. Detection of fake news in social media based on who liked it.

      we show that Facebook posts can be classified with high accuracy as hoaxes or non-hoaxes on the basis of the users who "liked" them. We present two classification techniques, one based on logistic regression, the other on a novel adaptation of boolean crowdsourcing algorithms. On a dataset consisting of 15,500 Facebook posts and 909,236 users, we obtain classification accuracies exceeding 99% even when the training set contains less than 1% of the posts.

    1. Obviously, in this situation whoever controls the algorithms has great power. Decisions like what is promoted to the top of a news feed can swing elections. Small changes in UI can drive big changes in user behavior. There are no democratic checks or controls on this power, and the people who exercise it are trying to pretend it doesn’t exist

    2. On Facebook, social dynamics and the algorithms’ taste for drama reinforce each other. Facebook selects from stories that your friends have shared to find the links you’re most likely to click on. This is a potent mix, because what you read and post on Facebook is not just an expression of your interests, but part of a performative group identity.

      So without explicitly coding for this behavior, we already have a dynamic where people are pulled to the extremes. Things get worse when third parties are allowed to use these algorithms to target a specific audience.

    3. any system trying to maximize engagement will try to push users towards the fringes. You can prove this to yourself by opening YouTube in an incognito browser (so that you start with a blank slate), and clicking recommended links on any video with political content.

      ...

      This pull to the fringes doesn’t happen if you click on a cute animal story. In that case, you just get more cute animals (an experiment I also recommend trying). But the algorithms have learned that users interested in politics respond more if they’re provoked more, so they provoke. Nobody programmed the behavior into the algorithm; it made a correct observation about human nature and acted on it.

    1. Really cool venue for publishing online, interactive articles for ML

  24. Mar 2017
    1. the area under the curve (often referred to as simply the AUC) is equal to the probability that a classifier will rank a randomly chosen positive instance higher than a randomly chosen negative one (assuming 'positive' ranks higher than 'negative')

      AUC能够在CTR应用中有指导意义的原因

  25. Feb 2017
    1. Robert Mercer, Steve Bannon, Breitbart, Cambridge Analytica, Brexit, and Trump.

      “The danger of not having regulation around the sort of data you can get from Facebook and elsewhere is clear. With this, a computer can actually do psychology, it can predict and potentially control human behaviour. It’s what the scientologists try to do but much more powerful. It’s how you brainwash someone. It’s incredibly dangerous.

      “It’s no exaggeration to say that minds can be changed. Behaviour can be predicted and controlled. I find it incredibly scary. I really do. Because nobody has really followed through on the possible consequences of all this. People don’t know it’s happening to them. Their attitudes are being changed behind their backs.”

      -- Jonathan Rust, Cambridge University Psychometric Centre

  26. Jan 2017
    1. AI criticism is also limited by the accuracy of human labellers, who must carry out a close reading of the ‘training’ texts before the AI can kick in. Experiments show that readers tend to take longer to process events that are distant in time or separated by a time shift (such as ‘a day later’).
    2. Even though AI annotation schemes are versatile and expressive, they’re not foolproof. Longer, book-length texts are prohibitively expensive to annotate, so the power of the algorithms is restricted by the quantity of data available for training them.
    3. In most cases, this analysis involves what’s known as ‘supervised’ machine learning, in which algorithms train themselves from collections of texts that a human has laboriously labelled.
  27. Dec 2016
    1. The team on Google Translate has developed a neural network that can translate language pairs for which it has not been directly trained. "For example, if the neural network has been taught to translate between English and Japanese, and English and Korean, it can also translate between Japanese and Korean without first going through English."

  28. Oct 2016
    1. In machine learning, the term "ground truth" refers to the accuracy of the training set's classification for supervised learning techniques.

      Ground truth in machine learning

  29. May 2016
    1. the algorithm was somewhat more accurate than a coin flip

      In machine learning it's also important to evaluate not just against random, but against how well other methods (e.g. parole boards) do. That kind of analysis would be nice to see.

  30. Apr 2016
    1. We should have control of the algorithms and data that guide our experiences online, and increasingly offline. Under our guidance, they can be powerful personal assistants.

      Big business has been very militant about protecting their "intellectual property". Yet they regard every detail of our personal lives as theirs to collect and sell at whim. What a bunch of little darlings they are.

  31. Feb 2016
    1. Patrick Ball—a data scientist and the director of research at the Human Rights Data Analysis Group—who has previously given expert testimony before war crimes tribunals, described the NSA's methods as "ridiculously optimistic" and "completely bullshit." A flaw in how the NSA trains SKYNET's machine learning algorithm to analyse cellular metadata, Ball told Ars, makes the results scientifically unsound.
    1. “Search is the cornerstone of Google,” Corrado said. “Machine learning isn’t just a magic syrup that you pour onto a problem and it makes it better. It took a lot of thought and care in order to build something that we really thought was worth doing.”
  32. Jan 2016
    1. UT Austin SDS 348, Computational Biology and Bioinformatics. Course materials and links: R, regression modeling, ggplot2, principal component analysis, k-means clustering, logistic regression, Python, Biopython, regular expressions.

  33. Dec 2015
    1. OpenAI is a non-profit artificial intelligence research company. Our goal is to advance digital intelligence in the way that is most likely to benefit humanity as a whole, unconstrained by a need to generate financial return.
    1. Big Sur is our newest Open Rack-compatible hardware designed for AI computing at a large scale. In collaboration with partners, we've built Big Sur to incorporate eight high-performance GPUs
  34. Nov 2015
    1. a study by Stephen Schueller, published last year in the Journal of Positive Psychology, found that people assigned to a happiness activity similar to one for which they previously expressed a preference showed significantly greater increases in happiness than people assigned to an activity not based on a prior preference. This, writes Schueller, is “a model for positive psychology exercises similar to Netflix for movies or Amazon for books and other products.”
    1. TPOT is a Python tool that automatically creates and optimizes machine learning pipelines using genetic programming. Think of TPOT as your “Data Science Assistant”: TPOT will automate the most tedious part of machine learning by intelligently exploring thousands of possible pipelines, then recommending the pipelines that work best for your data.

      https://github.com/rhiever/tpot TPOT (Tree-based Pipeline Optimization Tool) Built on numpy, scipy, pandas, scikit-learn, and deap.

    1. Nanodegree Program Summary Machine learning represents a key evolution in the fields of computer science, data analysis, software engineering, and artificial intelligence. It has quickly become industry's preferred way to make sense of the staggering volume of data our modern world produces. Machine learning engineers build programs that dynamically perform the analyses that data scientists used to perform manually. These programs can “learn” based on millions of experiences, all rigorously and numerically defined.
  35. Oct 2015
    1. I have the feeling we do not need to use models as complicated as some outlined in the text; we can (and finally will have to) abstract from most of the issues we can imagine. I expect that "magic" (an undisclosed heuristic, perhaps in combination with machine learning) will deal with the issues, a black box that will be considered inherently flawed and practical enough at the same time. The results from experimental ethics can help form the heuristic while the necessity for easy implementation and maintainability will limit the applications significantly.

  36. Sep 2015
  37. Aug 2015
  38. Jul 2015
  39. Jun 2015
    1. Enter the Daily Mail website, MailOnline, and CNN online. These sites display news stories with the main points of the story displayed as bullet points that are written independently of the text. “Of key importance is that these summary points are abstractive and do not simply copy sentences from the documents,” say Hermann and co.

      Someday, maybe projects like Hypothesis will help teach computers to read, too.

  40. Jan 2015
    1. Logistic regression, also called a logit model, is used to model dichotomous outcome variables. In the logit model the log odds of the outcome is modeled as a linear combination of the predictor variables.