79 Matching Annotations
  1. May 2017
    1. the termconceptto denote the underlying notion.

      Concepts according to section 4 of this paper:

      • Flexible Probabilities: compute prob on the fly by allowing prob to be part of an atom.
      • Distributional Clauses: A shorthand notation to assign a distribution to a random variable
      • Unknown Objects : Dropping the closed world assumption, and dropping the unique names assumption.
      • Stochastic Memoization: "If a random variable in Church is memoized, subsequent calls to it simply look up the result of the first call, similarly to tabling in logic programming"
      • Constraints
      • Negation as Failure
      • Second Order Predicates
      • Meta-Calls
      • Time and Dynamics
      • Generalized Labels for Facts and Queries
    2. It is often assumed that probabilistic facts do not unify with other probabilisticfacts or heads of rules

      I think this is saying that you can't follow a chain of reasoning without considering the probabilities.

      e.g. All men are social + Socrates is a man => Socrates is social, disregards the probabilities of all men are social and Socrates is a man.

    3. Legal groundings of such facts can also be restricted by providing a domain, asin the following variant of our alarm example where all persons have the sameprobability of independently hearing the alarm

      Just a special case where several probabilistic facts have the same probability.

    4. All ground instances o

      Umm, are they non-ground or ground?

    5. b^e^hm^:hj

      This conflicts with the possible world (3), so... is this a new example?

    6. we obtain the success probabilityofcalls(mary),P(calls(mary)) = 0:196

      OK, but this is the marginal distribution that Mary calls? Not conditional on the facts that we already know, that a burglary happened, that an earthquake did not, that Mary heard the alarm?

    7. to provecall

      Does this mean to prove 'there exists a call'?

    8. backward reasoning

      query: call

      It seems to me that we need to prove that 'call' happened. ```

      • For this we need to prove that calls(X) happened.
        • For this we need to prove that alarm,hears_alarm(X) happened.
          • For this we need to prove that either burglary or an earthquake happened.

      Since burglary is true, subbing X = Mary proves calls.

    9. orward reasoning

      Adds additional facts

    10. queryq

      What are two examples of queries?

    11. unique least Herbrand model (i.e., a unique least model using onlysymbols from the program

      What does this mean?

      "using only symbols from the program" as opposed to what?

    12. 0:1 ::burglary:0:7 ::hearsalarm(mary):0:2 ::earthquake:0:4 ::hearsalarm(john):

      Probabilistic facts. Each corresponds to a bool RV.

    13. The corresponding logicprogram obtained by adding the set of rulesRto the set of facts, also called apossible world

      Notice that probabilities are gone.

    14. alarm:earthquake:(1)alarm:burglary:calls(X):alarm;hearsalarm(X):call:calls(X)

      Rules / Definite clauses

    15. To summarize, the key contributions of this paper ar
      • Identify core concepts used by various probabilstic languages
      • Discussion of the execution mechanism that they require
      • Positioning of state-of-the-art prob languages and implementations wrt these concepts
    16. Inference, that is, evaluating the probability distribution de ned by a programor model

      Remember this definition.

    17. Typical probabilistic programming languages, on the other hand, employa variant of Sato's distribution semantics (Sato, 1995), in which random variablesdirectly correspond to ground facts and a traditional program speci es how to de-duce further knowledge from these facts

      Difference:

      SRL: Knowledge-based model construction, logic is used as a template for constructing a graphical model.

      Typical prob prog languages: Sato's distribution semantics, in which RVs directly correspond to ground facts. A traditional program specifies how to deduce further knowledge from these facts.

    18. probabilistic extensions of logic programming languages
      • ICL
      • SLP
      • PRISM
      • BLP
      • CLP(BN)
      • LPAD
      • P-log
      • Dyna
      • CP-Logic
      • ProbLog
      • PROPPR
    19. statistical relational learningmodels
      • RBN
      • PRM
      • MLN
    20. alternative probabilistic programming language
      • IBAL
      • BLOG
      • Church
      • Figaro
    21. By now, it is commonly accepted that the more interestingquestion is concerned with the underlying concepts that these languages employand their e ect on the inference mechanisms, as their expressive power is oftenvery similar.

      Expressiveness is OK. Interesting? This:

      • Underlying concepts employed
      • Effect on inference mechanisms
    22. To obtain a better understanding of probabilistic programming, we identify anumber of core programming concepts underlying the primitives used by variousprobabilistic languages, discuss the execution mechanisms that they require anduse these to position and survey state-of-the-art probabilistic languages and theirimplementation.While doing so, we focus on probabilistic extensions oflogicprogramminglanguages such as Prolog, which have been considered for over 20 years.

      A survey of probabilistic logic languages, with a bit of talk about probabilistic extensions to only logic languages.

  2. Apr 2017
    1. We performed experiments on two real data sets: the first is a pro-prietary address data set and the second is the publicly availableDBLP data set. The address data set contains 1 million strings,each a concatenation of an organization name and address (street,city, zip, state). The DBLP data set contains around 0.5 millionstrings, each a concatenation of authors and title of a publication.

      Datasets

    2. The self-proclaimed be all and end all - does everything exactly and better than everything else - like throwing a bunch of performance metric magnets at the refrigerator and seeing that everything sticks.

      Challenge: Find holes in the supremacy claims.

    1. Motivation of the paper is articulated in the conclusion: We have divided the problem into two aspects: theblack-box functions that match and merge records, and theER algorithm that invokes these functions. In our opinion,this division has two important advantages: (i) it yields gene-ric ER algorithms that can be used, with well-defined se-mantics, in many applications, and (ii) it lets us focus onour performance measure, the number of black-box invoca-tions.

    2. n summary, ER is an inherently expensive process, sothat only relatively small sets of records can be handled.Thus, large data sets need to be partitioned into smaller setsthat can be resolved in detail. How large a data set can beexhaustively resolved depends on the application. It is alsopossible to distribute the ER computations across multipleprocessors, in order to handle larger data sets. In [7] westudy various strategies for distributing the work done byR-Swoosh.

      Worth a look, methinks.

    3. F-Swoosh

      Feature-level

    4. R-Swoosh:

      Record-level

    5. G-Swoosh

      General

    6. We identify the ICAR properties

      Idempotence Commutativity Associativity Representativity

    7. The particular variant of the ER problem that we studyin this paper may not be the most sophisticated

      Amen.

    8. Un-like other works that focus only on identifying matchingrecords

      Unlike horses that don't have wheels, our car comes with 4.

    9. Because record matching is inherently expensive, largesets of input records are often divided into “buckets” usingapplication knowledge, and then ER is run on each bucket.

      Blocking. They're talking about blocking.

    10. The one with a lot of very strong assumptions. "Could this BE any less relational?"

    1. One key challenge for the classi cation-based approachinvolves the selection of training examples from which tolearn the similarity classi ers. Ideally, we want our modelto correctly predict the similarity of every document to everyother document (or every centroid, based on the modelingchoice described above) in the dataset. However, creating atraining example for each document (or document-centroid)pair results in a skewed label distribution, since a large ma-jority of pairs in the training dataset do not belong to thesame event

      Sound familiar?

    2. Radiohead

      Never thought I'd read a paper with "Radiohead" and "my favorite band" in the same sentence. Most certainly not one that wasn't my own anyway.

    1. [6] M. Hernandez and S. Stolfo. Real-world data is dirty:data cleansing and the merge/purge problem.Journalof Data Mining and Knowledge Discovery, 1(2), 1998.

      Say the folks that didn't really run experiments on real-world data :/

    2. The experimental comparisons should be extended to non-DBGendata sets to investigate dependencies on data sourcesand their error characteristics.

      Yeah. Tell me about it.

    1. Weobservedsimilarresultsonthepublicationinformationofabibliographydatabase.Weomitresultsduetospaceconstraints

      Oh come on :/

    1. Table 2. Target databases.

      Wow, they wrote and collected 15million academic articles? Pretty impressive, no?

  3. Oct 2016
  4. nikhilnkini.files.wordpress.com nikhilnkini.files.wordpress.com
    1. Had this model failed to identify a bug, we could subsequently haveenriched the sketched specifications.

      This works backwards, saying, well, we know there's a bug, and we will change the sketched spec until MOLLY is able to find one...

    1. In this survey I explain h
      • Notion of knowledge can be formalized
      • Usefulness of formalization
    1. Let C be a bivalent configuration of P, and let e = (p, m) be an event that is applicable to C. Let %? be the set of configurations reachable from C without applying e, and let 9 = e(g) = (e(E) I E E %? and e is applicable to E). Then, 9 contains a bivalent configuration.

      This is the setup for the second part:

      "Second, we construct an admissible run that avoids ever taking a step that would commit the system to a particular decision."

      As I understand it, what the authors say is that for an arbitrary event e and for an arbitrary configuration C, there is always a bivalent state that is reachable from C (the set fancy D in this case). And reaching a decision means that there needs to be a univalent state.

    2. assumed partial correctness.

      It is a valid assumption because total correctness is assumed which subsumes partial correctness.

      "First, we argue that there is some initial configuration in which the decision is not already predetermined." - this is the proof of this part.

      But why is a proof of this needed when the partial correctness already assumes that it is true that the configuration is bivalent (if all P begins with an initial config)?

    3. The basic idea is to show circumstances under which the protocol remains forever indecisive. This involves two steps. First, we argue that there is some initial configuration in which the decision is not already predetermined. Second, we construct an admissible run that avoids ever taking a step that would commit the system to a particular decision.

      simple enough, right?

    4. The associated sequence of steps is called a run.

      What's the relation between a step and an event?

    5. Thus, the message system acts nondeterministically, subject only to the condition that if receive( p) is performed infinitely many times, then every message (p, m) in the message buffer is eventually delivered. In particular, the message system is allowed to return 0 a finite number of times in response to receive(p), even though a message (p, m) is present in the buffer

      Hard to follow. Explain please?

      Also, why is this particular assumption made, that at infinity receives, every message is eventually delivered - how does not making this assumption change things?

    6. Our system model is rather strong

      Can we discuss the choices of models that we have and why this one might be particularly stronger than the others?

    7. Assumptions:

      1. No Byzantine failures
      2. Reliable message system
      3. No access to sync clocks (timeout based algos can't be used)
      4. If one non-faulty process receives a message, all non-faulty processes will and the sender knows this.

      No assumptions on:

      1. Relative speeds of processes
      2. Delay time in delivering a message
      3. The ability to detect the death of a process
    8. . In one atomic step, a process can attempt to receive a message, perform local computation on the basis of

      One atomic step:

      1. Attempt to receive a message
      2. perform local computation on the basis of whether or not a message was delivered to it
      3. Send an arbitrary but finite set of messages to other processes
    1. adaptsefficientlyasnodesjoinandleavethesystem,andcananswerqueriesevenifthesystemiscontinuouslychanging

      how?

    1. some application developers may not want to write their own conflict resolution mechanisms and choose to push it down to the data store

      Of course they will, it's the job of the data store, right?

    2. give services control over their system properties, such as durability and consistency, and to let services make their own tradeoffs between functionality, performance and cost-effectiveness

      Interesting... need to unpack this

    3. State management then becomes the main component of a service’s SLA.

      Why?

    4. which are in general measured at the 99.9th percentile of the distribution.

      what does this mean?

    5. object versioning

      What's this?

    6. An SLA stated in terms of mean or median response times will not address the performance of this important customer segment.

      Important insight

    7. non-hostile and there are no security related requirements such as authentication and authorizatio

      Oooh, that's a pretty useful assumption

    8. eventually-consistent storage system can be used in production with demanding application

      Very interesting

    9. In the past year, Dynamo has been the underlying storage technology for a number of the core services in Amazon’s e-commerce platform. It was able to scale to extreme peak loads efficiently without any downtime during the busy holiday shopping season. For example, the service that maintains shopping cart (Shopping Cart Service) served tens of millions requests that resulted in well over 3 million checkouts in a single day and the service that manages session state handled hundreds of thousands of concurrently active sessions.

      Much ads

    10. scalability and availabilit

      The two important things they are going for

    11. To achieve this level of availability, Dynamo sacrifices consistency

      Clear trade-off choice

    12. destroyed by tornados

      Oh the drama

  5. Sep 2016
    1. Exception Handling

      Whereas the exceptions relating to the distributed nature of the system (network partition etc) should have been part of this section, it unfortunately isn't. It is reduced to a call failed exception without too much detail about how to handle it.

    2. Having read the Waldo et al. paper first, I instinctively feel the need to defend this paper, which is kind of unprofessional. But I haven't enough time to shake off the emotions, so here goes.

      The problem that this paper has and also what sort of shields it from the Waldo et al. note is that it doesn't really try too hard to be a seminal contribution to the area of DS as we know it today. All it claims to do is provide an efficient, secure, programmer-of-that-day friendly way for more than one computer to communicate. They do back these claims up with their evaluations. So if they were honest, it is clear that in their evaluation, they did not come across problems that are fundamental to DS such as network partitions and node failures.

      Surely, this paper has important contributions considering that the DS paradigm of choice today - Map Reduce, makes use of some form of RPC to make calls to the map and reduce functions.

    3. Violation of this principle seemed likely to lead us into the complexities that have made previous communication packages and protocols difficult to use

      This line makes it pretty clear that "Distributed systems" and the common problems associated with them was not the concern that was being addressed, but it truly was creating a distributed communication protocol.

    4. imbedded in our major language, Mesa

      It does seem to be a major theme that implementation language choices ARE affecting the design choices, as was correctly pointed out in the Waldo et al paper.

    5. use procedure calls as the paradigm for expressing control and data transfers. For example, message passing might be a plausible alternative.

      Could someone elaborate the difference between the two mentioned paradigms? Doesn't procedure calls involve message passing?

    6. he primary purpose

      When I read this, having read the criticism that has been leveled against this paper in the A note on distributed computing paper. I think I see where the problem lies - this paper was set in a different time, and the problems that were trying to be solved were very different from the problems prevalent in 1994. It feels like at this stage of inception, the very problem of making calls to remote machines was a very new field, and various simplifying assumptions were made, and that performance was thought of, but not in the way that one would think of it ten years later.

    7. these should make it easier to build distributed computations, and to get them right. Another is efficiency: procedure calls seem simple enough for the communication to be quite rapid. A third is generality: in singie-machine com- putations, procedures are often the most important mechanism for communica- tion between parts of the algorithm.

      It's very hard to buy in to these assumptions, to be honest. So I hope they have justified why they think so in the pages to come.

    8. these should make it easier to build distributed computations, and to get them right.

      Easier to build, sure. Get them right? How did they get that idea?

    1. that all current invocations or callsfor system services will be eventually converted into callsthat might be to an object residing on some other machine.

      Why couldn't this be true, if for example, TCP were used?

    2. for easily under-standable reasons

      Are they implying the overhead of changing a lot of code? Or something more fundamental? I don't see the easily understandable reason.

    3. Unless the naming interfaceincludes an operation to lock a naming context, there is noway that I can make this operation completely robust.

      But isn't this true even today, and is just inherently a problem of distributed systems rather than any particular assumptions in a particular paradigm?

    4. This is not to argue against pleasant programming inter-faces.

      There seems to be an inherent assumption that programming interfaces that address DS problems have to be unpleasant, and I don't exactly see why this should be true.

    5. The hard problems have to do with differ-ences in memory access paradigms between local and dis-tributed entities.

      How so? In any way that's fundamentally different than process discovery?

    6. Thehard problems in distributed computing concern dealingwith partial failure and the lack of a central resource man-ager.

      Obviously...

    7. A less optimistic explanation of the failure of each attemptat unification holds that any such attempt will fail for thesimple reason that programming distributed applications isnot the same as programming non-distributed applications

      Did the development of BOOM, BLOOM and BUD take the view that this explanation takes?

    8. what-ever programming model is currently in vogue

      What is it in the 2000s and 2010s? Have we converged to a unified model? How have thing changed?