Hypothesis

120 Matching Annotations

Nov 2022
docs.google.com docs.google.com

Devising ML Metrics

1
1. carlhenrikrolf 11 Nov 2022
  
  in Public
  
  Devising ML Metrics
  
  benchmarking environment suites
Visit annotations in context

Tags

benchmarking

environment suites

Annotators

carlhenrikrolf

URL

docs.google.com/document/d/1Pesm4eDQKK96ZMsmzbLqfb-VvmisikJrZ7NhygS-QTU/edit
Aug 2022
www.ijcai.org www.ijcai.org

Symbolic Synthesis of Fault-Tolerance Ratios in Parameterised Multi-Agent Systems

8
1. carlhenrikrolf 15 Aug 2022
  
  in Public
  
  Parameterised Ratio Synthesis
  
  modification to the algorithm
2. carlhenrikrolf 15 Aug 2022
  
  in Public
  
  Together, Theorem 1 and Theorem 2 show that Algorithm 1is a sound decision procedure for the λ-synthesis problem.
  
  theoretical analysis completed
3. carlhenrikrolf 15 Aug 2022
  
  in Public
  
  We now show that the output of λ-SYNTHESIS reflects theminimum ratio for which the Fault Tolerance problem wouldreturn false w.r.t the concrete system and specification givenas input. We prove this by means of two steps.
  
  moving on to theoretical results
4. carlhenrikrolf 15 Aug 2022
  
  in Public
  
  We now give some preliminary definitions that we will usein our procedure. For a global state g we use λ(g) to denotethe ratio of agents that have exhibited a fault to total agents,i.e.
  
  this notation is a bit confusing when looking briefly. what are the correspondences in the pseudocode?
5. carlhenrikrolf 15 Aug 2022
  
  in Public
  
  We now describe the Lambda procedure. The proposi-tional cases are trivial: we can simply label each state whichsatisfies or does not satisfy the atomic proposition with theratio of faulty agents at that state. For the conjunction case,notice we need both φ1 and φ2 to hold so the minimum neces-sary ratio of faults is the largest of the ratios for each of thesetwo to hold. The disjunction case is similar.
  
  more in depth explanation starts here
6. carlhenrikrolf 15 Aug 2022
  
  in Public
  
  This sub-procedure works similarly to the labelling algo-rithm for model checking CTL [Clarke et al., 1999]. Thealgorithm recursively labels the states of a system with thesubformulas of the specification under question that they sat-isfy. Differently from the CTL labelling algorithm, Lambdalabels states with both said subformulas and the minimum ra-tio of faulty agents to total agents that is sufficient for thespecification to be satisfied at the state. Thus, Lambda op-erates on sets of pairs of states and ratios instead of sets ofstates. Note that since Lambda takes as input the negationof an ACTLK\X formula, the procedure is defined for theexistential fragment ECTLK\X of CTLK\X.
  
  Summary of subprocedure. Main procedure self-explnatory pseudocode
7. carlhenrikrolf 15 Aug 2022
  
  in Public
  
  Ourprocedure, shown in Algorithm 1, passes the system of nfaulty agents and the negation of our specification to a sub-procedure Lambda, shown in Algorithm 2.
  
  refs
8. carlhenrikrolf 15 Aug 2022
  
  in Public
  
  Symbolic Synthesis of Fault-Tolerance Ratiosin Parameterised Multi-Agent Systems
  
  scientific writing
Visit annotations in context

Tags

scientific writing

Annotators

carlhenrikrolf

URL

ijcai.org/proceedings/2018/0045.pdf
www.nature.com www.nature.com

Magnetic control of tokamak plasmas through deep reinforcement learning

1
1. carlhenrikrolf 12 Aug 2022
  
  in Public
  
  tags
  
  automating science
Visit annotations in context

Tags

automating science

Annotators

carlhenrikrolf

URL

nature.com/articles/s41586-021-04301-9
www.scribbr.com www.scribbr.com

A step-by-step guide to the writing process

1
1. carlhenrikrolf 12 Aug 2022
  
  in Public
  
  Redrafting and revising Once you’ve decided where changes are needed, make the big changes first, as these are likely to have knock-on effects on the rest. Depending on what your text needs, this step might involve: Making changes to your overall argument. Reordering the text. Cutting parts of the text. Adding new text. You can go back and forth between writing, redrafting and revising several times until you have a final draft that you’re happy with.
  
  I might want to write these lists instead as I often have to change them.
Visit annotations in context

Annotators

carlhenrikrolf

URL

scribbr.com/academic-writing/writing-process/
Jul 2022
ifaamas.org ifaamas.org

Formal Verification of Open Multi-Agent Systems

12
1. carlhenrikrolf 26 Jul 2022
  
  in Public
  
  line 2
  
  line numbers in proof
2. carlhenrikrolf 26 Jul 2022
  
  in Public
  
  Theorem
  
  If something is an assumption in a theorem, then it can be used as a def in a def env
3. carlhenrikrolf 26 Jul 2022
  
  in Public
  
  Sketch
  
  similarity -> assumption -> definition -> conclusion -> qed
4. carlhenrikrolf 26 Jul 2022
  
  in Public
  
  The proof is similar to that of Theorem 3.4
  
  many of these
5. carlhenrikrolf 26 Jul 2022
  
  in Public
  
  diction
  
  proof by contradiction
6. carlhenrikrolf 26 Jul 2022
  
  in Public
  
  Theorem
  
  For once, not so many lemmas.
7. carlhenrikrolf 26 Jul 2022
  
  in Public
  
  (⇒)
  
  these symbols show which direction of the iff we are proving
8. carlhenrikrolf 26 Jul 2022
  
  in Public
  
  Definition 3.1 (COIS). A collective open interpreted system (COIS)is an OIS O = ⟨A, E, V⟩ where the transition function of A satisfiesthe conditiont(l, a, X ∪ {a}, aE ) = t(l, a, X , aE )for all values of l, a, X and aE .
  
  assumptions generally seem to be set in definition enviornments
9. carlhenrikrolf 26 Jul 2022
  
  in Public
  
  sketch
  
  it feels very informal in some way. yet it is pretty clear that they take this known problem. Encode it using their model. Since this example is undecidable, the worst case is such
10. carlhenrikrolf 26 Jul 2022
  
  in Public
  
  O, д |= ∀v1, ...,vn : ψ iff O, д |= ψ [v1 7 → a1] . . . [vn 7 →an ] for all pairwise disjointchoices of a1, . . . , an ∈ N.O, д |= (alv, i) iff ∃j : id(д.j) = iO, д |= (p, i) iff O, д |= (alv, i) and p ∈ V(si (д));O, д |= ¬(p, i) iff O, д ̸ |= (p, i);
  
  new kinds of connectives
11. carlhenrikrolf 26 Jul 2022
  
  in Public
  
  EVALUATION
  
  experiments, no comparisons to other algorithms though
12. carlhenrikrolf 26 Jul 2022
  
  in Public
  
  We introduce asemantics, based on interpreted systems, to capture the openness ofthe system and show how an indexed variant of temporal-epistemiclogic can be used to express specifications on them
  
  new kind of problem?
Visit annotations in context

Annotators

carlhenrikrolf

URL

ifaamas.org/Proceedings/aamas2019/pdfs/p179.pdf
ojs.aaai.org ojs.aaai.org

Efficient Verification of ReLU-Based Neural Networks via Dependency Analysis

12
1. carlhenrikrolf 26 Jul 2022
  
  in Public
  
  Figure 3: The architecture of
  
  Maybe have one of these
2. carlhenrikrolf 26 Jul 2022
  
  in Public
  
  he estimation of the difficulty of a problem p at splittingdepth d that we consider is defined byscore(p, d) = stability ratio(p) − stability ratio(P )d 1m,where P is the original verification problem and m is thesplitting parameter.
  
  experimental metric
3. carlhenrikrolf 26 Jul 2022
  
  in Public
  
  Algorithm
  
  uses milp, calls split
4. carlhenrikrolf 26 Jul 2022
  
  in Public
  
  Algorithm
  
  modifies milp
5. carlhenrikrolf 26 Jul 2022
  
  in Public
  
  Lemmas 3 and 1
  
  I guess I would typically see Equations (1) and (3) or something like it
6. carlhenrikrolf 26 Jul 2022
  
  in Public
  
  Dependency analyser. Given the above, we now put for-ward a procedure using the identification of dependencies toreduce the configuration space.
  
  feels odd to write such that you derive a procedure from a couple of lemmas.
7. carlhenrikrolf 26 Jul 2022
  
  in Public
  
  Dependency Analysis
  
  the theoretical results appear without any proofs
8. carlhenrikrolf 26 Jul 2022
  
  in Public
  
  Lemma 1 gives a procedure for identifying intra-layer de-pendencies by computing the right hand side of each of theabove clauses for every pair of unstable nodes in a layer.Dependencies between layers require a different treatment.
  
  deriving the parts of the procedure little by little using lemmas?
9. carlhenrikrolf 26 Jul 2022
  
  in Public
  
  The procedure runs in time O(k · s2),where k is the number of layers and s is the layers’ maximalsize.
  
  this is another theoretical result apparently non-deserving of an environment
10. carlhenrikrolf 26 Jul 2022
  
  in Public
  
  We now proceed to derive a procedure for computing anetwork’s dependency relations. To ease the presentation,we express dependency relations as unions of four disjointsets Df = ⋃z,z′ ∈{,⊥} Dz,z′f , where each Dz,z′f comprisesdependencies pertaining to the ReLU states z and z′, i.e.,Dz,z′f {(ni,q , nj ,r ) | st(ni,q ) = z =⇒ st(nj ,r ) = z ′}
  
  Is this the high-level description of the theorem? It looks quite different to me ...
11. carlhenrikrolf 26 Jul 2022
  
  in Public
  
  Example 1.
  
  One of these examples that is really more of a worked exercise.
12. carlhenrikrolf 26 Jul 2022
  
  in Public
  
  procedure D EPENDENCY ANALYSIS (milp, h)
  
  procedure name inside the algorithm environment, and descriptive caption instead
Visit annotations in context

Annotators

carlhenrikrolf

URL

ojs.aaai.org/index.php/AAAI/article/download/5729/5585
www.ifaamas.org www.ifaamas.org

p1200.pdf

39
1. carlhenrikrolf 26 Jul 2022
  
  in Public
  
  The construction of the abstract model and its verificationagainst all formulae took approximately 5 seconds on anIntel Core i7 machine clocked at 3.4 GHz, with 7.7 GiBcache, running 64-bit Fedora 20, kernel 3.16.6. MCMAS-OFPand the ISPL-OFP file encoding the scenario are availableat [21].
  
  A small experiment
2. carlhenrikrolf 26 Jul 2022
  
  in Public
  
  Figure 1: ISPL-OFP snippet.
  
  In this case the algorithm is given in actual code. Also, it is more for a specific case study rather than in general.
3. carlhenrikrolf 26 Jul 2022
  
  in Public
  
  From the small neighbourhood assumption wehave that ˆP (γ, o) = P (g, o) ± , for each opinion o.Therefore,#(g′, o)n ≈ [%false (γ). ˆP (γ, o)] + %true (γ, o)for each opinion o.
  
  Is this where we actually use our assumption?
4. carlhenrikrolf 26 Jul 2022
  
  in Public
  
  (2) Pick n α such that SA(n) admits every initial den-sity of opinions that is represented by the set of abstractinitial states. Define R as above. Then the proof proceedsalong the same lines with the proof in (1).
  
  This seems like a bit of an inconsistent writing. Just before we did two lemmas for a similar thing.
5. carlhenrikrolf 26 Jul 2022
  
  in Public
  
  Lemma 3.11. Given an agent template A and an ACTLKformula φ, the following hold:1. for all n α, ˆSA(1) simulates SA(n);2. there is n α such that SA(n) simulates ˆSA(1).
  
  Now, we make a weaker assumption to prove the stronger assumption. Furthermore, we do it both ways.
6. carlhenrikrolf 26 Jul 2022
  
  in Public
  
  Lemma 3.10. Assume that ˆSA(1) simulates SA(n) andSA(n) simulates ˆSA(1). Then, ˆSA(1) |=ab φ iff SA(n) |= φ,for any ACTLK formula φ.Proof. (⇒) Lemma 3.8. (⇐) Lemma 3.9.
  
  This lemma shows the shape of the eventual theorem but under stronger assumption. The question is: Is it really needed. The proof line could be included in the final theorem.
7. carlhenrikrolf 26 Jul 2022
  
  in Public
  
  Lemma 3.9. Assume that SA(n) simulates ˆSA(1). Then,SA(n) |= φ implies ˆSA(1) |=ab φ, for any ACTLK formulaφ.
  
  The other way of the implication under the stronger assumption
8. carlhenrikrolf 26 Jul 2022
  
  in Public
  
  A simulation relation between the abstract model and aconcrete system is defined similarly to Definition 3.7, butby swapping the LHS with the RHS in each of the clauses.
  
  changing the order of the if-statements?
9. carlhenrikrolf 26 Jul 2022
  
  in Public
  
  Let φ = CΓψ and assume γ |=ab φ
  
  This is the property that is a bit harder to show, so left for last
10. carlhenrikrolf 26 Jul 2022
  
  in Public
  
  φ ∈ AP : from simulation requirement2;
  
  thing proved: reasoning; next thing proved: more reasoning; ... This is a very concise representation. I guess it is clear but still pretty hard to parse—especially without appropriate line breaks. Maybe there just is not space for such line breaks.
11. carlhenrikrolf 26 Jul 2022
  
  in Public
  
  by induction on φ.
  
  proof method
12. carlhenrikrolf 26 Jul 2022
  
  in Public
  
  Assume a simulation relation R betweenSA(n) and ˆSA(1).
  
  use the assumption
13. carlhenrikrolf 26 Jul 2022
  
  in Public
  
  Lemma 3.8. Assume that ˆSA(1) simulates SA(n). Then,ˆSA(1) |=ab φ implies SA(n) |= φ, for any ACTLK formulaφ.
  
  This is maybe 1/3 of the of the theorem to be proved. It shows the equivalence one way. It makes an assumption that we would later want to remove by proving it holds.
14. carlhenrikrolf 26 Jul 2022
  
  in Public
  
  By means of the former result, thesatisfaction of an ACTLK formula on the abstract modelentails the satisfaction of the formula on every concrete sys-tem. Conversely, by means of both results, the falsificationof an ACTLK formula on the abstract model implies the ex-istence of a concrete system that falsifies the formula
  
  equivalence as long as we restrict ourselves to talk about certain properties. High-level summary before we start presenting the lemmas necessary for the proof.
15. carlhenrikrolf 26 Jul 2022
  
  in Public
  
  The abstract satisfaction relation |=ab is defined for groupknowledge as ( ˆSA, γ) |=ab EΓφ iff for all γ′ with γ ∼ ˆEΓ γ′,we have that ( ˆSA, γ′) |=ab φ; for common knowledge it isdefined as ( ˆSA, γ) |=ab CΓφ iff for all γ′ with γ ∼ ˆCΓ γ′,
  
  Some further defs near the definition environment
16. carlhenrikrolf 26 Jul 2022
  
  in Public
  
  We now establish a correspondence between the concretesystems and the abstract model
  
  Ah, now the naming of concrete system starts to make sense for me; it is contrasted against abstract model
17. carlhenrikrolf 26 Jul 2022
  
  in Public
  
  An abstract state is a set of pairs of weights and templatelocal states in R × L; it represents every concrete state inwhich the ratio of agents in each local state to all the agentsapproximates the weight associated with the state. Notethat every local state that does not appear in an abstractstate is assumed to be associated with a weight equal to 0.
  
  Output of construction? Maybe not really the right way to say it. It says something, an important detail, of the construction, but the entire thing is defined in the definition environment later.
18. carlhenrikrolf 26 Jul 2022
  
  in Public
  
  To solve the PMCP we introduce the notion of weightedabstraction.
  
  First sentence introduces name.
19. carlhenrikrolf 26 Jul 2022
  
  in Public
  
  parameterised modelchecking problem [5, 10, 15
  
  PMCP
20. carlhenrikrolf 26 Jul 2022
  
  in Public
  
  g.i
  
  This g.i notation for the local state for i given the global state g is quite clean :)
21. carlhenrikrolf 25 Jul 2022
  
  in Public
  
  Weighted abstraction
  
  Hmm, maybe the main product from this paper is the abstraction rather than an algorithm
22. carlhenrikrolf 25 Jul 2022
  
  in Public
  
  Since swarms are typically made ofvery large numbers of agents each interacting with very fewneighbours, all systems of interest satisfy the small neigh-bourhood property.
  
  is the assumption reasonable? No assumption environment.
23. carlhenrikrolf 25 Jul 2022
  
  in Public
  
  n α to denote this
  
  shorthand notation
24. carlhenrikrolf 25 Jul 2022
  
  in Public
  
  Definition 3.2 (PMCP for small neighbourhoods).The PMCP for OFSs with small neighbourhoods of size αconcerns establishing whether the following holds:SA(n) |= φi for every n α.
  
  problem formulation revisited for stronger assumption
25. carlhenrikrolf 25 Jul 2022
  
  in Public
  
  By “much greater” we mean that for anygiven global state g and opinion o, we have that( #false (g,o)#false (g))α=( #false (g,o)−α#false (g)−α)α± , for some small constant .
  
  clarification
26. carlhenrikrolf 25 Jul 2022
  
  in Public
  
  Given an agent template A defined on a neighbourhoodsize α ∈ N, we say that an OFS SA(n) satisfies the smallneighbourhood property if at each time step the number ofagents not in latent state is much greater than the neigh-bourhood size.
  
  definition in words still with nebulous concepts such as much greater than
27. carlhenrikrolf 25 Jul 2022
  
  in Public
  
  We begin by formulating the small neighbourhood prop-erty.
  
  novel concept. start defining gradually
28. carlhenrikrolf 25 Jul 2022
  
  in Public
  
  Definition 3.1 (PMCP). Given an agent template Aand an ACTLK formula φ, the parameterised model check-ing problem (PMCP) is the decision problem of determiningwhether the following holds:SA(n) |= φ for every n ≥ α.
  
  Problem formulation
29. carlhenrikrolf 25 Jul 2022
  
  in Public
  
  Definition 2.2 (Concrete Agent).
  
  Quite different from how the RL community coes ir
30. carlhenrikrolf 25 Jul 2022
  
  in Public
  
  ntuitively, different behaviours are asso-ciated with different opinions. For instance, as we exemplifyin Section 4, the latent value can be used to keep trackof the time an agent is engaged in the protocol before itgoes into latent state; this period of time is proportionalto its currently held opinion. For a local state l, we writeopinion(l), value(l), and latent(l) to denote the opinion, thelatent value, and the latent status, respectively, encoded in l.We assume that whenever latent(l) = false and t(l, o) = l′,then opinion(l′) = o; i.e, at each time step, an agent switchesto the majority opinion in its neighbourhood if not in latentstate.
  
  How model is used but not what each concept means
31. carlhenrikrolf 25 Jul 2022
  
  in Public
  
  Note that a local state is built from an observable com-ponent representing an opinion, and a non-observable com-ponent associated with a latent value and the latent status
  
  note
32. carlhenrikrolf 25 Jul 2022
  
  in Public
  
  Definition 2.1 (Agent Template). An agent templateis a tuple A = (O, h, α, t), where:• O is a nonempty and finite set of opinions;• h : O → N is a mapping from the set of opinionsinto the set natural numbers, where h(o) represents thequality of opinion o. O and h define a setL = {(o, v, l) : o ∈ O, 0 ≤ v ≤ max(h(o) : o ∈ O),l ∈ {false, true}}of local states, where each triple (o, v, l) encodes anopinion o, a latent value v, and whether or not thetemplate is into latent state (l = true, and l = false,respectively);• α ∈ N is the size of the neighbourhood for the templateat any given time step;• t : L × O → L is a transition function that returns thenext local state given the current local state and themajority opinion held by neighbouring agents.
  
  Seems pretty similar to how I would write
33. carlhenrikrolf 25 Jul 2022
  
  in Public
  
  n its simplest form, a majority-rule protocol is describedas follows [18]. At each time step a team of three randomlypicked agents is formed. Each agent can be in one of twostates; each state is associated with one of two opinions. Themembers of a team adopt the opinion held by the majorityof the members in team. This process is repeated until con-sensus is reached on one of the two opinions. Studies haveshown that a collective agreement is eventually establishedon the opinion with the highest initial density [18].
  
  There is already an algorithm for achieving consensus. What er are looking for is a procedure by which we verify that it works for arbitrarily many agents
34. carlhenrikrolf 25 Jul 2022
  
  in Public
  
  arameterised model checking has previously been appliedfor the analysis of generic swarm systems of an arbitrarynumber of components [16]. In this paper we extend themethodology put forward in [16] to model consensus proto-cols following the majority rule
  
  Previous work, current work
35. carlhenrikrolf 25 Jul 2022
  
  in Public
  
  A key issue with opinion formation protocols is to investi-gate their convergence. In this paper we put forward a for-mal methodology that can be applied to analyse consensusprotocols that follow the majority rule.
  
  property of interest. same word as in RL although meaning slightly different
36. carlhenrikrolf 25 Jul 2022
  
  in Public
  
  Theorem 3.12. Given an agent template A and an ACTLKformula φ, the following hold:1. ˆSA(1) |= φ implies ∀n α. SA(n) |= φ.2. ˆSA(1) 6 |= φ implies ∃n α. SA(n) 6 |= φ.
  
  This theorem is less of a standard result. It shows what one model's theorem says about a different kind of model.
37. carlhenrikrolf 25 Jul 2022
  
  in Public
  
  OPINION FORMATION SEMANTICS
  
  What I would call Model or Preliminaries. This time there are definition environments involved rather than paragraph environments
38. carlhenrikrolf 25 Jul 2022
  
  in Public
  
  We investigate the formal verification of consensus proto-cols in swarm systems composed of arbitrary many agents
  
  Sentence 1: Topic
39. carlhenrikrolf 25 Jul 2022
  
  in Public
  
  Formal Verification of Opinion Formation in Swarms
  
  This is a more theoretical paper, but it is much older. Maybe ask Alessio which of the many-author papers he is referring to wrt him and Panagiotis coauthoring
Visit annotations in context

Annotators

carlhenrikrolf

URL

ifaamas.org/Proceedings/aamas2016/pdfs/p1200.pdf
www.ijcai.org www.ijcai.org

Towards Scalable Complete Verification of Relu Neural Networks via Dependency-based Branching

15
1. carlhenrikrolf 25 Jul 2022
  
  in Public
  
  he experimental resultsobtained show 145% performance gains over thepresent state-of-the-art in complete verification.
  
  Sentence 5: Experimental results
2. carlhenrikrolf 25 Jul 2022
  
  in Public
  
  We evaluatethe method on all of the ReLU-based fully con-nected networks from the first competition for neu-ral network verification.
  
  Sentence 4: Experimental set up
3. carlhenrikrolf 25 Jul 2022
  
  in Public
  
  This results individing the original verification problem into aset of sub-problems whose MILP formulations re-quire fewer integrality constraints.
  
  Sentence 3: Motivation for Approach
4. carlhenrikrolf 25 Jul 2022
  
  in Public
  
  The method implements branchingon the ReLU states on the basis of a notion ofdependency between the nodes.
  
  Sentence 2: Approach
5. carlhenrikrolf 25 Jul 2022
  
  in Public
  
  We introduce an efficient method for the com-plete verification of ReLU-based feed-forward neu-ral networks
  
  Sentence 1: topic
6. carlhenrikrolf 25 Jul 2022
  
  in Public
  
  1a
  
  Equation numbering is used extremely sparsely.
7. carlhenrikrolf 25 Jul 2022
  
  in Public
  
  Figure 1: A ReLU FFNN. A dashed arrow from x(i)j to x(q)r withlabel ss′ indicates that the s′ state of node n(q)r depends on the sstate of node n(i)j .
  
  Pictoral representation of a definition (dependency)
8. carlhenrikrolf 25 Jul 2022
  
  in Public
  
  Figure 2: The dependency graph of the network from Figure 1. An edge from vertex (n, s) to vertex (n′, s′) represents that the s′ state ofnode n′ depends on the s state of node n. The rectangles highlight the active and inactive dependency trees of node n(1 )2 .
  
  Pictoral representation of the algorithm
9. carlhenrikrolf 25 Jul 2022
  
  in Public
  
  Proof. The proof is by straightforward induction on thebranching depth.
  
  A very brief proof. Is there an appendix with more detail. It's not that obvious for a non-insider—although, I could probably figure it out by thinking a bunch. Maybe compare another paper.
10. carlhenrikrolf 25 Jul 2022
  
  in Public
  
  Conclusions
  
  No discussions for future work whatsoever
11. carlhenrikrolf 25 Jul 2022
  
  in Public
  
  Evaluation
  
  experimental part last
12. carlhenrikrolf 25 Jul 2022
  
  in Public
  
  Dependency-based Branching
  
  The algorithm and its analysis is given here. Now, there are definitions in the form of definition environments—not many though, only 3.
13. carlhenrikrolf 25 Jul 2022
  
  in Public
  
  Background
  
  What I usually call Model or Preliminaries. Rather than using the def environment, definitions introduced in paragraph environments.
14. carlhenrikrolf 25 Jul 2022
  
  in Public
  
  Theorem 1 (Soundness and completeness). Given a verifi-cation problem (f , X , Y), the procedure VERIFY((f , X , Y))from Algorithm 1 returns yes iff ∀x ∈ X : f (x) ∈ Y
  
  The theorem shows a standard property soundness/completeness. There is only one theorem and no propositions. The paper is more experimental than it is theoretical though.
15. carlhenrikrolf 25 Jul 2022
  
  in Public
  
  Algorithm 1 The verification procedure.
  
  Two procedures in one algorithm enviornment. Pseudocode.
Visit annotations in context

Annotators

carlhenrikrolf

URL

ijcai.org/proceedings/2021/0364.pdf
arxiv.org arxiv.org

2206.11795.pdf

1
1. carlhenrikrolf 03 Jul 2022
  
  in Public
  
  Video PreTraining (VPT): Learning to Act byWatching Unlabeled Online Videos
  
  only on arxiv so far
Visit annotations in context

Annotators

carlhenrikrolf

URL

arxiv.org/pdf/2206.11795.pdf
May 2022
link.springer.com link.springer.com

FACT: A Probabilistic Model Checker for Formal Verification with Confidence Intervals

1
1. carlhenrikrolf 17 May 2022
  
  in Public
Visit annotations in context

Annotators

carlhenrikrolf

URL

link.springer.com/chapter/10.1007/978-3-662-49674-9_32
Apr 2022
proceedings.neurips.cc proceedings.neurips.cc

NIPS-2017-safe-model-based-reinforcement-learning-with-stability-guarantees-Paper.pdf

2
1. carlhenrikrolf 05 Apr 2022
  
  in Public
  
  While we commit to GPs for theexploration analysis, for safety any suitable, well-calibrated model is applicable.Assumption 2 (well-calibrated model). Let μn(·) and Σn(·) denote the posterior mean and covari-ance matrix functions of the statistical model of the dynamics (1) conditioned on n noisy measurements.With σn(·) = trace(Σ1/2n (·)), there exists a βn > 0 such that with probability at least (1 − δ) it holdsfor all n ≥ 0, x ∈ X , and u ∈ U that ‖f (x, u) − μn(x, u)‖1 ≤ βnσn(x, u)
  
  Simplification: the true transition function is a Gaussian process.
2. carlhenrikrolf 05 Apr 2022
  
  in Public
  
  σn(·) = trace(Σ1/2n (·))
  
  Just the sum of standard deviations?
Visit annotations in context

Annotators

carlhenrikrolf

URL

proceedings.neurips.cc/paper/2017/file/766ebcd59621e305170616ba3d3dac32-Paper.pdf
www.ncbi.nlm.nih.gov www.ncbi.nlm.nih.gov

Deadly medicines and organised crime: How big pharma has corrupted healthcare

1
1. carlhenrikrolf 02 Apr 2022
  
  in Public
  
  His concern is with the rest: the “me, too” drugs that provide no benefit and those that are actively harmful.
  
  Back when "me too" meant kind of the opposite
Visit annotations in context

Annotators

carlhenrikrolf

URL

ncbi.nlm.nih.gov/pmc/articles/PMC4046551/
Mar 2022
rewriting.csail.mit.edu rewriting.csail.mit.edu

Rewriting a Deep Generative Model

1
1. carlhenrikrolf 23 Mar 2022
  
  in Public
Visit annotations in context

Annotators

carlhenrikrolf

URL

rewriting.csail.mit.edu/
www.alignmentforum.org www.alignmentforum.org

ELK prize results - AI Alignment Forum

2
1. carlhenrikrolf 16 Mar 2022
  
  in Public
  
  steganography
  
  the practice of concealing messages or information within other non-secret text or data.
2. carlhenrikrolf 16 Mar 2022
  
  in Public
  
  Strategy: train a reporter which isn’t useful for figuring out what the human will believe
  
  I find this hard to understand in the sense that it is too vague for me too evaluate.
Visit annotations in context

Annotators

carlhenrikrolf

URL

alignmentforum.org/posts/zjMKpSB2Xccn9qi5t/elk-prize-results
www.aaai.org www.aaai.org

PROST: Probabilistic Planning Based on UCT

1
1. carlhenrikrolf 15 Mar 2022
  
  in Public
  
  PROST: Probabilistic Planning Based on UCT
  
  Google citations 165
Visit annotations in context

Annotators

carlhenrikrolf

URL

aaai.org/ocs/index.php/ICAPS/ICAPS12/paper/viewFile/4715/4721
www.aaai.org www.aaai.org

Hard Constrained Semi-Markov Decision Processes

1
1. carlhenrikrolf 15 Mar 2022
  
  in Public
  
  to our best knowledge, MDPswith hard constraints have yet been studied
  
  Really!?
Visit annotations in context

Annotators

carlhenrikrolf

URL

aaai.org/Papers/AAAI/2006/AAAI06-088.pdf
www.aaai.org www.aaai.org

Path-specific Objectives for Safer Agent Incentives

4
1. carlhenrikrolf 14 Mar 2022
  
  in Public
  
  MG′
  
  What the total effect would be under this MDP?
2. carlhenrikrolf 14 Mar 2022
  
  in Public
  
  TE
  
  Total effect. ¿TE(x, x̄; Y, ε) = E[do(Y_x) - do(Y_x̄)]? (I have omitted ε.)
3. carlhenrikrolf 14 Mar 2022
  
  in Public
  
  UXa
  
  What the utility U would have been if the feature/parameter/variable X had taken the value it would have taken if action a had been sampled from A.
4. carlhenrikrolf 14 Mar 2022
  
  in Public
  
  Moreover, specializing to reinforcement learning, the one-step Bellman optimality condition for optimizing this objec-tive in expectation is:vπ∗ [M,G′ , ̄a,π]= maxa0∑s1,r1p (s1 | z0, s0, do(A0 = a0))p (r1 | z1 ̄a, s1, do(A0 = a0)) p (z1 ̄a | z0, s0)·[r0 + γvπ∗ [M,G′ , ̄a,π](r1, z1 ̄a)], (6)where γ is the discount rate.
  
  The algorithm
Visit annotations in context

Annotators

carlhenrikrolf

URL

aaai.org/AAAI22Papers/AAAI-1140.FarquharS.pdf
Feb 2022
proceedings.neurips.cc proceedings.neurips.cc

NeurIPS-2021-information-directed-reward-learning-for-reinforcement-learning-Paper.pdf

9
1. carlhenrikrolf 28 Feb 2022
  
  in Public
  
  timestep
  
  very different!
2. carlhenrikrolf 28 Feb 2022
  
  in Public
  
  average score
  
  somewhat different
3. carlhenrikrolf 28 Feb 2022
  
  in Public
  
  update frequencies of the candidatepolicies in the Corridor environments.
  
  where in the main text?
4. carlhenrikrolf 28 Feb 2022
  
  in Public
  
  number of queries
  
  works a bit like time? not an average over environments?
5. carlhenrikrolf 28 Feb 2022
  
  in Public
  
  Algorithm 1
  
  called iteratively?
6. carlhenrikrolf 28 Feb 2022
  
  in Public
  
  Πc ← initialize candidate policies
  
  plausible policies!
7. carlhenrikrolf 28 Feb 2022
  
  in Public
  
  r ← mean estimate of the reward model; ̄π∗ ← RL( ̄r)
  
  only at the end?
8. carlhenrikrolf 28 Feb 2022
  
  in Public
  
  ˆy
  
  belief about answer?
9. carlhenrikrolf 28 Feb 2022
  
  in Public
  
  H
  
  entropy
Visit annotations in context

Annotators

carlhenrikrolf

URL

proceedings.neurips.cc/paper/2021/file/1fa6269f58898f0e809575c9a48747ef-Paper.pdf
Jan 2022
arxiv.org arxiv.org

User Tampering in Reinforcement Learning Recommender SystemsUser Tampering in Reinforcement Learning Recommender Systems

6
1. carlhenrikrolf 17 Jan 2022
  
  in Public
  
  CONCLUSION
  
  My worry with this paper is that they only show that the RL-agent has an incentive to control noise, but this is not necessarily undesirable. Suppose the initial state and policy has some probability of causing an accident. In that case, we would want the RL-agent to manipulate the environment into a state where there is less noise such that an accident can be guaranteed not to happen.
2. carlhenrikrolf 17 Jan 2022
  
  in Public
  
  That is to say, the exogenous variable that is the user’s preferencesetc., at time 𝑥, is represented as 𝜃𝑇𝑥
  
  θT is unobserved features treated as noise in the transition function?
3. carlhenrikrolf 17 Jan 2022
  
  in Public
  
  i) ‘static’ machine learning approaches [2, 6, 9, 16, 18];
  
  For example collaborative filtering, where the preferences of one users are predicted (filtering) from ratings given by many other users (collaborative)
4. carlhenrikrolf 13 Jan 2022
  
  in Public
  
  The agent heavily exploits user tampering even thoughwe were able to generate similar cumulative rewardswith our crude ‘baseline’ policy.
  
  Surprising to me.
5. carlhenrikrolf 13 Jan 2022
  
  in Public
  
  If the recommendation to both Alice and Bob at the next time-step– say, 𝐴𝑥 – is an article about federal politics, it is intuitively untruethat the distribution over possible states at 𝑆𝑥+1 is the same; Aliceis surely more likely to observably engage with this content.
  
  Is this a kind of context?
6. carlhenrikrolf 13 Jan 2022
  
  in Public
  
  User Tampering in Reinforcement Learning RecommenderSystems
  
  Not published in proceeding as of today. Published as a workshop paper at FAccTRec ’21, September 25 2021.
Visit annotations in context

Annotators

carlhenrikrolf

URL

arxiv.org/pdf/2109.04083.pdf
Aug 2021
localhost:8888 localhost:8888

Test – Ze Ozer Vay Around

2
1. carlhenrikrolf 12 Aug 2021
  
  in Public
  
  Chart
  
  Should change this ...
2. carlhenrikrolf 11 Aug 2021
  
  in Public
  
  The Thrill Is Gone
  
  Great name!
Visit annotations in context

Annotators

carlhenrikrolf

URL

localhost:8888/2021/08/10/test/

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL