 Nov 2022

docs.google.com docs.google.com

Devising ML Metrics

 Aug 2022

www.ijcai.org www.ijcai.org

Parameterised Ratio Synthesis
modification to the algorithm

Together, Theorem 1 and Theorem 2 show that Algorithm 1is a sound decision procedure for the λsynthesis problem.
theoretical analysis completed

We now show that the output of λSYNTHESIS reflects theminimum ratio for which the Fault Tolerance problem wouldreturn false w.r.t the concrete system and specification givenas input. We prove this by means of two steps.
moving on to theoretical results

We now give some preliminary definitions that we will usein our procedure. For a global state g we use λ(g) to denotethe ratio of agents that have exhibited a fault to total agents,i.e.
this notation is a bit confusing when looking briefly. what are the correspondences in the pseudocode?

We now describe the Lambda procedure. The propositional cases are trivial: we can simply label each state whichsatisfies or does not satisfy the atomic proposition with theratio of faulty agents at that state. For the conjunction case,notice we need both φ1 and φ2 to hold so the minimum necessary ratio of faults is the largest of the ratios for each of thesetwo to hold. The disjunction case is similar.
more in depth explanation starts here

This subprocedure works similarly to the labelling algorithm for model checking CTL [Clarke et al., 1999]. Thealgorithm recursively labels the states of a system with thesubformulas of the specification under question that they satisfy. Differently from the CTL labelling algorithm, Lambdalabels states with both said subformulas and the minimum ratio of faulty agents to total agents that is sufficient for thespecification to be satisfied at the state. Thus, Lambda operates on sets of pairs of states and ratios instead of sets ofstates. Note that since Lambda takes as input the negationof an ACTLK\X formula, the procedure is defined for theexistential fragment ECTLK\X of CTLK\X.
Summary of subprocedure. Main procedure selfexplnatory pseudocode

Ourprocedure, shown in Algorithm 1, passes the system of nfaulty agents and the negation of our specification to a subprocedure Lambda, shown in Algorithm 2.
refs

Symbolic Synthesis of FaultTolerance Ratiosin Parameterised MultiAgent Systems


www.nature.com www.nature.com

tags


www.scribbr.com www.scribbr.com

Redrafting and revising Once you’ve decided where changes are needed, make the big changes first, as these are likely to have knockon effects on the rest. Depending on what your text needs, this step might involve: Making changes to your overall argument. Reordering the text. Cutting parts of the text. Adding new text. You can go back and forth between writing, redrafting and revising several times until you have a final draft that you’re happy with.
I might want to write these lists instead as I often have to change them.

 Jul 2022

ifaamas.org ifaamas.org

line 2
line numbers in proof

Theorem
If something is an assumption in a theorem, then it can be used as a def in a def env

Sketch
similarity > assumption > definition > conclusion > qed

The proof is similar to that of Theorem 3.4
many of these

diction
proof by contradiction

Theorem
For once, not so many lemmas.

(⇒)
these symbols show which direction of the iff we are proving

Definition 3.1 (COIS). A collective open interpreted system (COIS)is an OIS O = ⟨A, E, V⟩ where the transition function of A satisfiesthe conditiont(l, a, X ∪ {a}, aE ) = t(l, a, X , aE )for all values of l, a, X and aE .
assumptions generally seem to be set in definition enviornments

sketch
it feels very informal in some way. yet it is pretty clear that they take this known problem. Encode it using their model. Since this example is undecidable, the worst case is such

O, д = ∀v1, ...,vn : ψ iff O, д = ψ [v1 7 → a1] . . . [vn 7 →an ] for all pairwise disjointchoices of a1, . . . , an ∈ N.O, д = (alv, i) iff ∃j : id(д.j) = iO, д = (p, i) iff O, д = (alv, i) and p ∈ V(si (д));O, д = ¬(p, i) iff O, д ̸ = (p, i);
new kinds of connectives

EVALUATION
experiments, no comparisons to other algorithms though

We introduce asemantics, based on interpreted systems, to capture the openness ofthe system and show how an indexed variant of temporalepistemiclogic can be used to express specifications on them
new kind of problem?


ojs.aaai.org ojs.aaai.org

Figure 3: The architecture of
Maybe have one of these

he estimation of the difficulty of a problem p at splittingdepth d that we consider is defined byscore(p, d) = stability ratio(p) − stability ratio(P )d 1m,where P is the original verification problem and m is thesplitting parameter.
experimental metric

Algorithm
uses milp, calls split

Algorithm
modifies milp

Lemmas 3 and 1
I guess I would typically see Equations (1) and (3) or something like it

Dependency analyser. Given the above, we now put forward a procedure using the identification of dependencies toreduce the configuration space.
feels odd to write such that you derive a procedure from a couple of lemmas.

Dependency Analysis
the theoretical results appear without any proofs

Lemma 1 gives a procedure for identifying intralayer dependencies by computing the right hand side of each of theabove clauses for every pair of unstable nodes in a layer.Dependencies between layers require a different treatment.
deriving the parts of the procedure little by little using lemmas?

The procedure runs in time O(k · s2),where k is the number of layers and s is the layers’ maximalsize.
this is another theoretical result apparently nondeserving of an environment

We now proceed to derive a procedure for computing anetwork’s dependency relations. To ease the presentation,we express dependency relations as unions of four disjointsets Df = ⋃z,z′ ∈{,⊥} Dz,z′f , where each Dz,z′f comprisesdependencies pertaining to the ReLU states z and z′, i.e.,Dz,z′f {(ni,q , nj ,r )  st(ni,q ) = z =⇒ st(nj ,r ) = z ′}
Is this the highlevel description of the theorem? It looks quite different to me ...

Example 1.
One of these examples that is really more of a worked exercise.

procedure D EPENDENCY ANALYSIS (milp, h)
procedure name inside the algorithm environment, and descriptive caption instead


www.ifaamas.org www.ifaamas.orgp1200.pdf39

The construction of the abstract model and its verificationagainst all formulae took approximately 5 seconds on anIntel Core i7 machine clocked at 3.4 GHz, with 7.7 GiBcache, running 64bit Fedora 20, kernel 3.16.6. MCMASOFPand the ISPLOFP file encoding the scenario are availableat [21].
A small experiment

Figure 1: ISPLOFP snippet.
In this case the algorithm is given in actual code. Also, it is more for a specific case study rather than in general.

From the small neighbourhood assumption wehave that ˆP (γ, o) = P (g, o) ± , for each opinion o.Therefore,#(g′, o)n ≈ [%false (γ). ˆP (γ, o)] + %true (γ, o)for each opinion o.
Is this where we actually use our assumption?

(2) Pick n α such that SA(n) admits every initial density of opinions that is represented by the set of abstractinitial states. Define R as above. Then the proof proceedsalong the same lines with the proof in (1).
This seems like a bit of an inconsistent writing. Just before we did two lemmas for a similar thing.

Lemma 3.11. Given an agent template A and an ACTLKformula φ, the following hold:1. for all n α, ˆSA(1) simulates SA(n);2. there is n α such that SA(n) simulates ˆSA(1).
Now, we make a weaker assumption to prove the stronger assumption. Furthermore, we do it both ways.

Lemma 3.10. Assume that ˆSA(1) simulates SA(n) andSA(n) simulates ˆSA(1). Then, ˆSA(1) =ab φ iff SA(n) = φ,for any ACTLK formula φ.Proof. (⇒) Lemma 3.8. (⇐) Lemma 3.9.
This lemma shows the shape of the eventual theorem but under stronger assumption. The question is: Is it really needed. The proof line could be included in the final theorem.

Lemma 3.9. Assume that SA(n) simulates ˆSA(1). Then,SA(n) = φ implies ˆSA(1) =ab φ, for any ACTLK formulaφ.
The other way of the implication under the stronger assumption

A simulation relation between the abstract model and aconcrete system is defined similarly to Definition 3.7, butby swapping the LHS with the RHS in each of the clauses.
changing the order of the ifstatements?

Let φ = CΓψ and assume γ =ab φ
This is the property that is a bit harder to show, so left for last

φ ∈ AP : from simulation requirement2;
thing proved: reasoning; next thing proved: more reasoning; ... This is a very concise representation. I guess it is clear but still pretty hard to parse—especially without appropriate line breaks. Maybe there just is not space for such line breaks.

by induction on φ.
proof method

Assume a simulation relation R betweenSA(n) and ˆSA(1).
use the assumption

Lemma 3.8. Assume that ˆSA(1) simulates SA(n). Then,ˆSA(1) =ab φ implies SA(n) = φ, for any ACTLK formulaφ.
This is maybe 1/3 of the of the theorem to be proved. It shows the equivalence one way. It makes an assumption that we would later want to remove by proving it holds.

By means of the former result, thesatisfaction of an ACTLK formula on the abstract modelentails the satisfaction of the formula on every concrete system. Conversely, by means of both results, the falsificationof an ACTLK formula on the abstract model implies the existence of a concrete system that falsifies the formula
equivalence as long as we restrict ourselves to talk about certain properties. Highlevel summary before we start presenting the lemmas necessary for the proof.

The abstract satisfaction relation =ab is defined for groupknowledge as ( ˆSA, γ) =ab EΓφ iff for all γ′ with γ ∼ ˆEΓ γ′,we have that ( ˆSA, γ′) =ab φ; for common knowledge it isdefined as ( ˆSA, γ) =ab CΓφ iff for all γ′ with γ ∼ ˆCΓ γ′,
Some further defs near the definition environment

We now establish a correspondence between the concretesystems and the abstract model
Ah, now the naming of concrete system starts to make sense for me; it is contrasted against abstract model

An abstract state is a set of pairs of weights and templatelocal states in R × L; it represents every concrete state inwhich the ratio of agents in each local state to all the agentsapproximates the weight associated with the state. Notethat every local state that does not appear in an abstractstate is assumed to be associated with a weight equal to 0.
Output of construction? Maybe not really the right way to say it. It says something, an important detail, of the construction, but the entire thing is defined in the definition environment later.

To solve the PMCP we introduce the notion of weightedabstraction.
First sentence introduces name.

parameterised modelchecking problem [5, 10, 15
PMCP

g.i
This g.i notation for the local state for i given the global state g is quite clean :)

Weighted abstraction
Hmm, maybe the main product from this paper is the abstraction rather than an algorithm

Since swarms are typically made ofvery large numbers of agents each interacting with very fewneighbours, all systems of interest satisfy the small neighbourhood property.
is the assumption reasonable? No assumption environment.

n α to denote this
shorthand notation

Definition 3.2 (PMCP for small neighbourhoods).The PMCP for OFSs with small neighbourhoods of size αconcerns establishing whether the following holds:SA(n) = φi for every n α.
problem formulation revisited for stronger assumption

By “much greater” we mean that for anygiven global state g and opinion o, we have that( #false (g,o)#false (g))α=( #false (g,o)−α#false (g)−α)α± , for some small constant .
clarification

Given an agent template A defined on a neighbourhoodsize α ∈ N, we say that an OFS SA(n) satisfies the smallneighbourhood property if at each time step the number ofagents not in latent state is much greater than the neighbourhood size.
definition in words still with nebulous concepts such as much greater than

We begin by formulating the small neighbourhood property.
novel concept. start defining gradually

Definition 3.1 (PMCP). Given an agent template Aand an ACTLK formula φ, the parameterised model checking problem (PMCP) is the decision problem of determiningwhether the following holds:SA(n) = φ for every n ≥ α.
Problem formulation

Definition 2.2 (Concrete Agent).
Quite different from how the RL community coes ir

ntuitively, different behaviours are associated with different opinions. For instance, as we exemplifyin Section 4, the latent value can be used to keep trackof the time an agent is engaged in the protocol before itgoes into latent state; this period of time is proportionalto its currently held opinion. For a local state l, we writeopinion(l), value(l), and latent(l) to denote the opinion, thelatent value, and the latent status, respectively, encoded in l.We assume that whenever latent(l) = false and t(l, o) = l′,then opinion(l′) = o; i.e, at each time step, an agent switchesto the majority opinion in its neighbourhood if not in latentstate.
How model is used but not what each concept means

Note that a local state is built from an observable component representing an opinion, and a nonobservable component associated with a latent value and the latent status
note

Definition 2.1 (Agent Template). An agent templateis a tuple A = (O, h, α, t), where:• O is a nonempty and finite set of opinions;• h : O → N is a mapping from the set of opinionsinto the set natural numbers, where h(o) represents thequality of opinion o. O and h define a setL = {(o, v, l) : o ∈ O, 0 ≤ v ≤ max(h(o) : o ∈ O),l ∈ {false, true}}of local states, where each triple (o, v, l) encodes anopinion o, a latent value v, and whether or not thetemplate is into latent state (l = true, and l = false,respectively);• α ∈ N is the size of the neighbourhood for the templateat any given time step;• t : L × O → L is a transition function that returns thenext local state given the current local state and themajority opinion held by neighbouring agents.
Seems pretty similar to how I would write

n its simplest form, a majorityrule protocol is describedas follows [18]. At each time step a team of three randomlypicked agents is formed. Each agent can be in one of twostates; each state is associated with one of two opinions. Themembers of a team adopt the opinion held by the majorityof the members in team. This process is repeated until consensus is reached on one of the two opinions. Studies haveshown that a collective agreement is eventually establishedon the opinion with the highest initial density [18].
There is already an algorithm for achieving consensus. What er are looking for is a procedure by which we verify that it works for arbitrarily many agents

arameterised model checking has previously been appliedfor the analysis of generic swarm systems of an arbitrarynumber of components [16]. In this paper we extend themethodology put forward in [16] to model consensus protocols following the majority rule
Previous work, current work

A key issue with opinion formation protocols is to investigate their convergence. In this paper we put forward a formal methodology that can be applied to analyse consensusprotocols that follow the majority rule.
property of interest. same word as in RL although meaning slightly different

Theorem 3.12. Given an agent template A and an ACTLKformula φ, the following hold:1. ˆSA(1) = φ implies ∀n α. SA(n) = φ.2. ˆSA(1) 6 = φ implies ∃n α. SA(n) 6 = φ.
This theorem is less of a standard result. It shows what one model's theorem says about a different kind of model.

OPINION FORMATION SEMANTICS
What I would call Model or Preliminaries. This time there are definition environments involved rather than paragraph environments

We investigate the formal verification of consensus protocols in swarm systems composed of arbitrary many agents
Sentence 1: Topic

Formal Verification of Opinion Formation in Swarms
This is a more theoretical paper, but it is much older. Maybe ask Alessio which of the manyauthor papers he is referring to wrt him and Panagiotis coauthoring


www.ijcai.org www.ijcai.org

he experimental resultsobtained show 145% performance gains over thepresent stateoftheart in complete verification.
Sentence 5: Experimental results

We evaluatethe method on all of the ReLUbased fully connected networks from the first competition for neural network verification.
Sentence 4: Experimental set up

This results individing the original verification problem into aset of subproblems whose MILP formulations require fewer integrality constraints.
Sentence 3: Motivation for Approach

The method implements branchingon the ReLU states on the basis of a notion ofdependency between the nodes.
Sentence 2: Approach

We introduce an efficient method for the complete verification of ReLUbased feedforward neural networks
Sentence 1: topic

1a
Equation numbering is used extremely sparsely.

Figure 1: A ReLU FFNN. A dashed arrow from x(i)j to x(q)r withlabel ss′ indicates that the s′ state of node n(q)r depends on the sstate of node n(i)j .
Pictoral representation of a definition (dependency)

Figure 2: The dependency graph of the network from Figure 1. An edge from vertex (n, s) to vertex (n′, s′) represents that the s′ state ofnode n′ depends on the s state of node n. The rectangles highlight the active and inactive dependency trees of node n(1 )2 .
Pictoral representation of the algorithm

Proof. The proof is by straightforward induction on thebranching depth.
A very brief proof. Is there an appendix with more detail. It's not that obvious for a noninsider—although, I could probably figure it out by thinking a bunch. Maybe compare another paper.

Conclusions
No discussions for future work whatsoever

Evaluation
experimental part last

Dependencybased Branching
The algorithm and its analysis is given here. Now, there are definitions in the form of definition environments—not many though, only 3.

Background
What I usually call Model or Preliminaries. Rather than using the def environment, definitions introduced in paragraph environments.

Theorem 1 (Soundness and completeness). Given a verification problem (f , X , Y), the procedure VERIFY((f , X , Y))from Algorithm 1 returns yes iff ∀x ∈ X : f (x) ∈ Y
The theorem shows a standard property soundness/completeness. There is only one theorem and no propositions. The paper is more experimental than it is theoretical though.

Algorithm 1 The verification procedure.
Two procedures in one algorithm enviornment. Pseudocode.


arxiv.org arxiv.org

Video PreTraining (VPT): Learning to Act byWatching Unlabeled Online Videos
only on arxiv so far

 May 2022
 Apr 2022

proceedings.neurips.cc proceedings.neurips.cc

While we commit to GPs for theexploration analysis, for safety any suitable, wellcalibrated model is applicable.Assumption 2 (wellcalibrated model). Let μn(·) and Σn(·) denote the posterior mean and covariance matrix functions of the statistical model of the dynamics (1) conditioned on n noisy measurements.With σn(·) = trace(Σ1/2n (·)), there exists a βn > 0 such that with probability at least (1 − δ) it holdsfor all n ≥ 0, x ∈ X , and u ∈ U that ‖f (x, u) − μn(x, u)‖1 ≤ βnσn(x, u)
Simplification: the true transition function is a Gaussian process.

σn(·) = trace(Σ1/2n (·))
Just the sum of standard deviations?


www.ncbi.nlm.nih.gov www.ncbi.nlm.nih.gov

His concern is with the rest: the “me, too” drugs that provide no benefit and those that are actively harmful.
Back when "me too" meant kind of the opposite

 Mar 2022

rewriting.csail.mit.edu rewriting.csail.mit.edu


www.alignmentforum.org www.alignmentforum.org

steganography
the practice of concealing messages or information within other nonsecret text or data.

Strategy: train a reporter which isn’t useful for figuring out what the human will believe
I find this hard to understand in the sense that it is too vague for me too evaluate.


www.aaai.org www.aaai.org

PROST: Probabilistic Planning Based on UCT
Google citations 165


www.aaai.org www.aaai.org

to our best knowledge, MDPswith hard constraints have yet been studied
Really!?


www.aaai.org www.aaai.org

MG′
What the total effect would be under this MDP?

TE
Total effect. ¿TE(x, x̄; Y, ε) = E[do(Y_x)  do(Y_x̄)]? (I have omitted ε.)

UXa
What the utility U would have been if the feature/parameter/variable X had taken the value it would have taken if action a had been sampled from A.

Moreover, specializing to reinforcement learning, the onestep Bellman optimality condition for optimizing this objective in expectation is:vπ∗ [M,G′ , ̄a,π]= maxa0∑s1,r1p (s1  z0, s0, do(A0 = a0))p (r1  z1 ̄a, s1, do(A0 = a0)) p (z1 ̄a  z0, s0)·[r0 + γvπ∗ [M,G′ , ̄a,π](r1, z1 ̄a)], (6)where γ is the discount rate.
The algorithm

 Feb 2022

proceedings.neurips.cc proceedings.neurips.cc

timestep
very different!

average score
somewhat different

update frequencies of the candidatepolicies in the Corridor environments.
where in the main text?

number of queries
works a bit like time? not an average over environments?

Algorithm 1
called iteratively?

Πc ← initialize candidate policies
plausible policies!

r ← mean estimate of the reward model; ̄π∗ ← RL( ̄r)
only at the end?

ˆy
belief about answer?

H
entropy

 Jan 2022

arxiv.org arxiv.org

CONCLUSION
My worry with this paper is that they only show that the RLagent has an incentive to control noise, but this is not necessarily undesirable. Suppose the initial state and policy has some probability of causing an accident. In that case, we would want the RLagent to manipulate the environment into a state where there is less noise such that an accident can be guaranteed not to happen.

That is to say, the exogenous variable that is the user’s preferencesetc., at time 𝑥, is represented as 𝜃𝑇𝑥
θT is unobserved features treated as noise in the transition function?

i) ‘static’ machine learning approaches [2, 6, 9, 16, 18];
For example collaborative filtering, where the preferences of one users are predicted (filtering) from ratings given by many other users (collaborative)

The agent heavily exploits user tampering even thoughwe were able to generate similar cumulative rewardswith our crude ‘baseline’ policy.
Surprising to me.

If the recommendation to both Alice and Bob at the next timestep– say, 𝐴𝑥 – is an article about federal politics, it is intuitively untruethat the distribution over possible states at 𝑆𝑥+1 is the same; Aliceis surely more likely to observably engage with this content.
Is this a kind of context?

User Tampering in Reinforcement Learning RecommenderSystems
Not published in proceeding as of today. Published as a workshop paper at FAccTRec ’21, September 25 2021.

 Aug 2021

localhost:8888 localhost:8888

Chart
Should change this ...

The Thrill Is Gone
Great name!
