335 Matching Annotations
  1. Last 7 days
    1. Symmetry breaking

      Two propositional variables are symmetrical iff exchanging them leads to the same problem. Their identity carries no relative value.

    1. Difference logic

      Why are there no propositional operations?

    2. EUF

      Theory of Equality and Uninterpreted Functions

    3. if𝜙′/∈SAT, then return𝜙 /∈SAT

      The first SAT is propositional. The second SAT is modulo T.

  2. Oct 2020
    1. How does the meaning of the formula change if you replace(𝑥𝑛∨𝑦𝑛∨𝑎𝑛−1)by(𝑥𝑛∨𝑎𝑛−1)∧(𝑦𝑛∨𝑎𝑛−1)?

      The inequality becomes strict.

    2. What is the purpose of auxiliary variables?

      Let's assume that \(a_0 = 1\). Let \(j\) be the smallest index such that \(x_j \neq y_j\). Then \(a_i = 1\) for all \(i < j\).

      • If \(x_j > y_j\), then the formula is unsatisfiable.
      • If \(x_j < y_j\), then \(a_i\) is unconstrained for all \(i \geq j\).

      \(a_i\) is forced to 1 iff \(x\) and \(y\) are constrained at index \(i\).

    3. Why can we replace(𝑥𝑛∨𝑦𝑛∨𝑎𝑛−1)∧(𝑥𝑛∨𝑎𝑛∨𝑎𝑛−1)∧(𝑦𝑛∨𝑎𝑛∨𝑎𝑛−1)just by(𝑥𝑛∨𝑦𝑛∨𝑎𝑛−1)? Hence we need only3𝑛−2clauses and𝑛−1auxiliary variables (𝑎𝑛is also useless)

      The only purpose of \(a_i\) is to signal to the higher indices of \(x\) and \(y\) whether they are constrained.

    4. Why is𝑎0always false and hence useless?

      We must constrain \(a_0\) to 1. Otherwise the whole formula is satisfied by \(a_i = 0\) for all \(i\).

    1. How to encode typical constraints
    2. hard—must be satisfied

      This seems to be possible to implement with transfinite weights.

    3. 𝑝

      Usually \(p < 0.5\).

    4. Unary encoding (order encoding)

      Makes expressing inequalities easy:

      • \(x < i \leftrightarrow \overline{x_i}\)
      • \(x \geq i \leftrightarrow x_i\)
    5. while𝜙contains a pure literal𝑙dodelete clauses containing𝑙from𝜙

      In practice pure literal elimination is usually not used because it does not work efficiently with the data structures used to keep track of unit clauses.

    1. There are few exercises in this unit and all of them are voluntary (only intended for inexperienced students).

    1. Acronyms are all right as long as they are defined explicitly first, for example:

      personal protective equipment (PPE)

    2. Oxford comma”

      Oxford comma is useful especially when a list item contains "and", for example:

      One, two and three, and four.

  3. Sep 2020
    1. If not, you must pick up the discard pile.

      Do I get to pick the card I just flipped over as well?

    1. DO NOT use etc., i.e., or e.g.

      Michael: It is ok to use these short forms in brackets.

    2. hedging

      What exactly is hedging? What does it look like?

    3. Be non-personal

      How should I deal with sentences such as:

      we restrict our attention to learning precedences for predicate symbols only

      In this work, we design a system [...]

      We call the vector w the coefficients [...]

      Source: https://github.com/filipbartek/learning-precedences-from-elementary-symbol-features/releases/download/paar2020%2Fceur-2/Learning_Precedences_from_Simple_Symbol_Features.pdf

    4. and / or

      "and / or" does not seem to be an adequate substitute for "etc." to me.

    1. Punctuation in Academic Writing

      Punctuation is governed by style guides. When writing for a journal, ask them what style guide they follow.

    1. Michael prefers the participants to have their video enabled.

      Homework assignments:

      1. Introduction (paragraph or section)
      2. Body (paragraph or section)
    1. monotone

      monotonic

    2. Two formulae𝜙and𝜓are equivalent, we write𝜙≡𝜓or𝜙|=|𝜓

      Cf. "equisatisfiable"

    3. (𝜙1∧···∧𝜙𝑛)→(𝜓1∨···∨𝜓𝑚)

      Special example: Horn clauses

    4. 𝜙follows from(oris a consequence of)a formula𝜓

      \(\psi\) entails \(\phi\)

    5. all valid formulae are derivable (provable)

      valid -> derivable

    6. only valid formulae are derivable (provable)

      derivable -> valid

    7. Examples of used formal systems

      Not to be discussed in this course:

      • QBF
      • modal logics
      • HOL
    8. British Museum algorithm

      Random search

    9. foundations of mathematics in the late 19thcentury and early 20th century

      Cf. Hilbert's program

    10. derive (prove) formulae

      provability

    11. semantic consequence.

      entailment

    12. {𝑥:𝑥 /∈𝑥}

      How to define a set? For example by set comprehension, as exemplified here.

    1. The course extends on some previous courses. Namely we discuss computational aspects of inference.

      Filip Zelezny: The lectures are to be recorded by the students.

      Karel Chvalovsky: Feel free to unmute yourself and ask questions, or write them in the chat.

    2. First-order resolution

      We rely on theoretical knowledge of e.g. resolution from previous courses and we focus on computation hacks.

    3. Proof assistants

      Is this lecture focused at high-order logic?

    4. 9 23.11. Chvalovský Answer set programming 10 30.11. Chvalovský First-order logic 11 7.12. Chvalovský First-order resolution 12 14.12. Chvalovský Equality and model finding 13 4.1. Chvalovský Proof assistants

      These approaches are more modern than Prolog.

    5. Prolog

      Logical programming is not as important nowadays as it used to be. It is currently not fashionable.

      Filip: Why not use OCaml instead?

      Prolog is an example of declarative programming language.

    6. SAT solving

      Improving SAT solvers leads to making many NP-complete problems easier to solve computationally.

  4. Aug 2020
    1. the place value of the "hundreds" digit in base seven is 49

      I don't understand.

    1. Node List

      How to print the info:

      sinfo -o "%9n %9P %.4c %.6m %.8d %14f %G"
      

      How to format with xsv:

      sinfo -o "%n|%P|%c|%m|%d|%f|%G" | xsv table --delimiter "|"
      
  5. Jun 2020
    1. Unfortunately

      I consider it rather fortunate since it allows me to clean up, as demonstrated below.

    1. "There was a 0.95 correlation. I asked them about it. They said, 'We read this article in Chain Store Age magazine that said beer and diapers are correlated, so we put beer next to diapers in all of our stores," he said. "What they did was they created the data that actually validates the data. In effect, they created the signal they used to validate the signal."

      Placement of the goods is a confounder.

  6. May 2020
  7. Apr 2020
    1. Zaměstnanci s emailovou adresou ve formátu: jmeno.prijmeni@cvut.cz

      Configuration that works in Thunderbird in English in Ubuntu:

      • Incoming:
        • Authentication: Normal password
        • Username: bartefil
      • Outgoing:
        • Authentication: Normal password
        • Username: bartefil
  8. Feb 2020
    1. fisk.pdf(x, c, loc, scale)
      y = (x - loc) / scale
      return c * np.power(y, -c-1) * np.power(1 + np.power(y, -c), -2) / scale
      
  9. Jan 2020
    1. Which core is the best to emulate(some console/game)?

      Emulation General Wiki contains comparisons of emulators of various platforms.

  10. Dec 2019
  11. Nov 2019
    1. více než 75% bylo vyplacenona stipendia studentů

      Musí být více než 75 % z osobních nákladů vyplaceno na stipendia studentů, nebo 75 % z osobních nákladů akademiků, nebo 75 % z celkových nákladů na projekt?

    1. strict concavity

      \(\lambda_i > 0, \sum \lambda_i = 1: \log \sum \lambda_i x_i > \sum \lambda_i \log x_i\)

    2. Pθ0(supθ∈Θ|L(θ,Tm)−L(θ)|> )m→∞−−−−→0

      Starting with m, all the following approximations are strictly contained within epsilon-wide sleeve around the true L.

    3. zj

      \(z^j = (x^j, y^j)\)

    4. consistency

      The more training data we have, the closer we get to the true optimum.

    5. p(y|x) =exp[y(〈w,x〉+b)]1 + exp[〈w,x〉+b]

      \(p(y|x) = \frac{p(x, y)}{p(x)} = \frac{p(x|y) p(y)}{p(x)} = ... \)

    6. Generative learning

      We assume a certain model architecture. We need to estimate the parameters of the model.

    7. x∈Rn,y∈{0,1}withy∼Bern(α)andx|y∼N(μy,V)

      Unknown parameters:

      • Prior: alpha
      • Posterior: mu_0, mu_1, V
    8. Linear Classifier

      Example of discriminative learning version 2

    9. Logistic Regression

      Example of discriminative learning version 1

    10. Gaussian Discriminative Analysis

      Example of generative learning

    11. Discriminative learning(1)

      Example: Softmax layer for classification - estimates class probabilities

    12. max

      Correction: min

    13. conditional

      posterior

    14. maximum likelihood estimator (MLE)

      Why "likelihood" rather than "probability"? If the domain is real valued, p is a PDF and summing over p values does not correspond to probability.

    1. neural network

      What does the network output?

      1. Value estimate (required for this assignment)
      2. Policy (to be introduced in later lecture)
    1. Forward Message

      JD: The corresponding backward message may be one of the exam assignments.

    2. Universal approximation theorem: one layer is enough

      Wikipedia

      the universal approximation theorem states that a feed-forward network with a single hidden layer containing a finite number of neurons can approximate continuous functions on compact subsets of Rn, under mild assumptions on the activation function.

    1. Minimální výše měsíčního stipendia v prezenční formě při plnění studijních povinností je 15 000 Kč.

      Skutečné údaje:

      • Minimální stipendium za měsíc: 9 000 Kč. Zdroj: Příkaz rektora č. 11/2019
      • Minimální stipendium za kalendářní rok: 12 x 12 000 Kč. Zdroj: Stipendijní řád - článek 6:1-3
  12. Oct 2019
    1. {trn,val,tst}_kernel_mat.svmlight

      Python: Use the library svmlight. Consider using sklrean.svm.

    2. STATISTICAL MACHINE LEARNING (SML2019)1. COMPUTER LAB

      Solution consists of:

      • Source code (Python or Matlab)
      • PDF with answers to questions (and nothing else)
    1. 1nin

      Alternative from Xavier: \(\frac{2}{n_{in} + n_{out}}\)

    2. letwiandxibe independentrandom variables

      Is it safe to assume that x_i are independent across i?

    3. =

      Independent across i

    4. E(xi)

      We may need to normalize the input.

    5. αk>0is thelearning rateorstepsize

      Learning rate may be further specialized e.g. for each layer.

    6. θk−αk∇L(θk)

      We move in the opposite direction of the gradient.

    7. yk=σk(xTW)

      Class confidence distribution

    1. ULLN applies

      Proof: Seminar 3, Assignment 4

      1. \(R(h_m) - R_{T^m}(h_m) \leq \sup_{h \in H} |R(h) - R_{T^m}(h)|\) - trivial
      2. \(R_{T^m}(h_H) - R(h_H) \leq \sup_{h \in H} |R(h) - R_{T^m}(h)|\) - easy to see
    2. \(R_{T^m}(h_m) \leq R_{T^m}(h_H)\) because \(h_m \in \argmin_{h \in H} R_{T^m}(h)\).

    3. Neural Networks learned by back-propagation

      Neural networks need not converge to the global minimum. \(R(h_m)\) need not equal \(R(h_H)\).

    4. Vapnik-Chervonenkis dimension

      \(D_{VC}(H) = max_n . \exists x \in X^n . \forall y \subseteq x . \exists h \in H . h(x) = y\)

    5. `0/1(y,y′) = [[y6=y′]

      We assume 0-1 misclassification loss.

    6. Rn

      Space of feature vectors

    7. ULLN for finite hypothesis space

      and bounded loss image (bounded by [a, b])

    8. sup

      "Uniform" because we care about all strategies in H.

    9. limm→∞P(R(hm)−R(hH)≥ε)= 0

      Equivalently:

      \(\forall \delta > 0, \epsilon > 0. \exists m_0 \in N. \forall m \geq m_0. P(R(h_m) - R(h_H) \geq \epsilon) < \delta\)

      In general:

      \(lim_{m \rightarrow \infty} f(m) = 0 \Leftrightarrow \forall \delta > 0. \exists m_0 \in N. \forall m \geq m_0. f(m) < \delta\)

    10. estimation error

      How to reduce estimation error?

      • Increase number of samples m.
      • Improve the learning algorithm to utilize T^m more efficiently.
      • Shrink H to exclude strategies better than h_m.

      Enlarging H typically leads to worse generalization, so estimation error increases.

    11. R(hm)

      R(h_m) is a random variable, because h_m is a random variable, because T^m is a random variable.

    12. YX

      All functions \(h: X \rightarrow Y\)

    13. approximation error

      How can we reduce approximation error?

      Enlarge H so that it includes better rules.

    1. Empirical risk minimization

      An empirical risk minimization (ERM) algorithm returns \(h_m \in H\) that minimizes \(R_{T^m}(h)\).

      • Excess error: \(R(h_m) - R^*\)
        • Estimation error: \(R(h_m) - R(h_H)\)
        • Approximation error: \(R(h_H) - R^*\)

      An algorithm is statistically consistent in \(H\) iff \(\forall \epsilon > 0: \lim_{m \rightarrow \infty} P(R(h_m) - R(h_H) \geq \epsilon) = 0\).

      The hypothesis space \(H\) satisfies the uniform law of large numbers (ULLN) iff \(\forall \epsilon > 0: \lim_{m \rightarrow \infty} P(\sup_{h \in H} |R(h) - R_{T^m}(h)| \geq \epsilon) = 0\).

      Theorem 1: If \(H\) satisfies ULLN then ERM is statistically consistent in \(H\).

      Corollary 1: The ULLN is satisfied for a finite \(H\).

      Generalization bound for finite hypothesis space \(H\) with \(Im(l) \subseteq [a, b]\): \(P(\max_{h \in H} |R_{T^m}(h) - R(h)| < \epsilon) \geq 1 - 2 |H| \exp(-\frac{2 m \epsilon^2}{(b - a)^2})\)

    2. Pdf

      Boris:

      We usually publish the slides 1 hour before the lecture.

    1. lunar_lander

      The environment adheres to reward shaping. Each state has a hidden value and reward of an executed action is always the difference between the values of the previous and the current state. (?)

    2. The tasks are evaluated automatically using the ReCodEx Code Examiner.

      We can submit one or more Python source files (modules) with the .py file extension.

    3. Time limit for each test is 5 minutes.

      Most likely this is too short period to perform training in ReCodEx.

    4. average reward of -150

      Episode terminates after 1000 steps.

    5. average return of 490

      There are 500 steps in each episode.

      Is gamma equal to 1.0 in evaluation?

    6. Assignments: Best solution counts. There is no penalty for incorrect solution submission. The students can ask for increasing the submission count limit (50 by default).

    7. monte_carlo

      Train either in ReCodEx or offline.

      If I can't get it running in an hour, I can write to MS and ask for advice.

    8. states

      Generalized states: R^4:

      1. Cart position
      2. Cart velocity
      3. Pole angle
      4. Pole angular velocity

      For this task the states are discretized by binning (8 bins in each dimension; 8^4 states in total).

    9. --steps steps of policy evaluation/policy improvement

      We improve the policy steps times.

    10. --iterations applications of the Bellman equation

      After iterations evaluations, we improve the policy once.

    11. Average value per assignment: 11 points

    12. I can ask for a justified deadline extension before the deadline passes.

    1. Bellman backup operator

      Given a valuation and a state, returns the expected return provided we follow the action that maximizes the next reward plus the valuated value of the next state.

    2. return

      Return at time \(t\)

      The smaller the discount factor, the smaller is the relative weight of the far future rewards.

    3. Monte Carlo and -soft Policies

      It may make sense to decrease epsilon gradually while training to explore less and exploit more (refine the policy for the successful runs).

    4. Monte Carlo with Exploring Starts

      Can we make the policy stochastic and choose each action with a probability that scales with its expected return?

    5. greedy action

      Under what conditions do we get stuck in a local optimum?

  13. www.ciirc.cvut.cz www.ciirc.cvut.cz
    1. Český ústav robotiky a kybernetiky

    1. regularization constantC

      Finite C implies forces w to be in a hypersphere of finite radius R that depends on C.

    2. (w∗,b∗) = argminw∈Rn,b∈R(12‖w‖2︸︷︷︸penaltyterm+Cm∑i=1max{0,1−yi(〈w,φ(xi)〉+b)}
    1. Q=n+1Q+nα(R−nQ)

      Exponential moving average

      \((1 - \alpha) Q_n + \alpha R_n\)

    2. Incremental Implementation

      New estimate for this machine is old estimate plus relative current reward divided by number of times we tried this machine.

      Cf. gradient descent

    3. uniformlyrandom action with probability .ε1−εε

      \epsilon: Rate of exploration

    4. discrete

      Non-differentiable

    5. won 10 out of 11 StarCraft II games against two professional players

      Unfair interface advantage

    6. 9h

      Very high computational demands

    7. ches

      Can beat the previously best program after 3h of playing

    8. beat 60 professional

      Nickname: master

      Online game server, the program did not reveal it was a bot

    9. AlphaGo

      Why is go interesting?

      • High branching factor
      • Difficult to write a good state evaluation heuristic
    10. to rule them all

      Plays all the games

    11. human-normalized mean: 623.0%

      Questionable information value because some games can be exploited a lot

    12. 29 games out of 49 comparable or better to professional game players

      A separate agent trained for each game

    13. first

      1959

    14. Trial and error learning

      Learning in animals

    1. VGGNet

      conv3-64: receptive field 3x3, stride 1, 64 output channels

      To resolve boundary: pad with zeros

      MP: max pooling layer. Window 2x2, stride 2, aggregate with max.

      Deep layers: more channels, smaller resolution, large transitive receptive field, more abstract features

    2. negative log-likelihood of class probabilities (a.k.a. cross entropy)

      Surrogate loss function (or objective function) for 0-1 loss

      Why? So that it is differentiable.

    3. Loss function

      If we used 0-1 loss, the empirical risk would be the error rate.

    4. class probabilities

      To choose the class, we take the class with the highest predicted probability.

    1. Empirical Risk Minimization

      Optimization problem

    2. Testing: confidence intervals

      Filip: Is the estimation based on Hoeffding inequality commonly used in practice (research)?

      A: Yes.

    3. Hoeffding inequality

      Strength: Valid irrespective of the underlying distribution

      What distribution maximizes the error?

    4. 1−2e−2lε2(b−a)2=γ

      This equality binds the variables \epsilon, \gamma, l and (b-a).

    5. [a,b]

      The variables are bounded interval by an interval \([a, b]\).

    6. Example (rolling a die):μ= 3.5,zi∈[1,6],ε= 0.5.
      • Left graph
        • Red line: True expectation (3.5)
        • Blue band: \epsilon band around assumed expectation
      • Right graph:
        • Red curve: Portion of experiments that estimate a mean outside the blue band in the left graph
        • Blue curve: Upper bound of the empirical established by Hoeffding inequality
    7. how the interval widthεdepends onlandδ

      3 interdependent variables:

      • \(\epsilon\)
      • \(l\)
      • \(\delta\)

      3 possible tasks:

      1. Given l and \delta, minimize \epsilon.
      2. Given \epsilon and l, maximize \delta.
      3. Given \epsilon and \delta, minimize l.
    8. (RSl(h)−ε,RSl(h)+ε)

      Confidence interval

  14. Sep 2019
    1. NP-complete

      BF:

      NP complete or NP hard

    2. Seminars

      Theoretical assignments every other week published one week in advance.

      No evaluation, just discussion.

    3. Written exam

      90 minutes. Assignments similar to those discussed in the labs.

    4. The course will run irrespective of the number of students. The lecture format will only be held if at least approx. 5 students come consistently.

    1. maximum likelihood estimator

      Boris:

      You should know this already.

    2. unsupervised learning

      Note that the validation set must still be labeled.

    3. Bayes inference rule

      Note that this only works for classification.

    4. inf

      Is this infimum?

    5. how strong can they deviate from each other?

      Example: For 0-1 loss, using Chebyshev inequality, using the loss bound of 1, we get the bound: $$\frac{1}{\epsilon^2 m}$$

    6. drawn fromp(x,y)

      The evaluation distribution must match the application distribution. In other words, we evaluate the inference with respect to p.

    7. i.i.d.

      independent identically distributed

    8. R(h)

      In case of 0-1 loss in classification, this is the error rate of h.

    9. E(x,y)∼p

      Expectation

    10. minimises the loss

      The distribution of (x, y) will be specified later.

    11. related by an unknownjoint p.d.f.p(x,y)

      We assume that x and y are interdependent.

    12. loss function`:Y×Y →R+

      Simplest loss for classification: 0-1 loss: l(a,b) = 1 iff a != b

    13. real valued variable

      possibly a vector

    1. Labs/Seminars: Thursday 9:15-10:45, 11:00-12:30 and 12:45-14:15, all in KN:E-112

      Capacity: 30 people

    2. Boris:

      Don't hesitate to contact us.

      Don't hesitate to ask questions during the lectures.

      Immediately after each lecture we will go to the cafeteria where you can contact us privately.

    1. practical labs

      Each task: Around 10 points plus up to 3 bonus points

    2. before the class

      Boris:

      a week ahead

    3. Seminar

      Introductory lab session. Neither theoretical, nor practical.

  15. Aug 2019
    1. public key

      To have a personal certificate generated by CIIRC, follow the instructions in the wiki.

    2. Everyone with CIIRC account can access master node of the cluster via ssh at cluster.ciirc.cvut.cz

      Credentials: CTU username and CIIRC password

  16. Jul 2019
    1. All other usage is reserved.

      Does this mean that modification of Vampire for research purposes is prohibited?

    1. max

      According to the paper (Cohen 1998), maximizing the permutation score is an NP-hard problem. The paper proposes a 2-approximation greedy algorithm.

  17. Jun 2019
  18. May 2019
    1. Moc emocí, málo informací a argumentů.

    2. Vyber si niečo, za čím sa budeš hnať, až dokým ťa nezradí vlastné telo.

      Myslím, že tento postoj je zároveň součástí ideologie autora tohoto textu.

    3. Ľudí, ktorí chápu že spôsob, akým funguje naša spoločnosť, je chorý! Ľudí, ktorí vidia riešenia, nie problémy! Ľudí, ktorí kladú zaujímavé otázky a majú zaujímavé odpovede! Ľudí, ktorí sa neboja povedať svoj názor v spoločnosti, ktorá nové názory odsudzuje, no zároveň je na nich závislá!!!

      Apel na emoce

    4. si môžeme písať čo chceme

      Tohle mi připomíná manipulativní heslo Parlamentních listů:

      Nikdo nám nediktuje, o čem smíme psát.

      Kritická analýza: Jeden svět na školách

    5. Nikto nás nevlastní.

      Zatím se mi zdá, že qTube vlastní níže podepsaný autor tohoto textu.

    6. Epidémia depresie, ktorú zažívajú “rozvinuté štáty” je spôsobená nezdravým spôsobom myslenia a inherentne nesprávnymi konceptami, ktoré považujeme za pravdivé.

      Chybí zdroje.

    7. Ak nás za našu kontroverziu a excentrické vystupovanie odsúdite, je to v poriadku. Je to spôsob, ktorým filtrujeme tých správnych ľudí.

      Co když vás odsoudím za manipulativní rétoriku?

    1. HP Color LaserJet 6040 MFP

      Color LaserJet cm6040 MFP

      Viz screenshot v bodu 6 níže.

    2. System -> Správa -> Tisk

      Ubuntu 18: Settings -> Devices -> Printers -> Additional Printer Settings...

      Zdroj: Ask Ubuntu

    3. Server -> Nová -> Tiskárna

      Ubuntu 18: Add

  19. Apr 2019
    1. Of course, it’s okay for them to fly – Emma Thompson jetted first-class from LA to London to lecture us plebs about all our eco-destructive holidaymaking.

      This reminds me of the numerous shots of Al Gore at an airport or on an airplane in the 2006 environmentalist film An Inconvenient Truth.