10 Matching Annotations
  1. Apr 2019
  2. Jan 2019
    1. Human "content monitors" do some of that censoring, and some of them despise conservatives. A former Facebook employee reported that the human censors sometimes ignored stories trending among Facebook users if the stories came from a conservative website.

      Source?

    1. Service Level Objectives
    2. Another kind of SLI important to SREs is availability, or the fraction of the time that a service is usable. It is often defined in terms of the fraction of well-formed requests that succeed, sometimes called yield. (Durability—the likelihood that data will be retained over a long period of time—is equally important for data storage systems.) Although 100% availability is impossible, near-100% availability is often readily achievable, and the industry commonly expresses high-availability values in terms of the number of "nines" in the availability percentage. For example, availabilities of 99% and 99.999% can be referred to as "2 nines" and "5 nines" availability, respectively, and the current published target for Google Compute Engine availability is “three and a half nines”—99.95% availability.

      I'm guilty of overusing available . There's a different meaning here than I've been thinking. I'd still say that beyond some extreme SLI thresholds the service should be considered unavailable but this is a more binary success/failure metric

    3. In any given quarter, if a true failure has not dropped availability below the target, a controlled outage will be synthesized by intentionally taking down the system

      Fucking adorable

    4. Again, this is more subtle than it might at first appear, in that those two SLIs—QPS and latency—might be connected behind the scenes: higher QPS often leads to larger latencies, and it’s common for services to have a performance cliff beyond some load threshold.

      Should we consider at what loads we can guarantee availability? Below our rate limits?

      NB - We're UNAVAILABLE at our rate limits

    5. client-side latency is often the more user-relevant metric, but it might only be possible to measure latency at the server.

      umami, but also any inter service comms

    6. and system throughput, typically measured in requests per second.

      This makes sense for things like queue ingestion rates (MM/ Search/ Content/ Media...)