27 Matching Annotations
  1. Aug 2023
    1. Partition Leader Balancing

      balance the load on leader preferred replica - first replica of each partition preferred replica is evenly distributed among brokers tries to make the preferred replica as the leader and hence balancing the load

    2. Replication

      followers that lag behind replica lag time max ms is removed from the ISR

    3. Kafka

      follower failure is handled by removing the lagging or failed followers from the ISR and considering only the rest and moves the high watermark

    4. Protocol

      a log is committed when all replicas are in sync - ISR - then we move watermark

    5. Data Plane

      only data upto high watermark is visible to the consumers

    6. Replication

      leaders&followers leader commits log with an epoch(indicates generation of lifetime of a log) follower fetch with offset no to sync with leader leader reponds with new record logs with epoch in the next fetch request, leader moves high watermark based on offset in request (n-1) in that response it passes the high watermark to the follower

    1. course

      how data is received client -> socket -> n/w thread(lightweight) -> req queue -> i/o thread(crc check & append to commit log[.index and .log file]) -> resp queue -> n/w-> socket

      purgatory map -> used for req until data is replicated

      how data is fetched client -> socket -> n/w -> req queue -> i/o thread (fetch ranges are calcualted with .index) -> res queue -> n/w -> socket

      purgatory map -> waits until is arrived based on consumer config

      also follows zero copy - meaning data fetched from file directly to n/w thread - mostly page cached

      if not cached -> may need to use tiered storage

    2. Inside the Apache Kafka Broker

      consumer properties => same as producer, time and batch size.

    3. Kafka Broker

      producer key properties => linger time -> linger.ms batch size -> batch.size

      these determine the throughput and latency of kafkaproducers

    1. Part 1

      Breakdown

      1. Function or purpose
      2. Interconnections or relationships - can change system massively by changing the information flow within it
      3. Elements - low level way of changing like fire/replace, rarely make a difference otherwise they don't. <br /> 4.All of this produces behaviour of the system. <br /> 5.Event - one snippet of the behaviour of the systems, most news focus on this event, gives us no understanding of long term behaviour of the systems or why they are doing what they are doing.
    2. Dana (Donella) Meadows Lecture: Sustainable Systems

      deep insight -systems thinking - system behaviour comes out of the system, its interrelationships, its goals, not the elements of the people or actors in it

    1. Amdahl's law

      splitting up of tasks stops being a useful strategy if part of the tasks can't be further splittable

  2. Jul 2023
    1. Containerization solves these proble

      integration test with containerization looks like exactly what I need

    1. calculate the inuse_

      inuse is calculated by allocated-free allocations

    2. the runtime scales the collected sample values

      runtime scales the value, so reported allocations can be approximately equal to actual irrespective of the MemProfileRate

    3. gained from optimizing this function

      we should look at the percentage rather than individual functions since the optimizing the latter won't probably make any difference

    4. profiler labels to be missing from CPU profiles

      again use Go1.18 atleast to avoid these errors

    5. lets you to control the sampling rate of the CPU profiler

      better to use Go1.18 versions to avoid these limitations

    6. The example above is highly simplified and omits many details around return values, frame pointers, return addresses and function inlining. In fact, as of Go 1.17, the program above may not even need any space on the stack as the small amount of data can be managed using CPU registers by the compiler

      interesting so if data is small, it is managed by the cpu registers by the compiler, stack is not used

    1. recommend disabling optimizations when building the code being debugged

      debugging should be done with compiler optimizations disabled

    2. Go users can create their custom profiles via pprof.Profile and use the existing tools

      do we have any custom implementations of these profiles available publicly

    1. Between Java, Scala, Kotlin and search tools like ElasticSearch and Solr, or for old school data houses using Hadoop, I don't think there are *any* tech companies in the world that do not use Java or something JVM based The career penalty for not knowing Java is incredibly high

      i agree on this

    2. I want a tiny binary, for a monitoring/logging agent, or a systems tool like docker agent, or kubectl, fantastic, Go (or Rust) is the best language to go for. I want to make REST APIs, and some DB calls - no worse choices than Go to do that. Even Python and Node are far better

      would like to know more on why Go isn't suited for webservers