6 Matching Annotations
  1. Jun 2018
    1. Example1.49.Recall from Example 1.36 that given a setXwe defineEXto be theset of partitions onX, and that a partition may be defined using a surjective functions:XPfor some setP.Any surjective functionf:X!Yinduces a monotone mapf:EY! EX, going“backwards”. It is defined by sending a partitions:YPto the compositef:s:XP
  2. Jun 2017
    1. You measure the throughout that you can achieve on a single partition for production (call it p) and consumption (call it c). Let’s say your target throughput is t.

      t = throughput (QPS) p = single partition for production c = consumption

    1. no data loss will occur as long as producers and consumers handle this possibility and retry appropriately.

      Retries should be built into the consumer and producer code. If leader for the partition fails, you will see a LeaderNotAvailable Exception.

  3. May 2017
    1. The first limitation is that each partition is physically represented as a directory of one or more segment files. So you will have at least one directory and several files per partition. Depending on your operating system and filesystem this will eventually become painful. However this is a per-node limit and is easily avoided by just adding more total nodes in the cluster.

      total number of topics supported depends on the total number of partitions per topic.

      partition = directory of 1 or more segment files This is a per node limit

    1. the number of partitions -- there's no real "formula" other than this: you can have no more parallelism than you have partitions.

      This is an important thing to keep in mind. If we need massive parallelism we need to have more partitions.