5 Matching Annotations
  1. Jul 2017
    1. up vote 7 down vote accepted When you are starting your kafka broker you can define set of properties in conf/server.properties file. This file is just key value property file. One of the property is auto.create.topics.enable if it set tot true(by default) kafka will create topic automatically when you send message to non existing topic. All config options you can find here Imho Simple rule for creating topics is the following: number of replicas must be not less than number of nodes that you have. Number of topics must be the multiplier of number of node in your cluster for example: You have 9 node cluster your topic must have 9 partitions and 9 replicas or 18 partitions and 9 replicas or 36 partitions and 9 replicas and so on

      Number of replicas = #replicas Number of nodes = #nodes Number of topics = #topic

      replicas >= #nodes

      k x (#topics) = #nodes

  2. Jun 2017
    1. "isr" is the set of "in-sync" replicas.

      ISR are pretty import as when nodes go down you will see replicas created later.

    1. no data loss will occur as long as producers and consumers handle this possibility and retry appropriately.

      Retries should be built into the consumer and producer code. If leader for the partition fails, you will see a LeaderNotAvailable Exception.

    2. By electing a new leader as soon as possible messages may be dropped but we will minimized downtime as any new machine can be leader.

      two scenarios to get the leader back: 1.) Wait to bring the master back online. 2.) Or elect the first node that comes back up. But in this scenario if that replica partition was a bit behind the master then the time from when this replica went down to when the master went down. All that data is Lost.

      SO there is a trade off between availability and consistency. (Durability)