Hypothesis

17 Matching Annotations

Nov 2023
nightlies.apache.org nightlies.apache.org

Overview

1
1. wmsok 17 Nov 2023
  
  in Public
  
  Note that the write*() methods on DataStream are mainly intended for debugging purposes. They are not participating in Flink’s checkpointing, this means these functions usually have at-least-once semantics. The data flushing to the target system depends on the implementation of the OutputFormat. This means that not all elements send to the OutputFormat are immediately showing up in the target system. Also, in failure cases, those records might be lost. For reliable, exactly-once delivery of a stream into a file system, use the FileSink. Also, custom implementations through the .addSink(...) method can participate in Flink’s checkpointing for exactly-once semantics.
  
  生产禁用write*()，改用addSink()
  
  flink
Visit annotations in context

Tags

flink

Annotators

wmsok

URL

nightlies.apache.org/flink/flink-docs-release-1.17/docs/dev/datastream/overview/
Dec 2022
www.zhihu.com www.zhihu.com

如何看待阿里9000万欧元收购Apache Flink母公司Data Artisans？ - 知乎

1
1. caocao485 14 Dec 2022
  
  in Public
  
  如何看待阿里9000万欧元收购Apache Flink母公司Data Artisans？
  
  flink 大数据
Visit annotations in context

Tags

大数据

flink

Annotators

caocao485

URL

zhihu.com/question/308415397
Jun 2022
nightlies.apache.org nightlies.apache.org

指标

1
1. renyi 19 Jun 2022
  
  in Public
  
  Flink supports Counters, Gauges, Histograms and Meters.
  
  计数器，仪表，直方图，
  
  flink
Visit annotations in context

Tags

flink

Annotators

renyi

URL

nightlies.apache.org/flink/flink-docs-release-1.15/zh/docs/ops/metrics/
nightlies.apache.org nightlies.apache.org

Flink 架构

8
1. renyi 19 Jun 2022
  
  in Public
  
  拥有一个预先存在的集群可以节省大量时间申请资源和启动 TaskManager
  
  flink
2. renyi 19 Jun 2022
  
  in Public
  
  每个 TaskManager 有一个 slot，这意味着每个 task 组都在单独的 JVM 中运行（例如，可以在单独的容器中启动）。具有多个 slot 意味着更多 subtask 共享同一 JVM。同一 JVM 中的 task 共享 TCP 连接（通过多路复用）和心跳信息。它们还可以共享数据集和数据结构，从而减少了每个 task 的开销。
  
  flink
3. renyi 19 Jun 2022
  
  in Public
  
  调整 task slot 的数量，用户可以定义 subtask 如何互相隔离
  
  flink
4. renyi 19 Jun 2022
  
  in Public
  
  每个 worker（TaskManager）都是一个 JVM 进程
  
  flink
5. renyi 19 Jun 2022
  
  in Public
  
  算子链接成 task 是个有用的优化：它减少线程间切换、缓冲的开销，并且减少延迟的同时增加整体吞吐量。链行为是可以配置的；请参考链文档以获取详细信息。
  
  flink
6. renyi 19 Jun 2022
  
  in Public
  
  JobMaster 负责管理单个JobGraph的执行。Flink 集群中可以同时运行多个作业，每个作业都有自己的 JobMaster。
  
  flink
7. renyi 19 Jun 2022
  
  in Public
  
  决定何时调度下一个 task（或一组 task）、对完成的 task 或执行失败做出反应、协调 checkpoint、并且协调从失败中恢复等等
  
  flink
8. renyi 19 Jun 2022
  
  in Public
  
  JobManager 具有许多与协调 Flink 应用程序的分布式执行有关的职责
  
  flink
Visit annotations in context

Tags

flink

Annotators

renyi

URL

nightlies.apache.org/flink/flink-docs-release-1.15/zh/docs/concepts/flink-architecture/
blog.csdn.net blog.csdn.net

(51条消息) 详解 Flink 指标、监控与告警_无精疯的博客-CSDN博客

1
1. renyi 19 Jun 2022
  
  in Public
  
  自定义指标
  
  flink
Visit annotations in context

Tags

flink

Annotators

renyi

URL

blog.csdn.net/a934079371/article/details/107650231
May 2022
nightlies.apache.org nightlies.apache.org

Joining

4
1. renyi 07 May 2022
  
  in Public
  
  we define a tumbling window with the size of 2 milliseconds, which results in windows of the form [0,1], [2,3],
  
  相当于[0,2)，[2,4)
  
  flink
2. renyi 07 May 2022
  
  in Public
  
  Because this behaves like an inner join, elements of one stream that do not have elements from another stream in their tumbling window are not emitted!
  
  inner-join
  
  flink
3. renyi 07 May 2022
  
  in Public
  
  Those elements that do get joined will have as their timestamp the largest timestamp that still lies in the respective window. For example a window with [5, 10) as its boundaries would result in the joined elements having 9 as their timestamp.
  
  那些确实被加入的元素将具有仍然位于相应窗口中的最大时间戳作为它们的时间戳。例如，以 [5, 10) 作为其边界的窗口将导致连接的元素以 9 作为其时间戳。
  
  flink
4. renyi 07 May 2022
  
  in Public
  
  like an inner-join
  
  flink
Visit annotations in context

Tags

flink

Annotators

renyi

URL

nightlies.apache.org/flink/flink-docs-release-1.15/docs/dev/datastream/operators/joining/
Feb 2021
itnext.io itnext.io

Machine Learning Model Serving Options

1
1. pyxelr 26 Feb 2021
  
  in Public
  
  Consider the amount of data and the speed of the data, if low latency is your priority use Akka Streams, if you have huge amounts of data use Spark, Flink or GCP DataFlow.
  
  For low latency = Akka Streams
  
  For huge amounts of data = Spark, Flink or GCP DataFlow
  
  MLOps Spark Akka Flink GCP
Visit annotations in context

Tags

Akka

MLOps

Spark

Flink

GCP

Annotators

pyxelr

URL

itnext.io/machine-learning-model-serving-options-1edf790d917

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL