Flink运行架构及并行度设置 —— 深入源码行级解析

非常棒的进阶需求!下面我将针对Flink运行架构及并行度设置的每个环节,进一步细化到更深层次的源码方法与关键逻辑(行级/类级/方法级),并结合源码定位路径,帮助你实现从架构到源码的“穿透式”理解。
(以Flink 1.16为例,部分包名和类名在不同版本会有微调,但主线逻辑不变)


Flink运行架构及并行度设置 —— 深入源码行级解析


一、Flink运行时架构 源码细节

1. JobManager

  • 启动入口
    org.apache.flink.runtime.entrypoint.StandaloneSessionClusterEntrypoint#main
    • 创建DispatcherRunner(调度管理者)
    • 创建JobManagerRunnerImpl(作业管理者实现)
      // JobManagerRunnerImpl构造
      this.jobMasterService = jobMasterServiceFactory.createJobMasterService(
          jobGraph, completionFuture, userCodeClassLoader);
      
    • JobMaster启动:org.apache.flink.runtime.jobmaster.JobMaster#start
      public CompletableFuture<Acknowledge> start(JobMasterId newJobMasterId) {
          // 1. 注册到ResourceManager
          // 2. 初始化SlotPool
          // 3. 进入调度阶段
      }
      

2. TaskManager

  • 启动入口
    org.apache.flink.runtime.taskexecutor.TaskManagerRunner#main
    • 创建TaskExecutor实例
      // TaskManagerRunner.java
      final TaskManagerRunner runner = new TaskManagerRunner(configuration, ...);
      
    • 注册到ResourceManager
      // TaskExecutor.java
      resourceManagerLeaderRetriever.start(new ResourceManagerLeaderListener());
      
    • 接收并执行任务
      // TaskExecutor.java
      public CompletableFuture<Acknowledge> submitTask(TaskDeploymentDescriptor tdd, ...) {
          // 1. 创建Task实例
          // 2. 放入TaskSlot
          // 3. 启动Task线程
      }
      

3. Slot

  • Slot分配核心
    • org.apache.flink.runtime.jobmaster.slotpool.SlotPoolImpl#allocateSlot
      public CompletableFuture<LogicalSlot> allocateSlot(...) {
          // 查找可用slot
          // 分配slot给subtask
      }
      
    • Slot数据结构
      • org.apache.flink.runtime.taskexecutor.slot.TaskSlot
      • org.apache.flink.runtime.jobmaster.slotpool.SingleLogicalSlot

二、Flink作业提交流程 源码细节

  1. 客户端提交作业

    • org.apache.flink.client.program.ClusterClient#submitJob
      public CompletableFuture<JobSubmissionResult> submitJob(JobGraph jobGraph) {
          // 1. 将JobGraph序列化
          // 2. 通过DispatcherGateway发送给JobManager
          dispatcherGateway.submitJob(jobGraph, timeout);
      }
      
  2. JobManager接收作业

    • org.apache.flink.runtime.dispatcher.Dispatcher#submitJob
      public CompletableFuture<Acknowledge> submitJob(JobGraph jobGraph, ...) {
          // 1. 检查作业合法性
          // 2. 创建JobManagerRunner
          // 3. 启动作业
      }
      
    • org.apache.flink.runtime.jobmaster.JobMaster#initialize
      private void initialize() {
          // 1. 构建ExecutionGraph(物理执行计划)
          // 2. 调用ExecutionGraphBuilder.buildGraph
      }
      
  3. 资源申请与Slot分配

    • org.apache.flink.runtime.jobmaster.JobMaster#internalStartScheduling
      protected void internalStartScheduling() {
          // 1. 调用SchedulerBase.startScheduling
          // 2. Scheduler负责slot分配和subtask调度
      }
      
    • org.apache.flink.runtime.scheduler.SchedulerBase#allocateSlotsAndDeploy
      private void allocateSlotsAndDeploy(List<ExecutionVertexID> verticesToDeploy) {
          // 1. 调用SlotPoolImpl.allocateSlot
          // 2. 分配slot并部署subtask
      }
      
  4. TaskManager启动SubTask

    • org.apache.flink.runtime.taskexecutor.TaskExecutor#submitTask
      public CompletableFuture<Acknowledge> submitTask(TaskDeploymentDescriptor tdd, ...) {
          // 1. 构造Task对象
          // 2. task.startTaskThread()
      }
      
    • org.apache.flink.runtime.taskmanager.Task#run
      public void run() {
          // 1. 创建OperatorChain
          // 2. 启动算子链
      }
      
  5. 状态管理与容错

    • org.apache.flink.runtime.checkpoint.CheckpointCoordinator
      • triggerCheckpoint
      • restoreLatestCheckpointedState
    • org.apache.flink.runtime.jobgraph.tasks.StatefulTask#restoreState

三、并行度设置 源码细节

1. 全局并行度

  • 配置生效点
    • StreamExecutionEnvironment#setParallelism
      public StreamExecutionEnvironment setParallelism(int parallelism) {
          this.parallelism = parallelism;
          return this;
      }
      
    • 写入JobGraph:StreamGraph#getJobGraph
      jobVertex.setParallelism(operator.getParallelism());
      

2. 算子并行度

  • 生效点
    • SingleOutputStreamOperator#setParallelism
      public SingleOutputStreamOperator<T> setParallelism(int parallelism) {
          transformation.setParallelism(parallelism);
          return this;
      }
      
    • org.apache.flink.streaming.api.graph.StreamGraph构建DAG时,按算子并行度生成节点。

3. Slot资源分配与并行度关系

  • JobVertex#setParallelism
  • ExecutionJobVertex#initialize
    // 创建ExecutionVertex数组
    this.taskVertices = new ExecutionVertex[parallelism];
    
  • SlotPoolImpl#allocateSlot
    • 每个ExecutionVertex代表一个SubTask,对应一个Slot分配。

四、Slot Sharing机制及Chaining机制 源码细节

1. Slot Sharing

  • 分组分配实现
    • org.apache.flink.runtime.jobgraph.JobVertex#setSlotSharingGroup
    • org.apache.flink.runtime.jobmaster.slotpool.SlotSharingManager
      public SlotSharingManager(SlotSharingGroupId slotSharingGroupId, ...) {
          // 管理同一组内算子的Slot分配
      }
      

2. Operator Chaining

  • StreamGraph阶段
    • org.apache.flink.streaming.api.graph.StreamGraph#configureOperatorChain
      void configureOperatorChain(StreamNode node, ...) {
          // 判断是否可Chaining
          if (canBeChained(...)) {
              // 合并算子到同一OperatorChain
          }
      }
      
  • Task启动时
    • org.apache.flink.streaming.runtime.tasks.StreamTask#invoke
      this.operatorChain = new OperatorChain<>(...);
      operatorChain.openOperators();
      

五、动态调整并行度 源码细节

1. Savepoint恢复

  • 入口
    • org.apache.flink.client.cli.CliFrontend#run
      // 指定-s savepointPath和-p 并行度
      
    • org.apache.flink.runtime.jobgraph.JobGraph#setParallelism
    • org.apache.flink.runtime.executiongraph.ExecutionGraph#restoreLatestCheckpointedState
      // 状态重分布,涉及StateAssignmentOperation
      

2. Rescale

  • 实现机制
    • org.apache.flink.runtime.rescale.RescaleSignal
    • org.apache.flink.runtime.executiongraph.ExecutionJobVertex#rescale
      public void rescale(int newParallelism) {
          // 按新并行度重新分配ExecutionVertex
      }
      

六、常见并行度问题排查 源码细节

1. 资源不足排队

  • ExecutionJobVertex#scheduleAll
    • 日志中可见Not enough slots available...

2. Slot分配倾斜

  • SlotPoolImpl#allocateSlot
    • Slot分配策略可查看分配日志

3. 状态兼容性检查

  • StateAssignmentOperation#checkParallelismPreconditions
    if (newParallelism > maxParallelism) {
        throw new IllegalStateException(...);
    }
    

4. Chaining监控问题

  • StreamGraph#disableOperatorChaining
    • 通过设置disableChaining()可在源码中跟踪链路拆分

七、全流程行级源码主线(简明索引)

  • 提交作业
    ClusterClient.submitJobDispatcher.submitJobJobMaster.initialize
  • 生成DAG并设置并行度
    StreamGraphJobVertex.setParallelismExecutionJobVertex
  • Slot分配
    SchedulerBase.allocateSlotsAndDeploySlotPoolImpl.allocateSlot
  • Task下发与执行
    TaskExecutor.submitTaskTask.runOperatorChain
  • Slot Sharing/Chaining
    SlotSharingManager/StreamGraph.configureOperatorChain
  • 动态并行度调整
    ExecutionJobVertex.rescale/StateAssignmentOperation

八、总结与建议

  • 从作业提交到任务执行,Flink源码主线清晰,每一步均有核心类与主方法可查。
  • 并行度和Slot分配的所有核心决策点,都可通过类和方法定位关键源码,便于二次开发或定制优化。
  • 推荐通过IDE全局查找上述方法和类,结合官方文档与代码注释,深入理解Flink调度与资源分配机制。

全局口诀速记

Job调度、Task执行、Slot分配、SubTask落位,
并行度全局算子Slot,源码行级查主线,
监控排障靠日志,二次开发有门道!


如需针对某一步骤具体源码片段、参数说明或调试技巧,欢迎继续指定环节深入追问!

你可能感兴趣的:(Flink,flink,架构,大数据)