kubeflow-8-pipeline中的数据传递

Argo Workflows是一个开源的容器本地工作流引擎，用于在kubernetes上协调运行作业。Argo Workflows是基于kubernetes CRD实现的
功能：
(1)定义工作流，其中工作流中的每一个步骤都是一个容器
(2)将多个步骤工作流建模成一系列的任务，或者使用有向无环图（DAG）捕获任务间的依赖关系
(3)使用kubernetes上的Argo Workflows可以在短时间内轻松操作大量计算密集型作业
(4)不需要配置复杂的软件开发产品就可以在kubernetes本地环境中运行CI/CD

1 小数据

Small data is the data that you’ll be comfortable passing as program’s command-line argument. Small data size should not exceed few kilobytes.
小数据是作为程序的命令行参数传递的数据。小数据大小不应超过几千字节。
Some examples of typical types of small data are: number, URL, small string (e.g. column name).
小数据的典型类型有：数字、URL、小字符串（例如列名）。
Small lists, dictionaries and JSON structures are fine, but keep an eye on the size and consider switching to file-based data passing methods that are more suitable for bigger data (more than several kilobytes) or binary data.
小列表、字典和JSON结构可以，但是要注意大小，并考虑切换到基于文件的数据传递方法更适合于较大的数据（超过几千字节）或二进制数据。
All small data outputs will be at some point serialized to strings and all small data input values will be at some point deserialized from strings (passed as command-line argumants). There are built-in serializers and deserializers for several common types (e.g. str, int, float, bool, list, dict). All other types of data need to be serialized manually before returning the data. Make sure to properly specify type annotations, otherwize there would be no automatic deserialization and the component function will receive strings instead of deserialized objects.
所有小数据输出将在某个点序列化为字符串，所有小数据输入值将在某个点从字符串反序列化（作为命令行参数传递）。对于几种常见类型（例如str、int、float、bool、list、dict），都有内置的序列化程序和反序列化程序。在返回数据之前，所有其他类型的数据都需要手动序列化。确保正确指定类型注释，否则不会自动反序列化，组件函数将接收字符串而不是反序列化对象。

1.1 kfp.components.func_to_container_op

将一个python方法转化成一个pipeline的组件。

func_to_container_op(func:Callable, 
output_component_file:Union[str, NoneType]=None, 
base_image:Union[str, NoneType]=None, 
extra_code:Union[str, NoneType]='', 
packages_to_install:List[str]=None, 
modules_to_capture:List[str]=None, 
use_code_pickling:bool=False, 
annotations:Union[Mapping[str, str], 
NoneType]=None)
功能：
Converts a Python function to a component and returns a task
(:class:`kfp.dsl.ContainerOp`) factory.

Function docstring is used as component description. Argument and return annotations are used as component input/output types.
函数docstring用作组件描述。参数和返回注释用作组件输入/输出类型。
To declare a function with multiple return values, use the :code:`NamedTuple` return annotation syntax::
声明多个返回值的方式
from typing import NamedTuple
def add_multiply_two_numbers(a: float, b: float) -> NamedTuple('DummyName', [('sum', float), ('product', float)]):
    """Returns sum and product of two arguments"""
    return (a + b, a * b)
参数:
常用(1)func: 必填
The python function to convert
常用(2)base_image: Optional. 可选
Specify a custom Docker container image to use in the component. 
指定要在组件中使用的自定义Docker容器映像。
For lightweight components, the image needs to have python 3.5+. 
Default is tensorflow/tensorflow:1.13.2-py3

(3)output_component_file: Optional. 可选
Write a component definition to a local file. 
Can be used for sharing.
(4)extra_code: Optional. 可选
Extra code to add before the function code. 
Can be used as workaround to define types used in function signature.
(5)packages_to_install: Optional. 可选
List of [versioned] python packages to pip install before executing the user function.
(6)modules_to_capture: Optional. 可选
List of module names that will be captured (instead of just referencing) during the dependency scan. 
By default the :code:`func.__module__` is captured. 
The actual algorithm: Starting with the initial function, start traversing dependencies. 
If the :code:`dependency.__module__` is in the :code:`modules_to_capture` list then it's captured and it's dependencies are traversed. 
Otherwise the dependency is only referenced instead of capturing and its dependencies are not traversed.
(7)use_code_pickling: Specifies whether the function code should be captured using pickling as opposed to source code manipulation. 
Pickling has better support for capturing dependencies, but is sensitive to version mismatch between python in component creation environment and runtime image.
(8)annotations: Optional. 可选
Allows adding arbitrary key-value data to the component specification.

返回值:
A factory function with a strongly-typed signature taken from the python function.
Once called with the required arguments, the factory constructs a pipeline task instance (:class:`kfp.dsl.ContainerOp`)

1.2 kfp.Client.create_run_from_pipeline_func

在启用kfp的Kubernetes群集上运行pipeline。

create_run_from_pipeline_func(self, 
pipeline_func:Callable, 
arguments:Mapping[str, str], 
run_name:Union[str, NoneType]=None, 
experiment_name:Union[str, NoneType]=None, 
pipeline_conf:Union[kfp.dsl._pipeline.PipelineConf, NoneType]=None, 
namespace:Union[str, NoneType]=None)
功能：
Runs pipeline on KFP-enabled Kubernetes cluster.
在启用KFP的Kubernetes群集上运行pipeline。
This command compiles the pipeline function, creates or gets an experiment and submits the pipeline for execution.

参数:
(1)pipeline_func: 必填
A function that describes a pipeline by calling components and composing them into execution graph.
(2)arguments: 必填
Arguments to the pipeline function provided as a dict.

(3)run_name: Optional. 可选
Name of the run to be shown in the UI.
(4)experiment_name: Optional. 可选
Name of the experiment to add the run to.
(5)pipeline_conf: Optional. 可选
Pipeline configuration ops that will be applied to all the ops in the pipeline func.
(6)namespace: Kubernetes namespace where the pipeline runs are created.
  For single user deployment, leave it as None;
  For multi user, input a namespace where the user is authorized

1.3 消费常量参数

#pip install kfp
#docker pull tensorflow/tensorflow:1.13.2-py3
(1)指定基础镜像
直接打印常量参数

import kfp
from kfp.components import func_to_container_op

def print_fun():
    print("hello world")

if __name__ == '__main__':
	image_name = "tensorflow/tensorflow:1.13.2-py3"
    my_op = func_to_container_op(print_fun,base_image=image_name)
    kfp.Client().create_run_from_pipeline_func(my_op, arguments={})

(2)使用修饰器
默认会拉取镜像python:3.7

import kfp
from kfp.components import func_to_container_op

@func_to_container_op
def print_fun(text:str):
    print(text)
    
def use():
    print_fun("hello lucy")
    
if __name__ == '__main__':
    kfp.Client().create_run_from_pipeline_func(use, arguments={})

1.4 消费变量参数

(1)指定基础镜像

import kfp
from kfp.components import func_to_container_op

def use(text:str):
    print(text)
    
if __name__ == '__main__':
    image_name = "tensorflow/tensorflow:1.13.2-py3"
    my_op = func_to_container_op(use,base_image=image_name)
    kfp.Client().create_run_from_pipeline_func(my_op, arguments={"text":"hello you"})

(2)使用修饰器
默认会拉取镜像python:3.7

import kfp
from kfp.components import func_to_container_op

@func_to_container_op
def print_fun(text:str):
    print(text)
    
def use(text:str):
    print_fun(text)
    
if __name__ == '__main__':
    kfp.Client().create_run_from_pipeline_func(use, arguments={"text":"hello me"})

1.5 生产单参数数据

import kfp
from kfp.components import func_to_container_op

@func_to_container_op
def print_fun(text:str):
    print(text)

@func_to_container_op
def produce_fun() -> str:
    return 'Hello world'

def use():
    produce_task = produce_fun()
    # task.output only works for single-output components
    consume_task1 = print_fun(produce_task.output)
    # task.outputs[...] always works
    consume_task2 = print_fun(produce_task.outputs['output']) 

if __name__ == '__main__':
    kfp.Client().create_run_from_pipeline_func(use, arguments={})

定义了组件print_fun，默认使用镜像python:3.7
定义了组件produce_fun，默认使用镜像python:3.7
函数use制定了pipeline，数据从生产传到消费。

1.6 生产多参数数据

import kfp
from kfp.components import func_to_container_op
from typing import NamedTuple

@func_to_container_op
def consume_one_argument(text: str):
    print('name={}'.format(text))

@func_to_container_op
def produce_one_output() ->str:
    return "lucy"


@func_to_container_op
def consume_two_arguments(text: str, number: int):
    print('name={}'.format(text))
    print('age={}'.format(str(number)))

@func_to_container_op
def produce_two_outputs() -> NamedTuple('Outputs', [('name', str), ('age', int)]):
    return ("lily", 24)

def use(text: str = "kerry"):
    produce_task1 = produce_one_output()
    produce_task2 = produce_two_outputs()
    # task.output only works for single-output components
    consume_task1 = consume_one_argument(produce_task1.output)
    # task.outputs[...] always works
    consume_task2 = consume_one_argument(produce_task1.outputs['output']) 
    
    consume_task3 = consume_two_arguments(produce_task2.outputs['name'], produce_task2.outputs['age']) 
    consume_task4 = consume_two_arguments(text, produce_task2.outputs['age']) 
    consume_task5 = consume_two_arguments(produce_task1.outputs['output'], produce_task2.outputs['age']) 
    
if __name__ == '__main__':
    kfp.Client().create_run_from_pipeline_func(use, arguments={})

2 大数据

(1)Bigger data should be read from files and written to files.
较大的数据应该从文件中读取并写入文件。
(2)The paths for the input and output files are chosen by the system and are passed into the function (as strings).
输入和输出文件的路径由系统选择，并作为字符串传递到函数中。
(3)Use the InputPath parameter annotation to tell the system that the function wants to consume the corresponding input data as a file. The system will download the data, write it to a local file and then pass the path of that file to the function.
使用InputPath参数注释告诉系统函数希望将相应的输入数据作为文件使用。系统将下载数据，将其写入本地文件，然后将该文件的路径传递给函数。
(4)Use the OutputPath parameter annotation to tell the system that the function wants to produce the corresponding output data as a file. The system will prepare and pass the path of a file where the function should write the output data. After the function exits, the system will upload the data to the storage system so that it can be passed to downstream components.
使用OutputPath参数注释告诉系统函数希望以文件形式生成相应的输出数据。系统将准备并传递函数应在其中写入输出数据的文件路径。函数退出后，系统会将数据上载到存储系统，以便将其传递给下游组件。
(5)You can specify the type of the consumed/produced data by specifying the type argument to InputPath and OutputPath. The type can be a python type or an arbitrary type name string. OutputPath(‘TFModel’) means that the function states that the data it has written to a file has type ‘TFModel’. InputPath(‘TFModel’) means that the function states that it expect the data it reads from a file to have type ‘TFModel’. When the pipeline author connects inputs to outputs the system checks whether the types match.
通过将type参数指定给InputPath和OutputPath，可以指定已使用/生成的数据的类型。类型可以是python类型或任意类型名字符串。OutputPath（‘TFModel’）表示函数声明它写入文件的数据的类型为’TFModel’。InputPath（‘TFModel’）表示函数声明它希望从文件中读取的数据的类型为’TFModel’。当管道作者将输入连接到输出时，系统检查类型是否匹配。

Note on input/output names: 
When the function is converted to component, 
the input and output names generally follow the parameter names, 
but the "_path" and "_file" suffixes are stripped from file/path inputs and outputs. 
E.g. the number_file_path: InputPath(int) parameter becomes the number: int input. 
This makes the argument passing look more natural: number=42 instead of number_file_path=42.

2.1 kfp.components.InputPath

class InputPath(builtins.object)
When creating component from function, 
:class:`.InputPath` should be used as function parameter annotation 
to tell the system to pass the *data file path* to the function 
instead of passing the actual data.

__init__(self, type=None)
Initialize self.  See help(type(self)) for accurate signature.

2.2 kfp.components.OutputPath

class OutputPath(builtins.object)
When creating component from function, 
:class:`.OutputPath` should be used as function parameter annotation
 to tell the system that the function wants to output data by writing it into a file with the given path 
 instead of returning the data from the function.
__init__(self, type=None)
Initialize self.  See help(type(self)) for accurate signature.

2.3 大数据读写

from typing import NamedTuple
import kfp
from kfp.components import InputPath, InputTextFile, OutputPath, OutputTextFile
from kfp.components import func_to_container_op

# Writing bigger data
@func_to_container_op
def produce_data(line: str, output_text_path: OutputPath(str), count: int = 10):
    '''Repeat the line specified number of times'''
    with open(output_text_path, 'w') as fw:
        for i in range(count):
            fw.write(line + '\n')


# Reading bigger data
@func_to_container_op
def consume_data(text_path: InputPath()): # The "text" input is untyped so that any data can be printed
    '''Print text'''
    with open(text_path, 'r') as reader:
        for line in reader:
            print(line, end = '')

def print_repeating_lines_pipeline():
    produce_task = produce_data(line='world', count=20)
    consume_task = consume_data(produce_task.output) # Don't forget .output !

   
if __name__ == '__main__':
    kfp.Client().create_run_from_pipeline_func(print_repeating_lines_pipeline, arguments={})

3 多输出

how to make a component with multiple outputs using the Pipelines SDK.

3.1 kfp.compiler.build_python_component

build_python_component(component_func:Callable, 
target_image:str, 
base_image:Union[str, NoneType]=None, 
dependency:List[str]=[], 
staging_gcs_path:Union[str, NoneType]=None, 
timeout:int=600, 
namespace:Union[str, NoneType]=None, 
target_component_file:Union[str, NoneType]=None, 
python_version:str='python3', 
is_v2:bool=False)
automatically builds a container image for the
component_func based on the base_image and pushes to the target_image.
参数:
(1)component_func (python function): 
The python function to build components upon.
(2)base_image (str): 
Docker image to use as a base image.
(3)target_image (str): 
The target image path.
Full URI to push the target image.
(4)staging_gcs_path (str): GCS blob that can store temporary build files.
(5)timeout (int): 
The timeout for the image build(in secs), default is 600 seconds.
(6)namespace (str): 
The namespace within which to run the kubernetes Kaniko job. If the job is running on GKE and value is None the underlying functions will use the default namespace from GKE.
(7)dependency (list): 
The list of VersionedDependency, which includes the package name and versions, default is empty.
(8)target_component_file (str): 
The path to save the generated component YAML spec.
(9)python_version (str): Choose python2 or python3, default is python3
is_v2: Whether or not generating a v2 KFP component, default is false.

3.2 func_to_container_op

(1)修饰器
会默认下载python:3.7镜像

import kfp 
from typing import NamedTuple
from kfp.components import func_to_container_op

@func_to_container_op
def product_sum(a: float, b: float) -> NamedTuple('hahah', [('product', float), ('sum', float)]):
    '''Returns the product and sum of two numbers'''
    return (a*b, a+b)


@kfp.dsl.pipeline(
    name='Multiple Outputs Pipeline',
    description='Sample pipeline to showcase multiple outputs'
)
def pipeline(a=2.0, b=2.5, c=3.0):
    prod_sum_task = product_sum(a, b)
    prod_sum_task2 = product_sum(b, c)
    prod_sum_task3 = product_sum(prod_sum_task.outputs['product'],
                                    prod_sum_task2.outputs['sum'])
if __name__ == '__main__':
    arguments = { 'a': 2,'b': 3,'c': 4}
    kfp.Client().create_run_from_pipeline_func(pipeline, arguments=arguments)

(2)指定镜像
会使用指定的镜像

import kfp 
from typing import NamedTuple
from kfp.components import func_to_container_op


def product_sum(a: float, b: float) -> NamedTuple('hahah', [('product', float), ('sum', float)]):
    '''Returns the product and sum of two numbers'''
    return (a*b, a+b)

image_name = "tensorflow/tensorflow:1.13.2-py3"
my_op = func_to_container_op(product_sum,base_image = image_name)

@kfp.dsl.pipeline(
    name='Multiple Outputs Pipeline',
    description='Sample pipeline to showcase multiple outputs'
)
def my_pipeline(a=2.0, b=2.5, c=3.0):
    task1 = my_op(a, b)
    task2 = my_op(b, c)
    task3 = my_op(task1.outputs['product'],task2.outputs['sum'])
    
if __name__ == '__main__':
    arguments = { 'a': 20,'b': 30,'c': 40}
    kfp.Client().create_run_from_pipeline_func(my_pipeline, arguments=arguments)

s2i ksonnet helm minikube kubectl oc kubeflow docker CLI 串讲 Helen_Cat
以上工具都是基于docker的虚拟化工具，每个工具都足够让你节省非常多的时间来构建镜像，部署你的应用服务s2i非常简单是一个从源码构建docker镜像的工具，在我们做model-serving在使用s2i非常便利，最主要的是，比如你只是有model和预测代码，要启动一个web服务来做实时预测，那么使用s2i就可以省下写web业务逻辑的代码，在使用s2i前要保证dockerdaemon是在运行状态的
kubeflow-8-pipeline中的数据传递皮皮冰燃 kubeflow
ArgoWorkflows是一个开源的容器本地工作流引擎，用于在kubernetes上协调运行作业。ArgoWorkflows是基于kubernetesCRD实现的功能：(1)定义工作流，其中工作流中的每一个步骤都是一个容器(2)将多个步骤工作流建模成一系列的任务，或者使用有向无环图（DAG）捕获任务间的依赖关系(3)使用kubernetes上的ArgoWorkflows可以在短时间内轻松操作大量
轻量高可用的 K8s 集群搭建方案：MicroK8s soulteary 为了不折腾而去折腾的那些事 kubernetes docker 容器 microk8s ubuntu
MicroK8s是CNCF认证的Kubernetes发行版，由Ubuntu背后的商业公司Canonical开发和维护。它和完整版的Kubernetes一样支持高可用特性（HA），支持快速组建K8s集群。适合用于边缘计算、IoT、以及使用KubeFlow的MLOps机器学习场景。当然，也适合用于开发者本地环境，以轻量的资源消耗、简单的运维成本获得几乎完整的Kubernetes生态体验。本篇文章，我们
详解Kubeflow这一K8S的机器学习利器诸葛钢铁云 kubernetes 机器学习容器
作者：臧远慧针对Kubeflow组件较多，容易搞不清每个组件是干什么的，本文先对Kubeflow进行一个系统的概括，让大家明白各个组件分别的用处，并对组件间的关系进行理顺，帮助大家合理快速的选择自己需要的组件，随后会对每个组件的底层架构和流程分别进行详细的介绍和剖析，供大家针对性的进一步学习。一、什么是Kubeflow?Kubeflow是的机器学习工具包。Kubeflow是运行在K8S之上的一套技
任务编排工具选型：Airflow vs Luigi vs Argo vs MLFlow vs KubeFlow AAA小肥杨人工智能
最近，用于编排任务和数据工作流的新工具激增（有时称为“MLOps”）。这些工具的数量众多，使得选择正确的工具成为一个难题，因此我们决定将一些最受欢迎的工具进行对比。总体而言，ApacheAirflow既是最受欢迎的工具，也是功能最广泛的工具，但是Luigi等类似的工具，上手起来比较简单。Argo是团队已经在使用Kubernetes时经常使用的一种，而Kubeflow和MLFlow满足了与部署机器学
机器学习平台整理 AAA小肥杨机器学习人工智能
开源系列cube开源一站式云原生机器学习平台：https://blog.csdn.net/luanpeng825485697/article/details/123619334github:https://github.com/tencentmusic/cube-studiokubeflow参考官网：https://www.kubeflow.org/docs/started/参考：https://
kubeflow 1.6.1 单机搭建 jaffe—fly MLOPS kubernetes docker 运维
kubeflow1.6.1单机搭建manifest形式搭建（失败）先决条件安装kubeflowk8s1.25以上升级istio遇到的问题解决MySQL和minio的Pending状态CrashLoopBackOff登陆界面`Cannotloaddashboardmenulink`配置https证书方法一（不管用）juju部署kubeflow安装进入dashboard资料及参考不配置https证书，
[Kubeflow jupyter01]为jupyter notebook配置存储空间小欣小欣亮晶晶 kubeflow jupyter python ide
目录1、jupyter-notebook未指定存储位置，缺少存储空间2、手动为Jupyter创建PV和PVC2.1创建PV（前提：nfs挂载已经配置成功）2.2创建PVC2.3拉取官方的jupyter镜像，下载源为阿里云3、自动创建PV3.1StorageClass的概念和作用3.2创建nfsprovisioner3.3创建storageClass3.4新建jupyter任务1、jupyter-n
kubeflow 创建tensorflow过程 weixin_30492047 网络人工智能开发工具
onlinedeployable，installk8s代码Kubeflow有三个核心组件TFJobOperator和Controller：作为Kubernetes的扩展，来简化分布式TensorFlow工作负载的部署。通过Operator，Kubeflow能够自动化的配置master服务器，工作服务器和参数化服务器配置。TFJob可用来部署工作负载。OPeratpor$kubectldescrib
kubeflow-6-jupyter notebook的使用皮皮冰燃 kubeflow
在kubeflow中使用notebookservers功能本质是起一个jupyterlab容器，在容器中运行一个jupyter的服务。notebook服务名称，默认是不能重复的。notebook服务镜像可以选择已有的镜像，也可以使用包含自己需要python环境的个性化镜像。工作空间在容器中的默认挂载地址为/home/jovyan，但实际上它的文件都是存放在local-path-provisione
Kubernetes 和 Kubeflow 学习笔记摸鱼温乎 kubernetes 学习 docker
KubernetesKubernetes是一个完备的分布式系统支撑平台，具有完备的集群管理能力，多扩多层次的安全防护和准入机制、多租户应用支撑能力、透明的服务注册和发现机制、內建智能负载均衡器、强大的故障发现和自我修复能力、服务滚动升级和在线扩容能力、可扩展的资源自动调度机制以及多粒度的资源配额管理能力。Kubernetes优势:原生的资源隔离集群化自动化管理计算资源(CPU/GPU)自动调度对多
ubuntu20.04下使用juju+maas环境部署k8s-13-charmed kubeflow-2-安装 Kubeflow v1.3 injexengge k8s juju kubeflow windows linux 运维
参考文档：InstallKubeflowv1.3注：要在本地安装，您只需安装MicroK8s并启用Kubeflow插件。本指南列出了在任何符合标准的Kubernetes（包括AKS、EKS、GKE、Openshift和任何kubeadm部署的集群）上安装Kubeflow所需的步骤，前提是您可以通过kubectl访问它。1安装juju客户端在Linux上，使用以下命令通过snap安装juju：sna
ubuntu20.04下使用juju+maas环境部署k8s-12-charmed kubeflow-1-kubeflow270和kubeflow介绍 injexengge k8s juju kubeflow 深度学习
参考文档：Kubeflow#270CharmedKubeflow简介Kubeflow#270Kubeflow运算符介绍：CharmedKubeflow是全套Kubernetes运算符，可提供构成最新版Kubeflow的30多个应用程序和服务，可在任何地方轻松操作，从工作站到本地，再到公共云和边缘。Charm是一个软件包，其中包含一个运算符和元数据，该元数据支持将多个运算符集成到一个连贯的聚合系统中
k8s落地 TeqW windows
https://www.jiqizhixin.com/articles/2019-01-31-14Yarn已过时！Kubeflow实现机器学习调度平台才是未来
使用Kubeflow构建机器学习流水线 RancherLabs
在此前的文章中，我已经向你介绍了Kubeflow，这是一个为团队设置的机器学习平台，需要构建机器学习流水线。在本文中，我们将了解如何采用现有的机器学习详细并将其变成Kubeflow的机器学习流水线，进而可以部署在Kubernetes上。在进行本次练习的时候，请考虑你该如何将现有的机器学习项目转换到Kubeflow上。我将使用FashionMNIST作为例子，因为在本次练习中模型的复杂性并不是我们需
【3】为kubeflow配置默认的StorageClass 小毛1221
背景如果没有配置默认的StorageClass，kubeflow在创建notebook时会提示“NodefaultStorageClassisset.Can'tcreatenewDisksforthenewNotebook.PleaseuseanExistingDisk.”此时选择volume时，需要现在集群中手动创建pv,pvc，然后才可使用；image.png实现方法参考：https://bl
Kubeflow Pipelines介绍与实例 Pistachiout #云原生智算平台机器学习云原生 kubernetes
1.背景MLcode仅是MachineLearningsystems中的一小部分，像数据收集、特征抽取、配置管理、资源管理、模型部署、模型监控等同样十分的重要。一个典型的机器学习系统由这么多组件或子系统构成时，那么这么多子系统应该如何高效的配合起来？答案是机器学习工作流。通过机器学习工作流，可以有效的将各个子系统串联起，每一个业务场景可以通过一个端到端的机器学习工作流来描述，同时通过工作流也可以追
KubeFlow组件介绍 Pistachiout #云原生智算平台云原生 knative 人工智能
kubeflow是一个胶水项目，它把诸多对机器学习的支持，比如模型训练，超参数训练，模型部署等进行组合并已容器化的方式进行部署，提供整个流程各个系统的高可用及方便的进行扩展部署了kubeflow的用户就可以利用它进行不同的机器学习任务。kubeflow是一个为Kubernetes构建的可组合，便携式，可扩展的机器学习技术栈。Kubeflow想解决的问题是如何基于Kubernetes去方便地维护ML
外部访问Kubeflow默认安装的mino ui dawsongzhao
前提条件1、k8s集群正常安装2、kubeflow在K8S集群上正常安装image.png信息查看minioserviceimage.pngistio网关信息image.png访问设置：1、通过Kubectlportforwordimage.pngimage.png2、通过Istio网关//TODO3、通过NodePort//TODO
【问题解决】容器部署MySQL的数据在docker commit导出的镜像中丢失东北小狐狸-Hellxz 问题解决 #Docker mysql docker 数据库
问题起因最近公司有个甲方项目参加竞赛，要求在(基于kubeflow/arena)平台上部置应用，可以将MySQL打包在应用一起，也可以分开部署，没有提供volume相关的支持。大意是可以把初始好的数据直接拿到平台上。经过本人在Linux虚机中启动MySQL容器导入数据再dockercommit出镜像部署到平台上，发现数据竟然没了，包括新导的库。。。问题排查经过经过dockerinspect发现一个
[kubeflow] controller-runtime源码解析 hanjialeOK kubernetes kubernetes
[TODO]使用controller-runtime官方文档重构一下文章的脉络。在上一篇文章[kubeflow]从零搭建training-operator项目中，我们从零搭建了一个简单的training-operator项目，最终就差完成controller的Reconcile函数逻辑。这次从TFJob的Reconcile函数为入口，探究training-operator到底是怎么工作的。在此之前
[kubeflow] 从零搭建training-operator项目 hanjialeOK kubernetes kubernetes 云原生
最近一直在看kubeflow/training-operator源码，思考怎么从零搭建一个类似的项目呢？网上查了很多资料，点开的chrome标签页密密麻麻，这里把自己的学习过程记录下来，代码在https://github.com/hanjialeOK/simple-operator。简介training-operator里面主要是一些CRD和相关的controller。其中包括了tfjob，pyt
[kubeflow] training-operator源码解析 hanjialeOK kubernetes kubernetes
在之前的文章[kubeflow]从零搭建training-operator项目中，我们从零搭建了一个简单的training-operator项目，最终就差完成controller的Reconcile函数逻辑。在上一篇文章[kubeflow]controller-runtime源码解析中，我们探究了controller-runtime的运行原理，理解了执行Reconcile函数之前的逻辑是啥样的。这
安装kubeflow tfjob并让搭配 volcano 的教程 zoux
（1）准备工作，安装好k8s集群，安装好kfctl（2）确认你是否有一个默认StorageClass且也配置好了动态pv，确认方法如下:kubectlgetsc输出：NAMEPROVISIONERAGEnfs(default)fuseim.pri/ifs147mslowkubernetes.io/gce-pd5ddefault表示这个storageclass是默认的。修改一个storageclas
[Kubernetes]Kubeflow Pipelines - 基本介绍与安装方法奇思闻影的舒克与贝克 kubernetes 容器云原生
1.背景近些年来，人工智能技术在自然语言处理、视觉图像和自动驾驶方面都取得不小的成就，无论是工业界还是学术界大家都在惊叹一个又一个的模型设计。但是对于真正做过算法工程落地的同学，在惊叹这些模型的同时，更多的是在忧虑如果快速且有效的将这些模型落地到业务中，并产生商业价值。正如Google《HiddenTechnicalDebtinMachineLearningSystems》中说的，MLcode仅是
安装kubeflow 死亡之翼归来 kubeflow k8s docker docker linux kubernetes
此处为安装kubeflow1.0.2版本的教程，其他版本仅作为参考。准备工作如果没有配置docker访问外网代理，可以参考离线安装docker配置代理部分如果没有安装kubernetes集群，可以参考离线使用kubeadm安装kubernetes集群下载需要的资源1.从https://github.com/kubeflow/kfctl/releases/下载v1.0.2版本对应的kfctl二进制文
云原生周刊：Kubeflow 成为 CNCF 项目 KubeSphere 云原生 k8s 容器平台 kubesphere 云计算
开源项目推荐OpenKruiseGameOpenKruiseGame（OKG）是简化游戏服云原生化的自定义Kubernetes工作负载，相比Kubernetes内置的无状态（Deployment）、有状态（StatefulSet）等工作负载而言，OpenKruiseGame（OKG）提供了热更新、原地升级、定向管理等常用的游戏服管理功能，是完全面向游戏服场景而设计的Kubernetes工作负载。K
torch.distributed 与 Kubeflow分布式训练 cv-daily 分布式
参考：https://www.zhihu.com/question/359744585/answer/3054739466?utm_id=0如果你不使用torch.distributed，你仍然可以在Kubeflow上运行PyTorch工作负载，但是你可能需要自己实现分布式计算的逻辑，或者使用其他的库来实现分布式计算。在单独使用torch.distributed进行分布式训练时，大部分代码和在Ku
云原生的弹性 AI 训练系列之一：基于 AllReduce 的弹性分布式训练实践腾讯云原生
引言随着模型规模和数据量的不断增大，分布式训练已经成为了工业界主流的AI模型训练方式。基于Kubernetes的Kubeflow项目，能够很好地承载分布式训练的工作负载，业已成为了云原生AI领域的事实标准，在诸多企业内广泛落地。尽管Kubeflow让基于Kubernetes的大规模分布式训练变得可行，但是云原生的极致弹性、降本增效等特性在人工智能场景下没有得到很好地释放。为了解决目前在云原生AI场
【最新】k8s中kubeflow(v1.0)部署全过程+踩坑全集（图文） Jasonzz_ k8s 机器学习分布式 k8s kubeflow
目录简述部署环境及要求部署1.下载kfctl包和yaml文件2.applyyaml文件3.阿里云构建拉取所需要的镜像4.创建pv5.修改各个deploystatefulset的镜像下载策略部署成功踩坑全集1.报错Segmentationfault2.使用ksonnet时找不到yaml文件![在这里插入图片描述](https://img-blog.csdnimg.cn/202008031652358
开发者关心的那些事圣子足道 ios 游戏编程 apple 支付
我要在app里添加IAP，必须要注册自己的产品标识符（product identifiers）。产品标识符是什么？产品标识符（Product Identifiers）是一串字符串，它用来识别你在应用内贩卖的每件商品。App Store用产品标识符来检索产品信息，标识符只能包含大小写字母（A-Z）、数字（0-9）、下划线（-）、以及圆点(.)。你可以任意排列这些元素，但我们建议你创建标识符时使用
负载均衡器技术Nginx和F5的优缺点对比 bijian1013 nginx F5
对于数据流量过大的网络中，往往单一设备无法承担，需要多台设备进行数据分流，而负载均衡器就是用来将数据分流到多台设备的一个转发器。目前有许多不同的负载均衡技术用以满足不同的应用需求，如软/硬件负载均衡、本地/全局负载均衡、更高
LeetCode[Math] - #9 Palindrome Number Cwind java Algorithm 题解 LeetCode Math
原题链接：#9 Palindrome Number 要求：判断一个整数是否是回文数，不要使用额外的存储空间难度：简单分析：题目限制不允许使用额外的存储空间应指不允许使用O(n)的内存空间，O(1)的内存用于存储中间结果是可以接受的。于是考虑将该整型数反转，然后与原数字进行比较。注：没有看到有关负数是否可以是回文数的明确结论，例如
画图板的基本实现 15700786134 画图板
要实现画图板的基本功能，除了在qq登陆界面中用到的组件和方法外，还需要添加鼠标监听器，和接口实现。首先，需要显示一个JFrame界面： public class DrameFrame extends JFrame { //显示
linux的ps命令被触发 linux
Linux中的ps命令是Process Status的缩写。ps命令用来列出系统中当前运行的那些进程。ps命令列出的是当前那些进程的快照，就是执行ps命令的那个时刻的那些进程，如果想要动态的显示进程信息，就可以使用top命令。要对进程进行监测和控制，首先必须要了解当前进程的情况，也就是需要查看当前进程，而 ps 命令就是最基本同时也是非常强大的进程查看命令。使用该命令可以确定有哪些进程正在运行
Android 音乐播放器下一曲连续跳几首歌肆无忌惮_ android
最近在写安卓音乐播放器的时候遇到个问题。在MediaPlayer播放结束时会回调 player.setOnCompletionListener(new OnCompletionListener() { @Override public void onCompletion(MediaPlayer mp) { mp.reset(); Log.i("H
java导出txt文件的例子知了ing java servlet
代码很简单就一个servlet,如下： package com.eastcom.servlet; import java.io.BufferedOutputStream; import java.io.IOException; import java.net.URLEncoder; import java.sql.Connection; import java.sql.Resu
Scala stack试玩, 提高第三方依赖下载速度矮蛋蛋 scala sbt
原文地址： http://segmentfault.com/a/1190000002894524 sbt下载速度实在是惨不忍睹, 需要做些配置优化下载typesafe离线包, 保存为ivy本地库 wget http://downloads.typesafe.com/typesafe-activator/1.3.4/typesafe-activator-1.3.4.zip 解压r
phantomjs安装(linux，附带环境变量设置) ，以及casperjs安装。 alleni123 linux spider
1. 首先从官网 http://phantomjs.org/下载phantomjs压缩包，解压缩到/root/phantomjs文件夹。 2. 安装依赖 sudo yum install fontconfig freetype libfreetype.so.6 libfontconfig.so.1 libstdc++.so.6 3. 配置环境变量 vi /etc/profil
JAVA IO FileInputStream和FileOutputStream，字节流的打包输出百合不是茶 java核心思想 JAVA IO操作字节流
在程序设计语言中，数据的保存是基本，如果某程序语言不能保存数据那么该语言是不可能存在的，JAVA是当今最流行的面向对象设计语言之一，在保存数据中也有自己独特的一面，字节流和字符流 1，字节流是由字节构成的，字符流是由字符构成的字节流和字符流都是继承的InputStream和OutPutStream ,java中两种最基本的就是字节流和字符流类 FileInputStream
Spring基础实例（依赖注入和控制反转） bijian1013 spring
前提条件：在http://www.springsource.org/download网站上下载Spring框架，并将spring.jar、log4j-1.2.15.jar、commons-logging.jar加载至工程1.武器接口 package com.bijian.spring.base3; public interface Weapon { void kil
HR看重的十大技能 bijian1013 提升能力 HR 成长
一个人掌握何种技能取决于他的兴趣、能力和聪明程度，也取决于他所能支配的资源以及制定的事业目标，拥有过硬技能的人有更多的工作机会。但是，由于经济发展前景不确定，掌握对你的事业有所帮助的技能显得尤为重要。以下是最受雇主欢迎的十种技能。　　一、解决问题的能力　　每天，我们都要在生活和工作中解决一些综合性的问题。那些能够发现问题、解决问题并迅速作出有效决
【Thrift一】Thrift编译安装 bit1129 thrift
什么是Thrift The Apache Thrift software framework, for scalable cross-language services development, combines a software stack with a code generation engine to build services that work efficiently and s
【Avro三】Hadoop MapReduce读写Avro文件 bit1129 mapreduce
Avro是Doug Cutting(此人绝对是神一般的存在）牵头开发的。开发之初就是围绕着完善Hadoop生态系统的数据处理而开展的（使用Avro作为Hadoop MapReduce需要处理数据序列化和反序列化的场景）,因此Hadoop MapReduce集成Avro也就是自然而然的事情。这个例子是一个简单的Hadoop MapReduce读取Avro格式的源文件进行计数统计，然后将计算结果
nginx定制500，502，503，504页面 ronin47 nginx　错误显示
server { listen 80; error_page 500/500.html; error_page 502/502.html; error_page 503/503.html; error_page 504/504.html; location /test {return502;}} 配置很简单，和配
java-1.二叉查找树转为双向链表 bylijinnan 二叉查找树
import java.util.ArrayList; import java.util.List; public class BSTreeToLinkedList { /* 把二元查找树转变成排序的双向链表题目：输入一棵二元查找树，将该二元查找树转换成一个排序的双向链表。要求不能创建任何新的结点，只调整指针的指向。 10 / \ 6 14 / \
Netty源码学习-HTTP-tunnel bylijinnan java netty
Netty关于HTTP tunnel的说明： http://docs.jboss.org/netty/3.2/api/org/jboss/netty/channel/socket/http/package-summary.html#package_description 这个说明有点太简略了一个完整的例子在这里： https://github.com/bylijinnan
JSONUtil.serialize(map)和JSON.toJSONString(map)的区别 coder_xpf jquery json map val()
JSONUtil.serialize(map)和JSON.toJSONString(map)的区别数据库查询出来的map有一个字段为空通过System.out.println()输出 JSONUtil.serialize(map)： {"one":"1","two":"nul
Hibernate缓存总结 cuishikuan 开源 ssh javaweb hibernate缓存三大框架
一、为什么要用Hibernate缓存？ Hibernate是一个持久层框架，经常访问物理数据库。为了降低应用程序对物理数据源访问的频次，从而提高应用程序的运行性能。缓存内的数据是对物理数据源中的数据的复制，应用程序在运行时从缓存读写数据，在特定的时刻或事件会同步缓存和物理数据源的数据。二、Hibernate缓存原理是怎样的？ Hibernate缓存包括两大类：Hib
CentOs6 dalan_123 centos
首先su - 切换到root下面1、首先要先安装GCC GCC-C++ Openssl等以来模块：yum -y install make gcc gcc-c++ kernel-devel m4 ncurses-devel openssl-devel2、再安装ncurses模块yum -y install ncurses-develyum install ncurses-devel3、下载Erang
10款用 jquery 实现滚动条至页面底端自动加载数据效果 dcj3sjt126com JavaScript
无限滚动自动翻页可以说是web2.0时代的一项堪称伟大的技术，它让我们在浏览页面的时候只需要把滚动条拉到网页底部就能自动显示下一页的结果，改变了一直以来只能通过点击下一页来翻页这种常规做法。无限滚动自动翻页技术的鼻祖是微博的先驱：推特(twitter)，后来必应图片搜索、谷歌图片搜索、google reader、箱包批发网等纷纷抄袭了这一项技术，于是靠滚动浏览器滚动条
ImageButton去边框&Button或者ImageButton的背景透明 dcj3sjt126com imagebutton
在ImageButton中载入图片后，很多人会觉得有图片周围的白边会影响到美观，其实解决这个问题有两种方法一种方法是将ImageButton的背景改为所需要的图片。如：android:background="@drawable/XXX" 第二种方法就是将ImageButton背景改为透明，这个方法更常用在XML里； <ImageBut
JSP之c:foreach eksliang jsp forearch
原文出自：http://www.cnblogs.com/draem0507/archive/2012/09/24/2699745.html <c:forEach>标签用于通用数据循环，它有以下属性属性描述是否必须缺省值 items 进行循环的项目否无 begin 开始条件否 0 end 结束条件否集合中的最后一个项目 step 步长否 1
Android实现主动连接蓝牙耳机 gqdy365 android
在Android程序中可以实现自动扫描蓝牙、配对蓝牙、建立数据通道。蓝牙分不同类型，这篇文字只讨论如何与蓝牙耳机连接。大致可以分三步：一、扫描蓝牙设备： 1、注册并监听广播： BluetoothAdapter.ACTION_DISCOVERY_STARTED BluetoothDevice.ACTION_FOUND BluetoothAdapter.ACTION_DIS
android学习轨迹之四：org.json.JSONException: No value for hyz301 json
org.json.JSONException: No value for items 在JSON解析中会遇到一种错误，很常见的错误 06-21 12:19:08.714 2098-2127/com.jikexueyuan.secret I/System.out﹕ Result:{"status":1,"page":1,&
干货分享：从零开始学编程系列汇总 justjavac 编程
程序员总爱重新发明轮子，于是做了要给轮子汇总。从零开始写个编译器吧系列 (知乎专栏) 从零开始写一个简单的操作系统 (伯乐在线) 从零开始写JavaScript框架 (图灵社区) 从零开始写jQuery框架 (蓝色理想 ) 从零开始nodejs系列文章 (粉丝日志) 从零开始编写网络游戏
jquery-autocomplete 使用手册 macroli jquery Ajax 脚本
jquery-autocomplete学习一、用前必备官方网站：http://bassistance.de/jquery-plugins/jquery-plugin-autocomplete/ 当前版本：1.1 需要JQuery版本：1.2.6 二、使用 <script src="./jquery-1.3.2.js" type="text/ja
PLSQL-Developer或者Navicat等工具连接远程oracle数据库的详细配置以及数据库编码的修改超声波 oracle plsql
　　在服务器上将Oracle安装好之后接下来要做的就是通过本地机器来远程连接服务器端的oracle数据库，常用的客户端连接工具就是PLSQL-Developer或者Navicat这些工具了。刚开始也是各种报错，什么TNS:no listener;TNS:lost connection;TNS:target hosts...花了一天的时间终于让PLSQL-Developer和Navicat等这些客户
数据仓库数据模型之：极限存储--历史拉链表 superlxw1234 极限存储数据仓库数据模型拉链历史表
在数据仓库的数据模型设计过程中，经常会遇到这样的需求： 1. 数据量比较大; 2. 表中的部分字段会被update,如用户的地址，产品的描述信息，订单的状态等等; 3. 需要查看某一个时间点或者时间段的历史快照信息，比如，查看某一个订单在历史某一个时间点的状态，比如，查看某一个用户在过去某一段时间内，更新过几次等等; 4. 变化的比例和频率不是很大，比如，总共有10
10点睛Spring MVC4.1-全局异常处理 wiselyman spring mvc
10.1 全局异常处理使用@ControllerAdvice注解来实现全局异常处理; 使用@ControllerAdvice的属性缩小处理范围 10.2 演示演示控制器 package com.wisely.web; import org.springframework.stereotype.Controller; import org.spring