服务器/端点配置基线_使用Webassembly和/或工具解决无服务器端点的suduko

服务器/端点配置基线

In this article I will go through the process of

在本文中,我将介绍以下过程

  • Building an existing C++ library as WebAssembly

    将现有的C ++库构建为WebAssembly
  • Creating a WebAssembly application that links to this library

    创建链接到该库的WebAssembly应用程序
  • Deploying the WebAssembly application as part of a serverless endpoint

    将WebAssembly应用程序部署为无服务器端点的一部分

The example I use is a Sudoku solver using Google OR-Tools for the solving part. There is an online demo of the result here: https://sudoku-solver.krmuller.workers.dev/. To be clear: Creating a Sudoku solver is not really the end goal here. Puzzles like Sudoku is kind of a “Hello World” in this context. It exemplify the use of powerful C++-based libraries being leveraged within a serveless setting. The GitHub-repo for the Sudoko solver can be found at https://github.com/kjartanm/sudoku-solver.

我使用的示例是Sudoku求解器,使用Google OR-Tools作为求解部分。 这里有在线结果演示: https : //sudoku-solver.krmuller.workers.dev/ 。 需要明确的是:创建Sudoku求解器并不是这里的最终目标。 在这种情况下,数独之类的谜题就是一个“ Hello World”。 它举例说明了在无服务的设置中利用强大的基于C ++的库的用法。 Sudoko求解器的GitHub存储库可在https://github.com/kjartanm/sudoku-solver中找到。

First a bit of background. Using WebAssembly can have different motivations: Speed is obviously one, consistent speed across environments another — since WebAssembly may not always be faster. But access to tools, applications, libraries and functionality that else would be hard to use is another important motivation. Also, WebAssembly has different usecases: on browser, on serverless, or even in gateways. Here I focus on access to otherwise hard to implement functionality in combination with serverless. That is a natural combination. When compiling stuff that was not at all meant for a browser, you risk getting filesizes that is huge — even compared to todays React-based js-bundles. Pushing that as part of your js-app could create unacceptable download times. Enabling that same functionality on serverless could then make sense.

首先介绍一下背景。 使用WebAssembly可能有不同的动机:速度显然是一种,跨环境的一致速度是另一种-因为WebAssembly可能并不总是更快。 但是,访问其他难以使用的工具,应用程序,库和功能是另一个重要动机。 而且,WebAssembly具有不同的用例:在浏览器上,在无服务器上,甚至在网关上。 在这里,我重点介绍与无服务器结合使用对其他方面难以实现的功能的访问 。 那是自然的组合。 编译根本不适合浏览器的内容时​​,您冒着获得巨大文件大小的风险,即使与当今基于React的js包相比也是如此。 将其作为js-app的一部分进行推送可能会导致无法接受的下载时间。 在无服务器上启用相同的功能将很有意义。

将OR-工具构建为WebAssembly (Building OR-Tools as WebAssembly)

I became aware of Google OR-Tools while researching technology for dealing with task management. OR-Tools is an extensive library for handling problems that falls within the discipline that is called “Operations Research”:

在研究用于处理任务管理的技术时,我意识到了Google OR-Tools。 OR-Tools是一个广泛的库,用于处理称为“ 运筹学 ”的学科内的问题:

Operations research (OR) is a discipline that deals with the application of advanced analytical methods to help make better decisions (…) operations research arrives at optimal or near-optimal solutions to complex decision-making problems. (Wikipedia)

运筹学 ( OR )是一门涉及高级分析方法的应用程序的学科,旨在帮助做出更好的决策(...)运筹学为复杂的决策问题提供最佳或接近最优的解决方案。 (维基百科)

Typical OR-problems is scheduling, routing, work flows, critical paths etc. Problems with a lot of variables, hard and soft constraints, and potentially a range of possible solutions that needs to be prioritized.

典型的OR问题是调度,路由,工作流程,关键路径等。存在许多变量,硬约束和软约束以及可能需要优先考虑的一系列可能解决方案的问题。

The problems can be dealt with in a myriad of ways, using different techniques, methods and technologies. OR-Tools supports several of these by a mixture of own components and third party libraries. In addition OR-Tools include wrappers for Java, Python and C#. All in all it is a complex setup with a lot of dependencies.

可以使用不同的技术,方法和技术以多种方式解决这些问题。 OR-Tools通过混合使用自己的组件和第三方库来支持其中的几种。 另外,OR-Tools包括Java,Python和C#的包装器。 总而言之,这是一个具有许多依赖性的复杂设置。

I was looking for something I could use on the web. Since I also was looking into WebAssembly at the time, I thought I could see if it was possible to compile OR-Tools into Wasm. Being a novice of WebAssembly, having only rudimentary knowledge of C++, and being borderline clueless about OR, that sounded like a good idea :) Fortunately it turned out well!

我一直在寻找可以在网上使用的功能。 由于当时我也在研究WebAssembly,因此我想我可以看看是否可以将OR-Tools编译为Wasm。 作为WebAssembly的新手,只具备C ++的基本知识,对OR毫无了解,这听起来是个好主意:)幸运的是,结果很好!

But the start was bumpy. The usual way to compile existing libraries — as I gathered from the examples — would be to use Emscripten and its ‘make’-commands: emmake, emconfigure. The make process for OR-Tools has two steps, building the third-party dependencies first. Using emmake for the dependencies didn’t work well at all, and the main make process for OR-Tools is a very all-or-nothing for creating libraries, binaries, wrappers for Java and Python, examples, tests, etc. This is great for those who want to jump into it, but is hard to adapt for something else — at least for me with limited experience creating make-files.

但是开始是坎bump的。 正如我从示例中所收集的那样,编译现有库的常用方法是使用Emscripten及其“ make”命令:emmake,emconfigure。 OR-Tools的制造过程分为两个步骤,首先构建第三方依赖项。 将emmake用于依赖项根本无法正常工作,而OR-Tools的主要make过程对于创建库,二进制文件,Java和Python的包装器,示例,测试等而言,是一无所有。非常适合那些想加入它的人,但很难适应其他东西-至少对于我在创建文件方面经验有限的人而言。

The breakthrough was after reading this article about cross compiling (http://marcelbraghetto.github.io/a-simple-triangle/2019/03/10/part-06/), with a setup for using Emscripten.cmake. OR-Tools has an experimental cmake support, so I could adapt the Emscripten.cmake setup from the article for my own purpose. The first run(s) failed, but even so gave better feedback and showed lot more promise than using emmake directly. So I iterated on that, and in the end, I didn’t really have to change that much. Some small changes within the OR-Tools process, mostly turning things off that I didn’t really need, and some changes with two of the dependencies.

突破是在阅读了有关交叉编译的文章( http://marcelbraghetto.github.io/a-simple-triangle/2019/03/10/part-06/ )之后,并设置了使用Emscripten.cmake的设置。 OR-Tools具有实验性的cmake支持,因此我可以出于自己的目的改编本文中的Emscripten.cmake设置。 第一次运行失败,但是与直接使用emmake相比,即使如此,它也提供了更好的反馈并显示出更多的希望。 因此,我对此进行了迭代,最后,我实际上并不需要进行太多更改。 OR-Tools流程中的一些小变化,大部分是我真正不需要的东西,还有一些依赖项的变化。

The OR-Tools cmake-process adapt the dependencies for its own purposes doing a patch after fetching them. So to do the needed changes, I used standalone versions of the dependencies in question, iterated on them until I got a Wasm-build, then created patches with the differences. The patches were added back into the OR-Tools repo for its cmake-process. The resulting build took a lot of time, but it actuallyworked! I compiled some of the examples using Emscriptens emcc, linking the output of the OR-Tools build, and they behaved as the should.

OR-Tools cmake-process将依赖关系调整为自己的目的,在获取依赖关系后进行修补。 因此,为了进行所需的更改,我使用了相关依赖项的独立版本,对其进行迭代,直到获得Wasm构建,然后创建具有差异的补丁程序。 将补丁重新添加到OR-Tools存储库中以进行cmake处理。 生成的构建花费了很多时间,但实际上可以正常工作! 我使用Emscriptens emcc编译了一些示例,链接了OR-Tools构建的输出,并且它们的行为应有的表现。

The changes I did was of the quick and dirty kind, so for now building OR-Tools as Wasm is just part of this fork (https://github.com/kjartanm/wasm-or-tools). But with that little changes, it should be totally possible to adapt the OR-Tools cmake process to add Wasm as an option in a proper way. But that is for someone more experienced with cmake than myself at the moment.

我所做的更改是快速而又肮脏的更改,因此就目前而言,因为Wasm而构建OR-Tools只是该分支的一部分( https://github.com/kjartanm/wasm-or-tools )。 但是,只需进行少量更改,就完全有可能使OR-Tools cmake流程适应以适当的方式添加Wasm作为选项。 但这是针对目前比我更熟悉cmake的人。

使用OR-Tools创建WebAssembly应用程序 (Creating an WebAssembly application using OR-Tools)

OR-Tools comes with a lot examples illustrating different methods and techniques. This Sudoku solver is based on a Java example in the contributor section of the examples. But ported to C++ and adapted for integration with JavaScript. The solver uses something called Constraint Programming, where “ users declaratively state the constraints on the feasible solutions for a set of decision variables”. In the Sudoku solver this is done in just a few steps. First the solver sets up the variables:

OR-Tools附带了许多示例,说明了不同的方法和技术。 此Sudoku求解器基于示例的贡献者部分中的Java示例。 但是移植到C ++并适合与JavaScript集成。 求解器使用一种称为约束编程的约束 ,其中“用户以声明方式声明一组决策变量对可行解的约束 ”。 在Sudoku求解器中,只需几个步骤即可完成。 首先,求解器设置变量:

std::vector> grid = {
{}, {}, {}, {}, {}, {}, {}, {}, {}};
std::vector grid_flat;for (int i = 0; i < n; i++)
{
for (int j = 0; j < n; j++)
{
std::ostringstream varname;
varname << "grid[ " << i << "," << j << "]";
IntVar *nr = solver.MakeIntVar(1, 9, varname.str());
grid[i].push_back(nr);
grid_flat.push_back(nr);
}
}

This is the well known 81 variables with numbers from 1–9 arranged in a grid (the solver use both a flat and a 2d version of the grid for convenience ).

这是众所周知的81个变量,其编号从1到9排列在一个网格中(为方便起见,求解器同时使用平面和2d版本的网格)。

Then we add the constraints:

然后添加约束:

for (int i = 0; i < n; i++)
{
std::vector row;
for (int j = 0; j < n; j++)
{
if (sudoku_flat[i * n + j] > 0)
{
solver.AddConstraint(solver.MakeEquality(grid[i][j], sudoku_flat[i * n + j]));
}
row.push_back(grid[i][j]);
}
solver.AddConstraint(solver.MakeAllDifferent(row));
}

Here we add constraints for the rows. The first AddConstraint says that if we already have a value from the original problem, the variable should be equal to this value. The other says all variables in a row should be different. Constraints for columns and the 3x3 cells is declared in the same way.

在这里,我们为行添加约束。 第一个AddConstraint表示,如果我们已经有了原始问题的值,则变量应等于该值。 另一个说一行中的所有变量都应该不同。 列和3x3单元的约束以相同的方式声明。

After defining variables and setting the constraints, we initiate a solver and execute a search for solutions:

定义变量并设置约束后,我们启动求解器并执行对解决方案的搜索:

DecisionBuilder *db = solver.MakePhase(
grid_flat,
solver.INT_VAR_SIMPLE,
solver.INT_VALUE_SIMPLE
);
solver.NewSearch(db);
if (solver.NextSolution())
{
for (int i = 0; i < n; i++)
{
for (int j = 0; j < n; j++)
{
sudoku_flat[i * n + j] = (int)grid[i][j]->Value();
}
}
}
solver.EndSearch();

In this case we write the solution back into the orginal array which is already referenced by the JavaScript integration.

在这种情况下,我们将解决方案写回到JavaScript集成已引用的原始数组中。

The application is compiled using the following script:

使用以下脚本编译该应用程序:

docker run --rm -v $(pwd):/src emscripten/emsdk \
emcc \
-O3 \
--bind \
--no-entry \
-s EXTRA_EXPORTED_RUNTIME_METHODS='["getValue"]' \
-s ALLOW_MEMORY_GROWTH=1 \
-s DYNAMIC_EXECUTION=0 \
-s TEXTDECODER=0 \
-s MODULARIZE=1 \
-s ENVIRONMENT='web' \
-s EXPORT_NAME='emscripten' \
--pre-js './pre.js' \
-lm \
-s ERROR_ON_UNDEFINED_SYMBOLS=0 \
-I deps/include \
-L deps/lib \
-lglog \
# Rest of libs removed from the example, ca 90 in total
-o ./build/module.js \
./src/sudoku.cc

I will not go into details here. Most of these settings comes from the Cloudflare Emscripten-template. See here https://emscripten.org/docs/tools_reference/emcc.html#emccdoc and here https://github.com/emscripten-core/emscripten/blob/master/src/settings.js for added details. What I have changed/added is emscripten/emsdk as docker image instead of the deprecated trzeci/emscripten that most examples usually references. I also added —-bind since I’m using Emscripten’s Embind for integrating WebAssembly with JavaScript. —-no-entry says that the module doesn’t have any main() that can be called, and -I deps/include and -L deps/lib is where to find the dependencies. I haven’t included every or-tool library here, since there is around 90 of them. But the repo has the full script.

我在这里不做详细介绍。 这些设置大多数来自Cloudflare Emscripten模板。 有关更多详细信息,请参见https://emscripten.org/docs/tools_reference/emcc.html#emccdoc和https://github.com/emscripten-core/emscripten/blob/master/src/settings.js 。 我更改/添加的内容是emscripten/emsdk作为emscripten/emsdk映像,而不是大多数示例通常引用的已弃用的trzeci/emscripten 。 我还添加了—-bind因为我正在使用Emscripten的Embind将WebAssembly与JavaScript集成在一起。 —-no-entry表示该模块没有任何可以调用的main() ,并且-I deps/include-L deps/lib是查找依赖项的地方。 我没有在这里包括每个or-tool库,因为其中大约有90个。 但是仓库有完整的脚本。

在无服务器端点上与JavaScript集成 (Integrating with JavaScript on serverless endpoint)

The setup for the serverless endpoint is as following:

无服务器端点的设置如下:

  • First it receives the Sudoko problem as a flat array in JSON as a POST request

    首先,它将JSON中的平面数组作为Sudoko问题作为POST请求接收
  • It sets up a Wasm-instance with the Sudoku solver

    它使用Sudoku求解器设置了一个Wasm实例
  • It loads the problem-array into instance’s memory

    它将问题数组加载到实例的内存中
  • Then it calls instance.solve with the pointer for the array

    然后,它使用数组的指针调用instance.solve
  • Finally it reads back the solution from the instance’s memory into an array that it returns as JSON to the client.

    最后,它将解决方案从实例的内存读回一个数组,该数组以JSON的形式返回给客户端。

I wanted to use Cloudflare Workers for this example. They seems to emphasize the possibilities with WebAssmbly, and I wanted to test the experience. That has mostly been positive, even if they need to update their template for emscripten. But it should be very easy to adapt this example to other serverless platforms as well (or add it to a SPA). The main work is done in a single method that is called by the function handling the request:

我想在此示例中使用Cloudflare Workers。 他们似乎在强调WebAssmbly的可能性,我想测试一下这种体验。 即使他们需要更新脚本模板,这通常也是积极的。 但是,使该示例也适应其他无服务器平台(或将其添加到SPA)应该非常容易。 主要工作是通过处理请求的函数调用的单个方法完成的:

let instance = null;
const solve = async sudokuArray => {
if(!instance) {
instance = await emscripten_module;
}
const I32_SIZE = 4;
const sudokuPtr = instance._malloc(sudokuArray.length * I32_SIZE);
instance.HEAP32.set(sudokuArray, sudokuPtr / I32_SIZE);
instance.solve(sudokuPtr); for (let v = 0; v < sudokuArray.length * I32_SIZE; v += I32_SIZE) {
sudokuArray[v/I32_SIZE] = instance.getValue(sudokuPtr + v, 'i32')
}
instance._free(sudokuPtr);
return new Response(JSON.stringify(sudokuArray), { 'status': 200, 'content-type': 'application/json' });
}

When passing arrays to a Wasm-function you will have to load it into the memory for the Wasm-instance and pass the pointer in the function call. In the same way you can’t return an array, but must pass a pointer. Since the Wasm-solver writes the result back to the same memory space, we can just use the same pointer here.

将数组传递给Wasm函数时,必须将其加载到Wasm实例的内存中,并在函数调用中传递指针。 同样,您不能返回数组,但必须传递一个指针。 由于Wasm-solver将结果写回到相同的内存空间,因此我们可以在此处使用相同的指针。

Emscripten has a lot of helpful functionality for dealing with stuff like this. The most important thing is to remember that the memory is one single ArrayBuffer, and it is easy to overwrite stuff by accident. For example, instance.HEAP32 and instance.HEAPF64 is just two views of the same array buffer, so it is important to keep track of nr of bytes with the different types when addressing and allocating memory, and keep within the allocated space. See https://becominghuman.ai/passing-and-returning-webassembly-array-parameters-a0f572c65d97 for a more thorough discussion about passing/receiving arrays with Wasm.

Emscripten具有许多有用的功能来处理类似这样的事情。 最重要的是要记住,内存是单个ArrayBuffer,很容易意外覆盖内容。 例如,instance.HEAP32和instance.HEAPF64只是同一数组缓冲区的两个视图,因此在寻址和分配内存时,要跟踪具有不同类型的nr个字节,并保持在分配的空间内,这一点很重要。 请参阅https://becominghuman.ai/passing-and-returning-webassembly-array-parameters-a0f572c65d97 ,以获取有关通过Wasm传递/接收数组的更详尽讨论。

For the integration to work this way, two methods needs to be added to the Emscripten instance: getValue is added by the EXTRA_EXPORTED_RUNTIME_METHODS setting in the build script (see above). solve is added by using Embind to bind the C++ function to its JavaScript counter part:

为了使集成以这种方式工作,需要在Emscripten实例中添加两种方法: getValue由构建脚本中的EXTRA_EXPORTED_RUNTIME_METHODS设置添加(请参见上文)。 通过使用Embind将C ++函数绑定到其JavaScript计数器部分来添加solve

int solve(int sudoko_ptr)
{
int *sudoku_flat = (int *)sudoko_ptr;
operations_research::solve(sudoku_flat);
return EXIT_SUCCESS;
}EMSCRIPTEN_BINDINGS(module)
{
function("solve", &solve);
}

Since we can’t pass the pointer directly as a pointer in Embind at the moment, I pass it as int and cast it as pointer before processing it further. When the call to Wasm returns, the JavaScript-part can read back the solution array from memory.

由于此刻我们不能将指针直接作为指针传递给Embind,因此在进一步处理之前,我将其作为int传递并将其转换为指针。 当对Wasm的调用返回时,JavaScript部分可以从内存中读取解决方案数组。

One concern with using OR-Tools on serverless is that processing time can grow real fast with increased complexity of the problems. Cloudflare has a 10 ms CPU time capping on a request on their free plan, and 50 ms on their bundled plan. 50 ms is more than enough to solve tough Sudoko problems, but not for more advanced OR-problems. But they are now testing a new Unbound plan in a beta-program that are without this capping, and hopefully the pricing will be acceptable also for complex OR-Tools-applications. On other platforms, this will count against general timeout settings — like 10 seconds on Vercel.

在无服务器上使用OR-Tools的一个问题是,随着问题的复杂性的增加,处理时间会Swift增长。 Cloudflare的免费计划中的请求有10毫秒的CPU时间上限,捆绑计划中有50毫秒的上限。 50毫秒足以解决棘手的Sudoko问题,但不足以解决更高级的OR问题。 但是他们现在正在Beta版程序中测试没有此上限的新Unbound计划 ,希望该价格对于复杂的OR-Tools-applications也可以接受。 在其他平台上,这将计入常规超时设置-例如在Vercel上为10秒。

结束语 (Concluding remarks)

The possibility of leveraging complex functionality in serverless through WebAssembly and libraries like OR-Tools represent a huge opportunity, I think. It is early days yet, but I see promise in this area with companies like Cloudflare, and for example Fastly who are putting Wasm directly on endpoints without the need of being instantiated by JavaScript (https://www.fastly.com/blog/announcing-lucet-fastly-native-webassembly-compiler-runtime). Experiments like this Sudoku solver have been very encouraging so far, and this is absolutely something I look forward to explore further.

我认为,通过WebAssembly和OR-Tools之类的库在无服务器中利用复杂功能的可能性代表了巨大的机会。 现在还处于初期,但是我看到Cloudflare等公司在这一领域的前景,例如Fastly,他们将Wasm直接放置在端点上,而无需被JavaScript实例化( https://www.fastly.com/blog/宣布lucet-native-webassembly-compiler-runtime) 。 到目前为止,像Sudoku求解器这样的实验一直非常令人鼓舞,这绝对是我期待进一步探索的东西。

翻译自: https://medium.com/swlh/a-suduko-solving-serverless-endpoint-using-webassembly-and-or-tools-df9f7bb10044

服务器/端点配置基线

你可能感兴趣的:(linux,java)