使用后端和前端javascript环境构建pdf文件的最有效方法

体验反馈(Experience Feedback)

Everybody is facing the need of building PDFs somedays in its developer career.

在开发人员的职业生涯中的每一天,每个人都面临着构建PDF的需求。

PDF is a very common format made by Adobe, which is optimally vectorial. This means that no matter where you’re using it, on any device all the content will scale properly to be displayed as High Definition content.

PDF是Adobe制作的一种非常常见的格式,最好是矢量格式。 这意味着无论您在何处使用它,任何设备上的所有内容都将正确缩放以显示为高清晰度内容。

Let’s be honest, you have to know that in terms of a feature with today’s tools: building PDFs is probably in my opinion, one of the hardest things to do… Things can get really complicated depending on the need and if you’re not able to make tradeoffs with the customer about things, it gonna be really time-consuming or expensive.

坦白说,您必须了解当今工具的功能:在我看来,构建PDF可能是最困难的事情之一……根据需要,如果您无法做到,事情可能会变得非常复杂。与客户就事物进行权衡,这将是非常耗时或昂贵的。

But first, let’s have a big picture: PDFs can be sorted into two kinds :

但是首先,让我们看一看:PDF可以分为两种:

  • Simple PDF: Those are displaying texts, links, emphasis, and basics assets like images without complex things. Their layout is pretty common. Think of a simple article PDF without complexes things: the one you could have written in markdown then exported to PDF.

    简单PDF:这些文件显示的是文字,链接,重点和基本资产,例如图像,没有复杂的内容。 它们的布局很常见。 考虑一下简单的PDF文章,其中没有什么复杂的东西:您可以用markdown编写的文章然后导出到PDF。

  • Complexes PDF: that needs metadata, SSL encryption to prevent writing, charts, sections summary, table of contents, automatic pages number, mixing landscape, and portrait pages. You might also need forms, forms filling, attachment, or even composing PDF with edition, fusion, overlay…

    复杂PDF :需要元数据,SSL加密以防止书写,图表,节摘要,目录,自动页码,混合横向和纵向页面。 您可能还需要表格,表格填充,附件,甚至用版本,融合,覆盖图组成PDF。

To solve those two need, you have mainly two way of implementing it:

要解决这两个需求,主要有两种实现方式:

  • Some PDFs are just images based: The whole PDF or some part is a screenshot of some runtime UI. That way if you zoom on it, they will lose quality. It’s a poor trick to deal with PDF, but it works for things like A4 printing as the size if “fixed”. You often see those kinds of PDFs when someone scans a book’s pages to make it available in the numeric world.

    有些PDF只是基于图像的:整个PDF或一部分是某些运行时UI的屏幕截图。 这样,如果放大它,它们将失去质量。 处理PDF是一个很差的技巧,但是它适用于A4打印之类的东西,如果“固定”,它的大小也可以。 当某人扫描一本书的页面以使其在数字世界中可用时,您通常会看到这类PDF。

  • Whereas best quality PDFs are vector-based, which means you can zoom on it or print it and you’ll end up with a top-grade quality. That’s what a PDF is generally supposed to be.

    高质量的PDF是基于矢量的,这意味着您可以缩放或打印它,最终将得到顶级的质量。 那就是通常应该的PDF。

Where some techniques and libraries offer flexibility and such a good developer experience. They tend to be limited with advanced features such as a table of contents or SSL signing.

某些技术和库提供了灵活性,并提供了良好的开发人员体验。 它们往往受限于高级功能,例如目录或SSL签名。

⚡️ This article is condensed of my research and observations, it will present many ways to deal with PDFs with pros/cons for each practice. It’s also open for your observations guys, so feel free to reach me on LinkedIn if you have any comments.

⚡️本文总结了我的研究和观察结果,它将提出许多处理PDF的方法,每种方法都有优缺点。 它也向您的观察者开放,因此,如果您有任何意见,请随时通过LinkedIn与我联系。

Oh and one last thing! You can build PDFs on frontend or backend depending on your need, let’s see that in detail.

哦,还有最后一件事! 您可以根据需要在前端或后端构建PDF,让我们详细了解一下。

我应该在前端/客户端上构建PDF吗? (Should I build PDFs on the Frontend / Client-side?)

Building PDFs on the frontend side is the process of only using the web browser to generate the file (or any kind of frontend client, such as a Mobile Application).

在前端侧生成PDF是仅使用Web浏览器生成文件(或任何类型的前端客户端,例如移动应用程序)的过程。

Then the browser will display it or trigger a download of the BLOB (Binary Large Object) without making any network request.

然后,浏览器将显示它或触发BLOB(二进制大对象)的下载,而无需发出任何网络请求。

✅ Using frontend PDF generation has various benefits:

✅使用前端PDF生成有很多好处:

  • It saves resources from your server, it’s especially useful when you generate a lot of PDFs for a lot of users and very often.

    它可以节省服务器资源,在为许多用户(经常)生成大量PDF时特别有用。
  • It allows offline capabilities such as with progressive web apps. You can therefore export PDFs from anywhere, even in the airplane and all of this without any internet connection.

    它允许离线功能,例如渐进式Web应用程序。 因此,您可以从任何地方(甚至在飞机上)导出PDF,而无需任何Internet连接。
  • It’s secure and GDPR compliant because you can use user data without making them flying through the network or even being known by your server.

    它是安全的且符合GDPR,因为您可以使用用户数据,而无需使其在网络中飞行甚至不被服务器知道。

❌ But it also has this amount of drawbacks:

❌但是它也有很多缺点:

  • You have less control over access policies as the code lives in the frontend and it might be harder to restrict access to some features, like for a paid plan.

    由于代码位于前端,因此您对访问策略的控制较少,并且可能更难限制对某些功能的访问,例如付费计划。
  • You might need code adaptation depending on the platform as not every browser works the same way neither are equals. Sometimes some features might just lack support on some browser.

    您可能需要根据平台进行代码调整,因为并非每个浏览器的工作方式都不相同。 有时,某些功能可能只是缺少某些浏览器的支持。
  • You need a backend and of course, you’ll pay for it every month. Some apps are backend less and it would be sad to have to build and host one just for that use case.

    您需要一个后端,当然,您需要每月支付。 有些应用程序是后端少,那将是可悲的要建立和主机一个只为该用例。

用纯JavaScript编写PDF (Write PDF in pure JavaScript)

First, let’s see the “classic way” using some kind of imperative programming in JavaScript.

首先,让我们看看在JavaScript中使用某种命令式编程的“经典方式”。

For that, you can use something like https://pdfkit.org/ on the frontend (but it also work on the backend, as it’s a polymorphic library, good point!).

为此,您可以在前端使用类似https://pdfkit.org/的东西(但它也可以在后端使用,因为它是一个多态库,很好!)。

Let’s see a quick example of what you can achieve at https://pdfkit.org/demo/out.pdf

让我们在https://pdfkit.org/demo/out.pdf上看一下可以实现的目标的快速示例。

The following snippet shows how easy is it to write texts, images, and also SVGs.

下面的代码片段显示了编写文本,图像以及SVG的容易程度。

// create a document and pipe to a blob
var doc = new PDFDocument();
var stream = doc.pipe(blobStream());// draw some text
doc.fontSize(25).text('Here is some vector graphics...', 100, 80);// some vector graphics
doc
.save()
.moveTo(100, 150)
.lineTo(100, 250)
.lineTo(200, 250)
.fill('#FF3300');doc.circle(280, 200, 50).fill('#6600FF');// an SVG path
doc
.scale(0.6)
.translate(470, 130)
.path('M 250,75 L 323,301 131,161 369,161 177,301 z')
.fill('red', 'even-odd')
.restore();// and some justified text wrapped into columns
doc
.text('And here is some wrapped text...', 100, 300)
.font('Times-Roman', 13)
.moveDown()
.text(lorem, {
width: 412,
align: 'justify',
indent: 30,
columns: 2,
height: 300,
ellipsis: true
});// end and display the document in the iframe to the right
doc.end();
stream.on('finish', function() {
iframe.src = stream.toBlobURL('application/pdf');
});

Alright! Great! It’s just a matter of sequencing imperative instructions, but that’s all, a kind of low-level API. Do you need custom charts? You’ll need to make your own with SVG and something like d3-js.

好的! 大! 这只是命令指令顺序的问题,但这仅是一种低级API。 您需要自定义图表吗? 您需要使用SVG和类似d3-js的东西制作自己的东西。

As it’s imperative and especially low API it can take some time to achieve the expected final results and it might require a lot of code. You’ll probably need some 2D modelization using the Vectors to draw what you need and the tool won’t help you with that.

由于这是必须的且API尤其低,它可能需要一些时间才能达到预期的最终结果,并且可能需要大量代码。 您可能需要使用Vector进行2D建模,以绘制所需的内容,而该工具将无济于事。

If you’re using React JS you can achieve something similar with a declarative wrapper that provides a bit higher tier components.

如果您使用的是React JS,则可以使用声明性包装器实现类似的功能,该包装器提供更高的层级组件。

使用ReactJS (Using ReactJS)

The best tool on the React context is https://react-pdf.org/ it allows you to design PDF using React components and use some kind of CSS language for the layout, which is great.

React上下文上最好的工具是https://react-pdf.org/,它允许您使用React组件设计PDF并使用某种CSS语言进行布局,这很棒。

import React from 'react';
import { Page, Text, View, Document, StyleSheet } from '@react-pdf/renderer';// Create styles
const styles = StyleSheet.create({
page: {
flexDirection: 'row',
backgroundColor: '#E4E4E4'
},
section: {
margin: 10,
padding: 10,
flexGrow: 1
}
});// Create Document Component
const MyDocument = () => (



Section #1


Section #2



);

Alright, this is a bit better and dead simple isn’t it? It’s also easier to maintain as it uses kind of JSX instructions and components which are a great way to split our interface.

好吧,这好一点了,很简单,不是吗? 由于它使用某种JSX指令和组件,因此维护起来也更容易,这是拆分我们的界面的一种好方法。

But it comes with the same issues. It remains still low tier and also relies on canvas for specifics needs such as charts. Here again, you’ll need some external library.

但是它也有同样的问题。 它仍然是低层的,并且还依赖于画布来满足特定需求,例如图表。 再次在这里,您将需要一些外部库。

Under the hood react-pdf uses PDFKit.

在幕后react-pdf使用PDFKit。

For frontend in JavaScript, I guess tools based on PDFKit are the only viable tools but I might have lacked somethings, feel free to reach me if you have a higher level API with better support :)

对于JavaScript的前端,我想基于PDFKit的工具是唯一可行的工具,但是我可能缺少一些东西,如果您拥有更高级别的API和更好的支持,请随时与我联系:)

We’ll see one more on the frontend side a bit further in that article as it’s also a cross-platform tool :)

在那篇文章中,我们还将在前端看到更多,因为它还是跨平台的工具:)

还是应该在具有NodeJS或Cloud Function的后端/服务器上构建PDF? (Or should I build PDFs on Backend / Server with NodeJS or Cloud Function?)

In that case, every logic is delegated to the server that will own everything to render our PDF and sent it to the client application.

在这种情况下,每个逻辑都委托给服务器,该服务器将拥有一切以呈现我们的PDF,并将其发送到客户端应用程序。

✅ Using backend PDF generation has various benefits:

✅使用后端PDF生成有很多好处:

  • It gives more capabilities of making PDFs as you can rely on some existing binary from other languages and specialized in PDFs generation.

    它提供了更多的制作PDF的功能,因为您可以依赖于其他语言的某些现有二进制文件并专门从事PDF的生成。
  • You have more control over everything such as access control policies as it’s server authority, you could just cut off access if the user hasn’t paid for example.

    作为服务器权限,您可以更好地控制所有内容,例如访问控制策略,例如,如果用户未付款,就可以切断访问权限。
  • You can use a third-party service and control access and usage on it.

    您可以使用第三方服务并控制对它的访问和使用。
  • You can ensure that the results will be the same on all devices, as you control the execution environment. When using backend you generally ensure consistency between client platforms.

    在控制执行环境时,可以确保所有设备上的结果都相同。 使用后端时,通常要确保客户端平台之间的一致性。

❌ But it also has this amount of cons:

❌但是它也有很多缺点:

  • As the main drawback, you need an internet connection on the frontend.

    作为主要缺点,您需要在前端连接Internet。
  • You need a backend that will cost some every month. A backend that you might need to scale depending on frequentation.

    您需要一个每月要花费一些费用的后端。 您可能需要根据使用频率来扩展后端。
  • If the backend fall, the feature falls whereas in frontend integration everything would have still worked.

    如果后端下降了,那么功能就会下降,而在前端集成中,一切仍然会起作用。
  • It could also be less secure because you might need to send user information to the backend that he doesn’t already collect for other purposes.

    这也可能不太安全,因为您可能需要将用户尚未收集的用户信息发送到后端,以用于其他目的。

为后端PDF生成选择合适的环境 (Choosing the proper environment for backend PDFs generation)

Before building PDFs on the backend, you must be aware of your environment and shipped library. As an example, you can’t use everything in a cloud function environment because it might lack some binaries.

在后端上构建PDF之前,您必须了解您的环境和随附的库。 例如,您不能在云功能环境中使用所有内容,因为它可能缺少一些二进制文件。

  • Do I have a Ubuntu Linux, CentOS, or even Windows server?

    我是否有Ubuntu Linux,CentOS甚至Windows服务器?
  • Is my code running on a barebone server or inside of a cloud function?

    我的代码是在准系统服务器上运行还是在云功能内部运行?
  • What are my RAM and Processor capabilities?

    我的RAM和处理器功能是什么?

Here are some questions you should ask yourself. With that in mind, there are various ways of building PDFs on the server-side.

这是您应该问自己的一些问题。 考虑到这一点,在服务器端有多种构建PDF的方法。

Some about HTML and are very useful for small use cases. Others are about bare-bone PDFs clients, which are way harder but also more efficient for complex use cases.

有关HTML的一些知识对于小型用例非常有用。 另一些是关于准骨头PDF客户端的,这对于复杂的用例而言,难度更大,但效率更高。

You can even rely on an external service such as SAAS software.

您甚至可以依靠外部服务,例如SAAS软件。

使用Puppeteer在后端生成PDF (Using Puppeteer to generate PDFs on the backend side)

Puppeteer is supposed to be a headless client for chromium-based browsers, it is made by Google engineers. You are writing code that drives the browser on some pages or inject some HTML into the browser, then you can interact with the HTML document.

Puppeteer应该是Google工程师制造的基于Chrome的浏览器的无头客户端。 您正在编写在某些页面上驱动浏览器或将某些HTML注入浏览器的代码,然后就可以与HTML文档进行交互。

Therefore, you’ll use the functionality save to PDF of the browser, that way all you have to do is to design a page with CSS specific for your print or add a CSS print sheet.

因此,您将使用save to PDF浏览器save to PDF的功能,那样,您要做的就是设计一个页面,该页面具有特定于您的打印CSS或添加CSS打印表。

We end up with a code like this:

我们最终得到这样的代码:

const chromium = require('chrome-aws-lambda');exports.handler = async (req, res) => {
let result = null;
let browser = null;try {
browser = await chromium.puppeteer.launch({
args: chromium.args,
defaultViewport: chromium.defaultViewport,
executablePath: await chromium.executablePath,
headless: chromium.headless,
ignoreHTTPSErrors: true,
});let page = await browser.newPage();await page.goto('https://example.com');const pdfBuffer = await page.pdf({ printBackground: true });res.set("Content-Type", "application/pdf");
res.set("Content-Disposition", 'attachment; filename="My-report.pdf"');
res.status(200).send(pdfBuffer);
} catch (error) {
return res.status(500).send(error);
} finally {
if (browser !== null) {
await browser.close();
}
}
};

This is great, but let’s be honest kind of hacky.

太好了,但是老实说,让我们老套。

Speaking of difficulties, things might get harder when you need advanced features, such as mixing landscape and portrait pages within the PDF, using the table of content, or merging many PDFs together.

说到困难,当您需要高级功能(例如在PDF中混合横向和纵向页面,使用目录或将许多PDF合并在一起)时,事情可能会变得更加困难。

The browser is not supposed to be a PDFs centered tool, it’s just a convenience functionality used to exports some pages quickly at the origin. We are hijacking that user convenient feature into a largely scaled production feature used by a server.

浏览器不应该是以PDF为中心的工具,它只是一种方便的功能,用于在原始位置快速导出某些页面。 我们正在将用户方便的功能劫持到服务器使用的大规模生产功能中。

By doing this we are exposing ourselves to some strange behaviors.

通过这样做,我们使自己暴露于一些奇怪的行为。

If you choose to proceed like so, have a look at https://github.com/alixaxel/chrome-aws-lambda which is a library that’s is guaranteed to works in cloud functions whereas pure puppeteer might not…

如果您选择以这种方式进行操作,请查看https://github.com/alixaxel/chrome-aws-lambda ,这是一个可以保证在云函数中工作的库,而纯木偶可能不会…

I would recommend this method if you’re running short in time and your use case is simple as it’s really easy to implement.

if如果您运行时间短并且用例很简单,因为它很容易实现,我会推荐这种方法。

使用GotenBerg和Docker生成PDF (Using GotenBerg and Docker to generate PDFs)

In the same fashion, you’ve Gotenberg, a Docker-based solution that is presented as an autonomous API server for dealing with PDFs

同样,您将获得Gotenberg,这是一个基于Docker的解决方案,作为用于处理PDF的自主API服务器而呈现

It’s a great tool and you should definitely have a look at it!

这是一个很棒的工具,您绝对应该看看它!

使用“ PDF lib”库 (Using “PDF lib” library)

After my research, this is probably the most promising library to deal with PDFs as it supports various panel of use cases with forms, metadata, and overlay.

经过我的研究,这可能是处理PDF的最有前途的库,因为它支持各种形式的用例,表单,元数据和覆盖图。

Oh and yeah I talk about it here but it’s CROSS-PLATFORM

哦,是的,我在这里谈论它,但它是跨平台的

Written in TypeScript and compiled to pure JavaScript with no native dependencies. Works in any JavaScript runtime, including browsers, Node, Deno, and even React Native.

用TypeScript编写,并编译为无本地依赖项的纯JavaScript。 可在任何JavaScript运行时中使用,包括浏览器,Node,Deno甚至React Native。

Using merging we can play with different orientations on one document by merging two different PDFs. You can think of this library as PDFKit but with some additional implementation to avoid using Canva for the first use case.

通过合并,我们可以通过合并两个不同的PDF在一个文档上以不同的方向播放。 您可以将该库视为PDFKit,但可以使用其他一些实现,以避免在第一个用例中使用Canva。

⚡️ This is really a great tool and I guess it’s currently the winner in terms of free and open-source solutions it’s also great to note that it’s cross-platform.

⚡️这确实是一个很棒的工具,我猜它目前在免费和开源解决方案方面是赢家,也很高兴注意到它是跨平台的。

使用第三方服务在NodeJS上生成PDF (Using third-party service to generate PDFs on NodeJS)

And the last one of this article is SAAS, the thing is that SAAS is most of the time expensive as you pay for earning some time in integration.

而本文的最后一篇是SAAS,事实是,在赚取一些集成时间时,SAAS大部分时间都是昂贵的。

SAAS has also the issue that you have a tight coupling to the external service and you depend on their potentially changing policies and pricing variations.

SAAS还存在一个问题,即您与外部服务紧密耦合,并且依赖于它们可能会更改的策略和价格变化。

But SAAS are generally easy to implement and secured in terms of that their team is working on it and only on it with probably a lot of testing tools. They also provide supports for help if any issue is found.

但是SAAS通常很容易实现,并且可以确保其团队正在研究它,并且只有使用很多测试工具才能对其进行保护。 如果发现任何问题,他们还提供支持以寻求帮助。

Here are some tools that look promising:

以下是一些看起来很有前途的工具:

  • https://pdfgeneratorapi.com/

    https://pdfgeneratorapi.com/

  • https://docraptor.com/ (Based on HTML but with great features such as mixing layout)

    https://docraptor.com/ (基于HTML,但是具有诸如混合布局之类的强大功能)

  • https://www.docmosis.com/

    https://www.docmosis.com/

If you can afford any of those, this might be a good solution for you!

如果您能负担得起任何一个,这可能对您来说是一个很好的解决方案!

带走 (Takeaway)

Think of your need and how far you can make tradeoffs on it. Thinks about essentials questions:

考虑您的需求以及可以在多大程度上进行权衡。 考虑基本问题:

  • Does it need to work offline?

    需要脱机工作吗?
  • How complex is my PDF file?

    我的PDF文件有多复杂?
  • Do I have a backend, is it a server or cloud functions?

    我有一个后端,是服务器还是云功能?
  • How much time do I have to make that?

    我必须花多少时间?
  • Can I pay for bringing a third party service in the project?

    我可以为在项目中引入第三方服务付费吗?

With that in mind, you can choose whether you’re going server or client-side. Then you can use some techniques like the puppeteer one or make a custom implementation. If you’re rolling on gold, you can even transfer the responsibility on an external service that is going to be responsible for that.

考虑到这一点,您可以选择要使用服务器端还是客户端。 然后,您可以使用一些技术,例如操纵木偶或自定义实现。 如果您使用的是黄金,您甚至可以将责任转移到将对此负责的外部服务上。

If you’re French ? Alors je te propose de découvrir ⚡️ Coding Spark et de t’abonner à ma newsletter tech pour recevoir gratuitement du contenu tech!

French如果您是法国人? Alors je te de de decouvrir⚡️编码Spark and de t'abonnerà时事通讯技术倾销收录人contenu技术!

翻译自: https://medium.com/javascript-in-plain-english/most-efficient-ways-for-building-pdfs-files-with-backend-and-frontend-javascript-environment-68056f73257

你可能感兴趣的:(java,javascript,python,vue,web,ViewUI)