「全局命令」& robotjs 体验桌面自动化

本文主要介绍如何实现全局命令，如何使用 robotjs 实现桌面自动化以及踩坑经验分享。旨在经验和灵感交流，篇幅有限，点到为止。原文可参考我的公众号文章《全局命令」& robotjs 体验桌面自动化》

实现全局命令的步骤

1.创建一个新项目，并生成 package.json 文件

比如新建项目 weakup，并在项目根目录下执行 npm init -y 生成 package.json 文件。

2.创建一个 bin 文件夹，并在 bin 文件夹下创建一个 weakup.js 文件

比如需要创建一个 node 可执行文件，如下代码：

#!/usr/bin/env node

console.log("Hello, Weakup");

首行 #!/usr/bin/env 表示这是一个可执行文件，node 表示使用系统环境变量中的 node 命令来执行这个文件。如果需要用 python 来执行这个文件，可以改为 #!/usr/bin/env python。

3.修改 package.json 文件添加 `bin` 字段

在bin字段中声明全局命令和对应执行的文件，如下代码：

{
  "name": "weakup",
  "type": "module",
  "bin": {
    "weakup": "bin/weakup.js"
  },
  "files": [
    "bin"
  ]
}

"weakup": "bin/weakup.js"表示全局命令为 weakup，对应的执行文件为 bin/weakup.js。当我们在命令行里输入 weakup 并按下回车后，会自动执行 bin/weakup.js 文件。

4.执行 `npm link` 将 `weakup` 命令注册到系统

npm link 可以帮助我们模拟包安装后的状态，它会在系统中做一个快捷方式映射，让本地的包就好像 install 过一样，可以直接使用。

在项目根目录下执行 npm link 命令，将weakup命令注册到系统。这样在任意命令行下执行 weakup 都会自动执行 bin/weakup.js 文件。

请注意name字段是必填项，否则在执行 npm link 时会报相应错误。另外，如果是 Mac 电脑可能需要加 sudo 执行。

参考从零开发前端 CL 命令

5.配合`inquirer`库，可以实现命令行交互功能

import inquirer from "inquirer";
const questions = [{type:'list', ...}, {type:'input', ...}]
inquirer.prompt(questions).then((res) => {
  console.log("你的选择:", res);
}).catch(console.error)

参考inquirer 文档

robotjs 的基本使用

robotjs 可以实现桌面自动化，它提供了鼠标、键盘和屏幕的控制能力。方法不是很多，文档也比较清晰，使用的时候查一下就好了。需要注意的是，执行完鼠标移动、点击或输入操作后，最好 delay 一下，否则可能会出现一些奇怪的问题。参考robotjs 文档

下面说一下安装：

npm install robotjs

Mac 前置安装工具（如果运行失败）

xcode 环境
CammandLineTools

windows 前置安装工具

Python 环境：下载安装 python3.11.x（下载地址https://www.python.org/downloads/windows/。安装后可在C:\Users\xxx\AppData\Local\Programs\Python查看安装版本）
C++环境：下载安装 VisualStudio，在安装 Visual Studio 时，请确保选择了 "Desktop development with C++" 工作负载。（下载地址https://visualstudio.microsoft.com/zh-hans/downloads/（社区版））；
npm install --global --production windows-build-tools

几个本人的使用场景

屏幕取色：实时显示鼠标位置和当前位置的屏幕颜色；
风格截屏：截取整个屏幕，处理rgba信息，并保存为 PNG 文件（还没有实现鼠标拖动截屏功能）；
自动登陆 ssh：打开系统命令行工具，执行预设的服务器登录流程
抖音无人直播助手：web 页面观看直播时自动点击页面实现保活 / 3D 无人直播时旋转 3D 模型 / 模拟输入自动回复直播间的聊天（需要走 http 服务并接入 GPT）；
录屏：robotjs 截屏存图，再用 ffmpeg 把图片组装生成视频（todo）；

之前用过puppeteer这种无头浏览器模拟，它的功能更强大。用来实现爬虫、截屏、自动化测试都挺方便的运行起来不占用用户的鼠标、键盘外设，这些是它的优点。但是robotjs的优点也很明显，就是它能够在桌面层级进行操作，理论上如果仔细编排流程的话，任何界面任何软件都能模拟用户的真实操作。

_觉得 robojs 强大，然而我没有想到什么特别有意思且使用的场景，想来个集思广益的 _。

「截屏保存」简单实现

import fs from "fs";
import robot from "robotjs";
import { PNG } from "pngjs";
import { fileURLToPath } from "url";
import path, { dirname } from "path";

const __filename = fileURLToPath(import.meta.url);
const __dirname = dirname(__filename);

// 截图尺寸
const dpr = 2; //获取自己屏幕的dpr
const size = robot.getScreenSize();
console.log("size:", size);

// 截取屏幕
const screenshot = robot.screen.capture(0, 0, size.width, size.height);
console.log("screenshot.image.length:", screenshot.image.length);

// 创建 PNG 对象，缩放dpr倍，解决图片不完整的问题
const png = new PNG({
  width: size.width * dpr,
  height: size.height * dpr,
});
console.log("\npng.data.length:", png.data.length);

for (let y = 0; y < png.height; y++) {
  for (let x = 0; x < png.width; x++) {
    let idx = (png.width * y + x) * 4;
    let r = screenshot.image[idx];
    let g = screenshot.image[idx + 1];
    let b = screenshot.image[idx + 2];
    let a = screenshot.image[idx + 3];

    //交换r和b
    png.data[idx] = b;
    png.data[idx + 1] = g;
    png.data[idx + 2] = r;
    png.data[idx + 3] = a;
  }
}

// 将 PNG 对象保存为图片文件
const filename = "screenshot.png";
const savePath = path.join(__dirname, filename);

png.pack().pipe(fs.createWriteStream(filename));

console.log(`截图保存在 ${savePath}`);

踩坑经验

以下是实现「屏幕截图存为图片」时遇到的几个问题：

1. 如何把 `robot.screen.capture` 返回的 `Bitmap` 数据的 `image` 存储为 PNG 文件？

robotjs 没有提供类似方法，我目前是利用pngjs库进行的处理。参考pngjs 文档

2. 使用 `pngjs` 保存的 PNG 模糊不完整？

先看第一版代码：

const screenshot = robot.screen.capture(0, 0, size.width, size.height);
const png = new PNG({
  width: size.width,
  height: size.height,
});

console.log("screenshot.image.length:", screenshot.image.length);
console.log("png.data.length:", png.data.length);

观察上面的 log 输出可以发现，从robot.screen.capture返回的screenshot.image长度为x，而png.data长度为x/2，相差一半。这里可以初步判断图片不完整的原因，且可以知道 dpr 是 2；

如果修改为以下代码，问题就解决了。

const dpr = 2; // 假设屏幕是高清屏
const png = new PNG({
  width: size.width * dpr,
  height: size.height * dpr,
});

3.生成的 png 图片「红蓝颠倒」了？

将 robotjs 截图数据填充到png的data过程中，需要交换r和b两个像素的位置，例如：

for (let y = 0; y < png.height; y++) {
  for (let x = 0; x < png.width; x++) {
    let idx = (png.width * y + x) * 4;
    let r = screenshot.image[idx];
    let g = screenshot.image[idx + 1];
    let b = screenshot.image[idx + 2];
    let a = screenshot.image[idx + 3];

    //交换r和b（不知道为什么顺序不是rgba而是bgra）
    png.data[idx] = b;
    png.data[idx + 1] = g;
    png.data[idx + 2] = r;
    png.data[idx + 3] = a; // 不透明
  }
}

请教下屏幕前的朋友为什么顺序不是 rgba 而是 bgra

4. `ES module` 模式不支持 `__dirname`，报错如下：

const savePath = path.join(__dirname, "screenshot.png");
^

ReferenceError: __dirname is not defined in ES module scope
This file is being treated as an ES module because it has a '.js' file extension and '/Users/xxx/package.json' contains "type": "module". To treat it as a CommonJS script, rename it to use the '.cjs' file extension.

解决方案：

import { fileURLToPath } from "url";
import path, { dirname } from "path";

const __filename = fileURLToPath(import.meta.url);
const __dirname = dirname(__filename);

const savePath = path.join(__dirname, "screenshot.png");

THE END