使用Zig构建WebAssembly中间件在Tyk网关中重塑与Spring后端通信的tRPC请求

API网关

文章字数: 3.6k

阅读时长: 16 分

一个生产环境中的技术决策，很少是关于选择“最好”的技术，而更多是关于在现有约束下，用一组“正确”的技术组合，去解决一个棘手的、具体的问题。我们面临的挑战是：一个庞大、稳定但陈旧的Spring Framework单体应用，它暴露的是一套设计于多年前的、结构复杂且冗余的RESTful API。与此同时，一个新的前端团队正在构建一个体验驱动的Web应用，他们选择了tRPC，以追求端到端的类型安全和卓越的开发效率。

直接让tRPC客户端去适配这套陈旧的API是痛苦且低效的。数据结构不匹配，多个端点需要聚合，字段需要重命名和计算。常规的解决方案是在两者之间加一个BFF（Backend for Frontend）微服务。但这个方案引入了新的运维成本、网络延迟和潜在的单点故障。我们的目标是在不引入新服务的情况下，实现一个高性能、低延迟的请求转换层。API网关是这个转换层的天然宿主。我们使用Tyk，但它的标准中间件（如JavaScript或Python插件）在进行密集的JSON解析和序列化时，性能无法满足我们对延迟的苛刻要求。

最终的架构决策是：在Tyk网关层面，利用其对WebAssembly (WASM)的支持，使用系统级编程语言Zig来编写一个高性能的请求转换中间件。这个决策的核心权衡是，用更高的开发复杂性，换取极致的运行时性能和更精简的架构。

方案对比与选型 rationale

在深入实现之前，必须清晰地阐述为什么放弃了其他看似更简单的方案。

方案A: BFF微服务 (Node.js/Spring Boot)

优势:
- 技术栈统一，团队熟悉度高。
- 与核心业务逻辑解耦，可以独立部署和扩展。
- 测试和调试相对直接。
劣势:
- 网络延迟: 增加了一次完整的网络往返（Client -> Gateway -> BFF -> Spring Service）。在低延迟场景下，这是不可接受的。
- 运维复杂度: 需要为BFF服务配置独立的部署流水线、监控、日志和告警。增加了基础设施的负担。
- 资源消耗: 即使是一个轻量级的BFF，也需要占用独立的计算和内存资源。

方案B: Tyk内置JavaScript中间件

优势:
- 配置简单，无需编译，直接在API定义中编写脚本。
- 无额外网络跳数，逻辑在网关进程内执行。
劣势:
- 性能瓶颈: 在高并发下，JS引擎（otto）的性能对于密集的JSON操作来说是一个显著的瓶颈。V8引擎的集成（Tyk 5.0+）有所改善，但依然存在GC停顿和JIT预热问题。
- 类型安全缺失: JavaScript的动态类型特性使得在处理复杂、嵌套的数据结构转换时容易出错。

最终选择：Zig + WebAssembly中间件

优势:
- 极致性能: Zig直接编译成高度优化的WASM字节码，没有VM或GC开销。内存管理是手动的，可以做到对性能的精细控制。
- 安全性: WASM运行在一个隔离的沙箱中，即使中间件代码有bug，也不会搞垮整个网关进程。
- 可移植性: WASM是标准的字节码格式，理论上可以运行在任何支持WASM的宿主环境中。
- 强类型系统: Zig的强类型系统可以在编译期发现大量潜在的数据处理错误。
劣势:
- 开发复杂性: 需要处理原始内存指针、了解Tyk WASM的ABI（应用程序二进制接口）、手动管理内存。
- 调试困难: 调试运行在Tyk（一个Go程序）内部的WASM模块比调试一个独立的服务要困难得多。
- 生态系统: Zig和Tyk WASM的生态相对较新，可参考的案例和库较少。

这个决策是一个典型的架构权衡，我们接受了开发阶段的复杂性，以换取生产环境中的高性能和低运维开销。

架构与数据流

整个请求生命周期如下：

sequenceDiagram
    participant Client as tRPC Client
    participant Tyk as Tyk Gateway
    participant WASM as Zig WASM Middleware
    participant Spring as Spring Boot Service

    Client->>Tyk: 发起tRPC风格的POST请求 (JSON)
    Tyk->>WASM: 调用中间件，传递请求体
    activate WASM
    WASM-->>WASM: 1. 解析tRPC风格JSON
    WASM-->>WASM: 2. 转换数据结构
    WASM-->>WASM: 3. 序列化为Spring风格JSON
    WASM->>Tyk: 返回转换后的请求体
    deactivate WASM
    Tyk->>Spring: 转发转换后的请求
    activate Spring
    Spring-->>Tyk: 返回响应
    deactivate Spring
    Tyk-->>Client: 返回响应给客户端

核心实现概览

我们将分步构建这个系统的各个组件。

1. 后端: 遗留的Spring Framework服务

这个服务代表了我们的现有系统。它接受一个特定格式的UserLegacyPayload对象。

pom.xml 核心依赖:

<dependencies>
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-web</artifactId>
    </dependency>
    <dependency>
        <groupId>com.fasterxml.jackson.core</groupId>
        <artifactId>jackson-databind</artifactId>
    </dependency>
</dependencies>

DTO (Data Transfer Object) 定义:

// src/main/java/com/example/legacy/dto/UserLegacyPayload.java
package com.example.legacy.dto;

// 遗留系统期望的格式：扁平化，字段名冗长
public class UserLegacyPayload {
    private String userIdentifier; // 对应 tRPC 的 id
    private String userProfileFullName; // 对应 tRPC 的 profile.name
    private int userProfileAge; // 对应 tRPC 的 profile.age
    private String userAccountCreationSource; // 在WASM中硬编码或计算得出的字段

    // Getters and Setters ...
}

Controller 接口:

// src/main/java/com/example/legacy/controller/LegacyUserController.java
package com.example.legacy.controller;

import com.example.legacy.dto.UserLegacyPayload;
import org.springframework.http.ResponseEntity;
import org.springframework.web.bind.annotation.PostMapping;
import org.springframework.web.bind.annotation.RequestBody;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RestController;
import java.util.Map;

@RestController
@RequestMapping("/api/legacy/user")
public class LegacyUserController {

    @PostMapping("/create")
    public ResponseEntity<?> createUser(@RequestBody UserLegacyPayload payload) {
        System.out.println("Received legacy payload: " + payload.toString());

        // 真实项目中这里是复杂的业务逻辑...
        if (payload.getUserIdentifier() == null || payload.getUserIdentifier().isEmpty()) {
            return ResponseEntity.badRequest().body(Map.of("status", "error", "message", "userIdentifier is missing"));
        }

        return ResponseEntity.ok(Map.of("status", "created", "userId", payload.getUserIdentifier()));
    }
}

这个Spring服务非常直接，它定义了我们必须适配的目标数据结构。

2. 前端API层: tRPC服务

前端团队使用tRPC来定义他们的API。我们在一个简单的Node.js服务器上模拟这个tRPC后端。注意，这个服务本身不被直接调用，它仅仅用来定义前端期望的API schema。

tRPC Router定义:

// src/router.ts
import { initTRPC } from '@trpc/server';
import { z } from 'zod';

const t = initTRPC.create();

export const appRouter = t.router({
  // 定义一个前端期望的procedure
  createUser: t.procedure
    .input(
      // 这是前端希望发送的，更现代、更结构化的数据模型
      z.object({
        id: z.string().uuid(),
        profile: z.object({
          name: z.string(),
          age: z.number().min(18),
        }),
        metadata: z.record(z.any()).optional(),
      })
    )
    .mutation(async ({ input }) => {
      // 在真实场景中，tRPC的resolver会调用外部服务。
      // 但在这里，请求被Tyk拦截和转换，所以这个resolver永远不会执行。
      // 它的存在主要是为了生成客户端的类型定义。
      console.log('This resolver should never be called in our architecture.');
      return { status: 'ok', userId: input.id };
    }),
});

export type AppRouter = typeof appRouter;

前端通过tRPC客户端发送的JSON载荷会是这样的：

{
  "id": "a1b2c3d4-e5f6-4a7b-8c9d-0e1f2a3b4c5d",
  "profile": {
    "name": "John Doe",
    "age": 30
  }
}

而我们的目标是在WASM中间件中将其转换为Spring服务期望的格式：

{
  "userIdentifier": "a1b2c3d4-e5f6-4a7b-8c9d-0e1f2a3b4c5d",
  "userProfileFullName": "John Doe",
  "userProfileAge": 30,
  "userAccountCreationSource": "api-gateway-wasm"
}

3. 核心: Zig WebAssembly 中间件

这是整个架构的心脏。我们将使用Zig来编写WASM模块。

项目结构:

.
├── build.zig         # Zig构建脚本
└── src
    └── main.zig      # 源代码

build.zig 构建脚本:
这个脚本负责将我们的Zig代码编译成Tyk兼容的WASM模块。

const std = @import("std");

pub fn build(b: *std.Build) void {
    const target = b.resolveTargetQuery(.{
        .cpu_arch = .wasm32,
        .os_tag = .freestanding,
    });

    const exe = b.addExecutable(.{
        .name = "tyk-zig-transformer",
        // .root_source_file = .{ .path = "src/main.zig" },
        .root_source_file = b.path("src/main.zig"),
        .target = target,
        .kind = .exe, // For WASM, .exe is often used for freestanding binaries
    });

    // Tyk's WASM host requires these functions to be exported.
    // This is crucial for the linking to work correctly.
    exe.export_symbol_names = &.{
        "tyk_wasm_request_handler",
        "tyk_wasm_allocate",
        "tyk_wasm_free",
    };

    // We don't need the Zig runtime support like panic handlers for this minimal module.
    // This reduces the final WASM file size.
    exe.strip = true;
    exe.rdynamic = false;

    b.installArtifact(exe);

    const install_step = b.step("install", "Install the wasm module");
    install_step.dependOn(&b.install_step.items);

    // Command to copy the final artifact to a known location
    const wasm_artifact_path = b.getInstallPath(exe.install_path, exe.out_filename);
    const copy_cmd = b.addSystemCommand(&.{
        "cp",
        "-f",
        wasm_artifact_path,
        "../tyk/middleware/tyk-zig-transformer.wasm",
    });
    copy_cmd.step.dependOn(install_step);

    const copy_step = b.step("copy-wasm", "Copy wasm to tyk middleware dir");
    copy_step.dependOn(&copy_cmd.step);
}

src/main.zig 源代码:
这是实现转换逻辑的地方。代码必须非常小心地处理内存，因为它直接与Tyk的Go运行时通过共享内存进行交互。

const std = @import("std");
const json = std.json;
const mem = std.mem;

// Tyk WASM Host Functions. These are imported from the host environment (Tyk).
// We declare them here so Zig's compiler knows about them.
// The `link_section` tells the linker these are imports from the "env" module.
extern "env" fn tyk_wasm_log(ptr: [*]const u8, len: u32) void;
extern "env" fn tyk_wasm_get_request_body(ptr: *[*]u8, len: *u32) u32;
extern "env" fn tyk_wasm_set_request_body(ptr: [*]const u8, len: u32) u32;

// These functions are exported to the Tyk host for memory management.
// Tyk will call `tyk_wasm_allocate` to create space for data it wants to pass to us,
// and `tyk_wasm_free` to release memory we allocated.

// C-style export for WASM compatibility.
export fn tyk_wasm_allocate(size: u32) *anyopaque {
    // We use a general purpose allocator for simplicity.
    // In a real high-performance scenario, you might use a more specialized one.
    const ptr = std.heap.c_allocator.alloc(u8, size) catch return null;
    return @ptrCast(ptr);
}

export fn tyk_wasm_free(ptr: *anyopaque, size: u32) void {
    // Cast the opaque pointer back to a slice and free it.
    const slice = @ptrCast([*]u8, @alignCast(ptr))[0..size];
    std.heap.c_allocator.free(slice);
}

// Helper for logging from WASM. Makes debugging much easier.
fn log(msg: []const u8) void {
    tyk_wasm_log(msg.ptr, @intCast(msg.len));
}

// Data structures matching the JSON formats.
const TrpcProfile = struct {
    name: []const u8,
    age: u8,
};

const TrpcPayload = struct {
    id: []const u8,
    profile: TrpcProfile,
};

const SpringPayload = struct {
    userIdentifier: []const u8,
    userProfileFullName: []const u8,
    userProfileAge: u8,
    userAccountCreationSource: []const u8 = "api-gateway-wasm",

    pub fn jsonStringify(self: SpringPayload, writer: anytype) !void {
        var obj_map = json.ObjectMap.init(writer);
        defer obj_map.deinit();
        try obj_map.put("userIdentifier", self.userIdentifier);
        try obj_map.put("userProfileFullName", self.userProfileFullName);
        try obj_map.put("userProfileAge", self.userProfileAge);
        try obj_map.put("userAccountCreationSource", self.userAccountCreationSource);
    }
};

// This is the main entry point called by Tyk for each request.
export fn tyk_wasm_request_handler() u32 {
    log("Zig WASM middleware: request handler invoked.");

    // 1. Get the request body from Tyk host.
    var body_ptr: [*]u8 = undefined;
    var body_len: u32 = undefined;
    if (tyk_wasm_get_request_body(&body_ptr, &body_len) != 0) {
        log("Failed to get request body.");
        return 1; // Return non-zero for error
    }
    const request_body = body_ptr[0..body_len];

    // 2. Parse the incoming tRPC-style JSON.
    // We'll use an arena allocator to manage memory for the parsed objects.
    // This is efficient because all memory for this request can be freed at once.
    var arena = std.heap.ArenaAllocator.init(std.heap.c_allocator);
    defer arena.deinit();
    const allocator = arena.allocator();

    var trpc_payload: TrpcPayload = undefined;
    const options = json.ParseOptions{ .allocator = allocator };
    const parse_result = json.parse(TrpcPayload, &std.io.fixedBufferStream(request_body), options) catch |err| {
        log("JSON parsing failed.");
        std.debug.print("Parse error: {s}\n", .{@errorName(err)});
        return 1;
    };
    defer json.parseFree(TrpcPayload, parse_result, allocator);
    trpc_payload = parse_result;

    log("Successfully parsed tRPC payload.");

    // 3. Transform data and create the Spring-style payload.
    const spring_payload = SpringPayload{
        .userIdentifier = trpc_payload.id,
        .userProfileFullName = trpc_payload.profile.name,
        .userProfileAge = trpc_payload.profile.age,
    };

    // 4. Serialize the new payload back to a JSON string.
    var json_buffer = std.ArrayList(u8).init(allocator);
    json.stringify(spring_payload, .{}, json_buffer.writer()) catch |err| {
        log("JSON stringify failed.");
        std.debug.print("Stringify error: {s}\n", .{@errorName(err)});
        return 1;
    };
    const new_body = json_buffer.toOwnedSlice() catch |err| {
        log("Failed to own slice.");
        std.debug.print("Slice error: {s}\n", .{@errorName(err)});
        return 1;
    };

    // 5. Set the modified request body back into the Tyk request context.
    if (tyk_wasm_set_request_body(new_body.ptr, @intCast(new_body.len)) != 0) {
        log("Failed to set new request body.");
        return 1;
    }

    log("Successfully transformed and set new request body.");
    return 0; // Success
}

这段Zig代码的核心在于：

ABI遵从: 正确声明和导出Tyk期望的函数 (tyk_wasm_request_handler, tyk_wasm_allocate, tyk_wasm_free)。
内存管理: 使用ArenaAllocator来简化单次请求中的内存分配与释放，避免内存泄漏。
数据处理: 使用std.json库进行高效的解析和序列化，并定义了与JSON结构对应的Zig struct。
错误处理: 在每个可能失败的步骤（获取/设置body，解析/序列化JSON）后都进行了检查和日志记录。

4. Tyk 网关配置

最后一步是配置Tyk API Definition，让它加载并执行我们的WASM中间件。

{
  "name": "Zig WASM Transformer API",
  "api_id": "zig-wasm-api",
  "org_id": "1",
  "use_keyless": true,
  "auth": {
    "auth_header_name": "Authorization"
  },
  "definition": {
    "location": "header",
    "key": "x-api-version"
  },
  "version_data": {
    "not_versioned": true,
    "versions": {
      "Default": {
        "name": "Default",
        "expires": "3000-01-01 00:00",
        "use_extended_paths": true
      }
    }
  },
  "proxy": {
    "listen_path": "/trpc-to-spring/",
    "target_url": "http://spring-service:8080/", // 指向我们的Spring Boot后端
    "strip_listen_path": true
  },
  "custom_middleware": {
    "pre": [
      {
        "name": "tyk_wasm_request_handler",
        "path": "tyk-zig-transformer.wasm", // 相对于Tyk中间件目录的路径
        "driver": "wasm",
        "function_name": "tyk_wasm_request_handler"
      }
    ],
    "post": [],
    "driver": "wasm" // 全局驱动设置为wasm
  },
  "active": true
}

关键配置是custom_middleware部分：

"driver": "wasm": 告诉Tyk这是一个WASM中间件。
"path": "tyk-zig-transformer.wasm": 指向我们编译好的WASM文件名。Tyk会在其配置的中间件路径下查找此文件。
"function_name": "tyk_wasm_request_handler": 指定要调用的WASM导出函数名。

当前方案的局限性与未来展望

此方案通过在网关层引入以Zig编写的WASM模块，成功地在不增加架构复杂度的前提下，解决了异构系统间API适配的高性能需求。它避免了额外的网络跳数和微服务运维开销，将转换逻辑下沉到了基础设施层。

然而，这个方案并非银弹。它的主要局限性在于开发和调试的复杂性。WASM的沙箱环境限制了其能力，它不能直接进行网络调用或访问文件系统，任何与外部世界的交互都必须通过宿主（Tyk）提供的函数进行。这意味着如果转换逻辑需要从外部数据源（如Redis或数据库）富化数据，实现会变得更加复杂，需要Tyk提供相应的宿主函数接口。

另一个挑战是状态管理。当前的实现是完全无状态的，每个请求都被独立处理。如果未来的需求涉及到需要跨请求维持状态的转换逻辑，我们就必须依赖Tyk提供的会话元数据功能，或者通过宿主函数与外部状态存储进行交互，这会削弱WASM模块的独立性和可移植性。

未来的迭代方向可能包括建立一个更完善的Zig-WASM-Tyk开发工具链，例如提供本地模拟Tyk宿主环境的测试框架，以简化调试流程。同时，随着WASM接口类型（Interface Types）提案的成熟，当前这种基于共享内存和原始指针的底层ABI交互方式，有望被更高级、更类型安全的接口所取代，从而大幅降低开发门槛。

tRPC Zig Tyk Spring Framework WebAssembly

基于etcd实现OpenTelemetry采样策略的动态无重启更新

2023-10-27 可观测性

Go OpenTelemetry Grafana etcd 分布式系统

构建高吞吐MLOps日志管道 Node.js与Phoenix的架构权衡与实现

2023-10-27 后端架构

Node.js MLOps Loki AWS SNS Phoenix