Skip to content

Migrate WebGPU backend to gogpu/wgpu (core) #40

@kolkov

Description

@kolkov

Summary

Migrate Born's WebGPU backend from go-webgpu/webgpu (Rust FFI via wgpu-native shared library) to gogpu/wgpu (pure Go). Full replacement, not dual backend.

Context

  • Parent: Backend Strategy: GoGPU Integration Research #20
  • Decision: ADR-005 — use Core API, not HAL-direct (available on request; feel free to ask about specific points in comments)
  • Research: GOGPU_WGPU_ARCHITECTURE_2026-04-10 (internal)
  • gogpu/wgpu v0.24.6 stable, pure Go
  • Eliminates runtime dependency on wgpu_native shared library (.dll/.so/.dylib)
  • True single binary deployment
  • gogpu/naga supports DXIL (Rust naga does NOT)
  • WGSL shaders stay unchanged
  • Milestone: v0.8.0

Technical Approach

Use gogpu/wgpu Core API (root wgpu package):

  • Provides encoder pooling, staging belt, deferred destruction
  • Dispatch overhead negligible for ML (nanoseconds vs microsecond GPU kernels)
  • We maintain both libraries — can optimize Core API for ML workloads as needed

Do NOT use HAL-direct (except potentially for tensor arena allocator in the future).

Scope

Files to modify

  • internal/backend/webgpu/backend.go — device init, lifecycle
  • internal/backend/webgpu/compute.go — compute dispatch, encoder/pass
  • internal/backend/webgpu/gpu_ops.go — GPU tensor operations
  • internal/backend/webgpu/gpu_tensor.go — buffer management
  • internal/backend/webgpu/buffer_pool.go — buffer pooling
  • internal/backend/webgpu/lazy_compute.go — lazy mode
  • internal/backend/webgpu/gpu_creation.go — tensor creation
  • internal/backend/webgpu/ops.go, ops_extended.go — minor import updates
  • go.mod — swap dependency

Key changes

  • Replace go-webgpu/webgpu imports → gogpu/wgpu
  • Adapt device initialization to Core API flow (Instance → Adapter → Device)
  • Leverage Core API's staging belt for buffer uploads
  • Leverage Core API's encoder pooling (saves 64KB DX12 allocator per frame)
  • Use Core API's DestroyQueue for safe resource lifecycle
  • Update shader module creation API
  • WGSL shaders (shaders.go) — NO changes needed

Acceptance Criteria

  • go build ./... passes with gogpu/wgpu
  • All WebGPU unit tests pass
  • Compute shaders dispatch correctly
  • Buffer upload/download works
  • Lazy mode works
  • Buffer pool works
  • Flash attention works
  • No go-webgpu imports remain
  • Single binary — no runtime .dll/.so dependency

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions