A modular Three.js tool for capturing panoramic images and depth maps from any point inside a WebGL scene, with first-person navigation and multiple AI-ready export formats.
- Partial equirectangular panorama — configurable horizontal (default 215°) and vertical (default 120°) field of view, centred on the camera's look direction
- Perspective view capture — standard frustum snapshot of exactly what the camera sees
- LiDAR-style depth maps — 16-bit precision distance encoding via RG channel packing (~3 mm precision at
far=200 m) - Three export formats per capture: colour PNG, annotated heatmap PNG, grayscale depth PNG
- Click-to-measure — click any pixel in the depth preview to read its world-space distance in metres
- First-person navigation — WASD + mouse look + E/Q height control via
PointerLockControls - Modular design —
PanoramaCaptureris a self-contained class that attaches to any existingTHREE.WebGLRenderer+THREE.Scene
npm install
npm run devOpen http://localhost:5173, click Enter First Person View, navigate with WASD, press ESC to show the capture menu, then click either capture button.
Pano3D/
├── index.html # UI shell: overlay, buttons, canvas container
├── src/
│ ├── main.js # Demo app: scene setup, FPS controls, button wiring
│ └── PanoramaCapturer.js # Core capture class (self-contained, no demo deps)
├── package.json # Vite 5 + Three.js r160
└── DEV_PROGRESS.md # Engineering log
sourceCamera position + orientation
│
▼
┌─────────────────────────────────┐
│ CubeCamera (world-axis-aligned)│ ← renders 6 faces at sourceCamera.position
│ WebGLCubeRenderTarget │ rotation is always identity (no tilt)
└────────────────┬────────────────┘
│ 6 cubemap faces (linear, no tone mapping)
▼
┌───────────────────────────────────────────────────────────────┐
│ Equirectangular GLSL shader (full-screen quad) │
│ │
│ For each output pixel (u, v): │
│ lon = (u − 0.5) × hFOV_rad │
│ lat = (v − 0.5) × vFOV_rad │
│ localDir = [sin(lon)cos(lat), sin(lat), −cos(lon)cos(lat)] │
│ worldDir = uCameraRot × localDir ← full mat3 from │
│ camera.matrixWorld │
│ colour = textureCube(cubemap, worldDir) │
│ output = pow(clamp(colour × exposure, 0,1), 1/2.2) │
└────────────────┬──────────────────────────────────────────────┘
│ WebGLRenderTarget (width = resolution/180 × hFOV)
▼
Y-flip → Canvas → PNG
Why a rotation matrix instead of Euler angles?
Decomposing matrixWorld into yaw + pitch loses information when the camera is rolled or gimbal-locked. Uploading the full upper-left 3×3 as a mat3 uniform (uCameraRot) is exact for any orientation.
Why scene.overrideMaterial not renderer.overrideMaterial?
Three.js r152+ only reads scene.overrideMaterial (see WebGLRenderer.js line 1430). Setting a property on the renderer object has no effect.
Critical viewport rule:
renderer.setRenderTarget(target) sets the GL viewport to (0, 0, target.width, target.height) without devicePixelRatio scaling. Calling renderer.setViewport() afterwards multiplies by DPR on retina displays, making the viewport 2× the framebuffer size and causing only the bottom-left quadrant to render. Never call setViewport after setRenderTarget.
scene.overrideMaterial = depthOverrideMat
│
▼
┌─────────────────────────────────┐
│ depthCubeCamera.update() │ same position as colour CubeCamera
│ (scene.background = null, │ clear colour = white (n=1.0 = far)
│ scene.fog = null) │
└────────────────┬────────────────┘
│ 6 depth cube faces — each pixel stores 16-bit distance
│
│ Vertex shader:
│ vWorldPos = (modelMatrix × position).xyz
│ Fragment shader:
│ dist = length(vWorldPos − cameraPosition)
│ n = clamp(dist / uMaxDist, 0, 1)
│ hi = floor(n × 255) / 255 → R channel
│ lo = fract(n × 255) → G channel
▼
┌───────────────────────────────────────────────┐
│ Depth equirect shader (same angular mapping) │
│ NearestFilter on cube texture (critical — │
│ LinearFilter corrupts RG byte pairs at seams)│
│ output = vec4(packed.r, packed.g, 0, 1) │
└────────────────┬──────────────────────────────┘
│ readRenderTargetPixels → Uint8Array
▼
CPU decode per pixel:
n = R/255 + G/65025 (65025 = 255²)
dist = n × far (metres)
Precision: far / 65025 → ~3 mm at far=200 m
The normalised depth n ∈ [0, 1] is remapped with a power curve before colouring so that close objects (which cluster at low n values) spread across the full visible spectrum:
curved = n ^ 0.35
hue = curved × 0.667 (maps 0→red, 1→blue in HSV)
| Distance | n (far=200m) | curved | Colour |
|---|---|---|---|
| 1 m | 0.005 | 0.157 | red-orange |
| 5 m | 0.025 | 0.303 | yellow |
| 10 m | 0.050 | 0.420 | yellow-green |
| 20 m | 0.100 | 0.501 | green |
| 50 m | 0.250 | 0.660 | cyan |
| 100 m | 0.500 | 0.785 | blue |
The raw linear n values are stored separately in _depthPx (Float32Array) and are not affected by the heatmap curve. Click-to-measure reads from _depthPx, not from the coloured pixels.
import { PanoramaCapturer } from './src/PanoramaCapturer.js';
const capturer = new PanoramaCapturer(renderer, scene, options);| Option | Type | Default | Description |
|---|---|---|---|
resolution |
number |
2048 |
Cube face resolution in pixels |
near |
number |
0.1 |
Near clipping plane (metres) |
far |
number |
1000 |
Far clipping plane and max depth range (metres) |
hFOV |
number |
215 |
Horizontal capture angle (degrees) |
vFOV |
number |
120 |
Vertical capture angle (degrees) |
exposure |
number |
1.0 |
Linear exposure multiplier (>1 = brighter) |
Captures a partial equirectangular panorama centred on camera's current look direction.
await capturer.capture(camera, 'panorama.png');Output image size: (resolution/180 × hFOV) × (resolution/180 × vFOV) pixels.
Captures the scene directly through camera as a standard perspective image (no cubemap involved). Output resolution matches the renderer's current logical size.
await capturer.captureView(camera, 'view.png');Both methods open an in-browser preview panel with:
- Colour thumbnail
- Depth heatmap (click any pixel to read distance in metres)
- Colour scale legend with tick marks at 1 m, 5 m, 10 m, 20 m, 50 m, 100 m
| Button | Filename suffix | Content |
|---|---|---|
| Save PNG | *.png |
sRGB colour image with exposure + gamma correction |
| Save Heatmap | *_heatmap.png |
HSV depth heatmap with legend scale burned into bottom strip |
| Save Gray | *_depth_gray.png |
Linear 8-bit grayscale: pixel = n × 255 (0=near/black, 255=far/white) |
Standard format compatible with depth estimation models (MiDaS, DepthAnything, NYU Depth v2, KITTI):
- Single channel encoded as RGB (R=G=B)
- Linear mapping:
distance_metres = (pixel_value / 255) × far - 8-bit precision:
far / 255metres per step (~0.78 m at far=200 m)
For vision-language models (GPT-4V, Claude, Gemini) that interpret colour:
- HSV false-colour with power-curve remapping
- Colour scale legend burned into bottom strip with distance labels
- Red = near, blue = far
- The model can read absolute distances from the tick labels on the legend
| Key / Action | Effect |
|---|---|
| Click Enter First Person View | Locks pointer, enters FPS mode |
| W / A / S / D | Move forward / left / backward / right |
| Mouse | Look around |
| E | Move camera up |
| Q | Move camera down |
| ESC | Exit FPS mode, show menu |
| Capture Panorama button | Panoramic capture from current position |
| Capture View button | Perspective capture from current view |
- Three.js r160 — WebGL renderer, CubeCamera, ShaderMaterial, WebGLRenderTarget
- Vite 5 — dev server and bundler (
npm run dev) - WebGL 2.0 — used by default in Three.js r160; required for reliable render-target readback
MIT — see LICENSE