You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This limitation was discovered and documented during investigation of issues #455 and #456. Both of those reports include a [!CAUTION] callout referencing this forthcoming feature request.
Warning
A note from a Security Architect: I do this for a living, so security is always on my mind, and I want to be upfront that neither of the implementation paths I'm proposing here are what I'd call ideal from a security standpoint. They're the lesser of available evils, forced on us by Proxmox's decision to hardcode an identity check (root@pam) instead of building a proper permission-based model. The correct fix is Proxmox implementing a grantable permission! Something like VM.ConfigFeatureFlags that can be assigned to specific users or tokens with a full audit trail. That's been sitting in their bugzilla for years with no resolution. Until that happens, PegaProx is stuck choosing between "silently fail on feature flags" or "use elevated auth with guardrails." I'm proposing the latter because at least it works; however, I'd be doing you a disservice if I didn't flag that it's not the architecture I'd design from scratch.
Describe the feature
Cross-cluster LXC replication currently fails silently for any container with feature flags other than nesting=1. The failure happens at the config step with:
403 Permission check failed (changing feature flags (except nesting) is only allowed for root@pam)
Note
This affects any container using mknod, keyctl, fuse, mount, nfs, cifs, or any other capability flag other than nesting, which covers the majority of real-world LXC deployments that are security-conscious enough to use feature flags instead of running privileged containers!
The restriction is hardcoded in Proxmox's LXC.pm as an identity check, not a permission check:
return 1 if$authusereq'root@pam';
No role, realm, token, or privilege level bypasses it. Only a root@pam session ticket (not a token) passes. This has been reported to Proxmox directly! See open feature requests at bugzilla.proxmox.com #6614 and #2582.
This feature request proposes two implementation paths to handle this gracefully, plus a minimum viable preflight warning.
Use Case
Any homelab or production environment running LXC containers with capability flags needs cross-cluster DR to actually work. Specifically:
ZeroTier LXC — requires mknod=1 for TUN device
Docker-in-LXC — requires keyctl=1,nesting=1
NFS mounts in LXC — requires nfs=1
FUSE filesystems — requires fuse=1
Without this feature, cross-cluster replication is non-functional for the majority of real-world LXC containers. Site Recovery auto-failover silently fails because the replica is missing its capability flags and can't perform its intended function after failover.
Minimum Viable Fix: Preflight Warning
At minimum, detect feature flags on LXC containers when a replication job is configured and warn the user:
⚠️Warning: CT {vmid} has feature flags ({flags}) that cannot be restored via the Proxmox API due to a Proxmox hardcoded restriction. The replica will be missing these flags after replication unless feature flag restore is explicitly enabled in the replication job settings.
This prevents users from filing bug reports for a problem that is not PegaProx's fault.
Preferred Viable Fix: Offer Two Paths
Path A — root@pam Ticket + SSH Key Auth
How it works:
PegaProx requests a root@pam session ticket from the target cluster using stored credentials
Uses the ticket for the specific PUT /lxc/{vmid}/config call to restore feature flags
Discards the ticket immediately — never stored, never reused
Logs the elevated operation explicitly in the audit trail
Confirmed working via live test:
# Get root@pam ticket
curl -s -X POST 'https://10.0.0.2:8006/api2/json/access/ticket' \
-d 'username=root@pam&password=<password>'# Restore feature flags using ticket (not token) — no 403
curl -s -X PUT 'https://10.0.0.2:8006/api2/json/nodes/pve-source/lxc/101/config' \
-H 'Cookie: PVEAuthCookie=<ticket>' \
-H 'CSRFPreventionToken: <csrf-token>' \
-d 'features=mknod%3D1'# Returns: {"data":null} ← success
Security requirements (non-negotiable):
SSH key authentication must be configured between PegaProx and Proxmox nodes — password auth must not be accepted for this path. Users deploying feature flags instead of privileged containers are security-conscious by definition. Requiring SSH key auth as a prerequisite is consistent with that posture.
Feature flag restore must be explicitly opted-in per replication job — disabled by default
Ticket must be requested fresh per operation, never cached or stored
Ticket scope must be limited to one specific config call only
Full audit log entry required: "Feature flags {flags} restored on CT {vmid} via root@pam ticket — replicated from source, not new privileges"
UI must surface that elevated auth was used for this job
Path B — vzdump + pct restore --unique 0
How it works:
Replace clone+migrate with backup+restore for containers with feature flags. pct restore runs as root natively on the target node — no API token, no ticket, no 403.
Benefits:
Solves hostname, MAC address, AND feature flags in one operation
No SSH key requirement
No elevated API auth
No ticket handling
Trade-offs:
Slower — full backup + restore cycle
Requires more temporary storage during restore
Better suited for less frequent DR schedules
Implementation suggestion:
Auto-detect feature flags on LXC containers when replication job is configured
If feature flags present, offer user choice of Path A or Path B
Default to Path B if SSH key auth is not configured
Path A available only when SSH key auth is confirmed
Alternatives Considered
Strip features before migration, restore after — requires rebooting a running production container to apply the feature deletion. Not acceptable.
Patch Proxmox LXC.pm — unsupported, overwritten on every Proxmox update.
Run container as privileged — defeats the entire purpose of using feature flags for security.
SSH as root with password auth — works but violates security best practices. Not recommended.
How important is this for you?
Blocking my use case. Cross-cluster replication is non-functional for any LXC container with capability flags, which covers the majority of real-world LXC deployments. Site Recovery failover silently produces a broken replica.
Checklist
I have searched existing issues and discussions to make sure this hasn't been requested before
Tip
I'm happy to assist with backend implementation as I have done in previous bug reports. The Python patches for the ticket approach and audit logging are within my wheelhouse. Front-end UI work for the preflight warning and opt-in settings is outside my skill set, so that piece I'll have to leave to you.
Note
This limitation was discovered and documented during investigation of issues #455 and #456. Both of those reports include a
[!CAUTION]callout referencing this forthcoming feature request.Warning
A note from a Security Architect: I do this for a living, so security is always on my mind, and I want to be upfront that neither of the implementation paths I'm proposing here are what I'd call ideal from a security standpoint. They're the lesser of available evils, forced on us by Proxmox's decision to hardcode an identity check (
root@pam) instead of building a proper permission-based model. The correct fix is Proxmox implementing a grantable permission! Something likeVM.ConfigFeatureFlagsthat can be assigned to specific users or tokens with a full audit trail. That's been sitting in their bugzilla for years with no resolution. Until that happens, PegaProx is stuck choosing between "silently fail on feature flags" or "use elevated auth with guardrails." I'm proposing the latter because at least it works; however, I'd be doing you a disservice if I didn't flag that it's not the architecture I'd design from scratch.Describe the feature
Cross-cluster LXC replication currently fails silently for any container with feature flags other than
nesting=1. The failure happens at the config step with:Note
This affects any container using
mknod,keyctl,fuse,mount,nfs,cifs, or any other capability flag other thannesting, which covers the majority of real-world LXC deployments that are security-conscious enough to use feature flags instead of running privileged containers!The restriction is hardcoded in Proxmox's
LXC.pmas an identity check, not a permission check:No role, realm, token, or privilege level bypasses it. Only a
root@pamsession ticket (not a token) passes. This has been reported to Proxmox directly! See open feature requests at bugzilla.proxmox.com #6614 and #2582.This feature request proposes two implementation paths to handle this gracefully, plus a minimum viable preflight warning.
Use Case
Any homelab or production environment running LXC containers with capability flags needs cross-cluster DR to actually work. Specifically:
mknod=1for TUN devicekeyctl=1,nesting=1nfs=1fuse=1Without this feature, cross-cluster replication is non-functional for the majority of real-world LXC containers. Site Recovery auto-failover silently fails because the replica is missing its capability flags and can't perform its intended function after failover.
Minimum Viable Fix: Preflight Warning
At minimum, detect feature flags on LXC containers when a replication job is configured and warn the user:
This prevents users from filing bug reports for a problem that is not PegaProx's fault.
Preferred Viable Fix: Offer Two Paths
Path A — root@pam Ticket + SSH Key Auth
How it works:
root@pamsession ticket from the target cluster using stored credentialsPUT /lxc/{vmid}/configcall to restore feature flagsConfirmed working via live test:
Security requirements (non-negotiable):
Path B — vzdump + pct restore --unique 0
How it works:
Replace clone+migrate with backup+restore for containers with feature flags.
pct restoreruns as root natively on the target node — no API token, no ticket, no 403.Benefits:
Trade-offs:
Implementation suggestion:
Alternatives Considered
How important is this for you?
Blocking my use case. Cross-cluster replication is non-functional for any LXC container with capability flags, which covers the majority of real-world LXC deployments. Site Recovery failover silently produces a broken replica.
Checklist
Tip
I'm happy to assist with backend implementation as I have done in previous bug reports. The Python patches for the ticket approach and audit logging are within my wheelhouse. Front-end UI work for the preflight warning and opt-in settings is outside my skill set, so that piece I'll have to leave to you.