Releases: simplyblock/sbcli
Releases · simplyblock/sbcli
25.10.4
What's Changed
- Remove remote object from node when receiving distrib events by @Hamdy-khader in #743
- Add req id to RPC logs for spdk_proxy by @Hamdy-khader in #752
- Main lvol sync delete (#734) by @Hamdy-khader in #757
- R25.10 hotfix sfam 2471 by @Hamdy-khader in #758
- fix linter and type checker issues by @Hamdy-khader in #759
- Prometheus hostpath by @geoffrey1330 in #761
- Format nvme devices when run sbcli sn configure with --force by @wmousa in #760
- Nsocket env by @geoffrey1330 in #764
- Fix fw connection error handling not to set node status down by @Hamdy-khader in #765
- Fix sfam-2473 by @Hamdy-khader in #766
- Do not install pip package on cluster update by @Hamdy-khader in #749
- Fix calculate total_mem for multi sn nodes on same numa by @wmousa in #767
- use max_size instead as hugepage memory when set by @geoffrey1330 in #754
- refactor node add task runner by @Hamdy-khader in #768
- Fix sfam-2483 by @Hamdy-khader in #773
- fix logger by @Hamdy-khader in #776
- fix SFAM-2476 by @Hamdy-khader in #775
- Fix SFAM-2482 by @Hamdy-khader in #769
Full Changelog: 25.10.3...25.10.4
25.10.3
What's Changed
- 🐛 Optimise Storage node monitor
- 🐛 Fix fdb value exceed limit
- 🐛 Other miner bug fixes, see more in the Full Changelog
Full Changelog: 25.10.2...25.10.3
25.10.2
New Features
- Control Plane: Can alternatively deploy into existing Kubernetes clusters and co-locate on workers with storage nodes.
- Kubernetes Support Matrix: Added OpenShift starting from version XX.XX.
- OpenStack Driver: Now available. Supports most optional features and tested from OpenStack 25.1 (Epoxy). (Older OpenStack versions may be supported on request.)
- Lower Memory Footprint: Required memory on storage nodes reduced from 0.2% of storage capacity to 0.05%.
- QoS (Pool-level): Added pool-level QoS controls.
- QoS Service Classes: Assign a service class to a volume; service classes provide full performance isolation within the cluster.
- Flexible Erasure Coding: Support for flexible erasure-coding schemas within a cluster.
- Fabrics: Support for RDMA fabric and mixed fabrics (RDMA, TCP).
- Write Performance: Improvements during first write to volume and during node outage.
- Namespace Volumes: A single NVMe-oF subsystem can now expose up to 32 namespace volumes.
Fixes
- Control Plane: Fixed an issue that could lead to stuck deletes.
Upgrade Considerations
- Upgrades are supported from 25.7.6 and 25.7.7.
- It’s possible to add RDMA support to the fabric during an online upgrade.
Known Issues
- Using different erasure-coding schemas per cluster is available but experimental (not GA) and, in some tests, can cause I/O interrupt issues.
25.7.7
What's changed:
- 🐛 Bug fix: QOS setting between lvol and pool must be consistent and not accept negative values
- 🐛 Bug fix: On bare metal, node auto restart was not triggered after container crash but node is made online
- 🐛 Bug fix: Crypto LVOL delete: first delete crypto, then Lvol
Full Changelog: 25.7.6...25.7.7
25.7.6
What's Changed
- SFAM-2295: Fix _connect_device invocation from port allow service.
- SFAM-2292: check storage network interface ping on node auto restart
- Increase node restart task retry count to 80
- SFAM-2308: FIO interrupt with IO error on spdk container crash failover
- revert node restart task retry count to 8 and start count from success checkes and node restart can start
- SFAM-2179: Add snodeapi logs to graylog
- fix firewall back port
- Stop health check auto fix
- SFAM-2309: Change health check service auto fix for problems in distrib cluster map
- SFAM-2310: fix base lvol ref to be found before deleting it
- SFAM-2311: Create snapshot monitor service on cluster update if not found in service ls
- Use expansion migration instead of temp migrations in case of
node restart,node recovers from network outageandnode goes from down to online - Use "physical_label" in cluster map only if we have multi node per host
- skip nsenter command for talos
- SFAM-2179: Apply container logging config to be gelf on node add
- change sn suspend function to print a msg to use shutdown
- skip generate automated deployment if spdk pod already exist
Full Changelog: 25.7.5...25.7.6
25.7.5
Changes
New Features
- Storage Plane: Added support for vfio device driver, with a fallback to the legacy uio driver.
Fixes
- Control Plane: Improved the reliability of migrations when restarting a storage node, or a storage node recovers from a network outage.
- Storage Plane: Fixed an issue where the vfio driver wouldn't be available on some systems.
Full Changelog: 25.7.4...25.7.5
25.7.4
Changes
- Change health check service auto fix for problems in distrib cluster map
- fix base lvol ref to be found before deleting it
- Create snapshot monitor service on cluster update if not found in service ls
Full Changelog: 25.7.3...25.7.4
25.7.3
What's Changed
SBCLI
- Skip nsenter command for talos
- Updated init job OS check
- skip generate automated deployment if spdk pod already exist by @geoffrey1330 in #599
- fix firewall backport
SPDK
- change check refernces timeout
Full Changelog: 25.7.2...25.7.3
25.7.2
What's Changed
- Couple of bug fixes
- ref_count decrement when deleting cloned_from_snap lvols
- Pulling e2e from main to 25.6-Pre branch by @RaunakJalan in #571
- Merge task management improvements (sfam-1802) to R25.6 by @Hamdy-khader in #584
- R25.6 pre test fix migration subtask e2e by @boddumanohar in #586
Full Changelog: 25.7.1...25.7.2
25.7.1
New Features
- General: Clarified deployment documentation.
- General: Added documentation for cluster expansion.
- General: Added documentation for storage node migration.
- Kubernetes: Improved the Helm Chart for simplified Kubernetes deployments.
- Kubernetes: Automated applying Core Isolation on Kubernetes worker nodes.
- Proxmox: Added support for quality of service settings.
Fixes
- Control Plane: Fixed an issue where the storage utilization of a logical volume wasn't shown when the primary storage node was offline.
- Control Plane: Fixed an issue where the snapshot health status was shown as unhealthy while it was healthy.
- Control Plane: Fixed an issue where a client would fail to reconnect after a network outage due to a missing property in the configuration.
- Control Plane: Fixed an issue where the cluster would not be shown as
degradedwhile a data migration operation is ongoing. - Control Plane: Fixed an issue where it was possible to restart a storage node even if it was not in
offlinestate. - Storage Plane: Fixed an issue which caused a cluster suspension (hence I/O interruption) in case of a partial or full network outage.
- Storage Plane: Fixed an issue where a logical volume wasn't correctly deleted if the operation was issued as asynchronous.
- Storage Plane: Fixed a segfault on secondary nodes.
- Storage Plane: Fixed an error on the journal for large numbers of records.
- Storage Plane: Fixed an I/O leakage between primary and secondary storage nodes for certain I/O patterns.
- Storage Plane: Fixed an issue where rebalancing stopped early after cluster expansion, causing the cluster to become imbalanced.
- Storage Plane: Fixed an issue where a snapshot wasn't correctly re-registered after a failback.
- Storage Plane: Fixed a distrib error on network outages.
- Storage Plane: Fixed an issue where a storage node would get stuck in down state after a restart.
- Storage Plane: Fixed an issue where a checksum error could happen after failing back from a partial outage.
- Proxmox: Fixed an issue when adding additional storage pools.
Important Changes
No changes in this release.
Known Issues
Simplyblock always seeks to provide a stable and strong release. However, smaller known issues happen. Following up is
a list of known issues for the current simplyblock release.
- GCP: On GCP, multiple Local SSDs are connected as NVMe Namespace devices. Simplyblock recommends to use C4A-based ARM servers.