云原生存储Longhorn升级到1.7,支持定期或按需全量备份特性
                 本文最后更新于 414 天前, 如有失效请评论区留言.
            
            
        k3s 分布式存储 Longhorn1.7 版本升级记录
抽空水一下,有兴趣的需要先阅读傻瓜式教学:部署云原生存储 Longhorn
新特性
- 新增 cli 工具代替之前的脚本(新用户用的比较多),其他用途暂时没了解, 官方说支持故障排查, 命令行文档
- RWX(NFS 协议)相关特性增强
- RWX 卷快速故障转移,支持快速检测和响应 ShareManage 故障
- 存储网络支持多次读写
 
- 当卷意外分离时自动删除工作负载 pod
- 支持定期和按需全量备份
- 增强了副本自动均衡功能
- 建议使用 6.7+内核版本
新 Bug
无法附加在 v1.5.2 和 v1.4.4 之前创建的卷
如果 Longhorn 集群包含具有以下特征的资源,请避免升级到 v1.7.0:
- 资源名称:格式为 。<volume name>-e-<8-char random id>
- 创建时间:集群上安装了 v1.5.2 和 v1.4.4 之前的 Longhorn 版本
运行以下命令以检查是否可以安全地将 Longhorn 集群升级到 v1.7.0:
[ $(kubectl -n longhorn-system get engines.longhorn.io -o name | grep -E '\-e\-[a-z0-9]{8}$' | wc -l) -gt 0 ] && echo "Please hold off on upgrading to v1.7.0 until v1.7.1 is available." || echo "Safe to upgrade to v1.7.0."
升级后体验感觉
自动重新平衡副本
之前手动干预比较多。调整如下参数阈值,会自动在同一节点内的另一个磁盘上重建相关副本
kubectl get settings.longhorn.io/replica-auto-balance-disk-pressure-percentage  -n longhorn-system
升级安装
相关镜像我已经同步到国内。
添加 Longhorn Helm 仓库
helm repo add longhorn https://charts.longhorn.io
更新
helm repo update
准备 values.yaml
我的环境限制,仅在存储节点运行。
global:
  nodeSelector:
    node-role.kubernetes.io/storage: "true"
  tolerations:
    - operator: Exists
      effect: NoSchedule
persistence:
  defaultClass: false
  defaultDataLocality: best-effort
  migratable: true
defaultSettings:
  createDefaultDiskLabeledNodes: true
  defaultDataPath: /data/k8s/longhorn
  defaultDataLocality: best-effort
  replicaAutoBalance: best-effort
  storageMinimalAvailablePercentage: 10
  systemManagedComponentsNodeSelector: "node-role.kubernetes.io/storage: true"
  taintToleration: ":NoSchedule"
longhornUI:
  replicas: 1
ingress:
  enabled: true
  ingressClassName: nginx
  host: lhsc.ysicing.local
  # annotations:
  #   nginx.ingress.kubernetes.io/auth-type: basic
  #   nginx.ingress.kubernetes.io/auth-secret: basic-auth
longhornManager:
  nodeSelector:
    node-role.kubernetes.io/storage: "true"
  tolerations:
    - operator: Exists
      effect: NoSchedule
longhornDriver:
  nodeSelector:
    node-role.kubernetes.io/storage: "true"
  tolerations:
    - operator: Exists
      effect: NoSchedule
image:
  longhorn:
    engine:
      repository: ccr.ccs.tencentyun.com/k7scn/longhorn-engine
    manager:
      repository: ccr.ccs.tencentyun.com/k7scn/longhorn-manager
    ui:
      repository: ccr.ccs.tencentyun.com/k7scn/longhorn-ui
    instanceManager:
      repository: ccr.ccs.tencentyun.com/k7scn/longhorn-instance-manager
    shareManager:
      repository: ccr.ccs.tencentyun.com/k7scn/longhorn-share-manager
    backingImageManager:
      repository: ccr.ccs.tencentyun.com/k7scn/backing-image-manager
    supportBundleKit:
      repository: ccr.ccs.tencentyun.com/k7scn/support-bundle-kit
  csi:
    attacher:
      repository: ccr.ccs.tencentyun.com/k7scn/csi-attacher
    provisioner:
      repository: ccr.ccs.tencentyun.com/k7scn/csi-provisioner
    nodeDriverRegistrar:
      repository: ccr.ccs.tencentyun.com/k7scn/csi-node-driver-registrar
    resizer:
      repository: ccr.ccs.tencentyun.com/k7scn/csi-resizer
    snapshotter:
      repository: ccr.ccs.tencentyun.com/k7scn/csi-snapshotter
    livenessProbe:
      repository: ccr.ccs.tencentyun.com/k7scn/livenessprobe
升级
helm upgrade -i longhorn longhorn/longhorn -n longhorn-system -f values.yaml
