How to Backup and Restore ETCD ?
Backup ETCD
How to Backup and Restore ETCD ? This guide will show you how to backup and restore ETCD. If you encounter any issues please refer the Troubleshooting at the end of this guide.
First, we need to find CA and Server Certificates
cat /etc/kubernetes/manifest/etcd.yaml
Note 3 lines: –trusted-ca-file, –cert-file= , –key-file=
ETCDCTL_API=3 etcdctl --endpoints=https://127.0.0.1:2379 \ --cacert=/etc/kubernetes/pki/etcd/ca.crt \ --cert=/etc/kubernetes/pki/etcd/server.crt \ --key=/etc/kubernetes/pki/etcd/server.key \ snapshot save /opt/snapshot.db
Restore ETCD to a new folder
ETCDCTL_API=3 etcdctl --data-dir /var/lib/etcd-backup \ snapshot restore /opt/snapshot.db
Modify etcd.yaml and point data-dir to the restored directory in the previous step: /var/lib/etcd-backup
Note: There are 3 values of data-dir. Please change all of them. If not you will encounter many problems.
vi /etc/kubernetes/manifests/etcd.yaml
Change the values below: --data-dir=/var/lib/etcd-backup ... ... ... ... ... ... ... volumeMounts: - mountPath: /var/lib/etcd-backup name: etcd-data - mountPath: /etc/kubernetes/pki/etcd name: etcd-certs hostNetwork: true priorityClassName: system-node-critical volumes: - hostPath: path: /etc/kubernetes/pki/etcd type: DirectoryOrCreate name: etcd-certs - hostPath: path: /var/lib/etcd/etcd-backup
systemctl daemon-reload
systemctl restart kubelet
Wait for 3 minutes to let ETCD static pod recreate
Let check the pod

Troubleshoting
No resources found on node

Try to reload the daemon and restart kubelet service
systemctl daemon-reload
systemctl restart kubelet
Check controlplane status after service restarted.
kubectl get node
NotReady status on node

Please may cause by the data-dir in etcd.yaml. Please make sure you change all of the data-dir values:
vi /etc/kubernetes/manifests/etcd.yaml
Change the values below: --data-dir=/var/lib/etcd-backup ... ... ... ... ... ... ... volumeMounts: - mountPath: /var/lib/etcd-backup name: etcd-data - mountPath: /etc/kubernetes/pki/etcd name: etcd-certs hostNetwork: true priorityClassName: system-node-critical volumes: - hostPath: path: /etc/kubernetes/pki/etcd type: DirectoryOrCreate name: etcd-certs - hostPath: path: /var/lib/etcd/etcd-backup
Pending status of all Pods
If the data-dir is correct but all pods are in the pending state or no pod is displayed, try to remove ETCD new data-dir
rm -rf /var/lib/etcd-backup
Re-run the restore process
ETCDCTL_API=3 etcdctl --data-dir /var/lib/etcd-backup \
snapshot restore /opt/snapshot.db
Delete ETCD pod and check the ETCD pod status after restarting
kubectl delete pod -n kube-system etcd-controlplane
Check running pods
kubectl get pod --all-namespaces