Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Clarifiction on Multiple Deployments notes

See original GitHub issue

The main README.md has the following comments:

Multiple Deployments

You may install multiple deployments of each/any driver. It requires the following:

    Use a new helm release name for each deployment
    Make sure you have a unique csiDriver.name in the values file
    Use unqiue names for your storage classes (per cluster)
    Use a unique parent dataset (ie: don't try to use the same parent across deployments or clusters)

I have two independent K3s clusters with nothing shared but TrueNAS backend for iSCSI and NFS. Two independent ArgoCD repositories are used to build each. I’ll call them DEV and TEST. Essentially deployment file are identical between them other than ingress route names, IP addresses, etc. mostly minor items. Ansible is used to build each of them.

Regarding Democratic-CSI each cluster points to a different ZFS parent dataset but otherwise they are the same deployment. There is only ONE deployment of ISCSI provider and NFS provider per cluster. All other TrueNAS connectivity between the two is the same same user ID for SSH, same iSCSI connection configuration.

What I observed on a deployment of Kube Prometheus Stack to DEV - flawless, everything worked as expected. no issues. Created 3 iSCSI PV in the correct parent dataset,

I then went to deploy it to TEST, and it got a little weird:

It created three ZVOLs (one for each PV) in the correct (different) parent dataset.
Each of the three PVC in TEST has status Bound and points and a volume name in the correct parent dataset.

Based on that I assumed all was OK. However, when I log into TEST ENV Grafana which worked, the data is weird. I see my Work in Progress dashboards from the DEV environment that have yet to be deployed to TEST. These dashboards were not part of the ArgoCD deployment for TEST, its somehow picking them up from the database.

I check on TrueNAS, the status of the 3 ZVOLs created for TEST which are in Bond status… and each of them are still just 88Kib in size and not growing. They are clearly not being used.

Both clusters generated the same claim names (expected), with same StorageClass, Reclaim Policy, Access Mode, Capacity. 6 PVs got created, each with different volume name, each of the volume names line up to a ZVOL in the correct parent dataset and all six report “bound” status. It seems like everything got created correctly, but at some layer the CSI is confused.

I’m wondering if its Make sure you have a unique csiDriver.name in the values file is that unique per cluster? or need to be unique across all clusters?

Issue Analytics

State:
Created a year ago
Comments:17 (9 by maintainers)

Top GitHub Comments

1reaction

travisghansencommented, Jun 28, 2022

Added some extra notes in the README and merged.

1reaction

reeflandcommented, Jun 16, 2022

I was already uninstalling and cleaning up. Completed a full cleanup on both clusters. CSI cleanup did not need any manual work, everything deleted on its own:

root@truenas[~]# zfs list -r  main/k8s/iscsi/v
NAME               USED  AVAIL     REFER  MOUNTPOINT
main/k8s/iscsi/v   200K  24.4T      200K  /mnt/main/k8s/iscsi/v

root@truenas[~]# zfs list -r  main/kts/iscsi/v
NAME               USED  AVAIL     REFER  MOUNTPOINT
main/kts/iscsi/v   200K  24.4T      200K  /mnt/main/kts/iscsi/v

Uninstalled CSI iSCSI Helm as well.

Now I have Ansible render the nameSuffix as the name of the parent datasets with any “/” converted to “-”. Such as:

  namePrefix: "csi-"
  nameSuffix: "-main-kts"

Now the iSCSI target and extent names, look different:

csi-monitoring-kube-prometheus-stack-grafana-main-kts csi-monitoring-kube-prometheus-stack-grafana-main-k8s

I let ArgoCD do its magic to redeploy the deltas and all ZVOls started to show growth:

root@truenas[~]# zfs list -r  main/k8s/iscsi/v
NAME                                                        USED  AVAIL     REFER  MOUNTPOINT
main/k8s/iscsi/v                                           34.9M  24.4T      200K  /mnt/main/k8s/iscsi/v
main/k8s/iscsi/v/pvc-16534375-4566-46bf-9512-88611ee38b69  32.0M  24.4T     32.0M  -
main/k8s/iscsi/v/pvc-81520914-a593-433d-b215-4ca56317694e   720K  24.4T      720K  -
main/k8s/iscsi/v/pvc-c168a209-ff09-41c0-997b-5457307e7b62  1.99M  24.4T     1.99M  -

root@truenas[~]# zfs list -r  main/kts/iscsi/v
NAME                                                        USED  AVAIL     REFER  MOUNTPOINT
main/kts/iscsi/v                                           30.6M  24.4T      200K  /mnt/main/kts/iscsi/v
main/kts/iscsi/v/pvc-df3f31df-1154-40e7-b1d4-c6796ffce202  2.00M  24.4T     2.00M  -
main/kts/iscsi/v/pvc-e48fa856-92df-425c-94bc-4305df7951e2  27.7M  24.4T     27.7M  -
main/kts/iscsi/v/pvc-fddf0e2e-79f5-4eee-97f7-0ddc3fca7931   720K  24.4T      720K  -

I’ll let things run overnight and see if any driver pods got restarted.