DKP 2.4.1 Known Issues and Limitations

Known issues and limitations

The following items are known issues with this release.

AWS `additionalTags` cannot contain spaces

Due to an upstream bug in the cluster-api-provider-aws component, it is not possible to specify tags with spaces in their name in the additionalTagssection of an AWSCluster. If you have any tags like this during an upgrade of the capi-components, you may receive a validation error, and will need to remove any such tags. This issue will be corrected in a future DKP release.

Use Static Credentials to Provision an Azure Cluster

Only static credentials can be used when provisioning an Azure cluster.

Workaround for missing `allowVolumeExpansion` on EBS StorageClass

To allow for volume expansion, perform the following steps:

Add the missing allowVolumeExpansion on EBS StorageClass byrunning command below to edits this storageclass YAML and add in allowVolumeExpansion:true:
CODE
```
kubectl patch storageclass ebs-sc --patch '{"allowVolumeExpansion": true}' 
```
Enable CSI volume expansion by setting allowVolumeExpansion for:
* EBS CSI driver
* AzureDisk CSI driver
* GCP CSI driver
* vSphere CSI driver
Add permissions to the following areas for respective cloud provider:
- AWS
- Azure
- GCP
- vSphere

EKS Upgrades do not use Declared Kubernetes Version

Upgrading an EKS cluster to DKP version 2.4 does not support specifying a particular Kubernetes patch version when upgrading EKS clusters. Instead, your cluster will be upgraded to the highest currently available patch version available in EKS for the MAJOR.MINOR version of Kubernetes you selected.

EKS Clusters use CSI-based EBS Drivers

EKS has disabled support for in-tree EBS volume provisioning in favor of CSI Volumes. See the dropdown labeled “Additional Considerations for EKS” in the (2.4) DKP Enterprise Upgrade | Upgrade the Core Addons page for more details on what steps you must take when upgrading.

Intermittent Error Status when Creating EKS Clusters in the UI

When provisioning an EKS cluster through the UI, you may receive a brief error state because the EKS cluster may sporadically lose connectivity with the management cluster which results in the following symptoms:

The UI shows the cluster is in an error state.
The kubeconfig generated and retrieved from Kommander ceases to work.
Applications created on the management cluster may not be immediately federated to managed EKS clusters.

After a few moments, the error will resolve, without any action on your part. A new kubeconfig generated and retrieved from Kommander then works properly, and the UI shows that it is working again. In the meantime, you can continue to use the UI to work on the cluster such as deploy applications, create projects, and add roles.

Installation and upgrade issue in pre-provisioned environments

DKP made the transition from Minio to Rook Ceph for cluster storage. An issue with Rook Ceph’s deployment prevents pre-provisioned environments from installing and upgrading to this DKP version. To solve this issue, you must set up 40 GB of raw storage for your worker nodes and customize your Rook Ceph installation as indicated in Install Kommander in a Pre-provisioned Environment or Upgrade Kommander in a Pre-provisioned Environment.

Cluster Roles' information not available in the Projects section of the UI

Selecting a role of the Cluster Role type in the Projects > Roles section of a workspace displays an error message in the UI.

Workaround

You can still access a Cluster Role’s description and configuration from the UI. Take the following alternative path to view or edit the desired role:

Select your workspace from the top navigation bar.
Select Administration > Access Control from the sidebar.
A table appears that lists roles from all Projects in the selected workspace.
Select the Name or ID of the Cluster Role you want to access. A page opens that contains more information and configuration options for the role.

To access a Cluster Role’s description and configuration page, ensure your user has sufficient cluster view or edit rights. You will not be able to select roles for which you do not have access rights.

Resolve issues with failed HelmReleases

An issue with the Flux helm-controller can cause HelmReleases to fail with the error message Helm upgrade failed: another operation (install/upgrade/rollback) is in progress. This can happen when the helm-controller is restarted while a HelmRelease is still upgrading, or installing.

Workaround

To ensure the HelmRelease error was caused by the helm-controller restarting, first try to suspend/resume the HelmRelease:

CODE

kubectl -n <namespace> patch helmrelease <HELMRELEASE_NAME> --type='json' -p='[{"op": "replace", "path": "/spec/suspend", "value": true}]'
kubectl -n <namespace> patch helmrelease <HELMRELEASE_NAME> --type='json' -p='[{"op": "replace", "path": "/spec/suspend", "value": false}]'

This might resolve the issue. If not, continue with the following steps:

You should see the HelmRelease attempting to reconcile, and then it either succeeds (with status: Release reconciliation succeeded) or it fails with the same error as before.

If the HelmRelease is still in the failed state, it is likely related to the helm-controller restarting. For example, if the 'reloader' HelmRelease is the one that is stuck.

To resolve the issue, follow these steps:

List secrets containing the affected HelmRelease name:

CODE

kubectl get secrets -n ${NAMESPACE} | grep reloader

The output should look like this:

CODE

kommander-reloader-reloader-token-9qd8b                        kubernetes.io/service-account-token   3      171m
sh.helm.release.v1.kommander-reloader.v1                       helm.sh/release.v1                    1      171m
sh.helm.release.v1.kommander-reloader.v2                       helm.sh/release.v1                    1      117m

In this example, sh.helm.release.v1.kommander-reloader.v2 is the most recent revision.

Find and delete the most recent revision secret, for example, sh.helm.release.v1.*.<revision>:
CODE
```
kubectl delete secret -n <namespace> <most recent helm revision secret name>
```

Suspend and resume the HelmRelease to trigger a reconciliation:

CODE

kubectl -n <namespace> patch helmrelease <HELMRELEASE_NAME> --type='json' -p='[{"op": "replace", "path": "/spec/suspend", "value": true}]'
kubectl -n <namespace> patch helmrelease <HELMRELEASE_NAME> --type='json' -p='[{"op": "replace", "path": "/spec/suspend", "value": false}]'

You should see the HelmRelease is reconciled and eventually the upgrade and install succeed.

Calico not updated during DKP upgrade on Flatcar

When upgrading a DKP cluster running on Flatcar OS, you may find that after the upgrade the Calico services were not updated. This occurs because the upgrade procedure is not correctly updating the Flatcar specific CNI ClusterResourceSet(CRS). This issue only impacts the Calico CRS.

Follow these steps to manually correct this issue:

Update the ConfigMap as follows:

YAML

cat <<EOF | kubectl apply -f -
apiVersion: v1
data:
  custom-resources.yaml: |+
    # This section includes base Calico installation configuration.
    # For more information, see: <https://docs.projectcalico.org/reference/installation/api>
    apiVersion: operator.tigera.io/v1
    kind: Installation
    metadata:
      name: default
    spec:
      # Configures Calico networking.
      calicoNetwork:
        # Note: The ipPools section cannot be modified post-install.
        ipPools:
        - blockSize: 26
          cidr: 192.168.0.0/16
          encapsulation: IPIP
          natOutgoing: Enabled
          nodeSelector: all()
        bgp: Enabled
        nodeAddressAutodetectionV4:
          firstFound: true
      # FlexVolume path must be mounted under /opt on flatcar/coreos systems
      flexVolumePath: /opt/libexec/kubernetes/kubelet-plugins/volume/exec/
kind: ConfigMap
metadata:
  name: calico-cni-installation-$CLUSTER_NAME
EOF

Run these commands:

kubectl edit clusterresourceset calico-cni-installation-$CLUSTER_NAME

and update

spec.clusterSelector.matchLabels.konvoy.d2iq.io/osHint to konvoy.d2iq.io/osHint: flatcar

`Kube-oidc-proxy` not ready after upgrade

If you installed or attached a cluster in 2.1, kube-oidc-proxy is not available after upgrading to 2.3 and 2.4. This application is required to access the Kubernetes API (with kubectl) using SSO. For affected customers, there are issues with the authentication via kubectl.

To make the application available, run the following command on each cluster that was installed, created or attached in 2.1, and is now on DKP version 2.4. Replace <namespace> with each cluster’s workspace namespace:

CODE

kubectl -n <namespace> patch appdeployment kube-oidc-proxy --type=json -p '[{"op":"remove","path":"/spec/configOverrides"}]'

Known issues and limitations

AWS additionalTags cannot contain spaces

Use Static Credentials to Provision an Azure Cluster

Workaround for missing allowVolumeExpansion on EBS StorageClass

EKS Upgrades do not use Declared Kubernetes Version

EKS Clusters use CSI-based EBS Drivers

Intermittent Error Status when Creating EKS Clusters in the UI

Installation and upgrade issue in pre-provisioned environments

Cluster Roles' information not available in the Projects section of the UI

Workaround

Resolve issues with failed HelmReleases

Workaround

Calico not updated during DKP upgrade on Flatcar

Kube-oidc-proxy not ready after upgrade

AWS `additionalTags` cannot contain spaces

Workaround for missing `allowVolumeExpansion` on EBS StorageClass

`Kube-oidc-proxy` not ready after upgrade