DKP 2.4.1 Known Issues and Limitations
Known issues and limitations
The following items are known issues with this release.
AWS additionalTags
cannot contain spaces
Due to an upstream bug in the cluster-api-provider-aws
component, it is not possible to specify tags with spaces in their name in the additionalTags
section of an AWSCluster
. If you have any tags like this during an upgrade of the capi-components
, you may receive a validation error, and will need to remove any such tags. This issue will be corrected in a future DKP release.
Use Static Credentials to Provision an Azure Cluster
Only static credentials can be used when provisioning an Azure cluster.
Workaround for missing allowVolumeExpansion
on EBS StorageClass
To allow for volume expansion, perform the following steps:
Add the missing
allowVolumeExpansion
on EBS StorageClass byrunning command below to edits this storageclass YAML and add inallowVolumeExpansion:true
:CODEkubectl patch storageclass ebs-sc --patch '{"allowVolumeExpansion": true}'
Enable CSI volume expansion by setting
allowVolumeExpansion
for:
* EBS CSI driver
* AzureDisk CSI driver
* GCP CSI driver
* vSphere CSI driverAdd permissions to the following areas for respective cloud provider:
AWS
Azure
GCP
vSphere
EKS Upgrades do not use Declared Kubernetes Version
Upgrading an EKS cluster to DKP version 2.4 does not support specifying a particular Kubernetes patch version when upgrading EKS clusters. Instead, your cluster will be upgraded to the highest currently available patch version available in EKS for the MAJOR.MINOR version of Kubernetes you selected.
EKS Clusters use CSI-based EBS Drivers
EKS has disabled support for in-tree EBS volume provisioning in favor of CSI Volumes. See the dropdown labeled “Additional Considerations for EKS” in the (2.4) DKP Enterprise Upgrade | Upgrade the Core Addons page for more details on what steps you must take when upgrading.
Intermittent Error Status when Creating EKS Clusters in the UI
When provisioning an EKS cluster through the UI, you may receive a brief error state because the EKS cluster may sporadically lose connectivity with the management cluster which results in the following symptoms:
The UI shows the cluster is in an error state.
The kubeconfig generated and retrieved from Kommander ceases to work.
Applications created on the management cluster may not be immediately federated to managed EKS clusters.
After a few moments, the error will resolve, without any action on your part. A new kubeconfig generated and retrieved from Kommander then works properly, and the UI shows that it is working again. In the meantime, you can continue to use the UI to work on the cluster such as deploy applications, create projects, and add roles.
Installation and upgrade issue in pre-provisioned environments
DKP made the transition from Minio to Rook Ceph for cluster storage. An issue with Rook Ceph’s deployment prevents pre-provisioned environments from installing and upgrading to this DKP version. To solve this issue, you must set up 40 GB of raw storage for your worker nodes and customize your Rook Ceph installation as indicated in Install Kommander in a Pre-provisioned Environment or Upgrade Kommander in a Pre-provisioned Environment.
Cluster Roles' information not available in the Projects section of the UI
Selecting a role of the Cluster Role type in the Projects > Roles section of a workspace displays an error message in the UI.
Workaround
You can still access a Cluster Role’s description and configuration from the UI. Take the following alternative path to view or edit the desired role:
Select your workspace from the top navigation bar.
Select Administration > Access Control from the sidebar.
A table appears that lists roles from all Projects in the selected workspace.
Select the Name or ID of the Cluster Role you want to access. A page opens that contains more information and configuration options for the role.
To access a Cluster Role’s description and configuration page, ensure your user has sufficient cluster view or edit rights. You will not be able to select roles for which you do not have access rights.
Resolve issues with failed HelmReleases
An issue with the Flux helm-controller can cause HelmReleases to fail with the error message Helm upgrade failed: another operation (install/upgrade/rollback) is in progress. This can happen when the helm-controller is restarted while a HelmRelease is still upgrading, or installing.
Workaround
To ensure the HelmRelease error was caused by the helm-controller restarting, first try to suspend/resume the HelmRelease:
kubectl -n <namespace> patch helmrelease <HELMRELEASE_NAME> --type='json' -p='[{"op": "replace", "path": "/spec/suspend", "value": true}]'
kubectl -n <namespace> patch helmrelease <HELMRELEASE_NAME> --type='json' -p='[{"op": "replace", "path": "/spec/suspend", "value": false}]'
This might resolve the issue. If not, continue with the following steps:
You should see the HelmRelease attempting to reconcile, and then it either succeeds (with status: Release reconciliation succeeded) or it fails with the same error as before.
If the HelmRelease is still in the failed state, it is likely related to the helm-controller restarting. For example, if the 'reloader' HelmRelease is the one that is stuck.
To resolve the issue, follow these steps:
List secrets containing the affected HelmRelease name:
CODEkubectl get secrets -n ${NAMESPACE} | grep reloader
The output should look like this:
CODEkommander-reloader-reloader-token-9qd8b kubernetes.io/service-account-token 3 171m sh.helm.release.v1.kommander-reloader.v1 helm.sh/release.v1 1 171m sh.helm.release.v1.kommander-reloader.v2 helm.sh/release.v1 1 117m
In this example,
sh.helm.release.v1.kommander-reloader.v2
is the most recent revision.Find and delete the most recent revision secret, for example,
sh.helm.release.v1.*.<revision>
:CODEkubectl delete secret -n <namespace> <most recent helm revision secret name>
Suspend and resume the HelmRelease to trigger a reconciliation:
CODEkubectl -n <namespace> patch helmrelease <HELMRELEASE_NAME> --type='json' -p='[{"op": "replace", "path": "/spec/suspend", "value": true}]' kubectl -n <namespace> patch helmrelease <HELMRELEASE_NAME> --type='json' -p='[{"op": "replace", "path": "/spec/suspend", "value": false}]'
You should see the HelmRelease is reconciled and eventually the upgrade and install succeed.
Calico not updated during DKP upgrade on Flatcar
When upgrading a DKP cluster running on Flatcar OS, you may find that after the upgrade the Calico services were not updated. This occurs because the upgrade procedure is not correctly updating the Flatcar specific CNI ClusterResourceSet(CRS). This issue only impacts the Calico CRS.
Follow these steps to manually correct this issue:
Update the ConfigMap as follows:
YAMLcat <<EOF | kubectl apply -f - apiVersion: v1 data: custom-resources.yaml: |+ # This section includes base Calico installation configuration. # For more information, see: <https://docs.projectcalico.org/reference/installation/api> apiVersion: operator.tigera.io/v1 kind: Installation metadata: name: default spec: # Configures Calico networking. calicoNetwork: # Note: The ipPools section cannot be modified post-install. ipPools: - blockSize: 26 cidr: 192.168.0.0/16 encapsulation: IPIP natOutgoing: Enabled nodeSelector: all() bgp: Enabled nodeAddressAutodetectionV4: firstFound: true # FlexVolume path must be mounted under /opt on flatcar/coreos systems flexVolumePath: /opt/libexec/kubernetes/kubelet-plugins/volume/exec/ kind: ConfigMap metadata: name: calico-cni-installation-$CLUSTER_NAME EOF
Run these commands:
kubectl edit clusterresourceset calico-cni-installation-$CLUSTER_NAME
and update
spec.clusterSelector.matchLabels.konvoy.d2iq.io/osHint
to konvoy.d2iq.io/osHint: flatcar
Kube-oidc-proxy
not ready after upgrade
If you installed or attached a cluster in 2.1, kube-oidc-proxy
is not available after upgrading to 2.3 and 2.4. This application is required to access the Kubernetes API (with kubectl
) using SSO. For affected customers, there are issues with the authentication via kubectl
.
To make the application available, run the following command on each cluster that was installed, created or attached in 2.1, and is now on DKP version 2.4. Replace <namespace>
with each cluster’s workspace namespace:
kubectl -n <namespace> patch appdeployment kube-oidc-proxy --type=json -p '[{"op":"remove","path":"/spec/configOverrides"}]'