Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
67 changes: 58 additions & 9 deletions modules/graceful-restart.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -72,13 +72,36 @@ ip-10-0-211-16.ec2.internal Ready control-plane,master 75m v1.34.2
$ oc get csr
----

.. Review the details of a CSR to verify that it is valid:
.. Review the details of each CSR to verify that it is valid by running:
+
[source,terminal]
----
$ oc describe csr <csr_name> <1>
$ oc get csr <csr_name> -o jsonpath='{.spec.request}' | base64 -d | openssl req -text -noout
----
<1> `<csr_name>` is the name of a CSR from the list of current CSRs.
+
When validating the CSR, verify that the following fields match your infrastructure expectations:
+
* Subject / Common Name (CN): Must follow the format `system:node:<node_name>`, such as `system:node:control-plane-0.example.com`.
* Organization (O): Must be exactly `system:nodes`.
* Requested Extensions (Extended Key Usage): Must list `TLS Web Client Authentication`.
+
.Example Output of a Valid Control Plane Client CSR
[source,terminal]
----
Certificate Request:
Data:
Version: 1 (0x0)
Subject: O = system:nodes, CN = system:node:control-plane-0.example.com
Subject Public Key Info:
Public Key Algorithm: id-ecPublicKey
Public-Key: (256 bit)
Attributes:
Requested Extensions:
X509v3 Extended Key Usage:
TLS Web Client Authentication
----
+
The process verifies that the certificate is allowed to be used as a client credential for the node.

.. Approve each valid CSR:
+
Expand All @@ -87,14 +110,14 @@ $ oc describe csr <csr_name> <1>
$ oc adm certificate approve <csr_name>
----

. After the control plane nodes are ready, verify that all worker nodes are ready.
. After the control plane nodes are ready, verify that all compute nodes are ready.
+
[source,terminal]
----
$ oc get nodes -l node-role.kubernetes.io/worker
----
+
The worker nodes are ready if the status is `Ready`, as shown in the following output:
The compute nodes are ready if the status is `Ready`, as shown in the following output:
+
[source,terminal]
----
Expand All @@ -104,7 +127,7 @@ ip-10-0-182-134.ec2.internal Ready worker 64m v1.34.2
ip-10-0-250-100.ec2.internal Ready worker 64m v1.34.2
----

. If the worker nodes are _not_ ready, then check whether there are any pending certificate signing requests (CSRs) that must be approved.
. If the compute nodes are _not_ ready, then check whether there are any pending certificate signing requests (CSRs) that must be approved.

.. Get the list of current CSRs:
+
Expand All @@ -113,13 +136,39 @@ ip-10-0-250-100.ec2.internal Ready worker 64m v1.34.2
$ oc get csr
----

.. Review the details of a CSR to verify that it is valid:
.. Review the details of each CSR to verify its validity by running:
+
[source,terminal]
----
$ oc describe csr <csr_name> <1>
$ oc get csr <csr_name> -o jsonpath='{.spec.request}' | base64 -d | openssl req -text -noout
----
<1> `<csr_name>` is the name of a CSR from the list of current CSRs.
+
Compute node CSRs can be for client certificates (kubelet to API) or serving certificates (API to kubelet). Verify that the following fields match your infrastructure expectations:
+
* Subject / Common Name (CN): Must follow the format `system:node:<compute_node_name>`.
* Organization (O): Must be exactly `system:nodes`.
* Extended Key Usage (EKU): Must list `TLS Web Client Authentication` (for client requests) or `TLS Web Server Authentication` (for serving requests).
* Subject Alternative Name (SAN): For serving certificates, this field must contain the correct internal DNS hostname and the internal IP address of the respective compute node.
+
.Example Output of a Valid Worker Serving CSR
[source,terminal]
----
Certificate Request:
Data:
Version: 1 (0x0)
Subject: O = system:nodes, CN = system:node:worker-0.example.com
Subject Public Key Info:
Public Key Algorithm: rsaEncryption
Public-Key: (2048 bit)
Attributes:
Requested Extensions:
X509v3 Extended Key Usage:
TLS Web Server Authentication
X509v3 Subject Alternative Name:
DNS:worker-0.example.com, IP Address:10.0.12.34
----
+
This process verifies the validity of the certificate as a server credential for cluster communication.

.. Approve each valid CSR:
+
Expand Down