Skip to content

chore: [wip] PQC POC 2#13203

Draft
diegomarquezp wants to merge 24 commits into
mainfrom
chore/pqc-poc-2
Draft

chore: [wip] PQC POC 2#13203
diegomarquezp wants to merge 24 commits into
mainfrom
chore/pqc-poc-2

Conversation

@diegomarquezp
Copy link
Copy Markdown
Contributor

No description provided.

TAG=agy
CONV=0ade5891-3c8d-4e27-a240-b1a8cd6a0b0c
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces Post-Quantum Cryptography (PQC) support across the Java SDK by integrating Bouncy Castle and configuring PQC-hardened SSL contexts for gRPC and HTTP/JSON transports. Key changes include the addition of a pqc-test module, reflection-based configuration for Netty channel builders, and updates to OAuth2Utils and InstantiatingHttpJsonChannelProvider. Feedback identifies issues with potential initialization errors in OAuth2Utils, the use of a SNAPSHOT dependency version, and performance overhead from repeated reflection lookups. Suggestions were also provided for improving exception handling, ensuring logic consistency in transport creation, and cleaning up test provider registrations.

Comment thread google-auth-library-java/oauth2_http/java/com/google/auth/oauth2/OAuth2Utils.java Outdated
Comment thread sdk-platform-java/gapic-generator-java-pom-parent/pom.xml Outdated
…ranch chore/pqc-poc-2

TAG=agy
CONV=0ade5891-3c8d-4e27-a240-b1a8cd6a0b0c
TAG=agy
CONV=0ade5891-3c8d-4e27-a240-b1a8cd6a0b0c
This commit addresses the review feedback by implementing:
1. Graceful fallback to default NetHttpTransport with WARNING logging in OAuth2Utils static initializer.
2. warning-level exception logging in InstantiatingGrpcChannelProvider.
3. Caching of shaded/unshaded gRPC Netty OpenSSL reflection lookups inside thread-safe static OpenSslReflectionHolder to remove runtime overhead.
4. Reverting createHttpTransport in InstantiatingHttpJsonChannelProvider to return null when mTLS is not active, as default transport is already PQC-hardened.
5. Clean unregistration of BC and BCJSSE security providers in integration test server/client teardown.

TAG=agy
@diegomarquezp
Copy link
Copy Markdown
Contributor Author

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements Post-Quantum Cryptography (PQC) support by hardening HTTP and gRPC transports with Bouncy Castle JSSE. Key changes include updating OAuth2Utils to use a PQC-hardened transport, adding Bouncy Castle dependencies to GAX modules, and introducing reflection-based configuration for gRPC channels to support hybrid PQC key exchange. A new integration test module, pqc-test, is also introduced. Feedback highlights the use of a -SNAPSHOT dependency version in the parent POM, code duplication and brittle string-based class checks in the gRPC reflection logic, and unused imports.

<javax.annotation-api.version>1.3.2</javax.annotation-api.version>
<grpc.version>1.81.0</grpc.version>
<google.http-client.version>2.1.0</google.http-client.version>
<google.http-client.version>2.1.1-SNAPSHOT</google.http-client.version>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Updating google.http-client.version to a -SNAPSHOT version in a parent POM is generally discouraged for release-track projects. This can lead to unstable builds and dependency resolution issues in environments without access to the specific snapshot repository. If this is necessary for the POC, please ensure it is reverted or replaced with a stable version before merging.

Comment on lines +930 to +965
private ManagedChannelBuilder<?> applyPqcConfiguration(ManagedChannelBuilder<?> builder) {
// Configure the PQ and classical hybrid named groups:
// 1. X25519MLKEM768 (codepoint 4588): Hybrid classical (X25519) + post-quantum (ML-KEM-768) key exchange.
// Provides defense-in-depth: if ML-KEM is compromised, security reverts to classical strength of X25519.
// 2. MLKEM768 (codepoint 1896): Pure post-quantum key exchange using ML-KEM-768.
// 3. X25519 (codepoint 29): Classical elliptic curve Diffie-Hellman key exchange, used as a fallback.
String[] hybridGroups = new String[] {"X25519MLKEM768", "MLKEM768", "X25519"};
String builderClassName = builder.getClass().getName();
boolean isShaded = "io.grpc.netty.shaded.io.grpc.netty.NettyChannelBuilder".equals(builderClassName);
boolean isUnshaded = "io.grpc.netty.NettyChannelBuilder".equals(builderClassName);

if (isShaded && OpenSslReflectionHolder.SHADED_AVAILABLE) {
try {
Object sslContextBuilder = OpenSslReflectionHolder.SHADED_FOR_CLIENT.invoke(null);
OpenSslReflectionHolder.SHADED_OPTION_METHOD.invoke(
sslContextBuilder, OpenSslReflectionHolder.SHADED_GROUPS_OPTION, (Object) hybridGroups);
Object sslContext = OpenSslReflectionHolder.SHADED_BUILD_METHOD.invoke(sslContextBuilder);
OpenSslReflectionHolder.SHADED_SSL_CONTEXT_METHOD.invoke(builder, sslContext);
return builder;
} catch (java.lang.reflect.InvocationTargetException | IllegalAccessException | RuntimeException e) {
LOG.log(Level.WARNING, "Failed to configure shaded PQC transport fallback", e);
}
} else if (isUnshaded && OpenSslReflectionHolder.UNSHADED_AVAILABLE) {
try {
Object sslContextBuilder = OpenSslReflectionHolder.UNSHADED_FOR_CLIENT.invoke(null);
OpenSslReflectionHolder.UNSHADED_OPTION_METHOD.invoke(
sslContextBuilder, OpenSslReflectionHolder.UNSHADED_GROUPS_OPTION, (Object) hybridGroups);
Object sslContext = OpenSslReflectionHolder.UNSHADED_BUILD_METHOD.invoke(sslContextBuilder);
OpenSslReflectionHolder.UNSHADED_SSL_CONTEXT_METHOD.invoke(builder, sslContext);
return builder;
} catch (java.lang.reflect.InvocationTargetException | IllegalAccessException | RuntimeException e) {
LOG.log(Level.WARNING, "Failed to configure unshaded PQC transport fallback", e);
}
}
return builder;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The applyPqcConfiguration method contains significant code duplication between the shaded and unshaded Netty configuration paths. Additionally, checking the builder class name via string comparison (lines 938-939) is brittle and may fail if the class is renamed or wrapped.

Consider refactoring OpenSslReflectionHolder to store two instances of a helper class (e.g., NettyReflectionMetadata) that encapsulates the reflected methods and classes for each variant. You can then use Class.isInstance() for a more robust type check and a single helper method to apply the configuration.

References
  1. If code is duplicated and needs to be shared, move it to a separate helper/utility class.

Comment on lines +45 to +46
import javax.net.ssl.SSLContext;
import java.security.NoSuchAlgorithmException;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The imports for javax.net.ssl.SSLContext and java.security.NoSuchAlgorithmException are unused in this file and should be removed.

ks.load(PqcTestServer.class.getResourceAsStream("/pqctest.p12"), "password".toCharArray());

// Build a custom HttpTransport explicitly trusting the self-signed certificate keystore.
final com.google.api.client.http.HttpTransport httpTransport = new com.google.api.client.http.javanet.NetHttpTransport.Builder()
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be enabled by default via pom.xml in this test

…uncy Castle JSSE in PQC zero-config connectivity integration tests and add extensive documentation comments

- Configures Bouncy Castle server-scoped system properties fallback to enforce ML-KEM-768 on Java 17.
- Keeps compile-safe Java 20 reflection for JRE 20+ runtimes.
- Adds extremely detailed comments describing provider, keystore, managers, server configurators, and netty gRPC secure socket pipelines.

TAG=agy
CONV=5d96c302-27fd-438a-ad0e-ffd6d64e61cb
@diegomarquezp
Copy link
Copy Markdown
Contributor Author

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new integration test suite for Post-Quantum Cryptography (PQC) auto-upgrades, featuring a common test module, a specialized PqcTestServer that enforces ML-KEM-768, and specific runners for snapshot and release versions. Key feedback includes addressing the missing registration of Bouncy Castle providers in the setup and ensuring global system properties are cleared in the teardown to prevent side effects. Other recommendations focus on removing duplicate code and documentation, increasing server startup timeouts for CI reliability, and cleaning up redundant methods or stale comments.

} catch (ClassNotFoundException e) {
isPqcSupported = false;
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The Bouncy Castle providers are not being registered in the setup method, despite the Javadoc and the teardown method suggesting they should be. Without registering BouncyCastleJsseProvider at position 1, the client will not use PQC for the TLS handshake as intended by this test suite.

    if (isPqcSupported) {
      Security.insertProviderAt(new org.bouncycastle.jsse.provider.BouncyCastleJsseProvider(), 1);
      Security.insertProviderAt(new org.bouncycastle.jce.provider.BouncyCastleProvider(), 2);
    }

Comment on lines +252 to +262
public static void teardown() {
if (serverProcess != null) {
// Forcibly destroy the background process and close standard streams to allow
// clean exit
serverProcess.destroyForcibly();
}
if (isPqcSupported) {
Security.removeProvider("BCJSSE");
Security.removeProvider("BC");
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The system properties set in setup (javax.net.ssl.trustStore, etc.) are not cleared in teardown. This can lead to side effects in other tests running in the same JVM. Ensure all global state is restored and that the cleanup is exception-safe to prevent resource leaks.

  public static void teardown() {
    if (serverProcess != null) {
      // Forcibly destroy the background process and close standard streams to allow
      // clean exit
      serverProcess.destroyForcibly();
    }
    if (isPqcSupported) {
      Security.removeProvider("BCJSSE");
      Security.removeProvider("BC");
    }
    System.clearProperty("javax.net.ssl.trustStore");
    System.clearProperty("javax.net.ssl.trustStorePassword");
    System.clearProperty("javax.net.ssl.trustStoreType");
  }
References
  1. When managing a collection of closeable resources, ensure they are closed in the reverse order of their creation (LIFO). The implementation must be exception-safe to prevent resource leaks.

Comment on lines +115 to +142
/**
* Configures the integration test harness environment before test cases are executed.
*
* <p><b>Detailed Security & Keystore Configuration Architecture:</b>
*
* <ul>
* <li><b>What is a Keystore (PKCS12):</b> A PKCS12 keystore (<code>pqctest.p12</code>) is a
* secure key database containing the server's private key and its self-signed public
* certificate. The server uses it during the TLS handshake to prove its identity and
* establish a secure channel.
* <li><b>How Encryption Works:</b> The certificate itself does not encrypt message data
* directly. Instead, during the TLS handshake, the client and server negotiate a symmetric
* session key using post-quantum cryptography (ML-KEM). This session key is then used to
* encrypt and decrypt all sent/received HTTP/gRPC data.
* <li><b>Why a Custom Temporary Truststore is Required:</b> Because the server uses a
* self-signed test certificate, it is not signed by any public Certificate Authority (CA)
* trusted by the standard JRE truststore (<code>cacerts</code>). Without registering a
* custom truststore containing this certificate, standard JRE TLS clients will reject the
* connection with an <code>SSLHandshakeException</code>. We extract the certificate to a
* temporary file and point <code>javax.net.ssl.trustStore</code> to it, thereby trusting it
* scope-specifically for this test run without polluting or mutating the user's system-wide
* JRE truststore.
* <li><b>JCA Provider Registration:</b> Registers <code>BouncyCastleJsseProvider</code> at
* provider position 1. This registers Bouncy Castle as the primary security provider,
* causing all standard default <code>SSLContext</code> and vanilla client factories to
* utilize Bouncy Castle JSSE and negotiate PQC automatically.
* </ul>
*/
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This Javadoc block appears to be a duplicate of the one starting on line 97. Please remove the redundant documentation to improve maintainability.


// Ephemeral port detection timeout (10 seconds) to fail-fast on server startup
// errors
if (System.currentTimeMillis() - startTime > 10000) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

A 10-second timeout for the server process to start and print its ports might be too short for some CI environments. Consider increasing this to at least 30 seconds to avoid flaky tests.

Suggested change
if (System.currentTimeMillis() - startTime > 10000) {
if (System.currentTimeMillis() - startTime > 30000) {

Comment on lines +264 to +269
public void runTests() throws Exception {
assertEquals(isPqcSupported, clientSupportsPqc());
testHttpPqc();
testGrpcPqc();
testBigQueryPqc();
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The runTests() method is not annotated with @Test and does not appear to be called by the JUnit runner. Since the individual test methods are overridden in subclasses, this method seems redundant and could be removed.

Comment on lines +125 to +126
// Enforce ALWAYS and ONLY hybrid ML-KEM / Kyber named groups programmatically on
// HttpsServer!
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This comment refers to a stale implementation detail regarding PQC enforcement that is actually handled in PqcEnforcingSSLEngine. Per repository rules, comments referring to stale implementations should be removed rather than updated.

References
  1. Remove comments that refer to stale implementations instead of updating them.

Comment on lines +255 to +269
private static class ByteMarshaller implements io.grpc.MethodDescriptor.Marshaller<byte[]> {
@Override
public InputStream stream(byte[] value) {
return new java.io.ByteArrayInputStream(value);
}

@Override
public byte[] parse(InputStream stream) {
try {
return com.google.common.io.ByteStreams.toByteArray(stream);
} catch (java.io.IOException e) {
throw new RuntimeException(e);
}
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The ByteMarshaller class is duplicated from PqcConnectivityTest. Consider moving it to a shared utility class within the pqc-test-common module to avoid duplication and maintain proper layering.

References
  1. Avoid inverting layering by making lower-level channel implementations depend on higher-level RPC wrappers. If code is duplicated and needs to be shared, move it to a separate helper/utility class.

try {
while (System.in.read() != -1) {
Thread.sleep(1000);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

System.in.read() is a blocking operation. The Thread.sleep(1000) call is redundant as the loop will naturally wait for input or the stream to close.

        // No sleep needed as read() blocks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant