question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Performance tests in pub/sub show that the publisher is using 50-100x more CPU than subscriber

See original GitHub issue

I’ve been running performance tests with Pub/Sub trying to find the right combination of threads, memory, buffer sizes, etc… I’m seeing that the Publisher taking ~500% CPU versus 5-8% for the Subscriber running on the same server, simultaneously handling the same message throughput. Does that surprise anyone? In looking at the running threads I see that a number of them have the following stack trace in the RSA/SSHA256 JWT signing code. I suspect that this is to a large degree the real limitation of the Publisher’s performance.

Anyone else seeing this? Any recommendations on how to work around this? Is there something I can do to the transport layer turn off these or reduce their usage? Maybe GPRC configuration tweaks?

Thanks. This is using v0.38 on a Linux box running Java 7 outside of google’s cloud. Would running on Google’s computer engine instances?

BigInteger.oddModPow(BigInteger.java:2716) 
java.math.BigInteger.modPow(BigInteger.java:2459) 
sun.security.rsa.RSACore.crtCrypt(RSACore.java:183) 
sun.security.rsa.RSACore.rsa(RSACore.java:122) 
sun.security.rsa.RSASignature.engineSign(RSASignature.java:175) 
java.security.Signature$Delegate.engineSign(Signature.java:1207) 
java.security.Signature.sign(Signature.java:579) 
com.google.api.client.util.SecurityUtils.sign(SecurityUtils.java:147) 
com.google.api.client.json.webtoken.JsonWebSignature.signUsingRsaSha256(JsonWebSignature.java:637) 
com.google.auth.oauth2.ServiceAccountJwtAccessCredentials.getJwtAccess(ServiceAccountJwtAccessCredentials.java:300) 
com.google.auth.oauth2.ServiceAccountJwtAccessCredentials.getRequestMetadata(ServiceAccountJwtAccessCredentials.java:267) 
com.google.auth.Credentials.blockingGetToCallback(Credentials.java:103) 
com.google.auth.oauth2.ServiceAccountJwtAccessCredentials.getRequestMetadata(ServiceAccountJwtAccessCredentials.java:251) 
io.grpc.auth.GoogleAuthLibraryCallCredentials.applyRequestMetadata(GoogleAuthLibraryCallCredentials.java:90) 
io.grpc.internal.CallCredentialsApplyingTransportFactory$CallCredentialsApplyingTransport.newStream(CallCredentialsApplyingTransportFactory.java:91) 
io.grpc.internal.ClientCallImpl.start(ClientCallImpl.java:242) 
io.grpc.internal.CensusTracingModule$TracingClientInterceptor$1.start(CensusTracingModule.java:387) 
io.grpc.internal.CensusStatsModule$StatsClientInterceptor$1.start(CensusStatsModule.java:679) 
io.grpc.ForwardingClientCall.start(ForwardingClientCall.java:32) 
com.google.api.gax.grpc.GrpcHeaderInterceptor$1.start(GrpcHeaderInterceptor.java:95) 
io.grpc.stub.ClientCalls.startCall(ClientCalls.java:293) 
io.grpc.stub.ClientCalls.asyncUnaryRequestCall(ClientCalls.java:268) 
io.grpc.stub.ClientCalls.futureUnaryCall(ClientCalls.java:177) 
com.google.pubsub.v1.PublisherGrpc$PublisherFutureStub.publish(PublisherGrpc.java:538) 
com.google.cloud.pubsub.v1.Publisher.publishOutstandingBatch(Publisher.java:333) 
com.google.cloud.pubsub.v1.Publisher.access$000(Publisher.java:90) 
com.google.cloud.pubsub.v1.Publisher$1.run(Publisher.java:255) 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
java.util.concurrent.FutureTask.run(FutureTask.java:266) 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
java.lang.Thread.run(Thread.java:745) 

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:6 (2 by maintainers)

github_iconTop GitHub Comments

2reactions
pongadcommented, Apr 26, 2018

I think I found the issue. From the stack trace, you seem to be using JWT for auth. Until recently JWT tokens aren’t cached so you were doing expensive auth negotiation every single call to publish RPC. The subscribe side wasn’t affected because it uses long-running streaming call.

This was fixed in https://github.com/google/google-auth-library-java/commit/664754ee1208fe17472e41d10aa752851f610e7e on the auth side. We’ll upgrade our library in this repo.

1reaction
j256commented, Apr 26, 2018

Wow. Huge win with 0.38.0 for example. I still see the Publisher use more CPU than the Subscriber but the difference now is like ~3x instead of 50-100x. Throughput seems to have increased so that my program is now hitting other limits. Thanks much @pongad.

For those who don’t want to wait, it’s an easy pom.xml exclusion stanza. Before you ask, the only difference that I can see between 0.9.0 and 0.9.1 is the addition of the JWT token caching.

<dependency>	
	<groupId>com.google.cloud</groupId>	
	<artifactId>google-cloud-pubsub</artifactId>	
	<version>0.##.0-beta</version>	
	<exclusions>	
		<exclusion>	
			<groupId>com.google.auth</groupId>	
			<artifactId>google-auth-library-credentials</artifactId>	
		</exclusion>	
		<exclusion>	
			<groupId>com.google.auth</groupId>	
			<artifactId>google-auth-library-oauth2-http</artifactId>	
		</exclusion>	
	</exclusions>
</dependency>	
<dependency>	
	<groupId>com.google.auth</groupId>	
	<artifactId>google-auth-library-credentials</artifactId>	
	<version>0.9.1</version>	
</dependency>	
<dependency>	
	<groupId>com.google.auth</groupId>	
	<artifactId>google-auth-library-oauth2-http</artifactId>	
	<version>0.9.1</version>	
</dependency>	
Read more comments on GitHub >

github_iconTop Results From Across the Web

Testing Cloud Pub/Sub clients to maximize streaming ...
Using Cloud Pub/Sub with Google Cloud lets you ingest data from ... Check out this new open-source load test framework to improve performance....
Read more >
Fine-tuning Pub/Sub performance with batch and flow control ...
Adjusting the batch size (i.e. how many messages or bytes are sent in a publish request) can be used to achieve the desired...
Read more >
Google Pub/Sub message took ~5 minutes to be published ...
Your Cloud Functions is really executed in 12s, and replied HTTP 200 code. Then the CPU is throttled because, outside request processing, ...
Read more >
par Asterios KATSIFODIMOS Scalable View-based Techniques for ...
throughout my thesis with her trust, experience, knowledge and patience (lots of it). Ioana has been more than an advisor for the last...
Read more >
pubsub - Go Packages
Publishing ¶. Google Cloud Pub/Sub messages are published to topics. Topics may be created using the pubsub package like so: topic, err :=...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found