Pulsar on EBS having poor performance
See original GitHub issueDescribe the bug Pulsar’s performance when using EBS isn’t up to what is expected reading off of EBS even when setting EBS to silly settings and ruling out all other bottlenecks that are obvious.
To Reproduce Steps to reproduce the behavior:
- We’re using Pulsar helm install on AWS EKS https://github.com/apache/pulsar-helm-chart/commit/6e9ad25ba322f6f0fc7c11c66fb88faa6d0218db
- Our values.yaml overrides look like this:
pulsar:
namespace: cory-ebs-test
components:
pulsar_manager: false # UI is outdated and won't load without errors
auth:
authentication:
enabled: true
bookkeeper:
resources:
requests:
memory: 11560Mi
cpu: 1.5
volumes:
journal:
size: 100Gi
ledgers:
size: 5Ti
configData:
# `BOOKIE_MEM` is used for `bookie shell`
BOOKIE_MEM: >
"
-Xms1280m
-Xmx10800m
-XX:MaxDirectMemorySize=10800m
"
# we use `bin/pulsar` for starting bookie daemons
PULSAR_MEM: >
"
-Xms10800m
-Xmx10800m
-XX:MaxDirectMemorySize=10800m
"
# configure the memory settings based on jvm memory settings
dbStorage_writeCacheMaxSizeMb: "2500" #pulsar docs say 25%
dbStorage_readAheadCacheMaxSizeMb: "2500" #pulsar docs say 25%
dbStorage_rocksDB_writeBufferSizeMB: "64" #pulsar docs had 64
dbStorage_rocksDB_blockCacheSize: "1073741824" #pulsar docs say 10%
readBufferSizeBytes: "8096" #attempted doubling
autorecovery:
resources:
requests:
memory: 2048Mi
cpu: 1
configData:
BOOKIE_MEM: >
"
-Xms1500m -Xmx1500m
"
broker:
resources:
requests:
memory: 4096Mi
cpu: 1
configData:
PULSAR_MEM: >
"
-Xms1024m -Xmx4096m -XX:MaxDirectMemorySize=4096m
-Dio.netty.leakDetectionLevel=disabled
-Dio.netty.recycler.linkCapacity=1024
-XX:+ParallelRefProcEnabled
-XX:+UnlockExperimentalVMOptions
-XX:+DoEscapeAnalysis
-XX:ParallelGCThreads=4
-XX:ConcGCThreads=4
-XX:G1NewSizePercent=50
-XX:+DisableExplicitGC
-XX:-ResizePLAB
-XX:+ExitOnOutOfMemoryError
-XX:+PerfDisableSharedMem
"
proxy:
resources:
requests:
memory: 4096Mi
cpu: 1
configData:
PULSAR_MEM: >
"
-Xms1024m -Xmx4096m -XX:MaxDirectMemorySize=4096m
-Dio.netty.leakDetectionLevel=disabled
-Dio.netty.recycler.linkCapacity=1024
-XX:+ParallelRefProcEnabled
-XX:+UnlockExperimentalVMOptions
-XX:+DoEscapeAnalysis
-XX:ParallelGCThreads=4
-XX:ConcGCThreads=4
-XX:G1NewSizePercent=50
-XX:+DisableExplicitGC
-XX:-ResizePLAB
-XX:+ExitOnOutOfMemoryError
-XX:+PerfDisableSharedMem
"
service:
annotations:
service.beta.kubernetes.io/aws-load-balancer-type: nlb
external-dns.alpha.kubernetes.io/hostname: pulsar.internal.ckdarby
toolset:
resources:
requests:
memory: 1028Mi
cpu: 1
configData:
PULSAR_MEM: >
"
-Xms640m
-Xmx1028m
-XX:MaxDirectMemorySize=1028m
"
grafana:
service:
annotations:
external-dns.alpha.kubernetes.io/hostname: grafana.internal.ckdarby
admin:
user: admin
password: 12345
- Produce message to multi-partioned topic:
- Partitioned by 8
- Average message size is ~1.5 KB
- Set retention as 7 days
- We’re storing ~ 2-8 TB of retention at times
- Attempt to consume message with the offset set as earliest (thus skipping any rocksdb read cache, going to the backlog):
Have tried Flink Pulsar connector Running with the Pulsar’s perf reader from the toolset pod on a single partition topic
{
"confFile" : "/pulsar/conf/client.conf",
"topic" : [ "persistent://public/cory/test-ebs-partition-5" ],
"numTopics" : 1,
"rate" : 0.0,
"startMessageId" : "earliest",
"receiverQueueSize" : 1000,
"maxConnections" : 100,
"statsIntervalSeconds" : 0,
"serviceURL" : "pulsar://cory-ebs-test-pulsar-proxy:6650/",
"authPluginClassName" : "org.apache.pulsar.client.impl.auth.AuthenticationToken",
"authParams" : "file:///pulsar/tokens/client/token",
"useTls" : false,
"tlsTrustCertsFilePath" : ""
}
- Check Grafana, EBS graphs, etc
- See really poor performance from Pulsar, 60-100 mbyte/s on the partition
- Don’t see any bottlenecks
Expected behavior Pulsar is getting 60-100 mbyte/s reads off each partition. Would expect closer to what bookie is actually able to read off EBS at 200-300 mbyte/s
Additional context Here is a real example of everything I could pull, perf reader starts at 18:31:17 UTC & ends at 18:46:37 UTC. All the graphs are during that time and in UTC.
Perf Reader Output
18:31:17.389 [main] INFO org.apache.pulsar.testclient.PerformanceReader - Read throughput: 58250.685 msg/s -- 647.672 Mbit/s
18:31:27.389 [main] INFO org.apache.pulsar.testclient.PerformanceReader - Read throughput: 58523.641 msg/s -- 667.659 Mbit/s
18:31:37.390 [main] INFO org.apache.pulsar.testclient.PerformanceReader - Read throughput: 61314.984 msg/s -- 688.519 Mbit/s
18:31:47.390 [main] INFO org.apache.pulsar.testclient.PerformanceReader - Read throughput: 64920.905 msg/s -- 748.406 Mbit/s
18:31:57.390 [main] INFO org.apache.pulsar.testclient.PerformanceReader - Read throughput: 64340.229 msg/s -- 732.601 Mbit/s
...
18:42:17.416 [main] INFO org.apache.pulsar.testclient.PerformanceReader - Read throughput: 64034.036 msg/s -- 723.160 Mbit/s
18:42:27.419 [main] INFO org.apache.pulsar.testclient.PerformanceReader - Read throughput: 63048.031 msg/s -- 700.458 Mbit/s
18:42:37.421 [main] INFO org.apache.pulsar.testclient.PerformanceReader - Read throughput: 69958.533 msg/s -- 817.095 Mbit/s
18:42:47.422 [main] INFO org.apache.pulsar.testclient.PerformanceReader - Read throughput: 69898.133 msg/s -- 827.770 Mbit/s
18:42:57.422 [main] INFO org.apache.pulsar.testclient.PerformanceReader - Read throughput: 62989.179 msg/s -- 726.990 Mbit/s
18:43:07.422 [main] INFO org.apache.pulsar.testclient.PerformanceReader - Read throughput: 63500.736 msg/s -- 728.683 Mbit/s
...
18:45:37.430 [main] INFO org.apache.pulsar.testclient.PerformanceReader - Read throughput: 55052.395 msg/s -- 645.263 Mbit/s
18:45:47.431 [main] INFO org.apache.pulsar.testclient.PerformanceReader - Read throughput: 72004.353 msg/s -- 804.856 Mbit/s
18:45:57.431 [main] INFO org.apache.pulsar.testclient.PerformanceReader - Read throughput: 86224.170 msg/s -- 954.399 Mbit/s
18:46:07.431 [main] INFO org.apache.pulsar.testclient.PerformanceReader - Read throughput: 80231.708 msg/s -- 905.096 Mbit/s
18:46:17.432 [main] INFO org.apache.pulsar.testclient.PerformanceReader - Read throughput: 73065.824 msg/s -- 864.556 Mbit/s
Bookie reading directly from EBS
Flushed disk cache before & this is before running the perf reader
EC2 instances Amount: 13 Type: r5.large AZ: All in us-west-2c All within Kubernetes
EBS Journal: Space: 100 GiB Type: gp2 IOPS: 300
Ledgers: Space: 5120 GiB Type: gp2 IOPS: 15360
Ledger graphs
Grafana Overview
JVM
Bookie
Broker
Recovery
Zookeeper
Bookie
Specifically public/cory/test-ebs-partition-5
Issue Analytics
- State:
- Created 3 years ago
- Comments:15 (4 by maintainers)
100% solved, ran against production today.
We decreased dispatcherMaxReadBatchSize=10000 to 500 for our Flink job because reading from all 8 partitions was OOM’ing the brokers with the memory allocation we had set.
Running in production with graphs from Flink, this is 4x what we were getting before and we’re seeing matching graphs in Pulsar’s Grafana with 4x throughput.
@ckdarby awesome!