question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Pulsar on EBS having poor performance

See original GitHub issue

Describe the bug Pulsar’s performance when using EBS isn’t up to what is expected reading off of EBS even when setting EBS to silly settings and ruling out all other bottlenecks that are obvious.

To Reproduce Steps to reproduce the behavior:

  1. We’re using Pulsar helm install on AWS EKS https://github.com/apache/pulsar-helm-chart/commit/6e9ad25ba322f6f0fc7c11c66fb88faa6d0218db
  2. Our values.yaml overrides look like this:
pulsar:
  namespace: cory-ebs-test
  components:
    pulsar_manager: false # UI is outdated and won't load without errors
  auth:
    authentication:
      enabled: true
  bookkeeper:
    resources:
      requests:
        memory: 11560Mi
        cpu: 1.5
    volumes:
      journal:
        size: 100Gi
      ledgers:
        size: 5Ti
    configData:
      # `BOOKIE_MEM` is used for `bookie shell`
      BOOKIE_MEM: >
        "
        -Xms1280m
        -Xmx10800m
        -XX:MaxDirectMemorySize=10800m
        "
      # we use `bin/pulsar` for starting bookie daemons
      PULSAR_MEM: >
        "
        -Xms10800m
        -Xmx10800m
        -XX:MaxDirectMemorySize=10800m
        "
      # configure the memory settings based on jvm memory settings
      dbStorage_writeCacheMaxSizeMb: "2500" #pulsar docs say 25%
      dbStorage_readAheadCacheMaxSizeMb: "2500" #pulsar docs say 25%
      dbStorage_rocksDB_writeBufferSizeMB: "64" #pulsar docs had 64
      dbStorage_rocksDB_blockCacheSize: "1073741824" #pulsar docs say 10%
      readBufferSizeBytes: "8096" #attempted doubling
  autorecovery:
    resources:
      requests:
        memory: 2048Mi
        cpu: 1
    configData:
      BOOKIE_MEM: >
        "
        -Xms1500m -Xmx1500m
        "
  broker:
    resources:
      requests:
        memory: 4096Mi
        cpu: 1
    configData:
      PULSAR_MEM: >
        "
        -Xms1024m -Xmx4096m -XX:MaxDirectMemorySize=4096m
        -Dio.netty.leakDetectionLevel=disabled
        -Dio.netty.recycler.linkCapacity=1024
        -XX:+ParallelRefProcEnabled
        -XX:+UnlockExperimentalVMOptions
        -XX:+DoEscapeAnalysis
        -XX:ParallelGCThreads=4
        -XX:ConcGCThreads=4
        -XX:G1NewSizePercent=50
        -XX:+DisableExplicitGC
        -XX:-ResizePLAB
        -XX:+ExitOnOutOfMemoryError
        -XX:+PerfDisableSharedMem
        "
  proxy:
    resources:
      requests:
        memory: 4096Mi
        cpu: 1
    configData:
      PULSAR_MEM: >
        "
        -Xms1024m -Xmx4096m -XX:MaxDirectMemorySize=4096m
        -Dio.netty.leakDetectionLevel=disabled
        -Dio.netty.recycler.linkCapacity=1024
        -XX:+ParallelRefProcEnabled
        -XX:+UnlockExperimentalVMOptions
        -XX:+DoEscapeAnalysis
        -XX:ParallelGCThreads=4
        -XX:ConcGCThreads=4
        -XX:G1NewSizePercent=50
        -XX:+DisableExplicitGC
        -XX:-ResizePLAB
        -XX:+ExitOnOutOfMemoryError
        -XX:+PerfDisableSharedMem
        "
    service:
      annotations:
        service.beta.kubernetes.io/aws-load-balancer-type: nlb
        external-dns.alpha.kubernetes.io/hostname: pulsar.internal.ckdarby
  toolset:
    resources:
      requests:
        memory: 1028Mi
        cpu: 1
    configData:
      PULSAR_MEM: >
        "
        -Xms640m
        -Xmx1028m
        -XX:MaxDirectMemorySize=1028m
        "
  grafana:
    service:
      annotations:
        external-dns.alpha.kubernetes.io/hostname: grafana.internal.ckdarby
    admin:
      user: admin
      password: 12345
  1. Produce message to multi-partioned topic:
  • Partitioned by 8
  • Average message size is ~1.5 KB
  • Set retention as 7 days
  • We’re storing ~ 2-8 TB of retention at times
  1. Attempt to consume message with the offset set as earliest (thus skipping any rocksdb read cache, going to the backlog):

Have tried Flink Pulsar connector Running with the Pulsar’s perf reader from the toolset pod on a single partition topic

{
  "confFile" : "/pulsar/conf/client.conf",
  "topic" : [ "persistent://public/cory/test-ebs-partition-5" ],
  "numTopics" : 1,
  "rate" : 0.0,
  "startMessageId" : "earliest",
  "receiverQueueSize" : 1000,
  "maxConnections" : 100,
  "statsIntervalSeconds" : 0,
  "serviceURL" : "pulsar://cory-ebs-test-pulsar-proxy:6650/",
  "authPluginClassName" : "org.apache.pulsar.client.impl.auth.AuthenticationToken",
  "authParams" : "file:///pulsar/tokens/client/token",
  "useTls" : false,
  "tlsTrustCertsFilePath" : ""
}
  1. Check Grafana, EBS graphs, etc
  • See really poor performance from Pulsar, 60-100 mbyte/s on the partition
  • Don’t see any bottlenecks

Expected behavior Pulsar is getting 60-100 mbyte/s reads off each partition. Would expect closer to what bookie is actually able to read off EBS at 200-300 mbyte/s

Additional context Here is a real example of everything I could pull, perf reader starts at 18:31:17 UTC & ends at 18:46:37 UTC. All the graphs are during that time and in UTC.

Perf Reader Output

18:31:17.389 [main] INFO  org.apache.pulsar.testclient.PerformanceReader - Read throughput: 58250.685  msg/s -- 647.672 Mbit/s
18:31:27.389 [main] INFO  org.apache.pulsar.testclient.PerformanceReader - Read throughput: 58523.641  msg/s -- 667.659 Mbit/s
18:31:37.390 [main] INFO  org.apache.pulsar.testclient.PerformanceReader - Read throughput: 61314.984  msg/s -- 688.519 Mbit/s
18:31:47.390 [main] INFO  org.apache.pulsar.testclient.PerformanceReader - Read throughput: 64920.905  msg/s -- 748.406 Mbit/s
18:31:57.390 [main] INFO  org.apache.pulsar.testclient.PerformanceReader - Read throughput: 64340.229  msg/s -- 732.601 Mbit/s
...
18:42:17.416 [main] INFO  org.apache.pulsar.testclient.PerformanceReader - Read throughput: 64034.036  msg/s -- 723.160 Mbit/s
18:42:27.419 [main] INFO  org.apache.pulsar.testclient.PerformanceReader - Read throughput: 63048.031  msg/s -- 700.458 Mbit/s
18:42:37.421 [main] INFO  org.apache.pulsar.testclient.PerformanceReader - Read throughput: 69958.533  msg/s -- 817.095 Mbit/s
18:42:47.422 [main] INFO  org.apache.pulsar.testclient.PerformanceReader - Read throughput: 69898.133  msg/s -- 827.770 Mbit/s
18:42:57.422 [main] INFO  org.apache.pulsar.testclient.PerformanceReader - Read throughput: 62989.179  msg/s -- 726.990 Mbit/s
18:43:07.422 [main] INFO  org.apache.pulsar.testclient.PerformanceReader - Read throughput: 63500.736  msg/s -- 728.683 Mbit/s
...
18:45:37.430 [main] INFO  org.apache.pulsar.testclient.PerformanceReader - Read throughput: 55052.395  msg/s -- 645.263 Mbit/s
18:45:47.431 [main] INFO  org.apache.pulsar.testclient.PerformanceReader - Read throughput: 72004.353  msg/s -- 804.856 Mbit/s
18:45:57.431 [main] INFO  org.apache.pulsar.testclient.PerformanceReader - Read throughput: 86224.170  msg/s -- 954.399 Mbit/s
18:46:07.431 [main] INFO  org.apache.pulsar.testclient.PerformanceReader - Read throughput: 80231.708  msg/s -- 905.096 Mbit/s
18:46:17.432 [main] INFO  org.apache.pulsar.testclient.PerformanceReader - Read throughput: 73065.824  msg/s -- 864.556 Mbit/s

Bookie reading directly from EBS Flushed disk cache before & this is before running the perf reader Selection_292

EC2 instances Amount: 13 Type: r5.large AZ: All in us-west-2c All within Kubernetes

EBS Journal: Space: 100 GiB Type: gp2 IOPS: 300

Ledgers: Space: 5120 GiB Type: gp2 IOPS: 15360

Ledger graphs Selection_009

Grafana Overview Selection_008

JVM

Bookie Selection_002

Broker Selection_003

Recovery Selection_004

Zookeeper Selection_005

Bookie Selection_006

Selection_007

Specifically public/cory/test-ebs-partition-5 Selection_001

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:15 (4 by maintainers)

github_iconTop GitHub Comments

2reactions
ckdarbycommented, Jun 2, 2020

100% solved, ran against production today.

We decreased dispatcherMaxReadBatchSize=10000 to 500 for our Flink job because reading from all 8 partitions was OOM’ing the brokers with the memory allocation we had set.

Running in production with graphs from Flink, this is 4x what we were getting before and we’re seeing matching graphs in Pulsar’s Grafana with 4x throughput.

image (1)

0reactions
sijiecommented, Jun 3, 2020

@ckdarby awesome!

Read more comments on GitHub >

github_iconTop Results From Across the Web

Benchmarking Kafka vs. Pulsar vs. RabbitMQ: Which is Fastest?
Kafka has been known to be fast, but how fast is it today, and how does it stack up against other systems? We...
Read more >
Evaluating Apache Pulsar - Zendesk Engineering
We observed producing records to slow down when Tiered storage was enabled. This is to be expected as the brokers / bookies now...
Read more >
The Cost Savings of Replacing Kafka with Pulsar - Gigaom
Our performance testing revealed Pulsar (Luna Streaming) had a higher average throughput in all the OMB testing workloads we performed. The ...
Read more >
Apache Pulsar is an open-source distributed pub-sub ...
I just finished rolling out Pulsar to 8 AWS regions with geo-replication. Messages rates are currently at about 50k msgs/sec but still in ......
Read more >
BlazeGraph Results Methods Abstract PULSAR ... - Amazon S3
A single instance of PULSAR excelled at performing URI dereferencing using a separate Redis server for indexing. Peak performance on a single low-powered ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found