question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

FlowControl / max_messages is not working as expected

See original GitHub issue

Hi,

I am trying to implement a simple task queue using Pub/Sub and I think I am running into a bug with FlowControl.

I have a few workers that need to execute a callback one message by one (without concurrency) and I use the FlowControl and max_messages=1 for that. I am expecting that every worker receive one message, process it, ack it, then pull another one.

import time
from google.cloud import pubsub
from google.cloud.pubsub import types

subscriber = pubsub.SubscriberClient()

subscription_name = 'projects/{project_id}/subscriptions/{sub}'.format(
    project_id='my-project-id',
    sub='my-subscription'
)

flow_control = types.FlowControl(max_messages=1)
    
subscription = subscriber.subscribe(subscription_name, flow_control=flow_control)

def callback(message):
    print('Received')    
    time.sleep(5)
    print('Done')
    message.ack()

future = subscription.open(callback)

future.result()

But when I publish a few messages using the following code, they are all received and processed concurrently, by a single worker.

from google.cloud import pubsub
from google.cloud.pubsub import types

publisher = pubsub.PublisherClient()

topic = publisher.topic_path('my-project-id', 'my-topic')

for i in range(1,5):	
	publisher.publish(topic, b'my-message')

Am I missing something or is this a bug ?

Thanks !

My environment details

  • OS: Ubuntu 17.10
  • Python: 3.6
  • google-cloud-pubsub : 0.30.1

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Comments:11 (4 by maintainers)

github_iconTop GitHub Comments

3reactions
dan-laddcommented, Nov 6, 2019

For anyone who stumbles across this, this has been fixed here: https://github.com/googleapis/google-cloud-python/issues/7677

2reactions
Etendard7commented, Feb 22, 2018

Thanks for your answer but the issue with the shared lock is that it doesn’t prevent a worker from receiving several messages and keeping it until the end of the processing of the first one.

Is there any way to distribute 1000 messages to 1000 workers with PubSub so that each worker receive one message ?

My use case is that I have an embarrassingly parallel task, splittable into 1000 sub-tasks and each one consume the entire memory of my worker machine.

I have tried to nack() every undesirable message but it seems that other workers can’t see them before the end of the process of the current message. That means that I cannot parallelize my task.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Google Pub-sub flow-control with max_messages is not ...
I am using google pubsub to listen to the events that are published. The publisher is not under my control and I dont...
Read more >
Fine-tuning Pub/Sub performance with batch and flow control ...
Flow Control Features​​ Once this limit is hit, the client won't be able to pull more messages until the messages that have already...
Read more >
Ability to rate limit messages being sent from a topic to a ...
The "message flow control" feature you linked me to sounds like it won't help us. It provides the ability to limit the number...
Read more >
Messages queued (dynamic flow control)" is encountered ...
“Dynamic flow control” errors occur when the MTA has too many errors in a domain in a brief period. The MTA is coded...
Read more >
Flow Control · ActiveMQ Artemis Documentation
Flow control is used to limit the flow of data between a client and server, ... If the consumer cannot process messages as...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found