No messages delivered on brokered addresses anymore
See original GitHub issueWe have done some performance tests, first by sending messages on an anycast address and then on a brokered address.
During the test with the brokered address, we got some errors in the broker that the corresponding queue is full:
2018-05-22 14:37:57,324 WARN [org.apache.activemq.artemis.protocol.amqp.proton.ProtonServerReceiverContext] AMQ119102: Address “event/LOAD_TEST_TENANT” is full.: ActiveMQAddressFullException[errorType=ADDRESS_FULL message=AMQ119102: Address “event/LOAD_TEST_TENANT” is full.]
In the router stats, these errors correspond to rejected messages.
Now the issue: Later messages on the brokered address don’t get delivered anymore - even when the size of the Artemis address is zero again. On the sender side in Hono we get a timeout. Looking at the router stats, these new messages increment the “undelivered” counter. Stats before:
Router Links
type dir conn id id peer class addr phs cap undel unsett del presett psdrop acc rej rel mod admin oper
==================================================================================================================================================================
endpoint in 1 6 250 0 0 26 0 0 26 0 0 0 enabled up
endpoint out 1 7 local temp.Zc9fZGY5eu8QBMf 250 0 0 26 26 0 0 0 0 0 enabled up
endpoint out 15 30 mobile event/LOAD_TEST_TENANT 0 250 0 0 2623 0 0 2623 0 0 0 enabled up
endpoint out 5 31 mobile telemetry/HEALTH_CHECK_TENANT 0 250 0 0 0 0 0 0 0 0 0 enabled up
endpoint out 6 32 mobile event/HEALTH_CHECK_TENANT 0 250 0 0 0 0 0 0 0 0 0 enabled up
endpoint in 95 83 250 0 0 34728 0 0 34728 0 0 0 enabled up
endpoint out 95 84 local temp.KsDwq5hpjW5bInE 250 0 0 34728 34728 0 0 0 0 0 enabled up
endpoint in 169 128 mobile event/HEALTH_CHECK_TENANT 1 250 0 0 0 0 0 0 0 0 0 enabled up
endpoint out 169 129 mobile event/HEALTH_CHECK_TENANT 0 250 0 0 0 0 0 0 0 0 0 enabled up
endpoint in 170 130 mobile event/HUB_TEAM_TENANT 1 250 0 0 0 0 0 0 0 0 0 enabled up
endpoint out 170 131 mobile event/HUB_TEAM_TENANT 0 250 0 0 0 0 0 0 0 0 0 enabled up
endpoint in 171 132 mobile event/LOAD_TEST_TENANT 1 250 0 0 15653 0 0 15653 0 0 0 enabled up
endpoint out 171 133 mobile event/LOAD_TEST_TENANT 0 250 200 0 15900 0 0 15653 247 0 0 enabled up
endpoint in 172 134 mobile event/SMOKE_TEST_TENANT 1 250 0 0 0 0 0 0 0 0 0 enabled up
endpoint out 172 135 mobile event/SMOKE_TEST_TENANT 0 250 0 0 0 0 0 0 0 0 0 enabled up
endpoint in 173 136 mobile event/PERF_TEST_TENANT 1 250 0 0 0 0 0 0 0 0 0 enabled up
endpoint out 173 137 mobile event/PERF_TEST_TENANT 0 250 0 0 0 0 0 0 0 0 0 enabled up
endpoint in 33081 15620 mobile $management 0 250 0 0 1 0 0 1 0 0 0 enabled up
endpoint out 33081 15621 local temp.GeDiBp2KLr7VhZ+ 250 0 0 0 0 0 0 0 0 0 enabled up
Then, after having sent 7 messages on event/LOAD_TEST_TENANT
:
Router Links
type dir conn id id peer class addr phs cap undel unsett del presett psdrop acc rej rel mod admin oper
==============================================================================================================================================================
endpoint in 1 6 250 0 0 26 0 0 26 0 0 0 enabled up
endpoint out 1 7 local temp.Zc9fZGY5eu8QBMf 250 0 0 26 26 0 0 0 0 0 enabled up
endpoint in 95 83 250 0 0 36648 0 0 36648 0 0 0 enabled up
endpoint out 95 84 local temp.KsDwq5hpjW5bInE 250 0 0 36648 36648 0 0 0 0 0 enabled up
endpoint in 169 128 mobile event/HEALTH_CHECK_TENANT 1 250 0 0 0 0 0 0 0 0 0 enabled up
endpoint out 169 129 mobile event/HEALTH_CHECK_TENANT 0 250 0 0 0 0 0 0 0 0 0 enabled up
endpoint in 170 130 mobile event/HUB_TEAM_TENANT 1 250 0 0 0 0 0 0 0 0 0 enabled up
endpoint out 170 131 mobile event/HUB_TEAM_TENANT 0 250 0 0 0 0 0 0 0 0 0 enabled up
endpoint in 171 132 mobile event/LOAD_TEST_TENANT 1 250 0 0 15653 0 0 15653 0 0 0 enabled up
endpoint out 171 133 mobile event/LOAD_TEST_TENANT 0 250 207 0 15900 0 0 15653 247 0 0 enabled up
endpoint in 172 134 mobile event/SMOKE_TEST_TENANT 1 250 0 0 0 0 0 0 0 0 0 enabled up
endpoint out 172 135 mobile event/SMOKE_TEST_TENANT 0 250 0 0 0 0 0 0 0 0 0 enabled up
endpoint in 173 136 mobile event/PERF_TEST_TENANT 1 250 0 0 0 0 0 0 0 0 0 enabled up
endpoint out 173 137 mobile event/PERF_TEST_TENANT 0 250 0 0 5 0 0 5 0 0 0 enabled up
endpoint in 11 16383 mobile event/PERF_TEST_TENANT 0 50 0 0 1 0 0 1 0 0 0 enabled up
endpoint in 35814 16488 mobile $management 0 250 0 0 1 0 0 1 0 0 0 enabled up
endpoint out 35814 16489 local temp.YPqAZFlsiDm7UT_ 250 0 0 0 0 0 0 0 0 0 enabled up
Restarting either router or broker resolves the issue.
The test is reproducible in our environment. Router and broker are from EnMasse 0.19.
We have assembled some log output with PN_TRACE_FRM=1
:
- Test resulting in undelivered messages (address is
event/LOAD_TEST_TENANT
): ErrorCase_LOAD_TEST_TENANT_Router.log ErrorCase_LOAD_TEST_TENANT_Broker.log - for comparison: Test with messages that get delivered (address is
event/PERF_TEST_TENANT
): OKCase_PERF_TEST_TENANT_Router.log OKCase_PERF_TEST_TENANT_Broker.log
Issue Analytics
- State:
- Created 5 years ago
- Comments:20 (20 by maintainers)
Top GitHub Comments
Thanks, these show the problem clearly. It appears to be a bug in the broker, whereby it is not issuing credit once space opens up on the queue: https://issues.apache.org/jira/browse/ARTEMIS-1898
@grs artemis issue is resolved, so can we close this issue as well?