question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Rewrite the way of working with UDP traffic in XiaomiGateway._send_cmd()

See original GitHub issue

The whole way of dealing with UDP traffic is wrong there. Compared to TCP, UDP guarantees almost nothing. https://github.com/Danielhiversen/PyXiaomiGateway/blob/master/xiaomi_gateway/__init__.py#L268 The code assumes it will get a response to the request above, but it’s too naive. UDP doesn’t guarantee order and delivery. Multiple UDP dispatches at the same time can make a lot of bad things, like each recvfrom() will get the result of the neighbour. Here is an example of this bug in HA:

20:00:53 DEBUG (SyncWorker_12) [xiaomi_gateway] >> b'{ "cmd":"read","sid":"1"}'
20:00:53 DEBUG (SyncWorker_2) [xiaomi_gateway] >> b'{"cmd": "write", "sid": "2", "data": {"rgb": 0, "key": "xxx"}}'
20:00:53 DEBUG (SyncWorker_2) [xiaomi_gateway] << {'cmd': 'read_ack', 'model': 'magnet', 'sid': '1', 'short_id': 43528, 'data': '{"voltage":3035,"status":"open"}'}
20:00:53 ERROR (SyncWorker_2) [xiaomi_gateway] Non matching response. Expecting write_ack, but got read_ack. {'cmd': 'read_ack', 'model': 'magnet', 'sid': '1', 'short_id': 43528, 'data': '{"voltage":3035,"status":"open"}'}
20:00:53 ERROR (SyncWorker_2) [xiaomi_gateway] No data in response from hub None
20:00:53 DEBUG (SyncWorker_12) [xiaomi_gateway] << {'cmd': 'write_ack', 'model': 'gateway', 'sid': '2', 'short_id': 0, 'data': '{"rgb":0,"illumination":312,"proto_version":"1.0.9"}'}
20:00:53 ERROR (SyncWorker_12) [xiaomi_gateway] Non matching response. Expecting read_ack, but got write_ack. {'cmd': 'write_ack', 'model': 'gateway', 'sid': '2', 'short_id': 0, 'data': '{"rgb":0,"illumination":312,"proto_version":"1.0.9"}'}
20:00:53 ERROR (SyncWorker_12) [xiaomi_gateway] No data in response from hub None

Devices didn’t get right responses, their states were not updated. In case of 2 simultaneous reads, devices can get each others states. As you understood - infinite space for bugs 😃.

It’s OK to have send + immediate receive when you’re 100% sure there will be no simultaneous dispatches. But it’s not the case of write_to_hub() and get_from_hub() which can be executed asynchronously without waiting of the previous function to return the result.

What do I propose?

  1. Make a single while True: recvfrom() per XiaomiGateway at least after initialization. It will check for an error/invalid key or call push_data() on success which will update the device state.
  2. Don’t call recvfrom() in _send_cmd() at least after initialization.
  3. Remove checking of write_to_hub() result in HA (e.g. https://github.com/home-assistant/home-assistant/blob/dev/homeassistant/components/light/xiaomi_aqara.py#L98). Hub will respond and single listener will call push_data.
  4. Add some tracker that will have a list of responses we are waiting. If we didn’t get a response for any request in 10 seconds time frame - log an error. If we got a response, but was not waiting for it - log an error. You can use sid+cmd to identify the response (it’s the best we can as there is no way to supply/receive request ids).

Related issue: https://community.home-assistant.io/t/xiaomi-gateway-errors-but-seems-to-be-working-fine/38633

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Reactions:2
  • Comments:24 (3 by maintainers)

github_iconTop GitHub Comments

2reactions
PaulAnnekovcommented, Sep 17, 2018

I’m gonna fix this bug this week.

0reactions
cosinguyencommented, Aug 14, 2019

Hi, Sorry my code skill very bad. So what i need to do in this case. I using newest version of HassIO with docker.

Thanks.

Read more comments on GitHub >

github_iconTop Results From Across the Web

No results found

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found