Rewrite the way of working with UDP traffic in XiaomiGateway._send_cmd()
See original GitHub issueThe whole way of dealing with UDP traffic is wrong there. Compared to TCP, UDP guarantees almost nothing.
https://github.com/Danielhiversen/PyXiaomiGateway/blob/master/xiaomi_gateway/__init__.py#L268
The code assumes it will get a response to the request above, but it’s too naive. UDP doesn’t guarantee order and delivery. Multiple UDP dispatches at the same time can make a lot of bad things, like each recvfrom()
will get the result of the neighbour. Here is an example of this bug in HA:
20:00:53 DEBUG (SyncWorker_12) [xiaomi_gateway] >> b'{ "cmd":"read","sid":"1"}'
20:00:53 DEBUG (SyncWorker_2) [xiaomi_gateway] >> b'{"cmd": "write", "sid": "2", "data": {"rgb": 0, "key": "xxx"}}'
20:00:53 DEBUG (SyncWorker_2) [xiaomi_gateway] << {'cmd': 'read_ack', 'model': 'magnet', 'sid': '1', 'short_id': 43528, 'data': '{"voltage":3035,"status":"open"}'}
20:00:53 ERROR (SyncWorker_2) [xiaomi_gateway] Non matching response. Expecting write_ack, but got read_ack. {'cmd': 'read_ack', 'model': 'magnet', 'sid': '1', 'short_id': 43528, 'data': '{"voltage":3035,"status":"open"}'}
20:00:53 ERROR (SyncWorker_2) [xiaomi_gateway] No data in response from hub None
20:00:53 DEBUG (SyncWorker_12) [xiaomi_gateway] << {'cmd': 'write_ack', 'model': 'gateway', 'sid': '2', 'short_id': 0, 'data': '{"rgb":0,"illumination":312,"proto_version":"1.0.9"}'}
20:00:53 ERROR (SyncWorker_12) [xiaomi_gateway] Non matching response. Expecting read_ack, but got write_ack. {'cmd': 'write_ack', 'model': 'gateway', 'sid': '2', 'short_id': 0, 'data': '{"rgb":0,"illumination":312,"proto_version":"1.0.9"}'}
20:00:53 ERROR (SyncWorker_12) [xiaomi_gateway] No data in response from hub None
Devices didn’t get right responses, their states were not updated. In case of 2 simultaneous reads, devices can get each others states. As you understood - infinite space for bugs 😃.
It’s OK to have send + immediate receive when you’re 100% sure there will be no simultaneous dispatches. But it’s not the case of write_to_hub()
and get_from_hub()
which can be executed asynchronously without waiting of the previous function to return the result.
What do I propose?
- Make a single
while True: recvfrom()
perXiaomiGateway
at least after initialization. It will check for an error/invalid key or callpush_data()
on success which will update the device state. - Don’t call
recvfrom()
in_send_cmd()
at least after initialization. - Remove checking of
write_to_hub()
result in HA (e.g. https://github.com/home-assistant/home-assistant/blob/dev/homeassistant/components/light/xiaomi_aqara.py#L98). Hub will respond and single listener will callpush_data
. - Add some tracker that will have a list of responses we are waiting. If we didn’t get a response for any request in 10 seconds time frame - log an error. If we got a response, but was not waiting for it - log an error. You can use
sid
+cmd
to identify the response (it’s the best we can as there is no way to supply/receive request ids).
Related issue: https://community.home-assistant.io/t/xiaomi-gateway-errors-but-seems-to-be-working-fine/38633
Issue Analytics
- State:
- Created 6 years ago
- Reactions:2
- Comments:24 (3 by maintainers)
Top GitHub Comments
I’m gonna fix this bug this week.
Hi, Sorry my code skill very bad. So what i need to do in this case. I using newest version of HassIO with docker.
Thanks.