AWS IoT endpoints not resolved by many default router DNS providers - "Name or service not known"
See original GitHub issueI have a service/script which allows people to control their TV using Alexa. https://github.com/eclair4151/AlexaControlledSamsungTV
It uses this lib to connect over MQTT, and one issue many users get is the following error:
https://github.com/eclair4151/AlexaControlledSamsungTV/issues/21 https://github.com/eclair4151/AlexaControlledSamsungTV/issues/11 https://github.com/eclair4151/AlexaControlledSamsungTV/issues/6 https://github.com/eclair4151/AlexaControlledSamsungTV/issues/4
It can also be seen in previous issues of this lib here, which were closed for inactivity: https://github.com/aws/aws-iot-device-sdk-python/issues/101 https://github.com/aws/aws-iot-device-sdk-python/issues/133 https://github.com/aws/aws-iot-device-sdk-python/issues/27
File "/home/pi/AlexaControlledSamsungTV/helpers/mqtt_server.py", line 247, in startServer
myMQTTClient.connect()
File "/home/pi/.local/lib/python3.5/site-packages/AWSIoTPythonSDK/MQTTLib.py", line 408, in connect
return self._mqtt_core.connect(keepAliveIntervalSecond)
File "/home/pi/.local/lib/python3.5/site-packages/AWSIoTPythonSDK/core/protocol/mqtt_core.py", line 168, in connect
self.connect_async(keep_alive_sec, self._create_blocking_ack_callback(event))
File "/home/pi/.local/lib/python3.5/site-packages/AWSIoTPythonSDK/core/protocol/mqtt_core.py", line 179, in connect_async
rc = self._internal_async_client.connect(keep_alive_sec, ack_callback)
File "/home/pi/.local/lib/python3.5/site-packages/AWSIoTPythonSDK/core/protocol/internal/clients.py", line 113, in connect
rc = self._paho_client.connect(host, port, keep_alive_sec)
File "/home/pi/.local/lib/python3.5/site-packages/AWSIoTPythonSDK/core/protocol/paho/client.py", line 654, in connect
return self.reconnect()
File "/home/pi/.local/lib/python3.5/site-packages/AWSIoTPythonSDK/core/protocol/paho/client.py", line 776, in reconnect
sock = socket.create_connection((self._host, self._port), source_address=(self._bind_address, 0))
File "/usr/lib/python3.5/socket.py", line 694, in create_connection
for res in getaddrinfo(host, port, 0, SOCK_STREAM):
File "/usr/lib/python3.5/socket.py", line 733, in getaddrinfo
for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno -2] Name or service not known
Every single case of this issue is related to the fact their default DNS cant resolve my endpoint and fails to connect. In every case of this I see, getting the user to manually set their DNS on the machine to 8.8.8.8, 8.8.4.4 fixes the issue immediately.
The problem is I cant be asking all my users to change their DNS settings to use my software. I can’t for the life of me figure out why this is an issue or how to fix it.
This is how the script is connecting which was taken directly from an example here
myMQTTClient = AWSIoTMQTTClient(clientid)
myMQTTClient.configureEndpoint("afkx1f9takwol.iot.us-east-1.amazonaws.com", 8883)
myMQTTClient.configureCredentials(".auth/root.pem", ".auth/private.pem.key", ".auth/certificate.pem.crt")
myMQTTClient.configureOfflinePublishQueueing(-1) # Infinite offline Publish queueing
myMQTTClient.configureDrainingFrequency(2) # Draining: 2 Hz
myMQTTClient.configureConnectDisconnectTimeout(10) # 10 sec
myMQTTClient.configureMQTTOperationTimeout(5) # 5 sec
myMQTTClient.connect()
Do you have any thoughts or suggestions on how to solve this?
Issue Analytics
- State:
- Created 4 years ago
- Reactions:4
- Comments:8 (1 by maintainers)

Top Related StackOverflow Question
There are multiple instances of customer reports with respect to DNS failures (on IOT addresses) on actual devices and having to reconfigure host-level resolution in order to get the IOT endpoints to resolve. I am resurrecting this as a model use-case for some new DNS functionality I’d like to add to the v2 SDKs.
In particular, I’d like to add the ability to configure a fallback resolution method when the default host resolver fails (or after a configurable amount of time with no response). We’ll start with dns-over-udp and dns-over-tls; dns-over-https can come later. I believe that moving to async versions of getaddrinfo (via integration with the platform event loop) is a pre-requisite to this work (primarily to move the default host resolver away from its synchronous loop to an event-driven model).
The primary benefit of this addition would be a transparent software-only solution (i.e. configure the fallback to 8.8.8.8 or something similar) rather than a user-hardware-software-configuration solution.
Note that this will NOT fix the issue in the v1 SDKs.
⚠️COMMENT VISIBILITY WARNING⚠️
Comments on closed issues are hard for our team to see. If you need more assistance, please either tag a team member or open a new issue that references this one. If you wish to keep having a conversation with other community members under this issue feel free to do so.