question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

edgeHub crashes after changing config.yaml when configured for additional offline storage

See original GitHub issue

Expected Behavior

Changing RuntimeLogLevel to “debug” in /etc/config.yaml and restarting the iotedge runtime (systemctl restart iotedge) when the edgeHub is configured for additional storage should change the log level of the edgeAgent and edgeHub and not cause the edgeHub to repeatedly crash.

Current Behavior

If I configure the edgeHub to use non-container storage (https://docs.microsoft.com/en-us/azure/iot-edge/offline-capabilities) and then edit the config.yaml file ( /etc/iotedge/config.yaml ) to change the RuntimeLogLevel and restart systemctl restart iotedge the edgeHub fails to come back up completely, logging that it received 500 Internal Server Error when making a call to the workload API to decrypt.

Changing the config.yaml file back to the way it was before and restarting again does not resolve the issue.

Steps to Reproduce

Create & change permissions on the directory for offline/non-container edgeHub storage if you haven’t already done so sudo mkdir /etc/iotedge/strorage && sudo chown -R 1000:1000 /etc/iotedge/storage

Update your deployment.json to use this directory for storage. My systemModules section looks like the following:

"systemModules": {
    "edgeAgent": {
        "settings": {
            "image": "mcr.microsoft.com/azureiotedge-agent:1.0.6",
            "createOptions": ""
        },
        "type": "docker"
    },
    "edgeHub": {
        "settings": {
            "image": "mcr.microsoft.com/azureiotedge-hub:1.0.6",
            "createOptions": "{\"HostConfig\":{\"Binds\":[\"/etc/iotedge/storage/:/iotedge/storage/\"],\"PortBindings\":{\"8883/tcp\":[{\"HostPort\":\"8883\"}],\"443/tcp\":[{\"HostPort\":\"443\"}],\"5671/tcp\":[{\"HostPort\":\"5671\"}]}}}"
        },
        "type": "docker",
        "env": {
            "storageFolder": {
                "value": "/iotedge/storage/"
            }
        },
        "status": "running",
        "restartPolicy": "always"
    }
}

Deploy and make sure everything is working/that you are using the additional storage.

Edit /etc/iotedge/config.yaml and change RuntimeLogLevel to debug (or if you have it at debug, you can change it to info or whatever your favorite log level happens to be). Save your changes and then restart the edge runtime systemctl restart itoedge so the changes are applied.

If you follow the edgeHub and edge daemon logs, you should now see errors. journalctl -u iotedge:

May 06 14:19:51 betsyc-iotedge9 iotedged[4327]: 2019-05-06T14:19:51Z [ERR!] - Internal server error: Could not decrypt
May 06 14:19:51 betsyc-iotedge9 iotedged[4327]:         caused by: A error occurred in the key store.
May 06 14:19:51 betsyc-iotedge9 iotedged[4327]:         caused by: HSM failure
May 06 14:19:51 betsyc-iotedge9 iotedged[4327]:         caused by: HSM API failure occurred: 417
May 06 14:19:51 betsyc-iotedge9 iotedged[4327]: 2019-05-06T14:19:51Z [INFO] - [work] - - - [2019-05-06 14:19:51.665964641 UTC] "POST /modules/%24edgeHub/genid/636927468235895329/decrypt?api-version=2018-06-28 HTTP/1.1" 500 Internal Server Error 150 "-" "-" pid(5198)

docker logs edgeHub:

2019-05-06 14:19:15.488 +00:00 [INF] [EdgeHub] - Starting Edge Hub
2019-05-06 14:19:15.489 +00:00 [INF] [EdgeHub] - 
        █████╗ ███████╗██╗   ██╗██████╗ ███████╗
       ██╔══██╗╚══███╔╝██║   ██║██╔══██╗██╔════╝
       ███████║  ███╔╝ ██║   ██║██████╔╝█████╗
       ██╔══██║ ███╔╝  ██║   ██║██╔══██╗██╔══╝
       ██║  ██║███████╗╚██████╔╝██║  ██║███████╗
       ╚═╝  ╚═╝╚══════╝ ╚═════╝ ╚═╝  ╚═╝╚══════╝

 ██╗ ██████╗ ████████╗    ███████╗██████╗  ██████╗ ███████╗
 ██║██╔═══██╗╚══██╔══╝    ██╔════╝██╔══██╗██╔════╝ ██╔════╝
 ██║██║   ██║   ██║       █████╗  ██║  ██║██║  ███╗█████╗
 ██║██║   ██║   ██║       ██╔══╝  ██║  ██║██║   ██║██╔══╝
 ██║╚██████╔╝   ██║       ███████╗██████╔╝╚██████╔╝███████╗
 ╚═╝ ╚═════╝    ╚═╝       ╚══════╝╚═════╝  ╚═════╝ ╚══════╝

2019-05-06 14:19:15.489 +00:00 [INF] [EdgeHub] - Version - 1.0.6.19913336 (8288bc9bd6f6e15295fea506cd3f99d7f6347a6a)
2019-05-06 14:19:15.491 +00:00 [INF] [EdgeHub] - Loaded server certificate with expiration date of "2019-08-04T13:52:25.0000000+00:00"
2019-05-06 14:19:15.523 +00:00 [INF] [Microsoft.Azure.Devices.Edge.Hub.Core.Storage.MessageStore] - Created new message store
2019-05-06 14:19:15.523 +00:00 [INF] [Microsoft.Azure.Devices.Edge.Hub.Core.Storage.MessageStore] - Started task to cleanup processed and stale messages
2019-05-06 14:19:15.592 +00:00 [DBG] [Microsoft.Azure.Devices.Edge.Hub.CloudProxy.DeviceConnectivityManager] - Created DeviceConnectivityManager with connected check frequency 00:05:00 and disconnected check frequency 00:02:00
2019-05-06 14:19:20.350 +00:00 [DBG] [Microsoft.Azure.Devices.Edge.Util.Uds.HttpUdsMessageHandler] - Connecting socket /var/run/iotedge/workload.sock
2019-05-06 14:19:20.351 +00:00 [DBG] [Microsoft.Azure.Devices.Edge.Util.Uds.HttpUdsMessageHandler] - Connected socket /var/run/iotedge/workload.sock
2019-05-06 14:19:20.351 +00:00 [DBG] [Microsoft.Azure.Devices.Edge.Util.Uds.HttpUdsMessageHandler] - Sending request http://workload.sock/modules/%24edgeHub/genid/636927468235895329/decrypt?api-version=2018-06-28
2019-05-06 14:19:20.354 +00:00 [DBG] [Microsoft.Azure.Devices.Edge.Util.Uds.HttpUdsMessageHandler] - Response received InternalServerError
2019-05-06 14:19:20.354 +00:00 [DBG] [Microsoft.Azure.Devices.Edge.Util.Edged.WorkloadClient] - Retrying Http call to unix:///var/run/iotedge/workload.sock to Decrypt because of error Error, retry count = 3
2019-05-06 14:19:30.062 +00:00 [DBG] [Microsoft.Azure.Devices.Edge.Util.Uds.HttpUdsMessageHandler] - Connecting socket /var/run/iotedge/workload.sock
2019-05-06 14:19:30.063 +00:00 [DBG] [Microsoft.Azure.Devices.Edge.Util.Uds.HttpUdsMessageHandler] - Connected socket /var/run/iotedge/workload.sock
2019-05-06 14:19:30.063 +00:00 [DBG] [Microsoft.Azure.Devices.Edge.Util.Uds.HttpUdsMessageHandler] - Sending request http://workload.sock/modules/%24edgeHub/genid/636927468235895329/decrypt?api-version=2018-06-28
2019-05-06 14:19:30.066 +00:00 [DBG] [Microsoft.Azure.Devices.Edge.Util.Uds.HttpUdsMessageHandler] - Response received InternalServerError

Unhandled Exception: System.AggregateException: One or more errors occurred. (Error calling Decrypt: Could not decrypt
	caused by: A error occurred in the key store.
	caused by: HSM failure
	caused by: HSM API failure occurred: 417) ---> Microsoft.Azure.Devices.Edge.Util.Edged.WorkloadCommunicationException: Error calling Decrypt: Could not decrypt
	caused by: A error occurred in the key store.
	caused by: HSM failure
	caused by: HSM API failure occurred: 417
   at Microsoft.Azure.Devices.Edge.Util.Edged.WorkloadClient.Execute[T](Func`1 func, String operation) in /home/vsts/work/1/s/edge-util/src/Microsoft.Azure.Devices.Edge.Util/edged/WorkloadClient.cs:line 109
   at Microsoft.Azure.Devices.Edge.Util.Edged.WorkloadClient.DecryptAsync(String initializationVector, String encryptedText) in /home/vsts/work/1/s/edge-util/src/Microsoft.Azure.Devices.Edge.Util/edged/WorkloadClient.cs:line 83
   at Microsoft.Azure.Devices.Edge.Storage.EncryptedStore`2.<>c__DisplayClass17_0.<<IterateBatch>b__0>d.MoveNext() in /home/vsts/work/1/s/edge-util/src/Microsoft.Azure.Devices.Edge.Storage/EncryptedStore.cs:line 89
--- End of stack trace from previous location where exception was thrown ---
   at Microsoft.Azure.Devices.Edge.Storage.RocksDb.ColumnFamilyDbStore.IterateBatch(Action`1 seeker, Int32 batchSize, Func`3 callback, CancellationToken cancellationToken) in /home/vsts/work/1/s/edge-util/src/Microsoft.Azure.Devices.Edge.Storage.RocksDb/ColumnFamilyDbStore.cs:line 162
   at Microsoft.Azure.Devices.Edge.Util.TaskEx.TimeoutAfter(Task task, TimeSpan timeout) in /home/vsts/work/1/s/edge-util/src/Microsoft.Azure.Devices.Edge.Util/TaskEx.cs:line 142
   at Microsoft.Azure.Devices.Edge.Hub.Core.DeviceScopeIdentitiesCache.ReadCacheFromStore(IKeyValueStore`2 encryptedStore) in /home/vsts/work/1/s/edge-hub/src/Microsoft.Azure.Devices.Edge.Hub.Core/DeviceScopeIdentitiesCache.cs:line 135
   at Microsoft.Azure.Devices.Edge.Hub.Core.DeviceScopeIdentitiesCache.Create(IServiceProxy serviceProxy, IKeyValueStore`2 encryptedStorage, TimeSpan refreshRate) in /home/vsts/work/1/s/edge-hub/src/Microsoft.Azure.Devices.Edge.Hub.Core/DeviceScopeIdentitiesCache.cs:line 55
   at Microsoft.Azure.Devices.Edge.Hub.Service.Modules.CommonModule.<Load>b__17_9(IComponentContext c) in /home/vsts/work/1/s/edge-hub/src/Microsoft.Azure.Devices.Edge.Hub.Service/modules/CommonModule.cs:line 211
   at Microsoft.Azure.Devices.Edge.Hub.Service.Modules.RoutingModule.<Load>b__20_10(IComponentContext c) in /home/vsts/work/1/s/edge-hub/src/Microsoft.Azure.Devices.Edge.Hub.Service/modules/RoutingModule.cs:line 193
   at Microsoft.Azure.Devices.Edge.Hub.Service.Modules.RoutingModule.<Load>b__20_12(IComponentContext c) in /home/vsts/work/1/s/edge-hub/src/Microsoft.Azure.Devices.Edge.Hub.Service/modules/RoutingModule.cs:line 225
   at Microsoft.Azure.Devices.Edge.Hub.Service.Modules.RoutingModule.<Load>b__20_25(IComponentContext c) in /home/vsts/work/1/s/edge-hub/src/Microsoft.Azure.Devices.Edge.Hub.Service/modules/RoutingModule.cs:line 392
   at Microsoft.Azure.Devices.Edge.Hub.Service.Modules.RoutingModule.<Load>b__20_28(IComponentContext c) in /home/vsts/work/1/s/edge-hub/src/Microsoft.Azure.Devices.Edge.Hub.Service/modules/RoutingModule.cs:line 450
   at Microsoft.Azure.Devices.Edge.Hub.Service.Program.MainAsync(IConfigurationRoot configuration) in /home/vsts/work/1/s/edge-hub/src/Microsoft.Azure.Devices.Edge.Hub.Service/Program.cs:line 62
   --- End of inner exception stack trace ---
   at System.Threading.Tasks.Task`1.GetResultCore(Boolean waitCompletionNotification)
   at Microsoft.Azure.Devices.Edge.Hub.Service.Program.Main() in /home/vsts/work/1/s/edge-hub/src/Microsoft.Azure.Devices.Edge.Hub.Service/Program.cs:line 30

Context (Environment)

Device (Host) Operating System

Ubuntu 18.04 LTS

Container Operating System

Linux Containers

Runtime Versions

iotedged

iotedge 1.0.6.1 (3fa6cbef8b7fc3c55a49a622735eb1021b8a5963)

Edge Agent

1.0.6

Edge Hub

1.0.6

Docker

3.0.5

Logs

edgedlogs.log

Additional Information

When the edgeHub is not configured for additional storage, I don’t see this issue. I can change the RuntimeLogLevel repeatedly and restart the runtime without issue. After getting to this bad state, I can return to a workings state by stopping the runtime, clearing out the storage directory, and restarting the runtime. systemctl stop iotedge sudo rm -rf /etc/iotedge/storage/* systemctl restart iotedge

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Reactions:3
  • Comments:11 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
myagleycommented, May 6, 2019

Thank you for the bug report. You are running into this bug: https://github.com/Azure/iotedge/pull/1082

This fix is in the 1.0.7 release which will be out today (in a couple of hours).

0reactions
chriswuecommented, Oct 1, 2019

Well, looks like the actual issue is different, just the symptom is the same:

I forgot we bind-mount a named volume for the edgeHub storage instead of /etc/iotedge/storage. After stopping iotedge and deleting the named volume and restarting iotedge caused the deployment to re-apply and re-create the named volume. At which point edgeHub successfully managed to start up. I’ll create a new issue for this

Read more comments on GitHub >

github_iconTop Results From Across the Web

Configure Azure IoT Edge device settings
This article shows you how to configure Azure IoT Edge device settings and options using the config.toml file.
Read more >
Ubuntu user on custom image - device
This gets a conflict with the azure edgeAgent and edgeHub docker containers with local storage. Because the images map the permissions to the...
Read more >
Tag: iot edge - Busbyland - Azure IoT Playground
This post demonstrates how to get Azure IoT Edge to work on Red Hat Enterprise ... If you have to change it in...
Read more >
CHANGELOG.md - iotedge - Explore projects - CodeLinaro
You can configure Edge Hub to go back to the previous behavior by setting the environment variable "AuthenticationMode" to the value " ...
Read more >
Install the Azure IoT Edge Runtime
Install the Azure IoT Edge Runtime using the DE10-Nano FPGA kit, using Intel® Cyclone V FPGA.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found