ChiaDog is not recovering from a remote harvester being down
See original GitHub issueHi, I have ChiaDog running on a CentOS box. I mapped my harvesters to local folders. Works great.
However, when a harvester box is restarted, ChiaDog is stuck on not seeing that log file anymore, until I restart ChiaDog for that harvester. Maybe when ChiaDog is detecting harvester down (no access to the file), it should try to check whether the file access has been restored?
A clear and concise description of what the bug is and how it can be reproduced.
- Setup
-
- One box for harvester, one for ChiaDog
- Map harvester log folder to a local folder on ChiaDog box
- Run ChiaDog
- Pull down the network cable from the ChiaDog box
-
- ChiaDog starts sending “Harvester Down” notifications
- Reconnect network to ChiaDog
-
- ChiaDog keeps sending “Harvester Down” notifications
Environment:
- OS: CentOS (for ChiaDog box)
- Python version: 3.9.6
- Chiadog version: hmm, latest? Maybe ChiaDog version should be included in those notifications, or in the first log line, when it is started?
- Harvester: remote; however, mapped to a local folder, so seen as local to ChiaDog (maybe this is the reason that ChiaDog is not checking whether file access was restored, as it assumed that this is a catastrophic failure, and is due to reboot?)
Here is the exception generated when harvester went down:
Exception in thread Thread-1:
Traceback (most recent call last):
File "/usr/local/lib/python3.9/threading.py", line 973, in _bootstrap_inner
self.run()
File "/usr/local/lib/python3.9/threading.py", line 910, in run
self._target(*self._args, **self._kwargs)
File "/mnt/chia_logs/chiadog/ox/venv/lib/python3.9/site-packages/decorator.py", line 232, in fun
return caller(func, *(extras + args), **kw)
File "/mnt/chia_logs/chiadog/ox/venv/lib/python3.9/site-packages/retry/api.py", line 73, in retry_decorator
return __retry_internal(partial(f, *args, **kwargs), exceptions, tries, delay, max_delay, backoff, jitter,
File "/mnt/chia_logs/chiadog/ox/venv/lib/python3.9/site-packages/retry/api.py", line 33, in __retry_internal
return f()
File "/mnt/chia_logs/chiadog/ox/src/chia_log/log_consumer.py", line 75, in _consume_loop
for log_line in Pygtail(self._expanded_log_path, read_from_end=True, offset_file=self._offset_path):
File "/mnt/chia_logs/chiadog/ox/venv/lib/python3.9/site-packages/pygtail/core.py", line 89, in __init__
if self._offset_file_inode != stat(self.filename).st_ino or \
OSError: [Errno 112] Host is down: '/mnt/chia_logs/ox/debug.log'
Exception ignored in: <function Pygtail.__del__ at 0x7f8f87633c10>
Traceback (most recent call last):
File "/mnt/chia_logs/chiadog/ox/venv/lib/python3.9/site-packages/pygtail/core.py", line 97, in __del__
if self._filehandle():
File "/mnt/chia_logs/chiadog/ox/venv/lib/python3.9/site-packages/pygtail/core.py", line 179, in _filehandle
self._fh = open(filename, "r", 1)
OSError: [Errno 112] Host is down: '/mnt/chia_logs/ox/debug.log'
Issue Analytics
- State:
- Created 2 years ago
- Comments:8 (2 by maintainers)
Top Results From Across the Web
Chia dog no reward notification on remote harvesters
I got my first reward this morning and was very surprised by it because Chiadog did not notify me. I only found out...
Read more >Anyone else having issues with remote harvester since 1.2.0?
This has been working fine for months. Today I did the upgrade to 1.2.0 and at first I was pleased by the fact...
Read more >[Support] Machinaris - Chia cryptocurrency farming + Plotman ...
Scaling-Down: Optional mode where wallets are synced daily, not run 24/7. ... bugs reported on their Discord, but seems to be working.
Read more >Tools - ChiaLinks - Chia Cryptocurrency Resources
Hot Plotter allows you to remotely start chia plots utilizing MadMax, or the official Chia ... Plotting manager similar to Plotman but not...
Read more >paramiko.ssh_exception.novalidconnectionserror: - You.com | The ...
Windows does not come with SSH/SFTP/SCP server running by default. ... How to keep reconnecting remote host after reboot with Paramiko.
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found

I would also suggest that just one notification about the harvester being down event is being sent. I guess, we all know what to do when we get notified, so those extra notifications are both redundant and (to me only?) annoying.
Saying that, I would also like to see a notification when a bunch of plots is being added (what would indicate connecting a new drive with plots - moving HDs around). That notification would be most often complementary to the one that is being sent when plots are disappearing from the harvester (HD unplugged from the plotter). This way, it would be a good notification that the added drive was recognized by the harvester, so we would not need to relay on rather hopeless full node UI.
@Jacek-ghub I am only letting chiadog in over SSH with a dedicated user who only has read access to the log file