question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Failed plots from Chia GUI crash plotman interactive

See original GitHub issue

Describe the bug I fired up interactive mode in a terminal last night and everything was looking good. I only have NVMe / SSD as temp and internal HDD as dest. Separately in the Chia GUI, I decided to experiment with plotting to an external HDD just giving it 1 thread. Mainly curious how bin count affects throughput on a platter drive. I go to bed, then two hours later the external HDD unmounted (I think it overheated), but that crashed Plotman Intereactive that I had left up in the terminal. No new jobs were scheduled all night long.

Why did the chia plot kicked off outside Plotman on a drive that doesn’t appear in the yaml crash it. Is that intended?

To Reproduce

Steps to reproduce the behavior, e.g.:

  1. Set up config with normal plotting parameters
  2. Start a plot from the Chia GUI on a drive not in the plotman.yml
  3. Unmount the drive that the additional plot was running on
  4. See error:
Traceback (most recent call last):
  File "/home/username/chia-blockchain/venv/bin/plotman", line 8, in <module>
    sys.exit(main())
  File "/home/username/chia-blockchain/venv/lib/python3.8/site-packages/plotman/plotman.py", line 173, in main
    interactive.run_interactive()
  File "/home/username/chia-blockchain/venv/lib/python3.8/site-packages/plotman/interactive.py", line 334, in run_interactive
    curses.wrapper(curses_main)
  File "/usr/lib/python3.8/curses/__init__.py", line 105, in wrapper
    return func(stdscr, *args, **kwds)
  File "/home/username/chia-blockchain/venv/lib/python3.8/site-packages/plotman/interactive.py", line 261, in curses_main
    jobs_win.addstr(0, 0, reporting.status_report(jobs, n_cols, jobs_h, 
  File "/home/username/chia-blockchain/venv/lib/python3.8/site-packages/plotman/reporting.py", line 106, in status_report
    plot_util.human_format(j.get_tmp_usage(), 0),
  File "/home/username/chia-blockchain/venv/lib/python3.8/site-packages/plotman/job.py", line 351, in get_tmp_usage
    with os.scandir(self.tmpdir) as it:
FileNotFoundError: [Errno 2] No such file or directory: '/media/username/easystore/chia-plots'

Expected behavior Errors with plots scheduled on drives unknown to plotman shouldn’t halt scheduling.

One drive disconnecting shouldn’t halt scheduling for plotman. (If there are destination drives remaining.)

System setup:

  • OS: Linux Mint
  • Method of archiving: none

Config

full configuration
# Default/example plotman.yaml configuration file

# Options for display and rendering
user_interface:
        # Call out to the `stty` program to determine terminal size, instead of
        # relying on what is reported by the curses library.   In some cases,
        # the curses library fails to update on SIGWINCH signals.  If the
        # `plotman interactive` curses interface does not properly adjust when
        # you resize the terminal window, you can try setting this to True. 
        use_stty_size: True

# Where to plot and log.
directories:
        # One directory in which to store all plot job logs (the STDOUT/
        # STDERR of all plot jobs).  In order to monitor progress, plotman
        # reads these logs on a regular basis, so using a fast drive is
        # recommended.
#        log: /home/username/chia/logs
        log: /home/username/.chia/mainnet/plotter

        # One or more directories to use as tmp dirs for plotting.  The
        # scheduler will use all of them and distribute jobs among them.
        # It assumes that IO is independent for each one (i.e., that each
        # one is on a different physical device).
        #
        # If multiple directories share a common prefix, reports will
        # abbreviate and show just the uniquely identifying suffix.
        tmp:
                - /home/username/plotter-1/chia-plot-temp
                - /media/username/plotter-2/chia-plot-temp
                - /media/username/plotter-3/chia-plot-temp
                - /media/username/plotter-4/chia-plot-temp
                - /media/username/ssd-os/home/username/ssd-chia-plot-temp

        # Optional: Allows overriding some characteristics of certain tmp
        # directories. This contains a map of tmp directory names to
        # attributes. If a tmp directory and attribute is not listed here,
        # it uses the default attribute setting from the main configuration.
        #
        # Currently support override parameters:
        #     - tmpdir_max_jobs
#        tmp_overrides:
                # In this example, /mnt/tmp/00 is larger than the other tmp
                # dirs and it can hold more plots than the default.
        #        "/mnt/tmp/00":
        #                tmpdir_max_jobs: 5

        # Optional: tmp2 directory.  If specified, will be passed to
        # chia plots create as -2.  Only one tmp2 directory is supported.
        # tmp2: /mnt/tmp/a

        # One or more directories; the scheduler will use all of them.
        # These again are presumed to be on independent physical devices,
        # so writes (plot jobs) and reads (archivals) can be scheduled
        # to minimize IO contention.
        dst:
                - /media/username/farmer-01/chia-plots
                - /media/username/farmer-02/chia-plots
                - /media/username/farmer-03/chia-plots
                - /media/username/farmer-04/chia-plots
                - /media/username/farmer-05/chia-plots
                - /media/username/farmer-06/chia-plots
                - /media/username/farmer-07/chia-plots
                - /media/username/farmer-08/chia-plots

        # Archival configuration.  Optional; if you do not wish to run the
        # archiving operation, comment this section out.
        #
        # Currently archival depends on an rsync daemon running on the remote
        # host.
        # The archival also uses ssh to connect to the remote host and check
        # for available directories. Set up ssh keys on the remote host to
        # allow public key login from rsyncd_user.
        # Complete example: https://github.com/ericaltendorf/plotman/wiki/Archiving
#        archive:
#                rsyncd_module: plots # Define this in remote rsyncd.conf.
#                rsyncd_path: /plots # This is used via ssh. Should match path
#                                    # defined in the module referenced above.
#                rsyncd_bwlimit: 80000  # Bandwidth limit in KB/s
#                rsyncd_host: myfarmer
#                rsyncd_user: chia
#                # Optional index.  If omitted or set to 0, plotman will archive
                # to the first archive dir with free space.  If specified,
                # plotman will skip forward up to 'index' drives (if they exist).
                # This can be useful to reduce io contention on a drive on the
                # archive host if you have multiple plotters (simultaneous io
                # can still happen at the time a drive fills up.)  E.g., if you
                # have four plotters, you could set this to 0, 1, 2, and 3, on
                # the 4 machines, or 0, 1, 0, 1.
                #   index: 0


# Plotting scheduling parameters
scheduling:
        # Run a job on a particular temp dir only if the number of existing jobs
        # before [tmpdir_stagger_phase_major : tmpdir_stagger_phase_minor]
        # is less than tmpdir_stagger_phase_limit.
        # Phase major corresponds to the plot phase, phase minor corresponds to
        # the table or table pair in sequence, phase limit corresponds to
        # the number of plots allowed before [phase major : phase minor].
        # e.g, with default settings, a new plot will start only when your plot
        # reaches phase [2 : 1] on your temp drive. This setting takes precidence
        # over global_stagger_m
        tmpdir_stagger_phase_major: 2
        tmpdir_stagger_phase_minor: 1
        # Optional: default is 1
        tmpdir_stagger_phase_limit: 1

        # Don't run more than this many jobs at a time on a single temp dir.
        tmpdir_max_jobs: 3

        # Don't run more than this many jobs at a time in total.
        # Setting 6 because each plotting drive (2 currently) has room for 3, maybe 4 if optimized
#        global_max_jobs: 0
        global_max_jobs: 15

        # Don't run any jobs (across all temp dirs) more often than this, in minutes. 
        # (default was 30)
        global_stagger_m: 10

        # How often the daemon wakes to consider starting a new plot job, in seconds.
        polling_time_s: 60


# Plotting parameters.  These are pass-through parameters to chia plots create.
# See documentation at
# https://github.com/Chia-Network/chia-blockchain/wiki/CLI-Commands-Reference#create
plotting:
        k: 32
        e: False             # Use -e plotting option
        n_threads: 2         # Threads per job
        n_buckets: 128       # Number of buckets to split data into
        job_buffer: 3389     # Per job memory (default: 3389)
        # If specified, pass through to the -f and -p options.  See CLI reference.
        #   farmer_pk: ...
        #   pool_pk: ...

Additional context & screenshots

Issue Analytics

  • State:open
  • Created 2 years ago
  • Comments:6 (1 by maintainers)

github_iconTop GitHub Comments

1reaction
ericaltendorfcommented, May 23, 2021

Thanks for the stacktrace and filing an issue. Sounds like the basic problem is if one of the dirs plotman depends on disappears (ie the drive gets unmounted) we die instead of cleanly recovering?

0reactions
CrossBreadcommented, May 29, 2021

Hi sorry, I thought I subscribed to this issue, but I’ve somehow subscribed to all activity in the repo and lost your reply in the noise.

It’s possible that is the case, but I think what I’m observing was slightly different.

I was using the Chia GUI to experiment with plotting to an external hard drive. That was logging to a default location in .chia/mainnet/plotter.

Separately, I had plot man configured to log into that same directory, because I noticed it would scan the logs of plots from other sources. So that way I could keep an eye on the experimental plots, and let plotman take them into account when scheduling.

Some kind of error happened with the external hard drive and the mount got really messed up. I eventually had to force unmount it. Any calls to stat that drive were locking up processes.

So it’s possible that was related. Even just trying to list the root directory of the drive with ls /dev/sdm would just hang forever. Trying to check the smart data and grab temperature for instance would hang forever.

So if plotman is doing any of that under the hood, maybe it was stuck waiting on a hung process.

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to Recover from a Plotter Crash - The Chia Farmer
Investigate – Take the time now to investigate why the crash happened since everything is offline. Check the Chia Logs. If you don't...
Read more >
How to delete an incorrectly created plot with CLI? - Chia Forum
I createda plot with a wrong diarection.If I use for deleting a GUI it crashes.How can I do it via CLI?
Read more >
[Support] Machinaris - Chia cryptocurrency farming + Plotman ...
Now - I would like to ask for help - today I updated both my machines to latest Machinaris container, but since then...
Read more >
Introducing Swar's Chia Plot Manager - Reddit
Welp, the manager says "running" via its command prompt view. But how am I supposed to actually get the plotting running? The gui...
Read more >
Chia Plot and Farming Setup with SystemD
It has more then enough space for the OS and blockchain, but it's very tight when using it as destination in plotman before...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found