question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Incorrect calculation of disk usage and availability during import

See original GitHub issue

Observed behavior

This issue reports possibly two related bugs:

1/ Incorrect calculation of disk available:

bug_screensho_00

Actual: 0 Expected: 33G

2/ Misleading calculation when computing file size already on disk

file_size_calculation_bug

Context: channel import of a large channel KA (ru) failed 70% through (due to network error, see below), so 70% of the files are already downloaded to /storage, but haven’t been marked as “available” in Kolibri DB, so are not included in the calculation.

Expected behavior

  • 1/ Interfacing with OS to check disk space available should be accurate
  • 2/ Channel import/update user interface should show content nodes available and size of /storage already downloaded (since this is what users need to know to judge the size of the download).

User-facing consequences

Users see wrong disk-space-available calculation (until restarting Kolibri).

Users see misleading “already downloaded” information that doesn’t account for files downloaded, but not marked available.

Errors and logs

@laurenlichtman was importing through the web around 14:31:29 UTC, then first import errors due to network timeout?

INFO 2018-06-22 14:31:29,076 importchannel Downloading data for channel id 303df4e42aac519796a3f49bed613cb4
INFO 2018-06-22 14:31:30,282 channel_import Importing ContentTag data
INFO 2018-06-22 14:31:30,285 channel_import Importing ContentNode_has_prerequisite data
INFO 2018-06-22 14:31:30,286 channel_import Importing ContentNode_related data
INFO 2018-06-22 14:31:30,287 channel_import Importing ContentNode_tags data
INFO 2018-06-22 14:31:30,288 channel_import Importing ContentNode data
INFO 2018-06-22 14:31:31,348 channel_import Importing Language data
INFO 2018-06-22 14:31:31,353 channel_import Importing File data
INFO 2018-06-22 14:31:32,500 channel_import Importing LocalFile data
INFO 2018-06-22 14:31:42,336 channel_import Importing AssessmentMetaData data
INFO 2018-06-22 14:31:42,339 channel_import Importing ChannelMetadata data
INFO 2018-06-22 14:31:43,186 annotation Setting availability of File objects based on LocalFile availability
INFO 2018-06-22 14:31:43,296 annotation Setting availability of non-topic ContentNode objects based on File availability
INFO 2018-06-22 14:31:43,582 annotation Setting availability of ContentNode objects with children for 2 levels
INFO 2018-06-22 14:31:43,583 annotation Setting availability of ContentNode objects with children for level 2
INFO 2018-06-22 14:31:43,587 annotation Setting availability of ContentNode objects with children for level 1

ERROR 2018-06-22 15:47:55,066 importcontent An error occured during content import: 504 Server Error: Gateway Time-out for url: https://studio.learningequality.org/content/storage/7/4/7489c3176ca502d0c91f7d9a67829d5e.jpg

ERROR 2018-06-22 16:28:13,775 importcontent An error occured during content import: HTTPSConnectionPool(host='studio.learningequality.org', port=443): Max retries exceeded with url: /content/storage/b/3/b30e4ce8c480f46e27b0eabddc06ac95.jpg (Caused by SSLError(SSLError("bad handshake: Error([('SSL routines', 'ssl3_read_bytes', 'sslv3 alert handshake failure')],)",),))

ERROR 2018-06-22 16:35:12,585 importcontent An error occured during content import: HTTPSConnectionPool(host='studio.learningequality.org', port=443): Read timed out. (read timeout=20)
WARNING 2018-06-22 16:35:13,426 base Job 331d2d0e55aa4dffa59c47cb26b602ab raised an exception: Traceback (most recent call last):
  File "/home/kolibri/.pex/install/kolibri-0.10.0b5-py2.py3-none-any.whl.231bf69099f195f7a4092238814111ee2fcdd688/kolibri-0.10.0b5-py2.py3-none-any.whl/kolibri/dist/iceqube/worker/backends/inmem.py", line 75, in handle_finished_future
    result = future.result()
  File "/home/kolibri/.pex/install/kolibri-0.10.0b5-py2.py3-none-any.whl.231bf69099f195f7a4092238814111ee2fcdd688/kolibri-0.10.0b5-py2.py3-none-any.whl/kolibri/dist/py2only/concurrent/futures/_base.py", line 422, in result
    return self.__get_result()
  File "/home/kolibri/.pex/install/kolibri-0.10.0b5-py2.py3-none-any.whl.231bf69099f195f7a4092238814111ee2fcdd688/kolibri-0.10.0b5-py2.py3-none-any.whl/kolibri/dist/py2only/concurrent/futures/thread.py", line 62, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/home/kolibri/.pex/install/kolibri-0.10.0b5-py2.py3-none-any.whl.231bf69099f195f7a4092238814111ee2fcdd688/kolibri-0.10.0b5-py2.py3-none-any.whl/kolibri/dist/iceqube/worker/backends/inmem.py", line 149, in wrap
    raise e
ReadTimeout: HTTPSConnectionPool(host='studio.learningequality.org', port=443): Read timed out. (read timeout=20)

Second attempt at import fail for unknown reason

INFO 2018-06-23 06:42:01,437 importchannel Downloading data for channel id 303df4e42aac519796a3f49bed613cb4
WARNING 2018-06-23 06:42:03,542 channel_import Version 1 of channel 303df4e42aac519796a3f49bed613cb4 already exists in database; cancelling import of version 1
INFO 2018-06-23 06:42:03,547 annotation Setting availability of File objects based on LocalFile availability
INFO 2018-06-23 06:42:03,805 annotation Setting availability of non-topic ContentNode objects based on File availability
INFO 2018-06-23 06:42:04,360 annotation Setting availability of ContentNode objects with children for 2 levels
INFO 2018-06-23 06:42:04,361 annotation Setting availability of ContentNode objects with children for level 2
INFO 2018-06-23 06:42:04,365 annotation Setting availability of ContentNode objects with children for level 1
INFO 2018-06-23 06:42:05,911 apps Running Kolibri with the following settings: kolibri.deployment.default.settings.base
ERROR 2018-06-23 07:09:38,228 importcontent An error occured during content import: [Errno 5] Input/output error

After restarting Kolibri the disk space available was correct (33G), still, the UI import logic did not allow me to choose “select all checkbox” then import (because thinks not enough disk space)

Third attempt via command line stalls at first, but after restarting Kolibri 4 mins later, seems to finish task OK:

INFO 2018-06-23 12:17:04,666 importchannel Downloading data for channel id 303df4e42aac519796a3f49bed613cb4
WARNING 2018-06-23 12:17:06,791 channel_import Version 1 of channel 303df4e42aac519796a3f49bed613cb4 already exists in database; cancelling import of version 1
INFO 2018-06-23 12:17:06,796 annotation Setting availability of File objects based on LocalFile availability
INFO 2018-06-23 12:17:07,267 annotation Setting availability of non-topic ContentNode objects based on File availability
INFO 2018-06-23 12:17:08,041 annotation Setting availability of ContentNode objects with children for 2 levels
INFO 2018-06-23 12:17:08,043 annotation Setting availability of ContentNode objects with children for level 2
INFO 2018-06-23 12:17:08,046 annotation Setting availability of ContentNode objects with children for level 1

INFO 2018-06-23 12:17:10,825 apps Running Kolibri with the following settings: kolibri.deployment.default.settings.base
INFO 2018-06-23 12:22:05,964 annotation Setting availability of 7464 LocalFile objects based on passed in checksums
INFO 2018-06-23 12:22:06,336 annotation Setting availability of File objects based on LocalFile availability
INFO 2018-06-23 12:22:06,764 annotation Setting availability of non-topic ContentNode objects based on File availability
INFO 2018-06-23 12:22:07,770 annotation Setting availability of ContentNode objects with children for 2 levels
INFO 2018-06-23 12:22:07,772 annotation Setting availability of ContentNode objects with children for level 2
INFO 2018-06-23 12:22:08,229 annotation Setting availability of ContentNode objects with children for level 1

So demo server is in working state now: http://ka-ru-demo.learningequality.org/learn/

Steps to reproduce

Not sure what caused the network error so difficult to reproduce. Presumably, if the network import task had finished correctly files would have been marked available so bug 2/ would not be visible.

If it helps chasing bug 1/, I can create an identical demo server as above (100G disk, trying to import 70G channel). Kolibri UI should report 30G available after finished import.

Context

  • Kolibri version: Kolibri 0.10.0b5
  • Operating system: linux
  • Browser: chrome

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:5 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
indirectlylitcommented, Aug 16, 2018

For the first problem, I wonder if it would be less confusing to say:

Your available space: 33GB Resources selected: 5,168 (78GB)

rather than

Your remaining space: 0GB Resources selected: 5,168 (78GB)

because ‘Your remaining space’ is ambiguous about whether it refers to post- or pre-import

0reactions
benjaomingcommented, Aug 24, 2018

I agree with @indirectlylit - this text change is urgently needed. I know that we have a string freeze in 0.10.x, but we could fix this by only adding text and changing the order?

Resources selected: 5,168 (78GB) Your remaining space: 0GB (after import)

Read more comments on GitHub >

github_iconTop Results From Across the Web

dbimport of a database can use much more disk space ... - IBM
You may observe that a dbimport of a database consumes much more disk space in a dbspace than expected.
Read more >
Disk calculation are incorrect [BUG] · Issue #134 - GitHub
Describe the bug The calculations for disk size, usage and free are incorrect. I ran out of disk space with BTop++ saying I...
Read more >
Incorrect used disk space percentage when calculated using ...
I get 22.2 % as the output which is still not the same as 19%. Can you please explain why the difference in...
Read more >
Disk Manager reports incorrect disk usage a while after the ...
This issue occurs because the NTFS file system incorrectly adjusts the amount of free disk space in the volume control block (VCB) when...
Read more >
[AOS Only] What to do when /home partition or /home/nutanix ...
Not enough space on /home/nutanix directory on Controller VM [ip]. Available = x GB : Expected = x GB Failed to calculate minimum...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found