question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Segfault of some algorithms on cluster

See original GitHub issue

Hi,

I am trying to run all the algorithms on the TwoArmTransport environment, and I ran into Segmentation issue when trying td3_bc, bcq and cql on our school’s cluster (with GeForce GTX 1080 with 8120 MB memory). Here is an example of the segmentation fault when running the td3_bc algorithm on the low_dim dataset. I tried to investigate a little bit, but it’s not clear to me what is causing this segfault issue (I’ve attached the error message below from the terminal). There is no such issue if I run these algorithms on my own laptop. It would be great if there are solutions to the segfault so that I can run my experiments on the cluster. Thanks a lot in advance.

SequenceDataset (
	path=robomimic_data/low_dim.hdf5
	obs_keys=('object', 'robot0_eef_pos', 'robot0_eef_quat', 'robot0_gripper_qpos')
	seq_length=1
	filter_key=none
	frame_stack=1
	pad_seq_length=True
	pad_frame_stack=True
	goal_mode=none
	cache_mode=all
	num_demos=200
	num_sequences=93752
)

 10%|#         | 519/5000 [00:28<04:03, 18.43it/s]Segmentation fault (core dumped)

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:9 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
amandlekcommented, Nov 4, 2021

You can go ahead and monitor it in this function here.

0reactions
amandlekcommented, Dec 20, 2021

Closing this issue for now - please re-open if this issue persists

Read more comments on GitHub >

github_iconTop Results From Across the Web

Segmentation fault on splitting cluster · Issue #199 - GitHub
I ran it several time printing k and ik indexes to see if it fails at the same iteration but it does not....
Read more >
[Dot] segfault when cluster and rank subgraph have the same ...
Definitely a bug. It is some strange interaction between the cluster being in a subgraph with rank=same set. (Go figure.) A simple workaround...
Read more >
Debugging Segmentation Faults - Google Sites
Segmentation faults are referred to as segfault, access violation or bus error. Hardware notifies the operating system about memory access violation. The OS ......
Read more >
[AMBER] CPPTRAJ cluster analysis: Segmentation fault (core ...
have several options in addition to what Samuel mentioned: 1) Use the 'sieve' cluster ... clustering algorithm are OpenMP-parallelized.
Read more >
Segmentation fault 11 in C program [duplicate] - Stack Overflow
I'm a software engineer getting my master's degree and I'm coding a program in C that implements the Lloyd Algorithm. However, I'm stuck...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found