question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

ValueError: invalid literal for int() with base 10: '3.6.8' OR ValueError: not enough values to unpack (expected 2, got 1)

See original GitHub issue

First of all, using the framework has been very useful already!

I am having two kinds of errors and don’t know why. I use GPT2 architecture (in repository example notebook) successfully trained and Miditok 1.1.9.

Code structure

Encoding:

pitch_range = range(21, 109)
beat_res = {(0, 4): 8}
nb_velocities = 32
additional_tokens = {'Chord': False, 'Rest': False, 'Tempo': True, 'Program': True, 'TimeSignature': True,
                     'nb_tempos': 32,
                     'tempo_range': (40, 250),
                     'time_signature_range': (8, 2)}
tokenizer = Octuple(pitch_range, beat_res, nb_velocities, additional_tokens)

Preprocessing:

# Converts MIDI files to tokens saved as JSON files
tokenizer.tokenize_midi_dataset(paths, relative_path_to_json, midi_valid)

json_paths = list(path.Path(relative_path_to_json).glob('*.json'))
entire_pop909_json_with_bools = []

for json_file in json_paths:
    with open(json_file) as f:
        data = json.load(f)
        entire_pop909_json_with_bools.extend(data) # where elements are found in the list of lists

entire_pop909_json_list = []
# just take song tokens, not boolean track signs
for slot in entire_pop909_json_with_bools:
    if False not in slot[0]: # TAKE CARE: just for Pop909 dataset
        entire_pop909_json_list.append(slot)

flatten_different_songs = [item for sublist in entire_pop909_json_list for item in sublist]
# just trying to make token units to fit the [4, 1024] shape, otherwise it would be [4, 1024, 8]
flatten_time_steps = [item for sublist in flatten_different_songs for item in sublist]

train_data = []
train_data.extend(flatten_time_steps)

Output tensors shape from DataLoader:

Train loader
X shape: torch.Size([4, 1024])
Target shape: torch.Size([4, 1024])

Generating from scratch:

rand_seq = model.generate(torch.Tensor([1]), target_seq_length=512)
out = rand_seq[0].cpu().numpy().tolist()

converted_back_midi = tokenizer.tokens_to_midi([out], None)
converted_back_midi.dump('output.mid')

Errors

When the generating part is executed two kinds of errors could show, this one:

MidiTok Model Generator
Generating sequence of max length: 512
50 / 512
100 / 512
150 / 512
200 / 512
250 / 512
300 / 512
350 / 512
400 / 512
450 / 512
500 / 512

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/tmp/ipykernel_5234/3425966451.py in <module>
     14 out = rand_seq[0].cpu().numpy().tolist()
     15 
---> 16 converted_back_midi = tokenizer.tokens_to_midi([out], None)
     17 converted_back_midi.dump('4_model_1_OUTPUT(256).mid')
     18 

~/miniconda3/envs/remiTest/lib/python3.9/site-packages/miditok/octuple.py in tokens_to_midi(self, tokens, _, output_path, time_division)
    230 
    231         if self.additional_tokens['TimeSignature']:
--> 232             time_sig = self._parse_token_time_signature(self.tokens_to_events(tokens[0])[-1].value)
    233         else:  # default
    234             time_sig = TIME_SIGNATURE

~/miniconda3/envs/remiTest/lib/python3.9/site-packages/miditok/midi_tokenizer_base.py in _parse_token_time_signature(token_time_sig)
    447         :return: the numerator and denominator of a time signature
    448         """
--> 449         numerator, denominator = map(int, token_time_sig.split('/'))
    450         return numerator, denominator
    451 

ValueError: invalid literal for int() with base 10: '3.6.8'

Or this one:

MidiTok Model Generator
Generating sequence of max length: 512
50 / 512
100 / 512
150 / 512
200 / 512
250 / 512
300 / 512
350 / 512
400 / 512
450 / 512
500 / 512

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/tmp/ipykernel_5234/761086941.py in <module>
     14 out = rand_seq[0].cpu().numpy().tolist()
     15 
---> 16 converted_back_midi = tokenizer.tokens_to_midi([out], None)
     17 converted_back_midi.dump('output.mid')
     18 

~/miniconda3/envs/remiTest/lib/python3.9/site-packages/miditok/octuple.py in tokens_to_midi(self, tokens, _, output_path, time_division)
    230 
    231         if self.additional_tokens['TimeSignature']:
--> 232             time_sig = self._parse_token_time_signature(self.tokens_to_events(tokens[0])[-1].value)
    233         else:  # default
    234             time_sig = TIME_SIGNATURE

~/miniconda3/envs/remiTest/lib/python3.9/site-packages/miditok/midi_tokenizer_base.py in _parse_token_time_signature(token_time_sig)
    447         :return: the numerator and denominator of a time signature
    448         """
--> 449         numerator, denominator = map(int, token_time_sig.split('/'))
    450         return numerator, denominator
    451 

ValueError: not enough values to unpack (expected 2, got 1)

The ValueError: invalid literal for int() with base 10: '3.6.8' one can be ‘x.x.x’ literal, it can change in every execution.

Thanks in advance!

PS: Sorry if I made it too long, just wanted to be clear on each point 😃.

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:24 (12 by maintainers)

github_iconTop GitHub Comments

2reactions
Natoozcommented, May 29, 2022

Hi Env,

After a few tests I did not run into bugs, so I released the update in v1.2.0 ! If you get new bugs / crashes please re-open this issue or create a new one ! 😃

BTW Octuple is pretty “demanding” in computer ressource, meaning the multi input / output requires a relatively high amount of model parameters (and therefore gpu). The original authors used 8 V100 (32GB vram), which is quite a lot. My results with one V100 weren’t very good also, the model often producing errors like predicting Bar / Positions already passed (going backward in time). For smaller hardware / model size, representations like REMI / Structured are more suitable.

1reaction
envilkcommented, May 30, 2022

Amazing!

I will try to train for many epochs in an Amazon GPU, so when I have results I can tell you. The new version seems to work well!

Read more comments on GitHub >

github_iconTop Results From Across the Web

ValueError: invalid literal for int() with base 10: '' - Stack Overflow
The reason you are getting this error is that you are trying to convert a space character to an integer, which is totally...
Read more >
Python ValueError: invalid literal for int() with base 10
This error can frequently occur when converting user-input to an integer-type using the int() function. This problem happens because Python stores the input...
Read more >
ValueError: invalid literal for int() with base 10
The error message invalid literal for int() with base 10 would seem to indicate that you are passing a string that's not an...
Read more >
ValueError: Too many values to unpack in Python - STechies
PythonValueError: Too many values to unpack in Python, This error occurs when we assign more number of values and less number of variables,...
Read more >
Bug Detection - Transformers | Kaggle
... 'ValueError: not enough values to unpack (expected 2, got 1)', "ValueError: invalid literal for int() with base 10: '1 3'", ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found