Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

ValueError: invalid literal for int() with base 10: '3.6.8' OR ValueError: not enough values to unpack (expected 2, got 1)

See original GitHub issue

First of all, using the framework has been very useful already!

I am having two kinds of errors and don’t know why. I use GPT2 architecture (in repository example notebook) successfully trained and Miditok 1.1.9.

Code structure

Encoding:

pitch_range = range(21, 109)
beat_res = {(0, 4): 8}
nb_velocities = 32
additional_tokens = {'Chord': False, 'Rest': False, 'Tempo': True, 'Program': True, 'TimeSignature': True,
                     'nb_tempos': 32,
                     'tempo_range': (40, 250),
                     'time_signature_range': (8, 2)}
tokenizer = Octuple(pitch_range, beat_res, nb_velocities, additional_tokens)

Preprocessing:

# Converts MIDI files to tokens saved as JSON files
tokenizer.tokenize_midi_dataset(paths, relative_path_to_json, midi_valid)

json_paths = list(path.Path(relative_path_to_json).glob('*.json'))
entire_pop909_json_with_bools = []

for json_file in json_paths:
    with open(json_file) as f:
        data = json.load(f)
        entire_pop909_json_with_bools.extend(data) # where elements are found in the list of lists

entire_pop909_json_list = []
# just take song tokens, not boolean track signs
for slot in entire_pop909_json_with_bools:
    if False not in slot[0]: # TAKE CARE: just for Pop909 dataset
        entire_pop909_json_list.append(slot)

flatten_different_songs = [item for sublist in entire_pop909_json_list for item in sublist]
# just trying to make token units to fit the [4, 1024] shape, otherwise it would be [4, 1024, 8]
flatten_time_steps = [item for sublist in flatten_different_songs for item in sublist]

train_data = []
train_data.extend(flatten_time_steps)

Output tensors shape from DataLoader:

Train loader
X shape: torch.Size([4, 1024])
Target shape: torch.Size([4, 1024])

Generating from scratch:

rand_seq = model.generate(torch.Tensor([1]), target_seq_length=512)
out = rand_seq[0].cpu().numpy().tolist()

converted_back_midi = tokenizer.tokens_to_midi([out], None)
converted_back_midi.dump('output.mid')

Errors

When the generating part is executed two kinds of errors could show, this one:

MidiTok Model Generator
Generating sequence of max length: 512
50 / 512
100 / 512
150 / 512
200 / 512
250 / 512
300 / 512
350 / 512
400 / 512
450 / 512
500 / 512

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/tmp/ipykernel_5234/3425966451.py in <module>
     14 out = rand_seq[0].cpu().numpy().tolist()
     15 
---> 16 converted_back_midi = tokenizer.tokens_to_midi([out], None)
     17 converted_back_midi.dump('4_model_1_OUTPUT(256).mid')
     18 

~/miniconda3/envs/remiTest/lib/python3.9/site-packages/miditok/octuple.py in tokens_to_midi(self, tokens, _, output_path, time_division)
    230 
    231         if self.additional_tokens['TimeSignature']:
--> 232             time_sig = self._parse_token_time_signature(self.tokens_to_events(tokens[0])[-1].value)
    233         else:  # default
    234             time_sig = TIME_SIGNATURE

~/miniconda3/envs/remiTest/lib/python3.9/site-packages/miditok/midi_tokenizer_base.py in _parse_token_time_signature(token_time_sig)
    447         :return: the numerator and denominator of a time signature
    448         """
--> 449         numerator, denominator = map(int, token_time_sig.split('/'))
    450         return numerator, denominator
    451 

ValueError: invalid literal for int() with base 10: '3.6.8'

Or this one:

MidiTok Model Generator
Generating sequence of max length: 512
50 / 512
100 / 512
150 / 512
200 / 512
250 / 512
300 / 512
350 / 512
400 / 512
450 / 512
500 / 512

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/tmp/ipykernel_5234/761086941.py in <module>
     14 out = rand_seq[0].cpu().numpy().tolist()
     15 
---> 16 converted_back_midi = tokenizer.tokens_to_midi([out], None)
     17 converted_back_midi.dump('output.mid')
     18 

~/miniconda3/envs/remiTest/lib/python3.9/site-packages/miditok/octuple.py in tokens_to_midi(self, tokens, _, output_path, time_division)
    230 
    231         if self.additional_tokens['TimeSignature']:
--> 232             time_sig = self._parse_token_time_signature(self.tokens_to_events(tokens[0])[-1].value)
    233         else:  # default
    234             time_sig = TIME_SIGNATURE

~/miniconda3/envs/remiTest/lib/python3.9/site-packages/miditok/midi_tokenizer_base.py in _parse_token_time_signature(token_time_sig)
    447         :return: the numerator and denominator of a time signature
    448         """
--> 449         numerator, denominator = map(int, token_time_sig.split('/'))
    450         return numerator, denominator
    451 

ValueError: not enough values to unpack (expected 2, got 1)

The ValueError: invalid literal for int() with base 10: '3.6.8' one can be ‘x.x.x’ literal, it can change in every execution.

Thanks in advance!

PS: Sorry if I made it too long, just wanted to be clear on each point 😃.

Issue Analytics

State:
Created a year ago
Comments:24 (12 by maintainers)

Top GitHub Comments

2reactions

Natoozcommented, May 29, 2022

Hi Env,

After a few tests I did not run into bugs, so I released the update in v1.2.0 ! If you get new bugs / crashes please re-open this issue or create a new one ! 😃

BTW Octuple is pretty “demanding” in computer ressource, meaning the multi input / output requires a relatively high amount of model parameters (and therefore gpu). The original authors used 8 V100 (32GB vram), which is quite a lot. My results with one V100 weren’t very good also, the model often producing errors like predicting Bar / Positions already passed (going backward in time). For smaller hardware / model size, representations like REMI / Structured are more suitable.

1reaction

envilkcommented, May 30, 2022

Amazing!

I will try to train for many epochs in an Amazon GPU, so when I have results I can tell you. The new version seems to work well!

Top Results From Across the Web

ValueError: invalid literal for int() with base 10: '' - Stack Overflow

The reason you are getting this error is that you are trying to convert a space character to an integer, which is totally...

Python ValueError: invalid literal for int() with base 10

This error can frequently occur when converting user-input to an integer-type using the int() function. This problem happens because Python stores the input...

ValueError: invalid literal for int() with base 10

The error message invalid literal for int() with base 10 would seem to indicate that you are passing a string that's not an...

ValueError: Too many values to unpack in Python - STechies

PythonValueError: Too many values to unpack in Python, This error occurs when we assign more number of values and less number of variables,...

Bug Detection - Transformers | Kaggle

... 'ValueError: not enough values to unpack (expected 2, got 1)', "ValueError: invalid literal for int() with base 10: '1 3'", ...