Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[App]: 'You must call wandb.init() before wandb.log()' at the end of the training

See original GitHub issue

Current Behavior

Hello, I am running on a strange behavior with wandb. I have the massage “You must call wandb.init() before wandb.log()” but at the end of my training! I am running my code on vertex Al (using a docker file) and I dont get any error or warning but at the end of the training (after the run summary of wandb) I got this :

wandb: Synced test_CB_wandb: https://wandb.ai/bodyguard/huggingface/runs/m19hgjkq
2022-10-07 16:58:31.503 CEST

workerpool0-0
wandb: Synced 5 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s)
2022-10-07 16:58:31.503 CEST

workerpool0-0
wandb: Find logs at: ./wandb/run-20221007_145819-m19hgjkq/logs
2022-10-07 16:58:31.719 CEST

workerpool0-0
wandb: ERROR You must call wandb.init() before wandb.log()
2022-10-07 16:58:31.721 CEST

workerpool0-0
Traceback (most recent call last):
2022-10-07 16:58:31.721 CEST

workerpool0-0
 File "/opt/conda/lib/python3.7/runpy.py", line 193, in _run_module_as_main
2022-10-07 16:58:31.721 CEST

workerpool0-0
 "__main__", mod_spec)
2022-10-07 16:58:31.721 CEST

workerpool0-0
 File "/opt/conda/lib/python3.7/runpy.py", line 85, in _run_code
2022-10-07 16:58:31.721 CEST

workerpool0-0
 exec(code, run_globals)
2022-10-07 16:58:31.721 CEST

workerpool0-0
 File "/app/my_trainer/task.py", line 100, in <module>
2022-10-07 16:58:31.721 CEST

workerpool0-0
 main()
2022-10-07 16:58:31.721 CEST

workerpool0-0
 File "/app/my_trainer/task.py", line 96, in main
2022-10-07 16:58:31.721 CEST

workerpool0-0
 experiment.run(args)
2022-10-07 16:58:31.721 CEST

workerpool0-0
 File "/app/my_trainer/experiment.py", line 154, in run
2022-10-07 16:58:31.721 CEST

workerpool0-0
 metrics = trainer.evaluate(eval_dataset=val_dataset)
2022-10-07 16:58:31.721 CEST

workerpool0-0
 File "/opt/conda/lib/python3.7/site-packages/transformers/trainer.py", line 2807, in evaluate
2022-10-07 16:58:31.721 CEST

workerpool0-0
 self.log(output.metrics)
2022-10-07 16:58:31.721 CEST

workerpool0-0
 File "/opt/conda/lib/python3.7/site-packages/transformers/trainer.py", line 2401, in log
2022-10-07 16:58:31.721 CEST

workerpool0-0
 self.control = self.callback_handler.on_log(self.args, self.state, self.control, logs)
2022-10-07 16:58:31.721 CEST

workerpool0-0
 File "/opt/conda/lib/python3.7/site-packages/transformers/trainer_callback.py", line 390, in on_log
2022-10-07 16:58:31.721 CEST

workerpool0-0
 return self.call_event("on_log", args, state, control, logs=logs)
2022-10-07 16:58:31.721 CEST

workerpool0-0
 File "/opt/conda/lib/python3.7/site-packages/transformers/trainer_callback.py", line 407, in call_event
2022-10-07 16:58:31.721 CEST

workerpool0-0
 **kwargs,
2022-10-07 16:58:31.721 CEST

workerpool0-0
 File "/opt/conda/lib/python3.7/site-packages/transformers/integrations.py", line 731, in on_log
2022-10-07 16:58:31.721 CEST

workerpool0-0
 self._wandb.log({**logs, "train/global_step": state.global_step})
2022-10-07 16:58:31.721 CEST

workerpool0-0
 File "/opt/conda/lib/python3.7/site-packages/wandb/sdk/lib/preinit.py", line 36, in preinit_wrapper
2022-10-07 16:58:31.721 CEST

workerpool0-0
 raise wandb.Error(f"You must call wandb.init() before {name}()")
2022-10-07 16:58:31.721 CEST

workerpool0-0
wandb.errors.Error: You must call wandb.init() before wandb.log()
2022-10-07 16:58:31.722 CEST

workerpool0-0
ERROR:root:Traceback (most recent call last):
2022-10-07 16:58:31.722 CEST

workerpool0-0
 File "/opt/conda/lib/python3.7/runpy.py", line 193, in _run_module_as_main
2022-10-07 16:58:31.722 CEST

workerpool0-0
 "__main__", mod_spec)
2022-10-07 16:58:31.722 CEST

workerpool0-0
 File "/opt/conda/lib/python3.7/runpy.py", line 85, in _run_code
2022-10-07 16:58:31.722 CEST

workerpool0-0
 exec(code, run_globals)
2022-10-07 16:58:31.722 CEST

workerpool0-0
 File "/app/my_trainer/task.py", line 100, in <module>
2022-10-07 16:58:31.722 CEST

workerpool0-0
 main()
2022-10-07 16:58:31.722 CEST

workerpool0-0
 File "/app/my_trainer/task.py", line 96, in main
2022-10-07 16:58:31.722 CEST

workerpool0-0
 experiment.run(args)
2022-10-07 16:58:31.722 CEST

workerpool0-0
 File "/app/my_trainer/experiment.py", line 154, in run
2022-10-07 16:58:31.722 CEST

workerpool0-0
 metrics = trainer.evaluate(eval_dataset=val_dataset)
2022-10-07 16:58:31.722 CEST

workerpool0-0
 File "/opt/conda/lib/python3.7/site-packages/transformers/trainer.py", line 2807, in evaluate
2022-10-07 16:58:31.722 CEST

workerpool0-0
 self.log(output.metrics)
2022-10-07 16:58:31.722 CEST

workerpool0-0
 File "/opt/conda/lib/python3.7/site-packages/transformers/trainer.py", line 2401, in log
2022-10-07 16:58:31.722 CEST

workerpool0-0
 self.control = self.callback_handler.on_log(self.args, self.state, self.control, logs)
2022-10-07 16:58:31.722 CEST

workerpool0-0
 File "/opt/conda/lib/python3.7/site-packages/transformers/trainer_callback.py", line 390, in on_log
2022-10-07 16:58:31.722 CEST

workerpool0-0
 return self.call_event("on_log", args, state, control, logs=logs)
2022-10-07 16:58:31.722 CEST

workerpool0-0
 File "/opt/conda/lib/python3.7/site-packages/transformers/trainer_callback.py", line 407, in call_event
2022-10-07 16:58:31.722 CEST

workerpool0-0
 **kwargs,
2022-10-07 16:58:31.722 CEST

workerpool0-0
 File "/opt/conda/lib/python3.7/site-packages/transformers/integrations.py", line 731, in on_log
2022-10-07 16:58:31.722 CEST

workerpool0-0
 self._wandb.log({**logs, "train/global_step": state.global_step})
2022-10-07 16:58:31.722 CEST

workerpool0-0
 File "/opt/conda/lib/python3.7/site-packages/wandb/sdk/lib/preinit.py", line 36, in preinit_wrapper
2022-10-07 16:58:31.722 CEST

workerpool0-0
 raise wandb.Error(f"You must call wandb.init() before {name}()")
2022-10-07 16:58:31.722 CEST

workerpool0-0
wandb.errors.Error: You must call wandb.init() before wandb.log()
2022-10-07 16:58:31.722 CEST

workerpool0-0
NoneType: None

Knowing that on my training code I have the init, log and finish at the right places;

wandb.init(project="my_project")
wandb.login(key=my_key)

#...
trainer.train()
wandb.finish()

Expected Behavior

The error message should be at the beginning of my training or I shouldn’t get it at all.

Steps To Reproduce

No response

Screenshots

No response

Environment

OS:

Browsers:

Version:

Additional Context

No response

Issue Analytics

State:
Created a year ago
Comments:27

Top GitHub Comments

1reaction

ma-batitacommented, Dec 1, 2022

Hi @thanos-wandb, This is how my docker file looks like. I really appreciate your efforts. Thank you!

FROM us-docker.pkg.dev/vertex-ai/training/pytorch-gpu.1-11:latest

WORKDIR /app

RUN pip install google-cloud-storage transformers sentencepiece wandb accelerate toma deepspeed

COPY ./my_accelerate/__init__.py /app/my_accelerate/__init__.py
COPY ./my_accelerate/config_file.yaml /app/my_accelerate/config_file.yaml
COPY ./my_accelerate/script.py /app/my_accelerate/script.py


ENTRYPOINT ["python", "-m", "my_accelerate.script"]

1reaction

ma-batitacommented, Nov 22, 2022

hi @thanos-wandb, here is a shared file of the same test data that I used in my code.