question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[Question] Trained Models not give best rewards (Hugging Face Models)

See original GitHub issue

❓ Question

I am trying to load and test trained models. I got trained model from Hugging face and I evaluate performance and that give mean_reward=6.60 +/- 4.758150901348128. Do I need to set Hyperparameters? If so, how can I do it?

Code:

import gym

from huggingface_sb3 import load_from_hub
from stable_baselines3 import PPO
from stable_baselines3.common.evaluation import evaluate_policy

from google.colab.patches import cv2_imshow

checkpoint = load_from_hub(
    repo_id="sb3/ppo-BreakoutNoFrameskip-v4",
    filename="ppo-BreakoutNoFrameskip-v4.zip",
)
model = PPO.load(checkpoint)

eval_env = make_atari_env("Breakout-v4", n_envs=4, seed=0)
eval_env = VecFrameStack(eval_env , n_stack=4)

mean_reward, std_reward = evaluate_policy(
    model, eval_env, n_eval_episodes=10, deterministic=True, warn=False
)
print(f"mean_reward={mean_reward:.2f} +/- {std_reward}")

Hyperparameters:

OrderedDict([(‘batch_size’, 256), (‘clip_range’, ‘lin_0.1’), (‘ent_coef’, 0.01), (‘env_wrapper’, [‘stable_baselines3.common.atari_wrappers.AtariWrapper’]), (‘frame_stack’, 4), (‘learning_rate’, ‘lin_2.5e-4’), (‘n_envs’, 8), (‘n_epochs’, 4), (‘n_steps’, 128), (‘n_timesteps’, 10000000.0), (‘policy’, ‘CnnPolicy’), (‘vf_coef’, 0.5), (‘normalize’, False)])

Link - https://huggingface.co/sb3/ppo-BreakoutNoFrameskip-v4

Checklist

  • I have checked that there is no similar issue in the repo
  • I have read the documentation
  • If code there is, it is minimal and working
  • If code there is, it is formatted using the markdown code blocks for both code and stack traces.

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:7 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
simoninithomascommented, Oct 23, 2022

Hey there 👋 . My bad, we changed it some months ago, we need to specify the zip file since we can load_from_hub any file. But indeed I forgot to update it on the SB3 documentation (it was updated on the Hub documentation and tutorials): https://huggingface.co/docs/hub/stable-baselines3#using-existing-models

I’ll make a doc update PR today, thanks @indramal for pointing this out 🤗

0reactions
indramalcommented, Oct 24, 2022

@simoninithomas no problem

Read more comments on GitHub >

github_iconTop Results From Across the Web

Question answering - Hugging Face Course
Time to look at question answering! This task comes in many flavors, but the one we'll focus on in this section is called...
Read more >
Introducing Decision Transformers on Hugging Face
It abstracts Reinforcement Learning as a conditional-sequence modeling problem. The main idea is that instead of training a policy using RL ...
Read more >
Improve DistilBERT Question and Answering model with ...
I'm wondering if there's a workaround to use a similar approach on improving pre-trained distilBERT model using reinforcement based method ( ...
Read more >
Fine-tuning a model with the Trainer API - Hugging Face Course
Transformers provides a Trainer class to help you fine-tune any of the pretrained models it provides on your dataset.
Read more >
Fine-tune a pretrained model - Hugging Face
In this tutorial, you will fine-tune a pretrained model with a deep learning framework of your choice: Fine-tune a pretrained model with Transformers...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found