question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. ItΒ collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Issues on getting antmaze-medium-play-v0 results with iql

See original GitHub issue

Hi there, Thank you for releasing the CORL benchmark. I cloned the latest repo and using parameters as below to run antmaze-medium-play-v0 experiment. However, I got all near 0 normalized reward from the first 430,000 gradient step.

I did not change the code except using these parameters:

class TrainConfig:
    # Experiment
    device: str = "cpu"
    env: str = "antmaze-medium-play-v0"  # OpenAI gym environment name
    seed: int = 0  # Sets Gym, PyTorch and Numpy seeds
    eval_freq: int = int(1e4)  # How often (time steps) we evaluate
    n_episodes: int = 100  # How many episodes run during evaluation
    max_timesteps: int = int(1e6)  # Max time steps to run environment
    checkpoints_path: str = "./models/iql"  # Save path
    load_model: str = ""  # Model load file name, "" doesn't load
    # IQL
    buffer_size: int = 10_000_000  # Replay buffer size
    batch_size: int = 256  # Batch size for all networks
    discount: float = 0.99  # Discount factor
    tau: float = 0.005  # Target network update rate
    beta: float = 10.0  # Inverse temperature. Small beta -> BC, big beta -> maximizing Q
    iql_tau: float = 0.9  # Coefficient for asymmetric loss
    iql_deterministic: bool = False  # Use deterministic actor
    normalize: bool = True  # Normalize states
    normalize_reward: bool = False  # Normalize reward
    # Wandb logging
    project: str = "CORL-default"
    group: str = "IQL-D4RL"
    name: str = "IQL"

And the results are as below:

 % python iql.py
objc[33597]: Class GLFWApplicationDelegate is implemented in both /Users/xxx/.mujoco/mujoco210/bin/libglfw.3.dylib (0x11aa13778) and /opt/anaconda3/envs/iql2/lib/python3.10/site-packages/glfw/libglfw.3.dylib (0x11aabc7e8). One of the two will be used. Which one is undefined.
objc[33597]: Class GLFWWindowDelegate is implemented in both /Users/xxx/.mujoco/mujoco210/bin/libglfw.3.dylib (0x11aa13700) and /opt/anaconda3/envs/iql2/lib/python3.10/site-packages/glfw/libglfw.3.dylib (0x11aabc810). One of the two will be used. Which one is undefined.
objc[33597]: Class GLFWContentView is implemented in both /Users/xxx/.mujoco/mujoco210/bin/libglfw.3.dylib (0x11aa137a0) and /opt/anaconda3/envs/iql2/lib/python3.10/site-packages/glfw/libglfw.3.dylib (0x11aabc860). One of the two will be used. Which one is undefined.
objc[33597]: Class GLFWWindow is implemented in both /Users/xxx/.mujoco/mujoco210/bin/libglfw.3.dylib (0x11aa13818) and /opt/anaconda3/envs/iql2/lib/python3.10/site-packages/glfw/libglfw.3.dylib (0x11aabc8d8). One of the two will be used. Which one is undefined.
Warning: Flow failed to import. Set the environment variable D4RL_SUPPRESS_IMPORT_ERROR=1 to suppress this message.
No module named 'flow'
Warning: CARLA failed to import. Set the environment variable D4RL_SUPPRESS_IMPORT_ERROR=1 to suppress this message.
No module named 'carla'
pybullet build time: Oct 16 2022 01:59:14
/opt/anaconda3/envs/iql2/lib/python3.10/site-packages/gym/envs/registration.py:505: UserWarning: WARN: The environment antmaze-medium-play-v0 is out of date. You should consider upgrading to version `v2` with the environment ID `antmaze-medium-play-v2`.
  logger.warn(
/Users/xxx/Documents/project_offlineexploration/D4RL_6330b4e09e36a80f4b706a3885d59d97745c05a9/d4rl/locomotion/ant.py:180: UserWarning: This environment is deprecated. Please use the most recent version of this environment.
  offline_env.OfflineEnv.__init__(self, **kwargs)
Target Goal:  (20.64647417679362, 21.089515421327548)
/opt/anaconda3/envs/iql2/lib/python3.10/site-packages/gym/spaces/box.py:84: UserWarning: WARN: Box bound precision lowered by casting to float32
  logger.warn(f"Box bound precision lowered by casting to {self.dtype}")
load datafile: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 8/8 [00:03<00:00,  2.14it/s]
Dataset size: 999092
Checkpoints path: ./models/iql
---------------------------------------
Training IQL, Env: antmaze-medium-play-v0, Seed: 0
---------------------------------------
wandb: Currently logged in as: lxu. Use `wandb login --relogin` to force relogin
wandb: wandb version 0.13.4 is available!  To upgrade, please run:
wandb:  $ pip install wandb --upgrade
wandb: Tracking run with wandb version 0.12.21
wandb: Run data is saved locally in /Users/xxx/Documents/default_repo/CORL/algorithms/wandb/run-20221019_133015-2d1a2d9d-8f35-4295-bac7-e39fa293699c
wandb: Run `wandb offline` to turn off syncing.
wandb: Syncing run IQL
wandb: ⭐️ View project at https://wandb.ai/xxx/CORL-default
wandb: πŸš€ View run at https://wandb.ai/xxx/CORL-default/runs/2d1a2d9d-8f35-4295-bac7-e39fa293699c
wandb: WARNING Calling wandb.run.save without any arguments is deprecated.Changes to attributes are automatically persisted.
iql_results

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:5 (2 by maintainers)

github_iconTop GitHub Comments

4reactions
egg-westcommented, Oct 19, 2022

My problem is solved by setting normalize_reward to True.

0reactions
DT6Acommented, Oct 19, 2022

Thanks for your report. Antmaze configs are fixed now https://github.com/tinkoff-ai/CORL/pull/8

Read more comments on GitHub >

github_iconTop Results From Across the Web

Insight IQL search gives wrong result when using ... - Jira
Issue Summary. Insight IQL search gives wrong result when using "!=" and "NOT IN" operators. This happens when using an attribute name of...
Read more >
Search for objects in Insight Asset Management for Jira
Learn different ways to search for objects in Insight, how to save searches and customize the way you view the results.
Read more >
Insight 5.5 - Support for "OR" in IQL - YouTube
With the release of Insight 5.5 IQL now supports "OR" adding a lot of power to the Insight platform.
Read more >
IQL Search - Dashboard Hub Documentation
IQL SEARCH Display corporate assets and their information based on a IQL query. ... However, if you select all, the results could change...
Read more >
API Reference - Documentation - Impira
Switch back and forth between each view to get a different perspective on your result. Filters, sorting, limits, and offsets. The remainder of...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found