Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Cannot set log level of `ignite.distributed` because of calls to setup_logger inside auto functions

See original GitHub issue

🐛 Bug description

The ignite.distributed auto functions such as ignite.distributed.auto_dataloader use setup_logger to get a logging.Logger instance. However, this function always sets the logging level to logging.INFO and furthermore removes ands sets up the logging handlers. Functions should not be setting up a logger object if it is already initialized such that user code can specify a different log level, handlers, or log message format string in their code (e.g., main function).

See for example:

Note that in case of ignite.distributed.launcher.Parallel one can modify the self.logger instance, but this logger already prints messages right after setup_logger in its __init__ function, hence, making it impossible to adjust the log level, format, or handlers for these first messages also.

Environment

PyTorch Version (e.g., 1.4):
Ignite Version (e.g., 0.3.0):
OS (e.g., Linux):
How you installed Ignite (conda, pip, source):
Python version:
Any other relevant information:

Issue Analytics

State:
Created 3 years ago
Comments:14 (8 by maintainers)

Top GitHub Comments

1reaction

schuhschuhcommented, May 11, 2021

Hey, this is great! Thanks for working on this and keeping me in the loop. Sorry I didn’t provide feedback as I was distracted.

The general recommendation for any Python lib should be to never use the root logger internally. Root is for applications. So I think this works well with how you’ve been adapting it now. I would probably have picked the default name to be simply ignite, because technically that is the “root” logger for the ignite library. The . separators in the logger name define a hierarchy. So if I want to modify all loggers used by ignite only, I can use the logging.getLogger("ignite") object in my application code. Was there another motivation for choosing ignite.root.logger instead?

I think it’s good to have the reset flag default to False, even though it changes the behavior of setup_logger() compared to previous ignite versions. But as it’s a pre-stable release version, this should certainly be fine with users. Otherwise, I think the reset flag may not actually have any meaningful use? I was mostly suggesting it in the first place to maintain existing behavior, but be able to alter it via opt-in. Having the reset=False behavior should be the new norm, though, so not sure there is any meaningful use case for reset=True. Application code should normally set up logging at the very start of the main function, after which there seems no need to reset. But fine if the option exists, of course.

Instead of a reset flag, another option may have been to split the functionality of setup_logger() into separate more reusable functions with each a clearer and more limited scope. To set the logging level, one should anyway just use the API provided by Python’s logging module. I guess this mainly leaves a utility function to (re-)set the handlers depending on distributed_rank. If one were to remove level setting from setup_logger(), the function would indeed only be used to set the handlers as desired (as in setup_logger_handlers() kind of)?

Anyway, these were just some thoughts I had and missed to share earlier. It is looking fine as is for my use case.

1reaction

schuhschuhcommented, Mar 15, 2021

I would say you already have global logging objects for these functions anyway by the use of logging.getLogger() in setup_logger(). This is what makes it possible to adjust the logger without accessing a module level logger instance reference. Multiple calls to auto_*() functions will give you the same logger instance. The issue is just that this instance will also always reset the level, format, and handlers.

Would it work to just use a setup_logger variant (maybe a flag for this function) which instead of removing any previous handlers and setting up the logger retrieved by logging.getLogger(name) would simply check logger.hasHandlers() and just return the already set up logger object if True at this line? One could still check and remove all handlers for distributed_rank > 0. But given that these loggers have a NullHandler set by any previous call to the auto_* functions, I don’t think this would be necessary. Also, by only checking hasHandlers() and making any changes to the logger object only if False, you leave the user the option to setup any desired handlers even for loggers where the distributed_rank of the process is greater than zero.

For example:

def setup_logger(
    name: Optional[str] = None,
    level: int = logging.INFO,
    stream: Optional[TextIO] = None,
    format: str = "%(asctime)s %(name)s %(levelname)s: %(message)s",
    filepath: Optional[str] = None,
    distributed_rank: Optional[int] = None,
    reset: bool = True,
) -> logging.Logger:
    logger = logging.getLogger(name)
    if logger.hasHandlers() and not reset:
        return logger
    # [...]

EDIT: It may be desirable to have the handlers always be removed and replaced by a NullHandler if distributed_rank > 0 regardless of the reset flag (to be used in auto_*() functions when calling setup_logger()), though. Because then a user can setup the handler for the named logger in the main process before any processes are spawned.