question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[feature] Abstract global KFP compilation context/state

See original GitHub issue

Feature Area

/area sdk

What feature would you like to see?

Gathering all the mutable globals used during KFP compilation into one singleton. That way, the users could implement their own implementation to support any compilation state model they need, instead of a synchronous global-variables-based approach that is currently used, and trivially set it in a single one, globally replacing one singleton with another. Thanks to that, third-party plugins could provide KFP SDK support for frameworks such as Flask, Django or others, independently from the KFP itself.

At the same time, the singleton doesn’t need to be aware of all the state needed by KFP, but rather provide a generic mechanism of setting and retrieving the values for globals via unique keys (names?).

What is the use case or pain point?

Currently, KFP compilation requires using numerous global variables such as Pipeline._default_pipeline, _components._container_task_constructor, _container_op._register_op_handler — which means that at least the compile_pipeline function must be @synchronized (locked) between calls. This is very ineffective in case of using KFP DSL compilation in concurrent environments, which could be trivially avoided by moving the global state to some proxy.

Is there a workaround currently?

One could try to manually overwrite each individual global private variable mutated in KFP with proxies, but it has many disadvantages to it: it’s not only very hacky, but because they’re private, they can be removed or added at any time in subsequent versions of KFP.

Implementation details

I imagine that the base class of such global context may look as follows:

class KfpCompilationCtx:
  @abstractmethod
  def set_global(self, name: str, value: Any) -> None: ...

  @abstractmethod
  def get_global(self, name: str) -> Any: ...

  def get_set_global(self, name: str, value: Any) -> Any:
    try:
      old_value = self.get_global(name)
    except:
      old_value = None
    self.set_global(name, value)
    return old_value

# could also be a static inside of the class, but that would enable other classes to overwrite it
_kfp_ctx: 'KfpCompilationCtx' = None

def set_kfp_ctx(ctx: KfpCompilationCtx):
  global _kfp_ctx
  _kfp_ctx = ctx

def get_kfp_ctx() -> KfpCompilationCtx:
  return _kfp_ctx

class KfpGlobalCtx(KfpCompilationCtx):
  def __init__(self):
    super().__init__(self)
    self._globals_dict: Dict[str, Any] = {}

  def set_global(self, name: str, value: Any) -> None:
    self._globals_dict[name] = value

  def get_global(self, name: str) -> Any:
    return self._globals_dict[name]

# by default, the context is global-based
set_kfp_ctx(KfpGlobalCtx())

but it would also trivially enable the following implementation for Flask:

from flask import g

class KfpFlaskCtx(KfpCompilationCtx):
  def set_global(self, name: str, value: Any) -> None:
    setattr(g, name, value)

  def get_global(self, name: str) -> Any:
    return getattr(g, name)

Now, the current implementation of Pipeline class would change to:

class Pipeline():
  def __init_(self, name: str):
    self.name = name
    self.ops = {}
    # Add the root group.
    self.groups = [_ops_group.OpsGroup('pipeline', name=name)]
    self.group_id = 0
    self.conf = PipelineConf()
    self._metadata = None

  @staticmethod
  def get_default_pipeline() -> Optional['Pipeline']:
    return get_kfp_ctx().get_global('default_pipeline')

  @staticmethod
  def _set_default_pipeline(value: Optional['Pipeline']) -> None:
    return get_kfp_ctx().set_global('default_pipeline', value)

  def __enter__(self) -> 'Pipeline':
    if self.get_default_pipeline():
      raise Exception('Nested pipelines are not allowed.')

    self._set_default_pipeline(self)
    ctx = get_kfp_ctx()
    self._old_container_task_constructor = ctx.get_set_global(
      'container_task_constructor',
      _component_bridge._create_container_op_from_component_and_arguments
    )

    def register_op_and_generate_id(op):
      return self.add_op(op, op.is_exit_handler)

    self._old__register_op_handler = ctx.get_set_global(
      'register_op_handler',
      register_op_and_generate_id
    )
    return self

  def __exit__(self, *args):
    self._set_default_pipeline(None)
    ctx = get_kfp_ctx()
    ctx.set_global(self._old__register_op_handler)
    ctx.set_global(self._old_container_task_constructor)

  ...

…which is trivially compatible with both implementations.

Considering how in KFP it’s usually only required to set and unset some global, maybe set_global and unset_global would be better instead — certainly would be easier to implement global locking mechanisms. Maybe a setdefault could be useful etc. But the general idea is that of the above.


Love this idea? Give it a 👍. We prioritize fulfilling features with the most 👍.

Issue Analytics

  • State:open
  • Created 2 years ago
  • Reactions:3
  • Comments:10 (7 by maintainers)

github_iconTop GitHub Comments

1reaction
animeshsinghcommented, Jan 19, 2022

@chensun please let us know if there are any valid arguments why not to do this - else let’s move this forward

0reactions
Tomclicommented, Apr 26, 2022

/assign

Read more comments on GitHub >

github_iconTop Results From Across the Web

No results found

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found