question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Augmenting trial run metadata with metadata from `Runner.stop` call

See original GitHub issue

Hi, I would like to thank you for such a detailed abstraction of experimentation steps. Although it needs a very deep dig of the modules until understanding the purpose of a component (sometimes I misinterpret too), eventually, all pieces fit beautifully.

I want to discuss a usage case that relies on the implementation of a Runner. I checked that the only required method to implement is Runner.run(), which may return some metadata about deployment. I assume that I can use this method to get the arms of the trial and prepare the actual parameters to configure and start the actual process. Then, possibly it collects some other metadata during running, which will later be used to compute the cost of a metric using Metric.fetch_trial_data(). This part may not be as I predicted but, unlike synthetic metrics that implement an offline function f(), I couldn’t think of another way to handle process data.

In a specific case, I want to use Runner.run() to open a TCP connection on a real-time target computer and deploy the required parameters. However, the target cannot transfer real-time process data before stopping the process because it would overload the computer. So, Runner.run() will not return any process metadata. Conceptually, I would use Runner.stop() to stop the process, close the connection and collect the data that my metric will need to compute its cost. However, BasicTrial.complete() does not collect any metadata.

As a work-around, I would implement the Runner.run() to execute all the above processes, but then, what would be the purpose of Runner.stop() because eventually, I will define something like stop_my_process() and call it in Runner.run().

Probably, I will work this out by implementing a MyTrial that overrides only .complete() such that:

...
metadata = not_none(self._runner).stop(self)
self.update_run_metadata(metadata) if metadata else pass
...

But I would like to get your comments about it: would it be considered directly in BasicTrial in the future, or is there any other way to do it more ax-like?

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:7 (6 by maintainers)

github_iconTop GitHub Comments

2reactions
Balandatcommented, Jun 24, 2021

@ugurmengilli I just realized that I happened to work on exactly this feature without actually being aware of this issue. I should be able to put out a PR some time soon, would be great to get your eyes on it 😃 Will be sure to tag you.

1reaction
lena-kashtelyancommented, Jun 21, 2021

An “updates” section with a date of the update on the main page would probably be very beneficial to track the changes to the website.

That’s a great idea and is actually on our roadmap for the second half of this year, so stay tuned!

Read more comments on GitHub >

github_iconTop Results From Across the Web

Run a pipeline - Amazon SageMaker - AWS Documentation
Step 4: Stop and Delete a Pipeline Execution. When you're finished with your pipeline, you can stop any ongoing executions and delete the...
Read more >
The role of metadata in reproducible computational research
Metadata provide context and provenance to raw data and methods and are essential to both discovery and validation. Despite this shared ...
Read more >
Improving Metadata Quality: Augmentation and Recombination
The National. Science Digital Library (NSDL) is exploring options for augmenting harvested metadata and re-exposing the augmented metadata to down- stream users ...
Read more >
Configuration reference - Apache Druid
How often to run metadata management tasks in ISO 8601 duration format. No, PT1H. druid.coordinator.kill.supervisor.on, Boolean value for ...
Read more >
TensorFlow Object Detection API: Best Practices to Training ...
Model evaluation during training is called validation. ... Such experiment metadata is automatically logged when you launch a training job for your model....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found