question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[PTAL] [WIP] Questions regarding MLIR Environment for CompilerGym

See original GitHub issue

Hi Chris,

I’ve uploaded @sogartar and my latest changes to my local fork. I don’t think it’s quite ready for a pull request because at the moment. In particular, I’ve made breaking changes to the LLVM envrionment to get it working for now (I’m aware of your branch working on this). And that’s not to mention some leftover cleanup.

That said, I think it would be good to get some high level feedback from you to make sure we’re heading in a direction that you would feel comfortable merging in the (hopefully very near) future. I’m not looking for full code review, to be clear, just a skim and feedback on a couple of design questions.

  1. Do you expect the feature for picking a llvm revision to be something that you would like to complete and us to build off of, or should we go ahead and create our own solution? This could get a little ugly because of things like the file compiler_gym/third_party/autophase/InstCount.cc where the function getNumArgOperands does not exist (at least under that name) in LLVM 13. There might be a version-agnostic appropriate syntax, but the straightforward solution would be to use compiler directives to change the code based off of the version chosen.

  2. In a similar vein, if we restrict builds to using only one version of LLVM at a time, the most straightforward solution would be to conditionally build environments, throwing errors if environment that want conflicting versions are enabled and either generating certain files or doing conditional imports in files like compiler_gym/random_search.py and setup.py. For our purposes, I worry that we would want to implement too basic/inelegant of a solution to 1) and 2) that it might cause you trouble down the line, particularly if you already have a structure/solution in mind to either of these problems.

  3. We want to be able to dynamically generate benchmarks. You’ve done something that almost achieves this in the LLVM environment in the csmith benchmark, but this workaround won’t work for us, because we want to be able to pass in multiple parameters (right now, just three integers), which makes the space quite a bit larger and fairly prohibitive. I’m not sure if there’s anything you can easily do about this given that you’re following the Gym API, but I wanted to bring your attention to it because we’re having to use somewhat messy workarounds. Also, we can’t easily/meaningfully have the benchmark just reference a filename with arbitrarily populated code, because of…

  4. MLIR’s analog to compiler passes don’t ensure the code is runnable at each step, and in particular, there isn’t a way to always know a way to lower MLIR code to executable bitcode*. MLIR code is also generally best not run alone, but rather, linked to main file defined in C++ that initializes data and calls out to the MLIR. This means that we probably want to pass around two source files instead of just one for a benchmark, which violates the pattern even further. My workaround right now is to use the getSiteDataPath function to get a directory at which to semi-persistently store the additional C++ file. You can see how I’m doing it right now in compiler_gym/envs/mlir/datasets/matmul.py. I can obviously store the string elsewhere, but for accessing the file from the benchmark, it’s not clear to me that there’s a cleaner way.

*There’s no way to always lower MLIR for ARBITRARY MLIR, but for restricted subsets like we are currently using, we can establish a set of passes that can be run before execution and after any optimization passes that ensures the code is runnable. In general, I expect each MLIR benchmark to probably follow a pattern of that sort, which can currently be signaled through the MlirActionSpace enum. Finding a good way to enable developers to create new benchmarks/extensions to the environment is going to be the largest technical challenge, but I’m getting ahead of myself. We want a single benchmark working first.

Thank you, Kyle

Issue Analytics

  • State:open
  • Created 2 years ago
  • Comments:20 (19 by maintainers)

github_iconTop GitHub Comments

1reaction
ChrisCumminscommented, Mar 27, 2022

An Event has semantic meaning that does not correspond to a feature in the Benchmark message type

I’m not so sure. An “Event”, in the context of an Action is a catch-all for any type of action. In the context of an Observation, it is a catch-all for any type of features. Could we not just use the same catch-all for benchmark features?

You could retrain Any behavior using the any_value event type:

Benchmark(uri="...", features={"a": Event(any_value=Any(...))})

You also lose out on the feature of the protobuf JSON parser packing arbitrary protobufs for you and instead have to recall the various names of event fields

Ooh okay this sounds interesting, how does it work?

BTW, aside from splitting hairs over the features field (which no environment currently requires 🙂 ), I think you’re good to go if you want to start a branch to introduce the multi-file map.

Cheers, Chris

1reaction
KyleHerndoncommented, Mar 23, 2022

This new suggestion makes a lot more sense to me, and I would be fine with it as stated. It does occur to me that it may be preferable, depending on if you what you want to optimize for, to make features a map<string, google.protobuf.Any>. This would give you type safety (only at nested levels, though, not at the top level, unless you count explicitly checking the Any’s type) at the expense of having to pack/unpack the message.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Frequently Asked Questions - CompilerGym
This page answers some of the commonly asked questions about CompilerGym. ... I want to add a new compiler environment, where do I...
Read more >
Reporting Issues - MLIR - LLVM
Issues with MLIR can be reported through GitHub. Report the issue for the llvm-project repository at https://github.com/llvm/llvm-project/issues/new.
Read more >
Robust, Performant Compiler Optimization Environments for AI ...
CompilerGym enables anyone to experiment on production compiler optimization problems through an easy-to-use package, regardless of their experience with ...
Read more >
[Vue warn]: Invalid prop: type check failed for prop ... - IssueHint
Issue Title Created Date Comment Count Updated Date hmac validation for oauth (cli three) 0 2022‑10‑30 2022‑10‑26 Extension is not working 8 2020‑08‑18 2022‑09‑29 Transparent backgroundColor...
Read more >
CompilerGym: Making compiler optimizations accessible to all
Built by Facebook AI on OpenAI Gym, CompilerGym provides powerful tools that enable ... We expose our compiler optimization problems as Gym environments, ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found