question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Improve the compilation pipeline (introduce stages)

See original GitHub issue

For now, we have somewhat messed up pipeline: the IR nodes do all by themselves. Type checking, member resolve, lowering, codegen: we have everything everywhere. Right now, only parsing is properly isolated from everything else.

We should introduce more formal stages of compilation and maybe even create several different layers of IR? For example, we have certain constructs that have to be lowered (such as +=-styled operators). We may get rid of these constructs in the “lowered IR layer” and thus decrease the amount of type checks required in the codegen.

I don’t know how should it be organized, though. Just start from separating the IR layers, and everything else will click and fit in place?

Thoughts?

cc @kant2002, @impworks

When implementing, look for number 201 in the source and try to eliminate every instance of that number.

Issue Analytics

  • State:open
  • Created a year ago
  • Comments:10 (10 by maintainers)

github_iconTop GitHub Comments

1reaction
ForNeVeRcommented, Sep 7, 2022

Your example demonstrates that we should have a way to preserve the source information after preprocessing, and I very much agree with that.

I still very strongly disagree with the idea of changing the grammar. We have a lot of hacks already, and it is very hard to reason about the parser: how does it correspond to the C standard? Does it have any parser conflicts? What we should change to migrate to C23 after it emerges?

And these questions will quickly become impossible to answer if we change the parser completely, inventing some unholy hybrid of C grammar and C preprocessor grammar. Moreover, I cannot see how it helps to achieve anything: this combined grammar won’t have any source information embedded, either.

I believe that the preprocessor should generate some kind of annotated result (so we know from where each token comes). If this was (part of) your point, then I agree. Simple plain text output from the preprocessor won’t work, and I agree on that, too.

Unfortunately, Yoakke isn’t able to work with such token streams out-of-the-box, I believe. We may choose either to migrate to some other library (or a manual parser), or provide a bridge between the preprocessor-generated text and the source. One possible way to preserve the source information without changing Yoakke I imagine is the following:

  1. Parse the source file to a stream of the preprocessor tokens (note that each token still have the source information: in-file position and range, but there’s no way to distinguish different files).
  2. Preprocessor should provide a string content and a mapping from each text token of this content to the original source. I.e. (string, Dictionary<TextRange, SourceInformation>). Of course, there may be some peculiarities involved when we are trying to determine “the origin” of a C token created by the preprocessor (since the token may be glued together by several different macros), but that’s in any case a question for our interpretation.
  3. When parsing the file in C language, we receive the text ranges from the parser. These ranges then may be remapped to the original source code using the Dictionary<TextRange, SourceInformation>, because at that point, both text ranges (from the dictionary and from the C parser) operate on the same source: the text clob from the preprocessor output.
0reactions
ForNeVeRcommented, Sep 8, 2022

To me, it’s not a big deal whether we do it in this repo or contribute to Yoakke. The latter is a bit more complicated because we’ll need to invent abstractions useful for other people as well for ourselves. But still, doable.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Continuous Delivery Pipeline: The 5 Stages Explained
A continuous delivery pipeline consists of five main phases—build/develop, commit, test, stage, ... Compile the source code; Run the relevant commit tests ...
Read more >
CI/CD pipelines explained: Everything you need to know
This comprehensive guide explains the CI/CD pipeline stages, benefits and challenges, best practices and more.
Read more >
Building a DevOps pipeline: The stages, benefits, and how ...
Explore the stages involved and learn how to develop a streamlined ... Understanding how to build a DevOps pipeline is a crucial first...
Read more >
Computer Organization and Architecture | Pipelining | Set 1 ...
To improve the performance of a CPU we have two options: 1) Improve the hardware by introducing faster circuits.
Read more >
CI/CD Pipeline: A Gentle Introduction - Semaphore
A CI/CD pipeline helps you automate steps in your software delivery process. Understand the basics, best practices and how to get started.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found