question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

including applicable extensions in the opcode syntax

See original GitHub issue

I have been using this repo as the official source of encodings for our internal design and verification tools. One issue that I have been facing is the lack of concrete information of “under which extension(s) is an instruction applicable”. I am looking at decoding only instructions which are applicable for a user-defined ISA. So if the user specifies RV64IMC then only instructions under those 3 extensions must be decoded. Even though the filenaming convention right now is somewhat useful, it does not fully address all the issues. two of which I have described below:

  1. c.flw should be applicable only when F and C are both implemented. So placing it inside opcodes-rvc confuses the tools and having a separate file opcodes-rv32fc increases maintenance.
  2. instructions like pack which are present under multiple extensions (zbp, zbf and zbe). Placing pack into individual opcodes file for each sub-extension might work but is not scalable. One has to remember to edit all those files for any change in the instruction

so, having a file-naming convention alone might not work. The syntax of opcode entries will need to change slightly. The following is a very quick and dirty proposal (and will need refining) of what I think can work to address the above issues:

Add the list of comma-separated extensions under which the encoding is a legal instruction; wrapped within | | at the end of the line.

Examples

c.flw      1..0=0 15..13=3 12=ignore 11..2=ignore |RV32FC|

pack       rd rs1 rs2 31..25=4  14..12=4 6..2=0x0C 1..0=3 |RV32Zbp, RV32Zbf, RV32Zbe, RV64Zbp, RV64Zbf, RV64Zbe|

Tools can then use substring matching to identify if that instruction is applicable for the user-defined ISA or not.

A better way of doing the above would be to use regex (less readable but extremely powerful) :

c.flw      1..0=0 15..13=3 12=ignore 11..2=ignore |RV(32).*(F).*(C).*|

pack       rd rs1 rs2 31..25=4  14..12=4 6..2=0x0C 1..0=3 |RV(32|64).*(Zbp|Zbf|Zbe).*|

The regex will need to follow a few strict guidelines while writing but that should be manageable.

Pros of the proposal:

  • the syntax is pretty regex-able and simply adds on to the current syntax. Current tools depending on this repo will simply need to ignore everything between | |.
  • minimal changes to existing scripts in this repo to generate the current set of artifacts
  • does not require a strict file naming convention - improves scalability
  • number of files in the repo will reduce - improves maintenance

Before I go on to work on a PR for the above, I wanted to get a sense if such a change is welcomed/acceptable?

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:13 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
aswatermancommented, Jan 29, 2022

My immediate reaction is that I prefer a different approach that’s more similar to what we’re currently doing: use the file names to make this distinction, rather than adding metadata to the individual instructions.

For instructions that belong to multiple extensions, we could use the existing @ aliasing scheme when they appear in multiple files, or invent some new prefix that means “I know this is defined elsewhere, but I’m including it here anyway, without explicitly rewriting its operands”.

Regardless, I agree we should solve the problem you’re trying to solve, and your solution is a reasonable approach. I’d like others to weigh in.

0reactions
neelgalacommented, May 3, 2022

closed in #106

Read more comments on GitHub >

github_iconTop Results From Across the Web

80386 Programmer's Reference Manual -- Section 17.2
Instructions consist of optional instruction prefixes, one or two primary opcode bytes, possibly an address specifier consisting of the ModR/M byte and the ......
Read more >
1 Opcode Reference - Oracle Help Center
1 Opcode Reference. This chapter provides reference information for Oracle Communications Billing and Revenue Management (BRM) opcodes.
Read more >
Assembler Syntax - RAD Studio - Embarcadero DocWiki
This syntax of an assembly statement is: Label: Prefix Opcode Operand1, Operand2. where Label is a label, Prefix is an assembly prefix opcode...
Read more >
How to read the Intel Opcode notation - Stack Overflow
The reg field contains the digit that provides an extension to the instruction's opcode. /r — Indicates that the ModR/M byte of the...
Read more >
X86 Opcode and Instruction Reference
This reference is intended to be precise opcode and instruction set reference (including x86-64). Its principal aim is exact definition of ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found