Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • May 18 14:15
    kobalicek closed #366
  • May 18 14:15
    kobalicek commented #366
  • May 18 11:56
    StephanJBI commented #366
  • May 18 11:43
    kobalicek commented #366
  • May 18 11:43
    kobalicek commented #366
  • May 18 11:43
    kobalicek commented #366
  • May 18 11:14
    StephanJBI commented #366
  • May 18 11:06
    StephanJBI commented #366
  • May 18 10:39
    kobalicek commented #366
  • May 18 10:01
    StephanJBI commented #366
  • May 18 08:19
    kobalicek commented #366
  • May 18 08:18
    kobalicek commented #366
  • May 18 08:17
    kobalicek commented #366
  • May 18 08:05
    StephanJBI opened #366
  • May 15 15:55
    7x-hex-x7 closed #365
  • May 15 15:54
    7x-hex-x7 commented #365
  • May 15 15:36
    kobalicek commented #365
  • May 15 15:35
    kobalicek commented #365
  • May 15 14:00
    7x-hex-x7 opened #365
  • Apr 21 05:17
    SpriteOvO commented #332
Petr Kobalicek
@kobalicek
@rschoene:tu-dresden.de This is currently unexplored. I think CpuInfo would have to be changed to offer muliple lists of CpuFeatures information. However, how to get information regarding a different core that your thread runs on at the moment, that's something I don't know
Robert Schöne (INF-TeI-RA; ZIH)
@rschoene:tu-dresden.de
[m]
@kobalicek: thats unfortunate, but thanks for the swift reply
Petr Kobalicek
@kobalicek
I would like to keep this simple - on Linux you can read it via /proc/cpuinfo, but that means parsing and it's just Linux
Not sure how this is exposed on Windows etc
However, when it comes to CPU features, I think it doesn't matter because all core types only allow you to use what the others have
so if you detect AVX2, all cores support AVX2
that's why Intel blocked AVX-512 on those high performance cores I think
BTW I don't even have alder lake to test this - if anyone is willing to explore I'm all ears. I think refactoring CpuInfo to support this would be straightforward
Robert Schöne (INF-TeI-RA; ZIH)
@rschoene:tu-dresden.de
[m]
unfortunately, i do not have the time for this. but if i can find a student worker and funding, I'll let you know.
Lukas Gerlach
@s8lvg
Is it possible to use asmjit in such a way that it selects the instructions to emit at runtime? So for example to emit random instructions with the correct operands?
Petr Kobalicek
@kobalicek
Can you elaborate?
Short answer: you can, asmjit is highly dynamic and you can simply emit anything, this is the way how asmtk works btw, it uses asmjit database
If you do this, you would use low-level API instead of high level instruction mnemonic API offered by emitters
Lukas Gerlach
@s8lvg
Im trying to use asmjit to fuzz short instruction sequences, my current approach is:
  • Parse the x86emmiter.h file and create a wrapper function each mnemonic offered by the emitter
  • Create extra functions that return a randomly selected register or memory location (of the right type)
  • Put these wrapper functions into function arrays which the fuzzer can then use to emit instructions by calling at the right index
    I found out that this approach is quite limited because there are many edge cases (AVX512 instructions with different KRegs, consecutive registers, implicit operands,...). Thanks in advance.
Michael McLoughlin
@mmcloughlin
I'm curious what inputs you used to build your arm instruction database? I'm looking at the ISA documentation they distribute in XML format (https://developer.arm.com/downloads/-/exploration-tools) which is awesome. However, maybe I've missed it, but I don't see the RW action on operands represented in this spec in a simple way, at least without parsing out the "Arm Pseudocode".
deltars
@deltars
Hello, I am interested in pre-compiling my asmjit and serializing it to binary, then deserializing it later to be executed. (the idea is to reduce initialization time).
I think i need to serialize CodeHolder after compiler.finalize() but before being runtime.add(), as this is where we have the compiled code but it hasn't been physically addressed yet.
Has anyone else serialized the compiled asmjit to file for execution later?
Is there a more practical approach? CodeHolder looks like quite a complex data structure for serializing.
Petr Kobalicek
@kobalicek
@s8lvg You have two options - parse the JSON database (asmdb) asmjit is using, or use asmjit to query the instructions it supports. I have created a small gist that could be a baseline for this: https://gist.github.com/kobalicek/6a11464e2c6777f1c87d44f779e82b7d - note that this is raw information from asmjit, essentially this is how asmjit sees the instructions so if an instruction supports m8, but the pointer size is not required, it would show both mem and m8. In addition, you can also check out how CULT does this (https://github.com/asmjit/cult/blob/master/src/cult/instbench.cpp) - it also iterates instructions dynamically and benchmarks them.
@mmcloughlin It's a hand made database. The problem of importing third party data is that there is sometimes missing stuff that asmjit needs.
Petr Kobalicek
@kobalicek

@deltars

It would be possible to serialize and deserialize CodeHolder, however, I would suggest to instead emit code that is PIC (position independent code), which would not need you to serialize/deserialize the content of CodeHolder, but only to maintain a table of addresses that would be resolved by the deserializer.

The reason is that if you use a function, for example, that you call within the code, you would serialize / deserialize its address too, which doesn't have to be valid in some other process (especially if OS/linker does address randomization for you). To fix this, you can just maintain a function table, or symbol table, which would contain pointer, and another structure which would describe the name of the symbol at each index. In the code, you would use a separate section for this, which would be flattened and put at the end, and patched before the code is actually used by the deserializer.

This way, you would have code that is position independent, and that uses a table of addresses, that can change across processes.

deltars
@deltars
thank you @kobalicek !
For generating PIC, I think I do a similar thing to JitRuntime::_add(), but call code->relocateToBase(0)
And the relocateToBase function shows me that I need to keep track of the RelocEntry CodeHolder._relocations (a simple table with offset, value size, RelocType) to apply my baseAddress later on.
And for my table of physical addresses, I create a ConstPool which I can also fill out later on.
StephanJBI
@StephanJBI
Should the library take care of splitting a64 cmp(var, Imm(value)) into two instructions if immediate value exceeds 4095 and handling negative values or should this be done be the user of the library?
Petr Kobalicek
@kobalicek
In general this should be handled by users
The reason is that sometimes it's not trivial to do this with arbitrary imm
and you would need a register
Lukas Gerlach
@s8lvg
@kobalicek Thanks for your swift reply. That works perfectly.
Michael McLoughlin
@mmcloughlin

@mmcloughlin It's a hand made database. The problem of importing third party data is that there is sometimes missing stuff that asmjit needs.

Yeah, I see that. Presumably though when you say "hand made" you actually mean "derived one or more third-party sources with adhoc scripts and partial automation"? Could you elaborate a bit more?

Petr Kobalicek
@kobalicek

Well, not even that. I just downloaded ARM architecture reference manual and created the DB from each instruction listed in that manual (well except SVE and some others). So I really added there each instruction manually. It's okay for me as I get an overview about the architecture and patterns in encodings even before I start writing the assembler.

BTW even assemblers in LLVM are not fully automated - they use "tablegen" and have a specific language designed to generate tables and code.

I have actually a branch in which there is ARM32 assembler, which is fully generated.
The problem is, however, that instruction tables provided by vendors are for humans, not machines. So when you see AArch32 and AArch64, for example, you would notice that there are basically expressions that transform immediates to opcode fields - sometimes a bit complicated functions actually. It's very hard to get this from various instruction tables. So instead of trying to do this, I just store in the instruction DB something like "ShiftImm(x)" which describes extra logic needed to do some transformation, and this logic is implemented in C++ code instead of trying to describe it in the DB.
Petr Kobalicek
@kobalicek
What I would like to achieve in the future is to have assemblers for all architectures generated from the DB, and to ship the DB with asmjit instead of having it in another repo.
I would actually even sacrifice some space to do this - like if the generated assembler compiles to 2kB bigger code I would be fine with that, because today x86 assembler is essentially hand-coded and it's simply impossible to write thousands of lines without bugs, so historically there were bugs. Not even speaking about new instructions that require specific logic to encode them, etc...
Michael McLoughlin
@mmcloughlin
Yeah, I know LLVM is done differently. And I think it's fixed now but in the early days it was buggy as a result https://llvm.org/devmtg/2012-04-12/Slides/Richard_Barton.pdf
"vendors are for humans, not machines" I mean, my understanding was that the XML distribution together with a spec for ASL "Architecture Specification Language" was intended to make the spec consumable by machines
Context is I'm dipping my toe in the water of adding arm support to my avo project.
Initially I naively thought it would be easy to generate a DB from the ARM XML. I'm learning it's far from trivial
Michael McLoughlin
@mmcloughlin
It's so frustrating. It seems pretty clear that there's some "source of truth" that ARM has internally, and they're generating derived materials from it (the XML). The derived materials are still awkward to use for tools developers though.
Anyway, I appreciate you sharing your experience. We'll see how far I get before giving up :sweat_smile:
Michael McLoughlin
@mmcloughlin
Screenshot from 2022-05-18 19-08-34.png
^ from the CL that added the arm64 disassembler to golang.org/x/arch, including a 2.5k+ line function with a massive switch statement
Michael McLoughlin
@mmcloughlin
Michael McLoughlin
@mmcloughlin
Hahaha I went into this a couple days ago with so much hubris
Petr Kobalicek
@kobalicek
BTW if you wanna check out the incomplete ARM32 support, you can look here:

https://github.com/asmjit/asmjit/tree/abi_2_0

The instruction db is here:

https://github.com/asmjit/asmjit/blob/abi_2_0/db/armdata.js

and is similar to asmdb, but I have reorganized it a bit:

    {"inst": "vsri.x8-64 Dx, Dn, #n"                                   , "t32": "1111|11111|Vx'|imm[5:0]|Vx|0100|imm[6]|0|Vn'|1|Vn"      , "ext": "ASIMD", "imm": "VecShiftNImm(sz, n)"},

So this is an instruction, very similar to ARM definition, but I have added more metadata regarding encoding - like "imm": "VecShiftNImm(sz, n)" - this is an inline function in the assembler that is responsible for verifying and encoding the immediate into an opcode field. This helps as the assembler generator doesn't have to be that smart, it just processes the fields and calls utility functions where necessary - and theoretically it would also work for a disassembler (the transformation would be just opposite)

What I have done differently compared to manuals is specifying multiple data types in a single instruction, where possible, so .x8-64 is essentially saying it's s8, s16, s32, s64, and u8, u16, u32, u64 datatypes
What I'm missing is better support for memory operands and relative displacements
it has almost 10k lines yeah, not really that compact I think, but few thousand lines more, if they are generated, that's okay, as this saves weeks of development and as a bonus it matches the DB, no bugs in encoding so far
What is scary on ARM is that it's essentially 3 architectures - it's ARM, Thumb, and AArch64 - ARM/Thumb have some similarities, but a lot of painful areas, like IT instructions in Thumb, different restrictions of immediates, etc