Rules to avoid common extended inline assembly mistakes

59
15
ingve
21 hours ago
nullprogram.com

wyldfire
·
13 hours ago
·
[ - ]

> Because it’s so treacherous, the first rule is to avoid it if at all possible. Modern compilers are loaded with intrinsics and built-ins that replace nearly all the old inline assembly use cases.

If you take away anything from this article, it should be at least this. Intrinsics/builtins should be your first approach. Only use inline assembly if you can't express what you need using intrinsics.

bjackman
·
5 hours ago
·
[ - ]

I have a fun exception to this!

When writing BPF programs, sometimes it's tricky to get the compiler to generate code that passes the verifier. This can lead you down a path of writing bizarre C in order to try and produce the right order of checks and register allocations.

So, use inline asm!

It's not portable any more... But it's a BPF program! There's nothing to port it to.

It's less readable... Wait no, it's _more_ readable because BPF asm has good syntax and you can avoid the bizarre C backflips to satisfy the verifier.

It's unsafe... Wait no, it's checked for safety at load time!

ryandrake
·
2 hours ago
·
[ - ]

For the curious, BPF in this case might mean Berkeley Packet Filter[1]. Not sure. Kind of an obscure acronym but Google search seems to have a consensus.

1: https://en.wikipedia.org/wiki/Berkeley_Packet_Filter

wyldfire
·
2 hours ago
·
[ - ]

It does. These days, it's probably eBPF and a popular target is the linux kernel [1].

You can write C hooks for tracing and profiling, etc. - with inline asm!

[1] https://docs.kernel.org/bpf/

·
2 hours ago
·
[ - ]

fuhsnn
·
10 hours ago
·
[ - ]

One of my earliest surprises is: input-only and output-only may be mapped to the same register, and explicitly mapping one of them will not prevent this: https://godbolt.org/z/bo3r749Ge

tom_
·
14 hours ago
·
[ - ]

It's always been a mystery to me why people put up with this stuff. Adding strings to the assembler output is fine if you want to assemble some unsupported instruction, and a useful getout clause. But as the only option, it sucks, and it's no fun if you want to insert more than 1 instruction.

I used CodeWarrior for PowerPC about 15 years ago, and its inline assembler was very easy to use. No markup required, and you didn't even have to really understand the ABI. Write a C function, add "register" to the parameters, put register variables inside the function, add an asm block inside it, then do your worst. It'd track variable liveness to allocate registers, and rearrange instructions to lengthen dependency chains. Any problems, you'd get an error. Very nice.

throwaway_1224
·
5 hours ago
·
[ - ]

+1, ^5, ditto and Amen for CodeWarrior and its inline asm. CW was way ahead of its time in terms of UX. Its C++ compiler was well above average too, particularly in terms of codegen (although all C++ compilers were effectively broken in that era.)

The only thing that held it back was the lack of scripting. It was probably a rebound rejection of the MPW days, when everything was script-based (and with a crazy custom language.) I remember thinking that the design team probably didn't want to open that Pandora's box, lest scripting might lazily become required and spoil the UX.

Unfortunately, this made CW unsuited to the advent of CI. Even then, I still think it was stupid for Apple not to acquire Metrowerks. The first 5-10 years of Xcode versions had a worse UX and way worse codegen.

kccqzy
·
13 hours ago
·
[ - ]

I haven't used CodeWarrior for PowerPC, but that approach sounds like it requires the C compiler to understand the assembler instructions you are using. Is it? But most use cases of inline assembler I've seen these days is for using instructions that the compiler will not emit.

Conscat
·
12 hours ago
·
[ - ]

Raw multi-line R"()" strings in C++ reduce some of the tedium. I wrote myself an Emacs tree sitter pattern to highlight asm syntax nicer than a string normally would, which helps. There is also the stasm library (which I haven't used) that looks like a pleasant syntax. https://github.com/stasinek/stasm

Clang (but not GCC) also supports the MSVC assembly syntax which is derived from Borland inline assembly. Unlike MSVC, Clang supports it in 64-bit mode and also for arm.

astrange
·
11 hours ago
·
[ - ]

Most of the time I've used inline assembly it's because the compiler was optimizing something badly. I don't want it to rearrange anything.

(Scheduling is almost useless on modern desktop CPUs anyway, except for some instruction fusion patterns.)

mst
·
5 hours ago
·
[ - ]

> Despite this, please use volatile anyway! When I do not see volatile it’s likely a defect. Stopping to consider if it’s this special case slows understanding and impedes code review.

There are quite a few things I reflexively write where I know that in the specific case I don't actually need to do that, but also know that it'll make the code easier to skim read later.

I hate having my skimming interrupted by "is this the case where this is safe?" type situations.

One can, of course, overdo it - and to what extent it's worth doing it depends on who you expect to be reading it - but it can often be quite handy.

A concrete example:

    this.attrs = { style: {}, ...attrs }

will work fine if the 'attrs' variable is null, because ...<null> is not an error in javascript and expands to nothing ... but I'll still often write

    this.attrs = { style: {}, ...(attrs??{}) }

instead because it makes it clear that the code is allowing the null case deliberately and because it means the person reading it doesn't have to remember that the null case would've worked anyway (and also because my brain finds it weird that the null case does work so it often makes me pause even though I well know it's fine once I stop and think for a second).

smitelli
·
2 hours ago
·
[ - ]

Is that vanilla JavaScript or TypeScript? I had always thought that one of the main benefits of TS is that it would probably yell about the first case. (I currently only dabble in the JS world.)

zdragnar
·
8 minutes ago
·
[ - ]

The spread syntax is native to JavaScript. TS wouldn't complain about the first case, because as the parent said, it is a valid operation.

TS only complains about valid operations if there's some potential mistake due to ambiguity, usually when relying on to strong conversions such as adding a string and an array together or some other nonsense.

nubinetwork
·
4 hours ago
·
[ - ]

If I'm using assembly, the entire project is assembly... granted, I don't do any low level programming on modern hardware (anything newer than 586)...

brigade
·
9 hours ago
·
[ - ]

There aren’t many reasons to write an inline asm block that the compiler will elide because of no apparent effects; more likely you screwed up the constraints. If it’s due to ensuring correct memory accesses relative to the compiler, it’s usually better to define appropriate “m” constraints to give the compiler appropriate visibility, or if it’s complex/loopy enough to make that impossible then that is what the “memory” clobber is for, not volatile.

So I strongly disagree with 2 and 3.