>A will to live (optional but recommended)
>LLVM is NOT required. BarraCUDA does its own instruction encoding like an adult.
>Open an issue if theres anything you want to discuss. Or don't. I'm not your mum.
>Based in New Zealand
Oceania sense of humor is like no other haha
The project owner strongly emphasize the no LLM dependency, in a world of AI slope this is so refreshing.
The cheer amount of knowledge required to even start such project, is really something else, and prove the manual wrong on the machine language level is something else entirely.
When it comes to AMD, "no CUDA support" is the biggest "excuse" to join NVIDIA's walled garden.
Godspeed to this project, the more competition the less NVIDIA can continue destroying the PC parts pricing.
The project owner is talking about LLVM,a compiler toolkit, not an LLM.
Very few total number of commits, AI like documentation and code comments.
But even if LLMs were used, the overall project does feel steered by a human, given some decisions like not using bloated build systems. If this actually works then that's great.
Most of my free-time projects are developed either by my shooting the shit with code on disk for a couple of months, until it's in a working state, then I make one first commit. Alternatively, I commit a bunch iteratively, but before making it public I fold it all into one commit, which would be the init. 20K lines in the initial commit is not that uncommon, depends a lot on the type of project though.
I'm sure I'm not alone with this sort of workflow(s).
Lot of people in this thread have argued for squashing but I don't see why one would do that for a personal project. In large scale open source or corporate projects I can imagine they would like to have clean commit histories but why for a personal project?
Reasoning: skip SCM 'cost' by not making commits I'd squash and ignore, anyway. The project lifetime and iteration loop are both short enough that I don't need history, bisection, or redundancy. Yet.
Point being... priorities vary. Not to make a judgement here, I just don't think the number of commits makes for a very good LLM purity test.
Enshrining "end of day commits", "oh, that didn't work" mistakes, etc is not only demoralizing for the developer(s), but it makes tracing changes all but impossible.
I guess this is the difference, I expect the commit to represent a somewhat working version, at least when it's in upstream, locally it doesn't matter that much.
> Why do this, what is the advantage?
Cleaner I suppose. Doesn't make sense to have 10 commits whereas 9 are broken half-finished, and 10 is the only one that works, then I'd just rather have one larger commit.
> they would like to have clean commit histories but why for a personal project?
Not sure why it'd matter if it's personal, open source, corporate or anything else, I want my git log clean so I can do `git log --short` and actually understand what I'm seeing. If there is 4-5 commits with "WIP almost working" between each proper commit, then that's too much noise for me, personally.
But this isn't something I'm dictating everyone to follow, just my personal preference after all.
Yep, no excuse for this, feature branches exist for this very reason. wip commits -> git rebase -i master -> profit
On a solo project I do the opposite: I make sure there is an error where I stopped last. Typically I put in in a call to the function that is needed next so i get a linker error.
6 months later when I go back to the project that link error tells me all I need to know about what comes next
From that point bisect is not needed.
Further, I guess if author is expecting contributions to the code in the future, it might be more "professional" for the commits to only the ones which are relevant.
My own projects, I consider, are just for my own learning and understanding so I never cared about this, but I do see the point now.
Regardless, I think it still remains a reasonable sign of someone doing one-shot agent-driven code generation.
> Regardless, I think it still remains a reasonable sign of someone doing one-shot agent-driven code generation.
Yeah, why change your perception in the face of new evidence? :)
Regarding changing the perception, I think you did not understand the underlying distrust. I will try to use your examples.
It's a moderate size project. There are two scenarios: author used git/some VCS or they did not use it. If they did not use it, that's quite weird, but maybe fine. If they did use git, then perhaps they squashed commits. But at certain point they did exist. Let's assume all these commits were pristine. It's 16K loc, so there must be decent number of these pristine commits that were squashed. But what was the harm in leaving them?
So these commits must have been made of both clean commits as well as broken commits. But we have seem this author likes to squash commits. Hmm, so why didn't they do it before and only towards the end?
Yes, I have been introduced to a new perception but it's the world does not work "if X, then not Y principles." And this is a case where the two things being discussed are not mutually exclusive like you are assuming. But I appreciate this conversation because I learnt importance and advantages of keeping clean commit history and I will take that into account next time reaching to the conclusion that it's just another one-shot LLM generated project. But nevertheless, I will always consider the latter as a reasonable possibility.
I hope the nuance is clear.
For example when the commits were made. I would not like to share publicly for the whole world when I have worked with some project of mine. Commits themselves could also contain something that you don't want to share or commit messages.
At least I approach stuff differently depending if I am sharing it with whole world, with myself or with people who I trust.
Scrubbing git history when going from private to public should be seen totally normal.
For me it's quite funny to sometimes read my older commit messages. To each of their own.
But my opinion on this is same as it is with other things that have become tell-tale signs of AI generated content. If something you used to do starts getting questioned as AI generated content, it's better to change that approach if you find it getting labelled as AI generated, offensive.
people are to keen to say something was produced with an LLM if they feel its something they cannot produce themselves readily..
Don't care though. AI can work wonders in skilled hands and I'm looking forward to using this project
I do use LLM's (specifically Ollama) particularly for test summarisation, writing up some boilerplate and also I've used Claude/Chatgpt on the web when my free tier allows. It's good for when I hit problems such as AMD SOP prefixes being different than I expected.
It looks like a project made by a human and I mean that in a good way.
Reminded me of the beached whale animated shorts[1].
[1]: https://www.youtube.com/watch?v=ezJG0QrkCTA&list=PLeKsajfbDp...
https://github.com/Zaneham/BarraCUDA/blob/master/src/lexer.c...
> The project owner strongly emphasize the no LLM dependency, in a world of AI slope this is so refreshing.
"Has tech literacy deserted the tech insider websites of silicon valley? I will not beleove it is so. ARE THERE NO TRUE ENGINEERS AMONG YOU?!"
This is not an advantage since you will now not benefit from any improvements in LLVM.
Zluda used LLVM and ended up bundling a patched version to achieve what they wanted https://vosen.github.io/ZLUDA/blog/zluda-update-q4-2025/
> Although we strive to emit the best possible LLVM bitcode, the ZLUDA compiler simply is not an optimizing, SSA-based compiler. There are certain optimizations relevant to machine learning workloads that are beyond our reach without custom LLVM optimization passes.
(except that it applies to Zluda, not necessarily this project)
The scientific term for this is “gradient descent”.
Ah I'm glad it's just optional, I was concerned for a second.
First, it's a porting kit, not a compatibility layer, so you can't run arbitrary CUDA apps on AMD GPUs. Second, it only runs on some of their GPUs.
This absolutely does not solve the problem.
GPU drivers, Adrenalin, Windows chipset drivers...
How many generations into the Ryzen platform are they, and they still can't get USB to work properly all the time?
You have to port every piece of software you want to use. It's ridiculous to call this a solution.
If you want to fight against Nvidia monopoly, then don't just rant, but buy a GPU other than Nvidia and build on it. Check my GitHub and you'll see what I'm doing.
You don't understand what HIP is - HIP is AMD's runtime API. it resembles CUDA runtime APIs but it's not the same thing and it doesn't need to be - the hard part of porting CUDA isn't the runtime APIs. hipify is the thing that translates both runtime and kernels. Now is hipify a drop-in replacement? No of course but because the two vendors have different architectures. So it's absolutely laughable to imagine that some random could come anywhere near "drop-in replacement" when AMD can't (again: because of fundamental architecture differences).
AMD's best option is a greenfield GPU architecture that puts CUDA in the crosshairs, which is what they already did for datacenter customers with AMD Instinct.
...but that requires buy-in from the rest of the industry, and it's doubtful FAANG is willing to thread that needle together. Nvidia's hedged bet against industry-wide cooperation is making Jensen the 21st century Mansa Musa.
Let's say you put 50-100 seasoned devs on the problem, and within 2-3 years, probably get ZLUDA to the point where most mainstream CUDA applications — ML training/inference, scientific computing, rendering — run correctly on AMD hardware at 70-80% of the performance you'd get from a native ROCm port. Even if its not optimal due to hardware differences, it would be genuinely transformative and commercially valuable.
This would give them runway for their parallel effort to build native greenfield libraries and toolkits and get adoption, and perhaps make some tweaks to future hardware iterations that make compatibility easier.
And while compatibility layers aren't illegal, they ordinarily have to be a cleanroom design. If AMD knew that the ZLUDA dev was decompiling CUDA drivers to reverse-engineer a translation layer, then legally they would be on very thin ice.
Those billions are much better-off being spent on new hardware designs, and ROCm integrations with preexisting projects that make sense. Translating CUDA to AMD hardware would only advertise why Nvidia is worth so much.
> it would be genuinely transformative and commercially valuable.
Bullshit. If I had a dime for every time someone told me "my favorite raster GPU will annihilate CUDA eventually!" then I could fund the next Nvidia competitor out of pocket. Apple didn't do it, Intel didn't do it, and AMD has tried three separate times and failed. This time isn't any different, there's no genuine transformation or commercial value to unlock with outdated raster-focused designs.
> invest BILLIONS to make this happen
As I have already said twice, they already have, it's called hipify and it works as well as you'd imagine it could (ie poorly because this is a dumb idea).
HIP has been dismissed for years because it was a token effort at best. Linux only until the last year or two, and even now it only supports a small number of their cards.
Meanwhile CUDA runs on damn near anything, and both Linux and Windows.
Also, have you used AMD drivers on Windows? They can't seem to write drivers or Windows software to save their lives. AMD Adrenalin is a slow, buggy mess.
Did I mention that compute performance on AMD cards was dogshit until the last generation or so of GPUs?
Andrzej Janik.
Starter at Intel working on it, they passed because there was no business there.
AMD picked it up and funded it from 2022. They stopped in 2024, but his contract allowed the release of the software in such an event.
Now it's ZLUDA.
So the strategy to publish independently, wait and see if nvidia lawyers have anything to say about it, would be a very smart move.
Huh? This is obvious AI slop from the readme. Look at that "ASCII art" diagram with misaligned "|" at the end of the lines. That's a very clear AI slop tell, anyone editing by hand would instinctively delete the extra spaces to align those.
Didn't realise this was posted here (again lol) but where I originally posted, on the R/Compilers subreddit I do mention I used chatgpt to generate some ascii art for me. I was tired and it was 12am and I then had to spend another few minutes deleting all the Emojis it threw in there.
I've also been open about how I use AI use to people who know me, and I work with in the OSS space. I have a lil Ollama model that helps me from time to time, especially with test result summaries (if you've ever seen what happens when a Mainframe emulator explodes on a NIST test you'd want AI too lol, 10k lines of individual errors aint fun to walk through) and you can even see some Chatgpt generated Cuda in notgpt.cu which I mixed and mashed a little bit. All in all, I'm of the opinion that this is perfectly acceptable use of AI.
Another obvious tell.
https://en.wikipedia.org/wiki/Wikipedia:Signs_of_AI_writing#...
https://en.wikipedia.org/wiki/Wikipedia:Signs_of_AI_writing#...
This is also just what I intentionally avoided when making this by the way. I don't really know how else to phrase this because LLVM and HIP are quite prolific in the compiler/GPU world it seems.
Code/doc generators are just another tool. A carpenter uses power tools to cut or drill things quickly, instead of screwing everything manually. That doesn't mean they're doing a sloppy job, because they're still going to obsessively pore over every detail of the finished product. A sloppy carpenter will be sloppy even without power tools.
So yeah, I don't think it's worth spending extra effort to please random HN commenters, because the people who face the problem that you're trying to solve will find it valuable regardless. An errant bold or pipe symbol doesn't matter to people who actually need what you're building.
I keep hoping that low-effort comments like these will eventually get downvoted (because it's official HN policy). I get that it's fashionable to call things AI slop, but please put some effort into reading the code and making an informed judgment.
It's really demeaning to call someone's hard work "AI slop".
What you're implying is that the quality of the work is poor. Did you actually read the code? Do you think the author didn't obsessively spend time over the code? Do you have specific examples to justify calling this sloppy? Besides a misaligned "|" symbol?
And I doubt you even read anything because the author never talked about LLMs in the first place.
My beef isn't with you personally, it's with this almost auto-generated trend of comments on HN calling everyone's work "AI slop". One might say, low-effort comments like these are arguably "AI slop", because you could've generated them using GPT-2 (or even simple if-conditionals).
One of the reasons that slop gets such an immediate knee-jerk reaction, is that it has become so prolific online. It is really hard to read any programming message boards without someone posting something half baked, entirely generated by Claude, and asking you to spend more effort critiquing it than they ever did prompting for it.
I glanced through the code, but I will admit that the slop in the README put me off digging into it too deeply. It looked like even if it was human written, it's a very early days project.
Yeah, calling something slop is low effort. It's part of a defense mechanism against slop; it helps other folks evaluate if they want to spend the time to look at it. It's an imperfect metric, especially judging if it's slop based only on the README, but it's gotten really hard to participate in good faith in programming discussions when so many people just push stuff straight out of Claude without looking at it and then expect you to do so.
> It's really demeaning to call someone's hard work "AI slop".
I agree. I browsed through some files and found AI-like comments in the code. The readme and several other places have AI-like writing. Regarding author not spending time on this project, this is presumably a 16k loc project that was commited in a single commit two days ago. So the author never commited any draft/dev version in the time. I find that quite hard to believe. Again my opinion is that LLMs were used, not that the code is slop. It may be. It may not be.
Yes this whole comment chain is the top comment misreading LLVM as LLMs which is hilarious.
> My beef isn't with you personally, it's with this almost auto-generated trend of comments on HN calling everyone's work "AI slop".
Now this doesn't necessarily is about this particular project but if you post something on a public forum for reactions then you are seeking the time of the people who will read and interact with it. So if they encounter something that the original author did not even bother to write, why should they read it? You're seeing many comments like that because there's just a lot of slop like that. And I think people should continue calling that out.
Again, this project specifically may or may not be slop. So here the reactions are a bit too strong.
I’m self taught in this field. I was posting on R/compilers and shared this around with some friends who work within this space for genuine critique. I’ve been very upfront with people on where I use LLMs. It’s actually getting a bit “too much” with the overwhelming attention.
Regarding the writing style, it's unfortunate that LLMs have claimed a lot of writing styles from us. My personal opinion is to avoid using these AI-isms but I completely get that for people who wrote like that from the start, it's quite annoying that their own writing is now just labelled as LLM generated content.
It's quite common to work locally and publish a "finished" version (even if you use source control). The reasons can vary, but I highly doubt that Google wrote Tilt Brush in 3 commits - https://github.com/googlevr/tilt-brush
All I'm saying is assuming everyone one-shots code (and insulting them like people do on HN), is unnecessary. I'm not referring to you, but it's quite a common pattern now, counter to HN's commenting guidelines.
> found AI-like comments in the code
Sure, but respectfully, so what? Like I posted in a [separate comment](https://news.ycombinator.com/item?id=47057690), code generators are like power tools. You don't call a carpenter sloppy because they use power tools to drill or cut things. A sloppy carpenter will be sloppy regardless, and a good carpenter will obsess over every detail even if they use power tools. A good carpenter doesn't need to prove their worth by screwing in every screw by hand, even if they can. :)
In some cases, code generators are like sticks of dynamite - they help blow open large blocks of the mountain in one shot, which can then be worked on and refined over time.
The basic assumption that annoys me is to assume that anyone who uses AI to generate code is incompetent and that their work is of poor quality. Because that assumes that people just one-shot the entire codebase and release it. An experienced developer will mercilessly edit code (whether written by an AI or by a human intern), and edit it until it fits the overall quality and sensibility. And large projects have tones of modules in them, it's sub-optimal to one-shot them all at once.
For e.g. with tests, I've written enough tests in my life that I don't need to type every character from scratch each time. I list the test scenarios, hit generate, and then mercilessly edit the output. The final output is exactly what I would've written anyway, but I'm done with it faster. Power tool. The final output is still my responsibility, and I obsessively review every character that's shipped in the finished product - that is my responsibility.
Sure plenty of people one-shot stuff, just like plenty of Unity games are asset flips, and plenty of YouTube videos are just low-effort slop.
But assuming everything that used AI is crap is just really tiring. Like [another commenter said](https://news.ycombinator.com/item?id=47054951), it's about skilled hands.
> something that the original author did not even bother to write
Again, this is an assumption. If I give someone bullet points (the actual meat of the content), and someone else puts them into sentences. Did the sentences not reflect my actual content? And is the assumption that the author didn't read what was finally written, and edit it until it reflected the exact intent?
In this case, the author says they used AI to generate the ASCII art in question. How does that automatically mean that the author AI-generated the entire readme, let alone the entire project? I agree, the knee-jerk reactions are way out of proportion.
Where do you draw the line? Will you not use grammar tools now? Will you not use translation tools (to translate to another language) in order to communicate with a foreign person? Will that person argue back that "you" didn't write the text, so they won't bother to read it?
Should we stop using Doxygen for generating documentation from code (because we didn't bother with building a nice website ourselves)?
Put simply, I don't understand the sudden obsession with hammering every nail and pressing every comma by hand, whereas we're clearly okay with other tools that do that.
Should we start writing assembly code by hand now? :)
Also I can see you and I both agree that it's disingenuous to call all LLM generated content slop. I think slop has just become a provocative buzzword at this point.
Regarding drawing the line, at the end, it comes down to the person using the tools. What others think as these tools become more and more pervasive will become irrelevant. If you as a person outsourced your thinking than it's you who will suffer.
In all my comments, I personally never used the word slop for this project but maintained that LLMs were used significantly. I still think that. Your other comparison of LLMs with things like doxygen or translation tools is puzzling to me. Also points about hammering every nail and every comma are just strawman. 5-6 years ago from today people used these things and nobody had any issues. There's a reason why people dislike LLM use though. If you cannot understand why it frustrates people, then I don't know what to say.
Also people do write assembly by hand when it is required.
Using a code generator != outsourcing your thinking. I know that's the popular opinion, and yes, you can use it that way. But if you do that, I agree you'll suffer. It'll make sub-optimal design decisions, and produce bloated code.
But you can use code generators and still be the one doing the thinking and making the decisions in the end. And maintain dictatorial control over the final code. It just depends on how you use it.
In many ways, it's like being a tech lead. If you outsource your thinking, you won't last very long.
It's a tool, you're the one wielding it, and it takes time, skill and experience to use it effectively.
I don't really have much more to say. I just spoke up because someone who built something cool was getting beat up unnecessarily, and I've seen this happen on HN way too many times recently. I wasn't pointing fingers at you at any point, I'm glad to have had this discussion :)
I think as a human I am significantly more likely to give up on senseless pixelpushing like this than an LLM.
I'll be the party pooper here, I guess. The manual is still right, and no amount of reverse-engineering will fix the architecture AMD chose for their silicon. It's absolutely possible to implement a subset of CUDA features on a raster GPU, but we've been doing that since OpenCL and CUDA is still king.
The best thing the industry can do is converge on a GPGPU compute standard that doesn't suck. But Intel, AMD and Apple are all at-odds with one another so CUDA's hedged bet on industry hostility will keep paying dividends.
I would love to see these folks working together on this to break apart nvidia's strangehold on gpu market (which, according to internet, allows them to have an insane 70% profit margins, thereby, raising costs for all users, worldwide).
> make
Beautiful.
Supporting CUDA on AMD would only build a bigger moat for NVidia; there's no reason to cede the entire GPU programming environment to a competitor and indeed, this was a good gamble; as time goes on CUDA has become less and less essential or relevant.
Also, if you want a practical path towards drop-in replacing CUDA, you want ZLUDA; this project is interesting and kind of cool but the limitation to a C subset and no replacement libraries (BLAS, DNN, etc.) makes it not particularly useful in comparison.
When it comes to GPUs, AMD just has the vibe of a company that basically shrugged and gave up. It's a shame because some competition would be amazing in this environment.
Specifically:
CuBLAS (limited/partial scope), cuBLASLt (limited/partial scope), cuDNN (limited/partial scope), cuFFT, cuSPARSE, NVML (very limited/partial scope)
Notably Missing: cuSPARSELt, cuSOLVER, cuRAND, cuTENSOR, NPP, nvJPEG, nvCOMP, NCCL, OptiX
I'd estimate it's around 20% of CUDA library coverage.
The primary competitors are Google's TPU which are programmed using JAX and Cerebras which has an unrivaled hardware advantage.
If you insist on an hobbyist accessible underdog, you'd go with Tenstorrent, not AMD. AMD is only interesting if you've already been buying blackwells by the pallet and you're okay with building your own inference engine in-house for a handful of models.
What sucks is that such projects at some point become too big, and make so much noise forcing big techs to buy them and everybody gets fuck all.
All it requires to beat proprietary walled garden, is somebody with knowledge and a will to make things happen. Linus with git and Linux is the perfect example of it.
Fun fact, BitKeeper said fuck you to the Linux community in 2005, Linus created git within 10 days.
BitKeeper make their code opensource in 2016 but by them, nobody knew who they were lol
So give it time :)
It all ended up good because of one mans genius but let's not rewrite history.
More like wouldn't* most of the time.
Well isn't that the case with a few other things? FSR4 on older cards is one example right now. AMD still won't officially support it. I think they will though. Too much negativity around it. Half the posts on r/AMD are people complaining about it.
They're working the problem, but slandering them over it isn't going to make it come out any faster.
It works fine.
> They're working the problem, but slandering them over it isn't going to make it come out any faster.
You have insider info everyone else doesn't? They haven't said any such thing yet last I checked. If that were true, they should have said that.
That is incorrect, and the FP8 issue is both the officially stated reason and the reason that the community has independently verified.
> You have insider info everyone else doesn't?
AMD has been rather open about it.
But I digress, just a quick put around... I don't know what I'm looking at. But it's impressive.
I guess CUDA got a lot more traction and there isn't much of a software base written for OpenCL. Kind of what happened with Unix and Windows - You could write code for Unix and it'd (compile and) run on 20 different OSs, or write it for Windows, and it'd run on one second-tier OS that managed to capture almost all of the desktop market.
I remember Apple did support OpenCL a long time ago, but I don't think they still do.
This project is a super cool hobby/toy project but ZLUDA is the “right” drop in CUDA replacement for almost any practical use case.
Open-source projects are being inundated with PR from AIs, not depending on them doesn't limit a project.
That project owner seems pretty knowledgeable of what is going on and keeping it free of dependencies is not an easy skill. Many developers would have written the code with tons of dependency and copy/paste from LLM. Some call the later coding :)
I mean this with respect to the other person though please don't vibe code this if you want to contribute or keep the compiler for yourself. This isn't because I'm against using AI assistance when it makes sense it's because LLMs will really fail in this space. Theres's things in the specs you won't find until you try it and LLMs find it really hard to get things right when literal bits matter.
But help me understand something. BarraCuda does its own codegen and therefore has to implement its own optimisation layer? It's increbibly impressive to get "working" binaries, but will it ever become a "viable" alternative to nvidia's CUDA if it has to re-invent decades of optimisation techniques? Is there a performance comparison between the binaries produced by this compiler and the nvidia one? Is this something you working on as an interesting technical project to learn from and prove that this "can be done"? Or are you trying to create something that can make CUDA a realistic option on AMD GPUs?
It's a LOT less bad than it used to be, amd deserves serious credit. Codex should be able to crush it once you get the env going
What is the problem with such approaches?
Storage capacity everywhere rejoices
Shout out to https://github.com/vosen/ZLUDA which is also in this space and quite popular.
I got Zluda to generally work with comfyui well enough.
I'm not the one who posted to HN but I am the project author. I'm working my way into doing multiple architectures as well as more modern GPUs too. I only did this because I used LLVM to check my work and I have an AMD GFX 11 card on my partners desktop (Which I use to test on sometimes when its free).
If you do have access to this kind of hardware and you're willing to test my implementations on it then I'm all ears! (You don't have too obviously :-) )
If you want portablitiy you need a machine learning compiler ala TorchInductor or TinyGrad or OpenXLA.
If your needs can be expressed as tensor operations or neural network stuff that tinygrad supports, might as well use that (or one of the ten billion other higher order tensor libs).
Seeing insane investments (in time/effort/knowledge/frustration) like this make me enjoy HN!!
(And there is always the hope that someone at AMD will see this and actually pay you to develop the thing.. Who knows)
Good luck -
Write CUDA code. Run Everywhere. Your CUDA skills are now universal. SCALE compiles your unmodified applications to run natively on any accelerator, ending the nightmare of maintaining multiple codebases.