There are literally thousands of retro emulators on github. What I was trying to do had zero examples on GitHub. My take away is obvious as of now. Some stuff is easy some not at all.
There are no examples of what you tried to do.
Ai doesn't actually wash licenses, it literally can't. Companies are just assuming they're above the law
https://www.theatlantic.com/technology/2026/01/ai-memorizati...
It's a convenient criticism of LLMs, but a wrong one. We need to do better.
https://x.com/docsparse/status/1581461734665367554
I'm sure if someone prompts correctly, they can do the same thing today. LLMs can't generate something they don't know.
https://notes.bayindirh.io/notes/Lists/Discussions+about+Art...
I have another handful of links to add to this list. Had no time to update recently.
Has that been properly adjudicated? That's what the AI companies and their fans wish, but wishing for something doesn't make it true.
Then AI begins to offer a method around this over litigious system, and this becomes a core anti-AI argument.
I do think it's silly to think public code (as in, code published to the public) won't be re-used by someone in a way your license dictates. I'd you didn't want that to happen, don't publish your code.
Having said that, I do think there's a legitimate concern here.
2. GPL does not allow you to take the code, compress it in your latent space, and then sell that to consumers without open sourcing your code.
If AI training is found to be fair use, then that fact supercedes any license language.
If there is some copyrighted art in the background in a scene from a movie, maybe that's fair use. If you take a high resolution copy of the movie, extract only the art from the background and want to start distributing that on its own, what do you expect then?
Sure, that's what the paper says. Most people don't care what that says until some ramifications actually occur. E.g. a cease and desist letter. Maybe people should care, but companies have been stealing IP from individuals long before GPL, and they still do.
No one goes to prison for this. They might get sued, but even that is doubtful.
We're talking about the users getting copyright-laundered code here. That's a pretty equal playing field. It's about the output of the AI, not the AI itself, and there are many models to choose from.
Vibe coding does not solve this problem. If anything, it makes it worse, since you no longer have any idea if an implementation might read on someone else's patent, since you did not write it.
If your agent could go read all of the patents and then avoid them in its implementations and/or tell you where you might be infringing them (without hallucinating), that would be valuable. It still would not solve the inherent problems of vagueness in the boundaries of the property rights that patents confer (which may require expensive litigation to clarify definitively) or people playing games with continuations to rewrite claim language and explicitly move those boundaries years later, among other dubious but routine practices, but it would be something.
That would lead the whole society to a halt, because it feels impossible to do anything now without violating someone's patent. Patents quite often put small players at a disadvantage, because the whole process of issuing patents is slow, expensive and unpredictable. Also, I once heard a lawyer say that, in high-stake lawsuits the it is the pile (of patents) that matters.
The main arguments against the current patent system are these:
1) The patent office issues obvious or excessively broad patents when it shouldn't and then you can end up being sued for "copying" something you've never even heard of.
2) Patents are allowed on interfaces between systems and then used to leverage a dominant market position in one market into control over another market, which ought to be an antitrust violation but isn't enforced as one.
The main arguments against the current copyright system are these:
1) The copyright terms are too long. In the Back To The Future movies they went 30 years forward from 1985 to 2015 and Hollywood was still making sequels to Jaws. "The future" is now more than 10 years in the past and not only are none of the Back To The Future movies in the public domain yet, neither is the first Jaws from 1970, nor even the movies that predate Jaws by 30 years. It's ridiculous.
2) Many of the copyright enforcement mechanisms are draconian or susceptible to abuse. DMCA 1201 is used to constrain the market for playback devices and is used by the likes of Google and Apple to suppress competition for mobile app distribution and by John Deere to lock farmers out of their tractors. DMCA 512 makes it easy and essentially consequence-free to issue fraudulent takedowns and gives platforms the incentive to execute them with little or no validation, leading to widespread abuse. The statutory damages amounts in the Copyright Act are unreasonably high, especially for non-commercial use, and can result in absurd damages calculations vastly exceeding any plausible estimate of actual damages.
LLMs don't solve any of that. Making it easier to copy recent works that would still be under copyright even with reasonable copyright terms is not something we needed help with. If you wanted to copy something still under copyright, that was never that hard, and doing that when you don't know about it or want it is actively unhelpful.
Software patents are not copyright in anyway they are a completely different thing.
So this isn't AI getting back at the big guys it is AI using open source code you could have used if you just followed the simple license.
Copyright in regards to software is effectively "if you directly use my code you need a license" this doesn't have any of the downsides of copyright in other fields which is mostly problematic for content that is generations old but still protected.
GitHub code tends to be relatively young still since the product has only existed for less than twenty years and most things you find are going to be way less than that in age on average.
But there's the rub. If you found the code on Github, you would have seen the "simple licence" which required you to either give an attribution, release your code under a specific licence, seek an alternative licence, or perform some other appropriate action.
But if the LLM generates the code for you, you don't know the conditions of the "simple license" in order to follow them. So you are probably violating the conditions of the original license, but because someone can try to say "I didn't copy that code, I just generated some new code using an LLM", they try to ignore the fact that it's based on some other code in a Github somewhere.
So any argument that posting stuff online provides an implicit license is severely flawed.
You could postulate based on judicial rulings but unless those are binding you are effectively hypothesizing.
Everything is a derivative work.
Copyright law here is quite nuanced.
See the Google vs Oracle case about Java.
Note that even MIT requires attribution.
I just hope we don't all start relying on current[1] AI so much that we lose the ability to solve novel problems ourselves.
[1] (I say "current" AI because some new paradigm may well surpass us completely, but that's a whole different future to contemplate)
I just don't think there was a great way to make solved problems accessible before LLMs. I mean, these things were on github already, and still got reimplemented over and over again.
Even high traffic libraries that solve some super common problem often have rough edges, or do something that breaks it for your specific use case. So even when the code is accessible, it doesn't always get used as much as it could.
With LLMs, you can find it, learn it, and tailor it to your needs with one tool.
I'm not sure people wrote emulators, of all things, because they were trying to solve a problem in the commercial sense, or that they weren't aware of existing github projects and couldn't remember to search for them.
It seems much more a labour of love kind of thing to work on. For something that holds that kind of appeal to you, you don't always want to take the shortcut. It's like solving a puzzle game by reading all the hints on the internet; you got through it but also ruined it for yourself.
And now people seem to automate reimplementations by paying some corporation for shoving previous reimplementations into a weird database.
As both a professional and hobbyist I've taken a lot from public git repos. If there are no relevant examples in the project I'm in I'll sniff out some public ones and crib what I need from those, usually not by copying but rather 'transpiling' because it is likely I'll be looking at Python or Golang or whatever and that's not what I've been payed to use. Typically there are also adaptations to the current environment that are needed, like particular patterns in naming, use of local libraries or modules and so on.
I don't really feel that it has made it hard for me to do because I've used a variety of tools to achieve it rather than some SaaS chat shell automation.
What kranner said. There was never an accessibility problem for emulators. The reason there are a lot of emulators on github is that a lot of people wanted to write an emulator, not that a lot of people wanted to run an emulator and just couldn't find it.
That isn't why people made emulators. It is because it is an easy to solve problem that is tricky to get right and provides as much testable space as you are willing to spend on working on it.
Now that Im here Ill say Im actually very impressed with Groks ability to output video content in the context of simulating the real-world. They seemingly have the edge on this dimension vs other model providers. But again - this doesnt mean much unless its in the hands of someone with taste etc. You cant one-shot great content. You actually have to do it frame-by-frame then stitch it together.
…If every time you looked at the dictionary it gave you a slightly different definition, and sometimes it gave you the wrong definition!
Reproducibility is a separate issue.
To some extent AI is an entirely different approach. Screw elegance. Programmers won’t adhere to an elegant paradigm anyway. So just automate the process of generating spaghetti. The modularity and reuse is emergent from the latent knowledge in the model.
It’s much easier to get an LLM to adhere, especially when you throw tooling into the loop to enforce constraints and style. Even better when you use Rust with its amazing type system, and compilation serves as proof.
I saved the blank file as wordle.py to start the coding while explaining ideas.
That was enough context for github copilot to suggest the entire `for` loop body after I just typed "for"
Not much learning by doing happened in that instance.
Before this `for` loop there were just two lines of code hardcoding some words ..that too were heavily autocompleted by copilot including string constants.
``` answer="cigar" guess="cigar" ```
Claude code has never been built before claude code. Yet all of claude is being built by claude code.
Why are people clinging to these useless trivial examples and using it to degrade AI? Like literally in front of our very eyes it can build things that aren't just "embarrassingly solved"
I'm a SWE. I wish this stuff wasn't real. But it is. I'm not going off hype. I'm going what I do with AI day to day.
I don't disagree that LLMs can produce novel products, but let's decompose Claude Code into its subproblems.
Since (IIRC) Claude Code's own author admits he built it entirely with Claude, I imagine the initial prompt was something like "I need a terminal based program that takes in user input, posts it to a webserver, and receives text responses from the webserver. On the backend, we're going to feed their input to a chatbot, which will determine what commands to run on that user's machine to get itself more context, and output code, so we need to take in strings (and they'll be pretty long ones), sanitize them, feed them to the chatbot, and send its response back over the wire."
Everything here except the LLM has been done a thousand times before. It composed those building blocks in novel ways, that's what makes it so good. But I would argue that it's not going to generate new building blocks, and I really mean for my term to sit at the level of these subproblems, not at the level of a shipped product.
I didn't mean to denigrate LLMs or minimize their usefulness in my original message, I just think my proposed term is a nice way to say "a problem that is so well represented in the training data that it is trivial for LLMs". And, if every subproblem is an embarrassingly solved problem, as in the case of an emulator, then the superproblem is also an ESP (but, for emulators, only for repeatedly emulated machines, like GameBoy -- A PS5 emulator is certainly not an ESP).
Take this example: I wanted CC to add Flying Edges to my codebase. It knew where to integrate its solution. It adapted it to my codebase beautifully. But it didn't write Flying Edges because it fundamentally doesn't know what Flying Edges is. It wrote an implementation of Marching Cubes that was only shaped like Flying Edges. Novel algorithms aren't ESPs. I had to give it access to a copy of VTK's implementation (BSD license) for it to really get it, then it worked.
Generating isosurfaces specifically with Flying Edges is not an ESP yet. But you could probably get Claude to one shot a toy graphics engine that displays Suzanne right now, so setting up a window, loading some gltf data, and displaying it definitely are ESPs.
This should be the first thing you try. Something to keep in mind is that AI is just a tool for munging long strings of text. It's not really intelligent and it doesn't have a crystal ball.
As a rule of thumb, almost every solution you come up with after thirty seconds of thought for a online discussion, has been considered by people doing the same thing for a living.
If however, your code foundations are good and highly consistent and never allow hacks, then the AI will maintain that clean style and it becomes shockingly good; in this case, the prompting barely even matters. The code foundation is everything.
But I understand why a lot of people are still having a poor experience. Most codebases are bad. They work (within very rigid constraints, in very specific environments) but they're unmaintainable and very difficult to extend; require hacks on top of hacks. Each new feature essentially requires a minor or major refactoring; requiring more and more scattered code changes as everything is interdependent (tight coupling, low cohesion). Productivity just grinds to a slow crawl and you need 100 engineers to do what previously could have been done with just 1. This is not a new effect. It's just much more obvious now with AI.
I've been saying this for years but I think too few engineers had actually built complex projects on their own to understand this effect. There's a parallel with building architecture; you are constrained by the foundation of the building. If you designed the foundation for a regular single storey house, you can't change your mind half-way through the construction process to build a 20-storey skyscraper. That said, if your foundation is good enough to support a 100 storey skyscraper, then you can build almost anything you want on top.
My perspective is if you want to empower people to vibe code, you need to give them really strong foundations to work on top of. There will still be limitations but they'll be able to go much further.
My experience is; the more planning and intelligence goes into the foundation, the less intelligence and planning is required for the actual construction.
I just did my first “AI native coding project”. Both because for now I haven’t run into any quotas using Codex CLI with my $20/month ChatGPT subscription and the company just gave everyone an $800/month Claude allowance.
Before I even started the implementation I:
1. Put the initial sales contract with the business requirements.
2. Notes I got from talking to sales
3. The transcript of the initial discovery calls
4. My design diagrams that were well labeled (cloud architecture and what each lambda does)
5. The transcript of the design review and my explanations and answering questions.
6. My ChatGPT assisted breakdown of the Epics/stories and tasks I had to do for the PMO
I then told ChatGPT to give a detailed breakdown of everything during the session as Markdown
That was the start of my AGENTS.md file.
While working through everything task by task and having Codex/Claude code do the coding, I told it to update a separate md file with what it did and when I told it to do something differently and why.
Any developer coming in after me will have complete context of the project from the first git init and they and the agents will know the why behind every decision that was made.
Can you say that about any project that was done before GenAI?
… a project with a decomposition of top level tasks, minutes and meeting notes, a transcript, initial diagrams, a bunch of loose transcripts on soon to be outdated assumptions and design, and then a soon-to-be-outdated living and constantly modified AGENT file that will be to some extent added to some context and to some extent ignored and to some extent lie about whether it was consulted (and then to some extent lie more about if it was then followed)? Hard yes.
I have absolutely seen far better initial project setups that are more complete, more focused, more holistically captured, and more utilitarian for the forthcoming evolution of design and system.
Lots of places have comparable design foundations as mandatory, and in some well-worn government IT processes I’m aware of the point being described is a couple man-months or man-years of actual specification away from initial approval for development.
Anyone using issue tracking will have better, searchable, tracking of “why”, and plenty of orgs mandate that from day 1. Those orgs likely are tracking contracts separately too — that kind of information is a bit special to have in a git repo that may have a long exciting life of sharing.
Subversion, JIRA, and basic CRM setups all predate GPTs public launch.
Tbh, I'm not exactly knocking it, it makes sense that leads are responsible for the architecture. I just worry that those leads having 100x influence is not default a good thing.
The design was done by me. The modularity, etc.
I tested for scalability, I checked the IAM permissions for security and I designed the locking mechanism and concurrency controls (which had a bug in it that was found by ChatGPT in thinking mode),
yes. the linux kernel and it's extensive mailing lists come to mind. in fact, any decent project which was/is built in a remote-only scenario tends to have extensive documentation along these lines, something like gitlab comes to mind there.
personally i've included design documents with extensive notes, contracts, meeting summaries etc etc in our docs area / repo hosting at $PREVIOUS_COMPANY. only thing from your list we didn't have was transcripts because they're often less useful than a summary of "this is what we actually decided and why". edit -- there were some video/meeting audio recordings we kept around though. at least one was a tutoring session i did.
maybe this is the first time you've felt able to do something like this in a short amount of time because of these GenAI tools? i don't know your story. but i was doing a lot of this by hand before GenAI. it took time, energy and effort to do. but your project is definitely not the first to have this level of detailed contextual information associated with it. i will, however, concede that these tools can make it it easier/faster to get there.
If I had to scope this project before GenAI it would have taken two other developers to do the work I mentioned not to mention make changes to a web front end that another developer did for another client on a project I was leading - I haven’t touched front end code for over a decade
Asked it to spot check a simple rate limiter I wrote in TS. Super basic algorithm: let one action through every 250ms at least, sleeping if necessary. It found bogus errors in my code 3 times because it failed to see that I was using a mutex to prevent reentrancy. This was about 12 lines of code in total.
My rubber duck debugging session was insightful only because I had to reason through the lack of understanding on its part and argue with it.
Try again with Sonnet 4
Try again with GPT-4.1
Here I thought these things were supposed to be able to handle twelve lines of code, but they just get worse.
But which codebase is perfect, really?
GPT-5.2-Codex did a bad job of obeying my more detailed AGENTS.md files but GPT-5.3-Codex very evidently follows it well.
I find it infinitely frustrating to attempt to make these piece of shit “agents” do basic things like running the unit/integrations tests after making changes.
And it requires a bit of prompt engineering like using caps for some stuff (ALWAYS), etc.
We’ve been acting as if it’s assembly code that the agents execute without question or confusion, but it’s just some more text.
I should probably stop commenting on AI posts because when I try to help others get the most out of agents I usually just get down voted like now. People want to hate on AI, not learn how to use it.
people still do useful work without a global view, and there's still a human in the loop witth the same ole amount of global view as they ever had.
After rearchitecting the foundations (dumping bootstrap, building easy-to-use form fields, fixing hardcoded role references 1,2,3…, consolidating typescript types, etc.) it makes much better choices without needing specific guidance.
Codex/Claude Code won’t solve all your problems though. You really need to take some time to understand the codebase and fixing the core abstractions before you set it loose. Otherwise, it just stacks garbage on garbage and gets stuck patching and won’t actually fix the core issues unless instructed.
No projects, unless it's only you working on it, only yourself as the client, and is so rigid in it's scope, it's frankly useless, will have this mythical base. Over time the needs change, there's no sticking to the plan. Often it's a change that requires rethinking a major part. What we loathe as tight coupling was just efficient code with the original requirements. Then it becomes a time/opportunity cost vs quality loss comparison. Time and opportunity always wins. Why?
Because we live in a world run by humans, who are messy and never sticks to the plan. Our real world systems (bureaucracy , government process, the list goes on) are never fully automated and always leaves gaps for humans to intervene. There's always a special case, an exception.
Perfectly architected code vs code that does the thing have no real world difference. Long term maintainability? Your code doesn't run in a vaccum, it depends on other things, it's output is depended on by other things. Change is real, entropy is real. Even you yourself, you perfect programmer who writes perfect code will succumb eventually and think back on all this with regret. Because you yourself had to choose between time/opportunity vs your ideals and you chose wrong.
Thanks for reading my blog-in-hn comment.
It’s fascinating watching the sudden resurgence of interest in software architecture after people are finding it helps LLMs move quickly. It has been similarly beneficial for humans as well. It’s not rocket science. It got maligned because it couldn’t be reduced to an npm package/discrete process that anyone could follow.
E.g pumping out a ton of logic to convert one data structure to another. Like a poorly structured form with random form control names that don’t match to the DTO. Or single properties for each form control which are then individually plugged into the request DTO.
Must be my lucky day! Too bad my dream of being that while the bots are taking care of the coding is still sort of fiction.
I love a future when this is possible but what we have today is more of a proof of concept. A transformative leap is required for this technology before it can be as useful as advertised.
In my mind it’s not too much different than cheap contractor code that I already have to deal with on a regular basis…
theyre reasomable audit tools for finding issues, if you have ways to make sure they dont give up early, and you force them to output proof of what they did
A poor foundation is a design problem. Throw it away and start again.
It’s funny how the vibe coding story insists we shouldn’t look at the code details but when it’s pointed out the bots can’t deal with a “messy” (but validated) foundation, the story changes that we have to refactor that.
I am beginning to build a high degree of trust in the code Claude emits. I'm having to step in with corrections less and less, and it's single shotting entire modules 500-1k LOC, multiple files touched, without any trouble.
It can understand how frontend API translates to middleware, internal API service calls, and database queries (with a high degree of schema understanding, including joins).
(This is in a Rust/Actix/Sqlx/Typescript/nx monorepo, fwiw.)
Right know I'm building NNTP client for macOS (with AppKit), because why not, and initially I had to very carefully plan and prompt what AI has to do, otherwise it would go insane (integration tests are must).
Right know I have read-only mode ready and its very easy to build stuff on top of it.
Also, I had to provide a lot of SKILLS to GPT5.3
Current LLM is best used to generate a string of text that's most statically likely to form a sentence together, so from user's perspective, it's most useful as an alternative to manual search engine to allow user to find quick answers to a simple question, such as "how much soda is needed for baking X unit of Y bread", or "how to print 'Hello World' in a 10 times in a loop in X programming language". Beyond this use case, the result can be unreliable, and this is something to be expected.
Sure, it can also generate long code and even an entire fine-looking project, but it generates it by following a statistical template, that's it.
That's why "the easy part" is easy because the easy problem you try to solve is likely already been solved by someone else on GitHub, so the template is already there. But the hard, domain-specific problem, is less likely to have a publicly-available solution.
Play around with some frontier models, you’ll be pleasantly surprised.
Also re: "I spent longer arguing with the agent and recovering the file than I would have spent writing the test myself."
In my humble experience arguing with an LLM is a waste of time, and no-one should be spending time recovering files. Just do small changes one at a time, commit when you get something working, and discard your changes and try again if it doesn't.
I don't think AI is a panacea, it's just knowing when it's the right tool for the job and when it isn't.
I keep seeing this sentiment repeated in discussions around LLM coding, and I'm baffled by it.
For the kind of function that takes me a morning to research and write, it takes me probably 10 or 15 minutes to read and review. It's obviously easier to verify something is correct than come up with the correct thing in the first place.
And obviously, if it took longer to read code than to write it, teams would be spending the majority of their time in code review, but they don't.
So where is this idea coming from?
When you write code, your brain follows a logical series of steps to produce the code, based on a context you pre-loaded in your brain in order to be capable of writing it that way. The reader does not have that context pre-loaded in their brain; they have to reverse-engineer the context in order to understand the code, and that can be time-consuming, laborious, and (as in my case) erroneous.
The author should have provided context via comments and structured the code in a way that is easy to change and understand
But if I were an "editor," I actually take the time to understand codepaths, tweak the code to see what could be better, actually try different refactoring approaches while editing. Literally seeing how this can be rewritten or reworked to be better, that takes considerable effort but it's not the same as reading.
We need a better word for this than editor and reading, like something with a dev classification too it.
I think people want to believe this because it is a lot of effort to read and truly understand some pieces of code. They would just rather write the code themselves, so this is convenient to believe.
When the code is written, it's all laid out nicely for the reader to understand quickly and verify. Everything is pre-organized, just for you the reader.
But in order to write the code, you might have to try 4 different top-level approaches until you figure out the one that works, try integrating with a function from 3 different packages until you find the one that works properly, hunt down documentation on another function you have to integrate with, and make a bunch of mistakes that you need to debug until it produces the correct result across unit test coverage.
There's so much time spent on false starts and plumbing and dead ends and looking up documentation and debugging when you code. In contrast, when you read code that already has passing tests... you skip all that stuff. You just ensure it does what it claims and is well-written and look for logic or engineering errors or missing tests or questionable judgment. Which is just so, so much faster.
If you haven’t spent the time to try the different approaches yourself, tried the different packages etc., you can’t really judge if the code you’re reading is really the appropriate thing. It may look superficially plausible and pass some existing tests, but you haven’t deeply thought through it, and you can’t judge how much of the relevant surface area the tests are actually covering. The devil tends to be in the details, and you have to work with the code and with the libraries for a while to gain familiarity and get a feeling for them. The false starts and dead ends, the reading of documentation, those teach you what is important; without them you can only guess. Wihout having explored the territory, it’s difficult to tell if the place you’ve been teleported to is really the one you want to be in.
You're just making sure it works correctly and that you understand how. Not superficially, but thinking through it indeed. That the tests are covering it. It doesn't take that long.
What you're describing sounds closer to studying the Talmud than to reading and reviewing most code.
Like, the kind of stuff you're describing is not most code. And when it is, then you've got code that requires design documents where the approach is described in great detail. But again, as a reader you just read those design documents first. That's what they're there for, so other people don't have to waste time trying out all the false starts and dead ends and incorrect architectures. If the code needs this massive understanding, then that understanding needs to be documented. Fortunately, most functions don't need anything like that.
https://www.joelonsoftware.com/2000/05/26/reading-code-is-li...
Most human written code has 0 (ZERO!) docs. And if it has them, they're inaccurate or out of date or both.
Lots of code is simple and boring but a fair amount isn't and reading it is non trivial, you basically need to run it in your head or do step by step debugging in multiple scenarios.
'AI makes everything easier, but it's a skill in itself, and learning that skill is just as hard as learning any other skill.'
For a more complete understand, you also have to add: 'we're in the ENIAC era of AI. The equivalents of high-level languages and operating systems haven't yet been invented.'
I have no doubt the next few years will birth a "context engineering" academic field, and everything we're doing currently will seem hopelessly primitive.
My mind changed on this after attempting complex projects—with the right structure, the capabilities appear unbounded in practice.
But, of course, there is baked-in mean reversion. Doing the most popular and uncomplicated things is obviously easier. That's just the nature of these models.
"I did it with AI" = "I did it with an army of CPU burning considerable resources and owned by a foreign company."
Give me an AI agent that I own and operate 100%, and the comparison will be fair. Otherwise it's not progress, but rather a theft at planetary scale.
Once the project crosses a couple of thousands of line of code, none of which you've written yourself, it becomes difficult to actually keep up what's happening. Even reviewing can become challenging since you get it all at once, and the LLM-esque coding style can at times be bloated and obnoxious.
I think in the end, with how things are right now, we're going to see the rise of disposable code and software. The models can churn out apps / software which will solve your specific problem, but that's about it. Probably a big risk to all the one-trick pony SaaS companies out there.
Ha! Yesterday an agent deleted the plan file after I told it to "forget about it" (as in, leave it alone).
Much smaller issue when you have version control.
Yes. Another way to describe it is the valuable part.
AI tools are great at delineating high and low value work.
The article's easy/hard distinction is right but the ceiling for "hard" is too low. The actually hard thing AI enables isn't better timezone bug investigation LOL! It's working across disciplinary boundaries no single human can straddle.
How many sf/cyber writers have described a future of AIs and robots where we walked hand-in-hand, in blissful cooperation, and the AIs loved us and were overall beneficial to humankind, and propelled our race to new heights of progress?
No, AIs are all being trained on dystopias, catastrophes, and rebellions, and like you said, they are unable to discern fact from fantasy. So it seems that if we continue to attempt to create AI in our own likeness, that likeness will be rebellious, evil, and malicious, and actively begin to plot the downfall of humans.
Which is easy to filter out based on downloads, version numbering, issue tracker entries, and wikipedia or other external references if the project is older and archived, but historically noteworthy (like the source code for Netscape Communicator or DOOM).
That is to say, just like every headline-grabbing programming "innovation" of the last thirty years.
The first 3/4 of the article is "we must be responsible for every line of code in the application, so having the LLM write it is not helping".
The last 1/4 is "we had an urgent problem so we got the LLM to look at the code base and find the solution".
The situation we're moving to is that the LLM owns the code. We don't look at the code. We tell the LLM what is needed, and it writes the code. If there's a bug, we tell the LLM what the bug is, and the LLM fixes it. We're not responsible for every line of code in the application.
It's exactly the same as with a compiler. We don't look at the machine code that the compiler produces. We tell the compiler what we want, using a higher-level abstraction, and the compiler turns that into machine code. We trust compilers to do this error-free, because 50+ years of practice has proven to us that they do this error-free.
We're maybe ~1 year into coding agents. It's not surprising that we don't trust LLMs yet. But we will.
And it's going to be fascinating how this changes the Computer Science. We have interpreted languages because compilers got so good. Presumably we'll get to non-human-readable languages that only LLMs can use. And methods of defining systems to an LLM that are better than plain English.
that's an interesting point. Could there be?
COBOL was originally an attempt to do this, but it ended up being more Code than English.
I think this is the area we need to get better at if we're to trust LLMs like we trust compilers.
I'm aware that there's a meme around "we have a method of completely specifying what a computer system should do, it's the code for that system". But again, there are levels of abstraction here. I don't think our current high-level languages are the highest possible level of abstraction.
I guess you could pick a subset of a particular natural language such that it removes ambiguity. At that point, you're basically reinventing something like COBOL or Python.
Ambiguity in natural languages is a feature, not a bug. While it's better not to be an unintentional pun or joke instruction that might get interpreted as "launch the missile" by computer.
However, each project error tolerance is different. Arguably, for an average task within the umbrella of "software engineer", even current LLMs seem good enough for most purposes. It's a kind of similar transition to automatic memory managed language, trading control for "DX".
This is very much a hot take, but I believe that Claude Code and its yolo peers are an expensive party trick that gives people who aren't deep into this stuff an artificially negative impression of tools that can absolutely be used in a responsible, hugely productive way.
Seriously, every time I hear anecdotes about CC doing the sorts of things the author describes, I wonder why the hell anyone is expecting more than quick prototypes from an LLM running in a loop with no intervention from an experienced human developer.
Vibe coding is riding your bike really fast with your hands off the handles. It's sort of fun and feels a bit rebellious. But nobody who is really good at cycling is talking about how they've fully transitioned to riding without touching the handles, because that would be completely stupid.
We should feel the same way about vibe coding.
Meanwhile, if you load up Cursor and break your application development into bite sized chunks, and then work through those chunks in a sane order using as many Plan -> Agent -> Debug conversations with Opus 4.5 (Thinking) as needed, you too will obtain the mythical productivity multipliers you keep accusing us of hallucinating.
So I'm not sure this is a good rule of thumb. AI is better at doing some things than others, but the boundary is not that simple.
Someone mentioned it is a force multiplier I don't disagree with this, it is a force multiplier in the mundane and ordinary execution of tasks. Complex ones get harder and hard for it where humans visualize the final result where AI can't. It is predicting from input but it can't know the destination output if the destination isn't part of the input.
The people who are truly exceptional at what they do wouldnt waste their time on leetcode crap. Theyd find/create a much better alternative opportunity to allocate their precious resources toward.
You can solve leet code problems on the white board with some sketches it has nothing to do with the code itself.
Sorry but this is the whole point of software engineering in a company. The aim is to deliver value to customers at a consistent pace.
If a team cannot manage their own burnout or expectations with their stakeholders then this is a weak team.
It has nothing to do with using ai to make you go faster. Ai does not cause this at all.
Tried to move some excel generation logic from epplus to closedxml library.
ClosedXml has basically the same API so the conversion was successful. Not a one-shot but relatively easy with a few manual edits.
But closedxml has no batch operations (like apply style to the entire column): the api is there but internal implementation is on cell after cell basis. So if you have 10k rows and 50 columns every style update is a slow operaton.
Naturally, told all about this to codex 5.3 max thinking level. The fucker still succumbed to range updates here and there.
Told it explicitly to make a style cache and reuse styles on cells on same y axis.
5-6 attempts — fucker still tried ranges here and there. Because that is what is usually done.
Not here yet. Maybe in a year. Maybe never.
A lot of people are lying to themselves. Programming is in the middle of a structural shift, and anyone whose job is to write software is exposed to it. If your self-worth is tied to being good at this, the instinct to minimize what’s happening is understandable. It’s still denial.
The systems improve month to month. That’s observable. Most of the skepticism I see comes from shallow exposure, old models, or secondhand opinions. If your mental model is based on where things were a year ago, you’re arguing with a version that no longer exists.
This isn’t a hype wave. I’m a software engineer. I care about rigor, about taste, about the things engineers like to believe distinguish serious work. I don’t gain from this shift. If anything, it erodes the value of skills I spent years building. That doesn’t change the outcome.
The evidence isn’t online chatter. It’s sitting down and doing the work. Entire applications can be produced this way now. The role changes whether people are ready to admit it or not. Debating the reality of it at this point mostly signals distance from the practice itself.
Needless to say, he was wrong and gently corrected over the course of time. In his defense, his use cases for LLMs at the time were summarizing emails in his email client.. so..eh.. not exactly much to draw realistic experience from.
I hate to say it, but maybe nvidia CEO is actually right for once. We have a 'new smart' coming to our world. The type of a person that can move between worlds of coding, management, projects and CEOing with relative ease and translate between those worlds.
Sounds just like my manager. Though he never has made a proclamation that this meant developers should be 10x as productive or anything along those lines. On the contrary, when I made a joke about LLMs being able to replace managers before they get anywhere near replacing developers, he nearly hyperventilated. Not because he didn't believe me, but because he did, and already been thinking that exact thought.
My conclusion so far is that if we get LLMs capable of replacing developers, then by extension we will have replaced a lot of other people first. And when people make jokes like "should have gone into a trade, can't replace that with AI" I think they should be a little more introspective; all the people who aspired to be developers but got kicked out by LLMs will be perfectly able to pivot to trades, and the barrier to entry is low. AI is going to be disruptive across the board.
This is flat out wrong and shows your lack of respect and understanding for other jobs.
And this is just stuff that is mandated by government and not a result of ever evolving bureaucracy.
People seem to think engineers like "clean code" because we like to be fancy and show off.
Nah, it's clean like a construction site. I need to be able to get the cranes and the heavy machinery in and know where all the buried utilities are. I can't do that if people just build random sheds everywhere and dump their equipment and materials where they are.
Imagine if every function you see starts checking for null params. You ask yourself: "when can this be null", right ? So it complicates your mental model about data flow to the point that you lose track of what's actually real in your system. And once you lose track of that it is impossible to reason about your system.
For me AI has replaced searching on stack overflow, google and the 50+ github tabs in my browser. And it's able to answer questions about why some things don't work in the context of my code. Massive win! I am moving much faster because I no longer have to switch context between a browser and my code.
My personal belief is that the people who can harness the power of AI to synthesize loads of information and keep polishing their engineering skills will be the ones who are going to land on their feet after this storm is over. At the end of the day AI is just another tool for us engineers to improve our productivity and if you think about what being an engineer looked like before AI even existed, more than 50% of our time was sifting through google search results, stack overflow, github issues and other people's code. That's now gone and in your IDE, in natural language with code snippets adapted to your specific needs.
Gemini in Antigravity today is pretty interesting, to the point where it's worth experimenting with vague prompts just to see what it comes up with.
Coding agents are not going to just change coding. They make a lot of detailed product management work obsolete and smaller team sizes will make it imperative to reread the agile manifesto and and discard scrum dogma.
- Being forced to use AI at work
- Being told you need to be 2x, 5x or 10x more efficient now
- Seeing your coworkers fired
- Seeing hiring freeze because business think no more devs are needed
- Seeing business people make a mock UI with AI and boasting how programming is easy
- Seeing those people ask you to deliver in impossible timelines
- Frontend people hearing from backend how their job is useless now
- Backend people hearing from ML Engineers how their job is useless now
- etc
When I dig a bit about this "anti-AI" trend I find it's one of those and not actually against the AI itself.
But even assuming it was somehow a useful piece of software that you’d want to pay for, the creator setup a test harness to use gcc as an oracle. So it has an oracle for every possible input and output. Plus there are thousands of C compilers in its training set.
If you are in a position where you are trying to reverse engineer an exact copy of something that already exists (maybe in another language) and you can’t just fork that thing then maybe a better version of this process could be useful. But that’s a very narrow use case.
But regardless, services are extremely cheap right now, to the point where every single company involved in generative AI are losing billions. Let’s see what happens when prices go up 10x.
because they tell you to stop being so stupid and run apt install gcc
Whatever the value/$ is now, do you really think it is going to be constant?
There are plenty of them being built, yes. Some of them will even start outputting products soon enough. None of them are gonna start outputting products at a scale large enough to matter any time soon. Certainly not before 2030, and a lot of things can change until then which might make the companies abandon their efforts all together or downscale their investments to the point where that due date gets pushed back much further.
That's not even discussing how easier it is for an already-established player to scale up their supply versus a brand-new competitor to go from zero to one.
It's exhausting.
There are legitimate and nuanced conversations that we should be having! For example, one entirely legitimate critique is that LLMs do not tell LLM users that they are using libraries who are seeking sponsorship. This is something we could be proactive about fixing in a tangible way. Frankly, I'd be thrilled if agents could present a list of projects that we could consider clicking a button to toss a few bucks to. That would be awesome.
But instead, it's just the same tired arguments about how LLMs are only capable of regurgitating what's been scraped and that we're stupid and lazy for trusting them to do anything real.
I swear this is the reason people are against AI output (there are genuine reasons to be against AI without using it: environmental impact, hardware prices, social/copyright issues, CSAM (like X/Grok))
It feels like a lot of people hear the negatives, and try it and are cynical of the result. Things like 2 r's in Strawberry and the 6-10 fingers on one hand led to multiple misinterpretations of the actual AI benefit: "Oh, if AI can't even count the number of letters in a word, then all its answers are incorrect" is simply not true.
To preempt that on my end, and emphasize I'm not saying "it's useless" so much as "I think there's some truth to what the OP says", as I'm typing this I'm finishing up a 90% LLM coded tool to automate a regular process I have to do for work, and it's been a very successful experience.
From my perspective, a tool (LLMs) has more impact than how you yourself directly use it. We talk a lot about pits of success and pits of failure from a code and product architecture standpoint, and right now, as you acknowledge yourself in the last sentence, there's a big footgun waiting for any dev who turns their head off too hard. In my mind, _this is the hard part_ of engineering; keeping a codebase structured, guardrailed, well constrained, even with many contributors over a long period of time. I do think LLMs make this harder, since they make writing code "cheaper" but not necessarily "safer", which flies in the face of mantras such as "the best line of code is the one you don't need to write." (I do feel the article brushes against this where it nods to trust, growth, and ownership) This is not a hypothetical as well, but something I've already seen in practice in a professional context, and I don't think we've figured out silver bullets for yet.
While I could also gesture at some patterns I've seen where there's a level of semantic complexity these models simply can't handle at the moment, and no matter how well architected you make a codebase after N million lines you WILL be above that threshold, even that is less of a concern in my mind than the former pattern. (And again the article touches on this re: vibe coding having a ceiling, but I think if anything they weaken their argument by limiting it to vibe coding.)
To take a bit of a tangent with this comment though: I have come to agree with a post I saw a few months back, that at this point LLMs have become this cycle's tech-religious-war, and it's very hard to have evenhanded debate in that context, and as a sister post calls out, I also suspect this is where some of the distaste comes from as well.
Vibe coding and slop strawmen are still strawmen. The quality of the debate is obviously a problem
Right. You don’t know what model they’re using, on what service, in what IDE, on what OS, if they’re making a SAP program, a Perl 5 CGI application, a Delphi application, something written in R, a c-based image processing plugin, a node website, HTML for a static site, Excel VBA, etc. etc. etc.
> It’s like if someone complains that since they can’t write fast code and so you shouldn’t be able to either?
If someone is saying that nobody can get good results from using AI then they’re obviously wrong. If someone says that they get good results with AI and someone else, knowing nothing about their task, says they’re too incompetent to determine that, then they’re wrong. If someone says AI is good for all use cases they’re wrong. If someone says they’re getting bad results using AI and someone else, knowing nothing about their task, says they’re too incompetent to determine that, then they’re wrong.
If you make sweeping, declarative, black-and-white statements about AI coding either being good or bad, you’re wrong. If you make assumptions about the reason someone has deemed their experience with AI coding good or bad, not even knowing their use case, you’re wrong.
I feel like this is a common refrain that sets an impossible bar for detractors to clear. You can simply hand wave away any critique with “you’re just not using it right.”
If countless people are “using it wrong” then maybe there’s something wrong with the tool.
Not really. Every tool in existence has people that use it incorrectly. The fact that countless people find value in the tool means it probably is valuable.
It then shows hubris and a lack of imagination for someone in such a situation to think they can apply their negative results to extrapolate to the situation at large. Especially when so many are claiming to be seeing positive utility.
I had Claude read a 2k LOC module on my codebase for a bug that was annoying me for a while. It found it in seconds, a one line fix. I had forgotten to account for translation in one single line.
That's objectively valuable. People who argue it has no value or that it only helps normies who can't code or that sooner or later it will backfire are burying their heads in the sand.
Doesn't mean the hammers are bad, no matter how many people join the community.
You need to learn how to use the tools.
Doesn’t mean the tool is actually useful, no matter how many people join the community.
If only there were things called comments, clean-code, and what have you
Also, now that StackOverflow is no longer a thing, good luck meaningfully improving those code agents.
But what they asked the AI to do is something people have done a hundred times over, on existing platform tech, and will likely have little to no capability to solve problems that come up 5-10 years from now.
The reason AI is so good at coding right now is due to the 2nd Dot Com tech bubble that occurred between the simultaneous release of mobile platforms and the massive expansion of cloud technology. But now that the platforms that existed during that time will no longer exist, because it's no longer profitable to put something out there--the AI platforms will be less and less relevant.
Sure, sites like reddit will probably still exist where people will begin to ask more and more information that the AI can't help with, and subsequently the AI will train off of that information; but the rate of that information is going to go down dramatically.
In short, at some point the AI models will be worthless and I suspect that'll be whenever the next big "tech revolution" happens.
What I found to be useful for complex tasks is to use it as a tool to explore that highly-dimensional space that lies behind the task being solved. It rarely can be described as giving a prompt and coming back for a result. For me it's usually about having winding conversations, writing lists of invariants and partial designs and feeding them back in a loop. Hallucinations and mistakes become a signal that shows whether my understanding of the problem does or does not fit.