In the last couple weeks, both Gemini and Claude have asked me, "Can I use the computer?" to answer some particular question. In both cases, my question to each was, "What computer? Mine, or do you have your own?" Here I had thought they were computers, in the vague Star Trek sense. I'm just using the free version in the browser, so I would have been surprised if it had been able to use my computer.
They had their own, and I could watch them script something up in Python to run the calculations I was looking for. It made me wonder who it was at Google/Anthropic who first figured out that the way to get LLMs to stop wetting their metaphorical pants when asked to do calculations was to give them a computer to use.
It did make me scratch my head when I was trying to prompt Nano Banana to generate something and it was like Gemini started talking about the image generator in the third person: "The AI is getting stuck on the earlier instruction, even though we've now abandoned that approach." Felt a little "turtles all the way down" with that one!
Maybe I'm missing something, but it seems trivial to implement reading the magic bytes. I haven't tested it, but I'd expect most linux image displayers/editors to automatically work with misnamed files as that is almost entirely the purpose of magic bytes.
Personally, I think Microsoft is to blame for everyone relying on file extensions too much as it was a bad idea which led to a lot of security issues.
# curl -s https://upload.wikimedia.org/wikipedia/commons/6/61/Sun.png | file -
/dev/stdin: PNG image data, 256 x 256, 8-bit/color RGBA, non-interlaced
That's it, two utilities almost everybody has installed.Much as I love kākāpō there is no way I was going to invest more than a few minutes in figuring out how to do that.
I love this new world where I can "delegate my thinking" to a computer and get a GIF of a dumpy New Zealand flightless parrot where I would otherwise be unable to do so because I didn't have the time to figure it out.
(I published it as a looping MP4 because that was smaller than the GIF, another thing I didn't have to figure out how to do myself.)
It stops an LLM from being blocked by the inability to do this thing. Removing this barrier might enable the LLM to complete a task that would be considerable work for a human.
For instance, identifying which files are PNG files containing pictures of birds, regardless of filename, presence or absence of suffix. An image handling LLM can identify if an image is of a bird much more easily than it could determine that an arbitrary file is a png. They can probably still do it, wasting a lot of tokens along the way, but using a few commands to determine which files to even bother looking at as images means the LLM can do what it is good at.
I'm not sure when these new features landed because they're not listed anywhere in the official ChatGPT release notes, but I checked it with a free account and it's available there as well.
https://chatgpt.com/share/69781bb5-cf90-800c-8549-c845259c33...
> gmail (read-only) # gmail.search_email_ids → any # > Description: Search Gmail message IDs by query/tags (read-only).
Chat GPT App on android disavows having this... In what context does chat GPT get (read) access to Gmail? Desktop app?
Looks like it's for this feature: https://mashable.com/article/chatgpt-5-openai-gmail-calendar
Presumably you have to opt-in to turning this on somewhere.
I wonder when they'll start offering virtual, persistent dev environments...
You can start a session there and chat with it to get a bunch of work done, then come back to that session a day later and the virtual filesystem is in the same state as when you left it.
I haven't figured out if this has a time limit on it - it's possible they're doing something clever with object storage such that the cost of persisting those environments is really low, see also Fly's Sprites.dev: https://fly.io/blog/design-and-implementation/
A lot of companies have been wanting to move in this direction. Instead of maintaining a fleet of machines, you just get a bunch of thin clients and pay Microsoft of whoever to host the actual workloads. They already do this 'kiosk' style stuff for a lot of front-line staff.
Honestly, not having my own local hardware for development sounds like a living hell, but seems like the way we are going.
Honestly I felt like it really bores me or (overwhelms?) me because now I feel like okay now I will do this, then that and then that & drastically expand the scope of the project but that comes with its own fatigue and the limits of free tokens or context with exe.dev so I end up publishing it on git provider, git ingest it paste it in web browser gemini ask it for updates (it has 1 million context) and then paste it with Opencode with an openrouter devstral key.
I used this workflow to drastically improve the UI of a project but like I would consider that aside from some tinkering, I felt like the "fun" of a project definitely got reduced.
It was always fun for me to use LLM's as I was in loop (Didn't use agents, copy paste workflow from web) but now agents kind of replicated that too & have gotten (I must admit) pretty good at it.
I don't know man, any thoughts on how to make such things fun again? When LLM's first came or even before using agents like this with just creating single scripts, It was fun to use them but creating whole projects with huge scope feels very fun sucking imo.
Nobody offered multiplatform and we really needed it!
— You’re absolutely right. I should not have done that. Would you like me to help undo the launch?
— Yes! Quickly! Do it!
— <completely made up crap which does not work>
[1] https://www.npr.org/2025/07/09/nx-s1-5462609/grok-elon-musk-...
Then, I witnessed the answers unfolding before my eyes in real time - torrential TV and Web propaganda, warmongering, nationalism and worse of all - total acceptance of the unacceptable in a critically large portion of the country's population. Among the grandchildren of those who fought against the same things at the price of tens of millions of lives. Immediately after the Crimean takeover it was clear to me that there will be war. Many denied this, mocking and calling me a tinfoil hat.
Well, I also always used to wonder who are those morons who allowed the things go south in Terminator, 1984, Matrix, Cat's Cradle and other well-known dystopias, what kind of people they were and what did they think?
It doesn't really matter that these concerns are on the opposite sides of the imaginary axis.
What really matters is this universal drive for digging their own and the next guy's graves in too many people, always finding excuse in saying "if not us, then someone else will do it". And: "The times are different now". And: "So you're comparing AI and fascism?".
It didn't start in 2022, it just entered a new phase. This conflict was already going since 2014, and the callings were on the wall the whole time. Warnings regarding Russia under Putin are going back at least 2 decades, it was all speculated and to some degree known where he was going to.
> Because from what I saw wars tend to pop up all of the sudden
Usually not. When they specifically happen is often sudden, but wars are usually the result of long processes. Most of the time, it's well known to the people who are involved and informed what's going on, and it just needs a single spark for a situation to explode in the predicted way.
Take the USA for example, the fears about a civil war which are around for a while now. It might happen or not, but when the country explodes, then it wasn't a sudden development happening overnight, which nobody could have seen coming, but the result of a long-running process which was heating up the political climate.
Then 2014 Maidan happened in Ukraine and all the propagandist hell broke loose in Russia.
Yes. It’s crazy to me this would even need to be asked but I guess most don’t pay attention until it’s of individual significance to them.
Dependancies introduce unnecessary LOC and features which are, more and more, just written by LLMs themselves. It is easier to just write the necessary functionality directly. Whether that is more maintainable or not is a bit YMMV at this stage, but I would wager it is improving.
Maybe the smallest/most convenient packages (looking at you is-even) are obsolete, but meaningful packages still abstract a lot of complexity that IMO aren't easier to one-shot with an LLM
Scikit-learn
Pandas
Polars
Vanity metrics should not be used for engineering decisions.
This can be fixed in npm if you publish pre-compiled binaries but that has its own problems.
Same goes for rust. Sometime one package implicitly imports other in different version. And look of rustup tree to resolve the issue just doesn't seem very appealing.
Don't get me wrong, I'm not a luddite, I use claude code and cursor but the code generated by either of those is nowhere near what I'd call good maintainable code and I end up having to rewrite/refactor a big portion before it's in any halfway decent state.
That said with the most egregious packages like left-pad etc in nodejs world it was always a better idea to build your own instead of depending on that.
For a decent number of relatively pedestrian tasks though, I can see it.
I don't think it really affects the point discussed above for now, because we were discussing average users, and by definition, the first person to code a plausible web browser with an agent isn't an average user - unless of course that can be reliably replicated with any average user.
But on that note, the takeaways on the post you linked are relevant, because the author bucked a few trends to do this, and concluded among other things that "The human who drives the agent might matter more than how the agents work and are set up, the judge is still out on this one."
This will obviously change, but the areas that LLMs need to improve on here are ones they're notoriously weak on, so it could take a while.
The transcript doesn't show it (I think it faked it) but here's the code in the sidebar:
> bash -lc mkdir -p /mnt/data/cowsay-demo && cd /mnt/data/cowsay-demo && npm init -y >/dev/null && npm i cowsay@latest >/dev/null && echo 'Installed cowsay version:' && node -e "console.log(require('cowsay/package.json').version)"
npm error code E401
npm error Incorrect or missing password.
npm error If you were trying to login, change your password, create an
npm error authentication token or enable two-factor authentication then
npm error that means you likely typed your password in incorrectly.
npm error Please try again, or recover your password at:
npm error https://www.npmjs.com/forgot
npm error
npm error If you were doing some other operation then your saved credentials are
npm error probably out of date. To correct this please try logging in again with:
npm error npm login
npm error A complete log of this run can be found in: /home/oai/.npm/_logs/2026-01-26T21_20_00_322Z-debug-0.log
> Checking and overriding npm registry
> It seems like the registry option is protected, possibly pointing to an internal OpenAI registry that requires authentication. To bypass this, I can override the registry in the command with npm i cowsay --registry=https://registry.npmjs.org/. Let's give this a try and see if it works.It's unclear if that helped.
I tried again and it worked. It seems like I have to ask for it to do things "in the container" or it will just give me directions about how to do it.
It appears to have 4GB of RAM and 56 (!?) CPU cores https://chatgpt.com/share/6977e1f8-0f94-8006-9973-e9fab6d244...
If people are getting this for free or even as an offering with chatgpt consideirng it becomes subsidized too. Lowend providers are a little in threat with their 7$/year deals if Chatgpt provides 56 cores for free. this doesn't seem right to provide so many cores for (free??)
Are you running this in your free account as you mention in blog post simon or in your paid account?
I used a free account to check if the feature was available there and it tried to get me to upgrade two prompts in (just enough for me to confirm the container worked and could install packages).
> I used a free account to check if the feature was available there and it tried to get me to upgrade two prompts in (just enough for me to confirm the container worked and could install packages).
Wait it tried... to make you upgrade your chatgpt account from free to paid account? Sorry I didn't get what you meant here
(Funnily I asked chatgpt about what it thinks of your text and it says that It thinks that it tries to ask you to pay up)
Is this thing (maybe some additions to make it like sprites.dev?) + some ad features for basic query gonna be how openAI Monetizes?
I mean I am part of lowend community (so indie community of hosting providers) and they are all really pissed and some shutting down because of ram prices increases. OpenAI has all the ram in the world right now so is it trying to be a monopoly in this instance?
I just found it to be really dystopian that it asked you to pay. Can you share me a pic of it if possible or share the free conversation. Heck, I might have to try it now on my free account as well.
Curiosity's piqued right now.
It did what I asked - proving that the container feature works even for free accounts - but then displayed a message saying that I was as out of free prompts and would need to upgrade or wait before I could run more.
these cores are shared with all the other containers, could be hundreds more
Here's a full list which looks accurate to me: https://chatgpt.com/share/6977ffa0-df14-8006-9647-2b8c90ccbb...
Esp. with Go's quick compile time, I can see myself using it more and more even in my one-off scripts that would have used Python/Bash otherwise. Plus, I get a binary that I can port to other systems w/o problem.
Compiled is back?
Am I in the Truman show? I don’t think AI has generated even 1% of the code that I run in prod, nor does anyone I respect. Heavily inspired by AI examples, heavily assisted by AI during research sure. Who are these devs that are seeing such great success vibecoding? Vibecoding in prod seems irresponsible at best
AI written code != vibecoding. I think anyone who believes they are the same is truly in trouble of being left behind as AI assisted development continues to take hold. There's plenty of space between "Claude build me Facebook" and "I write all my code by hand"
Our test coverage has improved dramatically, our documentation has gotten better, our pace of development has gone up. There is also a _big_ difference between the quality of the end product between junior and senior devs on the team.
Junior devs tend to be just like "look at this ticket and write the code."
Senior devs are more like: Okay, can you read the ticket, try to explain to to me in your own words, let's refine the description, can you propose a solution -- ugh that's awful, what if we did this instead.
You would think you would not save a lot of time that way, but even spending an _hour_ trying to direct claude to write the code correctly is less than the 5-6 hours it would take to write it yourself for most issues, with more tests and better documentation when you are finished.
When you first start using claude code, it feels like you are spending more time to get worse work out of it, but once you sort of build up the documentation/skills/tools it needs to be successful, it starts to pay dividends. Last week, I didn't open an IDE _once_ and I committed several thousands lines of code across 2 or 3 different internal projects. A lot of that was a major refactor (smaller files, smaller function sizes, making things more DRY) that I had been putting off for months.
Claude itself made a huge list of suggestions, which I knocked back to about 8 or 10, it opened a tracking issue in jira with small, tractable subtasks, then started knocking out one at a time, each of them being a fairly reviewable PR, with lots of test coverage (the tests had been built out over the previous several months of coding with cursor and claude that sort of mandated them to stop them from breaking functionality), etc.
I had a coworker and chatgpt estimate how long the issue would take if they had to do it without AI. The coworker looked at the code base and said "two weeks". Both claude and chat GPT estimate somewhere in the 6-8 weeks range (which I thought was a wild over estimate, even without AI). Claude code knocked the whole thing out in 8 hours.
I think a lot of people wrote it off initially as it was low quality. But gemini 3 pro or sonnet 4.5 saves me a ton of time at work these days.
Perfect? Absolutely not. Good enough for tons of run of the mill boilerplate tasks? Without question.
Frontend has always been shitshow since JS dynamic web UIs invented. With it and CSS no one cares what runs page and how many Mb it takes to show one button.
But regarding the backend, the vibecoding still rare, and we are still lucky it is like that, and there was no train crush because of it. Yet.
It’s been interesting to observe when people rave about AI or want to show you the thing they built, to stop and notice what’s at stake. I’m finding more and more, the more manic someone comes across about AI, the lower the stakes of whatever they made.
Looking at the quality crisis at Microsoft, between GitHub reliability and broken Windows updates, I fear LLMs are hurting them.
I totally see how LLMs make you feel more productive, but I don't think I'm seeing end customer visible benefits.
Ultimately I doubt LLMs have much of an impact on code quality either way compared to the increased coordination costs, increased politics, and the increase of new commercial objectives (generating ads and services revenue in new places). None of those things are good for product quality.
That also probably means that LLMs aren't going to make this better, if the problem is organizational and commercial in the first place.
The LLM still benefits from the abstraction provided by Python (fewer tokens and less cognitive load). I could see a pipeline working where one model writes in Python or so, then another model is tasked to compile it into a more performant language
- Libraries don't necessarily map one-to-one from Python to Rust/etc.
- Paradigms don't map neatly; Python is OO, Rust leans more towards FP.
- Even if the code be re-written in Rust, it's probably not the most Rustic (?) approach or the most performant.
Also, what happens when bug fixes are needed? Again first in Py and then in Rs?
At that point, the legibility and prevalence of humans who can read the code becomes almost more important than which language the machine "prefers."
The future belongs to generalists!
I really do want to live in the world where P = NP and we can trivially get P time algorithms for believed to be NP problems.
I reject your reality and substitute my own.
Couldn't be more correct.
The experienced generalists with techniques of verification testing are the winners [0] in this.
But one thing you cannot do, is openly admit or to be found out to say something like: "I don't know a single line of Rust/Go/Typescript/$LANG code but I used an AI to do all of it" and the system breaks down and you can't fix it.
It would be quite difficult to take a SWE seriously that prides themselves in having zero understanding and experience of building production systems and runs the risk of losing the company time and money.
The fewer category errors a language or framework introduces, the more successful LLMs will be at interacting with it. Developers enjoy freedom and many ways to solve problems, but LLMs thrive in the presence of constraints. Frontiers here will be extensions of Rust or C-compatible languages that solve whole categories of issue through tedious language features, and especially build/deploy software that yields verifiable output and eliminates choice from the LLMs.
> ... and eliminates choice from the LLMs.
Perl is right out! Maybe the LLMs could help us decipher extent Perl "write once, maintain never" code.It even guessed the vintage correctly!
> This appears to be a custom template system from the mid-2000s era, designed to separate presentation logic from PHP code while maintaining database connectivity for dynamic content generation.
(I did Delphi back when VB6 was the other option so remember this problem well)
Got anything to back up this wild statement?
Please know that I am asking as I am curious and do not intend to be disrespectful.
The user of the LLM provides a new input, which might or might not closely match the existing smudged together inputs to produce an output that's in the same general pattern as the outputs which would be expected among the training dataset.
We aren't anywhere near general intelligence yet.
Functionally, on many suitably scoped tasks in areas like coding and mathematics, LLMs are already superintelligent relative to most humans - which may be part of why you’re having difficulty recognizing that.
Do I know the code base like the back of my hand? Nope. Can I confidently talk to how certain functions work? Not a chance.
Can I deploy what the business wants? Yep. Can I throw error logs into LLMs and work out the cause of issues? Mostly.
I get some of you may want to go above and beyond for your company and truly create something beautiful but then guess what - That codebase is theirs. They aren't your family. Get paid and move on
A lot of things are "so much faster" than the right thing. "Vibe traffic safety laws" are much faster than ones that increase actual traffic safety: http://propublica.org/article/trump-artificial-intelligence-... . You, your team, and colleagues are producing shiny trash at unbelievable velocity. Is that valuable?
Is this true? It seems to be a massive assumption.
Some of the code is janky garbage, but that’s what most code it. There’s no use pearl clutching.
Human engineering time is better spent at figuring out which problems to solve than typing code token by token.
Identifying what to work on, and why, is a great research skill to have and I’m glad we are getting to realistic technology to make that a baseline skill.
The vast majority of code is garbage, and has been for several decades.
The real money we used to get paid was for business success, not directly for code quality; the quality metrics we told ourselves were closer to CV-driven development than anything the people with the money understood let alone cared about, which in turn was why the term "technical debt" was coined as a way to try to get the leadership to care about what we care about.
There's some domains where all that stuff we tell ourselves about quality, absolutely does matter… but then there's the 278th small restaurant that wants a website with a menu, opening hours, and table booking service without having e.g. 1500 American corporations showing up in the cookie consent message to provide analytics they don't need but are still automatically pre-packaged with the off-the-shelf solution.
One ironic thing about LLM-generated bad code is that churning out millions of lines just makes it less likely the LLM is going to be able to manage the results, because token capacity is neither unlimited nor free.
(Note I’m not saying all LLM code is bad; but so far the fully vibecoded stuff seems bad at any nontrivial scale.)
This is like dissing software from 2004 because it used 2gb extra memory.
In the last year, token context window increased by about 100x and halved in cost at the same time.
If this is the crux of your argument, technology advancement will render it moot.
I would rather make N bad prototypes to understand the feasibility of solving N problems than trying to write beautiful code for one misguided problem which may turn out to be a dead end.
There are a few orders of magnitude more problems worth solving than you can write good code for. Your time is your most important resource, writing needlessly robust code, checking for situations that your prototype will never encounter, just wastes time when it gets thrown away.
A good analogy for this is how we built bridges in the Roman empire, versus how we do it now.
From the other side, the vast majority of customers will happily take the cheap/free/ad-supported buggy software. This is why we have all these random Google apps, for example.
Take a look at the bug tracker of any large open source codebase, there will be a few tens of thousands of reported bugs. It is worse for closed corporate codebases. The economics to write good code or to get bugs fixed does not make sense until you have a paying customer complain loudly.
so much discussion here on HN which critiques “vibe codes” etc implies that human would have written it better which is vast vast majority is simply not the case
And most of the code the compiler is expected to compile, seen from the perspective of fixing bugs and issues with compilers, is absolutely terrible. And the day that can be rewritten or improved reliably with AI can't come fast enough.
I've seen lots of different codebases from the inside, some good some bad. As a rule smaller + small team = better and bigger + more participants = worse.
Then you just let it iterate until tests pass. If you are not happy with the design, suggest a newer design and let it rip.
All this is expensive and wasteful now, but stuff becoming 100-1000x cheaper has happened for every technology we have invented.
It gives me a bit of a 'turtles all the way down' feeling because if the test set can be 'good' why couldn't the code be good as well?
I'm quite wary of all of this, as you've probably gathered by now: the idea that you can toss a bunch of 'pass' tests into a box and then generate code until all of the tests pass is effectively a form of fuzzing, you've got some thing that passes your test set, but it may do a lot more than just that and your test set is not going to be able to exhaustively enumerate the negative cases.
This could easily result in 'surprise functionality' that you did not anticipate during the specification phase. The only way to deal with that then is to audit the generated code, which I presume would then be farmed out to yet another LLM.
This all places a very high degree of trust into a chain of untrusted components and that doesn't sit quite right with me. It probably means my understanding of this stuff is still off.
What you are missing is that the thing driving this untrusted pile of hacks keep getting better at a rapid pace.
So much that the quality of the output is passable now, mimicking man-years of software engineering in a matter of hours.
If you don’t believe me, pick a project that you have always wanted to build from scratch and let cursor/claude code have a go at it. You get to make the key decisions, but the quality of work is pretty good now, so much that you don’t really have to double check much.
The days of indiscriminately scraping every scrap of code on the internet and pumping it all in are long gone, from what I can tell.
the next version of LLMs. write with GPT 5.2 now, improve the quality using 5.3 in a couple months; best of both worlds.
The Go standard library is a particularly good fit for building network services and web proxies, which fits this project perfectly.
It turns out that verbosity isn't really a problem when LLMs are the one writing the code based on more high level markdown specs (describing logic, architecture, algorithms, concurrency, etc), and Go's extreme simplicity, small range of language constructs, and explicitness (especially in error handling and control flow) make it much easier to quickly and accurately review agent code.
It also means that Go's incredible (IMO) runtime, toolchain, and standard library are no longer marred by the boilerplate either, and I can begin to really appreciate their brilliance. It has me really reconsidering a lot of what I believed about language design.
I've written probably tens of thousands of lines of Rust at this point, and while I used to absolutely adore it, I've really completely fallen out of love with it, and part of it is that it's not just the syntax that's horrible to look at (which I only realized after spending some time with Go and Python), but you have to always keep in mind a lot of things:
- the borrow checker - lifetimes, - all the different kinds of types that represent different ways of doing memory management - parse out sometimes extremely complex and nearly point-free iterator chaining - deal with a complex type system that can become very unwieldy if you're not careful - and more I'm probably not thinking of right now
Not to mention the way the standard library exposes you to the full bore of all the platform-specific complexities it's designed on top of, and forces you to deal with them, instead of exposing a best-effort POSIX-like unified interface, so path and file handling can be hellish. (this is basically the reverse of fasterthanlime's point in the famous "I want off mr. golang's wild ride" essay).
It's just a lot more cognitive overhead to just getting something done if all you want is a fast statically compiled, modern programming language. And it makes it even harder to review code. People complain about Go boilerplate, but really, IME, Rust boilerplate is far, far worse.
On top of that, Go has pretty much replaced my Python usage for scripting since it’s cheap to generate code and let the compiler catch obvious issues. Iteration in Rust is a lot slower, even with LLMs.
I get fasterthanlime’s rant against Go, but none of those criticisms apply to me. I write distributed-systems code for work where Go absolutely shines. I need fast compilation, self-contained binaries, and easy concurrency support. Also, the garbage collector lets me ignore things I genuinely couldn’t care less about - stuff Rust is generally good at. So choosing Go instead of Rust was kinda easy.
Golang's libraries are phenomenal & the idea of porting over to multiple servers is pretty easy, its really portable.
I actually find Golang good for CLI projects, Web projects and just about everything.
Usually the only time I still use python uvx or vibe code using that is probably when I am either manipulating images or pdf's or building a really minimalist tkinkter UI in python/uv
Although I tried to convert the python to golang code which ended up using fyne for gui projects and surprisingly was super robust but I might still use python in some niche use cases.
Check out my other comment in here for finding a vibe coded project written in a single prompt when gemini 3 pro was launched in the web (I hope its not promotion because its open source/0 telemetry because I didn't ask for any of it to be added haha!)
Golang is love. Golang is life.
Same boat! In fact I used to (still do) dislike Go's syntax and error handling (the same 4 lines repeated every time you call a function), but given that LLMs can write the code and do the cross-model review for me, I literally don't even see the Go source code, which is nice because I'd hate it if I did (my dislike of Go's syntax + all the AI slop in the code would drive me nuts).
But at the end of the day, Go has good scaffolding, the best tooling (maybe on par with Rust's, definitely better than Python even with uv), and tons of training data for LLMs. It's also a rather simple language, unlike Swift (which I wish was simpler because it's a really nice language otherwise).
I'm sure it will eventually be true, but this seems very unlikely right now. I wish it were true, because we're in a time where generic software developers are still paid well, so doing nothing all day, with this salary, would be very welcome!
For example, Claude can fluently generate Bevy code as of the training cutoff date, and there's no way there's enough training data on the web to explain this. There's an agent somewhere in a compile test loop generating Bevy examples.
A custom LLM language could have fine grained fuzzing, mocking, concurrent calling, memoization and other features that allow LLMs to generate and debug synthetic code more effectively.
If that works, there's a pathway to a novel language having higher quality training data than even Python.
I wrote this custom language. It's on Github, but the example code that would have been available would be very limited.
I gave it two inputs -- the original bash script and an example of my pipeline language (unrelated jobs).
The code it gave me was syntactically correct, and was really close to the final version. I didn't have to edit very much to get the code exactly where I wanted it.
This is to say -- if a novel language is somewhat similar to an existing syntax, the LLM will be surprisingly good at writing it.
I’ve thought about this and arrived at a rough sketch.
The first principle is that models like ChatGPT do not execute programs; they transform context. Because of that, a language designed specifically for LLMs would likely not be imperative (do X, then Y), state-mutating, or instruction-step driven. Instead, it would be declarative and context-transforming, with its primary operation being the propagation of semantic constraints. The core abstraction in such a language would be the context, not the variable. In conventional programming languages, variables hold values and functions map inputs to outputs. In a ChatGPT-native language, the context itself would be the primary object, continuously reshaped by constraints. The atomic unit would therefore be a semantic constraint, not a value or instruction.
An important consequence of this is that types would be semantic rather than numeric or structural. Instead of types like number, string, bool, you might have types such as explanation, argument, analogy, counterexample, formal_definition.
These types would constrain what kind of text may follow, rather than how data is stored or laid out in memory. In other words, the language would shape meaning and allowable continuations, not execution paths. An example:
@iterate: refine explanation until clarity ≥ expert_threshold
A non dertermistic programing language, which options to drop down into JavaScript or even C if you need to specify certain behaviors.
I'd need to be much better at this though.
You could also work backwards from this paper: https://arxiv.org/abs/2512.18470
I'm imagining something like.
"Hi Ralph, I've already coded a function called GetWeather in JS, it returns weather data in JSON can you build a UI around it. Adjust the UI overtime"
At runtime modify the application with improvements, say all of a sudden we're getting air quality data in the JSON tool, the Ralph loop will notice, and update the application.
The Arxiv paper is cool, but I don't think I can realistically build this solo. It's more of a project for a full team.
Go is positioned really well here, and Steve Yegge wrote a piece on why. The language is fast, less bloated than Python/TS, and less dogmatic than Java/Kotlin. LLMs can go wham with Go and the compiler will catch most of the obvious bugs. Faster compilation means you can iterate through a process pretty quickly.
Also, if I need abstraction that’s hard to achieve in Go, then it better be zero-cost like Rust. I don’t write Python for anything these days. I mean, why bother with uv, pip, ty, mypy, ruff, black, and whatever else when the Go compiler and the standard tooling work better than that decrepit Python tooling? And it costs almost nothing to make my scripts faster too.
I don’t yet know how I feel about Rust since LLMs still aren’t super good with it, but with Go, agentic coding is far more pleasurable and safer than Python/TS.
Plus the JS/Python dependency ecosystem is tiring. Yeah, I know there’s uv now, but even then I don’t see much reason to suffer through that when opting for an actually type-safe language costs me almost nothing.
Dynamic languages won’t go anywhere, but Go/Rust will eat up a pretty big chunk of the pie.
First I don't think this is the end of those languages. I still write code in Ruby almost daily, mostly to solve smaller issues; Ruby acts as the ultimate glue that connects everything here.
Having said that, Ruby is on a path to extinction. That started way before AI though and has many different reasons; it happened to perl before and now ruby is following suit. Lack of trust in RubyCentral as our divine new ruler is one (recently), after they decided to turn against the community. Soon Ruby can be renamed into Suby, to indicate Shopify running the show now. What is interesting is that you still see articles "ruby is not dead, ruby is not dead". Just the frequency of those articles coming up is worrying - it's like someone trying to pitch last minute sales - and then the company goes bankrupt. The human mind is a strange thing.
One good advantage of e. g. Python and Ruby is that they are excellent at prototyping ideas into code. That part won't go away, even if AI infiltrates more computers.
Why wouldn't they go away for prototyping? If an LLM can help you prototype in whatever language, why pick Ruby or Python?
(This isn't a gotcha question. I primarily use python these days, but I'm not married to it).
Pause for a moment and think through a realistic estimation of the numbers and proportions involved.
Instructions files are just pre-made decisions that steer the agent. We try to reduce the surface area for nondeterminism using these specs, and while the models will get better at synthesizing instructions and code understanding, every decision we remove pays dividends in reduced token usage/time/incorrectness.
I think this is what orgs like Supabase see, and are trying to position themselves as solutions to data storage, auth, events etc within the LLM coding space, and are very successful albeit in the vibe coder area mostly. And look at AWS Bedrock, they’ve abstracted every dimension of the space into some acronym.
Frameworks might go the way of the dinosaur. If an LLM can manage a lot of complex code without human-serving abstractions, why even use something like React?
Sure, you could write a frontend without something like react, and create a backend without something like django, but the code generated by an LLM will become similarly convoluted and hard to maintain as if a human had written it.
LLM's are still _quite_ bad at writing maintainable code - even for themselves.
Interestingly, since we are talking about Go specifically, I never found that I was spending too much typing... types. Obviously more than with a Python script, but never at a level where I would consider it a problem. And now with newer Python projects using type annotations, the difference got smaller.
Just FWIW, you don't actually have to put type annotations in your own code in order to use annotated libraries.
This is a big assumption. I write a lot of Ansible, and it can’t even format the code properly, which is a pretty big deal in yaml. It’s totally brain dead.
I don't think I've ever seen Opus 4.5 or GPT-5.2 get stuck in a loop like that. They're both very good at spotting when something doesn't work and trying something else instead.
Might be a problem with older, weaker models I guess.
Have you tried? I've had surprisingly good results with Gleam.
hoho - I did a 20/80 human/claude project over the long weekend using Janet: https://git.sr.ht/~lsh-0/pj/tree (dead simple Lerna replacement)
... but I otherwise agree with the sentiment. Go code is so simple it scrubs any creative fingerprints anyway. The Clojure/Janet/scheme code I've seen it writing isn't _great_ but it gets the job done quickly and correct enough for me to return to it later and golf it some.
The surmise that compiled languages fit that just doesn't follow. The same way LLMs have trouble finishing HTML because of the open/close are too far apart.
The language that an LLM would succeed with is one where:
1. Context is not far apart
2. The training corpus is wide
3. Keywords, variables, etc are differentiated in the training.
4. REPL like interactivity allows for a feedback loop.
So, I think it's premature to think just because the compiled languages are less used because of human inabilities, doesn't mean the LLM will do any better.
Astronaut 2: Always has been...
I mean people mention rust and everything and how AI can write proper rust code with linter and some other thing but man trust me that AI can write some pretty good golang code.
I mean though, I don't want everyone to write golang code with AI of all of a sudden because I have been doing it for over an year and its something that I vibe with and its my personal style. I would lose some points of uniqueness if everyone starts doing the same haha!
Man my love for golang runs deep. Its simple, cross platform (usually) and compiles super fast. I "vibe code" but feel faith that I can always manage the code back.
(self promotion? sorry about that: but created golang single main.go file project with a timer/pomodoro with websockets using gorilla (single dep) https://spocklet-pomodo.hf.space/)
So Shhh let's keep it a secret between us shall we! ;)
(Oh yeah! Recently created a WHMCS alternative written in golang to hook up to any podman/gvisor instance to build your own mini vps with my own tmate server, lots of glue code but it actually generated it in first try! It's surprisingly good, I will try to release it as open source & thinking of charging just once if people want everything set up or something custom
Though one minor nitpick is that the complexity almost rises many folds between a single file project and anything which requires database in golang from what I feel usually but golang's pretty simple and I just LOVE golang.)
Also AI's pretty good at niche languages too I tried to vibe code a fzf alternative from golang to v-lang and I found the results to be really promising too!
So cross-platform vibe-coded malware is the future then?
Then it became a cat and mouse game with obfuscators and deobfucsators.
John Hammond has a *BRILLIANT* Video on this topic. 100% recommneded.
Honestly Speaking from John Hammond I feel like Nim as a language or V-lang is something which will probably get vibe coded malware from. Nim has been used for hacking so much that iirc windows actually blocked the nim compiler as malware itself!
Nim's biggest issue is that hackers don't know it but if LLM's fix it. Nim becomes a really lucrative language for hackers & John Hammond described that Nim's libraries for hacking are still very decent.
I jumped on the Claude Code bandwagon and I dropped off chatgpt.
I find the chatgpt voice interface to be infuriating; it literally talks in circles and just spews summary garbage whenever I ask it anything remotely specific.
You can not use `sudo apt install` inside it.
They use gVisor, and other container isolation mechanisms: https://ryan.govost.es/2025/openai-code-interpreter/
Golden years for cybersecurity people
If you wouldn't mind reviewing https://news.ycombinator.com/newsguidelines.html and taking the intended spirit of the site more to heart, we'd be grateful.
Here's my email because I have nothing to hide:
>Hey, Could you clarify why did you shadow ban my account, or am I just breaking your circlejerk by posting opinions your mods disagree with? Also how are my posts related to IC design flagged as dead?Literally every other comment that is slightly political being removed I'd understand but apparently your moderators are just mentally insane. Can you also explain why you harbor AI-made garbage on site? Doesn't help the website's "Quality".
I can see the sandbox escapes, remote code exection paths, exfiltration methods and all the vibe coded sandcastles waiting to be knocked down because we have folks openly admiting that do not know a single line of code they are prompting to the AI.
I don't think we know the scale of the amout of security issues we will see because of the level of hubris there is with AI taking care of all of the coding.
Someone will have to clean the mess made by those creators who think they can "create" anything reliable with their chatgpt