This reminds me of when I tried to let Claude port an Android libgdx-based game to a WASM-based libgdx version, so I can play the game in the browser.

No matter how much I tried to force it to stick to a mostly line-by-line port, it kept trying to "improve" the code. At some point it had to undo everything as it introduced a number of bugs. I asked it: "What should I add to your prompt so you won't do this again?" and it gave me this:

  ### CRITICAL LESSON: Don't "Improve" During Porting
  -  **BIGGEST MISTAKE: Reorganizing working code**
    - **What I did wrong:** Tried to "simplify" by splitting `createStartButton()` into separate creation and layout methods
    - **Why it failed:** Introduced THREE bugs:
      1. Layout overlap (getY() vs getY() - getHeight())
      2. Children not sized (Group.setSize() doesn't affect children)
      3. Origins not updated (scaling animations broken)
    - **The fix:** Deleted my "improvements" and copied the original Android pattern faithfully
    - **Root cause:** Arrogance - assuming I could improve production-tested code without understanding all the constraints
    - **Solution:** **FOLLOW THE PORTING PRINCIPLES ABOVE** - copy first, don't reorganize
    - **Time wasted:** ~1 hour debugging self-inflicted bugs that wouldn't exist if I'd just copied the original
    - **Key insight:** The original Android code is correct and battle-tested. Your "improvements" are bugs waiting to happen.

I like the self-reflection of Claude, unfortunately even adding this to CLAUDE.md didn't fix it and it kept taking wrong turns so I had to abandon the effort.
I wish there was a feature to say "you must re-read X" after each compaction.
Some people use hooks for that. I just avoid CC and use Codex.
  • Lionga
  • ·
  • 14 minutes ago
  • ·
  • [ - ]
Well its close to AGI, can you really expect AGI to follow simple instructions from dumbos like you when it can do the work of god?
Some quotes from the article stand out: "Claude after working for some time seem to always stop to recap things" Question: Were you running out of context? That's why certain frameworks like intentional compaction are being worked on. Large codebases have specific needs when working with an LLM.

"I've never interacted with Rust in my life"

:-/

How is this a good idea? How can I trust the generated code?

  • johnfn
  • ·
  • 38 minutes ago
  • ·
  • [ - ]
The author says that he runs both the reference implementation and the new Rust implementation through 2 million (!) randomly generated battles and flags every battle where the results don't line up.
Yeah and he claims a pass rate of 99.96%, which at that point might be running into bugs in the original implementation.
  • simonw
  • ·
  • 32 minutes ago
  • ·
  • [ - ]
This is the key to the whole thing in my opinion.

If you ask a coding agent to port code from one language to the another and don't have a robust mechanism to test that the results are equivalent you're inevitably going to waste a lot of time and money on junk code that doesn't work.

I'm very skeptical, but this is also something that's easy to compare using the original as a reference implementation, right? providing lots of random input and fixing any disparities is a classic approach for rewriting/porting a system
  • ethin
  • ·
  • 46 minutes ago
  • ·
  • [ - ]
This only works up to a certain point. Given that the author openly admits they don't know/understand Rust, there is a really high likelihood that the LLM made all kinds of mistakes that would be avoided, and the dev is going to be left flailing about trying to understand why they happen/what's causing them/etc. A hand-rewrite would've actually taught the author a lot of very useful things I'm guessing.
Hopefully they have a test suite written by QA otherwise they're for sure going to have a buggy mess on their hands. People need to learn that if you must rewrite something (often you don't actually need to) then an incremental approach best.
1 month of Claude Code would be an incremental approach

It would honestly try to one-shot the whole conversion in a 30 minute autonomous session

I think it could work if they have tests with good coverage, like the "test farm" described by someone who worked in Oracle.
His goal was to get a faster oracle that encoded the behavior of Pokemon that he could use for a different training project. So this project provides that without needing to be maintainable or understandable itself.
  • atonse
  • ·
  • 45 minutes ago
  • ·
  • [ - ]
My answer to this is to often get the LLMs to do multiple rounds of code review (depending on the criticality of the code, doing reviews on every commit. but this was clearly a zero-impact hobby project).

They are remarkably good at catching things, especially if you do it every commit.

> My answer to this is to often get the LLMs to do multiple rounds of code review

So I am supposed to trust the machine, that I know I cannot trust to write the initial code correctly, to somehow do the review correctly? Possibly multiple times? Without making NEW mistakes in the review process?

Sorry no sorry, but that sounds like trying to clean a dirty floor by rubbing more dirt over it.

  • rvz
  • ·
  • 57 minutes ago
  • ·
  • [ - ]
> How is this a good idea? How can I trust the generated code?

You don't. The LLMs wrote the code and is absolutely right. /s

What could possibly go wrong?

Same way you trust any auto translation for a document. You wrote it in English (or whatever language you’re most proficient in), but someone wants it in Thai or Czech, so you click a button and send them the document. It’s their problem now.
How you create the mental model of that Rust code?

You’re just creating slop.

  • _pdp_
  • ·
  • 35 minutes ago
  • ·
  • [ - ]
To be honest I think it should be the other way around.

Typescript is a good high-level language that is versatile and well generated by LLMs and there is a good support for various linters and other code support tools. You can probably knock out more TS code then Rust and at faster rate (just my hypothesis). For most intents and purposes this will be fine but in case you want faster, lower-level code, you can use an LLM-backed compiler/translator. A specialised tool that compiles high level code to rust will be awesome actually and I can see how it could potentially be a dedicated agent of sorts.

>I've tried asking Claude to optimize it further, it created a plan that looks reasonable (I've never interacted with Rust in my life) and it spent a day building many of these optimizations but at the end of the day, none of them actually improved the runtime and some even made it way worse.

This is the kind of thing where if this was a real developer tweaking a codebase they're familiar with, it could get done, but with AI there's a glass ceiling

Yeah, I had Claude spend a lot of time optimizing a JS bundling config (as a quite senior frontend) and it started some things that looked insanely promising, which a newer FE dev would be thrilled about.

I later realized it sped up the metric I'd asked about (build time) at the cost of all users downloading like 100x the amount of JS.

  • cies
  • ·
  • 28 minutes ago
  • ·
  • [ - ]
This is what LLMs are good at, generate what "look[s] insanely promising" to us humans
I just ran into the problem of extremely slow uploads in an app I was working on. Told Gemini to work on it, and it tried to get the timing of everything, then tried to optimize the slow parts of the code. After a long time, there might have been some improvements, but the basic problem remained: 5-10 seconds to upload an image from the same machine. Increasing the chunk size fixed the problem immediately.

Even though the other optimizations might have been ok, some of them made things more complicated, so I reverted all of them.

How much does it cost to run Claude Code 24 hrs/day like this. Does the $200/month plan hold up? My spend on Cursor has been high... I'm wondering if I can just collapse it into a 200/month CC subscription.
This guy tested it: https://she-llac.com/claude-limits

"Suspiciously precise floats, or, how I got Claude's real limits" 19hs ago 25 points https://news.ycombinator.com/item?id=46756742

OTOH, with ChatGPT/Codex limits are less of a problem, in general.

  • esafak
  • ·
  • 48 minutes ago
  • ·
  • [ - ]
Because Codex effectively rate limits you by being so slow.
If you're using it 24h/day you probably will run into it unless you're very careful about managing context and/or the requests are punctuated by long-running tool use (e.g. time-consuming test suites).

I'm on the $200/month plan, and I do have Claude running unattended for hours at a time. I have hit the weekly limits at times of particularly aggressive use (multiple sessions in parallel for hours at a time) but since it's involved more than one session at the time, I'm not really sure how close I got to the equivalent of one session 24/7.

There's a daily token limit. While I've never run into that limit while operating Claude as a human, I have received warnings that I'm getting close. I imagine that an unattended setup will blow through the token limit in not too much time.
I have no first-hand experience with the Max subscription (which the $200 plan is) but having read a few discussions here and on GitHub [1] it seems that Anthropic has tanked the usage limits in the last few weeks and thus I would argue that you would run into limits pretty quick if you using it (unsupervised) for 24h each day.

1) https://github.com/anthropics/claude-code/issues/16157

> I have never written any line of Rust before in my life

As an experiment/exercise this is cool, but having a 100k loc codebase to maintain in a language I’ve never used sounds like a nightmare scenario.

I think the plan is for Claude to maintain it. He hasn't read a single line of code.
  • cies
  • ·
  • 30 minutes ago
  • ·
  • [ - ]
I kind of expect that code to be full of non-idiomatic Rust code that mimics a GC'ed language...

Once that's also "fixed", it may well be a lot faster than the current Rust version.

I'm hoping that one day we can use AI to port the millions of lines in the modules of the Python ecosystem to a GIL-free version of Python.
For typing “yes” or “y” automatically into command prompts without interacting, you could have utilized the command ‘yes’ and piped it into the process you’re running as a first attempt to solving the yes problem. https://man7.org/linux/man-pages/man1/yes.1.html
  • rvz
  • ·
  • 1 hour ago
  • ·
  • [ - ]
I don't think this is an actual problem and the prompt is there for a reason.

Piping 'yes' to command prompts just to auto-approve any change isn't really a good idea, especially when the code / script can be malicious.

And here I was hoping OP was being sarcastic. Yet it‘s reasonable we‘re nearing an AI-fueled Homer drinking bird scenario.

Some concepts people try out using AI (for lack of a more specific word) are interesting. They will add to our collective understanding of when these tools, paired with meaningful methods can be used to effectively achieve what seemed out of reach before.

Unfortunately it comes with many rediscovering insights I thought we already had, badly. Others use tools without giving consideration to what they were looking to accomplish, and how they would know if they did.

Isn't that the point of vibe coding? You don't even look at the code. Just trust the llm to take the wheel.

https://x.com/karpathy/status/1886192184808149383?lang=en

Did you ever consider using something like Oh My Opencode [1]? I first saw it in the wake of Anthropic locking out Opencode. I haven’t used it but it appears to be better at running continuously until a task is finished. Wondering if anyone else has tried migrating a huge codebase like this.

[1] https://github.com/code-yeongyu/oh-my-opencode

This gives me hope that some people will use AI to port Javascript desktop apps to faster languages.
This is actually pretty incredible. Cannot really argue against the productivity in this case.
one possible argument against the productivity is if the mirgration introduced too many bugs to be useable.

In which case the code produced has zero value, resulting in a wasted month.

I suppose what’s impressive is that (with the author’s help) it did ultimately get the port to work, in spite of all the caveats described by the author that make Claude sound like a really bad programmer. The code is likely terrible, and the 3.5x speedup way low compared to what it could be, but I guess these days we’re supposed to be impressed by quantity rather than quality.
  • ·
  • 48 minutes ago
  • ·
  • [ - ]
  • Mizza
  • ·
  • 57 minutes ago
  • ·
  • [ - ]
At this rate, I am expecting that an AI will be able to port the entire Linux kernel to Rust by the end of the year.
  • Curzel
  • ·
  • 50 minutes ago
  • ·
  • [ - ]
I don’t know about the Linux kernel, but I’ll be surprised if don’t have some “fully vibe coded OS” for Christmas (which would be cool to see)
  • Havoc
  • ·
  • 39 minutes ago
  • ·
  • [ - ]
I recall seeing a claim about a vibe coded os already on Reddit somewhere. Looked very windows 3.1 but didn’t investigate further
Honestly I am really interested in trying to port the rust code to multiple languages like golang,zig, even niche languages like V-lang/Odin/nim etc.

It would be interesting if we use this as a benchmark similar to https://benjdd.com/languages/ or https://benjdd.com/languages2/

I used gitingest on the repository that they provided and its around ~150k tokens

Currently pasted it into the free gemini web and asked it to write it in golang and it said that line by line feels impossible but I have asked it to specifically write line by line so it would be interesting what the project becomes (I don't have many hopes with the free tier of gemini 3 pro but yeah, if someone has budget, then sure they should probably do it)

Edit: Reached rate limits lmao