No matter how much I tried to force it to stick to a mostly line-by-line port, it kept trying to "improve" the code. At some point it had to undo everything as it introduced a number of bugs. I asked it: "What should I add to your prompt so you won't do this again?" and it gave me this:
### CRITICAL LESSON: Don't "Improve" During Porting
- **BIGGEST MISTAKE: Reorganizing working code**
- **What I did wrong:** Tried to "simplify" by splitting `createStartButton()` into separate creation and layout methods
- **Why it failed:** Introduced THREE bugs:
1. Layout overlap (getY() vs getY() - getHeight())
2. Children not sized (Group.setSize() doesn't affect children)
3. Origins not updated (scaling animations broken)
- **The fix:** Deleted my "improvements" and copied the original Android pattern faithfully
- **Root cause:** Arrogance - assuming I could improve production-tested code without understanding all the constraints
- **Solution:** **FOLLOW THE PORTING PRINCIPLES ABOVE** - copy first, don't reorganize
- **Time wasted:** ~1 hour debugging self-inflicted bugs that wouldn't exist if I'd just copied the original
- **Key insight:** The original Android code is correct and battle-tested. Your "improvements" are bugs waiting to happen.
I like the self-reflection of Claude, unfortunately even adding this to CLAUDE.md didn't fix it and it kept taking wrong turns so I had to abandon the effort."I've never interacted with Rust in my life"
:-/
How is this a good idea? How can I trust the generated code?
If you ask a coding agent to port code from one language to the another and don't have a robust mechanism to test that the results are equivalent you're inevitably going to waste a lot of time and money on junk code that doesn't work.
It would honestly try to one-shot the whole conversion in a 30 minute autonomous session
They are remarkably good at catching things, especially if you do it every commit.
So I am supposed to trust the machine, that I know I cannot trust to write the initial code correctly, to somehow do the review correctly? Possibly multiple times? Without making NEW mistakes in the review process?
Sorry no sorry, but that sounds like trying to clean a dirty floor by rubbing more dirt over it.
You don't. The LLMs wrote the code and is absolutely right. /s
What could possibly go wrong?
You’re just creating slop.
Typescript is a good high-level language that is versatile and well generated by LLMs and there is a good support for various linters and other code support tools. You can probably knock out more TS code then Rust and at faster rate (just my hypothesis). For most intents and purposes this will be fine but in case you want faster, lower-level code, you can use an LLM-backed compiler/translator. A specialised tool that compiles high level code to rust will be awesome actually and I can see how it could potentially be a dedicated agent of sorts.
This is the kind of thing where if this was a real developer tweaking a codebase they're familiar with, it could get done, but with AI there's a glass ceiling
I later realized it sped up the metric I'd asked about (build time) at the cost of all users downloading like 100x the amount of JS.
Even though the other optimizations might have been ok, some of them made things more complicated, so I reverted all of them.
"Suspiciously precise floats, or, how I got Claude's real limits" 19hs ago 25 points https://news.ycombinator.com/item?id=46756742
OTOH, with ChatGPT/Codex limits are less of a problem, in general.
I'm on the $200/month plan, and I do have Claude running unattended for hours at a time. I have hit the weekly limits at times of particularly aggressive use (multiple sessions in parallel for hours at a time) but since it's involved more than one session at the time, I'm not really sure how close I got to the equivalent of one session 24/7.
As an experiment/exercise this is cool, but having a 100k loc codebase to maintain in a language I’ve never used sounds like a nightmare scenario.
Once that's also "fixed", it may well be a lot faster than the current Rust version.
Piping 'yes' to command prompts just to auto-approve any change isn't really a good idea, especially when the code / script can be malicious.
Some concepts people try out using AI (for lack of a more specific word) are interesting. They will add to our collective understanding of when these tools, paired with meaningful methods can be used to effectively achieve what seemed out of reach before.
Unfortunately it comes with many rediscovering insights I thought we already had, badly. Others use tools without giving consideration to what they were looking to accomplish, and how they would know if they did.
In which case the code produced has zero value, resulting in a wasted month.
It would be interesting if we use this as a benchmark similar to https://benjdd.com/languages/ or https://benjdd.com/languages2/
I used gitingest on the repository that they provided and its around ~150k tokens
Currently pasted it into the free gemini web and asked it to write it in golang and it said that line by line feels impossible but I have asked it to specifically write line by line so it would be interesting what the project becomes (I don't have many hopes with the free tier of gemini 3 pro but yeah, if someone has budget, then sure they should probably do it)
Edit: Reached rate limits lmao