AI is incredibly dangerous because it can do the simple things very well, which prevents new programmers from learning the simple things ("Oh, I'll just have AI generate it") which then prevents them from learning the middlin' and harder and meta things at a visceral level.

I'm a CS teacher, so this is where I see a huge danger right now and I'm explicit with my students about it: you HAVE to write the code. You CAN'T let the machines write the code. Yes, they can write the code: you are a student, the code isn't hard yet. But you HAVE to write the code.

  • orev
  • ·
  • 2 hours ago
  • ·
  • [ - ]
It’s like weightlifting: sure you can use a forklift to do it, but if the goal is to build up your own strength, using the forklift isn’t going to get you there.

This is the ultimate problem with AI in academia. We all inherently know that “no pain no gain” is true for physical tasks, but the same is true for learning. Struggling through the new concepts is essentially the point of it, not just the end result.

Of course this becomes a different thing outside of learning, where delivering results is more important in a workplace context. But even then you still need someone who does the high level thinking.

I think this is a pretty solid analogy but I look at the metaphor this way - people used to get strong naturally because they had to do physical labor. Because we invented things like the forklift we had to invent things like weightlifting to get strong instead. You can still get strong, you just need to be more deliberate about it. It doesn't mean shouldn't also use a forklift, which is its own distinct skill you also need to learn.

It's not a perfect analogy though because in this case it's more like automated driving - you should still learn to drive because the autodriver isn't perfect and you need to be ready to take the wheel, but that means deliberate, separate practice at learning to drive.

  • thesz
  • ·
  • 1 hour ago
  • ·
  • [ - ]
Weightlifting and weight training was invented long before forklifts. Even levers were not properly understood back then.

My favorite historic example of typical modern hypertrophy-specific training is the training of Milo of Croton [1]. By legend, his father gifted him with the calf and asked daily "what is your calf, how does it do? bring it here to look at him" which Milo did. As calf's weight grew, so did Milo's strength.

This is application of external resistance (calf) and progressive overload (growing calf) principles at work.

[1] https://en.wikipedia.org/wiki/Milo_of_Croton

Milo lived before Archimedes.

Dad needs to respect that we need rest days.
> people used to get strong naturally because they had to do physical labor

I think that's a bit of a myth. The Greeks and Romans had weightlifting and boxing gyms, but no forklifts. Many of the most renowned Romans in the original form of the Olympics and in Boxing were Roman Senators with the wealth and free time to lift weights and box and wrestle. One of the things that we know about the famous philosopher Plato was that Plato was essentially a nickname from wrestling (meaning "Broad") as a first career (somewhat like Dwayne "The Rock" Johnson, which adds a fun twist to reading Socratic Dialogs or thinking about relationships as "platonic").

Arguably the "meritocratic ideal" of the Gladiator arena was that even "blue collar" Romans could compete and maybe survive. But even the stories that survive of that, few did.

There may be a lesson in that myth, too, that the people that succeed in some sports often aren't the people doing physical labor because they must do physical labor (for a job), they are the ones intentionally practicing it in the ways to do well in sports.

>if the goal is to build up your own strength I think you missed this line. If the goal is just to move weights or lift the most - forklift away. If you want to learn to use a forklift, drive on and best of luck. But if you're trying to get stronger the forklift will not help that goal.

Like many educational tests the outcome is not the point - doing the work to get there is. If you're asked to code fizz buzz it's not because the teacher needs you to solve fizz buzz for them, it's because you will learn things while you make it. Ai, copying stack overflow, using someone's code from last year, it all solves the problem while missing the purpose of the exercise. You're not learning - and presumably that is your goal.

I do appreciate the visual of driving a forklift into the gym.

The activity would train something, but it sure wouldn't be your ability to lift.

Misusing a forklift might injure the driver and a few others; but it is unlikely to bring down an entire electric grid, expose millions to fraud and theft, put innocent people in prison, or jeopardize the institutions of government.

There is more than one kind of leverage at play here.

> but it is unlikely to bring down an entire electric grid

Unless you happen to drive a forklift in a power plant.

> expose millions to fraud and theft

You can if you drive forklift in a bank.

> put innocent people in prison

You can use forklift to put innocent people in prison.

> jeopardize the institutions of government.

It's pretty easy with a forklift, just try driving through main gate.

> There is more than one kind of leverage at play here.

Forklifts typically have several axes of travel.

  • pjc50
  • ·
  • 41 minutes ago
  • ·
  • [ - ]
> Misusing a forklift might injure the driver and a few others; but it is unlikely to bring down an entire electric grid

That's the job of the backhoe.

(this is a joke about how diggers have caused quite a lot of local internet outages by hitting cables, sometimes supposedly "redundant" cables that were routed in the same conduit. Hitting power infrastructure is rare but does happen)

  • jrm4
  • ·
  • 1 hour ago
  • ·
  • [ - ]
I like this analogy along with the idea that "it's not an autonomous robot, it's a mech suit."

Here's the thing -- I don't care about "getting stronger." I want to make things, and now I can make bigger things WAY faster because I have a mech suit.

edit: and to stretch the analogy, I don't believe much is lost "intellectually" by my use of a mech suit, as long as I observe carefully. Me doing things by hand is probably overrated.

  • orev
  • ·
  • 23 minutes ago
  • ·
  • [ - ]
The point of going to school is to learn all the details of what goes into making things, so when you actually make a thing, you understand how it’s supposed to come together, including important details like correct design that can support the goal, etc. That’s the “getting stronger” part that you can’t skip if you expect to be successful. Only after you’ve done the work and understand the details can you be successful using the power tools to make things.
This analogy works pretty well. Too much time doing everything in it and your muscles will atrophy. Some edge cases will be better if you jump out and use your hands.
There's also plenty of mech tales where the mech pilots need to spend as much time out of the suits making sure their muscles (and/or mental health) are in good strength precisely because the mechs are a "force multiplier" and are only as strong as their pilot. That's a somewhat common thread in such worlds.
  • PKop
  • ·
  • 31 minutes ago
  • ·
  • [ - ]
> I want to make things

You need to be strong to do so. Things of any quality or value at least.

How seriously do you mean the analogy?

I think forklifts probably carry more weight over longer distances than people do (though I could be wrong, 8 billion humans carrying small weights might add up).

Certainly forklifts have more weight * distance when you restrict to objects that are over 100 pounds, and that seems like a good decision.

I think it's a good analogy. A forklift is a useful tool and objectively better than humans for some tasks, but if you've never developed your muscles because you use the forklift every time you go to the gym, then when you need to carry a couch up the stairs you'll find that you can't do it and the forklift can't either.

So the idea is that you should learn to do things by hand first, and then use the powerful tools once you're knowledgeable enough to know when they make sense. If you start out with the powerful tools, then you'll never learn enough to take over when they fail.

A forklift can do things no human can. I've used a forklift for things that no group of humans could - you can't physically get enough humans around that size object to lift it. (of course levers would change this)
  • _flux
  • ·
  • 1 hour ago
  • ·
  • [ - ]
You're making the analogy work: because the point of weightlifting as a sport or exercise is to not to actually move the weights, but condition your body such that it can move the weights.

Indeed, usually after doing weightlifting, you return the weights to the place where you originally took them from, so I suppose that means you did no work at in the first place..

That's true of exercise in general. It's bullshit make-work we do to stay fit, because we've decoupled individual survival from hard physical labor, so it doesn't happen "by itself" anymore. A blessing and a curse.
The real challenge will be that people almost always pick the easier path.

We have a decent sized piece of land and raise some animals. People think we're crazy for not having a tractor, but at the end of the day I would rather do it the hard way and stay in shape while also keeping a bit of a cap on how much I can change or tear up around here.

Unlike weightlifting, the main goal of our jobs is not to lift heavy things, but develop a product that adds value to its users.

Unfortunately, many sdevs don't understand it.

Yes but the goal of school is to lift heavy things, basically. You're trying to do things that are difficult (for you) but don't produce anything useful for anyone else. That's how you gain the ability to do useful things.
Even after school, you need to lift weights once in a while or you lose your ability.

I wouldn't want to write raw bytes like Mel did though. Eventually some things are not worth getting good at.

Let's just accept that this weight lifting metaphor is leaky, like any other, and brings us to absurds like forklift operators need to lift dumbbells to keep relevant in their jobs.
Forklift operators need to do something to exercise. They sit in the seat all day. At least as a programmer I have a standing desk. This isn't relevant to the job though.
I kinda get the point, but why is that? The goal of school is to teach something that's applicable in industry or academia.

Forklift operators don't lift things in their training. Even CS students start with pretty high level of abstraction, very few start from x86 asm instructions.

We need to make them implement ALU's on logical gates and wires if we want them to lift heavy things.

We begin teaching math by having students solve problems that are trivial for a calculator.

Though I also wonder what advanced CS classes should look like. If they agent can code nearly anything, what project would challenge student+agent and teach the student how to accomplish CS fundamentals with modern tools.

But what has changed? Students never had a natural reason to learn how to write fizz buzz. It's been done before and its not even useful. There has always been a arbitrary nature to these exercises.

I actually fear more for the middle-of-career dev who has shunned AI as worthless. It's easier than ever for juniors to learn and be productive.

  • Isamu
  • ·
  • 5 minutes ago
  • ·
  • [ - ]
Same with essay assignments, you exercise different neural pathways by doing it yourself.

Recently in comments people were claiming that working with LLMs has sharpened their ability to organize thoughts, and that could be a real effect that would be interesting to study. It could be that watching an LLM organize a topic could provide a useful example of how to approach organizing your own thoughts.

But until you do it unassisted you haven’t learned how to do it.

I haven't done long division in decades, am probably unable to do it anymore, and yet it has never held me back in any tangible fashion (and won't unless computers and calculators stop existing)
What you as a teacher teach might have to adapt a bit. Teaching how code works is more important than teaching how to code. Most academic computer scientists aren't necessarily very skilled as programmers in any case. At least, I learned most of that after I stopped being an academic myself (Ph. D. and all). This is OK. Learning to program is more of a side effect of studying computer science than it is a core goal (this is not always clearly understood).

A good analogy here is programming in assembler. Manually crafting programs at the machine code level was very common when I got my first computer in the 1980s. Especially for games. By the late 90s that had mostly disappeared. Games like Roller Coaster Tycoon were one of the last ones with huge commercial success that were coded like that. C/C++ took over and these days most game studios license an engine and then do a lot of work with languages like C# or LUA.

I never did any meaningful amount of assembler programming. It was mostly no longer a relevant skill by the time I studied computer science (94-99). I built an interpreter for an imaginary CPU at some point using a functional programming language in my second year. Our compiler course was taught by people like Eric Meyer (later worked on things like F# at MS) who just saw that as a great excuse to teach people functional programming instead. In hindsight, that was actually a good skill to have as functional programming interest heated up a lot about 10 years later.

The point of this analogy: compilers are important tools. It's more important to understand how they work than it is to be able to build one in assembler. You'll probably never do that. Most people never work on compilers. Nor do they build their own operating systems, databases, etc. But it helps to understand how they work. The point of teaching how compilers work is understanding how programming languages are created and what their limitations are.

  • vidarh
  • ·
  • 8 minutes ago
  • ·
  • [ - ]
> A good analogy here is programming in assembler. Manually crafting programs at the machine code level was very common when I got my first computer in the 1980s. Especially for games. By the late 90s that had mostly disappeared.

Indeed, a lot of us looked with suspicion and disdain at people that used those primitive compilers that generated awful, slow code. I once spent ages hand-optimizing a component that had been written in C, and took great pleasure in the fact I could delete about every other line of disassembly...

When I wrote my first compiler a couple of years later, it was in assembler at first, and supported inline assembler so I could gradually convert to bootstrap it that way.

Because I couldn't imagine writing it in C, given the awful code the C compilers I had available generated (and how slow they were)...

These days most programmers don't know assembler, and increasingly don't know languaes as low level as C either.

And the world didn't fall apart.

People will complain that it is necessary for them to know the languages that will slowly be eaten away by LLMs, just like my generation argued it was absolutely necessary to know assembler if you wanted to be able to develop anything of substance.

I agree with you people should understand how things work, though, even if they don't know it well enough to build it from scratch.

> The point of this analogy: compilers are important tools. It's more important to understand how they work than it is to be able to build one in assembler. You'll probably never do that. Most people never work on compilers. Nor do they build their own operating systems, databases, etc. But it helps to understand how they work. The point of teaching how compilers work is understanding how programming languages are created and what their limitations are.

I don't know that it's all these things at once, but most people I know that are good have done a bunch of spikes / side projects that go a level lower than they have to. Intense curiosity is good, and to the point your making, most people don't really learn this stuff just by reading or doing flash cards. If you want to really learn how a compiler works, you probably do have to write a compiler. Not a full-on production ready compiler, but hands on keyboard typing and interacting with and troubleshooting code.

Or maybe to put another way, it's probably the "easiest" way, even though it's the "hardest" way. Or maybe it's the only way. Everything I know how to do well, I know how to do well from practice and repitition.

I only learn when I do things, not when I hear how they work. I think the teacher has the right idea.
Yes, I do too, but the point they were trying to make is that "learning how to write code" is not the point of CS education, but only a side effect.
A huge portion of the students in CS do intend the study precisely for writing code and the CS itself is more of a side effect.
When I did a CS major, there was a semester of C, a semester of assembly, a semester of building a verilog CPU, etc. I’d be shocked if an optimal CS education involved vibecoding these courses to any significant
They don't always do the simple things well which is even more frustrating.

I do Windows development and GDI stuff still confuses me. I'm talking about memory DC, compatible DC, DIB, DDB, DIBSECTION, bitblt, setdibits, etc... AIs also suck at this stuff. I'll ask for help with a relatively straightforward task and it almost always produces code that when you ask it to defend the choices it made, it finds problems, apologizes, and goes in circles. One AI (I forget which) actually told me I should refer to Petzold's Windows Programming book because it was unable to help me further.

I remember reading about a metal shop class, where the instructor started out by giving each student a block of metal, and a file. The student had to file an end wrench out of the block. Upon successful completion, then the student would move on to learning about the machine tools.

The idea was to develop a feel for cutting metal, and to better understand what the machine tools were doing.

--

My wood shop teacher taught me how to use a hand plane. I could shave off wood with it that was so thin it was transparent. I could then join two boards together with a barely perceptible crack between them. The jointer couldn't do it that well.

In middle school (I think) we spent a few days in math class hand-calculating trigonometry values (cosine, sin, etc.). Only after we did that did our teacher tell us that the mandated calculators that we all have used for the last few months have a magic button that will "solve" for the values for you. It definitely made me appreciate the calculator more!
Yes! You are best served by learning what a tool is doing for you by doing it yourself or carefully studying what it uses and obfuscates from you before using the tool. You don't need to construct an entire functioning processor in an HDL, but understanding the basics of digital logic and computer architecture matters if you're EE/CompE. You don't have to write an OS in asm, but understanding assembly and how it gets translated into binary and understanding the basics of resource management, IPC, file systems, etc. is essential if you will ever work in something lower level. If you're a CS major, algorithms and data structures are essential. If you're just learning front end development on your own or in a boot camp, you need to learn HTML and the DOM, events, how CSS works, and some of the core concepts of JS, not just React. You'll be better for it when the tools fail you or a new tool comes along.
  • ·
  • 44 minutes ago
  • ·
  • [ - ]
  • nso
  • ·
  • 1 hour ago
  • ·
  • [ - ]
I agree 100%. But as someone with 25 years of development experience, holy crap it's nice not having to do the boring parts as much anymore.
  • dfxm12
  • ·
  • 58 minutes ago
  • ·
  • [ - ]
As a teacher, do you have any techniques to make sure students learn to write the code?
I'm an external examiner for CS students in Denmark and I disagree with you. What we need in the industry is software engineers who can think for themselves, can interact with the business and understand it's needs, and, they need to know how computers work. What we get are mass produced coders who have been taught some outdated way of designing and building software that we need to hammer out of them. I don't particularily care if people can write code like they work at the assembly line. I care that they can identify bottlenecks and solve them. That they can deliver business value quickly. That they will know when to do abstractions (which is almost never). Hell, I'd even like developers who will know when the code quality doesn't matter because shitty code will cost $2 a year but every hour they spend on it is $100-200.

Your curriculum may be different than it is around here, but here it's frankly the same stuff I was taught 30 years ago. Except most of the actual computer science parts are gone, replaced with even more OOP, design pattern bullshit.

That being said. I have no idea how you'd actually go about teaching students CS these days, considering a lot of them will probably use ChatGPT or Claude regardless of what you do. That is what I see in the statistic for grades around here. For the first 9 years I was a well calibrated grader, but these past 1,5ish years it's usually either top marks or bottom marks with nothing in between. Which puts me outside where I should be, but it matches the statistical calibration for everyone here. I obviously only see the product of CS educations, but even though I'm old, I can imagine how many corners I would have cut myself if I had LLM's available back then. Not to mention all the distractions the internet has brought.

> I don't particularily care if people can write code like they work at the assembly line. I care [...] That they can deliver business value quickly.

In my experience, people who talk about business value expect people to code like they work at the assembly line. Churn out features, no disturbances, no worrying about code quality, abstractions, bla bla.

To me, your comment reads contradictory. You want initiative, and you also don't want initiative. I presume you want it when it's good and don't want it when it's bad, and if possible the people should be clairvoyant and see the future so they can tell which is which.

I think we very often confuse engineers with scientists in this field. Think of the old joke: “anyone can build a bridge, it takes an Engineer to build one that barely stands”. Business value and the goal of engineering is to make a bridge that is fast to build, cheap to make, and stays standing exactly as long as it needs to. This is very different from the goals of science which are to test the absolute limits of known performance.

What I read from GP is that they’re looking for engineering innovation, not new science. I don’t see it as contradictory at all.

  • ·
  • 37 minutes ago
  • ·
  • [ - ]
> people who talk about business value expect people to code like they work at the assembly line. Churn out features, no disturbances, no worrying about code quality, abstractions, bla bla.

That's typical misconception that "I'm an artist, let me rewrite in Rust" people often have. Code quality has a direct money equivalent, you just need to be able to justify it for people that pay you salary.

  • bambax
  • ·
  • 30 minutes ago
  • ·
  • [ - ]
> That being said. I have no idea how you'd actually go about teaching students CS these days, considering a lot of them will probably use ChatGPT or Claude regardless of what you do.

My son is in a CS school in France. They have finals with pen and paper, with no computer whatsoever during the exam; if they can't do that they fail. And these aren't multiple choice questions, but actual code that they have to write.

Let them use AI and then fall on their faces during exam time - simple as that. If you can't recall the theory, paradigm, methodology, whatever by memory then you have not "mastered" the content and thus, should fail the class.
> That they will know when to do abstractions

The only way to learn when abstractions are needed is to write code, hit a dead end, then try and abstract it. Over and over. With time, you will be able to start seeing these before you write code.

AI does not do abstractions well. From my experience, it completely fails to abstract anything unless you tell it to. Even when similar abstractions are already present. If you never learn when an abstraction is needed, how can you guide an AI to do the same well?

> I'm an external examiner for CS students

> Hell, I'd even like developers who will know when the code quality doesn't matter because shitty code will cost $2 a year but every hour they spend on it is $100-200.

> Except most of the actual computer science parts are gone, replaced with even more OOP, design pattern bullshit.

Maybe you should consider a different career, you sound pretty burnt out. There are terrible takes, especially for someone who is supposed to be fostering the next generation of developers.

It can take some people a few years to get over OOP, in the same way that some kids still believe in Santa a bit longer. Keep at it though and you’ll make it there eventually too.
What is an "external examiner?"
A proctor?
Maybe I'm "vibecoding" wrong but to me at least this misses a clear step which is reviewing the code.

I think coding with an AI changes our role from code writer to code reviewer, and you have to treat it as a comprehensive review where you comment not just on code "correctness" but these other aspects the author mentions, how functions fits together, codebase patterns, architectural implications. While I feel like using AI might have made me a lazier coder, it's made me a me a significantly more active reviewer which I think at least helps to bridge the gap the author is referencing.

In the long run, vb coding is going to undoubtedly rot people’s skills.if AGI is not showing up anytime soon, actually understanding what the code does,why it exists,how it breaks and who owns the fallout will matter just as much as it did before LLM agents showed up

it'll be really interesting to see in the decades to come what happens when a whole industry gets used to steering black boxes by vb coding the hell out of it

I came to "vibe coding" with an open mind, but I'm slowly edging in the same direction.

It is hands down good for code which is laborious or tedious to write, but once done, obviously correct or incorrect (with low effort inspection). Tests help but only if the code comes out nicely structured.

I made plenty of tools like this, a replacement REPL for MS-SQL, a caching tool in Python, a matplotlib helper. Things that I know 90% how to write anyway but don't have the time, but once in front of me, obviously correct or incorrect. NP code I suppose.

But business critical stuff is rarely like this, for me anyway. It is complex, has to deal with various subtle edge cases, be written defensively (so it fails predictably and gracefully), well structured etc. and try as I might, I can't get Claude to write stuff that's up to scratch in this department.

I'll give it instructions on how to write some specific function, it will write this code but not use it, and use something else instead. It will pepper the code with rookie mistakes like writing the same logic N times in different places instead of factoring it out. It will miss key parts of the spec and insist it did it, or tell me "Yea you are right! Let me rewrite it" and not actually fix the issue.

I also have a sense that it got a lot dumber over time. My expectations may have changed of course too, but still. I suspect even within a model, there is some variability of how much compute is used (eg how deep the beam search is) and supply/demand means this knob is continuously tuned down.

I still try to use Claude for tasks like this, but increasingly find my hit rate so low that the whole "don't write any code yet, let's build a spec" exercise is a waste of time.

I still find Claude good as a rubber duck or to discuss design or errors - a better Stack Exchange.

But you can't split your software spec into a set of SE questions then paste the code from top answers.

> Not only does an agent not have the ability to evolve a specification over a multi-week period as it builds out its lower components, it also makes decisions upfront that it later doesn’t deviate from.

That's your job.

The great thing about coding agents is that you can tell them "change of design: all API interactions need to go through a new single class that does authentication and retries and rate-limit throttling" and... they'll track down dozens or even hundreds of places that need updating and fix them all.

(And the automated test suite will help them confirm that the refactoring worked properly, because naturally you had them construct an automated test suite when they built those original features, right?)

Going back to typing all of the code yourself (my interpretation of "writing by hand") because you don't have the agent-managerial skills to tell the coding agents how to clean up the mess they made feels short-sighted to me.

  • asadjb
  • ·
  • 22 minutes ago
  • ·
  • [ - ]
Unfortunately I have started to feel that using AI to code - even with a well designed spec, ends up with code that; in the authors words, looks like

> [Agents write] units of changes that look good in isolation.

I have only been using agents for coding end-to-end for a few months now, but I think I've started to realise why the output doesn't feel that great to me.

Like you said; "it's my job" to create a well designed code base.

Without writing the code myself however, without feeling the rough edges of the abstractions I've written, without getting a sense of how things should change to make the code better architected, I just don't know how to make it better.

I've always worked in smaller increments, creating the small piece I know I need and then building on top of that. That process highlights the rough edges, the inconsistent abstractions, and that leads to a better codebase.

AI (it seems) decides on a direction and then writes 100s of LOC at one. It doesn't need to build abstractions because it can write the same piece of code a thousand times without caring.

I write one function at a time, and as soon I try to use it in a different context I realise a better abstraction. The AI just writes another function with 90% similar code.

> (And the automated test suite will help them confirm that the refactoring worked properly, because naturally you had them construct an automated test suite when they built those original features, right?)

I dunno, maybe I have high standards but I generally find that the test suites generated by LLMs are both over and under determined. Over-determined in the sense that some of the tests are focused on implementation details, and under-determined in the sense that they don't test the conceptual things that a human might.

That being said, I've come across loads of human written tests that are very similar, so I can see where the agents are coming from.

You often mention that this is why you are getting good results from LLMs so it would be great if you could expand on how you do this at some point in the future.

I work in Python which helps a lot because there are a TON of good examples of pytest tests floating around in the training data, including things like usage of fixture libraries for mocking external HTTP APIs and snapshot testing and other neat patterns.

Or I can say "use pytest-httpx to mock the endpoints" and Claude knows what I mean.

Keeping an eye on the tests is important. The most common anti-pattern I see is large amounts of duplicated test setup code - which isn't a huge deal, I'm much more more tolerant of duplicated logic in tests than I am in implementation, but it's still worth pushing back on.

"Refactor those tests to use pytest.mark.parametrize" and "extract the common setup into a pytest fixture" work really well there.

Generally though the best way to get good tests out of a coding agent is to make sure it's working in a project with an existing test suite that uses good patterns. Coding agents pick the existing patterns up without needing any extra prompting at all.

I find that once a project has clean basic tests the new tests added by the agents tend to match them in quality. It's similar to how working on large projects with a team of other developers work - keeping the code clean means when people look for examples of how to write a test they'll be pointed in the right direction.

One last tip I use a lot is this:

  Clone datasette/datasette-enrichments
  from GitHub to /tmp and imitate the
  testing patterns it uses
I do this all the time with different existing projects I've written - the quickest way to show an agent how you like something to be done is to have it look at an example.
I work in Python as well and find Claude quite poor at writing proper tests, might be using it wrong. Just last week, I asked Opus to create a small integration test (with pre-existing examples) and it tried to create a 200-line file with 20 tests I didn't ask for.

I am not sure why, but it kept trying to do that, although I made several attempts.

Ended up writing it on my own, very odd. This was in Cursor, however.

> Generally though the best way to get good tests out of a coding agent is to make sure it's working in a project with an existing test suite that uses good patterns. Coding agents pick the existing patterns up without needing any extra prompting at all.

Yeah, this is where I too have seen better results. The worse ones have been in places where it was greenfield and I didn't have an amazing idea of how to write tests (a data person working on a django app).

Thanks for the information, that's super helpful!

In my experience asking the model to construct an automated test suite, with no additional context, is asking for a bad time. You'll see tests for a custom exception class that you (or the LLM) wrote that check that the message argument can be overwritten by the caller, or that a class responds to a certain method, or some other pointless and/or tautological test.

If you start with an example file of tests that follow a pattern you like, along with the code the tests are for, it's pretty good at following along. Even adding a sentence to the prompt about avoiding tautological tests and focusing on the seams of functions/objects/whatever (integration tests) can get you pretty far to a solid test suite.

Embrace TDD? Write those tests and tell the agent to write the subject under test?
The article said:

> So I’m back to writing by hand for most things. Amazingly, I’m faster, more accurate, more creative, more productive, and more efficient than AI, when you price everything in, and not just code tokens per hour

At least he said "most things". I also did "most things" by hand, until Opus 4.5 came out. Now it's doing things in hours I would have worked an entire week on. But it's not a prompt-and-forget kind of thing, it needs hand holding.

Also, I have no idea _what_ agent he was using. OpenAI, Gemini, Claude, something local? And with a subscription, or paying by the token?

Because the way I'm using it, this only pays off because it's the 200$ Claude Max subscription. If I had to pay for the token (which once again: are hugely marked up), I would have been bankrupt.

> Going back to typing all of the code yourself (my interpretation of "writing by hand") because you don't have the agent-managerial skills to tell the coding agents how to clean up the mess they made feels short-sighted to me.

I increasingly feel a sort of "guilt" when going back and forth between agent-coding and writing it myself. When the agent didn't structure the code the way I wanted, or it just needs overall cleanup, my frustration will get the best of me and I will spend too much time writing code manually or refactoring using traditional tools (IntelliJ). It's clear to me that with current tooling some of this type of work is still necessary, but I'm trying to check myself about whether a certain task really requires my manual intervention, or whether the agent could manage it faster.

Knowing how to manage this back and forth reinforces a view I've seen you espouse: we have to practice and really understand agentic coding tools to get good at working with them, and it's a complete error to just complain and wait until they get "good enough" - they're already really good right now if you know how to manage them.

> Going back to typing all of the code yourself (my interpretation of "writing by hand") because you don't have the agent-managerial skills to tell the coding agents how to clean up the mess they made feels short-sighted to me.

Or those skills are a temporary side effect of the current SOTA and will be useless in the future, so honing them is pointless right now.

Agents shouldn't make messes, if they did what they say on the tin at least, and if folks are wasting considerable time cleaning them up, they should've just written the code themselves.

  • ap99
  • ·
  • 2 hours ago
  • ·
  • [ - ]
> That's your job.

Exactly.

AI assisted development isn't all or nothing.

We as a group and as individuals need to figure out the right blend of AI and human.

  • thesz
  • ·
  • 51 minutes ago
  • ·
  • [ - ]

  > AI assisted development isn't all or nothing.
  > We as a group and as individuals need to figure out the right blend of AI and human.
This is what makes current LLM debate very much like the strong typing debate about 15-20 years ago.

"We as a group need to figure out the right blend of strong static and weak dynamic typing."

One can look around and see where that old discussion brought us. In my opinion, nowhere, things are same as they were.

So, where will LLM-assisted coding bring us? By rhyming it with the static types, I see no other variants than "nowhere."

Seriously. I've known for a very long time that our community has a serious problem with binary thinking, but AI has done more to reinforce that than anything I can think of in modern memory. Nearly every discussion I get into about AI is dead out of the gate because at least one person in the conversation has a binary view that it's either handwritten or vibe coded. They have an insanely difficult time imagining anything in the middle.

Vibe coding is the extreme end of using AI, while handwriting is the extreme end of not using AI. The optimal spot is somewhere in the middle. Where exactly that spot is, I think is still up for debate. But the debate is not progressed in any way by latching on to the extremes and assuming that they are the only options.

I think you will find this is not specific to this community nor AI but any topic involving nuance and trade-offs without a right answer

For example, most political flamefests

  • ·
  • 1 hour ago
  • ·
  • [ - ]
I agree, as a pretty experienced coder, I wonder if the newer generation is just rolling with the first shot. I find myself having the AI rewrite things a slightly different way 2-3x per feature or maybe even 10x. Because i know quality when i see it, having done so much by hand and so much reading.
  • ·
  • 2 hours ago
  • ·
  • [ - ]
> The AI had simply told me a good story. Like vibewriting a novel, the agent showed me a good couple paragraphs that sure enough made sense and were structurally and syntactically correct. Hell, it even picked up on the idiosyncrasies of the various characters. But for whatever reason, when you read the whole chapter, it’s a mess. It makes no sense in the overall context of the book and the preceding and proceeding chapters.

This is the bit I think enthusiasts need to argue doesn't apply.

Have you ever read a 200 page vibewritten novel and found it satisfying?

So why do you think a 10 kLoC vibecoded codebase will be any good engineering-wise?

"So why do you think a 10 kLoC vibecoded codebase will be any good engineering-wise?"

I've been coding a side-project for a year with full LLM assistance (the project is quite a bit older than that).

Basically I spent over a decade developing CAD software at Trimble and now have pivoted to a different role and different company. So like an addict, I of course wanted to continue developing CAD technology.

I pretty much know how CAD software is supposed to work. But it's _a lot of work_ to put together. With LLMs I can basically speedrun through my requirements that require tons of boilerplate.

The velocity is incredible compared to if I would be doing this by hand.

Sometimes the LLM outputs total garbage. Then you don't accept the output, and start again.

The hardest parts are never coding but design. The engineer does the design. Sometimes I pain weeks or months over a difficult detail (it's a sideproject, I have a family etc). Once the design is crystal clear, it's fairly obvious if the LLM output is aligned with the design or not. Once I have good design, I can just start the feature / boilerplate speedrun.

If you have a Windows box you can try my current public alpha. The bugs are on me, not on the LLM:

https://github.com/AdaShape/adashape-open-testing/releases/t...

Because a novel is about creative output, and engineering is about understanding a lot of rules and requirements and then writing logic to satisfy that. The latter has a much more explicitly defined output.

  Have you ever read a 200 page vibewritten novel and found it satisfying?
I haven't, but my son has. For two separate novels authored by GPT 4.5.

(The model was asked to generate a chapter at a time. At each step, it was given the full outline of the novel, the characters, and a summary of each chapter so far.)

  • andai
  • ·
  • 43 minutes ago
  • ·
  • [ - ]
Interesting. I heard that model was significantly better than what we ended up with (at least for writing), and they shut it down because it was huge and expensive.

Did the model also come up with the idea for the novel, the characters, the outline?

For one novel, I gave the model a sentence about the idea, and the names and a few words about each of the characters.

For the other, my son wrote ~200 words total describing the story idea and the characters.

In each case, the model created the detailed outline and did all the writing.

I like this way of framing the problem, and it might even be a good way to self-evaluate your use of AI: Try vibe-writing a novel and see how coherent it is.

I suspect part of the reason we see such a wide range of testimonies about vibe-coding is some people are actually better at it, and it would be useful to have some way of measuring that effectiveness.

I wrote this a day ago but I find it even more relevant to your observation:

I would never use, let alone pay for, a fully vibe-coded app whose implementation no human understands.

Whether you’re reading a book or using an app, you’re communicating with the author by way of your shared humanity in how they anticipate what you’re thinking as you explore the work. The author incorporates and plans for those predicted reactions and thoughts where it makes sense. Ultimately the author is conveying an implicit mental model (or even evoking emotional states or sensations) to the reader.

The first problem is that many of these pathways and edge cases aren’t apparent until the actual implementation, and sometimes in the process the author realizes that the overall product would work better if it were re-specified from the start. This opportunity is lost without a hands on approach.

The second problem is that, the less human touch is there, the less consistent the mental model conveyed to the user is going to be, because a specification and collection of prompts does not constitute a mental model. This can create subconscious confusion and cognitive friction when interacting with the work.

> In retrospect, it made sense. Agents write units of changes that look good in isolation. They are consistent with themselves and your prompt. But respect for the whole, there is not. Respect for structural integrity there is not. Respect even for neighboring patterns there was not.

Well yea, but you can guard against this in several ways. My way is to understand my own codebase and look at the output of the LLM.

LLMs allow me to write code faster and it also gives a lot of discoverability of programming concepts I didn't know much about. For example, it plugged in a lot of Tailwind CSS, which I've never used before. With that said, it does not absolve me from not knowing my own codebase, unless I'm (temporarily) fine with my codebase being fractured conceptually in wonky ways.

I think vibecoding is amazing for creating quick high fidelity prototypes for a green field project. You create it, you vibe code it all the way until your app is just how you want it to feel. Then you refactor it and scale it.

I'm currently looking at 4009 lines of JS/JSX combined. I'm still vibecoding my prototype. I recently looked at the codebase and saw some ready made improvements so I did them. But I think I'll start to need to actually engineer anything once I reach the 10K line mark.

> My way is to understand my own codebase and look at the output of the LLM.

Then you are not vibe coding. The core, almost exclusive requirement for "vibe coding" is that you DON'T look at the code. Only the product outcome.

This seems to be a major source of confusion in these conversations. People do not seem to agree on the definition of vibe coding. A lot of debates seem to be between people who are using the term because it sounds cool and people who have defined it specifically to only include irresponsible tool use, then they get into a debate about if the person was being irresponsible or not. It’s not useful to have that debate based on the label rather than the particulars.
  • aeonik
  • ·
  • 41 minutes ago
  • ·
  • [ - ]
The original use of the word "vibe code" was very clear.

You don't even look at the diffs. You just yolo the code.

https://x.com/i/status/1886192184808149383

That's a fair point.
I think this is my main confusion around vibe coding.

Is it a skill for the layman?

Or does it only work if you have the understanding you would need to manage a team of junior devs to build a project.

I feel like we need a different term for those two things.

"Vibe coding" isn't a "skill", is a meme or a experiment, something you do for fun, not for writing serious code where you have a stake in the results.

Programming together with AI however, is a skill, mostly based on how well you can communicate (with machines or other humans) and how well your high-level software engineering skills are. You need to learn what it can and cannot do, before you can be effective with it.

I use "vibe coding" for when you prompt without even looking at the code - increasingly that means non-programmers are building code for themselves with zero understanding of how it actually works.

I call the act of using AI to help write code that you review, or managing a team of coding agents "AI-assisted programming", but that's not a snappy name at all. I've also skirted around the idea of calling it "vibe engineering" but I can't quite bring myself to commit to that: https://simonwillison.net/2025/Oct/7/vibe-engineering/

I don't think the OP was using the classic definition of vibe coding, it seemed to me they were using the looser definition where vibe coding means "using AI to write code".
The blog appears to imply that the author only opened the codebase after a significant period of time.

> It’s not until I opened up the full codebase and read its latest state cover to cover that I began to see what we theorized and hoped was only a diminishing artifact of earlier models: slop.

This is true vibe coding, they exclusively interacted with the project through the LLM, and only looked at its proposed diffs in a vacuum.

If they had been monitoring the code in aggregate the entire time they likely would have seen this duplicative property immediately.

The paragraph before the one you quoted there reads:

> What’s worse is code that agents write looks plausible and impressive while it’s being written and presented to you. It even looks good in pull requests (as both you and the agent are well trained in what a “good” pull request looks like).

Which made me think that they were indeed reading at least some of the code - classic vibe coding doesn't involve pull requests! - but weren't paying attention to the bigger picture / architecture until later on.

I know what you mean but to look that black and white at it seems dismissive of the spectrum that's actually there (between vibecoding and software engineering). Looking at the whole spectrum is, I find, much more interesting.

Normally I'd know 100% of my codebase, now I understand 5% of it truly. The other 95% I'd need to read it more carefully before I daresay I understand it.

Call it "AI programming" or "AI pairing" or "Pair programming with AI" or whatever else, "vibe coding" was "coined" with the explicit meaning of "I'm going by the vibes, I don't even look at the code". If "vibe coding" suddenly mean "LLM was involved somehow", then what is the "vibe" even for anymore?

I agree there is a spectrum, and all the way to the left you have "vibe coding" and all the way to the right you have "manual programming without AI", of course it's fine to be somewhere in the middle, but you're not doing "vibe coding" in the way Karpathy first meant it.

I never really got onto "vibe coding". I treat AI as a better auto-complete that has stack overflow knowledge.

I am writing a game in Monogame, I am not primarily a game dev or a c sharp dev. I find AI is fantastic here for "Set up a configuration class for this project that maps key bindings" and have it handle the boiler plate and smaller configuration. Its great at give me an A start implementation for this graph. But when it becomes x -> y -> z without larger contexts and evolutions it falls flat. I still need creativity. I just don't worry too much about boiler plate, utility methods, and figuring out specifics of wiring a framework together.

Karpathy coined the term vibecoding 11 months ago (https://x.com/karpathy/status/1886192184808149383). It caused quite a stir - because not only was it was a radically new concept, but fully agentic coding had only become recently possible. You've been vibe coding for two years??
  • andai
  • ·
  • 37 minutes ago
  • ·
  • [ - ]
I had GPT-4 design and build a GPT-4 powered Python programmer in 2023. It was capable of self-modification and built itself out after the bootstrapping phase (where I copy pasted chunks or code based on GPT-4's instructions).

It wasn't fully autonomous (the reliability was a bit low -- e.g. had to get the code out of code fences programmatically), and it wasn't fully original (I stole most of it from Auto-GPT, except that I was operating on the AST directly due to the token limitations).

My key insight here was that I allowed GPT to design the apis that itself was going to use. This makes perfect sense to me based on how LLMs work. You tell it to reach for a function that doesn't exist, and then you ask it to make it exist based on how it reached for it. Then the design matches its expectations perfectly.

GPT-4 now considers self modifying AI code to be extremely dangerous and doesn't like talking about it. Claude's safety filters began shutting down similar conversations a few months ago, suggesting the user switch to a dumber model.

It seems the last generation or two of models passed some threshold regarding self replication (which is a distinct but highly related concept), and the labs got spooked. I haven't heard anything about this in public though.

The term was coined then, but people have been doing it with claude code and cursor and copilot and other tools for longer. They just didn't have a word for it yet.
  • reedf1
  • ·
  • 14 minutes ago
  • ·
  • [ - ]
Claude Code was released a month after this post - and cursor did not yet have an agent concept, mostly just integrated chat and code completion. I know because I was using it.
Very good point. Also, What the OP describes is something I went through in the first few months of coding with AI. I pushed passed “the code looks good but it’s crap” phase and now it’s working great. I’ve found the fix is to work with it during research/planning phase and get it to layout all its proposed changes and push back on the shit. Once you have a research doc that looks good end to end then hit “go”.
The author is using the term to mean AI assisted coding. Thats been around for longer than the word vibe coding
  • andai
  • ·
  • 32 minutes ago
  • ·
  • [ - ]
This remains a point of great confusion every time there is such a discussion.

When some people say vibe coding, they mean they're copy-pasting snippets of code from ChatGPT.

When some people say vibe coding, they give a one sentence prompt to their cluster of Claude Code instances and leave for a road trip!

[dead]
  • eddyg
  • ·
  • 4 minutes ago
  • ·
  • [ - ]
Previous discussion on the video: https://news.ycombinator.com/item?id=46744572
Good for the author. Me, never going back to hands-only coding. I am producing more higher quality code that I understand and feel confident in. I tell AI to not just “write tests”, I tell it exactly what to test as well. Then I’ll often prompt it “hey did you check for the xyz edge cases?” You need code reviews. You need to intervene. You will need frequent code rewrites and refactors. But AI is the best pair-coding partner you could hope for (at this time) and one that never gets tired.

So while there’s no free lunch, if you are willing to pay - your lunch will be a delicious unlimited buffet for a fraction of the cost.

I use ai to develop, but at every code review I find stuff to be corrected, which motivates me to continuing the reviews. It's still a win I think though. I've incrementally increased my use of ai in development [1], but I'm at a plateau now I think. I don't plan to go over to complete vibe coding for anything serious or to be maintained.

1: https://asfaload.com/blog/ai_use/

  • andai
  • ·
  • 48 minutes ago
  • ·
  • [ - ]
It probably depends on what you're doing, but my use case is simple straightforward code with minimal abstraction.

I have to go out of my way to get this out of llms. But with enough persuasion, they produce roughly what I would have written myself.

Otherwise they default to adding as much bloat and abstraction as possible. This appears to be the default mode of operation in the training set.

I also prefer to use it interactively. I divide the problem to chunks. I get it to write each chunk. The whole makes sense. Work with its strengths and weaknesses rather than against them.

For interactive use I have found smaller models to be better than bigger models. First of all because they are much faster. And second because, my philosophy now is to use the smallest model that does the job. Everything else by definition is unnecessarily slow and expensive!

But there is a qualitative difference at a certain level of speed, where something goes from not interactive to interactive. Then you can actually stay in flow, and then you can actually stay consciously engaged.

In my experience it's great a writing sample code or solving obscure problems that would have been hard to google a solution for. However it fails sometimes and it can't get past some block, but neither can I unless I work hard at it.

Examples.

Thanks to Claude I've finally been able to disable the ssh subsystem of the GNOME keyring infrastructure that opens a modal window asking for ssh passhprases. What happened is that I always had to cancel the modal, look for the passhprase in my password manager, restart what made the modal open. What I have now is either a password prompt inside a terminal or a non modal dialog. Both ssh-add to a ssh agent.

However my new emacs windows still open in an about 100x100 px window on my new Debian 13 install, nothing suggested by Claude works. I'll have to dig into it but I'm not sure that's important enough. I usually don't create new windows after emacs starts with the saved desktop configuration.

  • dv_dt
  • ·
  • 2 hours ago
  • ·
  • [ - ]
I think there is going to be an AI eternal summer. Both from developer to AI spec - where the AI implements to the spec to some level of quality, but then closing the gap after that is an endless chase of smaller items that don't all resolve at the same time. And from people getting frustrated with some AI implemented app, and so go off and AI implement another one, with a different set of features and failings.
Process and plumbing become very important when using ai for coding. Yes, you need good prompts. But as the code base gets more complex, you also need to spend significant time developing test guides, standardization documents, custom linters, etc, to manage the agents over time.
There is certainly some truth to this, but why does it have to be black-and-white?

Nobody forces you to completely let go of the code and do pure vibe coding. You can also do small iterations.

I tell my students that they can watch sports on tv, but it will not make them fit.

On a personal note, vibe coding leaves me with that same empty hollow sort of tiredness, as a day filled with meetings.

Last week I just said f it and developed a feature by hand. No Copilot, no agents. Just good old typing and a bit of Intellisense. I ran into a lot of problems with the library I used, slowly but surely I got closer to the result I wanted. In the end my feature worked as expected, I understand the code I wrote and know about all the little quirks the lib has.

And as a added benefit: I feel accomplished and proud of the feature.

I felt everything in this post quite emphatically until the “but I’m actually faster than the AI.”

Might be my skills but I can tell you right now I will not be as fast as the AI especially in new codebases or other languages or different environments even with all the debugging and hell that is AI pull request review.

I think the answer here is fast AI for things it can do on its own, and slow, composed, human in the loop AI for the bigger things to make sure it gets it right. (At least until it gets most things right through innovative orchestration and model improvement moving forward.)

But those are the parts where it's important to struggle through the learning process even if you're slower than AI. if you defer to an LLM because it can do your work in a new codebase faster than you, that code base will stay new to you for forever. You'll never be able to review the AI code effectively.
I think what many people do no understand is that software development is communication. Communication from the customers/stake holders to the developer and communication from with the developer to the machine. At some fundamental level there needs to be some precision about what you want and someone/something needs to translate that into a system to provide that solution. Software can help check if there are errors, check constraints, and execute instructions precisely, but they cannot replace the fact that someone needs to tell the machine what to do (precise intent).

What AI (LLMs) do is raises the level of abstraction to human language via translation. The problem is human language is imprecise in general. You can see this with legal or science writing. Legalese is almost illegible to laypeople because there are precise things you need to specify and you need be precise in how you specify it. Unfortunately the tech community is misleading the public and telling laypeople they can just sit back and casually tell AI what you want and it is going to give you exactly what you wanted. Users are just lying to themself, because most-likely they did not take the time to think through what they wanted and they are rationalizing (after the fact) that the AI is giving them exactly what they wanted.

  • hgs3
  • ·
  • 32 minutes ago
  • ·
  • [ - ]
I'm flabbergasted why anyone would voluntarily vibe code anything. For me, software engineering is a craft. You're supposed to enjoy building it. You should want to do it yourself.
Do you honestly get satisfaction out of writing code that you've written dozens of times in your career? Does writing yet another REST client endpoint fill you with satisfaction? Software is my passion, but I want to write code where I can add the maximum value. I add more value by using my experience solving new problems that rehashing code I've written before. Using GenAI as a helper tool allows me to quickly write the boilerplate and get to the value-add. I review every line of code written before sending it for PR review. That's not controversial, it's just good engineering.
I never trust the opinion of a single LLM model anymore - especially for more complex projects. I have seen Claude guarantee something is correct and then immediately apologize when I feed a critical review by Codex or Gemini. And, many times, the issues are not minor but are significant critical oversights by Claude.

My habit now: always get a 2nd or 3rd opinion before assuming one LLM is correct.

  • ozten
  • ·
  • 1 hour ago
  • ·
  • [ - ]
It doesn’t have to be different foundation models. As long as the temperature is up, as the same model 100 times.
The author also has multiple videos on his YouTube channel going over the specific issues hes had with AI that I found really interesting: https://youtube.com/@atmoio
+1, ive lost the mental model of most projects. I also added disclaimers to my projects that part of it was generated to not fool anyone
My observation is that vibe-coded applications are significantly lower quality than traditional software. Anthropic software (which they claim to be 90% vibe coded) is extremely buggy, especially the UI.
  • gowld
  • ·
  • 33 minutes ago
  • ·
  • [ - ]
That's a misunderstanding based on loose definition of "vibe coding". When companies threw around the "90% of code is written by AI" claims, they were referring to counting characers of autocomplete basing on users actually typing code (most of which was eequivalent to "AI generated" code by Eclipse tab-completion decade ago), and sometimes writing hyperlocal prompts for a single method.

We can identify 3 levels of "vibe coding":

1. GenAI Autocomplete

2. Hyperlocal prompting about a specific function. (Copilot's orginal pitch)

3. Developing the app without looking at code.

Level 3 is hardly considered "vibe" coding, and Level 2 is iffy.

"90% of code written by AI" in some non-trivial contexts only very recently reached level 3.

I don't think it ever reached Level 2, because that's just a painfully tedious way of writing code.

I believe Anthropic is already doing Level 3 vibe coding for >90% of their code.
They have not said that. They've only said that most of their code is written by Claude. That is different than "vibe coding". If competent engineers review the code then it is little different than any coding.
  • andai
  • ·
  • 30 minutes ago
  • ·
  • [ - ]
I mostly work at level 2, and I call it "power coding", like power armor, or power tools. Your will and your hand still guides the process continuously. But now your force is greatly multiplied.
I'm impressed that this person has been vibecoding longer than vibecoding has been a thing. A real trailblazer!
GitHub copilot was released in 2021, and Cursor was released around October 2023[0].

[0]: https://news.ycombinator.com/item?id=37888477

  • reedf1
  • ·
  • 9 minutes ago
  • ·
  • [ - ]
Early cursor was just integrated chat and code completion. No agents.
At the earliest, "vibecoding" was only possible with Claude 3.5, released July 2024 ... maaaybe Claude 3, released in March of that year...

It's worth mentioning that even today, Copilot is an underwhelming-to-the-point-obstructing kind of product. Microsoft sent salespeople and instructors to my job, all for naught. Copilot is a great example of how product > everything, and if you don't have a good product... well...

Is Claude through Github Copilot THAT much worse? I know there are differences, but I don't find it to be obstructing my vibe coding.
I haven't tried it since 9-12 months ago. At the time it was really bad and I had a lot more success copy/pasting from web interfaces. Is it better now? Can you agentic code with it? How's the autocomplete?
Yes, I vibecoded small personal apps from start to finish with it. Planning mode, edit mode, mcp, tool calling, web searches. Can easily switch between Gemini, ChatGPT, Grok or Claude within the same conversation. I think multiple agents work, though not sure.

All under one subscription.

Does not support upload / reading of PDF files :(

In the enterprise deployments of GitHub Copilot I've seen at my clients that authenticate over SSO (typically OIDC with OAuth 2.0), connecting Copilot to anything outside of what Microsoft has integrated means reverse engineering the closed authentication interface. I've yet to run across someone's enterprise Github Copilot where the management and administrators have enabled the integration (the sites have enabled access to Anthropic models within the Copilot interface, but not authorized the integration to Claude Code, Opencode, or similar LLM coding orchestration tooling with that closed authentication interface).

While this is likely feasible, I imagine it is also an instant fireable offense at these sites if not already explicitly directed by management. Also not sure how Microsoft would react upon finding out (never seen the enterprise licensing agreement paperwork for these setups). Someone's account driving Claude Code via Github Copilot will also become a far outlier of token consumption by an order(s) of magnitude, making them easy to spot, compared to their coworkers who are limited to the conventional chat and code completion interfaces.

If someone has gotten the enterprise Github Copilot integration to work with something like Claude Code though (simply to gain access to the models Copilot makes available under the enterprise agreement, in a blessed golden path by the enterprise), then I'd really like to know how that was done on both the non-technical and technical angles, because when I briefly looked into it all I saw were very thorny, time-consuming issues to untangle.

Outside those environments, there are lots of options to consume Claude Code via Github Copilot like with Visual Studio Code extensions. So much smaller companies and individuals seem to be at the forefront of adoption for now. I'm sure this picture will improve, but the rapid rate of change in the field means those whose work environment is like those enterprise constrained ones I described but also who don't experiment on their own will be quite behind the industry leading edge by the time it is all sorted out in the enterprise context.

Github copilot used to only be in line completion. That is not vibe coding.
  • ·
  • 2 hours ago
  • ·
  • [ - ]
was github copilot LLM based in 2021? I thought the first version was something more rudimentary
It seems the term has been introduced by Andrej Karpathy in February 2025, so yes, but very often, people say "vibe coding" when they mean "heavily (or totally) LLM-assisted coding", which is not synonymous, but sounds better to them.
I think that something in between works.

I have AI build self-contained, smallish tasks and I check everything it does to keep the result consistent with global patterns and vision.

I stay in the loop and commit often.

Looks to me like the problem a lot of people are having is that they have AI do the whole thing.

If you ask it "refactor code to be more modern", it might guess what you mean and do it in a way you like it or not, but most likely it won't.

If you keep tasks small and clearly specced out it works just fine. A lot better than doing it by hand in many cases, specially for prototyping.

After reading the article (and watching the video), I think the author makes very clear points that comments here are skipping over.

The opener is 100% true. Our current approach with AI code is "draft a design in 15mins" and have AI implement it. The contrasts with the thoughtful approach a human would take with other human engineers. Plan something, pitch the design, get some feedback, take some time thinking through pros and cons. Begin implementing, pivot, realizations, improvements, design morphs.

The current vibe coding methodology is so eager to fire and forget and is passing incomplete knowledge unto an AI model with limited context, awareness and 1% of your mental model and intent at the moment you wrote the quick spec.

This is clearly not a recipe for reliable and resilient long-lasting code or even efficient code. Spec-driven development doesn't work when the spec is frozen and the builder cannot renegotiate intent mid-flight..

The second point made clearer in the video is the kind of learned patterns that can delude a coder, who is effectively 'doing the hard part', into thinking that the AI is the smart one. Or into thinking that the AI is more capable than it actually is.

I say this as someone who uses Claude Code and Codex daily. The claims of the article (and video) aren't strawman.

Can we progress past them? Perhaps, if we find ways to have agents iteratively improve designs on the fly rather than sticking with the original spec that, let's be honest, wasn't given the rigor relative to what we've asked the LLMs to accomplish. If our workflows somehow make the spec a living artifact again -- then agents can continuously re-check assumptions, surface tradeoffs, and refactor toward coherence instead of clinging to the first draft.

The tale of the coder, who finds a legacy codebase (sometimes of their own making) and looks at it with bewilderment is not new. It's a curious one, to a degree, but I don't think it has much to do with vibe coding.
I've gone through this cycle too, and what I realized is that as a developer a large part of your job is making sure the code you write works, is maintainable, and you can explain how it works.
  • jrm4
  • ·
  • 1 hour ago
  • ·
  • [ - ]
I feel like the vast majority of articles on this are little more than the following:

"AI can be good -- very good -- at building parts. For now, it's very bad at the big picture."

Claude Code slOpus user. No surprise this is their conclusion.
I read that people just allow Claude Code free rein but after using it for a few months and seeing what it does I wonder how much of that is in front of users. CC is incredible as much as it is frustrating and a lot of what it churns out is utter rubbish.

I also keep seeing that writing more detailed specs is the answer and retorts from those saying we’re back to waterfall.

That isn’t true. I think more of the iteration has moved to the spec. Writing the code is so quick now so can make spec changes you wouldn’t dare before.

You also need gates like tests and you need very regular commits.

I’m gradually moving towards more detailed specs in the form of use cases and scenarios along with solid tests and a constantly tuned agent file + guidelines.

Through this I’m slowly moving back to letting Claude lose on implementation knowing I can do scan of the git diffs versus dealing with a thousand ask before edits and slowing things down.

When this works you start to see the magic.

  • leesec
  • ·
  • 50 minutes ago
  • ·
  • [ - ]
OK. Top AI labs have people using llms for 100% of their code. Enjoy writing by hand tho
"They got more VC than me, therefore they are right".

You gotta have a better argument than "AI Labs are eating their own dogfood". Are there any other big software companies doing that successfully? I bet yes, and think those stories carry more weight.

unless someone shows their threads of prompts or an unedited stream of them working, it's pointless to put any weight into their opinions.

this is such an individualized technology that two people at the same starting point two years ago, could've developed wildly different workflows.

That's the sad part. Empiricism is scarce when people and companies are incentivized to treat their AI practices as trade secrets. It's fundamentally distinct from prior software movements which were largely underwritten by open, accessible, and permissively-licensed technologies.
False dichotomy. There is a happy medium where you can orchestrate the agent to give you the code you want even when the spec changes
  • ·
  • 49 minutes ago
  • ·
  • [ - ]
Good luck finding an employer that lets you do this moving forward. The new reality is that no one can give the estimates they previously gave for tasks. \

"Amazingly, I’m faster, more accurate, more creative, more productive, and more efficient than AI, when you price everything in, and not just code tokens per hour."

For 99.99% of developers this just won't be true.

I wish more critics would start to showcase examples of code slop. I'm not saying this because I defend the use of AI-coding, but rather because many junior devs. that read these types articles/blog posts may not know what slop is, or what it looks like. Simply put, you don't know what you don't know.
  • joomy
  • ·
  • 2 hours ago
  • ·
  • [ - ]
The title alone reads like the "digging for diamonds" meme.
He might be coding by hand again, but the article itself is AI slop
two years of vibecoding experience already?

his points about why he stopped using AI: these are the things us reluctant AI adopters have been saying since this all started.

The practice is older than the name, which is usually the way: first you start doing something frequently enough you need to name it, then you come up with the name.
[dead]
Everything the OP says can be true, but there’s a tipping point where you learn to break through the cruft and generate good code at scale.

It requires refactoring at scale, but GenAI is fast so hitting the same code 25 times isn’t a dealbreaker.

Eventually the refactoring is targeted at smaller and smaller bits until the entire project is in excellent shape.

I’m still working on Sharpee, an interactive fiction authoring platform, but it’s fairly well-baked at this point and 99% coded by Claude and 100% managed by me.

Sharpee is a complex system and a lot of the inner-workings (stdlib) were like coats of paint. It didn’t shine until it was refactored at least a dozen times.

It has over a thousand unit tests, which I’ve read through and refactored by hand in some cases.

The results speak for themselves.

https://sharpee.net/ https://github.com/chicagodave/sharpee/

It’s still in beta, but not far from release status.

When using GenAI (and not), enforcing guardrails is critical. Documenting decisions is critical.

Sharpee’s success is rooted in this and its recorded:

https://github.com/ChicagoDave/sharpee/tree/main/docs/archit...