I kinda regret going through the SeLU paper lol back in the late 2010s.
How do you both hold that the technology is so revolutionary because of its productive gains, but at the same time so esoteric that you better be ontop of everything all the time?
This stuff is all like a weird toy compared to other things I have taken the time to learn in my career, the sense of expertise people claim at all comes off to me like a guy who knows the Taco Bell secret menu, or the best set of coupons to use at Target. Its the opposite of intimidating!
Is this the product? I don't want to jump on the detractor wagon, but I read the post and watched the video, and all I gathered is that it dumps the context into the commit. I already do this.
(I will give the agent boom a bit of credit: I write a lot more documentation now, because it's essentially instruction and initial instruction to anything else that works on it. That's a total inversion, and I think it's good.)
The bigger problem is, like others have said, there's no one true flow. I use different agents for different things. I might summarize a lot of reasoning with a cheap model to create a design document, or use a higher reasoning model to sanity check a plan, whatever. It's a lot like programming in English. I don't want my tool to be prescriptive and imposing its technical restrictions on me.
All of that aside: it's impossible that this tool raised $60 million. The problem with this post is that it's supposed to be a hype post about changing the game "entirely" but it doesn't give us a glimpse into whatever we're supposed to by hyped about.
Then later if it goes off piste in another session tell it to re-read the ADDs for x, y and z.
If someone could make that process less clunky, that would be great. However it's very much not just funnel every turd uttered in the prompt onto a git branch and trying a chug the lot down every session.
The same very online group endlessly hyping messy techs and frontend JS frameworks, oblivious to the Facebook and Google sized mechanics driving said frameworks, are now 100x-ing themselves with things like “specs” and “tests” and dreaming big about type systems and compilers we’ve had for decades.
I don’t wanna say this cycle is us watching Node jockies discover systems programming in slow motion through LLMs, but it feels like that sometimes.
Just say what your thing does. Or, better yet, show it to me in under 60 seconds.
Web sites are the new banner ads and headings like that are the new `<blink>`.
It's been like this since the Dotcom era
Or did you forget that you can do anything at zombo.com?
It appears to be rather slow today, but here's a Wiki link for the uninitiated- https://en.wikipedia.org/wiki/Zombo.com
It's still around, but has been redesigned and it's under "new management". Further proof that the internet is dying.
Edit: Actually it may just be aimed at investors. Who cares about having a product?
The fact that the first image you see has "$60M seed" in big text, I have to agree, this does not feel aimed at devs.
It's almost like an extension of the "if you're not paying for the product, you are the product" idea. If you're assessing a tool like this and the marketing isn't even trying to communicate to you, the user, what the product does, aren't you also kind of "the product" in this case too?
Yes yes a Dropbox comment. But the problem here is 1 million people are doing the same thing. For this to be worth 60M seed I suspect they need to do something more than you can achieve by messing around locally."
"Claude build me a script in bash to implement a Ralph loop with a KV store tied to my git commits for agent memory."
But I'm skeptical of building this as a separate platform rather than as tooling on top of git. The most useful AI dev workflow improvements I've seen (cursor rules, aider conventions, claude hooks) all succeeded precisely because they stayed close to existing tools. The moment you ask developers to switch their entire SDLC stack, adoption becomes the real engineering challenge - not the tech.
Curious whether the open source commitment means the checkpoint format itself will be an open spec that other tools can build on.
So is this just a few context.md files that you tell the agent to update as you work and then push it when you are done???
The AI fatigue is real, and the cooling-off period is going to hurt. We’re deep into concept overload now. Every week it’s another tool (don’t get me started on Gas Town) confidently claiming to solve… something. “Faster development”, apparently.
Unless you’re already ideologically committed to this space, I don’t see how the average engineer has the energy or motivation to even understand these tools, never mind meaningfully compare them. That’s before you factor in that many of them actively remove the parts of engineering people enjoy, while piling on yet another layer of abstraction, configuration, and cognitive load.
I’m so tired of being told we’re in yet another “paradigm shift”. Tools like Codex can be useful in small doses, but the moment it turns into a sprawling ecosystem of prompts, agents, workflows, and magical thinking, it stops feeling like leverage and starts feeling like self-inflicted complexity.
The “I’m so tired of being told we’re in another paradigm shift” comments are widely heard and upvoted on HN and are just so hard to comprehend today. They are not seeing the writing on the wall and following where the ball is going to be even in 6-12 months. We have scaling laws, multiple METR benchmarks, internal and external evals of a variety of flavors.
“Tools like codex can be useful in small doses” the best and most prestigious engineers I know inside and outside my company do not code virtually at all. I’m not one of them but I also do not code at all whatsoever. Agents are sufficiently powerful to justify and explain themselves and walk you through as much of the code as you want them to.
This is why I use the copilot extension in VS code. They seem to just copy whatever useful thing climbs to the surface of the AI tool slop pile. Last week I loaded up and Opus 4.6 was there ready to use. Yesterday I found it has a new Claude tool built in which I used to do some refactoring... it worked fine. It's like having an AI tool curator.
I also keep getting job applications for AI-native 'developers' whatever that means.
This sounds like my current "phase" of AI coding. I have had so many project ideas for years that I can just spec out, everything I've thought about, all the little ideas and details, things I only had time to think about, never implement. I then feed it to Claude, and watch it meet my every specification, I can then test it, note any bugs, recompile and re-test. I can review the code, as you would a Junior you're mentoring, and have it rewrite it in a specific pattern.
Funnily enough, I love Beads, but did not like that it uses git hooks for the DB, and I can't tie tickets back to ticketing systems, so I've been building my own alternative, mine just syncs to and from github issues. I think this is probably overkill for whats been a solved thing: ticketing systems.
And I use git hooks on the tool event to print the current open gate (subtask) from task.md so the agent never deviates from the plan, this is important if you use yolo mode. It might be an original technique I never heard anyone using it. A stickie note in the tool response, printed by a hook, that highlights the current task and where is the current task.md located. I have seen stretches of 10 or 15 minutes of good work done this way with no user intervention. Like a "Markdown Turing Machine".
For me a gate is: a dependency that must pass before a task is closed. It could be human verification, unit testing, or even "can I curl this?" "can I build this?" and gates can be re-used, but every task MUST have one gate.
My issue with git hooks integration at that level is and I know this sounds crazy, but not everyone is using git. I run into legacy projects, or maybe its still greenfield as heck, and all you have is a POC zip file your manager emailed you for whatever awful reason. I like my tooling to be agnostic to models and external tooling so it can easily integrate everywhere.
Yours sounds pretty awesome for what its worth, just not for me, wish you the best of luck.
I'm confused how this is any different to the pretty standard agentic coding workflow?
Beads is a nightmare.
Gotta bully that thing man. There's probably room in the market for a local tool that strips the superfluous niceties from instructions. Probably gonna save a material amount of tokens in aggregate.
Oh, nevermind, it’s some MS dude.
EDIT: Or just keep a proper (technical) changelog.txt file in the repo. A lot of the "agentic/LLM engineering frameworks" boil down to best approaches and proper standards the industry should have been following decades ago.
I don't see the need for a full platform that is separate from where my code already lives. If I'm migrating away, it's to something like tangled, not another VC funded company
The readme is a bit more to the point.
I wanted to more or less build Jira for agents and track the context there.
If I had to guess 60 million is just enough to build the POC out. I don't see how this can compete though, Open AI or Anthro could easily spin up a competitor internally.
Commit hook > Background agent summarizes (in a data structure) the work that went into the commit > saves to a note
Built similar (with a better name) a week ago at a hackathon: https://github.com/eqtylab/y
Context management is still an important human skill in working with an agent, and this makes it harder.
I guess when you are Ex-Github CEO, it is that easy raising a $60M seed. I wonder what the record for a seed round is. This is crazy.
I tried a similar(-ish) thing last year at https://github.com/imjasonh/cnotes (a Claude hook to write conversations to git notes) but ended up not getting much out of it. Making it integrated into the experience would have helped, I had a chrome extension to display it in the GitHub UI but even then just stopped using it eventually.
I welcome more innovation in the code forge space but if you’re looking for an oss alternative just for tracking agent sessions with your commits you should checkout agentblame
Tech marketing has become a lot like dating, no technical explanation and intellectual honesty, just word words words and unreasonable expectations.
People usually cannot be honest in their romantic affairs, and here it is the same. Nobody can state: we just want to be between you and whatever you want to accomplish, rent seeking forever!
Will they ever care to elaborate HOW things works and the rationale behind stating this provides any benefit whatsoever? Perhaps this is not intended for those type of humans that care about understanding and logic?
But seriously, $300M valuation for a CLI tool that adds some metadata to Git commits. I don't know what to say.
If you're approaching this problem-space from the ground up, there are just so many fundamental problems to solve that it seems to me that no amount of money or quality of team can increase your likelihood of arriving at enough right answers to ensure success. Pulling off something like this vision in the current red-ocean market would require dozens of brilliant ideas and hundreds of correct bets.
I see zero reason for a person to care about the checkpoints.
And for agents, full sessions just needlessly fill context.
So not sure what is being solved by this.
It's like complaining about the availability of the printing press because it proliferated tabloid production, while preferring beautifully hand-crafted tomes. It's reactively trendy to hate on it because of the vulgar production it enables and to elevate the artisanal extremes that escape its apparent influence
Surely if all software is augmented with agentic development now, our most important space probes have had their software augmented too, right?
What about my blog that I serve static pages on? What about the xray machine my dentist uses? What about the firmware in my toaster? Does the New York Stock Exchange use AI to action stock trades? What about my telescope's ACSOM driver?
Blog: I use AI to make and blog developers are using agentic tools
X-ray machine: again a little late here, plus if you want to start dragging in places that likely have a huge amount of beaurocracy I don’t know that that’s very fair
Firmware in your toaster: cmon these are old basic things, if it’s new firmware maybe? But probably not? These are not strong examples
NYSE to action on stock trades; no they don’t use AI to action on stock trades (that would be dumb and slow and horribly inefficient and non-deterministic), but may very well now be using AI to work on the codebase that does
Let’s try to find maybe more impactful examples than small embodied components in toasters and telescopes, 1970s era telescopes that are already past our solar system.
The denial runs deep
"Essentially all software is augmented with Stack Overflow now, or if not, built with technology or on platforms that is."
Agentic development isn't a panacea nor as widespread as you claim. I'd wager that the vast majority of developers treat AI is a more specified search engine to point them in the direction they're looking for.
AI hallucination is still as massive problem. Can't tell you the number of times I've used agentic prompting with a top model that writes code for a package based on the wrong version number or flat out invents functionality that doesn't exist.
If I do it myself, I get the added bonus of actually understanding what the code is doing, which makes debugging any issues down the line way easier. It's also in generally better for teams b/c you can ask the 'owner' of a part of the codebase what their intuition is on an issue (trying to have AI fill in for this purpose has been underwhelming for me so far).
Trying to maintain a vibecoded codebase essentially involves spelunking though a non-familliar codebase every time manual action is needed to fix an issue (including reviewing/verifying the output of an AI tool's fix for the issue).
(For small/pinpointed things, it has been very good. e.g.: write a python script to comb through this CSV and print x details about it/turn this into a dashboard)
Opus 4.5 and 4.6 is where those instances have gone down, waaay down (though still true). Two personal projects I had abandoned after sonnet built a large pile of semi working cruft it couldn’t quite reason about, opus 4.6 does it in almost one shot.
You are right about learning but consider: you can educate yourself along the way — in some cases it’s no substitute for writing the code yourself, and in many cases you learn a ton more because it’s an excellent teacher and you can try out ideas to see which work best or get feedback on them. I feel I have learned a TON about the space though unlike when I code it myself I may not be extremely comfortable with the details. I would argue we are about 30% of the way to the point where it’s not even no longer relevant it’s a disservice to your company to be writing things yourself.
His use of bombastic language in this announcement suggests that he has never personally worked on serious software. The deterioration of GitHub under his tenure is not confidence inspiring either, but that of course may have been dictated by Nadella.
If you are very generous, this is just another GitHub competitor dressed up in AI B.S. in order to get funding.
As for SDLC, you can do some good automations if you're very opinionated, but people have diverse tastes in the way they want to work, so it becomes a market selection thing.
Productizing the building blocks of the platform seems like the smart play in today's environment honestly.
I am already overloaded with information (generated by AI and humans) on my day to day job, why do I need this additional context, unless company I work for just wants to spend more money to store more slop?
How is it different than reversing it, given a PR -> generate prompt based on business context relevant to the repo or mentioned issues -> preserve it as part of PR description
I barely look at git commit history, why should I look for even higher cardinality data, in this case: WTF, are you doing, idiot, I said don't change the logic to make tests pass, I said properly write tests!
There is no Composer 2.0. There is Cursor 2.0 and Composer 1.5.
I couldn't find any references of Composer 2.0 anywhere. When did that come out?
Personally, I don't let LLMs commit directly. I git add -p and write my own commit messages -- with additional context where required -- because at the end of the day, I'm responsible for the code. If something's unclear or lacks context, it's my fault, not the robot's.
But I would like to see a better GitHub, so maybe they will end up there.
Commit hook > Background agent summarizes (in a data structure) the work that went into the commit.
Built similar (with a better name) a week ago at a hackathon: https://github.com/eqtylab/y
Do we have new words for smaller amounts or is this inflation at work?
1. Tom Preston-Werner (Co-founder). 2008 – 2014 (Out for, eh... look it up)
2. Chris Wanstrath (Co-founder). 2014 – 2018
(2018: Acquisition by Microsoft: https://news.ycombinator.com/item?id=17227286)
3. Nat Friedman (Gnome/Ximian/Microsoft). 2018 – 2021
4. Thomas Dohmke (Founder of HockeyApp, some A/B testing thing, acquired by Microsoft in 2014). 2021 - 2025
There is no Github CEO now, it's just a team/org in Microsoft. (https://mrshu.github.io/github-statuses/)
"As a result, every change can now be traced back not only to a diff, but to the reasoning that produced it."
This is a good idea, but I just don't see how you build an entire platform around this. This feels like a feature that should be added to GitHub. Something to see in the existing PR workflow. Why do I want to go to a separate developer platform to look at this information?
In my case I don't want my tools to assume git, my tools should work whether I open SVN, TFS, Git, or a zip file. It should also sync back into my 'human' tooling, which is what I do currently. Still working on it, but its also free, just like Beads.
On the one hand they think these things provide 1337x productivity gains, can be run autonomously, and will one day lead to "the first 1 person billion dollar company".
And in complete cognitive dissonance also somehow still have fantasies of future 'acquisition' by their oppressors.
Why acquire your trash dev tool?
They'll just have the agents copy it. Hell, you could even outright steal it, because apparently laundering any licensing issues through LLMs short circuits the brains of judges to protohuman clacking rocks together levels.
Imagine being so intellectually lazy that you can't even be bothered to form your own opinion about a product. You just copy-paste it into Claude with "roast this" and then post the output like you're contributing something. That's not criticism, that's outsourcing your personality to an API call. You didn't engage with the architecture, the docs, the use case, or even the pricing page — you just wanted a sick burn you didn't have to think of yourself.
2026: The year everyone fried their brain with Think for Me SaaS.
I personally rarely need to use google maps, and if I do its a glance at it on the beginning of a trip, and I can find my way there through normal navigation. I might look again if I get lost, whereas, I have friends that use it to give directions to go five blocks. I don't think sense of direction is innate either, but its a muscle you build and some people choose to not work on that muscle and they suffer the consequences, albeit minor consequences.
I think we are seeing something similar with LLMs with the development and maintenance of reading, planning, creative and critical thinking skills. While some people might have a higher baseline, I think everyone has the ability to strengthen those muscles and the world implores that us to in many situations, however, now we can pay Altman $0.0010 cents to offload that workout onto a GPU much like people do with navigation and maps. Tech companies love to exploit the dopamine driven response from taking shortcuts, getting somewhere quickly, its no different here.
I think (/know) the implications of this are much more hazardous than consequences of not exercising your navigational abilities, and at least with navigation there are fallback to assist people (signs, landmarks ect). There are no societal fallbacks for llm assisted thinking once someone becomes dependent on it for all aspects of analysis, planning and creativity. Once it is taken away (or they can't afford a quality of output the previously did), where do those natural abilities stand? The implications are very terrifying in my opinion.
I'm personally trying to stay as far away as possible from these things, I see where this is heading and its not as inconsequential as needing Maps to navigate 5 blocks. I do not want my critical thinking skills correlated 1:1 to the quality and quantity of tokens I can afford or have access too anymore than I do not want my navigational abilities correlated 1:1 to the quality of Maps service available to me.
People will say that this is cope, its the new calculator, whatever.. Have fun, I promise you that not knowing trigonometry but having access to an LLM does not give you the ability to write CAD software. I actually think not using these will give you a huge competitive advantage in the future. Someone who has great navigation skills will likely win a navigational competition in the mountains, or survive longer in certain situations. While the scope of those skills is narrow, it still proves a point[0]. The scope of your reading, critical thinking, creativity and planning skills is not limited.
[0]: It should be noted that some of the worlds most high agency and successful people actually participate in navigation as sport called Orienteering, and spend boatloads of money in it.. I wonder why that is?
For any new piece of technology, there are a subset of people for whom it will completely and utterly destroy.