replace github.com/Masterminds/semver/v3 => github.com/Masterminds/semver/v3 v3.4.0
I found this very questionable PR[0]. It appears to have been triggered by dependabot creating an issue for a version upgrade -- which is probably unnecessary to begin with. The copilot agent then implemented that by adding a replace statement, which is not how you are supposed to do this. It also included some seemingly-unrelated changes. The copilot reviewer called out the unrelated changes, but the human maintainer apparently didn't notice and merged anyway.There is just so much going wrong here.
It's worse with renaming things in code. I've yet to see an agent be able to use refactoring tools (if they even exist in VS Code) instead of brute-forcing renames with string replacement or sed. Agents use edit -> build -> read errors -> repeat, instead of using a reliable tool, and it burns a lot more GPU...
For the second, I totally agree. I continue to hope that agents will get better at refactoring, and I think using LSPs effectively would make this happen. Claude took dozens of minutes to perform a rename which Jetbrains would have executed perfectly in like five seconds. Its approach was to make a change, run the tests, do it again. Nuts.
That's their strategy for everything the training data can't solve. This is the main reason the autonomous agent swarm approach doesn't work for me. 20 bucks in tokens just obliterated with 5 agents exchanging hallucinations with each-other. It's way too easy for them to amplify each other's mistakes without a human to intervene.
When using codex, I usually have something like `Never add 3rd party libraries unless explicitly requested. When adding new libraries, use `cargo add $crate` without specifying the version, so we get the latest version.` and it seems to make this issue not appear at all.
Though that is, at least to me, a bit of an anti-pattern for exactly that reason. I've found it far more successful to blow away the context and restart with a new prompt from the old context instead of having a very long running back-and-forward.
Its better than it was with the latest models, I can have them stick around longer, but it's still a useful pattern to use even with 4.6/5.3
Think about what a developer would do: - check the latest version online; - look at the changelog; - evaluate if it’s worth to upgrade or an intermediate may be alright in case of code update are necessary;
Of course, the keep these operations among the human ones, but if you really want to automate this part (and you are ready to pay its consequences) you need to mimic the same workflow. I use Gemini and codex to look for package version information online, it checks the change logs from the version I am to the one I’d like to upgrade, I spawn a Claude Opus subagent to check if in the code something needs to be upgraded. In case of major releases, I git clone the two packages and another subagents check if the interfaces I use changed. Finally, I run all my tests and verify everything’s alright.
Yes, it might not still be perfect, but neither am I.
The AI hasn't understood what's going on, instead it has pattern matched strings and used those patterns to create new strings that /look/ right, but fail upon inspection.
(The human involved is also failing my Turing test... )
I stopped using GH actions when I ran into this issue: https://github.com/orgs/community/discussions/151956#discuss...
That was almost a year ago and to this date I still get updates of people falling into the same issue.
The solution seems simple. Buy their product.
That's why it's an issue.
I've quoted the response on that ticket below. Is there something you disagree with? The "issue" is that usage exceeds the amount that's been paid. The solution sounds pretty simple: pay for your usage. Is your experience different somehow?
> If usage is exceeded, you need to add a payment method and set a spending limit (you can even set it to $0 if you don’t want to allow extra charges).
> If you don’t want to add billing, you’ll need to wait until your monthly quota resets (on the first day of the next month).
Edit: also, one of the other comments says this:
> If you’re experiencing this issue, there are two primary potential causes:
> Your billing information is incorrect. Please update your payment method and ensure your billing address is correct.
> You have a budget set for Actions that is preventing additional spend. Refer to Billing & Licensing > Budgets.
Buying half baked software would probably encourage this. Quarter baked software!
> people falling into the same issue.
Every SaaS provider with a free tier has this issue. How do you suggest it should be addressed?
I literally do not use it, and no my account isn’t compromised. Trying to trick people into paying? Seems cartoonishly stupid but…
GitHub Actions is the last organization I would trust to recognize a security-first design principle.
Especially on the angle of automatic/continuos improvement (https://github.github.io/gh-aw/blog/2026-01-13-meet-the-work...)
Often code is seen as an artifact, that it is valuable by itself. This was an incomplete view before, and it is now a completely wrong view.
What is valuable is how code encode the knowledge of the organization building it.
But what it is even more valuable, is that knowledge itself. Embedded into the people of the organization.
Which is why continuos and automatic improvement of a codebase is so important. We all know that code rot with time/features requests.
But at the same time, abruptly change the whole codebase architecture destroys the mental model of the people in the organization.
What I believe will work, is a slow stream of small improvements - stream that can be digested by the people in the organization.
In this context I find more useful to mix and control deterministic execution with a sprinkle of intelligence on top. So a deterministic system that figure out what is wrong - with whatever definition of wrong that makes sense. And then LLMs to actually fix the problem, when necessary.
I expend a lot of effort preparing instructions in order to steer agents in this way, it’s annoying actually. Think Deep Wiki-style enumeration of how things work, like C4 Diagrams for agents.
This is on GitHub's official account. For some reason GitHub is deploying this on GitHub pages without a different domain?
So this being from github.github.io implies it's published by the "github" account on github.
I would say that GitHub is particularly bad about this as they also use `github.blog` for announcements. I'm not sure if they have any others, but then that's the problem, you can't expect people to magically know which of your different domains are and aren't real if you use more than one. They even announced the github.com SSH key change on github.blog.
Bank: Avoid phishing links, this is what they look like.
Also bank: Here is an link from our actual marketing department that looks exactly like phishing.
but we had a redirect set to https://github.github.io/gh-aw/
Both work and we've fixed the redirect now, thanks
We recently moved this out of the githubnext org to the github org, but short of dedicating some route in github.com/whatever, github.github.io is the domain for pages from the github org.
It’s not like someone else can or could own this link, could they?
I tried out `gh aw init` and hit Y at the wrong prompt. It created a COPILOT_GITHUB_TOKEN on the github repo I happened to be in presumably with a token from my account. That's something that really should have an extra confirmation.
For examplpe, https://github.github.io/gh-aw/blog/2026-01-13-meet-the-work... has several examples of agentic workflows for managing issues and PRs, and those examples link to actual agentic workflow files you can read and use as a starting point for your own workflows.
The value is "delegate chores that cannot be handled by a heuristic". We're figuring out how to tell the story as we go, appreciate the callout!
Basically it feels like a long article that says "we have this new thing that does cool things", but never gives enough concrete details. It probably worked great for you, but it needs to communicate to random people off the street what the win is.
Given GitHub’s already lackluster reputation around security in GHA, I think I’d like to see them address some of GHA’s fundamental weaknesses before layering additional abstractions atop it.
But the implementation is comically awful.
Sure, you can "just write natural language" instructions and hope for the best.
But they couldn't fully get away from their old demons and you still have to pay the YAML tax to set the necessary guardrails.
I can't help but laugh at their example: https://github.com/github/gh-aw?tab=readme-ov-file#how-it-wo...
They wrote 16 words in Markdown and... 19 in YAML.
Because you can't trust the agent, you still have to write tons on gibberish YAML.
I'm trying to understand it, but first you give permissions, here they only provide read permissions.
And then give output permissions, which are actually write permissions on a smaller scope than the previous ones.
Obviously they also absolve themselves from anything wrong that could happen by telling users to be careful.
And they also suggest to setup an egress firewall to avoid the agents being too loose: https://github.com/github/gh-aw-firewall
Why setting-up an actual workflow engine on an infra managed by IT with actual security tooling when you can just stick together a few bits of YAML and Markdown on Github, right?
We've fixed the example on the README and hopefully it's clearer now what's going on.
because helping you isn't the goal
the goal is to generate revenue by consuming tokens
and a never ending swarm of "AI" "agents" is a fantastic way to do that
If I had a nice CI/CD workflow that was built into GitHub rather than rolling my own that I have running locally, that might just make it a little more automatic and a little easier.
The sensible case for this is for delivering human-facing project documentation, not actual code. (E.g. ask the AI agent to write its own "code review" report after looking at recent commits.) It's implemented using CI/CD solutions under the hood, but not real CI/CD.
I’m working on an open source project called consensus-tools that sits above systems like this and focuses on that gap. Agents do not just act, they stake on decisions. Multiple agents or agents plus humans evaluate actions independently, and bad decisions have real cost. This reduces guessing, slows risky actions, and forces higher confidence for security sensitive decisions. Execution answers what an agent can do. Consensus answers how sure we are that it should do it.
[0] https://github.blog/changelog/2025-08-15-github-actions-poli...
Also, a reminder: if you run Codex/Claude Code/whatever directly inside a GitHub Action without strong guardrails , you risk leaking credentials or performing unsafe write actions.
Two years, then we'll know if and how this industry has completely been revolutionized.
By then we'd probably have an AGI emulator, emulated through agents.
If you are changing your product for AI - you don’t understand AI. AI doesn’t need you to do this, and it doesn’t make you a AI company if you do.
AI companies like Anthropic, OpenAI, and maybe Google, simply will integrate at a more human leave and use the same tools humans used in the past, but do so at a higher speed, reliability.
All this effort wasted, as AI don’t need it, and your company is spending millions maybe billions to be an AI company that likely will be severely devalued as AI advances.
As for the domain, this is the same account that has been hosting Github projects for more than a decade. Pretty sure it is legit. Org ID is 9,919 from 2008.
Or you can be full accelerationist and give an agent the role of standing up all the agents. But then you need someone with the job of being angry when they get a $7000 cloud bill.
GH just doesnt really have much a value proposition for anything that isnt a non-trivial, star gathering obsessed, project IMO...
1: https://thenewstack.io/github-will-prioritize-migrating-to-a...
Edit: typo
This is early research out of GitHub Next building on our continuous AI [1] theme, so we'd love for you to kick the tires and share your thoughts. We'd be happy to answer questions, give support, whatever you need. One of the key goals of this project is to figure out how to put guardrails around agents running in GitHub actions. You can read more about our security architecture [1], but at a high level we do the following:
- We run the agent in a sandbox, with minimal to no access to secrets
- We run the agent in a firewall, so it can only access the sites you specify
- We have created a system called "*safe outputs*" that limits what write operations the agent can perform to only the ones you specify. For example, if you create an Agentic Workflow that should only comment on an issue, it will not be able to open a new issue, propose a PR, etc.
- We run MCPs inside their own sandboxes, so an attacker can’t leverage a compromised server to break out or affect other components
We find that there's something very compelling about the shape of this — delegating chores to agents in the same way that we delegate CI to actions. It's certainly not perfect yet, but we're finding new applications for this every day and teams at GitHub are already creating agentic workflows for their own purposes, whether it's engineering or issue management or PR hygiene.
> Why is it on github.github.io and not github.com?
GitHub Pages domains are always ORGNAME.github.io. Now that we've moved the repo over to the `github` org, that's the domain. When this graduates from being a technology preview to a full-on product, we imagine it'll get a spot on github.com/somewhere.
> Why is GitHub Next exploring this?
Our job at GitHub is to build applications that leverage the latest technology. There are a lot of applications of _asynchronous_ AI which we suspect might become way bigger than _synchronous_ AI. Agentic Workflows can do things that are not possible without an LLM. For example, there's no linter in existence that can tell me if my documentation and my code has diverged. That's just one new capability. We think there's a huge category of these things here and the only way to make it good is to … make it!
> Where can I go to talk with folks about this and see what others are cooking with it?
https://gh.io/next-discord in the #continuous-ai channel!
[1] https://githubnext.com/projects/continuous-ai/
[2] https://github.github.io/gh-aw/introduction/architecture/
(edit: right I forgot that HN doesn't do markdown links)
YAML: check
Markdown: check
Wrong level of abstraction: check
Shit slop which will be irrelevant in less than a year time: check
Manager was not PIP'd: check
I’m getting to the point of throwing Jenkins back in it’s that bad.
GitHub gives git a bad name and reputation.
People like Nadella must think that developers are the weakest link: Extreme tolerance for Rube Goldberg machines, no spine, no sense of self-protection.
I'll cancel my paid GitHub account though.