I researched this the other day, the recommended (by Anthropic) way to do this is to have a CLAUDE.md with a single line in it:
@AGENTS.md
Then keep your actual content in the other file: https://docs.claude.com/en/docs/claude-code/claude-code-on-t...The recommended approach has the advantage of separating information specific to Claude Code, but I think that in the long run, Anthropic will have to adopt the AGENTS.md format
Also, when using separate files, memories will be written to CLAUDE.md, and periodic triaging will be required: deciding what to leave there and what to move to AGENTS.md
Anthropic say "put @AGENTS.md in your CLAUDE.md" file and my own experiments confirmed that this dumps the content into the system prompt in the same way as if you had copied it to CLAUDE.md manually, so I'm happy with that solution - at least until Anthropic give in and support AGENTS.md directly.
Only sane (guaranteed portable) option is for it to be a relative symlink to another file within the same repo, of course. i.e. CLAUDE.md would be -> 'AGENTS.md', not '/home/simonw/projects/pelicans-on-bicycles/AGENTS.md' or whatever.
But I can’t speak to it working across OS.
I thought git by default treats symlinks simply as file copies when cloning new.
Ie git may not be aware of the symlink.
discouraging, actually, considering how frequently Claude ignores my AGENTS.md guidance.
> Instead of a bloated API, an MCP should be a simple, secure gateway that provides a few powerful, high-level tools [...] In this model, MCP’s job isn’t to abstract reality for the agent; its job is to manage the auth, networking, and security boundaries and then get out of the way.
Our auth, log diving, infra state, etc, is all usable via cli, and it feels pretty good when pointing Claude at it.
You can do anything you want via a CLI but MCP still exists as a standard that folks and platforms might want to adopt as a common interface.
It's not that practical to have an MCP that can connect to, for example, ALL of your corporate Google Drive. Not happening.
Why isn't it possible to limit it to a specific whitelisted set?
But in general I still don’t really use MCP. Agents are just so good at solving problems themselves. I wish MCP would mostly focus at the auth part instead of the tool part. Getting an agent access to an API with credentials usually gives them enough power to solve problems on their own.
[1]: https://x.com/mitsuhiko/status/1984756813850374578?s=46
I use Claude all the time, and this is probably my biggest issue with it. I just workaround by manually supplying context in prompts, but it’s kind of annoying to do so.
Does anyone else struggle with this or am I just doing something horribly wrong?
I've found myself doing similar workarounds. I'm guessing anthropic will just make the /compact command do this instead soon enough.
Fun fact, a large chunk of context is reserved for compaction. When you are shown that you have "0% context remaining," it's actually like 30% remaining that's reserved for compaction.
And yet, for some reason I feel like 50% of the time, compaction fails because it runs out of context or hits (non-rate) API limits.
read the document at https://blog.sshh.io/p/how-i-use-every-claude-code-feature and tell me how to improve my Claude code setup
Em dash and "it's not X, it's Y" in one sentence. Tired of reading posts written by AI. Feels disrespectful to your readers
The people who just copy paste output from ai and ship it as a blog post however, deserve significant condemnation for that.
I use AI for code, but I never use it for any writing that is for human eyes.
> the author clearly read through, organized, and edited the output.
Also worth noting, I've read plenty of human written stuff that has errors in it, so I read everything skeptically anyway.
Didn’t realize you were forced to read this?
> Feels disrespectful to your readers
I didn’t feel disrespected—I felt so respected I read the whole thing.
Who gives a shit?
If you can’t stand AI writing and you made it pretty far along before getting upset, who are you mad at, the author or yourself? Would you be happier if you found out this was written without AI and that you were just bad at detecting AI writing?
This is how we are doing it too, for the same reasons. For now, much easier to administer than trying to figure out who spends enough to get a real Claude Code seat. The other nice thing about using API keys is that you basically never hit rate limits.
Are the CLI-based agents better (much better?) than the Cursor app? Why?
I like how easy it is to get Cursor to focus a particular piece of code. I select the text and Cmd-L, saying "fix this part, it's broken like this ____."
I haven't really tried a CLI agent; sending snippets of code by CLI sounds really annoying. "Fix login.ts lines 148-160, it's broken like this ___"
Part of it is the snappy more minimal UX but also just pure efficacy seems consistently better. Claude does its best work in CC. I'm sure the same is true of Codex.
There's even an official Anthropic VS Code extension to run CC in VS Code. The biggest advantage is being able to use VS Code's diff views, which I like more than in the terminal. But the VS Code CC extension doesn't support all the latest features of the terminal CC, so I'm usually still in the terminal.
Really, the interface isn't a meaningful part of it. I also like cmd-L, but claude just does better at writing code.
...also, it's nice that Anthropic is just focusing on making cool stuff (like skills), while the folk from cursor are... I dunno. Whatever it is they're doing with cursor 2.0 :shrug:
The agentic part of the equation is improving on both sides all the time.
Whereas I tried Kilo Code and CoPilot and JetBrain's agent and others direct against Sonnet 4 and the output was ... not good ... in comparison.
I have my criticisms of Claude but still find it very impressive.
There is no customer advantage to developing cheap and fast if the delivered product isn't well conceived from a current and future customer-needs perspective, and a quickly shipped product full of bugs isn't going to help anyone.
I think the same goes for AI in general - CEOs are salivating over adopting "AI" (which people like Altman and Amodei are telling them will be human level tomorrow, or yesterday in the case of Amodei), and using it to reduce employee head count, but the technology is nowhere near the human level needed to actually benefit customers. An "AI" (i.e. LLM) customer service agent/chatbot is just going to piss off customers.
My current project I have a top level chat , then one chat in each of the four component sub directories.
I have a second terminal with QA-feature
So 10 tabs total . Plus I have one to run occasional commands real quick (like docker ps).
I’m using qwen.
Most of the time I'm just pasting code blocks directly into raycast and once I've fixed the bug or got the properly transformed code in the shape that I aimed for, then I paste it back into neovim. Next i'm going to try out "opencode"[0], because I've heard some good things about it. For now, I'm happy with my current workflow.
I recommend using it directly instead of via the plugin
Just to not confuse, MCP is like an api but the underlying api can execute an Skill. So, its not MCP vs Skill as a contest. It's just the broad concept of a "flexible" skill vs "parameter" based Api. And again parameter based APIs can also be flexible depending on how we write it except that it lacks SKILL.md in case of Skills which guides llm to be more generic than a pure API.
By the way, if you are a Mac user, you can execute Skills locally via OpenSkills[1] that I have created using apple contianers.
1. OpenSkills -https://github.com/BandarLabs/open-skills
My concern with hardcoding paths inside a doc, it will likely become outdated as the codebase evolves.
One solution would be to script it and have it run pre commit to regenerate the Claude.md with the new paths.
There probably is potential for even more dev tooling that 1. Ensure reference paths are always correct, 2. Enforces standard for how references are documented in Claude.md (and lints things like length)
Perhaps using some kind of inline documentation standard like jsdoc if it’s a ts file or a naming convention if it’s an Md file
Example:
// @claude.md // For complex … usage or if you encounter a FooBarError, see ${path} for advanced troubleshooting steps
So now you need to get CC to understand _how_ to do that for various tools in a way that's context efficient, because otherwise you're relying on either potentially outdated knowledge that Claude has built in (leading to errors b/c CC doesn't know about recent versions) or chucking the entirety of a man page into your default context (inefficent).
What the Skill files do is then separate the when from the how.
Consider the git cli.
The skill file has a couple of sentences on when to use the git cli and then a much longer section on how it's supposed to be used, and the "how" section isn't loaded until you actually need it.
I've got skills for stuff like invoking the native screenshot CLI tool on the Mac, for calling a custom shell script that uses the github API to download and pull in screenshots from issues (b/c the cli doesn't know how to do this), for accessing separate APIs for data, etc.
I think if it literally as a collection of .md files and scripts to help perform some set of actions. I'm excited for it not really as a "new thing" (as mentioned in the post) but as effectively an endorsement for this pattern of agent-data interaction.
So if youre building your own agent, this would be a directory of markdown documents with headers that you tell the agent to scan so that its aware of them, and then if it thinks they could be useful it can choose to read all the instructions into its context? Is it any more than that?
I guess I dont understand how this isnt just RAG with an index you make the agent aware of?
The skills that I use all direct a next action and how to do it. Most of them instruct to use Tasks to isolate context. Some of them provide abstraction specific context (when working with framework code, find all consumers before making changes. add integration tests for the desired state if it’s missing, then run tests to see…) and others just inject only the correct company specific approach to solving only this problem into Task context.
They are composable and you can build the logic table of when an instance is “skilled” enough. I found them worse than hooks with subagents when I started, but now I see them as the coolest thing in Claude code.
The last benefit is nobody on your team even had to know they exist. You can just have them as part of onboarding and everyone can take advantage of what you’ve learned even when working on greenfield projects that don’t have a CLAUDE.md.
For example, if you're writing a command line tool in Python, it doesn't really matter what model you use since they're all really great at Python (LOL). However, if you're writing a complicated SPA that uses say, Vue 3 with Vite (and some fancy CSS framework) and Python w/FastAPI... You want the "smartest" model that knows about all these things at once (and regularly gets updated knowledge of the latest versions of things). For me, that means Claude Code.
I am cheap though and only pay Anthropic $20/month. This means I run out of Claude Credits every damned week (haha). To work around this problem, I used to use OpenAI's pay-per-use API with gpt5-mini with VS Code's Copilot extension, switching to GPT5-codex (medium) with the Codex extension for more complicated tasks.
Now that I've got more experience, I've figured out that GPT5-codex costs way too much (in API credits) for what you get in nearly all situations. Seriously: Why TF does it use that much "usage". Anyway...
I've tried them all with my very, very complicated collaborative editor (CRDTs), specifically to learn how to better use AI coding assistants. So here's what I do now:
* Ollama cloud for gpt-oss:120b (it's so fast!)
* Claude Code for everything else.
I cannot understate how impressed I am with gpt-oss:120b... It's like 10x faster than gpt5-mini and yet seems to perform just as well. Maybe better, actually because it forces you to narrow your prompts (due to smaller context window). But because it's just so damned fast, that doesn't matter.With Claude Code, it's like magic: You give it a really f'ing complicated thing to troubleshoot or implement and it just goes—and keeps going until it finishes or you run out of tokens! It's a, "the future is now!" experience for sure.
With gpt-oss:120b it's more like having an actual conversation, where the only time you stop typing is when you're reviewing what it did (which you have to do for all the models... Some more than others).
FYI: The worst is Gemini 2.5. I wouldn't even bother! It's such trash, I can't even fathom how Google is trying to pass it off as anything more than a toy. When it decides to actually run (as opposed to responding with, "Failed... Try again"), it'll either hallucinate things that have absolutely nothing to do with your prompt or it'll behave like some petulant middle school kid that pretend to spend a lot of time thinking about something but ultimately does nothing at all.
GPT5-codex (medium) is such a token hog for some reason
You’ll also end up dealing with merge conflicts if you haven’t carefully split the work or modularized the code.
Please stop expecting every engineer on the team to be an ai engineer just to get started with coding agents
I have started experimenting with a skills/ directory in my open source software, and then made a plugin marketplace that just pulls them in. It works well, but I don't know how scalable it will be.
It wasn't possible before for me to do any of this at this kind of scale. Before, getting stuck on a bug could mean hours, days, or maybe even weeks of debugging. I never made the kind of progress I wanted before.
Many of the things I want, do already exist, but are often older, not as efficient or flexible as they could be, or just plain _look_ dated.
But now I can pump out react/shadcn frontends easily, generate apis, and get going relatively quickly. It's still not pure magic. I'm still hitting issues and such, but they are not these demotivating, project-ending, roadblocks anymore.
I can now move at a speed that matches the ideas I have.
I am giving up something to achieve that, by allowing AI to take control so much, but it's a trade that seems worth it.
This is basically a "thinking tax".
If you don't want to think and offload it to llm they burn through a lot of tokens to implement in a non-efficient way something you could often do in 10 lines if you though about it for a few minutes.
If you tell me I didn’t really need a LLM to be able to do all that in a week and just some thought and 10 lines of code would do, I suspect you are not really familiar with the latest developments in AI and just vastly underestimates the capabilities they have to do tricky stuff.
Thats why it took a week with llm. And for you it makes sense as this is new tech.
But if someone knows those technologies - it would still take a week with llm and like 2 days without.
Before LLMs we simply wouldn't implement many of those features since they were not exactly critical and required a lot of time, but now when the required development time is cut signifficantly, they suddenly make sense to implement.
Latest version from 2 momths ago, >4700 stars on GitHub
Or I could just tell Claude Code to do it and then spend some time cleaning it up afterwards. I had that thing working quite robustly in days! D A Y S!
(Then I had the bright idea of implementing a "track changes" mode which I'm still working on like a week and a half later, haha)
Even if you were already familiar with all that stuff, it's a lot of code to write to make it work! The stylesheets alone... Ugh! So glad I could tell the AI something like, "make sure it implements light and dark mode using VueUse's `useDark()` feature."
Almost all of my "cleanup" work was just telling it about CSS classes it missed when adding dark mode variants. In fact, most of my prompts are asking it to add features (why not?) or cleaning up the code (e.g. divide things into smaller, more concise files—all the LLMs really love to make big .vue files).
"Writing most of the code"? No. Telling it how to write the code with a robust architecture, using knowledge developed over two decades of coding experience: Yes.
I have to reject some things because they'd introduce security vulnerabilities but for the most part I'm satisfied with Claude Code spits out. GPT5, on the other hand... Everything needs careful inspection.
If theres enough interest, I might replicate some examples in an open source project.
To see if it is easy to digest, no repeated code etc or is it just slop that should be consumed by another agent and never by human.
Code is no different! You can tell an AI model to write something for you and that's fine! Except you have to review it! If the code is bad quality just take a moment to tell the AI to fix it!
Like, how hard is it to tell the AI that the code it just wrote is too terse and hard to read? Come on, folks! Take that extra moment! I mean, I'm pretty lazy when working on my hobby projects but even I'm going to get irritated if the code is a gigantic mess.
Just tell it, "this code is a gigantic mess. Refactor it into concise, human-readable files using a logical structure and make sure to add plenty of explanatory comments where anything might be non-obvious. Make sure that the code passes all the tests when you're done."
I think we'll be dealing with slop issues for quite some time, but I also have hopes that AI will raise the bar of code in general.
This feels like a false economy to me for real sized changes, but maybe I’m just a weak code reviewer. For code I really don’t care about, I’m happy to do this, but if I ever need to understand that code I have an uphill battle. OTOH reading intermediate diffs and treating the process like actual pair programming has worked well for me, left me with changes I’m happy with, and codebases I understand well enough to debug.
It is why I am a bit puzzled by the people who use an LLM to generate code in anything other than a "tightly scoped" fashion (boilerplate, throwaway code, standalone script, single file, or at the function level). I'm not sure how that makes your job later on any easier if you have even a worse mental model of the code because you didn't even write it. And debugging is almost usually more tedious than writing code, so you've traded off the fun/easy part for a more difficult one. Seems like a faustian deal.
It's much easier to review larger changes when you've aligned on a Claude generated plan up front.
Right now these are reading like a guide to prolog in the 1980s.
Skills doesn't totally deprecate documenting things in CLAUDE.md but agree that a lot of these can be defined as skills instead.
Skill frontmatter also still sits in the global context so it's not really a token optimization either.
I suggest everyone who can to try the voice mode. https://getvoicemode.com/
A sibling comment on hooks mentions some approaches. You could also try leveraging the UserPromptSubmit hook to do some prompt analysis and force relevant skill activation.
If you are using literally any of Claude Code’s features the experience isn’t close, and regardless of model preference (Claude is my least favorite model by far) you should probably use Claude code. It’s just a much more extensible product for teams.
Losing access to GPT 5 Pro is also a big hit… it is by far the best for reading full files/repos and creating plans (though it also by far has the worst out of the box tooling)
Codex writes higher quality code, but is slower and less feature rich. I imagine this will change within months. The jury is still out. Exciting times!
Maybe CC users haven’t figured out how to parallelize their work because it’s fast enough to just wait or be distracted, and so the Codex waiting seems unbearable.
If no anonymous access is provided, is there a way to create an account with a noscript/basic (x)html/classic web browsers in order to get an API key secret?
Because I do not use web engines from the "whatng" cartel.
To add insult to injury, my email is self-hosted with IP literals to avoid funding the DNS people which are mostly now in strong partnership with the "whatng" cartel (email with IP literals are "stronger" than SPF since it does the same and more). An email is often required for account registration.
At the moment though I also code on and off with an agent. I’m not ready or willing to only vibe code my projects. For one is the fact that I had tons of examples where the agent gaslighted me only to turn around at the last stage. And in some cases the code output was to result focused and didn’t think about the broader general usage. And sure that’s in part because I hold it wrong. Don’t specify 10million markdown files etc. But it’s a feedback loop system. If I don’t trust the results I don’t jump in deeper. And I feel a lot of developers have no issue with jumping ever deeper. Write MCPs now CLIs and describe projects with custom markdown files. But I think we really need both camps. Otherwise we don’t move forward.
IMO the best advice in life is try not to be fearful of things that happen to everyone and you can't change.
Good news! What you are afraid of will happen, but it'll happen to everyone all at once, and nothing you can do can change it.
So you no longer need to feel fear. You can skip right on over to resignation. (We have cookies, for we are cooked)