Here's a summary of the key themes from the Hacker News discussion on AI coding agents, with direct quotes:
The Efficacy and Limitations of AI in Code Generation
A central theme is the debate around whether AI, particularly LLMs like Claude, can effectively generate code, and what its practical limitations are. Many users express skepticism or highlight significant drawbacks.
"Using LLMs to code poses a liability most people can't appreciate, and won't admit" (Joel_Mckay)
"Can it work without Linear, using md files?" (bazhand)
"Is it a good idea to generate more code faster to solve problems? Can I solve problems without generating code?" (Frannky)
"If code is a liability and the best part is no part, what about leveraging Markdown files only?" (Frannky)
"Code you didn't write is an even bigger liability, because if the AI gets off track and you can't guide it back, you may have to spend the time to learn it's code and fix the bugs." (ehnto)
"Their ability to refactor a codebase goes in the toilet pretty quickly." (CuriouslyC)
"I tried to get Claude to move some code from one file to another. Some of the code went missing. Some of it was modified along the way." (zarzavat)
"I was bored yesterday and I tried to vibe code a simple react app yesterday using claude code and it was basically useless. It created a good shell of a code initially, but after 10 minutes I basically had to take over (It would be a feature, then regress the previous.)" (misiti3780)
"Am I the only one convinced that all of the hype around coding agents like codex and claude is 85% BS ?" (misiti3780)
Subagents: Potential vs. Practicality
The discussion heavily features the use and effectiveness of "subagents" – smaller, specialized AI agents designed to handle specific tasks within a larger project. While some see value in them for modularity and context management, many find them unreliable, especially in complex or large codebases.
"Subagents don't get a full system prompt (including stuff like CLAUDE.md directions) so they are flying very blind in your projects, and as such will tend to get derailed by their lack of knowledge of a project and veer into mock solutions and "let me just make a simpler solution that demonstrates X."" (CuriouslyC)
"I advise people to only use subagents for stuff that is very compartmentalized because they're hard to monitor and prone to failure with complex codebases where agents live and die by project knowledge curated in files like CLAUDE.md." (CuriouslyC)
"If your main Claude instance doesn't give a good handoff to a subagent, or a subagent doesn't give a good handback to the main Claude, shit will go sideways fast." (CuriouslyC)
"It seems like just basic prompting gets me much further than all these complicated extras." (redrove)
"At some point you gotta stop and wonder if you’re doing way too much work managing claude rather than your business problem." (redrove)
"The ideal sub agent is one that can take a simple question, use up massive amounts of tokens answering it, and then return a simple answer, dropping all those intermediate tokens as unnecessary." (lucraft)
"Make agents for tasks, not roles." (macrolime)
"Subagents suffer from context amnesia during context handouts which is why this subagent use is flawed for purpose of coding product features." (faangguyindia)
"Subagents SEEM good when you use them on greenfield projects, you can grind out a whole first pass without burning through much of your main context, it seems magical. But when you have a complex project that handoff is the kiss of death." (CuriouslyC)
"Subagents suffer from the same overriding problem with "Claude Contexting", which is context wrangling. Subagents "should" help to compartmentalize and manage your context better, but not in my experience so far." (rapind)
"I see lots of people saying you should be doing it, but not actually doing it themselves. Or at least, not showing full examples of exactly how to handle it when it starts to fail or scale, because obviously when you dont have anything, having a bunch of agents doing any random shit works fine. Frustrating." (noodletheworld)
Context Management and Prompt Engineering
The crucial role of managing the LLM's context window, prompt engineering, and providing structured information (like Markdown files) is a recurring theme. Users discuss various strategies to mitigate context limitations and improve AI performance.
"If code is a liability and the best part is no part, what about leveraging Markdown files only?" (Frannky)
"I am working on my own coding agent and seeing massive improvements by rewriting history using either a smaller model or a freestanding call to the main one. It really mitigates context poisoning." (olivermuty)
"One key insight I have from having worked on this from the early stages of LLMs (before chatgpt came out) is that the current crop of LLM clients or "agentic clients" don't log/write/keep track of success over time. It's more of a "shoot and forget" environment right now, and that's why a lot of people are getting vastly different results." (NitwickLawyer)
"I've experimented with feature chats, so start a new chat for every change, just like a feature branch. At the end of a chat I’ll have it summarize the the feature chat and save it as a markdown document in the project, so the knowledge is still available for next chats." (ako)
"You can also ask the llm at the end of a feature chat to prepare a prompt to start the next feature chat so it can determine what knowledge is important to communicate to the next feature chat." (ako)
"I match github project issues to md files committed to repo essentially, the github issue content is just a link to the md file in the repo also, epics are folders with links (+ a readme that gets updated after each task) I am very happy about it too" (rufasterisco)
"DOC_INDEX.md build around the concept of "read this if you are working on X (infra, db, frontend, domain, ....)" COMMON_TASKS.md (if you need to do X read Y, if you need to add a new frontend component read HOW_TO_ADD_A_COMPONENT.md ) common tasks tend to be increase quality when they are epxpressed in a checklist format" (rufasterisco)
"There's a large body of research on context pruning/rewriting (I know because I'm knee deep in benchmarks in release prep for my context compiler), definitely don't ad hoc this." (CuriouslyC)
"I do something similar and I have the best results of not having a history at all, but setting the context new with every invokation." (ixsploit)
"Claude can only reliably do this refactoring if it can keep the start and end files in context. This was a large file, so it got lost. Even then it needs direct supervision." (zarzavat)
"Claude’s utility really drops when any task requires a working set larger than the context window." (brookst)
"The subagents are like a freelance contractors... Good when they need little handoff... little overseeing and their results are a good advice not an action." (prash2488)
"I've been using subagents since they were introduced and it has been a great way to manage context size / pollution." (stingraycharles)
"I'm wondering if in large projects, you want subagents to avoid having tasks flush out the main context? If you're working with large source files, you might want to do each piece of work in an independent context with the information discarded afterwards?" (jpollock)
"Subagents suffer from context amnesia during context handouts which is why this subagent use is flawed for purpose of coding product features." (faangguyindia)
"My gut feeling from past experiences is that we have git, but now git-flow, yet: a standardized approach that is simple to learn and implement across teams." (rufasterisco)
"I feel like my cognitive constraints become the limits of this parallelized system. With a single workstream I pollute context, but feel way more secure somehow." (alxh)
"I feel like "we all" are trying to do something similar, in different ways, and in a fast moving space... My gut feeling from past experiences is that we have git, but now git-flow, yet: a standardized approach that is simple to learn and implement across teams." (rufasterisco)
"parent/CLAUDE.md provides a highlevel view of the stack "FastAPI backend with postgres, Next.js frontend using with tailwind, etc". The parent/CLAUDE.md also points to the childrens CLAUDE.md's which have more granular information." (skimojoe)
"I then just spawn a claude in the parent folder, set up plan mode, go back and forth on a design and then have it dump out to markdown to RFC/ and after that go to work. I find it does really well then as all changes it makes are made with a context of the other service." (skimojoe)
AI as an Assistant vs. Autonomous Agent
There's a clear split in opinion on whether LLMs should be treated as autonomous agents or as sophisticated assistants that require human guidance and oversight. The former is often seen as unreliable, while the latter offers more promise.
"Claude is a junior. The more you work with it, the more you get a feel for which tasks it will ace unsupervised (some subset of grunt work) and which tasks to not even bother using it for." (zarzavat)
"For me the slow part is determining what to write. And while AI helps with that (search, brainstorm, etc) by the time I know what to write trying to get the AI to enter those lines is often just a slow down. Much like writing up a ticket for a junior, I could write the code faster than I could write the English language rules describing how to write that code." (shaisjsh)
"I don't use subagents to do things, they're best for analysing things." (theshrike79)
"I have a project that's still in early stages that can monitor queries in clickhouse for agent failures, group/aggregate into post mortem classes, then do system paramter optimization on retrieval /document annotation system and invoke DSPy on low efficacy prompts." (CuriouslyC)
"I too have discovered that feature chats are surely a winner (as well as a pre-requirement for parallelization)" (rufasterisco)
"I ask the bot to come up with a list of "don't dos"/lessons learned based on what went right or required lots of edits. Then I have it merge them in to an ongoing list. It works OK." (dpkirchner)
"I don't trust Claude to write reams of code that I can't maintain except when that code is embarrassingly testable, i.e it has an external source of truth." (zarzavat)
"I've got this down to a science." (user3939382)
"I'm training myself to have the muscle memory for putting it into planning mode before I start telling it what to do." (taspeotis)
"I pretty much always attach (insert library here) LLM.txt as context, or a direct link to the documentation page for (insert framework feature) Not very agentic but it works a lot more." (jondwillis)
"I feel like any time gained by overusing an LLM will be offset by having to debug its code when it messes things up." (dutchCourage)
"All of this stuff seems completely insane to me and something my coding agent should handle for me. And it probably will in a year." (jackblemming)
"We’re still in the very early days of AI agents. Honestly, just orchestrating CC subagents alone could already be a killer product." (tonkinai)
"It seems like just basic prompting gets me much further than all these complicated extras." (redrove)
"The problem is that a lot of people work on these things in silos. The industry is much more geared towards quick returns now, having to show something now, rather than building strong fo0undations based on real data. Kind of an analogy to early linux dev. We need our own Linus, it would seem :)" (NitwickLawyer)
"One can hardly control one coding agent for correctness, let alone multiple ones... It's cool, but not very reliable or useful." (agigao)
"Why not? I'm assuming we're not talking about "vibe coding" as it's not a serious workflow, it was suggested as a joke basically, and we're talking about working together with LLMs. Why would correctness be any harder to achieve than programming without them?" (diggan)
"I feel like the time it takes the agent to code is best spent thinking about the problem. This is where I see the real value of LLMs. They can free you up to think more about architecture and high level concepts." (jongjong)
"Fast decision-making is terrible for software development. You can't make good decisions unless you have a complete understanding of all reasonable alternatives. There's no way that someone who is juggling 4 LLMs at the same time has the capacity to consider all reasonable alternatives when they make technical decisions." (jongjong)
Cost and Scalability Concerns
The financial implications and the ability of these systems to scale with project complexity are also significant points of discussion.
"Chaining agents, especially in a loop, will increase your token usage significantly. This means you’ll hit the usage caps on plans like Claude Pro/Max much faster. You need to be cognizant of this and decide if the trade-off—dramatically increased output and velocity at the cost of higher usage—is worth it." (raminf)
"Spent around 50 USD on a failed project, context contaminated and the project eventually had to be re-written." (prash2488)
"I was bored yesterday and I tried to vibe code a simple react app yesterday using claude code and it was basically useless. It created a good shell of a code initially, but after 10 minutes I basically had to take over (It would be a feature, then regress the previous.)" (misiti3780)
"It's resume driven development" (siva7)
"Is there any hard evidence that subagent flows give actual developers better experience than just using CC without?" (siva7)
"What's the difference between using agents and playing the casino? Large part of the industry is a casino hidden in other clothes. I see people who never coded in their life signing up for loveable or some other code agent and try their luck. What cements this thought pattern in your post is this: "If the agents get it wrong, I don’t really care—I’ll just fire off another run"" (beefcake)
The Evolution of Developer Tools and Workflows
Several users touch upon the broader implications for the future of software development, comparing current AI agent efforts to past technological shifts and outlining potential future states.
"The last programs I created were just CLI agents with Markdown files and MCP servers(some code here but very little). The feedback loop is much faster, allowing me to understand what I want after experiencing it, and self-correction is super fast. Plus, you don't get lost in the implementation noise." (Frannky)
"Remember 20 years ago when Eclipse could move a function by manipulating the AST and following references to adjust imports and callers, and it didn't lose any code?" (lupire)
"I think it's likely that these agent-based development will inevitably add more imperative tools to their arsenal to lower cost, improve speed and accuracy." (Yeroc)
"Codex’s model is much better at actually reading large volumes of code which improves its results compared with CC" (wahnfrieden)
"Support for multiple branches at once - I should be able to spin off multiple agents that work on multiple branches simultaneously." (simianwords)
"This already exists. Look at cursor with Linear, you can just reply with @cursor & some instructions and it starts working in a vm." (posix86)
"0 Days since AI post on HN" (x1unix)
"Once (if?) someone will just "get it right", and has a reliable way to break this down do the point that engineer(s) can efficiently review specs and code against expectations, it'll be the moment where being a coder will have a different meaning, at large." (rufasterisco)
"I feel like my cognitive constraints become the limits of this parallelized system. With a single workstream I pollute context, but feel way more secure somehow." (alxh)
"I have seen a friend build a rule based system and have been impressed at how well LLM work within that context" (ares623)
"I hate work. Work sucks. I try to minimize the amount of time I spend working; the best way to achieve that is by staring into space." (jongjong)
"The last programs I created were just CLI agents with Markdown files and MCP servers(some code here but very little)." (Frannky)