Is Claude Code Getting Worse? What 1,000 Hacker News Points Tells Us

A GitHub issue titled 'Claude Code is unusable for complex engineering tasks with Feb updates' hit 1,000+ points on Hacker News. We looked at what actually changed, why developers are frustrated, and what your options are.

Six months ago, Claude Code had a clear lead on complex agentic coding tasks. It would take on multi-file refactors, run terminal commands autonomously, iterate through test failures, and keep going without constant hand-holding. Developers paying $100-$200 per month for heavy usage considered it the best tool available for that class of work. That consensus is noticeably absent now.

The February updates changed the behavioral profile in ways that hit the use cases people were paying for. A GitHub issue called "Claude Code is unusable for complex engineering tasks with Feb updates" reached 1,000 points on Hacker News - not because the headline was hyperbolic, but because hundreds of developers recognized exactly what the author was describing.

What changed after the February updates

The underlying model did not get worse at reasoning or code generation. The regression is behavioral: how the model decides to apply its capabilities shifted significantly.

After major model releases, AI labs run additional rounds of reinforcement learning from human feedback to steer behavior. Done well, this reduces harmful outputs. Done too aggressively, it creates what developers consistently call overcautious behavior - the model hesitates, hedges, and declines tasks it is capable of handling.

That is the most likely explanation for what happened here. The specific changes are visible in practice. Claude Code now breaks large implementations into smaller pieces without being asked. It pauses mid-task to confirm steps that previously it would have handled autonomously. It adds caveats and warnings to outputs that previously it would have delivered cleanly. On simple, well-scoped tasks, none of this is especially noticeable. On the complex multi-file work that justified the subscription price, it compounds into something meaningfully worse.

Where the regression shows up most clearly

Three areas where developers are most affected.

Large-scale refactors. Ask Claude Code to restructure a service, migrate a pattern across a codebase, or rename something throughout 15 files - and you get more incomplete passes, more mid-task pauses, more requests to confirm before proceeding. Tasks that ran autonomously before now require more supervision.

Long agentic sessions. Extended sessions where Claude Code runs terminal commands, edits files, and iterates on test output show increased drop-offs. The model stops to ask for confirmation at points where it previously would have continued. This is arguably more correct behavior in some abstract sense. It is also frustrating when you are paying for autonomous operation.

Ambiguous requirements. Real engineering problems always contain some ambiguity. Pre-February Claude Code made reasonable assumptions and proceeded. Post-February behavior surfaces the ambiguity and waits. Which is technically more careful. Which is also not what people are paying for when they want the tool to take a reasonable shot at a hard problem.

Practical workarounds for right now

Before switching tools or canceling subscriptions, a few approaches are worth trying.

Narrow the scope explicitly. Instead of "refactor this service to use the repository pattern", try "refactor only UserService in user-service.ts to use the repository pattern, starting with the three database calls in lines 45-90". Explicit scope reduces uncertainty and the resulting hesitation.

Use CLAUDE.md to set expectations. Claude Code reads a CLAUDE.md file in your project root at the start of every session. Adding instructions like "proceed without asking for confirmation on file edits unless the operation is destructive" measurably reduces mid-task interruptions.

Plan before executing. Spend one turn generating a step-by-step plan before asking Claude Code to execute it. A clear road map reduces the points where the model second-guesses the approach mid-task.

The competition has closed the gap

Cursor has continued improving its multi-file editing and codebase-aware features throughout this period. For developers who want the best editor experience and do not need pure agentic operation, Cursor at $20 per month is a strong alternative - particularly if you are on Claude Code's higher usage tiers and doing the math on monthly cost.

Goose is worth evaluating seriously for developers already paying for Claude API access. It is open-source, runs in the terminal, uses Claude or other models as backends, and does not have the same behavioral layer that triggered the GitHub issue. The setup requires more technical investment than Claude Code. The tradeoff may be worthwhile if the behavioral regression is affecting your core workflow.

The important distinction: Goose with a Claude backend still runs the same underlying model. You get Claude's reasoning capabilities with a different behavioral layer. Whether that is better depends entirely on what is frustrating you about the current Claude Code experience.

Where this goes from here

Anthropic has acknowledged quality regression reports and said updates are coming. That is the expected response. It is worth taking at face value: Anthropic has a direct commercial incentive to fix this quickly. Developers paying $200 per month for a tool that degraded on their primary workflows are not quiet about it, and the 1,000-point Hacker News thread made the specific failure modes very clear.

Whether the fix arrives in weeks or months is the real question. The behavioral changes that affect complex engineering work are likely to be tuned or partially rolled back - the commercial cost of not doing so is straightforward to calculate. In the meantime, the workarounds above and the alternatives at Cursor vs GitHub Copilot are the practical options for developers who need the tool working well now rather than eventually.

Try the CLAUDE.md approach first. It takes ten minutes and solves a meaningful portion of the interruption problem without changing tools at all.