GLM 4.5 with Claude Code

Concerns Regarding Marketing and Transparency

A significant theme in the discussion revolves around the initial presentation of the article and the perceived conflict of interest. Users expressed skepticism when the article's origin from the company Z.AI, the creator of GLM models, was revealed. The initial title, perceived as overly promotional and misleading, contributed to this concern.

"I stopped when I got to this sentence and then realized the article is written by one of the companies mentioned. ... Maybe it is great, but with a conflict of interest so obvious I can't exactly take their word for it." (apparent)
"My issue was with an article being posted with a title saying how amazing two things are together (making it seem like it was somehow an independent review), when it was actually just a marketing post by one of the companies." (apparent)
"I wonder how you justify this editorialized title, and if HN mods share your justification. The linked article has no the word "killer" in it. I think this is why many people have concerns about AI. This group can't express neutral ideas. They have to hype about a simple official documentation page." (raincole)

The discussion also brought up the fact that the original title was indeed changed in response to feedback, indicating a willingness to address the transparency issue.

"The title has been changed. The original title was wildly positive, and OP has acknowledged it was inappropriate and changed it (see comments below)." (apparent)
"feedback accepted got rid of the killer bits" (vincirufus)

Performance, Cost, and Comparisons to Claude

A central point of discussion is the perceived performance and cost-effectiveness of GLM 4.5 and GLM 4.5 Air compared to established models, particularly those from Anthropic like Claude Sonnet and Opus. Many users found GLM models to be a compelling alternative due to their significantly lower pricing.

"GLM 4.5 Air costs about 10% of what Claude Sonnet does (when hosted on DeepInfra, at least), and it can perform simple coding tasks quite quickly." (ekidd)
"They have attractive plans, especially if their models actually perform better than Opus. ... I would also like to know who the people behind Z.ai are — I haven’t heard of them before." (stingraycharles)
"Well I'd call them the poor person's claude code, wouldnt compare it with Opus but very close to Sonnet and Kimi" (vincirufus)
"if you consider daily use, alternative would need to be very cheap if you consider daily use" (Szpadel)
"When it comes to "real-world development scenarios" they claim to be closer to Sonnet 4." (SparkyMcUnicorn)
"I was blown away by this model. It was definitely comparable to sonnet 4. In some of my tests, it performed as good as Opus." (sagarpatil)
"I think the number of models latching onto Claude codes harness. I'm still using Cursor for work and personal but tried out open code and Claude for a bit. I just miss having the checkpoints and whatnot." (Jcampuzano2)
"I've tried using the more expensive model for planning and something a bit cheaper for doing the bulk of changes (the Plan / Ask and Code modes in RooCode) which works pretty nicely, but settling on just one model like GLM 4.5 would be lovely! Closest to that I've gotten to up until now has been the Qwen3 Coder model on OpenRouter." (KronisLV)
"After Claude models have recently become dumb, I switched to Qwen3-Coder (there's a very generous free tier) and GLM4.5, and I'm not looking back." (jedisct1)
"Anthropic can't compete with this on cost. They're probably bleeding money as it is." (unsupp0rted)

However, some users pointed out that while the base cost is lower, factors like prompt engineering and the need for local hardware could influence the overall "real" cost. The context length of GLM models was also a point of consideration for more complex tasks.

"But you can't easily run GLM 4.5 Air quickly without professional workstation- or server-grade hardware (RTX 6000 Pro 96GB would be nice), at least not without a serious speed hit." (ekidd)
"For agentic coding I found the price difference more modest due to prompt caching, which most GLM providers on Openrouter don't offer, but Anthropic does." (esafak)
"Okay, I'm going to try it, but why didn't you link the information on how to integrate it with Claude Code: ... With a little search you can find it's not those led by just one or two professors." (arjie)
"With the lower context length I'm wonder how it holds up for problems requiring slightly larger context given we know most models tend to degrade fairly quickly with context length. Maybe it's best for shorter tasks or condensed context?" (Jcampuzano2)
"Also fascinating how they solved the issue that Claude expects a 200+k token model while GLM 4.5 has 128k." (steipete)

Integration with Claude Code and Agentic Environments

The discussion highlighted the growing trend of integrating various LLMs with tools like Claude Code, primarily for agentic coding workflows. Users explored how to achieve this, with some noting that prompts might be optimized for specific models, potentially affecting cross-model compatibility.

"If you can easily afford all the Claude Code tokens you want, then you'll probably get better results from Sonnet. But if you already know enough programming to work around any issues that arise, the GLM models are quite usable." (ekidd)
"But in my testing, other models do not work well. It looks like prompts are either very optimized for Claude, or other models are just not great yet with such an agentic environment. ... The reason why Claude Code is good is because Anthropic knows Claude Sonnet is good, and that they only need to create prompts that work well with their models." (sdesol)
"you can use any model with Claude code thanks to ... but in my testing other models do not work well, looks like prompts are either very optimized for Claude, or other models are just not great yet with such agentic environment" (Szpadel)
"You don't need claude code router to use GLM, just set the env var to the GLM url. Also, I generally advise people not to bother with claude code router, Bifrost can do the same job and it's much better software." (CuriouslyC)
"So you can use Claude Code with other models? I had assumed that it was tied to your subscription and that was that." (abrookewood)
"It is, but people figure out the Claude Code API and provide API compatible endpoints." (adastra22)
"Their plans $3 and $15 plans work even better with tools like Roo Code." (jedisct1)

OpenRouter Quality and Provider Practices

A significant portion of the conversation centered on OpenRouter as a platform for accessing various LLMs, with concerns raised about the quality and potential quantization of models offered by different providers.

"I'm really concerned that some of the providers are using quantized versions of the models so they can run more models per card and larger batches of inference." (chisleu)
"yeah I too have heard similar concerns with Open models on OpenRouter, but haven't been able to verify it, as I don't use that a lot" (vincirufus)
"(OpenRouter COO here) We are starting to test this and verify the deployments. More to come on that front -- but long story short is that we don't have good evidence that providers are doing weird stuff that materially affects model accuracy." (numlocked)
"However your providers do have such an incentive." (blitzar)
"I get better results from Qwen 3 coder 30b a3b locally than I get from Qwen 3 Coder 480b through open router." (chisleu)
"This doesn't match my experience precisely, but I've definitely had cases where some of the providers had consistently worse output for the same model than others, the solution there was to figure out which ones those are and to denylist them in the UI." (KronisLV)
"You can see that these providers run FP4 versions: ... And these providers run FP8 versions:" (KronisLV)
"Unsolicited advice: Why doesn’t open router provide hosting services for OSS models that guarantee non-quantised versions of the LLMs? Would be a win-win for everyone." (chandureddyvari)
"In fact I thought that's what OpenRouter was hosting them all along" (jatins)
"Seems like users are losing their minds over this.. at least from all the reddit threads I'm seeing" (indigodaddy)

OpenRouter's COO addressed these concerns, stating that they are actively working to verify provider deployments and do not have strong evidence of widespread, impactful quantization. They also emphasized their incentive to ensure high-quality inference and invited users to share data points supporting their concerns.

Origin and Background of Z.AI and GLM Models

The discussion also touched upon the origins of Z.AI and the GLM models, clarifying that Z.AI is a spin-off of Tsinghua University, a recognized institution for AI research. This information provided context and a better understanding of the company's background.

"Actually Z.ai is a spinoff of Tsinghua University and one of the first China labs open sourcing its own large models (GLM released in 2021)" (turingbook)
"It's a spinoff of the whole university?" (throwaway314155)
"With a little search you can find it's a laboratory within the CS department of THU. It's a fairly large lab though, not those led by just one or two professors." (cyp0633)

User Experience and Specific Tool Integrations

Users shared their experiences with specific tools and workflows, such as Cline and RooCode, and how GLM models fit into these. The usability of models on different hardware, like MacBooks and Mac Studios, was also mentioned.

"The Air model is light enough to run on a macbook pro and is useful for Cline. I can run the full GLM model on my Mac Studio, but the TPS is so slow that it's only useful for chatting." (chisleu)
"This is quite nice. Will try it out a bit longer over the weekend. I tested it using Claude Code with env variables overrides." (sergiotapia)
"This is really cool and should work well with something like RooCode as well." (KronisLV)
"Their plans $3 and $15 plans work even better with tools like Roo Code." (jedisct1)

Captcha Design and User Experience

A minor but notable thread in the discussion centered on the design of captchas, specifically referencing "Chinese captchas" and their increased interactivity compared to some Western counterparts.

"Chinese software always has such a design language: - prepaid and then use credit to subscribe - strange serif font - that slider thing for captcha" (arjie)
"I still call it 'chinnese chatpcha', back then chinnese chaptcha is so much harder than western counterpart but now gchaptcha spam me with 5 different image if I missing a tiles for crossroad, so chinnese chaptcha is much better in my opinion" (tonyhart7)
"I assume both of the approaches are useless at actually stopping bots" (awestroke)
"They deter newbies but this is not a problem for experienced developers." (whatevermom)