GPT-5 Thinking in ChatGPT (a.k.a. Research Goblin) is good at search

Here's a summary of the themes discussed on Hacker News, with direct quotes:

Simon Willison's Blog and the "Research Goblin"

A significant portion of the discussion revolves around Simon Willison's blog post and his writing style. Some users express strong appreciation for his insights, particularly regarding AI advancements, while others perceive a bias towards OpenAI and question the originality or depth of his content.

"Simon's writing is consistently either highly practical, or extremely high quality, or both. What's your reference frame to call it 'bad' - your own comments?" (scrollaway)
"FWIW I take his writings with a hefty pinch of salt these days. It seems incredibly concentrated on OpenAI to the detriment of anything else." (mattlondon)
"This is fine. He is his own person and can write about whatever he wants and work with whoever he wants, but the days when I'd eagerly read his blog to get a finger of the pulse of all of the main developments in the main labs/models has passed, as he seems to only really cover OpenAI these days..." (mattlondon)
"I really liked this article and the coining of the term 'Research Goblin'. That is how I use it too sometimes. Which is also how I used to use Google." (firesteelrain)
"I use Substack as a free email delivery service - it's just content copied over from my blog..." (simonw)
"HN is very cult-of-personality based. People see SimonW they upvote without reading, while at the same time a much better article could be posted on the same topic and get zero traction." (CuriouslyC)
"The mundane examples were the point. I'm not picking things to show it in the best possible light, I picked a representative sample of the ways I've been using it." (simonw)
"I think the popularity of certain authors has as much to do with trust as anything else. If I see one of Simon’s posts, I know there’s a good chance it’s more signal than noise, and I know how to contextualize what he’s saying based on his past work." (haswell)
"The original author submitted it, then when it didn't get traction it looks like two fans of his blog both submitted it around 12 hours later. Whether for internet upvote points or because they personally thought the article particularly great, I don't know." (redeyedtreefrog)
"HN is a bit weird because it's got 99 articles about how evil LLMs are and one article that's like 'oh hey I asked an LLM questions and got some answers' and people are like 'wow amazing'." (ants_everywhere)

The Efficacy and Limitations of LLM Search and "Deep Research"

A central theme is the evolving capabilities of LLMs, particularly their integrated search functions, and whether these advancements are truly revolutionary or merely incremental improvements. Users debate the quality of the curated information versus traditional search methods, the speed of these new tools, and their potential for misinformation.

"I was amused that it used the neologism 'steel-man' -- redundantly, too." (esafak)
"Did you check the facts? Did you click through all the links and see what the sources are?" (sixtyj)
"A while ago I bragged at a conference about how ChatGPT had 'solved' something... Yeah, we know, it's from Wikipedia and it's wrong :)" (sixtyj)
"As someone who is AI skeptical, there's so many breathless posts like 'Jizz-7 Thinking (Good) (Big Balls) can order my morning coffee!' which are a lot of words talking about one person's subjective experience of using some LLM to do one specific thing." (gdbsjjdn)
"People posting their subjective experience is precisely what a lot of these pieces should be doing, good or bad, their experience is the data they have to contribute." (Lerc)
"These answers take a shockingly long time to resolve considering you can put the questions into Brave search and get basically the same answers in seconds." (meshugaas)
"The thing is, with Chat+Search you don't have to click various links, sift through content farms, or be subject to ads and/or accidental malware download." (ignoramous)
"In practice this means that you get the same content farm answer dressed up as a trustworthy answer without even getting the opportunity to exercise better judgement." (dns_snek)
"This is why I am so excited about the way GPT-5 uses its search tool. GPT-4o and most other AI-assisted search systems in the past worked how you describe: they took the top 10 search results and answered uncritically based on those." (simonw)
"Most real knowledge is stored outside the head, so intelligent agents can't rely solely on what they've remembered. That's why libraries are so fundamental to universities." (ants_everywhere)
"The mundane examples were the point. I'm not picking things to show it in the best possible light, I picked a representative sample of the ways I've been using it." (simonw)
"I called out the terrible scatter plot of the latitude/longitude points because it helped show that this thing has its own flaws." (simonw)
"I know so many people who are convinced that ChatGPT's search feature is entirely useless. This post is mainly for them." (simonw)
"The thing about models getting incrementally better is that occasionally they cross a milestone where something that didn't work before starts being useful." (simonw)
"When I last tested them with topics I'm familiar with, I found the quality of research to be poor. These tools seem to spend their time searching for as much content as possible, then they dump it all into a report." (niklassheth)
"I find ChatGPT to be great at research too-but there are pathological failure modes where it is biased to shallow answers that are subtly wrong, even when definitive primary sources are readily available online:" (larsiusprime)
"The original author submitted it, then when it didn't get traction it looks like two fans of his blog both submitted it around 12 hours later." (redeyedtreefrog)
"The thing about students who cheat is most of them are (at least in the context of schoolwork) very lazy and don't care if their work is high quality. i would guess waiting multiple minutes for Thinking mode to give thorough results is very unappealing." (currymj)
"I do miss the earlier 'heavy' models that had encyclopedic knowledge vs the new 'lighter' models that rely on web search." (psadri)
"Perhaps we should call AI agents 'Goblins' instead." (senko)

The Impact of LLMs on Education and Research Practices

The discussion touches upon the ramifications of LLM proliferation for educational institutions and how students and educators might need to adapt. There's a consideration of how these tools might assist or hinder genuine learning and critical thinking.

"Pretty wild! I wonder how much high school teachers and college professors are struggling with the inevitable usage though?" (indigodaddy)
"Idea: workshops for teachers that teach them some kind of Socratic method that stimulates kids to support what they got from G with their own thinking, however basic and simple it may be." (wtbdbrrr)
"Formulating the state of your current knowledge graph, that was just amplified by ChatGPT's research might be a way to offset the loss of XP ... XP that comes with grinding at whatever level kids currently find themselves ..." (wtbdbrrr)

AI's Refusal to Identify Individuals and Visual Content

A specific technical limitation of current AI models, namely their unwillingness to identify individuals from images, even historical figures, is a point of discussion. Users share their experiences and ponder the underlying reasons and potential workarounds.

"Slightly off topic but chatGPT’s refusal to visually identify people, including dead historical personalities, has been a big let down for me. I can paste in an image of JFK and it will refuse to tell me who it is." (spaceman_2020)
"Can be sometimes circumvented with cropping / stronger compression, but it made looking up who a given image is of / what imageset is it from pretty annoying - the opposite of what these people would want in this case too." (perching_aix)
"I think it makes sense? Given the vast 'knowledge' of ChatGPT it'd be a perfect doxxing tool with the deep research. To straight-up refuse any identification is I think a better idea than to try to circumvent it with arbitrary limitations?" (hetspookjee)
"I don't understand why the 'Official name for the University of Cambridge' example is worth mentioning in the article." (rs186)
"It's an interesting and fun example?" (blast)
"I don't know, I didn't find anything interesting about that example. I would think anyone who has used ChatGPT since Nov 2022 at least once would have expected it to work like that." (rs186)

User Interface and Experience Issues

Some users are encountering or discussing practical challenges with the AI's user interface and operational experience, particularly concerning background processing and connectivity.

"It's going to take a minute, so why do I need to keep looking at it and can't go read some more Wikipedia in the mean time? This is insanely user hostile. Is it just me who encounters this?" (croemer)
"Yeah it should be able to perform these entirely as a process on their end and the app should just check in on progress." (wolttam)
"I'm also on Android with the Plus subscription and I also get this. It usually reconnects by itself a few seconds later, but if it doesn't, I've found that you can get to the answer by closing the app and reopening it." (timpera)
"For Samsungs, Apps ‐> ChatGPT -> Battery -> Unrestricted completely fixed the issue for me, it continues thinking/outputting in the background now." (Tenemo)