DeepWiki: Understand Any Codebase

Here's a summary of the themes expressed in the Hacker News discussion about DeepWiki:

Potential Usefulness and Novelty

Many users acknowledge that DeepWiki represents an interesting application of AI for code understanding and documentation. Some see it as a valuable tool for navigating complex codebases, especially for open-source projects.

"oriettaxx: very good review (and yes: deepwiki is just amazing!!!"

"rgoulter: The DeepWiki tool itself seems pretty neat. It has a pretty good go at collecting documentation from across the codebase and organising it in one place. It has a pretty good guess at coming up with documentation which isn't there."

"swiftcoder: Ok, yeah, this feels like a reasonable use case for AI. I generated a DeepWiki from one of my repos, and it's pretty informative."

"nikisweeting: Deepwiki was instrumental in our refactor of a large codebase away from playwright to pure CDP @ browser-use. Huge props to the team that built it, I regularly refer to it as one of the few strictly net positive AI coding tools."

Accuracy Concerns and Hallucinations

A significant portion of the discussion revolves around the accuracy of the documentation generated by DeepWiki. Several users report encountering incorrect or incomplete information, particularly for large or complex projects. This is attributed to LLMs hallucinating or misinterpreting code, especially when code deviates from standard practices or has been refactored.

"ignoramous: But ... I've caught DeepWiki hallucinating pretty convincingly far more than once just because a struct / a package / a function was named for something it wasn't doing anymore / wasn't doing it by the book (think: RFCs, docs, specifications etc)."

"jcranmer: I'm pretty dubious that the value it adds is in fact positive."

"jcranmer: For a smaller project, I rummaged through compiler-explorer since I once poked around in that codebase. And when looking through its description of the property files (specifically https://deepwiki.com/compiler-explorer/compiler-explorer/3.3...), I noticed that it has some very subtly incorrect description of what they do, the kind of mistake that's likely to boomerang on you only a month or so later."

"fergie: But for my libs (that aren't super popular, but OTOH have a few million downloads per year) it generates documentation that is incorrect, and this is not good for users."

"jcranmer: At pretty much every step of the process, DeepWiki has given me answers that are distinctly worse than what I would have found just traipsing through the code myself..."

"jcranmer: jcranmer: I then decided to see the quality of the ask-a-question system. At this point, I happened to be poking around CLP from COIN-OR trying to gauge how accurate it was about the simplex details, and I noticed it mentioned pivot tolerance here: https://deepwiki.com/coin-or/Clp/2.4-factorization-and-linea... . Playing a newbie, and given that it doesn't really explain pivot tolerance, I asked it to explain it in detail." (Followed by a detailed explanation of why DeepWiki's answer was poor).

"jcranmer: jcranmer: If I had to guess, it's overly fixated on things that happen to be very large files--I think everything it decided to focus on in a single page happens to be a 30kloc file or something. But that means it also misses the things that are so gargantuan they're split into multiple files..."

User Interface and Interactivity Issues

Some users pointed out usability issues, particularly with the mobile experience and the interaction with prompt boxes. The clarity and usefulness of diagrammatic representations were also questioned.

"mxmilkiib: don't know how to zoom the diagrams on mobile tho, n they can easily almost disappear from view when panning around"

"mxmilkiib: the prompt box could do with a way to move it out of the way of the bottom couple of cm in portrait, or from covering more than a quarter of the screen in landscape orientation"

"IceHegel: I really want to like deepwiki, but just looking at the diagrams of repos, they are too handwavy to be useful."

"IceHegel: They are a conceptual overview and don’t seem tied down enough to the actual implementation details of a particular project."

Unsolicited Generation and SEO Spam Concerns

A notable concern raised is that DeepWiki can generate documentation for any GitHub repository without explicit permission or endorsement from the project maintainers. This has led to accusations of it being "SEO slop-spammer" and users being concerned that the generated documentation might be mistaken for official project documentation.

"Nullabillity: "Uses it" sounds strong.. I don't see any link to it from https://github.com/kieler/elkjs?"

"Nullabillity: Annoyingly, anyone can just.. request a deepwiki for any GitHub repo. That one exists doesn't mean that it's endorsed or reviewed by the project."

"Nullabillity: They just kind of barged in, welcome or not. Just another SEO slop-spammer."

"tacker2000: So in the end people will believe that these are the official docs…"

"buovjaga: It's a pity that there is no clear way to send takedown requests. We didn't ask for deceptive garbage to be generated as documentation for LibreOffice, but here it is and newbies are discovering it..."

LLM Behavior and "Senior Engineer" Analogy

The discussion touched upon the nature of LLMs as assistants, drawing a comparison to a "senior engineer." While LLMs are seen as patient, there's skepticism about their ability to act as a "senior" by proactively identifying and correcting flawed ideas or suggesting better alternatives without explicit prompting.

"rgoulter: > "Treat it like a patient senior engineer.""

"rgoulter: I trust that LLMs are patient (you can ask them stupid questions without consequence)."

"rgoulter: I do not trust LLMs to act as 'senior'. (i.e. Unless you ask it to, it won't push back against dumb ideas, or suggest better ideas that would achieve what you're trying to do. -- And if you just ask it to 'push back', it's going to push back more than necessary)."

Potential for Improvement and Future Directions

Despite criticisms, some users express a desire for DeepWiki to improve. Suggestions for enhancement include incorporating more context from issues and discussions, better integration of specific technical details, and addressing the unsolicited generation problem.

"grokblah: (I haven’t read how it works but…) I wonder if removing file sizes, commit counts, and other numerical metadata would have a significant impact on the output. Or if all of the files were glommed into one large input with path+filename markers?"

"1317: It would be nice if it could also read github issues etc if they were available, so it could have more context about the decisions that were made."

"neilv: Bonus if the LLM was trained on the original repository."

AI-Generated Content and Voice

One user commented on the writing style of the DeepWiki promotional material, suggesting that the AI-generated prose could be distracting and that a more authentic "own voice" would be preferable.

"opdahl: The first sentence already is obviously AI generated, and reading through it it, it is obviously completely written by AI to the point of it being distracting."

"opdahl: I understand the author probably feels that AI is better at writing than they are, but I would heavily recommend they use their own voice."

Open Source Alternatives and Related Tools

The discussion also included mentions of open-source efforts to replicate DeepWiki's functionality and references to other AI coding tools, indicating a broader trend in this space.

"oriettaxx: I would love code could be opensource: I just saw now a couple of attempts * https://github.com/AsyncFuncAI/deepwiki-open * https://github.com/AIDotNet/OpenDeepWiki"

"faangguyindia: I just use context7, they launched api recently. It's my goto solution for coding agent docs."

"manishsharan: Gemini and chatgpt and github copilot subscriptions also provide similar functionality."

"mkagenius: ...I had done a show HN for https://gitpodcast.com earlier, created with a similar goal in mind."