Minimal auto-differentiation engine in Rust

Here's a breakdown of the key themes and opinions expressed in the Hacker News discussion, with direct quotes:

Learning and Personal Projects in Rust and ML

The initial comment highlights the use of the project as a learning tool, specifically for Rust and ML concepts.

"Nice! I made a small toy version myself to learn Rust and freshen up on ML." - tnlogy

This showcases a common motivation for building similar projects: personal development and skill enhancement. The user also shares a link to their own project, "telegrad", further illustrating this theme. The comment also points to a desire to potentially extend the project to run on a GPU.

GPU Acceleration and Data Structures

A significant portion of the discussion revolves around the possibility and challenges of running the computation graph on a GPU.

"I wanted to store the graph in a heap to be able to send it to the gpu later on, but then I got lazy and abandoned it." - tnlogy
"My idea was to make a Vec of nodes with pointers to indexes in the vec, so it would be easier to send this array into the gpu." - tnlogy

tnlogy's comments express an interest in optimizing data structures (specifically using a Vec with indexed pointers) to facilitate GPU processing. This suggests an awareness of the different memory models and performance considerations involved in GPU programming. The user also mentions specific GPU APIs:

"...making a micrograd network run on the gpu, with wgpu or macroquad..." - tnlogy

Accuracy and Clarity of Title

The user "kragen" suggests a more descriptive title.

"Probably it would be good to put "backward-mode" in the title." - kragen

This implies that the project's focus on backward-mode differentiation might not be immediately apparent and that clarifying the title could improve understanding and discoverability.

Concerns About Shared State and Thread Safety

The comment from "amelius" raises concerns about the use of global/shared state and its potential impact on thread safety and correctness in concurrent or parallel environments.

"Looks like this uses mutation of global/shared state...then it seems dangerous to collect the results of the backward pass (partial derivatives) in the shared variables x and y and make them accessible through x.get_grad() and y.get_grad()." - amelius

"amelius" highlights a potential race condition: multiple threads concurrently performing backward passes might overwrite each other's partial derivative results if they are stored in shared variables without proper synchronization. The user suggests an alternative API design to mitigate these risks.

"Imho, in a better design, you'd say z.get_grad(x) and z.get_grad(y), and w.get_grad(x) and w.get_grad(y) to get the partial derivatives." - amelius

This suggestion proposes an API where the gradient is accessed through the output variable and specifies the input variable with respect to which the gradient is desired. This approach could potentially avoid the need for global shared state and make the code more thread-safe.