When they began to feed the generated slop back to the models, they deepen the “fundamental problem” even further.

The most recent models produce an excelent verbiage but the code is out of sync with it.

This fact is very easy to undestand and explain – the verbiage and the code live in the very different “regions” of the model’s representation (weight matrix or whatever it is) and these regions are getting tuned (reinforced) at a very different rates and thus do not converge as they are expecting.

The verbiage is full of the correct, principle guided semantic promises (assertions), based on the just-right type-level techniques, but the code is just a simplistic imperative crap which uses some right abstractions.

I am Jack’s absolute lack of surprice here. The training data does not contain a refined, principle-guided literally crafted pure functional Rust code (which is easy to write by restricting the code to the move semantics and compiler-guaranteed referential transparency via immutable references, mimiking the immutable bindings and prersistent data of the pure languages).

What it spews out is just right and theoretically correct false promises and false assertions about the code. The code only appears as conforming to the accompanying verbiage. The Maya (the Cognitive Illusion) is almost perfect but one could see through it.

The reason is that the code has a lot more actual “structure” in its syntax (and the underlaying semantics which a model does not even ‘see’) so it only provides the best syntax it has, but, of course, it is, in principle, NOT what the surrounding verbiage asserts.

Even the comments can easily be off, but this divergence is even more subtle. The comments can easily be stale or simply errornous and there is no way, in principle, the model could “see” it, just as it cannot see “how many ‘r’ is in strawberry” – the same applied principle at work.

Think yourself of being an Upanishadic seer who sees the veil of Maya (and through it) for the very first time.

One more time – what you are looking at in your LLM’s output is a very sophisticated, computationally intense and expensive (but still a brute-force) cognitive illusion.

Since Anthropic add the actual compiler into the feedback loop, it gives you the code that runs and has no errors, but, again, when the engine is fixing the compiler errors it moves away from the verbiage and does not update it – the same fundamental problem of stale or outdate comments, but at the level of human cognition

Here is an example of verbiage even a local model can produce:

  • Style: Functional programming textbook style (pure, immutable, referential transparency).
  • Language: Rust 2024 (latest features: `Result`, `Option`, `match`, `traits`, `Algebraic Data Types`).
  • Architecture: Layered DSLs with abstraction barriers, orthogonal modules, high-level stateless interfaces.
  • Coding Standards:
    • No `mut`, no mutable references (use pipelines/moves).
    • No nulls, no imperative loops (use iterators/combinators).
    • No naked primitives (use “new-type” wrappers/ADTs).
    • Smart constructors for invariant enforcement.
    • Traits for duck-typing/constraints (Eq, Ord, Semigroup, Monoid, Functor-like).

But the code is an imperative crap because it autocompletes from what it has been trained with and the code blocks are alost always “verbatim” (it will fail if it “chooses” a slightly wrong variable name or a different symbol).

There will be a lot of surprises for those who cannot see things as they really are.