Reasoning LLMs

AUTHOR: <lngnmn2@yahoo.com>

When I was a kid ~~they told me not to stare at the sun~~ I had this vision, that the brain structures are sort of like trees, while the “branches” are just like patches thorough our yard after fresh snow.

Some of them remain thin, just someone walked across it absentmindedly, some gets broadened by a heavy re-use. Who would plow through a fresh snow while one could take follow the path that is already here.

So called use of Reinforcement Learning for LLM tuning (after which it is able to mimic “reasoning”) is just the same thing in principle – the most “seen” or “feed in” paths (of weights) are getting updated.

This, of course, is a frequency based statistical “learning” which only mimics (in principle) and never actually learns.

The canonical example is “whether the Sun will rise tomorrow”. Statistical learning said “it will because it did so many times”. The real answet is “it will, because such giant processes do not fluctuate due to its nature, which are “slow and steady” nuclear reactions” within a particular locality".

Another principle is my own – no amount of [frequency] statistical observations of trees directly from the above (where they all looks like spheres of leaves) will reveal the fact that there are roots, a single thunk and lots of branches from it.

These are the principal limitations of your “reasoning” LLMs, which are still just “dumb data structures” (data files).

Again, the so-called “reasoning” is mimicked just as if an actor in a movie would read the script.