To understand what non-deterministic, “syntax level” probabilistic models are actually producing (an illusion), we have to understand how the Mind (of an external observer) works, and how it produces Maya (which has been intuitively understood since the early Upanishads) – an ultimate illusion created by the Mind itself – an inner representations of the “outside” world, which the mind (and body) uses for “decision making”.
The “outside” world is inherently complex, non-deterministic and “concurrent” at the level of “compositions” and, at the same “time”, deterministic enough at the level of the “most basic building blocks” (of biology, lets say) – “simple” molecular structure’s (“small” molecules) are exactly the same, exact copies of clones of each other (otherwise everything will break) while larger molecular structures (like whole proteins and their compositions) may have “flaws” or “mutations” or just simple “kinks” – a slightly different shape or a resulting form.
This is a universal principle, of course. The larger the compound structure gets (grows), the more “non-deterministic changes” occur. No two trees are the same, no two stones, no two biological processes.
In general, no two outcomes of complex processes are the same, while “conditioned” Mind tends to “see” them as such (which is extremely useful, otherwise the system will be overwhelmed with irrelevant details and subtle differences, just as some are overwhelmed with emotions).
This is presumably what some people mean when they proclaim that “everything is probabilistic”, which is, of course, an instance of the same gross oversimplification at a more general and abstract level.
Yes, no two cooked dishes are the same, no two cases of “the same illnesses”, there are not even two exactly the same illnesses (the fact that doctors will never tell you, since their oversimplified and “overly-generic” approach will collapse).
The classic example is biological studies of identical twins over time. It is an absolutely astonishing fact how similar some genetically identical twins may develop over time of lets say 30 years (which tell one that almost everything is “genetic”).
So, the world is “more deterministic” at a small scale, and non-deterministic at larger and larger scales – the levels of more and more complex compositions of more and more complex [layers of] building blocks.
This ultimate fact has been intuitively well-understood and accepted since the earliest human civilizations of the Indian subcontinent. It also has been “captured” by biological evolution itself in a way in which our (and higher animal’s) brains happen to structurally evolve.
The fundamental principle here is that every biological system continuously builds and maintains a grossly oversimplified, mostly inaccurate, “barely good enough to survive and reproduce” crude representations of the outside world, which it uses for decision making.
The fact that human minds never “use” the outside world directly and rely on “what it thinks of/how it sees it” (and it “sees” as it wants or expects to see) has been intuitively understood by the best thinkers and philosophy-of-the-mind traditions of the East.
This principle, is, of course, also true for “the underlying biology” – the “sensors” are crude and very limited, yet good enough to regulate the biological processes and to be synchronized with the recurring conditions and phases in the environment (notice, no “time” or “ticking”).
The ultimate principle is that the wast complexity of causal relations, non-deterministic nature of the world at the level of “large structural compound” and “complex processes” (like weather patterns) has been “dealt with” by using oversimplified, good-enough “views” (not even “models”) of reality at all levels.
And this is, basically, it. End of the story. Everything can be understood and explained properly,once this universal principle is realized. Any complex human activities and “social formations”, in principle.
Here comes “The map is not the territory” maxim, which captures yet another universal principle. All maps or models are “wrong”, but some are useful.
Consider so called Chinese “Medicine” – the whole theoretical framework of the opposites – hot and cold, sweat and sour – is an utter naive abstract bullshit (while the abstract notion of “restoring a lost balance” captures the essence), and yet it still “works” somehow.
It works, because while a “model” may be utterly wrong, it somehow “captures”, usually by accident, (buried under mountains of an abstract bullshit verbiage) enough of What Is to be useful.
The same patterns is literally everywhere, in all the major abstract bullshitting doctrines, from organized religions, via “psychoanalysis” (which is an utter bullshit), to the actual “forces” which move the crypto an stock markets – the same few underlying principles at work.
And, of course, if we consider the “endless diarrhea of abstract bullshit verbiage” itself, we will find the very same principles underneath, and this is exactly what an LLM procures.
One more time – what is Maya? It is an oversimplified but partially useful illusion the mind (the brain) creates (for itself) to cope with the wast inherent complexity of the “outside” world. It is considered as an “illusion” because just a very (very) few aspects of the “physical environment” are captured “just right” (as in all higher animals), while everything else is just constructed, socially constructed “abstract” bullshit.
Think of this as the cost of having an intellect, constantly preoccupied with words of a language. So called intelligence is language-based in principle (which is another serious, complex and large topic in itself).
So, what an LLM spews out is an illusion of intelligence made out of words of a human language. It “works” because the “torrent of verbiage”, created by the process of sampling from a pre-trained conditional probability distribution based data structure (an information artifact) “captured” some somehow at least some aspects of reality, so it is partially useful.
How it “captured” anything “real” at all? Well, the human language is an evolved “tool” (system of subsystems) of describing and communicating [intuitively captured by the mind] some observed aspects of the shared environment. Some (very very few) of these “recorded” (socially reserved in so called language-based culture and tradition) observations correctly capture, name, and generalize some “stable” or “recurring” aspects of outside world, so the resulting “shared knowledge” is, sometimes, partially useful.
The most of it is an utter bullshit – a verbalized noise – of course.
Since beginning of time, the process of “parsing through” the mountains of a solidified verbal diarrhea in search of “grains of truth” (similar to what chickens do on a pile of compost) was considered as a “knowledge work”, and this is exactly what it meant to be “a man of words”.
Discarding the accumulated verbal compost and tracing correctly captured and properly generalized abstractions back to What Is (which is exactly what I am doing here) is a rare skill, and it is exacly the opposite of what LLMs do – they produce more and more of such stuff.
So, this is how the modern Maya works and how one could see through it. The process of writing modern Upanishads is still going on, and you just have read one.