LLM | Notes from the digital underground by Lngnmn

It Feels Like A Cheating Because It Is

The purpose of examination in a college or university setting is to empirically verify that a student has his own genuine understanding and could actually apply it. It is that simple. This is why some courses even allow to bring a textbook to an exam, because if one does not [already] understand the principles and techniques, the textbook is of no use. However, bringing pocket calculators, smartphones is considered a cheating, precisely because by using these devises one could appear as possessing one’s own understanding. ...

openai-gpt-oss-20b

OpenAI has recently benedicted us with a 20 billion parameter “open source” model, which is a significant step up from the previous 7 billion parameter model. This model is designed to be more efficient and effective in understanding and generating human-like text, or rather to appear to do so. It is a total crap, by the way, at least compared to the online free-tier GROK (which is also very basic and limited). It is, obviously, “competing” with the DeepSeek’s “open source” offerings, and have a very similar “feel”. ...

Let the bubble burst, for Crist's sake!

I have noticed a resent dramatic change in the behavior of major GPTs online providers – most notably, Gemini is now providing just outline of code, full of stubs and “mocks” of real APIs, and not the full code. This is a significant change from the previous behavior where they would provide a semi-complete (but ridden with errors) code solution. Perhaps, mimicking the behavior of ChatGPT, which has been doing this for a while now – they “optimize” for more what appears to be a “dialogue” (more like a normie-level chat), to create a better illusion of “actually conversing with an artificial intelligence”. ...

The Knowledge Work Bubble

We are living through a paradigmatic shift, the one described in the “Scientific Revolution” by Thomas Kuhn. As I mentioned many times, texts and even crappy code became very, very cheap, just like a processed junk-food or a low-effort street-food slop. This is the “shift” and the end of so-called “knowledge work” as we know it. At least this is the end of the pretentious “knowledge work”, when one just pretends to be an expert is social settings, using very straightforward verbal and non-verbal cues to signal their “knowledge” and “expertise”, just as a priest would do in the not so distant past. ...

Look ma, 100x engineers

Today’s bullshit was Surge CEO Says ‘100x Engineers’ Are Here. Yesterday it was “Gemini at IMO Gold level”. Again, mere appearances are not the facts of reality, but this realization requires a bit more education and old-fashioned intelligence, similar to that of Hesse or Sartre. What does it mean nowadays to be an engineer, and to be a 100x? It seems that the meaning is what they call “productivity”, which is time elapsed for putting together some spaghetti webshit .without any understanding whatsoever, from hundreds of lowest quality amateur node_modules in a few minutes. This is what 100x means for them, and this is, of course, bullshit. ...

Gold medal-level performance at IMO.

Oh, look gold medal-level performance on the world’s most prestigious math competition—the International Math Olympiad (IMO) Lets be very careful with this very clever deceptive and suggestive wording. The model “solved” (came up with acceptable solutions) some of the problems and failed on some others. The claim is that this is a gold medal-level performance, as a response to Musk’s claim of a PhD-Level in all fields. The facts, however, are following: ...

Grok4 launch video

So I watched some. The launch video I mean. Closed the frame when they began that “voice” thing. The PhD. level across all subjects is just a meme. I can easily be shown by asking the question that require a beyond memorizing textbooks reasoning. I don’t want to “register” for “Grok4”, but here are some example problems which will break the “PhD level” meme. A recursive functions in, say, Ocaml, or Scala, without explicitly mentioning the accumulator pattern as the required way to avoid stack overflows on languages which does not do TCO at the compile time. This is very basic stuff, which all “PhDs” have to know. The inner lambda with an extra argument, and the “trampoline” is such a classic pattern that some compliers do it automatically. Again, without mentioning it the model will fail by writing non TCO code . ...

Illusion Of Intelligence

There is a very simple trick to break the illusion and to see through “the veil of Maya”. Locally-run models, like Deepseek-R1-14b-0528 (at full fp16 quantization), which is the best I could have, produce vastly different “answers” for exactly the same prompt not just between two runs, but if one uses a different math library stack (like recompiling with forcing Intel MKL). Every time we run a prompt a reasonably good model spits out something which “looks very reasonable”, (unless your are an actual expert in the field), because it captures “common sense”, expressed in the training data. ...

The Age Of Copy-Pasting

Just a few years ago we had that running joke “pasting code from stackoverflow”, which describes a coder, who just find and copy-paste the code – the “right” answers – without any understanding whatsoever. It is well-understood at the level of cognitive neuroscience, that skipping the part of “doing it yourself” (and so never getting these necessary 10,000 hours of a deliberate practice) basically makes us dumber (lots of supporting MIT studies, google them) and ultimately wastes out time, the only irreplaceable and the most valuable resource we ever had. ...

LLMs-generated Rust code

Rust code is the ultimate evidence of the principal inability for a probability-based generating algorithm (based on sampling from a “learned” probability distribution over “tokens”) to come up with something that passes the type checker, but for the most trivial cases. The “causality” is that generation of the complex syntactic forms without the actual, proper understanding principles and heuristics is, well, “problematic”. The running example is these were subtle “already borrowed” panics, where the issue in with the underlying semantics, and the syntax is “correct”. The problem is that a recursive function has to drop all the borrows before a recursive call, and this constraint cannot be expressed in the syntax unless by redundant bindings which will automatically be drooped at the end of the scope. ...