Lets talk about something really hard. There are at least 3 whole asynchronous, concurrent “full stacks” on top of the Java Runtime written in Scala (which compiles to the JVM bytecode and its standard library is a wrapper around Java’s).
The first one, arguably the most widely used. is the Twitter’s platform. Then comes Lightbend (formerly Typesafe), and then stack upon which Spark has been built.
The most amazing thing is that vastly complex codebases, like Twitter “it just works”. Not just that, but everything has been open-sourced and the development continues open to literally everyone to see or even participate. Simply amazing. Here is why.
A JVM implementation, like OpenJDK
is an imperative multi-threaded application written in crappy C++ (only recently migrated to the C++11 standard). It runs as an ordinary user-space multithreaded process under an OS.
The process is Just-in-time compiler for the Java bytecode (into optimized machine code) which also runs the compiled code and carries (implements) the runtime system (usually called the “Java Runtime”).
This runtime library uses almost no services provided by an OS, except low-level memory management (including atomics and fences) and basic low-level I/O. It re-implements everything, starting from basic buffers and even files, in Java.
This is already a recipe for a disaster – relying on a complex C++ codebase which delegates almost nothing and tries to do literally everything.
Why does this work? The answer is - layers upon layers of proper abstraction barriers and a static type-discipline, which establishes so-called memory-safety.
The Java platform (JDK) provides APIs and services, including asynchronous tasks, thread-pools and “excuters”, which runs tasks on a bunch of actual POSIX threads (which all have shared stack and memory segments within the same single JVM process). Just imagine – everything within a single multithreaded user-level process that shares memory.
At the first layer Scala translates everything to calls to the Java Runtime, including the most basic things like Buffered I/O.
Then, using Java APIs, Scala implements the Futures API. Twitter has its own API and implementation for Futures, so does Lightbend. They even have their own mini-runtimes which actually run the Futures on that very same thread poll which JRE provides.
The main difference is that Scala’s standard library and Akka (which is a set of libraries too) uses the “Actor Model” for their Futures, while Twitter ones are “actor-less”. This is the basis of a “concurrency stack” – lots and lots of layers of libraries, all based on Futures for implementing async I/O.
Here is the first take-home message and an a-ha moment – the thesis that Scala is an advanced mostly-functional language designed especially for writing libraries full of layered advanced abstractions and related embedded DSLs is not just buzzwords and marketing-speak, it actually works with the most sophisticated codebases humanity ever produced. Twitter is huge and runs 24/7.
Another, even more important message, is that this is NOT because of Java or JVM (it is actually despite Java!) but due to the fundamental principles of proper abstraction and stable abstract interfaces, applied in each of these advanced Scala libraries.
It started with Scala, which was a principle-guided academic effort to fix Java, which was and (still is) a fucking abomination, “designed” by an unqualified and uneducated, just like PHP or Ethereum.
It is not that Java is any good, it is that applied fundamental Principles, proper academic knowledge and a strict discipline and formality finally won. Scala just comes from a much better tradition and a resulting culture. Nothing could be compared to these actual results (accomplishments) – Rust and especially fucking Kotlin are nothing butfucking jokes.
This is only beginning of the story. These layers of advanced libraries are all asynchronous (and concurrent), which means that the actual workflows are non-sequential. Nevertyheless, the underlying libraries has been built with the principle of composition (composability), using combinators like flatMap
and almost everything is a proper asynchronous stream, so everything composes. and actually works.
Now it is time to take a look at the APIs and to become instantly overwhelmed by the amount of nesting and underlying details.
Again, all this is possible to understand and to maintain only because of the underlying principles of proper, non-leaking abstractions and abstraction barriers made out of abstract interfaces – sort of penetrable partitions or cell-membranes – the same underlying universal principle.
“Horizontal” (orthogonal libraries) and “vertical” (layers of functional DSLs) partitioning is the central aspect of making complex systems that actually work.
The die-hard part is the complexity of these stacks. To even approach them is to realize that how they has been built. The most amazing part is that the early Twitter team started with Ruby (or something) and then just switched to Scala without being experts having any previous experience with the language, and yet, just by being principled and disciplined they wrote their own stack and then a whole platform.
This is how to program and the non-bullshit die hard vibes. I wish I were there. Again, the message is that this is the real programming, perhaps second only to Carmack himself.