Using the standard formalisms only within appropriate contexts

Fitting (using an optimization method) a weighted sum, a line or a curve or a whole “plane” to match the data.

Statistics

This is, by definition, observing, categorizing and counting observations (which also means “measuring”) that already happened. There is no notion of any potential or possible outcomes. We just observe and count (measure).

My favorite example is from archery, when one just measures the distances between hits (arrows), sum them and divide by the number of shots fired (which is averaging of the distances). The player with the smallest average distance is objectively the most skilled one, with the most consistent actual outcomes.

We do not talk about any potentiality or possibility or potential outcomes, in principle. We just observe, count (measure), compare, sort out and take the min.

Probability

A set of all distinct possible (actually attainable or reachable) outcomes, “weighted” by coefficients which all sums to 1. The crucial part here is that all the outcomes are in a set.

This implies that missing a single crucial factor (a possible outcome) or adding one (or more) imaginary ones (which would never happen but have some weights) will render the whole “model” into a sophisticated mathematical bullshit.

Missing a crucial factor is more “deadly” than having imaginary ones, since the “real” or actual ones will have approximately “correct” weights.

Counting observations

Just as in the classic settings (coin tosses and dices) we ultimately observe, bucket-sort and count observations, thus these are frequency based – the only valid probabilities.

In-principle incompatible with changes

The proper notion of a probability (and the less-bullshit notion of “deep learning” based on back-propagation) is by definition applicable only to stable environments, which do not change structurally while you observe them, just as the given coins or dices remain stable.

This principle implies that most of application of probabilities to the markets is, in principle, bullshit. Do not do it.

Deep learning

This is a generalization of “fitting” a whole multi-dimensional [vector] space.

The main principle is the generalized methodology of back-propagation by updating the weights by calculating the partial derivatives is an ultimate fitting algorithm, which has been generalized to any number of “dimensions”.

Again, the “inputs” or observations must come from a stable (consistent) environment, otherwise some nonsense will be “leaned”, in principle.

Markov processes

This is an example of superimposing an over-simplified views (model). It is OK to imagine that everything relevant has been properly captured in the estimated probabilities (which cannot be true, in principle - this is only a wishful thinking).

One more time - probabilities are applicable only to a stable, discrete, fully-observable systems, like coins, cards or dice.

Kalman filters

Averaging of noisy measurements to estimate the actual value. This is what all the moving averages are – an attempt to estimate some “true” value of the price (or volume).

This is the less-bullshit part, because we measure and deal with imperfect, potentially faulty “instruments” which produce noisy readings.

This is, however, a super-imposed oversimplified view of reality, which is ultimately different from what the actual measurements signify.

From the first principles

Ultimately, we reverse (take the inverses of) all the operations which constructed (produced) the pattern.

Candles have been constructed by putting together (addition) and scaling (multiplication).

All these are differentiable operations (partial derivatives can be taken).

This, as a ratio, shows how much the factor (a weight) contributes (the slope).

Time

Superimposing an abstract time scale, and then discarding the whole notion of time.

Discrete time - regular “notches” on an abstract “scale” (regular intervals). Time-invariant - the intervals are always the same (and never change).

Rise over Run - since Run is always a unit only Rise fully captures the notion.

Order

A sequence (series) denotes only an order of events (evenly spaced “in time”).

Arithmetic

Putting together and taking apart

These are the ultimate universal notions, from which the abstract notion the Natural Numbers and the operations of addition and subtraction, multiplication and division (including a reminder) has been properly generalized (abstracted out).

“Putting together” is so universal, that it is even prior to the notion of a Set, which is how the mind of an external observer categorizes observed “things”.

Everything has to be traced back to these universals without losing or the meaning of the abstract concept. This is the key to have proper, valid abstractions.

Difference

A difference - what remains when this quantity has been taken away (subtracted from).

Distance

A distance - what is the difference (in notches on a Number Line) between two values (points).

Speed

How many notches on a superimposed abstract Time Scale it took to “traverse” the distance.

Division

The division operation can be thought of as an inverse of multiplication (which is a repeated addition (not scaling) in this context) but it technically isn’t because of the remainders, which may occur in a repeated subtraction.

Division - how many “times” this quantity can be subtracted away, and what is the remainder, if any.

a Ratio

A ratio is an inverse notion of scaling (by a constant factor).

Just by itself a ratio of \[\frac{1}{10}\] means a quantity has been divided into 10 parts of equal size and 1 of these pieces has been “taken”. Or just one tenth.

All by itself It signifies a “split”, so \[\frac{2}{3}\], which is two thirds or 2 over 3, is, literally two of one-thirds .

Scaling (multiplying) by a fraction is to calculate the value of the ratio for a given number.

Rise-over-run

Rise-over-run is a ratio, and just \[\frac{Rise}{Run}\].

This is a fundamental notion – because it is also a slope

A slope

A line \[y = ax + b\] is a slope of itself.

Trend lines

When we plot and connect the individual “dots” (points) by fitting a smooth curve, we would get a trend-line.

If, however, we connect the points with straight lines, each line will be a slope of itself.

Averages

Moving averages

A moving average is a finite sum of the last \(n\) readings (measurements), divided by its size (\(n\))

A sum could be weighted with coefficients, to diminish the “past” values.

When we plot and connect the moving averages we will have smoothed trend lines.

Moving averages of different sizes, when being plotted against each other, show relative “average speeds”.

Derivatives

A slope has a subtly different definition.

Kalman FIlters

Markov processes