James and I were travelling by train to Geneva. We were to participate in the prestigious International Congress on Espionage & Sabotage. Time on a train is ideal for leafing through conference materials.
Suddenly, I noticed that James was studying a figure that looked quite familiar to me.
``I didn’t know you dealt with spectroscopy,'' I said. ``I see, this is a Strontium L_3 edge as it appears in Electron Energy-Loss Spectroscopy. Wait a moment... I have a similar spectrum, but mine is slightly better calibrated.''
James glanced absent-mindedly at my image and became dumbfounded. A moment later, I observed a rather rare natural phenomenon - astonishment on his face.
``I am not involved in spectroscopy. My graph is the annual budget of NASA, since its inception. But your graph - if you say it is a particular spectrum - is essentially the same...''
However, rational thinking soon prevailed.
``I don’t think there is any hidden relation between excitations of atomic shells and the excitation of Americans about space. Most probably, it is just a coincidence,'' he said. ``The number of technical documents is growing exponentially. An enormous variety of combinations appears, and sooner or later even a very improbable parallelism may occur. It's like meteorites. Have you ever observed a meteorite falling?''
``Never.''
``Neither have I. And yet, every day several fall on Earth. If we were a community of only a thousand individuals, we would consider such events esoteric and therefore mythological. But we are several billion, and so we know precisely that meteorites exist. Who knows which other rare events would have to be acknowledged as real if our statistics were a million times larger...''
He fell silent, lost in thought.
``But such rare events do not affect the general picture. They are washed out after averaging over all available samples,'' I argued.
``Mostly,'' James agreed. ``But not always. We can easily imagine that a singular, rare, and unbelievable event dramatically changes an entire life and a global trend.''
``Black swan?''
``Exactly. By the way, PCA analysis offers a fine illustration of this point...'' ”
As there was still half an hour left before the terminal station, we quickly devised an example of a data distribution in which a single outlier completely overrode the PCA trend.
Why did this happen? Because PCA actually finds a line that fits all data points in the sense of minimal least squares. If a singular point lies far away from the others, its coordinates affect the result quadratically.
``It is a pity that Nassim Taleb The Black Swan: The Impact of the Highly Improbable is not participating in the congress,'' James commented. ``We could present him with a perfect example of his Black Swan.''
Still, we had no desire to live in Taleb’s Extremistan, and so we devised a simple workaround. With two additional lines of code, we disabled the influence of extremities. Namely, all data points lying outside a ±4σ range of the data variance were excluded from the PCA calculation.
We returned to our dull Mediocristan and enjoyed life again.
Disclaimer. We did not manipulate the data. We merely defined a reasonable range for calculating the PCA trend. It was, in fact, a very generous range; one would have to be exceptionally reckless to venture beyond it.
Nor did we completely ignore the extreme data point. After projection onto the line defined by the ``outlier-robust PCA,'' it exhibited rather civilized behaviour.
The Python codes can be found in the pdf version of this document: Full Text with Codes.
If you have any comments or suggestions, please email pavel@temdm.com ".
Posted February 8, 2026
Anonymus (February 27, 2026)
Pavel, do you really believe in all black-swan-garbage of this Gauss hater - Taleb?
Pavel (February 28, 2026)
Funny enough, I posed to James exactly the same question in the train.
"I? I am a follower of Nassim Taleb ???" -asked he surprised
and then smiled sarcastically –
"I am. Especially, I like his amazing discovery that 'Not every distribution is a Gaussian and not every Gaussian is a distribution…'"}