Showing posts with label Alex's Adventures ın Numberland. Show all posts
Showing posts with label Alex's Adventures ın Numberland. Show all posts

Sunday, January 3, 2016

Regression-Correlation

In a mathematical context, regression to the mean is the statement that an extreme event is likely to be followed by a less extreme event.
**

Regression and correlation were major breakthroughs in scientific thought. For Isaac Newton and his peers, the universe obeyed deterministic laws of cause and effect. Everything that happened had a reason. Yet not all science is so reductive. In biology, for example, certain outcomes—such as the occurrence of lung cancer—can have multiple causes that mix together in a complicated way. Correlation provided a way to analyse the fuzzy relationships between linked sets of data. For example, not everyone who smokes will develop lung cancer, but by looking at the incidence of smoking and the incidence of lung cancer mathematicians can work out your chances of getting cancer if you do smoke. Likewise, not every child from a big class in school will perform less well than a child from a small class, yet class sizes do have an impact on exam results. Statistical analysis opened up whole new areas of research—in subjects from medicine to sociology, from psychology to economics. It allowed us to make use of information without knowing exact causes. Galton’s original insights helped make statistics a respectable field: ‘Some people hate the very name of statistics, but I find them full of beauty and interest,’ he wrote. ‘Whenever they are not brutalized, but delicately handled by the higher methods, and are warily interpreted, their power of dealing with complicated phenomena is extraordinary.

**
In 2002 the Nobel Prize in Economics was not won by an economist. It was won by the psychologist Daniel Kahneman, who had spent his career (much of it together with his colleague Amos Tversky) studying the cognitive factors behind decision-making. Kahneman has said that understanding regression to the mean led to his most satisfying ‘Eureka moment’. It was in the mid 1960s and Kahneman was giving a lecture to Israeli air-force flight instructors. He was telling them that praise is more effective than punishment for making cadets learn. On finishing his speech, one of the most experienced instructors stood up and told Kahneman that he was mistaken. The man said: ‘On many occasions I have praised flight cadets for clean execution of some aerobatic manœuvre, and in general when they try it again, they do worse. On the other hand, I have often screamed at cadets for bad execution, and in general they do better the next time. So please don’t tell us that reinforcement works and punishment does not, because the opposite is the case.’ At that moment, Kahneman said, the penny dropped. The flight instructor’s opinion that punishment is more effective than reward was based on a lack of understanding of regression to the mean. If a cadet does an extremely bad manœuvre, then of course he will do better next time—irrespective of whether the instructor admonishes or praises him. Likewise, if he does an extremely good one, he will probably follow that with something less good. ‘Because we tend to reward others when they do well and punish them when they do badly, and because there is regression to the mean, it is part of the human condition that we are statistically punished for rewarding others and rewarded for punishing them,’ Kahneman said.


Regression to the mean is not a complicated idea. All it says is that if the outcome of an event is determined at least in part by random factors, then an extreme event will probably be followed by one that is less extreme. Yet despite its simplicity, regression is not appreciated by most people. I would say, in fact, that regression is one of the least grasped but most useful mathematical concepts you need for a rational understanding of the world. A surprisingly large number of simple misconceptions about science and statistics boil down to a failure to take regression to the mean into account.

Take the example of speed cameras. If several accidents happen on the same stretch of road, this could be because there is one cause—for example, a gang of teenage pranksters have tied a wire across the road. Arrest the teenagers and the accidents will stop. Or there could be many random contributing factors—a mixture of adverse weather conditions, the shape of the road, the victory of the local football team or the decision of a local resident to walk his dog. Accidents are equivalent to an extreme event. And after an extreme event, the likelihood is of less extreme events occurring: the random factors will combine in such a way as to result in fewer accidents. Often speed cameras are installed at spots where there have been one or more serious accidents. Their purpose is to make drivers go more slowly so as to reduce the number of crashes. Yes, the number of accidents tends to be reduced after speed cameras have been introduced, but this might have very little to do with the speed camera. Because of regression to the mean, whether or not one is installed, after a run of accidents it is already likely that there will be fewer accidents at that spot. (This is not an argument against speed cameras, since they may indeed be effective. Rather it is an argument about the argument for speed cameras, which often displays a misuse of statistics.)

Bell Curve and Normal Distribution





Half a century before Poincaré saw the bell curve in bread, another mathematician was seeing it wherever he looked. Adolphe Quételet has good claim to being the world’s most influential Belgian. (The fact that this is not a competitive field in no way diminishes his achievements.) A geometer and astronomer by training, he soon became sidetracked by a fascination with data—more specifically, with finding patterns in figures. In one of his early projects, Quételet examined French national crime statistics, which the government started publishing in 1825. Quételet noticed that the number of murders was pretty constant every year. Even the proportion of different types of murder weapon—whether it was perpetrated by a gun, a sword, a knife, a fist, and so on—stayed roughly the same. Nowadays this observation is unremarkable—indeed, the way we run our public institutions relies on an appreciation of, for example, crime rates, exam pass rates and accident rates, which we expect to be comparable every year. Yet Quételet was the first person to notice the quite amazing regularity of social phenomena when populations are taken as a whole. In any one year it was impossible to tell who might become a murderer. Yet in any one year it was possible to predict fairly accurately how many murders would occur. Quételet was troubled by the deep questions about personal responsibility this pattern raised and, by extension, about the ethics of punishment. If society was like a machine that produced a regular number of murderers, didn’t this indicate that murder was the fault of society and not the individual?

Quételet’s ideas transformed the use of the word statistics, whose original meaning had little to do with numbers. The word was used to describe general facts about the state; as in the type of information required by statesmen. Quételet turned statistics into a much wider discipline, one that was less about statecraft and more about the mathematics of collective behaviour. He could not have done this without advances in probability theory, which provided techniques to analyse the randomness in data. In Brussels in 1853 Quételet hosted the first international conference on statistics.

Quételet’s insights on collective behaviour reverberated in other sciences. If by looking at data from human populations you could detect reliable patterns, then it was only a small leap to realize that populations of, for example, atoms also behaved with predictable regularities. James Clerk Maxwell and Ludwig Boltzmann were indebted to Quételet’s statistical thinking when they came up with the kinetic theory of gases, which explains that the pressure of a gas is determined by the collisions of its molecules travelling randomly at different velocities. Though the velocity of any individual molecule cannot be known, the molecules overall behave in a predictable way. The origin of the kinetic theory of gases is an interesting exception to the general rule that developments in the social sciences are the result of advances in the natural sciences. In this case, knowledge flowed in the other direction.

**
The history of the bell curve, in fact, is a wonderful parable about the curious kinship between pure and applied scientists. Poincaré once received a letter from the French physicist Gabriel Lippmann, who brilliantly summed up why the normal distribution was so widely exalted: ‘Everybody believes in the [bell curve]: the experimenters because they think it can be proved by mathematics; and the mathematicians because they believe it has been established by observation.’ In science, as in so many other spheres, we often choose to see what serves our interests.


Existence of God


Spirals


Algebra






Algebra is the generic term for the maths of equations, in which numbers and operations are written as symbols. The word itself has a curious history. In medieval Spain, barbershops displayed signs saying Algebrista y Sangrador. The phrase means ‘Bonesetter and Bloodletter’, two trades that used to be part of a barber’s repertoire. (This is why a barber’s pole has red and white stripes—the red symbolizes blood, and the white symbolizes the bandage.)

The root of algebrista is the Arabic al-jabr, which, in addition to referring to crude surgical techniques, also means restoration or reunion. In ninth-century Baghdad, Muhammad ibn Musa al-Khwarizmi wrote a maths primer entitled Hisab al-jabr w’al-muqabala, or Calculation by Restoration and Reduction.

**
Al-Khwarizmi wasn’t the first person to use restoration and reduction—these operations could also be found in Diophantus; but when Al-Khwarizmi’s book was translated into Latin, the al-jabr in the title became algebra. Al-Khwarizmi’s algebra book, together with another one he wrote on the Indian decimal system, became so widespread in Europe that his name was immortalized as a scientific term: Al-Khwarizmi became Alchoarismi, Algorismi and, eventually, algorithm.

**
Between the fifteenth and seventeenth centuries mathematical sentences moved from rhetorical to symbolic expression. Slowly, words were replaced with letters. Diophantus might have started letter symbolism with his introduction of  for the unknown quantity, but the first person to effectively popularize the habit was François Viète in sixteenth-century France. Viète suggested that upper-case vowels—A, E, I, O, U—and Y be used for unknown quantities, and that the consonants B, C, D, etc., be used for known quantities.

Within a few decades of Viète’s death, René Descartes published his Discourse on Method. In it, he applied mathematical reasoning to human thought. He started by doubting all of his beliefs and, after stripping everything away, was left with only certainty that he existed. The argument that one cannot doubt one’s own existence, since the process of thinking requires the existence of a thinker, was summed up in the Discourse as I think, therefore I am. The statement is one of the most famous quotations of all time, and the book is considered a cornerstone of Western philosophy. Descartes had originally intended it as an introduction to three appendices of his other scientific works. One of them, La Géométrie, was equally a landmark in the history of maths.

In La Géométrie Descartes introduces what has become standard algebraic notation. It is the first book that looks like a modern maths book, full of as, bs and cs and xs, ys and zs. It was Descartes’s decision to use lower-case letters from the beginning of the alphabet for known quantities, and lower-case letters from the end of the alphabet for the unknowns. When the book was being printed, however, the printer started to run out of letters. He enquired if it mattered if x, y or z was used. Descartes replied not, so the printer chose to concentrate on x since it is used less frequently in French than y or z. As a result, x became fixed in maths—and the wider culture—as the symbol for the unknown quantity. That is why paranormal happenings are classified in the X-Files and why Wilhelm Röntgen came up with the term X-ray. Were it not for issues of limited printing stock, the Y-factor could have become a phrase to describe intangible star quality and the African-American political leader might have gone by the name Malcolm Z.









With Descartes’ symbology, all traces of rhetorical expression had been expunged.


**
In 1621, a Latin translation of Diophantus’s masterpiece Arithmetica was published in France. The new edition rekindled interest in ancient problem-solving techniques, which, combined with better numerical and symbolic notation, ushered in a new era of mathematical thought. Less convoluted notation allowed greater clarity in describing problems. Pierre de Fermat, a civil servant and judge living in Toulouse, was an enthusiastic amateur mathematician who filled his own copy of Arithmetica with numerical musings. Next to a section dealing with Pythagorean triples—any set of natural numbers a, b and c such that a2+ b2 = c2, for example 3, 4 and 5—Fermat scribbled some notes in the margin. He had noticed that it was impossible to find values for a, b and c such that a3 + b3= c3. He was also unable to find values for a, b and c such that 
a4+ b4 = c4. Fermat wrote in his Arithmetica that for any number n greater than 2, there were no possible values a, b and c that satisfied the equation an + bn = cn. ‘I have a truly marvellous demonstration of this proposition which this margin is too narrow to contain,’ he wrote.

Fermat never produced a proof—marvellous or otherwise—of his proposition even when unconstrained by narrow margins. His jottings in Arithmetica may have been an indication that he had a proof, or he may have believed he had a proof, or he may have been trying to be provocative. In any case, his cheeky sentence was fantastic bait to generations of mathematicians. The proposition became known as Fermat’s Last Theorem and was the most famous unsolved problem in maths until the Briton Andrew Wiles cracked it in 1995. Algebra can be very humbling in this way—ease in stating a problem has no correlation with ease in solving it. Wiles’s proof is so complicated that it is probably understood by no more than a couple of hundred people.

Tuesday, December 29, 2015

Rubik’s Cube


Every week or so, somewhere around the world now hosts an official speedcubing tournament. To make sure that the starting position is sufficiently difficult in these competitions, the regulations stipulate that cubes must be scrambled by a random sequence of moves generated by a computer program. The current record of 7.08 seconds was set in 2008 by Erik Akkersdijk, a 19-year-old Dutch student. Akkersdijk also holds the record for the 2 × 2 × 2 cube (0.96secs), the 4 × 4 × 4 cube (40.05secs) and the 5 × 5 × 5 cube (1min 16.21 secs). He can also solve the Rubik’s Cube with his feet—his time of 51.36secs is fourth-best in the world. However, Akkersdijk really must improve his performance at solving the cube one-handed (33rd in the world) and blindfolded (43rd). The rules for blindfolded solving are as follows: the timer starts when the cube is shown to the competitor. He must then study it, and put on a blindfold. When he thinks it is solved he tells the judge to stop the stopwatch. The current record of 48.05secs was set by Ville Seppänen of Finland in 2008. Other speedcubing disciplines include solving the Rubik’s Cube on a rollercoaster, under water, with chopsticks, while idling on a unicycle, and during freefall.



Number

The more I pushed Pica for facts and figures, the more reluctant he was to provide them. I became exasperated. It was unclear if underlying his responses was French intransigence, academic pedantry or simply a general contrariness. I stopped my line of questioning and we moved on to other subjects. It was only when, a few hours later, we talked about what it was like to come home after so long in the middle of nowhere that he opened up. ‘When I come back from Amazonia I lose sense of time and sense of number, and perhaps sense of space,’ he said. He forgets appointments. He is disoriented by simple directions. ‘I have extreme difficulty adjusting to Paris again, with its angles and straight lines.’ Pica’s inability to give me quantitative data was part of his culture shock. He had spent so long with people who can barely count that he had lost the ability to describe the world in terms of numbers.

**
It is Pica’s belief that understanding quantities approximately in terms of estimating ratios is a universal human intuition. In fact, humans who do not have numbers—like Indians and young children—have no alternative but to see the world in this way. By contrast, understanding quantities in terms of exact numbers is not a universal intuition; it is a product of culture. The precedence of approximations and ratios over exact numbers, Pica suggests, is due to the fact that ratios are much more important for survival in the wild than the ability to count. Faced with a group of spear-wielding adversaries, we needed to know instantly whether there were more of them than us. When we saw two trees we needed to know instantly which had more fruit hanging from it. In neither case was it necessary to enumerate every enemy or every fruit individually. The crucial thing was to be able to make quick estimates of the relevant amounts and compare them, in other words to make approximations and judge their ratios.
**
There are tribes whose only number words are ‘one’, ‘two’ and ‘many’. The Munduruku, who go all the way up to five, are a relatively sophisticated bunch.

Numbers are so prevalent in our lives that it is hard to imagine how people survive without them. Yet while Pierre Pica stayed with the Munduruku he easily slipped into a numberless existence. He slept in a hammock. He went hunting and ate tapir, armadillo and wild boar. He told the time from the position of the sun. If it rained, he stayed in; if it was sunny, he went out. There was never any need to count.

**
In 1992, Karen Wynn, at the University of Arizona, sat a five-month-old baby in front of a small stage. An adult placed a Mickey Mouse doll on the stage and then put up a screen to hide it. The adult then placed a second Mickey Mouse doll behind the screen, and the screen was then pulled away to reveal two dolls. Wynn then repeated the experiment, this time with the screen pulling away to reveal a wrong number of dolls: just one doll or three of them. When there were one or three dolls, the baby stared at the stage for longer than when the answer was two, indicating that the infant was surprised when the arithmetic was wrong. Babies understood, argued Wynn, that one doll plus one doll equals two dolls.





 The Mickey experiment was later performed with the Sesame Street puppets Elmo and Ernie. Elmo was placed on the stage. The screen came down. Then another Elmo was placed behind the screen. The screen was taken away. Sometimes two Elmos were revealed, sometimes an Elmo and an Ernie together and sometimes only one Elmo or only one Ernie. The babies stared for longer when just one puppet was revealed, rather than when two of the wrong puppets were revealed. In other words, the arithmetical impossibility of 1 + 1 = 1 was much more disturbing than the metamorphosis of Elmos into Ernies. Babies’ knowledge of mathematical laws seems much more deeply rooted than their knowledge of physical ones.


The Swiss psychologist Jean Piaget (1896–1980) argued that babies build up an understanding of numbers slowly, through experience, so there was no point in teaching arithmetic to children younger than six or seven. This influenced generations of educators, who often preferred to let primary-age pupils play around with blocks in lessons rather than introduce them to formal mathematics. Now Piaget’s views are considered outdated. Pupils come face to face with Arabic numerals and sums as soon as they get to school.

Dot experiments are also the cornerstone of research into adult numerical cognition. A classic experiment is to show a person dots on a screen and ask how many dots he or she sees. When there are one, two or three dots, the response comes almost instantly. When there are four dots, the response is significantly slower, and with five slower still.
So what! you might say. Well, this probably explains why in several cultures the numerals for 1, 2 and 3 have been one, two and three lines, while the number for 4 is not four lines. When there are three lines or fewer we can tell the number of lines straight away, but when there are four of them our brain has to work too hard and a different symbol




Indian-Arabic Numerals

Owing to its ease of use, the Indian method spread to the Middle East, where it was embraced by the Islamic world, which accounts for why the numerals have come to be known, erroneously, as Arabic. From there they were brought to Europe by an enterprising Italian, Leonardo Fibonacci, his last name meaning ‘son of Bonacci’. Fibonacci was first exposed to the Indian numerals while growing up in what is now the Algerian city of Béjaïa, where his father was a Pisan customs official. Realizing that they were much better than Roman ones, Fibonacci wrote a book about the decimal place-value system called the Liber Abaci, published in 1202. It opens with the happy news:
The nine Indian figures are:
9     8     7     6     5     4     3     2     1
With these nine figures, and with the sign 0, which the Arabs call zephyr, any number whatsoever is written, as will be demonstrated.

More than any other book, the Liber Abaci introduced the Indian system to the West. In it Fibonacci demonstrated ways to do arithmetic that were quicker, easier and more elegant than the methods the Europeans had been using. Long multiplication and long division might seem dreary to us now, but at the beginning of the thirteenth century they were the latest technological novelty.

Not everyone, however, was convinced to switch immediately. Professional abacus operators felt threatened by the easier counting method, for one thing. (They would have been the first to realize that the decimal system was essentially the abacus with written symbols.) On top of that, Fibonacci’s book appeared during the period of the Crusades against Islam, and the clergy was suspicious of anything with Arab connotations. Some, in fact, considered the new arithmetic the Devil’s work precisely because it was so ingenious. A fear of Arabic numerals is revealed through the etymology of some modern words. From zephyr came ‘zero’ but also the Portuguese word chifre, which means ‘[Devil] horns’, and the English word cipher, meaning ‘code’. It has been argued that this was because using numbers with a zephyr, or zero, was done in hiding, against the wishes of the Church.

In 1299 Florence banned Arabic numerals because, it was said, the slinky symbols were easier to falsify than solid Roman Vs and Is. A 0 could easily become a 6 or 9, and a 1 morph seamlessly into a 7. As a consequence, it was only around the end of the fifteenth century that Roman numerals were finally superseded, though negative numbers took much longer to catch on in Europe, gaining acceptance only in the seventeenth century, because they were said to be used in calculations of illegal money-lending, or usury, which was associated with blasphemy. In places where no calculation is needed, however, such as legal documents, chapters in books and dates at the end of BBC programmes, Roman numerals still live on.

With the adoption of Arabic numbers, arithmetic joined geometry to become part of mathematics in earnest, having previously been more of a tool used by shopkeepers, and the new system helped open the door to the scientific revolution.



Non-Euclidean Geometry


Gauss’s final contribution to research on the fifth postulate came shortly before he died, when, already seriously ill, he set the title for the probationary lecture of one of his brightest students, 27-year-old Bernhard Riemann: ‘On the hypotheses that lie at the foundations of geometry’. The cripplingly shy son of a Lutheran pastor, Riemann at first had some kind of breakdown struggling with what he would say, yet his solution to the problem would revolutionize maths. It would later revolutionize physics too, since his innovations were required by Einstein to formulate his general theory of relativity.

Riemann’s lecture, given in 1854, consolidated the paradigm shift in our understanding of geometry resulting from the fall of the parallel postulate by establishing an all-embracing theory that included the Euclidean and non-Euclidean within it. The key concept behind Riemann’s theory was the curvature of space. When a surface has zero curvature, it is flat, or Euclidean, and the results of The Elements all hold. When a surface has positive or negative curvature, it is curved, or non-Euclidean, and the results of The Elements do not hold.

The simplest way to understand curvature, continued Riemann, is by considering the behaviour of triangles. On a surface with zero curvature, the angles of a triangle add up to 180 degrees. On a surface with positive curvature, the angles of a triangle add up to more than 180 degrees. On a surface with negative curvature, the angles of a triangle add up to less than 180 degrees.

A surface with negative curvature is called hyperbolic. So, the surface of a Pringle is hyperbolic. The Pringle, however, is only an hors d’oeuvre in understanding hyperbolic geometry since it has an edge. Show a mathematician an edge and he or she will want to go over it.



Vedic Mathematics


In India in those days [shortly after Independence] there was a strong feeling that we needed to get back [from the British] what we lost by hook or by crook. It was mostly in terms of artefacts, stuff that the British might have taken away. Because we lost such a lot, I thought we should have the equivalent back of what we lost.

‘Vedic Mathematics is a misguided attempt to claim arithmetic back for India.’

Randomness and IPod


The human brain finds it incredibly difficult, if not impossible, to fake randomness. And when we are presented with randomness, we often interpret it as non-random. For example, the shuffle feature on an iPod plays songs in a random order. But when Apple launched the feature, customers complained that it favoured certain bands because often tracks from the same band were played one after another. The listeners were guilty of the gambler’s fallacy. If the iPod shuffle were truly random, then each new song choice is independent of the previous choice. As the coin-flipping experiment shows, counterintuitively long streaks are the norm. If songs are chosen randomly, it is very possible, if not entirely likely, that there will be clusters of songs by the same artist. Apple CEO Steve Jobs was totally serious when he said, in response to the outcry: ‘We’re making [the shuffle] less random to make it feel more random.’