Our present world is experiencing serious changes and we are happy – thanks to technology and machine development on every side. Come to think of this; who would think that at some point in time, we would be able to talk or interact with machines?

Today, algorithms enable computers to communicate with us. Algorithms enable self- driving cars (thanks to computerization), publish stories and also find criminals. Did you think all the impacts machine learning is making in our society just began today? No! It didn’t. Let’s go back in time and unravel the past of machine learning.

Machine learning has a long history but a short past of about 80 years ago. Significantly, our machine learning has its history in mathematics and statistics.

The history of machine learning dates back to the 18th century during which Thomas Bayes’ work was a breakthrough in using calculative logic to solve social problems. Later in the line, another notable person, Pierre-Simon Laplace redefined Thomas Bayes’ theorem in 1812.

In 1805, Andrien-Marie Legendre propagated the popular Least Square Method for data fitting – this method is widely used in basic mathematics, economics, physics and statistics. Later on, Analysis techniques was developed by Andrey Markov in 1913 – analysis technique was later renamed Markov Chains. Then it looked like most scientists hoped that their names lived after them, so most discoveries were taken seriously and named after the founders.

Late 1940s were fundamental in the history of machine learning as innovation began in the area of developing computer programs to store and hold instructions using memories used for data. During this era, Manchester Small-Scale Experimental Machine, EDSAC, Manchester Mark 1, and EDVAC developed in 1948, 1949, 1949 and 1951 respectively were the first generation computers that began the modern computing revolution. This was the time where what we could call computer is like a whole warehouse – occupying a very large space. That time, quite of lot of discoveries were recorded and worked upon in subsequent years.

Machine learning history cannot be well told without mentioning our popular Alan Turing – “the machine thinker”. In 1950, he published his popular work titled Computing Machinery and Intelligence – a topic we are still dealing with till today. This paper by Turing was a groundbreaking vision on how science could develop artificial intelligence – don’t forget machine learning is a subset of artificial intelligence.

Computing Machinery and Intelligence proposed imitation game in testing if a computer was intelligence or not. In this intelligence test, a person was asked to differentiate between human and computer by communicating with both in typed texts/messages. With that, it tries to see language familiarity and processes in both human and computer – but not to say that computer is more intelligent.

1951 recorded experimentation of neural network by Marvin Minsky and Dean Edmonds. These two intelligent individuals from MIT built what we would call the first artificial neural network which was a design to imitate the way the brain works. Essentially, artificial neural network is a computer-based simulation of organic brain. During that time, SNARC was built and was used to search a maze in an experiment.

In building Stochastic Neural Analog Reinforcement Computer, the principle of connectionism was used. The principle of connectionism is formed on the premise that, the mind is made up of simple units wherewith intelligence arises. So with this principle, SNARC was able to learn to search the maze – more like rat in the experimental lab. However, Marvin didn’t stop at these discoveries, he moved to MIT to further his research and he discovered more groundbreaking discoveries useful to what we have today.


1950s and 1960s were with great and promising research discoveries in machine learning and artificial intelligence. Later in the years, all promises were dashed and research in that regard began to lose funding and support.

Machine learning was behind in research funding due to failure to meet up with standard as proposed by Minsky. A multipurpose robot, Shakey was built. And astoundingly, Shakey could make decisions by simply reasoning about its environment/surroundings. The robot was a very slow one. Shakey could not process information quickly to take action. This is what we would rationalize as imbecile. Shakey worked by building a spatial map of whatever it saw before deciding to take action on it. Its movement was extremely slow. If Shakey moved a few meters, it would have to stop to process and update its spatial map its working with. At few distances, it would stop for hours to plan its next move and this was pretty annoying.

Even though there was not much funding during this time, research took a turnaround with invention of expert systems in the 1980s – where old applications were adopted to new situations. This time, some old scientific ideas were transferred to new scenarios. Even artificial intelligence took several new names as we have machine learning, computational intelligence and informatics. 

Expert systems looked like it was truly expert in almost everything as public concern and hope sprang up again on artificial intelligence. It was during this time that Deep Blue (an IBM computer) beat Garry Kasparov (a world chess game champion) in a chess game.  To have won the game, Deep Blue relied on the efficiency of brute computing and special purpose of chips for chess. Also, the computer learned from evaluating several old chess games to know likely route to checkmate and could predict up to 20 moves ahead of Garry. This was fantastic and many people began to accept the future of machine learning.

Roomba vacuum, the first robot for household was created in 2002. Because of the development seen in this robot, it sold more ten million pieces across the world. Roomba was created by Spin-off Company, iRobot owned by Rodney Brook.

The spin vacuum device used fewer neurons of behavior-generating systems which were even simpler compared to Shakey algorithms that took almost eternity to carry out simple task. Significantly, with simple and minimal processing ability, Roomba has high intelligence and deep learning skills to efficiently clean a home. This intervention by iRobot was an eye opening as many autonomous robot designed for specific tasks were designed and people gladly bought the ideas.

Development continued within the field of machine learning till the actuated function of back-propagation in 2006. Back-propagation algorithm was initially introduced in the 1970s but became fully accepted in 1986. When it was first introduced, it was used to train artificial neural networks. The back-propagation was dumped upon discovery of effectiveness of modern processors by Geoff Hinton. Today, Deep learning is seen as the mainstay of machine learning.

Today, a lot of development has been made in machine learning. For example, speech recognition appeared on Google iPhone in 2008. This was far more than 80% accuracy that inventions had been battling with for decades. As at 2008, Google could claim only 92% accuracy but now we have more accuracy than we envisaged. Many more have been created and are proving so great today.

Leave a Reply

Your email address will not be published. Required fields are marked *