Artificial Intelligence history road map

This chapter is focused on the historical road map of neural networks, machine learning and deep learning. By looking into the past of these concepts, you are empowered to better understand and appreciate the current state of affairs.

Neural Networks (1950s – 1970s)

Neural network, properly known as artificial neural network (ANN), is an information processing paradigm whose inspiration is drawn from the manner in which biological nervous systems like the brain normally process information. The vital element here is the manner in which the information processing system is structured. There are neurons (interconnected processing elements) that work together in finding a solution to the problem at hand. Just like human beings, the ANNs normally learn by example. Through the learning process, the ANN is configured for a particular application like data classification or pattern recognition. Just the same way biological systems learn by adjusting the synaptic connections in neurons, the ANN learns in a similar manner (Psych, n.d.).

Even though the idea of ANNs appears to be a recent development, they have history that dates years before the advent of computers. Neurophysiologist Warren McCulloch developed the first artificial neuron in 1943.

In the late 1940s, Hebb created the Hebbian learning model that was based on neural plasticity. Clark and Farley would then in 1954 use “calculators” (modern day referred to as computational machines) in simulating a Hebbian network. Duda, Habit, Holland, and Rochester had also come up with other computational machines by 1956 (Stanford, n.d.).

The perceptron was created in 1958 by Rosenblatt. This is an algorithm specifically designed for pattern recognition. The creator used mathematical notation to describe circuitry that the neural networks at that time were unable to process.

As of 1959, Wiesel and Hubel (both are Nobel laureates) proposed a biological model after they discovered the simple cells and complex cells in the primary visual cortex (Andreykurenkov, 2015).

However, it had to wait until 1965 for the first functional network that has multiple layers to be published by Lapa and Ivakhnenko. This later on became the Group Method of Data Handling.

Research in neural network came to a standstill in 1969 after Papert and Minsky conducted a research in machine learning and discovered that computational machines had two key issues when it came to the processing of neural networks. First, the researchers established that basic perceptron could not process exclusive-or circuit. They also discovered that the power of computers was not strong enough to properly tackle the work performed in large neural networks. As such, there was a significant reduction in neural network research until computers with greater processing power were developed (Stanford, n.d.).

Even though the frustrations registered by Minsky and Papert in 1969 about ANNs were accepted without further analysis, the field has started to receive a resurgence of interest and more investors have begun pouring their money into it due to the promises that it holds.

Machine Learning (1980s – 2010s)

Machine learning refers to a data analysis method in which case analytical model building is automated. As a branch of AI, it is motivated by the understanding that it is possible for systems to learn from data and also recognize patterns in addition to making decisions with minimal human involvement. The basic idea behind machine learning is that computers should be able to act without having to be programmed explicitly. Through this concept, we have been able to get effective web search, practical speech recognition, self-driven cars, and helped scientists get deeper insights on human genome.

Machine learning of today is not the machine learning of the past. It is a field whose roots trace as back as the late 1950s when pattern recognition had shaped up and the theory that machines could learn without needed any form of programming was just theoretical. The term “Machine Learning” was coined in 1959 by Arthur Samuel who had already established his name as a pioneer of artificial intelligence and computer gaming. It is out of AI that Machine learning grew. At this time, academic scholars were already interested in seeing whether could learn from data. Different symbolic methods were used to approach the problem. These methods were generally referred to as neural networks which would later be understood to be just the reinventions of the statistics linear models. There was also an implementation of probabilistic reasoning, more so in automated medical diagnosis (Marr, 2016).

By 1980, statistics was no longer favored given that expert systems had dominated artificial intelligence. This is one of the times when the clash between AI and machine learning was at its peak because a lot of emphasis had been placed on logical, knowledge-based approached. Both AI and computer science had abandoned research in neural network. Similar to ANNs, machine learning was continued outside AI and CS field as part of connectionism by researchers like Hinton, Rumelhart, and Hopfield.

It is in the 1990s that machine learning once again reorganized itself as an independent field, taking huge steps forward. As opposed to the 1980s when it had a goal of achieving artificial intelligence, the field this time around was focused on dealing with solvable practical problems. It did away with the symbolic approached it had obtained from AI and concentrated on methods and models associated with probability theory and statistics. The emergence of the Internet and in particular the digitization of information was of great benefit to this field. Scientists could come up with computer programs that were able to analyze large amounts of data and draw conclusions (learn) from the analysis (Provalisresearch, 2017).

By the time we got to the 2000s, recurrent neural networks (RNNs) and support vector machines (SVMs) had increased in popularity. As such, researchers were more focused on unsupervised machine learning methods. This is when machine learning had established itself thanks to vector clustering and kernel methods (Marr, 2016).

In the 2010s, experts had started believing in the possibility of deep learning. This made machine learning an essential component in different applications and software services. This period has also seen the launch of Kaggle, a website that is used to host machine learning competitions.

Deep Learning (Present days)

When machine learning established itself as a separate discipline, it forego its original goal, that is, to achieve artificial intelligence. The introduction of deep learning is aimed at taking machine learning back to this objective. Deep learning is a subfield of machine learning that is interested with algorithms and draws inspiration from brain’s structure and function. The development of GPUs and decreasing hardware prices for the past few years have been instrumental in fostering the growth of deep learning which is made up of many layers in an ANN.

Deep learning has a number of architectures like deep belief networks, deep neural networks, and recurrent neural networks, all of which have found applications in speech recognition, computer vision, audio recognition, natural language processing, drug design, bioinformatics, machine translation, material inspection, medical image analysis, social network filtering, and board games programs. In these areas, deep learning has led to results which can be compared to human experts. There are even cases where the results have been superior to those yielded by humans.

Deep learning has been making major steps in the modern day generation. In 2012, Dahl and his team won the Merck Molecular Activity Challenge. This challenge involved the prediction of bio-molecular target with the aid of multi-task deep neural networks. Deep learning was once again at it in 2014 when Hochreiter’s group used it in the detection of toxic effects of environmental chemicals in drugs, household products and nutrients during the Tox21 Data Challenge, which they won (Fogg, 2017).

2011 and 2012 also led to significant progress with regards to image or object recognition. As much as Convolutional neural networks (CNNs) trained by backpropagation had been around for quite a while, progress could only be made in computer vision with max-pooling on graphics processing units (GPUs). This is an approach that was successfully used in 2011 at a vision pattern recognition, achieving superhuman performance. The approach also won the 2011 ICDAR Chinese handwriting contest, and also in 2012 at the ISBI image segmentation contest. Before the important contests, CNNs had not been given enough room in computer vision conferences.

Ciresan et al. submitted a paper in June 2012 at the CVPR conference, demonstrating how vision benchmark records could be dramatically improved through max-pooling CNNs on GPU. The Ciresan et al. progressed to win a contest in 2012 where they were able to detect cancer by analyzing large medical records. A system similar to this by Krizhevsky et al. emerged victorious during the ImageNet contest and also in 2013 at the MICCAI Grand Challenge. The years from 2013 to 2014 saw a reduction in the error rate on the ImageNet task. These improvements were publicized by Wolfram Image Identification project.

This led to the task being made more challenging whereby contesters are expected to generate captions for images. Many experts are of the opinion that the October 2012 ImageNet victory formed the basis upon which deep learning took a whole new direction. This is a direction that has led to tremendous transformations of the artificial intelligence industry.

Leave a Reply

Your email address will not be published. Required fields are marked *