{"id":3718,"date":"2017-11-06T08:12:22","date_gmt":"2017-11-06T08:12:22","guid":{"rendered":"https:\/\/datakeen.co\/8-machine-learning-algorithms-explained-in-human-language\/"},"modified":"2021-11-18T08:19:50","modified_gmt":"2021-11-18T08:19:50","slug":"8-machine-learning-algorithms-explained-in-human-language","status":"publish","type":"post","link":"https:\/\/legacy-wp.datakeen.co\/en\/8-machine-learning-algorithms-explained-in-human-language\/","title":{"rendered":"8 Machine Learning Algorithms explained in Human language"},"content":{"rendered":"<p><a href=\"https:\/\/www.datakeen.co\/wp-content\/uploads\/2021\/11\/andy-kelly-0E_vhMVqL9g-unsplash.jpg\"><img decoding=\"async\" class=\"aligncenter wp-image-4020 size-large\" src=\"https:\/\/www.datakeen.co\/wp-content\/uploads\/2021\/11\/andy-kelly-0E_vhMVqL9g-unsplash-1024x683.jpg\" alt=\"\" width=\"1024\" height=\"683\" srcset=\"https:\/\/media.datakeen.co\/wp-content\/uploads\/2021\/11\/28141451\/andy-kelly-0E_vhMVqL9g-unsplash-1024x683.jpg 1024w, https:\/\/media.datakeen.co\/wp-content\/uploads\/2021\/11\/28141451\/andy-kelly-0E_vhMVqL9g-unsplash-300x200.jpg 300w, https:\/\/media.datakeen.co\/wp-content\/uploads\/2021\/11\/28141451\/andy-kelly-0E_vhMVqL9g-unsplash-768x512.jpg 768w, https:\/\/media.datakeen.co\/wp-content\/uploads\/2021\/11\/28141451\/andy-kelly-0E_vhMVqL9g-unsplash-1536x1024.jpg 1536w, https:\/\/media.datakeen.co\/wp-content\/uploads\/2021\/11\/28141451\/andy-kelly-0E_vhMVqL9g-unsplash-1080x720.jpg 1080w, https:\/\/media.datakeen.co\/wp-content\/uploads\/2021\/11\/28141451\/andy-kelly-0E_vhMVqL9g-unsplash-1280x853.jpg 1280w, https:\/\/media.datakeen.co\/wp-content\/uploads\/2021\/11\/28141451\/andy-kelly-0E_vhMVqL9g-unsplash-980x653.jpg 980w, https:\/\/media.datakeen.co\/wp-content\/uploads\/2021\/11\/28141451\/andy-kelly-0E_vhMVqL9g-unsplash-480x320.jpg 480w, https:\/\/media.datakeen.co\/wp-content\/uploads\/2021\/11\/28141451\/andy-kelly-0E_vhMVqL9g-unsplash.jpg 1920w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/>Little schoolgirl with Pepper Robot. Credits : Andy Kelly (Unsplash)<\/a><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">What we call\u00a0&#8220;Machine Learning&#8221; is none other than the meeting of statistics and\u00a0the incredible computation power available today (in terms of memory, CPUs, GPUs). This domain\u00a0has become increasingly visible\u00a0important because of the digital revolution of companies leading\u00a0to the production of massive data of different forms and types, at ever increasing rates: Big Data. On a purely mathematical level most of the algorithms used today are already several decades old. In this article I will explain the underlying logic\u00a0of 8 machine learning algorithms in the simplest possible terms.<\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-size: 18px; font-family: arial, sans-serif;\">I. Some global\u00a0concepts before describing the algorithms <\/span><\/span><\/p>\n<p><span style=\"font-size: medium;\"><strong><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\">1. Classification and Prediction \/ Regression<\/span><\/span><\/strong><\/span><\/p>\n<p><span style=\"font-size: medium;\"><strong><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\">Classification<\/span><\/span><\/strong><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">Assigning a class \/ category to each of the observations in a dataset is called classification. It is done <em>a\u00a0posteriori<\/em>, once the data is recovered. <\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">Example:<\/span><\/span><\/span><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\"> classifying consumers reasons of visit in store in order to send them a personalized campaign. <\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\"><strong>Prediction<\/strong><\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">A prediction is made on a new observation. When it comes to a numerical variable (continuous) we speak of regression. <\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">Example:<\/span><\/span><\/span><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\"> predicting a heart attack\u00a0based on data from an electro cardiogram. <\/span><\/span><\/span><\/p>\n<p><b><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">2. Supervised and unsupervised learning<\/span><\/span><\/span><\/b><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\"><b>Supervised<\/b><\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">You already have tags on historical data and want to\u00a0classify new data according to these tags. The number of classes is known. <\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">Example:<\/span><\/span><\/span><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\"> in botany you made measurements (length of the stem, petals, &#8230;) on 100 plants of 3 different species. Each of the measurements is labeled with the species of the plant. You want to build a model that will automatically tell which species a new plant belongs to thanks to\u00a0the same measurements.<\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\"><b>Unsupervised<\/b><\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">On the contrary, in unsupervised learning, you have no labels, no predefined classes. You want to identify common patterns in order to form homogeneous groups based on your observations. <\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\"><b>Examples:<\/b><\/span><\/span><\/span><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\"> You want to classify your customers based on their browsing history on your website but you have not formed groups and are in an exploratory approach to see what would be the common points between them. In this case a clustering algorithm is adapted.<\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">Some neural network algorithms will be able to differentiate between human and animal images without prior labeling. <\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: 18px;\">II. Machine Learning Algorithms<\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">We will describe <\/span><\/span><\/span><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">8<\/span><\/span><\/span><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\"> algorithms used in Machine Learnin<\/span><\/span><\/span><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">g<\/span><\/span><\/span><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">. The objective here is not to go into the details of the models but rather to give the reader elements of understanding on each of them. <\/span><\/span><\/span><\/p>\n<p><b><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">1. &#8220;The Decision Tree&#8221;<\/span><\/span><\/span><\/b><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">A decision tree is used to classify future observations given a body of already labeled observations. This is the case of our botanical example where we already have 100 sightings classified in species A, B and C. <\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">The tree begins with a root (where we still have all our observations) then comes a series of branches whose intersections are called nodes and ends are called leaves, each corresponding to one of the classes to predict. The depth of the tree is refers to\u00a0the maximum number of nodes before reaching a leaf. Each node of the tree represents a rule (example: length of the petal greater than 2.5 cm). To browse the tree is to check a series of rules. The tree is constructed in such a way that each node corresponds to the rule that best divides the set of initial observations (variable and threshold).<\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">Example:<\/span><\/span><\/span><b><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\"><br \/>\n<\/span><\/span><\/span><\/b><\/p>\n<p><img decoding=\"async\" class=\"wp-image-634 aligncenter\" src=\"https:\/\/www.datakeen.co\/wp-content\/uploads\/2017\/11\/S\u00e9lection_139.png\" alt=\"Decsion tree data\" width=\"410\" height=\"195\" srcset=\"https:\/\/media.datakeen.co\/wp-content\/uploads\/2017\/11\/28141625\/S%C3%A9lection_139.png 787w, https:\/\/media.datakeen.co\/wp-content\/uploads\/2017\/11\/28141625\/S%C3%A9lection_139-300x143.png 300w, https:\/\/media.datakeen.co\/wp-content\/uploads\/2017\/11\/28141625\/S%C3%A9lection_139-768x365.png 768w\" sizes=\"(max-width: 410px) 100vw, 410px\" \/><\/p>\n<p><img decoding=\"async\" class=\"size-full wp-image-633 aligncenter\" src=\"https:\/\/www.datakeen.co\/wp-content\/uploads\/2017\/11\/S\u00e9lection_140.png\" alt=\"Decision Tree\" width=\"537\" height=\"401\" srcset=\"https:\/\/media.datakeen.co\/wp-content\/uploads\/2017\/11\/28141625\/S%C3%A9lection_140.png 537w, https:\/\/media.datakeen.co\/wp-content\/uploads\/2017\/11\/28141625\/S%C3%A9lection_140-300x224.png 300w\" sizes=\"(max-width: 537px) 100vw, 537px\" \/><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">The tree has a depth of 2 (one node plus the root). The length of the petal is the first measure that is used because it best separates the 4 observations according to class membership (here class B). <\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\"><b>2. &#8220;Random Forests&#8221;<\/b><\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">As the name might suggest, the random forest algorithm is based on a multitude of decision trees. <\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">In order to\u00a0understand better the advantages\u00a0and logic\u00a0of this algorithm, let&#8217;s start with an example:<\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">You are looking for a good travel destination for your next vacation. You ask your best friend for his opinion. He asks you questions about your previous trips and makes a recommendation. <\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">You decide to ask a group of friends who ask you questions randomly. They each make a recommendation. The chosen destination is the one that has been the most recommended by your friends. <\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">The recommendations made by your best friend and the group will both make good destination choices. But when the first recommendation method works very well for you, the second will be more reliable for other people. <\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">This comes from the fact that your best friend, who builds a decision tree to give you a destination recommendation, knows you very well what making the decision tree over-learned about you (we talk about overfitting). <\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">Your friend group represents the random forest of multiple decision trees and it&#8217;s a model, when used properly, avoids the pitfall of overfitting. How is this forest built? <\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">Here are the main steps: <\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">1. <\/span><\/span><\/span><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">We take a<\/span><\/span><\/span> <span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">number <\/span><\/span><\/span><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">X <\/span><\/span><\/span><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">of observations from the starting dataset (with discount).<\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">2. <\/span><\/span><\/span><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">We take a number K of the M variables available (features), for example: only temperature and population density<\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">3. <\/span><\/span><\/span><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">We create\u00a0a decision tree on this dataset.<\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">4. <\/span><\/span><\/span><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">Steps 1. to 4. are repeated N times so as to obtain N trees.<\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">To obtain the class of a new observation we\u00a0go down the N trees. Each tree will predict\u00a0a different class. The class chosen is the one that is most represented among all the trees in the forest. (Majority vote \/ &#8216;Ensemble&#8217;).<\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\"><b>3. The &#8220;Gradient Boosting&#8221; \/ &#8220;XG Boost&#8221;<\/b><\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">The boosting gradient method is used to reinforce a model that produces weak predictions, such as a decision tree (see below how do we judge the\u00a0quality of a model). <\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">We will explain the principle of boosting gradient with the decision tree but this could be with another model. <\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">You have an individual database with demographics information and past activities. You have 50% of individuals their age but the other half is unknown.<\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">You want to get the age of a person according to his activities: food shopping, television, gardening, video games &#8230; You choose as a model a decision tree, in this case it is a regression tree because the value to predict is numeric.<\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">Your first regression tree is satisfying but can be improved: it predicts, for example, that an individual is 19 years old when in fact he is 13 years old, and for another 55 years old instead of 68 years old. <\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">The principle of the gradient boosting is that you will redo a model on the difference between the predicted value and the true value to be predicted.<\/span><\/span><\/span><\/p>\n<table width=\"461\" cellspacing=\"0\" cellpadding=\"7\">\n<colgroup>\n<col width=\"34\" \/>\n<col width=\"141\" \/>\n<col width=\"80\" \/>\n<col width=\"148\" \/> <\/colgroup>\n<tbody>\n<tr>\n<td width=\"34\"><span style=\"color: #000000;\"><span style=\"font-family: Arial;\"><span style=\"font-size: small;\">Age<\/span><\/span><\/span><\/td>\n<td width=\"141\"><span style=\"color: #000000;\"><span style=\"font-family: Arial;\"><span style=\"font-size: small;\">Prediction Tree 1<\/span><\/span><\/span><\/td>\n<td width=\"80\"><span style=\"color: #000000;\"><span style=\"font-family: Arial;\"><span style=\"font-size: small;\">Difference<\/span><\/span><\/span><\/td>\n<td width=\"148\"><span style=\"color: #000000;\"><span style=\"font-family: Arial;\"><span style=\"font-size: small;\">Prediction Tree 2<\/span><\/span><\/span><\/td>\n<\/tr>\n<tr>\n<td width=\"34\"><span style=\"color: #000000;\"><span style=\"font-family: Arial;\"><span style=\"font-size: small;\">13<\/span><\/span><\/span><\/td>\n<td width=\"141\"><span style=\"color: #000000;\"><span style=\"font-family: Arial;\"><span style=\"font-size: small;\">19<\/span><\/span><\/span><\/td>\n<td width=\"80\"><span style=\"color: #000000;\"><span style=\"font-family: Arial;\"><span style=\"font-size: small;\">-6<\/span><\/span><\/span><\/td>\n<td width=\"148\"><span style=\"color: #000000;\"><span style=\"font-family: Arial;\"><span style=\"font-size: small;\">15<\/span><\/span><\/span><\/td>\n<\/tr>\n<tr>\n<td width=\"34\"><span style=\"color: #000000;\"><span style=\"font-family: Arial;\"><span style=\"font-size: small;\">68<\/span><\/span><\/span><\/td>\n<td width=\"141\"><span style=\"color: #000000;\"><span style=\"font-family: Arial;\"><span style=\"font-size: small;\">55<\/span><\/span><\/span><\/td>\n<td width=\"80\"><span style=\"color: #000000;\"><span style=\"font-family: Arial;\"><span style=\"font-size: small;\">+13<\/span><\/span><\/span><\/td>\n<td width=\"148\"><span style=\"color: #000000;\"><span style=\"font-family: Arial;\"><span style=\"font-size: small;\">63<\/span><\/span><\/span><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">This step N is repeated where N is determined by successively minimizing the error between the prediction and the true value. <\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">The method to optimize is the gradient descent method that we will not explain here. The XG Boost (eXtreme Gradient Boosting) model is one of the implementations of the boosting gradient founded by Tianqi Chen and has seduced the Kaggle datascientist community with its efficiency and performance. The publication explaining the algorithm is here. <\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\"><b>4. &#8220;Genetic Algorithms&#8221;<\/b><\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">As their name suggests genetic algorithms are based on the process of genetic evolution that has made us who we are &#8230; <\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">More prosaically they are mainly used when there are no observations of departure and it is hoped that a machine will learn to learn as and when testing. <\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">These algorithms are not the most effective for a specific problem but rather for a set of subproblems (eg learning balance and walking in robotics). <\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">Let&#8217;s take a simple example: We want to find the code of a safe that is made of 15 letters: &#8220;MACHINELEARNING&#8221; <\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">The genetic algorithm approach will be as follows: <\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">We start from a population of 10,000 &#8220;chromosomes&#8221; of 15 letters each. We say that the code is a word or a set of words pro <\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">&#8220;DEEP-LEARNING&#8221;<\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">&#8220;STATISTICAL INFERENCE-&#8220;<\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">&#8220;HUMAN MACHINE INTERFACE&#8221; etc. <\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">We will define a method of reproduction: for example, to combine the beginning of one chromosome with the end of another. <\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">Ex: &#8220;DEEP-LEARNING&#8221; + &#8220;STATISTICAL-INFERENCE&#8221; = &#8220;DEEP-INFERENCE&#8221; <\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">Then we will define a mutation method which allows to change a progeny that is blocked. In our case it could be to vary one of the letters randomly. Finally we define a score that will reward such or such descendants of chromosomes. In our case where the code is hidden we can imagine a sound that the trunk would do when 80% of the letters are similar and that would become stronger as we approach the right code. <\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">Our genetic algorithm will start from the initial population and form chromosomes until the solution has been found.<\/span><\/span><\/span><\/p>\n<p><b><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">5. &#8220;Support Vector Machines&#8221;<\/span><\/span><\/span><\/b><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">Also known as &#8220;SVM&#8221; this algorithm is mainly used for classification problems even though it has been extended to regression problems (Drucker et al., 96).<\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">Let&#8217;s take our example of ideal holiday destinations. For the simplicity of our example consider only 2 variables to describe each city: the temperature and the density of population. We can therefore represent cities in 2 dimensions.<\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">We represent by circles\u00a0cities which\u00a0you very much appreciated and by squares those which you least appreciated. When you consider new cities you want to know which group this new city is closest to.<\/span><\/span><\/span><\/p>\n<div id=\"attachment_632\" style=\"width: 897px\" class=\"wp-caption alignnone\"><img decoding=\"async\" aria-describedby=\"caption-attachment-632\" class=\"size-full wp-image-632\" src=\"https:\/\/www.datakeen.co\/wp-content\/uploads\/2017\/11\/S\u00e9lection_141.png\" alt=\"SVM Optimal Plane\" width=\"887\" height=\"423\" srcset=\"https:\/\/media.datakeen.co\/wp-content\/uploads\/2017\/11\/28141626\/S%C3%A9lection_141.png 887w, https:\/\/media.datakeen.co\/wp-content\/uploads\/2017\/11\/28141626\/S%C3%A9lection_141-300x143.png 300w, https:\/\/media.datakeen.co\/wp-content\/uploads\/2017\/11\/28141626\/S%C3%A9lection_141-768x366.png 768w\" sizes=\"(max-width: 887px) 100vw, 887px\" \/><p id=\"caption-attachment-632\" class=\"wp-caption-text\">SVM Example<\/p><\/div>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">As we see in the graph on the right, there are many plans (straight lines when you only have 2 dimensions) that separate the two groups.<\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">We will choose the line that is at the maximum distance between the two groups. To build it we already see that we do not need all the points, it is enough to take the points which are at the border of their group we call these points or vectors, the support vectors. The planes passing through these support vectors are called support planes. The separation plan will be the one that will be equidistant from the two supporting planes.<\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">What to do if the groups are not so easily separable, for example if by\u00a0one of the dimensions circles are mixed up with squares or vice-versa?<\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">We will proceed to a transformation of these points by a function to be able to separate them. As in the example below:<\/span><\/span><\/span><\/p>\n<div id=\"attachment_631\" style=\"width: 734px\" class=\"wp-caption alignnone\"><img decoding=\"async\" aria-describedby=\"caption-attachment-631\" class=\"size-full wp-image-631\" src=\"https:\/\/www.datakeen.co\/wp-content\/uploads\/2017\/11\/S\u00e9lection_142.png\" alt=\"SVM transformation example\" width=\"724\" height=\"272\" srcset=\"https:\/\/media.datakeen.co\/wp-content\/uploads\/2017\/11\/28141626\/S%C3%A9lection_142.png 724w, https:\/\/media.datakeen.co\/wp-content\/uploads\/2017\/11\/28141626\/S%C3%A9lection_142-300x113.png 300w\" sizes=\"(max-width: 724px) 100vw, 724px\" \/><p id=\"caption-attachment-631\" class=\"wp-caption-text\">SVM transformation example<\/p><\/div>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">The SVM algorithm will therefore consist of looking for both the optimal hyperplane and minimizing classification errors.<\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\"><b>6. The &#8220;K nearest neighbors&#8221;<\/b><\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">Pause. After 5 relatively technical models the algorithm of the K nearest neighbors will appear to you as a formality. Here&#8217;s how it works:<\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">An observation is assigned the class of its nearest K neighbors. <\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">&#8220;That&#8217;s it ?!&#8221; you might ask\u00a0me. <\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">Yes that&#8217;s all. Only as the following example shows: K&#8217;s choice can matter\u00a0a lot.<\/span><\/span><\/span><\/p>\n<div id=\"attachment_630\" style=\"width: 913px\" class=\"wp-caption alignnone\"><img decoding=\"async\" aria-describedby=\"caption-attachment-630\" class=\"size-full wp-image-630\" src=\"https:\/\/www.datakeen.co\/wp-content\/uploads\/2017\/11\/S\u00e9lection_143.png\" alt=\"K nearest neighbours\" width=\"903\" height=\"460\" srcset=\"https:\/\/media.datakeen.co\/wp-content\/uploads\/2017\/11\/28141627\/S%C3%A9lection_143.png 903w, https:\/\/media.datakeen.co\/wp-content\/uploads\/2017\/11\/28141627\/S%C3%A9lection_143-300x153.png 300w, https:\/\/media.datakeen.co\/wp-content\/uploads\/2017\/11\/28141627\/S%C3%A9lection_143-768x391.png 768w\" sizes=\"(max-width: 903px) 100vw, 903px\" \/><p id=\"caption-attachment-630\" class=\"wp-caption-text\">K nearest neighbours<\/p><\/div>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">We will typically\u00a0try different values \u200b\u200bof K to obtain the most satisfactory separation.<\/span><\/span><\/span><\/p>\n<p><span style=\"font-size: medium;\"><strong><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\">7. &#8220;Logistic Regression&#8221;<\/span><\/span><\/strong><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">Let&#8217;s start by a reminder of\u00a0linear regression. Linear regression is used to predict a numerical variable, e.g the price of cotton in relation to other numeric or binary variables: the number of cultivable hectares, the demand for cotton from various industries, and so on.<\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">It is a question of finding the coefficients a1, a2, &#8230; in order to have the best estimate:<\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">Cotton price = a1 * Number of hectares + a2 * Demand for cotton + &#8230;<\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">Logistic regression is used in classification in the same way as the algorithms exposed so far. Once again let&#8217;s take the example of trips considering only two classes: good destination (Y = 1) and bad destination (Y = 0).<\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">P (1): probability the city is a good destination.<\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">P (0): probability that the city is a bad destination.<\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">The city is represented by a number of variables, we will only consider two: the temperature and population density.<\/span><\/span><\/span><\/p>\n<ul>\n<li><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">X = (X1: temperature, X2: population density)<\/span><\/span><\/span><\/li>\n<\/ul>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">We are therefore interested in building a function that gives us for a city X:<\/span><\/span><\/span><\/p>\n<ul>\n<li><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">P (1 | X): probability that the destination is good knowing X, which is to say probability that the city checking X is a good destination.<\/span><\/span><\/span><\/li>\n<\/ul>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">We would like to relate this probability to a linear combination as a linear regression. Only the probability P (1 | X) varies between 0 and 1 except we want a function that traverses the whole domain of real numbers (from -infinite to + infinity).<\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">For that we will start by considering P (1 | X) \/ (1 &#8211; P (1 | X)) which is the ratio between the probability that the destination is good and that the destination is bad. <\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">For strong probabilities this ratio approaches + infinity (for example a probability of 0.99 gives 0.99 \/ 0.01 = 99) and for low probabilities it approaches 0: (a probability of 0.01 gives 0.01 \/ 0.99 = 0.0101 ).<\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">We went from [0,1] to [0, + infinite [. To extend the &#8216;scope&#8217; of the possible values \u200b\u200bto] -infinite, 0] we take the natural logarithm of this ratio.<\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">It follows that we are looking for b0, b1, b2, &#8230; such as:<\/span><\/span><\/span><\/p>\n<ul>\n<li><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">ln (P (1 | X) \/ (1-P (1 | X)) = b0 + b1X1 + b2X2<\/span><\/span><\/span><\/li>\n<\/ul>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">The right part represents the regression and the logarithm of Neperian denotes the logistic part.<\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">The logistic regression algorithm will therefore find the best coefficients to minimize the error between the prediction made for visited destinations and the true label (good, bad) given.<\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\"><b>8. &#8220;Clustering&#8221;<\/b><\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">Supervised vs. Unsupervised learning. Do you remember?<\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">Until now we have described\u00a0supervised learning algorithms. Classes are known and we want to classify or predict a new observation. But how to do when there is no predefined group? When you are looking for patterns shared by groups of people?<\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">Here comes\u00a0unsupervised learning and clustering algorithms.<\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">Take the example of a company that started its digital transformation. It has new sales and communication channels through its site and one or more associated mobile applications. In the past, it\u00a0was addressing it&#8217;s\u00a0clients based on demographics and their\u00a0purchase history. But how to exploit the navigation data of its customers? Does online behavior match classic customer segments?<\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">These questions can motivate the use of clustering to see if major trends are emerging. This will invalidate or confirm business intuitions that you may have.<\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">There are many clustering algorithms (hierarchical clustering, k-means, DBSCAN, &#8230;). One of the most used is the k-means algorithm. We will explain the operation simply:<\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">Even if we do not know how the clusters will be constituted, the k-means algorithm imposes to give the expected number of clusters. Techniques exist to find the optimal number of clusters.<\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">Consider the example of cities. Our dataset has 2 variables, so we have 2 dimensions. After a first study we expect to have 2 clusters. We begin by randomly placing two points; they represent our starter &#8216;means&#8217;. We associate with the same clusters the observations closest to these means. Then we calculate the average of the observations of each cluster and move the means to the computed position. We re-assign the observations to the nearest means and so on.<\/span><\/span><\/span><\/p>\n<p><img decoding=\"async\" class=\"aligncenter size-full wp-image-629\" src=\"https:\/\/www.datakeen.co\/wp-content\/uploads\/2017\/11\/S\u00e9lection_144.png\" alt=\"Clustering K means\" width=\"956\" height=\"314\" srcset=\"https:\/\/media.datakeen.co\/wp-content\/uploads\/2017\/11\/28141500\/S%C3%A9lection_144.png 956w, https:\/\/media.datakeen.co\/wp-content\/uploads\/2017\/11\/28141500\/S%C3%A9lection_144-300x99.png 300w, https:\/\/media.datakeen.co\/wp-content\/uploads\/2017\/11\/28141500\/S%C3%A9lection_144-768x252.png 768w\" sizes=\"(max-width: 956px) 100vw, 956px\" \/><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">To ensure the stability of the groups found it is recommended to repeat the draw of the initial &#8216;means&#8217; several times because some initial draws may give a configuration different from the vast majority of cases.<\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\"><b>Factors of Relevance and Quality of Machine Learning Algorithms<\/b><\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">Machine learning algorithms are evaluated on the basis of their ability to correctly classify or predict both the observations that were used to train the model (training and test game) but also and especially observations for which\u00a0the label or value is known and has not been used in the development of the model (validation set).<\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">Proper classification implies both placing the observations in the correct group and at the same time not placing them in the wrong groups.<\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">The chosen metric may vary depending on the intent of the algorithm and its business usage.<\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">Several data factors can play a big role in the quality of the algorithms. Here are the main ones:<\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">1. The number of observations:<\/span><\/span><\/span><\/p>\n<ul>\n<li><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">the fewer observations there are, the more difficult the analysis,<\/span><\/span><\/span><\/li>\n<li><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">but the more there is, the more the need for computer memory is high and the longer is the analysis)<\/span><\/span><\/span><\/li>\n<\/ul>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">2. The number and quality of attributes describing these observations<\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">For example the distance between two numeric variables (price, size, weight, light intensity, noise intensity, etc.) is easy to establish,\u00a0<\/span><\/span><\/span><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">that between two categorical attributes (color, beauty, utility &#8230;) is more delicate;<\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">3. The percentage of data filled in and missing<\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">4. &#8220;Noise&#8221;: the number and &#8220;location&#8221; of dubious values \u200b\u200b(potential errors, outliers &#8230;) or of course not conforming to the pattern of general distribution of &#8220;examples&#8221; on their distribution space will have an impact on the quality of the &#8216;analysis.<\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\"><b>Conclusion<\/b><\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">We have seen that machine learning algorithms serve two purposes: classifying and predicting and are divided into supervised and unsupervised algorithms. There are many possible algorithms, we have covered 8\u00a0of them including logistic regression and random forests to classify an observation and clustering to bring out homogeneous groups from the data. We also saw that the value of an algorithm depended on the associated cost or loss function but that its predictive power depended on several factors related to the quality and volume of data.<\/span><\/span><\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">I hope this article has given you some insight into what is called Machine Learning. Feel free to use the comment section to get back to me on aspects that you would like to clarify or deepen.<\/span><\/span><\/span><\/p>\n<p><span style=\"color: #000000;\">Ga\u00ebl Bonnardot,<\/span><\/p>\n<p><span style=\"color: #000000;\">Cofounder and CTO at Datakeen<\/span><\/p>\n<p><span style=\"color: #000000;\">At Datakeen we seek to simplify the use and understanding of new machine learning paradigms by the business functions of all industries.<\/span><\/p>\n<p><span style=\"color: #000000;\">Contact us for more information: contact@datakeen.co<\/span><\/p>\n<p><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\"><b>Upcoming articles:<\/b><\/span><\/span><\/span><\/p>\n<ul>\n<li><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">3 Deep Learning Architectures explained in Human Language<\/span><\/span><\/span><\/li>\n<li><span style=\"color: #212121;\"><span style=\"font-family: arial, sans-serif;\"><span style=\"font-size: medium;\">Key Successes of Deep Learning and Machine Learning in production<\/span><\/span><\/span><\/li>\n<\/ul>\n<p><b>Sources<\/b><\/p>\n<ul>\n<li><a href=\"http:\/\/blog.kaggle.com\/2017\/01\/23\/a-kaggle-master-explains-gradient-boosting\/\"><span style=\"font-weight: 400;\">http:\/\/blog.kaggle.com\/2017\/01\/23\/a-kaggle-master-explains-gradient-boosting\/<\/span><\/a><\/li>\n<li><a href=\"http:\/\/dataaspirant.com\/2017\/05\/22\/random-forest-algorithm-machine-learing\/\"><span style=\"font-weight: 400;\">http:\/\/dataaspirant.com\/2017\/05\/22\/random-forest-algorithm-machine-learing\/<\/span><\/a><\/li>\n<li><a href=\"https:\/\/burakkanber.com\/blog\/machine-learning-genetic-algorithms-part-1-javascript\/\"><span style=\"font-weight: 400;\">https:\/\/burakkanber.com\/blog\/machine-learning-genetic-algorithms-part-1-javascript\/<\/span><\/a><\/li>\n<li><a href=\"http:\/\/docs.opencv.org\/3.0-beta\/doc\/py_tutorials\/py_ml\/py_svm\/py_svm_basics\/py_svm_basics.html#svm-understanding\"><span style=\"font-weight: 400;\">http:\/\/docs.opencv.org\/3.0-beta\/doc\/py_tutorials\/py_ml\/py_svm\/py_svm_basics\/py_svm_basics.html#svm-understanding<\/span><\/a><\/li>\n<li><a href=\"https:\/\/fr.wikipedia.org\/wiki\/Apprentissage_automatique\"><span style=\"font-weight: 400;\">https:\/\/fr.wikipedia.org\/wiki\/Apprentissage_automatique<\/span><\/a><\/li>\n<\/ul>\n<p><!--:--><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Little schoolgirl with Pepper Robot. Credits : Andy Kelly (Unsplash) What we call\u00a0&#8220;Machine Learning&#8221; is none other than the meeting of statistics and\u00a0the incredible computation power available today (in terms of memory, CPUs, GPUs). This domain\u00a0has become increasingly visible\u00a0important because of the digital revolution of companies leading\u00a0to the production of massive data of different forms [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":4020,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_yoast_wpseo_focuskw":"","_yoast_wpseo_title":"","_yoast_wpseo_metadesc":"","_et_pb_use_builder":"","_et_pb_old_content":"","_et_gb_content_width":"","content-type":"","footnotes":""},"categories":[1],"tags":[203,204,205,155,206,207,208],"class_list":["post-3718","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-non-classifiee","tag-algorithms-en","tag-analytics-en","tag-classification-en","tag-machine-learning-en","tag-prediction-en","tag-supervised-learning-en","tag-unsupervised-learning-en"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v25.5 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>8 Machine Learning Algorithms explained in Human language - Datakeen<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/legacy-wp.datakeen.co\/en\/8-machine-learning-algorithms-explained-in-human-language\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"8 Machine Learning Algorithms explained in Human language - Datakeen\" \/>\n<meta property=\"og:description\" content=\"Little schoolgirl with Pepper Robot. Credits : Andy Kelly (Unsplash) What we call\u00a0&#8220;Machine Learning&#8221; is none other than the meeting of statistics and\u00a0the incredible computation power available today (in terms of memory, CPUs, GPUs). This domain\u00a0has become increasingly visible\u00a0important because of the digital revolution of companies leading\u00a0to the production of massive data of different forms [&hellip;]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/legacy-wp.datakeen.co\/en\/8-machine-learning-algorithms-explained-in-human-language\/\" \/>\n<meta property=\"og:site_name\" content=\"Datakeen\" \/>\n<meta property=\"article:published_time\" content=\"2017-11-06T08:12:22+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2021-11-18T08:19:50+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/media.datakeen.co\/wp-content\/uploads\/2021\/11\/28141451\/andy-kelly-0E_vhMVqL9g-unsplash.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1920\" \/>\n\t<meta property=\"og:image:height\" content=\"1280\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"Ga\u00ebl\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@DatakeenCO\" \/>\n<meta name=\"twitter:site\" content=\"@DatakeenCO\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Ga\u00ebl\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"15 minutes\" \/>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"8 Machine Learning Algorithms explained in Human language - Datakeen","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/legacy-wp.datakeen.co\/en\/8-machine-learning-algorithms-explained-in-human-language\/","og_locale":"en_US","og_type":"article","og_title":"8 Machine Learning Algorithms explained in Human language - Datakeen","og_description":"Little schoolgirl with Pepper Robot. Credits : Andy Kelly (Unsplash) What we call\u00a0&#8220;Machine Learning&#8221; is none other than the meeting of statistics and\u00a0the incredible computation power available today (in terms of memory, CPUs, GPUs). This domain\u00a0has become increasingly visible\u00a0important because of the digital revolution of companies leading\u00a0to the production of massive data of different forms [&hellip;]","og_url":"https:\/\/legacy-wp.datakeen.co\/en\/8-machine-learning-algorithms-explained-in-human-language\/","og_site_name":"Datakeen","article_published_time":"2017-11-06T08:12:22+00:00","article_modified_time":"2021-11-18T08:19:50+00:00","og_image":[{"width":1920,"height":1280,"url":"https:\/\/media.datakeen.co\/wp-content\/uploads\/2021\/11\/28141451\/andy-kelly-0E_vhMVqL9g-unsplash.jpg","type":"image\/jpeg"}],"author":"Ga\u00ebl","twitter_card":"summary_large_image","twitter_creator":"@DatakeenCO","twitter_site":"@DatakeenCO","twitter_misc":{"Written by":"Ga\u00ebl","Est. reading time":"15 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/legacy-wp.datakeen.co\/en\/8-machine-learning-algorithms-explained-in-human-language\/#article","isPartOf":{"@id":"https:\/\/legacy-wp.datakeen.co\/en\/8-machine-learning-algorithms-explained-in-human-language\/"},"author":{"name":"Ga\u00ebl","@id":"https:\/\/legacy-wp.datakeen.co\/en\/#\/schema\/person\/201d4b0eea1c7a6ca576cd44f2188730"},"headline":"8 Machine Learning Algorithms explained in Human language","datePublished":"2017-11-06T08:12:22+00:00","dateModified":"2021-11-18T08:19:50+00:00","mainEntityOfPage":{"@id":"https:\/\/legacy-wp.datakeen.co\/en\/8-machine-learning-algorithms-explained-in-human-language\/"},"wordCount":2926,"commentCount":0,"publisher":{"@id":"https:\/\/legacy-wp.datakeen.co\/en\/#organization"},"image":{"@id":"https:\/\/legacy-wp.datakeen.co\/en\/8-machine-learning-algorithms-explained-in-human-language\/#primaryimage"},"thumbnailUrl":"https:\/\/media.datakeen.co\/wp-content\/uploads\/2021\/11\/28141451\/andy-kelly-0E_vhMVqL9g-unsplash.jpg","keywords":["algorithms","analytics","classification","machine learning","prediction","supervised learning","unsupervised learning"],"articleSection":["Non classifi\u00e9(e)"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/legacy-wp.datakeen.co\/en\/8-machine-learning-algorithms-explained-in-human-language\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/legacy-wp.datakeen.co\/en\/8-machine-learning-algorithms-explained-in-human-language\/","url":"https:\/\/legacy-wp.datakeen.co\/en\/8-machine-learning-algorithms-explained-in-human-language\/","name":"8 Machine Learning Algorithms explained in Human language - Datakeen","isPartOf":{"@id":"https:\/\/legacy-wp.datakeen.co\/en\/#website"},"primaryImageOfPage":{"@id":"https:\/\/legacy-wp.datakeen.co\/en\/8-machine-learning-algorithms-explained-in-human-language\/#primaryimage"},"image":{"@id":"https:\/\/legacy-wp.datakeen.co\/en\/8-machine-learning-algorithms-explained-in-human-language\/#primaryimage"},"thumbnailUrl":"https:\/\/media.datakeen.co\/wp-content\/uploads\/2021\/11\/28141451\/andy-kelly-0E_vhMVqL9g-unsplash.jpg","datePublished":"2017-11-06T08:12:22+00:00","dateModified":"2021-11-18T08:19:50+00:00","breadcrumb":{"@id":"https:\/\/legacy-wp.datakeen.co\/en\/8-machine-learning-algorithms-explained-in-human-language\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/legacy-wp.datakeen.co\/en\/8-machine-learning-algorithms-explained-in-human-language\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/legacy-wp.datakeen.co\/en\/8-machine-learning-algorithms-explained-in-human-language\/#primaryimage","url":"https:\/\/media.datakeen.co\/wp-content\/uploads\/2021\/11\/28141451\/andy-kelly-0E_vhMVqL9g-unsplash.jpg","contentUrl":"https:\/\/media.datakeen.co\/wp-content\/uploads\/2021\/11\/28141451\/andy-kelly-0E_vhMVqL9g-unsplash.jpg","width":1920,"height":1280},{"@type":"BreadcrumbList","@id":"https:\/\/legacy-wp.datakeen.co\/en\/8-machine-learning-algorithms-explained-in-human-language\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Accueil","item":"https:\/\/legacy-wp.datakeen.co\/en\/"},{"@type":"ListItem","position":2,"name":"8 Machine Learning Algorithms explained in Human language"}]},{"@type":"WebSite","@id":"https:\/\/legacy-wp.datakeen.co\/en\/#website","url":"https:\/\/legacy-wp.datakeen.co\/en\/","name":"Datakeen","description":"[:fr]Democratise l&#039;IA en Entreprise[:en]Democratize AI Solutions[:de]KI-L\u00f6sungen demokratisieren[:es]Democratizar las soluciones de IA[:ja]\u4eba\u5de5\u77e5\u80fd\u30bd\u30ea\u30e5\u30fc\u30b7\u30e7\u30f3\u306e\u6c11\u4e3b\u5316[:zh]\u6c11\u4e3b\u5316\u4eba\u5de5\u667a\u80fd\u89e3\u51b3\u65b9\u6848[:]","publisher":{"@id":"https:\/\/legacy-wp.datakeen.co\/en\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/legacy-wp.datakeen.co\/en\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/legacy-wp.datakeen.co\/en\/#organization","name":"Datakeen","url":"https:\/\/legacy-wp.datakeen.co\/en\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/legacy-wp.datakeen.co\/en\/#\/schema\/logo\/image\/","url":"https:\/\/media.datakeen.co\/wp-content\/uploads\/2017\/03\/28141638\/datakeen_large_blue_nobg.png","contentUrl":"https:\/\/media.datakeen.co\/wp-content\/uploads\/2017\/03\/28141638\/datakeen_large_blue_nobg.png","width":2000,"height":826,"caption":"Datakeen"},"image":{"@id":"https:\/\/legacy-wp.datakeen.co\/en\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/x.com\/DatakeenCO","https:\/\/www.linkedin.com\/company\/11023029"]},{"@type":"Person","@id":"https:\/\/legacy-wp.datakeen.co\/en\/#\/schema\/person\/201d4b0eea1c7a6ca576cd44f2188730","name":"Ga\u00ebl","url":"https:\/\/legacy-wp.datakeen.co\/en\/author\/admin\/"}]}},"_links":{"self":[{"href":"https:\/\/legacy-wp.datakeen.co\/en\/wp-json\/wp\/v2\/posts\/3718","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/legacy-wp.datakeen.co\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/legacy-wp.datakeen.co\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/legacy-wp.datakeen.co\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/legacy-wp.datakeen.co\/en\/wp-json\/wp\/v2\/comments?post=3718"}],"version-history":[{"count":0,"href":"https:\/\/legacy-wp.datakeen.co\/en\/wp-json\/wp\/v2\/posts\/3718\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/legacy-wp.datakeen.co\/en\/wp-json\/wp\/v2\/media\/4020"}],"wp:attachment":[{"href":"https:\/\/legacy-wp.datakeen.co\/en\/wp-json\/wp\/v2\/media?parent=3718"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/legacy-wp.datakeen.co\/en\/wp-json\/wp\/v2\/categories?post=3718"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/legacy-wp.datakeen.co\/en\/wp-json\/wp\/v2\/tags?post=3718"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}