Capitule 1. Machine Learning for "dummies"

Pattern Recognition (PR) is the scientific field that has the aim of objects classification into classes (typically are two or three classes). There are four main tasks in PR, classification, regression, clustering, and recuperation. For this blog entry, we focus on the classification task.

Classification and regression task belongs to supervised learning. So we are going to have two main subsets (i.e., training and testing)...

Read More

Capitule 2. Websites to download datasets for classification task

Before to apply machine learning (ML) techniques, we need to have confidence datasets. Some cases, the research groups in the world have their datasets that are not available for the public and need to send a request by email.

Fortunately, there are confidence websites which let us download datasets of different fields such as UCI Machine Learning Repository and KEEL...

Read More

Capitule 3. Division of the dataset

We need two main subsets before classification, i.e., training and testing. To obtain these subsets is necessary to implement a technique as K-fold cross-validation stratified, Hold-Out stratified, or Leave One-Out...

Read More

Capitule 4. Handling missing values and outliers

In datasets, we can find missing values that can be denoted as "?" , "Null," etc. On the other hand, outliers are extreme or wrong values. An example is a temperature value of 2000 degrees when typical temperatures are between 19 and 40 degrees...

Read More

Capitule 5. Perfomance metrics

To measure the performance of a classifier is essential to know the phenomenon of the field that you want to merge with ML to choose the correct performance metric. For example, in the Brain-Computer interface field and according to Dr. Lotte, a metric that he suggests is the area under the ROC curve...

Read More

Capitule 6. Classifiers

Choose a classifier sometimes is a difficult task. It depends on the problem to resolve, and also it is essential to know the "no free lunch" theorem that said: "On the criterion of generalization performance, there are no context -or problem- independent reasons to favor one learning or classification method to another...

Read More

Capitule 7. Gamma Classifier

Mexican classifier applied to the Iris Dataset

As a member of the Alfa-Beta research group, I transformed the theory of Gamma classifier into Python 3.0 code. Gamma is a classifier that has shown to be competitive in classification and prediction task. The primary operators of the Gamma classifier are:

Read More