## Building and evaluating Naive Bayes classifier with WEKA

This is a followup post from previous where we were calculating Naive Bayes prediction on given data set. This time I want to demonstrate how all this can be implemented using WEKA application. For those who doesn’t know what WEKA is I highly recommend visiting their website and getting latest release. It is really powerful machine learning software written in Java. You can find plenty of tutorials in youtube on how to get started with WEKA. So I wont get in to details. I’m sure you’ll be able to follow anyway.

## Simple explanation of Naive Bayes classifier

Probably you’ve heard about Naive Bayes classifier and likely used in some GUI based classifiers like WEKA package. This is a number one algorithm used to see the initial results of classification. Sometimes surprisingly it outperforms the other models with speed, accuracy and simplicity. Lets see how this algorithm looks and what does it do. As you may know algorithm works on Bayes theorem of probability which allows to predict the class of unknown data set. Hoe you are comfortable with probability math – at least some basics.

## Linear regression with multiple features

Single feature linear regression is really simple. All you need is to find function that fits training data best. It is also easy to plot data and learned curve. But in reality regression analysis is based on multiple features. So in most cases we cannot imagine the multidimensional space where data could be plotted. We need to rely on methods we use. You have to feel comfortable with linear algebra where matrices and vectors are used. If previously we had one feature (temperature) now we need to introduce more of them. So we need to expand hypothesis to accept more features. From now and later on instead of output y we are gonna use h(x) notation: As you can see with more variables (features) we also end up with more parameters θ that has to be learned. Before we move lets find suitable data that we could use for building…

## Linear regression – learning algorithm with Python

In this post we are going to demystify the learning algorithm of linear regression. We are going to to analyze simplest univariate case with single feature X where in previous example was temperature and output was cricket chirps/sec. Lets use same data with crickets to build learning algorithm and see if it produces similar hypothesis as in excel. As you may already know from this example, we need to find linear equation parameters θ0 and θ1, to fit line most optimally on given data set: y = θ0 + θ1 x x here is feature (temperature) and y – output value (chirps/sec). So how we are going to find parameters θ0 and θ1? The whole point of learning algorithm is doing this iteratively. We need to find optimal θ0, θ1 parameter values so that approximation line error from plotted training set is minimal. By doing successive corrections to randomly selected…

## Simplest machine learning algorithm – linear regression with excel

Some may say that linear regression is more statistical problem. And this is truth at some level. But when problem is solved from machine learning perspective, things gets easier especially when moving towards more complex problems. First of all lets understand few important terms. We can start with regression. When speaking of linear regression we try to find best fitting line through given points. In other words we need to find optimal linear equation to fit given data points. This is a supervised learning problem when we have set of data pairs that can be plot on x-y axis. I understand, theory is boring thing, even for me, so lets move to practical example and learn by solving some problem. In order to work with some examples we need sample data. There are many data sources available on internet. For instance great source is college cengage that have several sets with…

## Overview of machine learning algorithms

Few years ago machine learning has caught my attention and since then my interest in this field keeps growing. Everyday we see more and more intelligent solutions surrounding us. Most obvious is internet. You probably noticed, that shopping sites adapt to our interests and suggests targeted offers, another example is spam email filters, if we mark emails as spam they keep disappearing from our lives. Other area is robotics, where they learn how to navigate independent and perform various tasks. Autonomous flying robots, helicopters, quads, handwriting recognition, computer vision, data mining in various fields like markets, biomedicine, biology – all this is covered by various machine learning algorithms.  Here are few reasons why machine learning is very important and sometimes necessary: Data mining. Sometimes it isn’t possible to understand the nature data and relations between, so machine learning algorithms are able to extract these hidden relations. Adaptation. It is hard…