Introduction To Machine Learning - Real Talk

2/22/2018

Introduction To The Series

We're going to be exploring Machine Learning in this article and a few others. This article is going to be the introduction. We'll learn about fundamentals of machine learning. In further articles, we'll look at specific techniques and libraries.

What Is Machine Learning?

So what is Machine Learning? Honestly, machine learning is a term that encompasses a bunch of different techniques, and defining just one term is sort of an insult to the scope of the many types of machine learning. It would be like asking what Calculus is. Calculus is a type of math that mainly covers derivatives and integrals, but it goes over so much more than that and the complexity can get really deep. Machine learning is similar to that, where normal people treat it like magic, but the people who actually know machine learning know that it's sort of a buzzword.

Machine Learning is basically teaching a computer program how to answer a question. This is done by giving the program "training data". The program should take a new piece of data and predict something about that new data.

The training data usually has an input and an output. The inputs are called "features", and the outputs are called "labels". The features and labels should be correlated, to make a good machine learning program. For example, we could try to predict if someone likes the color pink, based on their age and gender. The "features" are the age and gender, and the "label" is whether they like the color pink. We can have a bunch of examples as training data (like a 12 year old girl likes pink, a 14 year old girl likes pink, a 30 year old man doesn't like pink, etc.), and use it to train a machine learning classifier. Once trained, we can use the machine learning classifier to make predictions for new ages/genders.

That's a very simple example, but other more interesting examples include Visual Recognition (a program knowing what an image is of) and Natural Language Processing (programs understand speech).

However, we're going to keep things basic for now and learning about the two types of machine learning types.

Classification vs. Regression

There are generally two types of machine learning types: Classification and Regression. The difference is really simple.

In classification, we literally ask a machine learning program to literally classify something. "Here's some input. What is it? The input is our features (age/gender, or even features extracted from words or images), and the output is the label. The machine learning classifier will predict what something is given the features.

Regression is more math-like. We give the machine some features, and try to use a model to predict (usually) some number output. For example, a house with 2 bedrooms and 1 bathroom sells for $5. A house with 4 bedrooms and 2 bathrooms sell for $10. A house with 6 bedrooms and 3 bathrooms sells for $15. How much do you think a house with 8 bedrooms and 4 bathrooms sell for? The answer (is predictably) $20.

Easy so far, right? "Classifications" literally classify features into specific labels. "Regressions" are like math plots, where the features are used to predict a number (or some non-specific label). Next we'll look at two more machine learning concepts, Supervised and Unsupervised learning.

Supervised vs. Unsupervised

Supervised and Unsupervised learning are actually simple concepts. In "Supervised" learning, we tell the machine learning the answer. We say that the output is in the training data. A 12 year old girl "likes pink". A 2 bedroom/1 bathroom house sells for "$5".

In Unsupervised learning, there is no specific output label. This is also known as clustering. This is helpful for making observations about the training data. For example, if we ran a bookstore and recorded the details of which book each customer buys (for example "A 20 year old male buys The Great Gatsby"). We use unsupervised learning to try to find similarities between groups. For instance, the same people who buy "The Lord of the Rings" books probably buy "Game of Thrones" books. There's no correct input/output, we're just observing similarities.

So now we know about Supervised and Unsupervised machine learning. We're getting to the final section, where we bring everything together...Technique!

Technique!

We've gone over what machine learning is, but how do we actually use it? Okay, so real talk...It depends.

First, you need to decide what it is that you're using the machine learning program for. What question is it answering?

Depending on what it is you're trying to do, you pick a machine learning algorithm to try out. You feed it training data and test the output. If you're satisfied with the output, then we keep the program. If we're not satisfied, then usually we tweak the algorithm's parameters a bit, possibly tweak our features (input). We can also try different algorithms.

I know what I said sounds vague, but honestly that's what it is: trying different things and checking if the accuracy is good.

So assuming that you DID know all the machine learning algorithms for all of the machine learning question types, then the question remains...how do we code it? Honestly, there are a lot of different machine learning libraries out there. There's almost no reason to make your own, so check out open source libraries for your favorite programming language. We'll go over these in another article.

Conclusion

Okay, so we didn't do much, but the foundation is set. We know what machine learning is, and we know the types of machine learning.

In the next article, we're going to go over some machine learning examples using simple Python libraries, to get some hands-on experience. From there, we'll expand our horizons by learning about all sorts of machine learning questions and algorithms. We'll also look at a bunch of libraries that have those algorithms already implemented, making our lives easier. Finally, we'll go deep by using Neural Networks and Deep Learning, as well as special areas, such as NLP and Visual Recognition.

Like this content and want more? Feel free to look around and find another blog post that interests you. You can also contact me through one of the various social media channels.

Twitter: @srcmake
Discord: srcmake#3644
Youtube: srcmake
Twitch: www.twitch.tv/srcmake
Github: srcmake

References
1. www.quora.com/What-is-the-difference-between-supervised-and-unsupervised-learning-algorithms

Comments are closed.

Author

Hi, I'm srcmake. I play video games and develop software.

Pro-tip: Click the "DIRECTORY" button in the menu to find a list of blog posts.

License: All code and instructions are provided under the MIT License.

Discord

Chat with me.

Youtube

Watch my videos.

Twitter

Get the latest news.

Twitch

See the me code live.

Github

My latest projects.