Chapter -1 The Machine Learning Dictionary

Here are some definitions for commonly used terms/technologies in machine learning. I’ll try to update and improve this page with new entries over time.

Apache Spark — A library for distributed computing for large-scale data manipulation and machine learning.

Artificial Neural Networks — Machine learning algorithms inspired by biological neural networks.

Back-propagation — An algorithm for training neural networks in which errors are propagated backwards through the network.

Big Data — Data which is difficult to work upon using a single machine, typically in the order of terabytes or more. It can also mean machine learning and other types of analyses on data of this scale.

Classification — A machine learning problem involving the prediction of two or more classes from an observation.

Clustering — The process of grouping observations that are similar according to a particular criterion.

Cython — A Python-like language uses to give C-like performance to Python.

Cross Validation — A method for evaluating the performance of a learning algorithm. Particularly useful for small datasets.

Data Science — A field covering machine learning, data cleaning and preparation, and data analysis techniques such as visualisation.

Deep Learning — A class of machine learning algorithms which use artificial neural networks with many layers.

Face Detection — The problem of determining whether a face contains an image.

Face Recognition — The problem of identifying a face in an image.

Feature Extraction — The process of finding relevant features in a set of data.

Gradient Descent — An optimization method which can find a minimum of a function by following the gradient.

Hyper-parameter — A user-defined parameter in a machine learning algorithm.

k-nearest Neighbors — An algorithm which makes a prediction based on the k-nearest observations.

Kaggle — A data science competition.

Linear Algebra — A field of mathematics concerning linear mappings between vector spaces. Essential to machine learning.

Machine Learning — Algorithms which improve their performance with experience. A computational branch of statistics.

Model Selection — The process of choosing hyper-parameters for a machine learning algorithm.

Natural Language Processing — A field of computer science concerned with the analysis of natural (human) languages.

Numpy — A Python array/matrix library.

OpenCV — A computer vision library in C++ with bindings for Python.

Optimization — The branch of mathematics concerned with finding the minimum or maximum of a function. Essential to many machine learning algorithms.

Pandas — The Python Data Analysis library.

Principal Components Analysis — A classic feature extraction algorithm based on prediction into a subspace.

Python — A high-level programming language, popular for machine learning applications.

Regression — A machine learning problem involving the prediction of a real-valued scalar or vector.

Singular Value Decomposition — A well-known matrix factorisation method.

Scikit-learn — A library for Machine Learning in Python.

Scipy — A Python library for scientific computing.

Statistics — A branch of mathematics concerned with finding useful patterns in data.

Stochastic Gradient Descent — A fast numerical optimisation algorithm commonly used in deep learning algorithms.

Tensor — A multidimensional array.

Tensorflow — A deep learning library developed by Google.

Test Set — A set of examples/observations used for evaluating the prediction performance of an algorithm.

Theano — A tensor manipulation library for Python which can run code on the GPU.

Training Set — A set of examples/observations used for training a machine learning algorithm.

Validation Set — A set of examples/observations used for tuning the parameters of an algorithm whilst training.

Meticlux

Search This Blog

Chapter -1 The Machine Learning Dictionary

Labels

Comments

Post a Comment

Popular posts from this blog

ETL Process in Power BI

Excel for Data Analysis

Get started with Power BI Desktop