unsupervised machine learning

- [[principal component analysis]] - [[manifold learning]] - [[factor analysis]] - [[random projections]] - [[autoencoders]] - [[topic modeling]] # Idea Unsupervised learning lets us approach problems with little or no idea about what the results should look like. The model isn't trained on the "right answer," and thus the model/algorithm isn't "supervised." The model works with **unlabeled** data. The data come with inputs $x$ but not output labels $y$ (unliked [[supervised machine learning|supervised learning]]). The algorithm has to find structure in the data. We can derive structure from data where we don't necessarily know the effects of the variables. The computer learns by itself. It learns from unlabeled data and doesn't have access to the correct answer and there is no feedback based on the prediction results. We can derive this structure by clustering the data based on relationships among the variables in the data. # Examples Common and important algorithms - [[clustering algorithms]]: cluster similar points together - [[dimension reduction]]: compress data - [[anomaly detection]]: find unusual data points Common usage examples include **clustering algorithms** that are used to cluster or group news articles on the internet, gene expression in different people, organize large computer clusters, network analyses, market segmentation (understand customers), astronomical data analysis, solve the [[cocktail party problem]] ([[source separation]]). # References - [Unsupervised learning part 1 - Week 1: Introduction to Machine Learning | Coursera](https://www.coursera.org/learn/machine-learning/lecture/TxO6F/unsupervised-learning-part-1) - [Unsupervised learning part 2 - Week 1: Introduction to Machine Learning | Coursera](https://www.coursera.org/learn/machine-learning/lecture/jKBHE/unsupervised-learning-part-2)