Machine learning algorithms can be separated on a high level in two fundamental different types - supervised and unsupervised. Supervised machine learning algorithms are better known to the general public in comparison to unsupervised approaches. Classifying breast cancer on images which have been annotated by doctors can be seen as one real-world example of supervised machine learning. Supervised machine learning algorithms can be extremely powerful but are often limited by the availability of labeled data. Tedious and costly manual labor is necessary to prepare data sets which can be fed into supervised machine learning algorithms to achieve the expected performance. On the other hand, unsupervised machine learning algorithms are meant to find structures and relationships in the raw data itself, without any labels or prior information provided by human supervisors. This course will introduce several unsupervised machine learning techniques which can be leveraged in different domains - from finding hidden structures in time series data, representing text information in a numerical way until possibilities of generating new image data. To achieve all of that, we will introduce algorithm by algorithm in a rigorous manner guided by examples. The participants will learn when and how an unsupervised machine learning technique could be applicable. Furthermore, they will be able to implement them by themselves and expand their data analysis tools at their disposal. Summarized goals and scope: - understand the difference of unsupervised machine learning and supervised machine learning - clustering (K-means, DBSCAN, agglomerative clustering) - dimensionality reduction (robust pca, t-SNE) - semi-supervised machine learning algorithms o introduction to autoencoders and their applications (e.g. automated feature engineering) o word2vec algorithm to generate numerical embeddings of textual data - generative models o discriminative vs generative models o creating images with variational autoencoders |