Probabilistic Modeling on Userbehavior Data

Together with a startup in Hamburg we were able to develop a pipeline which enabled the team to gain deeper insights into user-behavior clusters.
The raw user data came from a CRM-API, was parsed into relational database systems and through a combination of statistical and unsupervised ML methods, important features were extracted and probabilistic models were fitted using R and Stan using MCMC methods. The probabilistic models enabled a insightful modeling on a sparse data-basis.

Classify Cancer Clusters

For the Biohackathon Copenhagen 2020 we developed a machine-learning pipeline to classify breast cancer clusters given gene expression data and CNA data. This was done with XGboost as a more classical ML approach. You can read all about it here.