Pancake: A Python package for model stacking
In a previous post, I have provided a discussion of model stacking, a popular approach in data science competitions for boosting predictive performance. Since then, the post has attracted some attention, so I have decided to put together a Python package which provides a simple API to stack models with minimal effort…. Read More
An overview of feature selection strategies
Feature selection and engineering are the most important factors which affect the success of predictive modeling. This remains true even today despite the success of deep learning, which comes with automatic feature engineering. Parsimonious and interpretable models provide simple insights into business problems and therefore they are deemed very valuable. Furthermore, in many occasions the underlying size and structure of the data being analyzed may not allow the use of complex models that have many parameters to tune… Read More
Time series classification with tensorflow
Time-series data arise in many fields including finance, signal processing, speech recognition and medicine. A standard approach to time-series problems usually requires manual engineering of features which can then be fed into a machine learning algorithm. Engineering of features generally requires some domain knowledge of the discipline where the data has originated from. For example, if one is dealing with signals (i.e. classification of EEG signals), then possible features would involve power spectra at various frequency bands, Hjorth parameters and several other specialized statistical properties….Read More
Machine Learning for Alchemy
It is no news to anyone that applications of machine learning span a vast range of fields, from artificial intelligence to social sciences. An application that I have been excited about is the possibility of discovering and designing new compounds. The list of unprecedented consequences is very long, including development of better methods in drug discovery, and computational design of compounds (digital alchemy!)… Read More
Feature Engineering with Tidyverse
In this blog post, I will discuss feature engineering using the Tidyverse collection of libraries. Feature engineering is crucial for a variety of reasons, and it requires some care to produce any useful outcome. In this post, I will consider a dataset that contains description of crimes in San Francisco between years 2003-2015 … Read More
Yet another introduction to neural networks
There are many great tutorials on neural networks that one can find online nowadays. Simply searching for the words “Neural Network” will produce numerous results on GithubGist. Even tough there are many examples floating around on the web, I decided to have my own Introduction to Neural Networks! … Read More
Deciphering the neural language model
Recently, I have been working on the Neural Networks for Machine Learning course offered by Coursera and taught by Geoffrey Hinton. Overall, it is a nice course and provides an introduction to some of the modern topics in deep learning. However, there are instances where the student has to do lots of extra work in order to understand the topics covered in full detail. … Read More
Stacking models for improved predictions
If you have ever competed in a Kaggle competition, you are probably familiar with the use of combining different predictive models for improved accuracy which will creep your score up in the leader board. While it is widely used, there are only a few resources that I am aware of where a clear description is available (One that I know of is here, and there is also a caret package extension for it). Therefore, I will try to workout a simple example here to illustrate how different models can be combined. … Read More
Machine Learning Meets Quantum Mechanics
Recently, I have published an article on Journal of Chemical Physics, entitled Tree based machine learning framework for predicting ground state energies of molecules (link to article and preprint). The article discusses in detail, the application of machine learning algorithms to predict ground state energies of molecules. … Read More