My machine learning workflow on Spark: Classification on customer behavior

Based on a real-world example, I am trying to demonstrate a complete machine learning workflow with minimal amount of code possible. Learn More


Tips to make machine learning algorithm better: Imbalanced sample, stacking and more

Now we got a functioning workflow. But there is something else we can try to make more acurate prediction! Learn More


GLM Application in Spark:
a case study

In the insurance industry, an important topic is to model the loss ratio. We will use the ml library in Spark to implement a popular frequency severity model. Learn More


Sklearn: Some less konwn techniques of this awesome package

Spark is awesome, and i never said opposite. However, for smaller projects, sklearn is also a good choice, and has a lot flexibility! Learn More


How to make your code faster: some thoughts in high performace computing.

Faster and more elegant code has always been what we are after. But from all the tools that enhances the performance, is there a clear winner? Coming soon