How does Data Science work in daily practice?

10 practical examples of Data Science applications

  • Article
  • Data Science
Alex de Ronde
Data Scientist
4 min
07 May 2019

Organisations wanting to get started with data quickly ask for Data Science solutions. Data Science is often seen as the holy grail of data-driven working. But what does a successful Data Science project actually look like in practice? And how can it serve your organisation? In this series of articles, we take you through all the elements you need to achieve success for your organisation with Data Science.

As an introduction, this article contains 10 practical examples of successful projects carried out by Digital Power's Data Science consultants. Use them to get an idea of what Data Science is and how it can serve your organisation.

What does a Data Scientist do? 

Data Science applications lie at the intersection of knowledge of what your organisation does, statistics and IT. A Data Scientist starts where the work of an 'ordinary' Data Analyst ends. It ends where hardcore Software or Data Engineering begins or where statistics and machine learning become no longer practically applicable (and therefore useless).

Data Scientists' Biggest Challenges

For organisations with a good basic infrastructure, a clear data strategy and measurable objectives, Data Science solutions can add a lot of value. The biggest challenges for Data Scientists are therefore to:

  • Discover where and how they can add value to an organisation or process
  • Determine how they make the data available and usable
  • Develop a Data Science model that fits the organisation, both in terms of technology and people
  • …and then implement and maintain it!

10 practical examples of Data Science applications

1. Insurance Recommendation System

For one of the largest insurers in the Netherlands, we developed a predictive model based on various machine learning algorithms. We trained the model using historical customer cases. The model can predict which insurance policies new customers are most likely to choose.

If required, it also predicts what price category a customer falls into. In this way, the correct premium can be calculated and displayed automatically. This allows the insurer to make real-time recommendations to its customers.

For this Data Science project we worked with R, H2O, Python and Spotfire.

2. Visual social media brand analysis using scraping and a predictive model

We participated in a large-scale social media study with the aim of gaining insight into the effects of visual social media posts on brands. For this we have scraped a large amount of Instagram posts/photos about various popular brands. We then classified these based on various image characteristics (both content and design).

Based on that classification, we built a predictive model that predicts the popularity of brand-related social media posts. The model does this based on image characteristics of the posted photo and properties of the person who posts it.

We used Python and R.

3. Churn analysis for e-commerce

A fast-growing e-commerce platform devoted a lot of capacity to personalised email marketing. It was suspected that some of the recipients never opened the emails and/or would never convert them again.

We developed a Data Science model that automatically determines which customers no longer need to receive emails. In addition, this has also provided valuable insights for the CRM team.

For this Data Science project we used R and Alteryx.

4. Enrichment of the real estate player database

A rental agent wanted to add the sun position of the garden to every property on their new website. This information was not available in the existing database. By combining public geodata in a smart way, we were able to enrich the database for them. This improves the online information provision for home seekers.

We did this using Python and qgis.

Read more about this case here.

5. Expected Customer Lifetime Value model for Google Ads

For an e-commerce platform, we developed a model that determines the Expected Customer Lifetime Value of every customer after two weeks. We used online data, attribution data and order data as input for the model. As output, the model automatically provides recommendations for the Google Ads strategy. The budget is now effectively spent on the campaigns with the highest revenue and positive ROI.

Used tools and programming languages were mainly R and PHP.

6. Online recommender systeem

An internationally operating e-commerce company wanted to increase turnover per website visitor. We developed an algorithm that makes real-time, personally relevant product recommendations based on click behaviour and the visitor's purchase history. We linked CRM data with the webshop. After implementation, the turnover per visitor increased.

We used the Data Science tools R, Alteryx and MongoDB for this.

7. Optimal distribution of marketing budget using Data Science

For an energy company with fixed subscription costs, we developed a Data Science model that determines for which type of customer the costs of customers are the lowest. Based on this, the marketing budget can be used as efficiently as possible.

We built this model using Alteryx.

8.Expected Customer Lifetime Value model for budget planning

We also developed a Data Science model for budget planning, but this required a different type of model. This can be used to calculate how much the existing customer base is still worth. This information serves as important input for strategic decisions of the marketing department.

For the development of this model, we used R.

9. Image Recognition for Fog Detection

As part of a pilot study, a government agency wanted to see whether fog could be detected in pictures taken by highway cameras. For this we trained state-of-the-art neural networks that are specifically focused on image recognition. The test results of this pilot study served as the basis for more in-depth research into fog recognition using highway cameras.

For this Data Science project we worked with R, H2O, Python and Spotfire.

10. Data Science for Planning: Determining Optimal Schedules

The assignment was to find the best possible timetable for a large number of students, each with an individual package of subjects. It was important that as few students as possible had multiple courses at the same time. Each subject also had to be distributed in a reasonable manner over the week (ie not all teaching hours of one subject on the same day).

To get as close as possible to the global optimum (the best lattice), we applied a simulated annealing algorithm.

We wrote the algorithm in Python.

Data Science projects that add value to your business

As we wrote in the introduction to this article, Data Science applications add the most value to organisations that already have a solid data infrastructure in place. In addition, you need a sufficiently good Data Science infrastructure if you want to put models into production, with the correct use of MLOps.

If this is not the case, we advise you to first map out how you can get the basics in order. With this too, we can help you step by step using the Digital Power model of building blocks for data-driven work.

Receive data insights, use cases and behind-the-scenes peeks once a month?

Sign up for our email list and stay 'up to data':