Last Update: 17/06/2023
I am a data scientist who helps companies leverage data for business growth.
I use Python, SQL, Machine Learning, Power BI and Statistics to collect and analyze data,
create visualizations and dashboards, build predictive models, test hypotheses and communicate results.
My Resume
my latest project
Rossmann Store Sales Forecasting
This project aims to predict the sales of Rossmann stores for the next 6 weeks using time series analysis.
The methods used include data cleaning, exploratory data analysis (EDA), feature engineering, time series analysis and XGBoost algorithm.
The dataset used in this project is publicly available from a Kaggle competition and contains historical sales data for 1,115 Rossmann stores across seven European countries.
This project is a movie recommendation system that uses the MovieLens Dataset and Pandas library.
It implements an Item Based Collaborative Filter with different similarity metrics to suggest movies to users.
It also has a web application built with streamlit library.
A credit card company wants to know its customers better and tailor its ads and offers accordingly.
It hired a data scientist to create customer segments using Hierarchical Clustering.
This repository shows how to solve this case study with 8950 rows and 18 columns of data.
This project reveals the secrets behind a successful marketing campaign for a Portuguese bank.
I used exploratory data analysis to understand how phone calls influenced customers to subscribe to a bank term deposit.
The data has 17 features and about 4000 rows, which are described in the repository.
This project aims to find out what makes customers say ‘yes’ or ‘no’ to a bank offer.
Do you want to know how many bikes will be rented on any given day?
Try my machine learning app that uses historical data to make accurate predictions.
You will also learn how I analyzed the data, cleaned it, selected the best features, and built a regression model using Python and Streamlit.
It’s a fun and easy way to explore the bike sharing business with data science.
I used the Adult Census Income Dataset from Kaggle to build a model that predicts if a person earns more than $50K a year based on various factors.
The dataset contains 15 features and 32K records from the 1994 Census bureau database.
I wanted to explore how education and other aspects affect income levels and who are the highest earners in the population.
I applied feature engineering, data visualization, exploratory data analysis, hypothesis testing, and machine learning (classification) techniques using Python, Jupyter Notebook, and several libraries.
I used the Medical Insurance Cost Dataset from Kaggle to build a model that predicts the cost of medical insurance based on various factors such as BMI, age, and smoking. The dataset contains 7 features and 3630 records of people with different insurance costs.
I wanted to explore how smoking and BMI influence the insurance price and who are the most expensive customers for the insurance companies.
I applied feature engineering, data visualization, exploratory data analysis, hypothesis testing, regression analysis, and machine learning (linear regression) techniques using Python, Jupyter Notebook, and several libraries.