5 Advance Projects for Data Science Portfolio
Work on data analytics, time series, natural language processing, machine learning, and ChatGPT projects to improve your chance of getting hired.
Image by Author
In this blog, we'll explore five essential data science projects that can boost the job profiles of both final-year students and professionals. Through these projects, you'll gain a deeper understanding of data science workflows and master essential tools for data cleaning, manipulation, visualization, and modeling. Additionally, you'll learn how to write project reports and deploy machine learning models on the cloud for maximum impact.
1. Recycled Energy Saved in Singapore
In the Recycled Energy Saved in Singapore project, you will analyze how much Singapore is saving energy per year by recycling plastics, paper, glass, and ferrous and non-ferrous metals. The project involves data joining, data cleaning, and data wrangling. After that, you will perform deep data analysis with statistical and visualization tools.Â
Image from Project
In the end, you will use data manipulation techniques to come up with the answer to the original question: Total energy saved from 2003 to 2020 based on five waste types in Singapore.Â
2. Time Series Forecasting with statsmodels and Prophet
Time Series Forecasting with statsmodels and Prophet project will teach you essential skills for handling time series data, performing data analysis, and forecasting.Â
Image from Project
You will start by training data on the ARIMA forecasting model and performing a model evaluation. After that, you will perform time series forecasting with the Python package Prophet.
3. spaCy Resume Analysis
In the spaCy Resume Analysis project, you will use spaCy for entity recognition on 200 Resume and various NLP tools for text analysis. The goal of the project is to help recruiters make fast and accurate decisions on thousands of job applications.Â
Image from Project
You will start by loading the scrapped dataset and spaCy base model for English languages. Next, you will create an entity ruler and clean the dataset. After that, you will perform data visualization, entity recognition, and dependency parsing. In the end, you will create a function for resume matching score and perform topic modeling.
4. Tripadvisor Data Analysis
The Tripadvisor Data Analysis portfolio project covers all aspects of data science, from data loading to data modeling. You will be analyzing reviews and ratings based on customer experience.Â
Image from Project
In this project you will perform:
- Data Exploration
- Sentiment Analysis using VaderÂ
- Data Visualization
- Adding Keywords (Gensim)
- Text Processing (NLTK)
- Building Deep Learning model (BiLSTM) using Keras
- Train and Validation
- Model Evaluation
- Prediction
- Saving Model
It is an introduction to text classification using deep learning models. Before jumping into training, you will preprocess the data (Text Lemmatization), perform data analysis, and prepare the data (Tokenization) for a deep learning model.Â
5. End-to-End Loan Approval Project with ChatGPT
The End-to-End Loan Approval Project with ChatGPT is my favorite. You will learn to master GPT prompting for all the steps involved in a real-life data science project. In the project, you will be asking ChatGPT to help you create an end-to-end loan approval project using the data that was extracted from LendingClub.com.
Image from Loan Classifier
You will learn to write prompts to generate ideas, data analysis, feature engineering, preprocess and balance the data, model selection, model tuning, and evaluation, build an app, and deploy it on a server.Â
We are at the stage where companies will start to ask employees to learn promoting skills and get better at using the new AI tools. Prompt engineering will become an essential skill for data scientists, and the recruiter will ask for experience in using GPT for data science tasks.Â
So, why wait? Start using GPT-4 and other AI tools to get productive and become future-proof.Â
Conclusion
Even if you lack experience, these projects will help you get your dream job. After working on the project, I will highly recommend you share them on GitHub, DagsHub, Deepnote, or on Kaggle. These platforms are used by developers and data scientists to showcase their projects and skills.Â
In this post, we have reviewed 5 advanced projects that cover data analytics, time series, natural language processing, machine learning, and prompt engineering using ChatGPT. If you are interested in learning about projects that deal with specific fields of data science, check out the complete collection of data science projects – Part 1 and Part 2.
I hope the list of advanced projects helps you, do let me know if you have better suggestions.
Abid Ali Awan (@1abidaliawan) is a certified data scientist professional who loves building machine learning models. Currently, he is focusing on content creation and writing technical blogs on machine learning and data science technologies. Abid holds a Master's degree in Technology Management and a bachelor's degree in Telecommunication Engineering. His vision is to build an AI product using a graph neural network for students struggling with mental illness.