10 AI Project Ideas in Computer Vision
The field of computer vision has seen the development of very powerful applications leveraging machine learning. These projects will introduce you to these techniques and guide you to more advanced practice to gain a deeper appreciation for the sophistication now available.
By Manika Nagpal, Technical Content Analyst at ProjectPro.
“Artificial intelligence is the science of making machines do things that would require intelligence if done by men.” -- Marvin Minsky, co-founder of MIT’s Artificial Intelligence laboratory.
The quote above nicely sums up the beauty of Artificial Intelligence (AI). Using AI for automating simple tasks allows humans to invest in solving more challenging problems. That is why we all are witnessing AI gaining much traction despite the technology being in its infancy. One can easily affirm this by looking at Gartner’s recent survey, which revealed that by the end of 2024, 75% of organizations would shift from piloting to operationalizing AI.
Artificial intelligence techniques like machine learning, deep learning, natural language processing, etc., allow their users to draw insightful conclusions from the data that wouldn’t have been revealed otherwise. They also offer individuals to make predictions about specific parameters, thereby preparing them for the future. And, please do not think of the dataset as a collection of numbers. Gone are the days when that used to be the case. With the advent of technological advancements in AI, extracting information from images and texts has become possible.
The branch of AI that deals with harnessing the potential of data in the form of images and videos is called Computer Vision. There are many exciting applications of Computer Vision (CV), and in this blog, we are going to list AI project ideas that a CV enthusiast can work on. The project ideas have been split into categories mentioned below so you can smoothly browse through them as per your experience in the industry.
- AI Projects in Computer Vision for Beginners
- AI Projects in Computer Vision for Intermediate Professionals
- Challenging AI Projects in Computer Vision for Experts
AI Projects in Computer Vision for Beginners
1) Face Recognition Application
Face Recognition is a fun computer-vision-based application that most beginners enjoy building. Just think of it, an application that sees your picture and identifies you with your name, sounds cool right? Creating such an application is not as difficult as you may think with so many computer vision libraries.
Solution Approach: Building a face-recognition system in Python is quite simple using Haar Cascade Classifiers. It is a pre-trained model that can detect the presence of a face in a given image. You can use this model to locate a face in an image and then use the KNN machine learning algorithm to estimate how closely it matches another face.
Dataset: Use the Yale Face Database for this project that has 165 images in a grayscale of 15 persons.
Use-Case: Face Recognition is widely used as a security feature, for example, on the lock screen of mobile phones to prevent random individuals from unlocking it.
2) Mask Detection
With China closing its schools and cancelling its flights again to combat a recent surge in coronavirus cases, citizens worldwide feel alarmed. We all know by now that maintaining a physical distance of at least 2 metres and wearing masks are the two primary steps that we can take to control the spread of the virus. Yet we see so many people not wearing masks when in public places. A solution to this problem can be to use CV to build a system that can detect people who are not wearing masks.
Solution Approach: Use a CNN model like ImageNet and train it to learn the difference between faces with a mask on them and faces that don’t. After a decent accuracy has been achieved, the next step will be to detect the facial features in the given image. Lastly, apply the model to test the presence of a mask.
Dataset: You can use the COVID-19 images dataset by Prajna Bhandary for this project that has 690 images of people wearing masks and 686 images of people without masks.
Use-Case: This model can be deployed at public places to ensure people who are not wearing masks are fined.
3) Dog and Cat Classification Project
The goal of this project is to learn Image classification using computer vision. It is a fun computer vision project idea for beginners where they will train a deep learning algorithm to distinguish between the images of dogs and cats.
Solution Approach: For this problem, you can build a simple CNN model from scratch using TensorFlow and Keras in Python and train it to learn the features of cats and dogs. As an alternative, you can also use a simple CNN model like VGG-16 to distinguish between the two animals automatically.
Dataset: Dogs vs. Cats Dataset on Kaggle
Use-Case: This project idea is best to learn how convolutional neural networks (CNN) models are built from scratch using TensorFlow and Keras library in Python.
4) Click My Selfie! System
Clicking selfies is now a hobby of Gen Z! They’re learning things faster because they belong to the generation that has witnessed smartphones everywhere right from birth. And, most of them do not hesitate to share what they learned on social media with their friends. So, we came up with an excellent computer vision project idea for our Gen Z, making an automated selfie system that clicks pictures when the person looks at the camera with a smile.
Solution Approach: For this project, you can use a convolution neural network model like VGG-16 to train it to differentiate between a smiling face and a non-smiling face. Once you have achieved a decent accuracy, move ahead with testing the model with your image. After that, you can use the OpenCV library to implement this model over each frame of the live camera and then trigger the camera to capture the frame if a smiling face is detected. Make sure to perform face detection before testing and training the model each time.
Dataset: Smile-detection Dataset on Kaggle
Use-Case: Not only can Gen Z use it for clicking their selfies, many digital marketing teams that run campaigns, which involve gifting free samples if a user shares the review on their social media, can benefit from this too.
AI Projects in Computer Vision for Intermediate Professionals
5) Text Recognising System
Visiting a foreign country where people don’t speak the same language as you do can be challenging. But that shouldn’t stop you from exploring them and experiencing the culture those countries may offer. Fortunately, with computer vision technology, the experience of traveling to different countries around the world has vastly improved. And one of the reasons behind that is its application in text recognizing systems, systems that can read any language and translate it to the user-specified language.
Solution Approach: For this project, the primary task is Optical Character Recognition (OCR), and you can use Tesseract by Google for it along with an object detection model like YOLO v4. You can download the pre-trained YOLO weights and then make your custom object detection model with it. After that, use LabelImg to annotate the images for training. Next, train the YOLO model using annotated images. Furthermore, use the Pytessaract library for extracting texts from testing images and then predict the text.
Dataset: Text-Image-OCR Dataset on Kaggle
Use-Case: Implementing this project for Language translation applications.
6) Digit Recognizer using MNIST
The MNIST dataset is quite a popular dataset among the Data Science community. It has images of handwritten digits and was created by resampling the original dataset by NIST. The MNIST dataset has about 70,000 black and white images of size 28 x 28 pixels. For this project, you can build a digit recognizing system using this dataset.
Solution Approach: The first for this project will be to analyze the MNIST dataset properly. It will allow us to understand how the data must be preprocessed before applying any algorithm. Once the analysis and preprocessing have been performed, you can design a CNN model for classifying digits in Python. After you have achieved a fair accuracy, move on to test the model with testing images. You can use a confusion matrix to deeply visualize the performance of the model.
Dataset: MNIST handwritten digit database by Yann LeCun, Corinna Cortes, and Chris Burges
Use-Case: This project can be scaled up to build an application that reads handwritten texts in different languages and transforms them into digital information. One can then apply language translation techniques to convert it to their choice of language.
7) Image Colorization
While looking at those old grayscale images, so many of us have a hard time imagining the colors that the moment captured would have contained. To ease our pain, computer vision technology has the perfect solution because one can use it to make a smart image colorization system.
Solution Approach: For implementing this project idea, you can use the VGG-16 model. After initializing the model parameters, use ImageDataGenerator for rescaling the images. Next, convert the RBG format to LAB one. After that, make a sequential model for Autoencoders using Keras and test its performance using test images.
Dataset: Landscape Pictures on Kaggle
Use-Case: This project can be used to color old historical images to obtain more information from them.
Challenging AI Projects in Computer Vision for Experts
8) Social Distancing Tracker
Social distancing, that is, maintaining a physical distance of two meters between people, is one of the best preventive measures against the coronavirus. The virus is deadly, and if citizens want occasional lockdowns to not happen in the near future, social distancing norms have to be followed. Computer vision technology can be of great help as one can use it to build a system that estimates the distance between any two individuals in a given frame.
Solution Approach: The first step in this project will be to use an object detection model like Faster RCNN and train it to identify people in a frame. Once that is done, you will have to set the scale for pixels and use that scale to transform pixel distance into the actual distance. If that distance is less than 2 meters, a warning message should pop up on the screen.
Dataset: Social Distancing Dataset
Use-Case: This project can be deployed at public places like airports, bus stops, markets, etc., to ensure social distancing.
9) Parking Management System
So many of us do not enjoy standing in long queues waiting for the parking space to be allotted. But now that we have computer vision technology with us, the long queues are expected to go away pretty soon. That’s primarily because we can harness the Artificial Intelligence technology for creating an automated parking system where one’s car is parked automatically.
Solution Approach: This project will have several mini projects like Number plate recognition, vehicle identification, path identification, and auto debiting system. For the first three mentioned projects, you can use an object detection model and train it to learn how to identify vehicle license plates and their model. After that, use computer vision to navigate a path for the vehicle based on the identification. The next step is to scan the records
Dataset: We recommend you spend time building your own dataset, especially for this project. For trial methods, you can use Stanford Cars Dataset and Car License Plate Detection that are available on Kaggle.
Use-Case: This project can be implemented in malls, metro stations, etc. to fasten the parking process.
10) Automated Attendance System
Maintaining physical records of employees/students at an institution is sometimes difficult because of the space it may require. All thanks to the developments in the IT industry, software-based attendance systems are now easily accessible. These have made it possible to store information digitally, which is much more convenient and efficient than registers. However, AI experts want to make the attendance systems more smooth and automated by using computer vision. Such a system will capture an individual’s face and scan the previously stored records to identify that person. Once the faces match with one of the records, it will mark that person as present automatically.
Solution Approach: The first step will be to make a CNN model learn to identify the people whose attendance must be marked. After that, test the performance of the system by submitting the image of one individual and performing face detection over it. Next, use the trained CNN model to identify the person. Once a person has been identified, update its record by marking him present in the database.
Dataset: It will be good to create a dataset on your own for this project as that will be more fun. Otherwise, you can use the CelebA Dataset.
Use-Case: Various companies can use this project to automate their attendance systems.
If you are further interested in exploring the exciting domain of Artificial Intelligence, we recommend you try your hands on a few projects. If you don’t where to start, check out these solved end-to-end data science and machine learning projects with source code to kickstart your learning journey.
Related: