DIY Deep Learning Projects
Inspired by the great work of Akshay Bahadur in this article you will see some projects applying Computer Vision and Deep Learning, with implementations and details so you can reproduce them on your computer.
LinkedIn Data Science Community
Akshay Bahadur is one of the great examples that the Data Science community at LinkedIn gave. There are great people in other platforms like Quora, StackOverflow, Youtube, here, and in lots of forums and platforms helping each other in many areas of science, philosophy, math, language and of course Data Science and its companions.
Akshay Bahadur.
But I think in the past ~3 years, the LinkedIn community has excel on sharing great content in the Data Science space, from sharing experiences to detailed posts on how to do Machine Learning or Deep Learning in the real world. I always recommend to people entering in this area to be a part of a community, and LinkedIn is on the best, you will find me there all the time :).
Starting with Deep Learning and Computer Vision
https://github.com/facebookresearch/Detectron
The research in the Deep Learning space for classifying things in images, detecting them and do actions when they “see” something has been very important in this decade, with amazing results like surpassing human level performance for some problems.
In this article I will show you every post Akshay Bahadur has done in the space of Computer Vision (CV) and Deep Learning (DL). If you are not familiar with any of those terms you can learn more about them here:
1. Hand Movement Using OpenCV
From Akshay:
To perform video tracking an algorithm analyzes sequential video frames and outputs the movement of targets between the frames. There are a variety of algorithms, each having strengths and weaknesses. Considering the intended use is important when choosing which algorithm to use. There are two major components of a visual tracking system: target representation and localization, as well as filtering and data association.
Video tracking is the process of locating a moving object (or multiple objects) over time using a camera. It has a variety of uses, some of which are: human-computer interaction, security and surveillance, video communication and compression, augmented reality, traffic control, medical imaging and video editing.
This is all the code you need to reproduce it:
import numpy as np import cv2 import argparse from collections import deque cap=cv2.VideoCapture(0) pts = deque(maxlen=64) Lower_green = np.array([110,50,50]) Upper_green = np.array([130,255,255]) while True: ret, img=cap.read() hsv=cv2.cvtColor(img,cv2.COLOR_BGR2HSV) kernel=np.ones((5,5),np.uint8) mask=cv2.inRange(hsv,Lower_green,Upper_green) mask = cv2.erode(mask, kernel, iterations=2) mask=cv2.morphologyEx(mask,cv2.MORPH_OPEN,kernel) #mask=cv2.morphologyEx(mask,cv2.MORPH_CLOSE,kernel) mask = cv2.dilate(mask, kernel, iterations=1) res=cv2.bitwise_and(img,img,mask=mask) cnts,heir=cv2.findContours(mask.copy(),cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)[-2:] center = None if len(cnts) > 0: c = max(cnts, key=cv2.contourArea) ((x, y), radius) = cv2.minEnclosingCircle(c) M = cv2.moments(c) center = (int(M["m10"] / M["m00"]), int(M["m01"] / M["m00"])) if radius > 5: cv2.circle(img, (int(x), int(y)), int(radius),(0, 255, 255), 2) cv2.circle(img, center, 5, (0, 0, 255), -1) pts.appendleft(center) for i in xrange (1,len(pts)): if pts[i-1]is None or pts[i] is None: continue thick = int(np.sqrt(len(pts) / float(i + 1)) * 2.5) cv2.line(img, pts[i-1],pts[i],(0,0,225),thick) cv2.imshow("Frame", img) cv2.imshow("mask",mask) cv2.imshow("res",res) k=cv2.waitKey(30) & 0xFF if k==32: break # cleanup the camera and close any open windows cap.release() cv2.destroyAllWindows()
Yep, 54 lines of code. Very simple right? You will need to have OpenCV installed in your computer, if you have Mac check this out:
If you have Ubuntu:
and if you have Windows:
2. Drowsiness Detection OpenCV
This can be used by riders who tend to drive for a longer period of time that may lead to accidents. This code can detect your eyes and alert when the user is drowsy.
Dependencies
- cv2
- immutils
- dlib
- scipy
Algorithm
Each eye is represented by 6 (x, y)-coordinates, starting at the left-corner of the eye (as if you were looking at the person), and then working clockwise around the eye:
Condition
It checks 20 consecutive frames and if the Eye Aspect ratio is lesst than 0.25, Alert is generated.
Relationship
Summing up
3. Digit Recognition using Softmax Regression
This code helps you classify different digits using softmax regression. You can install Conda for python which resolves all the dependencies for machine learning.
Description
Softmax Regression (synonyms: Multinomial Logistic, Maximum Entropy Classifier, or just Multi-class Logistic Regression) is a generalization of logistic regression that we can use for multi-class classification (under the assumption that the classes are mutually exclusive). In contrast, we use the (standard) Logistic Regression model in binary classification tasks.
Python Implementation
The dataset used was MNIST with images of size 28 X 28, and the plan here is to classify digits from 0 to 9 using Logistic Regression, Shallow Network and Deep Neural Network.
One of the best parts here is that he coded three models using Numpy including optimization, forward and back propagation and just everything.
For Logistic Regression: see code here
For a Shallow Neural Network: see code here
And finally with a Deep Neural Network: see code here
Execution for writing through webcam
To run the code, type python Dig-Rec.py
python Dig-Rec.py
Execution for showing images through webcam
To run the code, type python Digit-Recognizer.py
python Digit-Recognizer.py
Devanagiri Recognition
This code helps you classify different alphabets of hindi language (Devanagiri) using Convnets. You can install Conda for python which resolves all the dependencies for machine learning.
Technique Used
I have used convolutional neural networks. I am using Tensorflow as the framework and Keras API for providing a high level of abstraction.
Architecture
CONV2D → MAXPOOL → CONV2D → MAXPOOL → FC → Softmax → Classification
Some additional points
- You can go for additional conv layers.
- Add regularization to prevent overfitting.
- You can add additional images to the training set for increasing the accuracy.
Python Implementation
Dataset- DHCD (Devnagari Character Dataset) with i mages of size 32 X 32 and usage of Convolutional Network.
To run the code, type python Dev-Rec.py
python Dev-Rec.py
4. Facial Recognition using FaceNet
This code helps in facial recognition using facenets (https://arxiv.org/pdf/1503.03832.pdf). The concept of facenets was originally presented in a research paper. The main concepts talked about triplet loss function to compare images of different person. This concept uses inception network which has been taken from source and fr_utils.py is taken from deeplearning.ai for reference. I have added several functionalities of my own for providing stability and better detection.
Code Requirements
You can install Conda for python which resolves all the dependencies for machine learning and you’ll need:
numpy matplotlib cv2 keras dlib h5py scipy
Description
A facial recognition system is a technology capable of identifying or verifying a person from a digital image or a video frame from a video source. There are multiples methods in which facial recognition systems work, but in general, they work by comparing selected facial features from given image with faces within a database.
Functionalities added
- Detecting face only when your eyes are opened. (Security measure).
- Using face align functionality from dlib to predict effectively while live streaming.
Python Implementation
- Network Used- Inception Network
- Original Paper — Facenet by Google
Procedure
- If you want to train the network , run
Train-inception.py
, however you don't need to do that since I have already trained the model and saved it asface-rec_Google.h5
file which gets loaded at runtime. - Now you need to have images in your database. The code check
/images
folder for that. You can either paste your pictures there or you can click it using web cam. For doing that, runcreate-face.py
the images get stored in/incept
folder. You have to manually paste them in/images folder
- Run
rec-feat.py
for running the application.
5. Emojinator
akshaybahadur21/Emojinator
Emojinator - A simple emoji classifier for humans.github.com
This code helps you to recognize and classify different emojis. As of now, we are only supporting hand emojis.
Code Requirements
You can install Conda for python which resolves all the dependencies for machine learning and you’ll need:
numpy matplotlib cv2 keras dlib h5py scipy
Description
Emojis are ideograms and smileys used in electronic messages and web pages. Emoji exist in various genres, including facial expressions, common objects, places and types of weather, and animals. They are much like emoticons, but emoji are actual pictures instead of typographics.
Functionalities
- Filters to detect hand.
- CNN for training the model.
Python Implementation
- Network Used- Convolutional Neural Network
Procedure
- First, you have to create a gesture database. For that, run
CreateGest.py
. Enter the gesture name and you will get 2 frames displayed. Look at the contour frame and adjust your hand to make sure that you capture the features of your hand. Press 'c' for capturing the images. It will take 1200 images of one gesture. Try moving your hand a little within the frame to make sure that your model doesn't overfit at the time of training. - Repeat this for all the features you want.
- Run
CreateCSV.py
for converting the images to a CSV file - If you want to train the model, run ‘TrainEmojinator.py’
- Finally, run
Emojinator.py
for testing your model via webcam.
Contributors
Akshay Bahadur and Raghav Patnecha.
Final Words
I can only say I’m incredibly impress on these projects, all of them you can run them on your computer, or more easily on Deep Cognition’s platform if you don’t want to install anything, and it can run online.
I want to thank Akshay and his friends for making this great Open Source contributions and for all the others that will come. Try them, run them, and get inspired. This is only a small example of the amazing things DL and CV can do, and is up to you to take this an turn it into something that can help the world become a better place.
Never give up, we need everyone to be interested in lots of different things. I think we can change the world for the better, improve our lives, the way we work, think and solve problems, and if we channel all the resources we have right now to make these area of knowledge to work together for a greater good, we can make a huge positive impact in the world and our lives.
We need more people interested, more courses, more specializations, more enthusiasm. We need you :)
Thanks for reading this. I hope you found something interesting here :)
If you have questions just add me in twitter:
and LinkedIn:
See you there :)
Bio: Favio Vazquez is a physicist and computer engineer working on Data Science and Computational Cosmology. He has a passion for science, philosophy, programming, and music. Right now he is working on data science, machine learning and big data as the Principal Data Scientist at Oxxo. Also, he is the creator of Ciencia y Datos, a Data Science publication in Spanish. He loves new challenges, working with a good team and having interesting problems to solve. He is part of Apache Spark collaboration, helping in MLlib, Core and the Documentation. He loves applying his knowledge and expertise in science, data analysis, visualization, and automatic learning to help the world become a better place.
Original. Reposted with permission.
Related:
- A “Weird” Introduction to Deep Learning
- The Two Sides of Getting a Job as a Data Scientist
- My Journey into Deep Learning