Think Like an Amateur, Do As an Expert: Lessons from a Career in Computer Vision
Dr. Takeo Kanade shared his life lessons from an illustrious 50-year career in Computer Vision at last year's Embedded Vision Summit. You have a chance to attend the 2019 Embedded Vision Summit, from May 20-23, in the Santa Clara Convention Center, Santa Clara CA.
Key Takeaways from the 2018 Embedded Vision Summit Keynote by Dr. Takeo Kanade
Dr. Takeo Kanade is the U.A. and Helen Whitaker Professor at Carnegie Mellon University. The title of his talk was: Think Like an Amateur, Do As an Expert: Lessons from a Career in Computer Vision
Dr. Kanade gave an extremely engaging talk, packed with informative nuggets and punctuated with humor. Click here for the video link.
- He’s been working for 50 years on Computer Vision and Robotics since his graduate study. His Ph.D. thesis: Computer recognition of human faces.
- His Moment of Fame: On January 28, 2001 – at Super Bowl XXXV – Action Replay using EyeVision using 33 cameras, each accompanied by a robot (Mitsubishi PA-10 costing $80k at that time) with a Sony camera and a big zoom lens. The hardware cost totaled $3+ million. The combined cabling length was over 20 km. He described the challenges of coordinating robotic cameras to pan, tilt, zoom and focus during active play.
- Another moment of fame: Dr. Kanade appears as himself in the Bruce Willis movie Surrogates in a 2.5-second clip.
- What is “good research”? Even experienced researchers find it difficult to answer this question clearly. He quoted the late Prof. Allen Newell:
- Good science responds to real phenomena or real problems
- Good science is in the details
- Good science makes a difference
- To conduct useful and fun research:
- Start with a scenario: Picture the success – what can happen; how and where will it be used?
- Think freely with fun
- The story expands, and you talk about it
- Other people can join
- Dr. Kanade’s lessons:
- Successful ideas are often surprisingly simple and even naively trivial
- Impediment to such simple thinking: “I know” mentality
- Reliable execution requires substantive knowledge and skill – and here, expertise is necessary; however, we may have to overcome the ‘expert’ bias; this is not about pairing up an amateur to come up with something simple and have the expert execute it, but an expert adapting as needed
- Think like an amateur, do as an expert – when you are starting to solve a problem, think like a novice, an amateur by consciously avoiding preconceived biases. When implementing a solution, adopt a meticulous and thorough approach. Dr. Kanade has even written a book on this in Japanese (and translated into some languages other than English)
- Many-camera System:
- His very first paper on a many-camera real-time system was rejected (one reviewer's comment said: "... devices that use these many unnecessary cameras are too expensive to be useful.")
- He started out by modeling real reality into virtual reality and came up with the Ladar (Scanning Laser Range Imager)
- To overcome the video limitations of Ladar, his team led the development of 3D camera and a Soft Camera to overcome occlusion
- Around the year 2000, they developed a completely digital system that created dynamic virtual views (virtualized reality)
- Later work included face detection, pose detection
- Many-camera systems, large-scale 3D cameras are now used in many places like Entertainment (remember the movie 'Matrix'?)
- Navlab:
- Navlab was an autonomous driving program started with DARPA in 1984. This went through multiple generations from Navlab 1 through Navlab 10.
- No Hands Across America - in 1995: Navlab 5 traveled 98.2% of the distance from Washington DC to San Diego controlled by a computer
- Smart Headlight:
- Snowflakes and raindrops are highly reflective
- Solution: Avoid the light beam hitting the raindrops!
- Even with a framerate of about 1000 fps and a latency of 1.5 ms, other system limitations came into the picture. Many problems become easier to solve as you get faster!
- A practical application of this would be an adaptive high-beam that works for all drivers on the road
- Value of Research:
- Research does not have to be new
- “Newness itself is not a virtue, usefulness is.”
- It's hard to predict research: Dr. Kanade shared the story of Lucas-Kanade Optical Flow [Lucas, Kanade ‘81]. Around '80, they were working the problem of tracking. His student Bruce Lucas came up with an idea: the solution as a combination of the Taylor series expansion (well known for 300 years) and the graphical solution for a quadratic equation. Lucas insisted they should write a paper and Dr. Kanade insisted they should not publish as it seemed trivial … the student insisted and finally, the Professor gave in … allowing the student to publish in an obscure journal. Lo, behold! Of the 400+ papers Dr. Kanade has published, this has become the most-referenced paper of all. Every algorithm that has video processing uses this concept.
- Computer Vision today is in the state of ‘a perfect storm’
- New, ubiquitous, embedded, sometimes super-human vision sensors
- Dr. Kanade predicts explosive growth and progress in this field
About the Embedded Vision Alliance and 2018 Embedded Vision Summit
Credit: Jeff Bier, Founder and President – BDTI, Founder – Embedded Vision Alliance
- The Embedded Vision Alliance a partnership of companies dedicated to accelerating and facilitating the use of computer vision to solve real-world problems – "We aim to inspire and create products that see"
- The Embedded Vision Alliance is about enabling the incorporation of visual perception into all kinds of systems and devices
- Started in 2011 with 16 members, Embedded Vision Alliance now has 90 members
- Theme: Enabling computer vision at the edge and in the cloud – almost half of the developers in this Alliance are already using cloud and about 75% are already using vision at the edge. Cloud will be a major enabler for embedded vision solutions.
- Vision Product of the Year Awards started in 8 categories: processor, camera, software, automotive, AI, cloud, developer, end-product
- A single developer with no computer vision experience can create an effective cloud-based vision solution in months (Chris Adzima’s talk)
- 3D camera modules now cost around $20-30 (Guillaume Girardin’s talk)
- Binary neural networks can dramatically lower the cost and power consumption for DNN inference (Abdullah Raouf, Mohammad Rastegari - talk)
- One DNN can generate new training data for another DNN (Peter Corcoran’s talk)
- We can protect privacy even as smart cameras proliferate (Charlotte Dryden’s talk)
- Vision Tank: An exciting startup competition for winning the “Most Innovative Vision-Based Product” prize. Finalists:
- The 2018 Embedded Vision Summit boasted of:
- 1200 attendees
- 90+ presentations across 2 days, 6 tracks
- 100+ working demonstrations from 50+ exhibitors
- 84% of attendees actively developing or near-term planning vision-based products
Each year, this event has been gathering more momentum and buzz.
The 2019 Embedded Vision Summit is happening next week in the Santa Clara Convention Center, Santa Clara, CA. Here are the key dates and events:
- Monday, May 20: Training classes on Deep Learning for Computer Vision with TensorFlow 2.0 and Computer Vision Applications in OpenCV. [Needs separate registration on the event website]
- Tuesday and Wednesday, May 21-22: Main Conference and Vision Technology Showcase
- Thursday, May 23: Vision Technology Workshops by Intel, Khronos and Synopsys [Needs separate registration via the event website]
So, hurry up, there’s still a chance to register for the 2019 Embedded Vision Summit. Here’s the link: https://www.embedded-vision.com/summit
Some KDNuggets posts on Computer Vision
- Computer Vision by Andrew Ng - 11 Lessons Learned: Ryan Shrott systematically reviews Andrew Ng’s Computer Vision course on Coursera, sharing his learning.
- Introducing VisualData: A Search Engine for Computer Vision Datasets: Jie Feng introduces VisualData a search engine for computer vision datasets in this post.
- How to do Everything in Computer Vision: In this post, George Seif, walks us through classification, detection, segmentation, pose estimation, enhancement and restoration tasks in Computer Vision using Deep Learning.