How to start a career in Computer Vision?

4 min readSep 19, 2020

A practical (tried and tested) guide on starting a career in Computer vision.
“If you want to build a career in Computer vision, follow this roadmap to launch yourself in the industry. Or, if you are already experienced, use this guide to brush-up your skills before an interview.”

I have been working as Computer vision engineer for over three year now and I got the opportunity to work products involving:

Human detection and tracking in surveillance videos
Facial recognition
Object detection (like detecting brand logos in websites)
Optical character recognition (identifying text in images)
Image matching
And more…

The work has been stimulating of course, but it was tough as I jumped into the field of Computer vision without any prior formal education or experience. So, throughout the three years, I had to work two shifts: 1) Job in the day 2) Self study at night. From this experience, I have formulated a learning roadmap which helped me keep my career alive and thriving. It comprises a set of programming languages, software tools, algorithms and concepts. So, if you want to start a career in Computer vision, this roadmap will help you launch yourself in the industry. Or, if you are already experienced, you can use it as a guide to brush-up your skills before an interview.

So, let’s get started with Computer Vision!

1) Learn to program:

Python is the most popular language for data science domains including Computer vision. However, it is a vast language and you might get lost if you try to learn everything. So, I am listing down the most crucial Python packages for this domain. Other than the basics of Python you need to have a strong command on following Python libraries:

OpenCV (Open Source Computer Vision)
Pillow (also called PIL is the Python Image Library)
Imutils
Numpy
Matplotlib.Pyplot
Matplotlib.Image
Dlib
Scikit-image

These packages will cover most of the Machine learning side of Computer vision. When you have become proficient in these, then is the time to dive ‘deep’ into Deep learning for Computer vision by studying Tensorflow and Keras.

In addition, it is useful to have understanding of these miscellaneous concepts:

Managing Anaconda and Python environments
Saving trained models with Joblib and Pickle
Managing large datasets with Pandas

2) Learn Basic Mathematics:

A strong mathematical education like engineering, computer science or statistics is essential since most mathematics involved in Computer vision comes from Linear Algebra and Calculus. So, there is no shortcut around it.

Following are the concepts which you must master:

Solving linear equations
Convolution
Matrix algebra (multiplication and inverse)
Derivation (partial differentials, gradient, chain rule, convex optimization)
Different distance measures (Hamming, Euclidean, Manhattan, Minkowski, Mahalanobis)
Different similarity measures (Covariance, Correlation, Cosine similarity)

This answer on Quora briefly summaries which mathematical topics you need to focus on and why.

3) Learn Basic Machine Learning:

Although Deep learning has majorly taken over the Computer vision field, there are classical Machine learning (ML) tools, concepts and algorithms which are sometimes more efficient, easy and practical. Therefore it is essential for Computer vision engineers to be hands-on with ML. You must be familiar with:

Major Computer vision algorithms in Machine Learning: SIFT (Scale-Invariant Feature Transform), SURF (Speeded-Up Robust Features), BRIEF (Binary Robust Independent Elementary Features)
Basic ML concepts: gradient descent, supervised/unsupervised learning, bias-variance trade-off, confusion matrix, loss functions, learning curves
Basic ML algorithms: Regression, Decision trees, Support vector machines, Dimensional reduction algorithms like Principal component analysis

I would recommend the following books for beginners:

Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow by Aurélien Géron
Machine Learning by Tom Mitchell

4) Learn Basic Image Processing:

Since you will be working with images frequently, you will need to crop, resize, grey-scale, flip and rotate images frequently. In addition, other operations which you will need to grasp are:

Adding noise
De-noising (like adding Gaussian blur)
Segmenting different object layers
Color space augmentation

These image techniques are hugely used for image pre-processing; preparing images for training Computer vision models. Try to implement these operations with Python using the Python image processing packages mentioned in heading1.

While you are at it, it is also handy to study image hashing algorithms like Phash, Ahash, Dhash, Whash etc. which are simple and efficient tools for image matching problems.

5) Learn Deep Learning (Finally):

Now that you have learned all the basics required for a career in Computer vision, you can delve into Deep learning. It is the field where you learn to build, train and deploy Deep neural networks which are the driving force behind all the cool applications like self driving cars, facial recognition systems and social media image filters.

To learn Deep learning, I would simply suggest to start with Andrew Ng’s 6-months Deep learning specialization on CoursEra. In my personal experience, this specialization was a brief, yet thorough program that gave me everything I needed for the initial years of my career.

Closing remarks:

Good luck with your dive into the world of Computer vision and Artificial Intelligence.

If you find this guide useful, hit the clap button down there to encourage me to write more. Your suggestions and critique is as valuable to me as your appreciation :)