How do computers see so well?

William Law
6 min readOct 4, 2019

Transportation has been around for a long time. From riding horses 🐎, to the first car made in 1885, to modern cars today, transportation is a 🔑 component associated with our lifestyle.

If you look at the cars today, take a moment to think about how we got here.

  • The late 1800s-1900s: Cars were being introduced and powered by gasoline ⛽
  • Electric cars came along ⚡
  • Now, self-driving cars 🚗.

To this day, the majority of transportation vehicles are operated by a human, but with machine learning advancing super fast, we might not need a driver behind the wheel.

How?

Computer vision 🤖.

A quick overview of Computer Vision

If you aren’t aware of this field, that’s totally cool! I’ll briefly explain this:

Computer vision focuses on training computers to gain a higher-level understanding of pictures and video clips. Let’s take this example:

In a high-level environment, the goal is to train the computer to recognize that some of the objects in the picture are trees, people, and dogs. This is done by giving the computer thousands of images of trees, people and dogs to learn/train before giving it an image it has never seen!

You might be able to see how this is SUPER beneficial while driving.

However, in an image like the one above, computation gets difficult for computers with that many objects present.

Being able to identify and recognize what we see is great, but how do we make this more efficient/scalable and practical?

Well, there are many new approaches, but the one that I implemented was a network by FaceBook called Mask R-CNN.

Let’s look into that.

Here’s how Mask R-CNNs work

--

--

William Law

swe // trading — prev: @MLHacks, eng @ early-stage startups | Twitter @wlaw_