AEC Innovations

What is Supervised Machine Learning for the AEC Domain Experts?

In the previous article, we learned that Artificial Intelligence (AI) is like a tree; with roots in philosophy, mathematics, and other sciences; and Machine Learning (ML) is a branch of the AI tree. Moreover, we learned that all ML methods are AI-based, but not all AI methods are learning-based. But what are the different branches of ML and how do they work? As we discussed, ML has four sub-branches: supervised learning, unsupervised learning, semi-supervised learning, and reinforcement learning. In this article, we will review supervised learning, which is the most common method of ML.

What is Supervised Machine Learning?

My brother was born when I was eight years old. I was so excited to teach him everything I knew and make him the smartest brother ever! I started teaching him colors on his second birthday. To do so, I used Lego bricks in three colors: red, blue, and green. In the first activity, I was showing him each Lego brick and calling out its color: “This is red,” “This is blue,” “This is green.” While I repeated this over and over, I also asked him to tell me the color of each Lego brick:

Me: “Which color is this Lego brick?”

My brother: “Blueeee.”

Me: “Yeeeeeeess, good job my sweetie. It’s blue.”

In this activity, I assessed his ability to learn how to sort the Lego bricks by color. If he made a mistake, I went over the colors and let him try again. The process involved a lot of repetition. I was acting as a supervisor by teaching my brother about colors. Afterward, I determined if he had learned the colors and could even classify new objects by their colors. At the time, I did not know this method was called “Supervised Learning.”

“Supervised Learning” means exactly what the name indicates: the machine learns under your supervision. But how can you supervise a machine? You need to provide a lot of examples, known as a training set. Let’s say you have images of traffic signs (input) and you want to train a machine learning model to identify if a sign is in the shape of a circle, square, or triangle (output). To do this, you need to take three steps: a) create a labeled dataset, b) train and test a machine learning model, and c) verify the model with a new dataset.

a) Create a labeled data set: In this step, you are supervising the model by providing a set of inputs (images of traffic signs) and output (their shapes.) But why do you need to do that? The computer sees images as a combination of pixels, which is an array of numbers like {0,0,0,1,1,1,1}. However, a computer cannot understand what the images are about just by seeing these numbers. Thus, to help the computer understand the context of the images, you need to tag or label them to identify which are circles, squares, or triangles (your outputs.) This step is called dataset labeling.

b) Train and test a machine learning model: In this step, you feed the labeled dataset you created into an algorithm to learn the relationship between the inputs and outputs. While you are training the model with the labeled dataset, you need to test the model and update it until it can accurately predict the expected outputs. Back to our traffic signs example. From the labeled images you created, the algorithm learns the features of each shape to learn the pixel representations of circles vs. squares vs. triangles. This step is called model training and is how a computer learns to identify the images.

C) Verify the model with a new dataset: After completing the training process, you need to verify the model accuracy with new datasets, to ensure the model can predict the expected outputs when fed a new dataset. To verify the model in the previous example, show new traffic sign images of circles, squares, or triangles in different sizes. The model should be able to recognize the images. If not, you need to retrain the model until it can recognize them correctly. This step is called the verification process.

Supervised learning involves dataset labeling, model training, and model verification.

Let’s revisit the above process with a simple construction example. If you work at a construction site, you know that monitoring the safety of a construction site is challenging and complex for a human. Let’s say you want to develop a machine learning algorithm that can detect whether workers are wearing their safety helmets. To do so, you need to take the following steps:

1. Create a labeled data set: You need to label thousands of photos captured from construction sites. In these photos, label by drawing a rectangle around workers and safety helmets (like labeling Lego bricks for my little brother.)

2. Train and test a machine learning model: You need to feed the labeled photos (training dataset) into an algorithm for training a model. While training the model, you need another labeled dataset to test if the model learns. Eventually, the model will learn to detect humans and check for the safety helmet.

3. Verify the model with a new dataset: Although you trained the model, you still need to continuously check if the model is accurately recognizing if workers are wearing their safety helmets in a new dataset (new photos.) If not, you’ll need to retrain the model until it can recognize a lack of safety helmets.

Once the model has learned to recognize if someone is wearing a safety helmet, it can analyze the images and determine risks more quickly than a human. Such technology can be developed to provide real-time insights for tracking and monitoring various safety requirements at construction sites.

Real-time insights for tracking and monitoring safety requirements can help mitigate accidents at construction sites.

Supervised machine learning is the most common method of ML. This method involves three major steps: dataset labeling, model training, and model verification. One simple way to understand this method is to think about teaching kids. You supervise them, show them examples, and test them to see if they have learned. In the upcoming articles, we will review the other methods of ML.