This is a guest blog by Lincoln High School student, Richard Deng, who built the Air Quality Prediction Tool. Listen to an interview with Richard on DEQ’s GreenState podcast.
Throughout the recent years, air quality in the Pacific Northwest has spiked multiple times due to wildfire smoke. With unpredictable spikes in air quality, families are caught off-guard. These sorts of incidents have occurred multiple times and will continue to surface. Without a reliable source of prediction, people are left unprepared to combat these events, and their health may be put at risk as a result.
While thinking about this, I realized that one of the things that could be done was to create an air quality prediction tool. If the air quality is predicted, then it’ll be easier for residents to adapt to the changing environment. This will not only support their overall health, but it can also help them with scheduling their daily activities according to the air quality of their area.
So how did I develop the air quality prediction tool, or AQPT for short? To start off actually predicting the air quality I needed to pre-process the data being used to create the AQPT. I merged the historical meteorological, air pollutant and Air Quality Index data from the Oregon Department of Environmental Quality and U.S. Environmental Protection Agency into one central database, and converted it into daily average data. Then, I used feature engineering methods to identify and choose the relevant data to be used for testing.
There are multiple steps I had to take before I could actually start the testing. After preprocessing the data, I used an open-source Orange Data Mining software to test different machine learning methods in accordance with the pre-processed data and compared their accuracies. Machine learning is a branch of artificial intelligence and computer science in which a lot of data goes into a computer and you tell it to make decisions based on different factors. As you put more data into the machine, it learns and corrects its algorithm to become more accurate.

The AQPT consists of the combination of the highest accuracy machine learning methods. But the question is, how do you actually combine the machine learning methods? I used a novel weights formula on the machine learning methods outputs that applied “heavier” weights to the higher accuracy machine learning method. For example, I saw in the data that X was a very good predictor of air quality, so I gave that more weight telling the machine learning program to pay more attention to X. This then created an ensemble machine learning method that was able to predict air quality in accordance with the Air Quality Index.

After testing the ensemble method, I gathered the results and analyzed them by comparing it to the machine learning methods used to create the ensemble method as well as different machine learning methods used to predict air quality reported in literature. With a 94.92% prediction accuracy, the ensemble method achieved a higher accuracy than all of the tested individual machine learning methods as well as the methods reported in literature.

The journey to create the AQPT has been a great learning experience that has benefited me an enormous amount. The end goal for this project was to help the people in my life and my community deal with the drastic changes in air quality. So many people have assisted me in this journey, from answering questions to providing valuable feedback and information. I would like to thank DEQ members, Margaret Miller, Daniel Johnson and Lauren Wirtis. I would also like to thank Professor Perkowski at Portland State University for teaching me about the machine learning methods.
Currently, this tool is still not out for general use. That’s one thing that is really important to get done, but in order to do so, there are small steps to be taken to reach the end. Let’s all do our part, and help each other adapt to the seemingly unpredictable environment we live in.

A note from DEQ
It has been such a pleasure to work with Richard on his Air Quality Prediction Tool. Predicting air quality is incredibly difficult and we are encouraged by his work and enthusiasm to solve this problem that affects so many Oregonians. DEQ is connecting Richard with people at the National Weather Service as well so he can continue working to develop this tool. We’re excited to keep talking with him.