Air Quality Data with Atmotube

I will show you some analysis on dirty air location data that I have collected with Atmotube. Atmotube PRO detects PM1, PM2.5, and PM10 pollutants, like dust, pollen, soot, and mold, plus a wide range of Volatile Organic Compounds (VOCs). All in real time!

Pretext: I downloaded the data from Atmotube PRO mobile App and uploaded it to a Google Sheet, I used Google Dataprep to clean up the data before I created a dataset and a table in Google BigQuery.

The notebook will show the analysis work on the dataset table

First we have to connect to Google Cloud Platform so we can use BigQuery

As the data is stored in BigQuery, I first have to Autheticate with Google Cloud Platform(GCP).

In [ ]:
from google.colab import auth
auth.authenticate_user()
print('Authenticated')
Authenticated

Understand the data structure

It's helpful to inspect schema and a sample of the data we're working with

I make a query to BigQuery and my dataset table, in my query I'm ask for all the the data

In [ ]:
%%bigquery --project social-climate-tech df
SELECT * FROM
  `social-climate-tech.personal_climate_tech.perosnal_climate_data`

Showing the data with panda dataframe, returns the first n rows for the object based on position. It is useful for quickly testing if your object has the right type of data in it. The df is define in the BigQuery statement above.

In [ ]:
df.head(100)
Out[ ]:
date voc aqs temperature humidity pressure pm1 pm25 pm10 latitude longitude address postalcode city country coordinates
0 2020-10-23 12:42:00 0.170 76.0 26.8 36.0 1010.0 18.0 21.0 23.0 52.944803 5.062550 A7 1779 Den Oever Netherlands 52.9448033, 5.0625504
1 2020-10-23 12:41:00 0.259 82.0 27.5 33.0 1010.0 12.0 14.0 16.0 52.944803 5.062550 A7 1779 Den Oever Netherlands 52.9448033, 5.0625504
2 2020-10-23 12:08:00 0.409 75.0 25.3 33.0 1011.1 7.0 9.0 10.0 53.042728 5.642525 Pophornsterbrug 8602 Sneek Netherlands 53.0427284, 5.6425247
3 2020-10-23 11:49:00 0.130 80.0 25.8 35.0 1011.6 14.0 17.0 18.0 53.019348 5.705013 Hendrik Bulthuisweg 30 Sneek Netherlands 53.0193483, 5.7050128
4 2020-10-23 11:47:00 0.232 80.0 28.7 30.0 1011.7 14.0 16.0 17.0 53.019348 5.705013 Hendrik Bulthuisweg 30 Sneek Netherlands 53.0193483, 5.7050128
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
95 2020-09-29 01:34:00 0.206 87.0 18.8 60.0 1013.1 7.0 11.0 13.0 52.349492 4.797737 52.3494918, 4.7977369
96 2020-09-29 01:23:00 0.180 89.0 18.8 61.0 1013.1 7.0 10.0 12.0 52.349492 4.797737 52.3494918, 4.7977369
97 2020-09-29 01:12:00 0.154 90.0 18.8 61.0 1013.0 7.0 8.0 9.0 52.349492 4.797737 52.3494918, 4.7977369
98 2020-09-29 01:01:00 0.129 88.0 18.7 60.0 1013.0 8.0 10.0 11.0 52.349492 4.797737 52.3494918, 4.7977369
99 2020-09-29 00:50:00 0.122 88.0 18.7 60.0 1013.0 8.0 10.0 11.0 52.349492 4.797737 52.3494918, 4.7977369

100 rows × 16 columns

Fix NaN issue

To make sure i do not have any fields with NaN - I updated my dataframe and replace NaN with 0.

In [ ]:
df = df.fillna(0)

Illustrate on Map

To illustrate the data on a map - I use the Folium library - which I can use to visulize were I taken the Atmotube to collect data.

In [ ]:
import folium
from folium import plugins

Building the map that shows all the location in which I collected data from both outside and inside

In [ ]:
#initialize the map around
amsMap = folium.Map(location=[52.349494,4.797808], tiles='Stamen Toner', zoom_start=9)

#for each row in the Starbucks dataset, plot the corresponding latitude and longitude on the map
for i,row in df.iterrows():
    folium.CircleMarker((row.latitude,row.longitude), radius=3, weight=2, color='red', fill_color='red', fill_opacity=.5).add_to(amsMap)

#save the map as an html    
amsMap.save('amsPointMap.html')
amsMap
Out[ ]:
Make this Notebook Trusted to load map: File -> Trust Notebook