Organizing the Indicator Zoo: Can a New Taxonomy Make It Easier for Citizen Science Data to Contribute to the United Nations Sustainable Development Goal Indicators?

In order to measure progress towards the aims outlined by the United Nations (UN) 2030 Agenda, data are needed for the different indicators that are linked to each UN Sustainable Development Goal (SDG). Where statistical or scientific data are not sufficient or available, alternative data sources, such as data from citizen science (CS) activities


INTRODUCTION
The UN resolution "Transforming our world: the 2030 Agenda for Sustainable Development" has shaped many national, regional, and local development policies since its ratification in 2015. To reach the ambitious vision for 2030, the agenda lists 17 Sustainable Development Goals (SDGs). Accompanied by 169 targets, which are processes by which the SDGs can be addressed, the SDGs help to evaluate progress in areas of relevance for sustainability (United Nations 2015). A framework of currently 231 indicators ensures the monitoring of progress on the targets and therefore success towards the 17 SDGs at a global level (United Nations n.d.).
In order to support the quantification of the UN indicators, data from different sources are available. Data for monitoring the SDGs are collected by, amongst others, national statistical offices (NSOs), international organizations, and national administrations (Fritz et al. 2019). A number of nontraditional data sources have been recognized as complementary for this purpose, including earth observation data, commercially available data, data from sensor networks, and data generated by public volunteers for scientific or monitoring purposes (Fritz et al. 2019;Woods et al. 2022). These participatory research activities bear many names-amongst them, the term Citizen Science (CS) is being used most commonly (Franzoni et al. 2022;Haklay et al. 2021;Kullenberg and Kasperowski 2016;Moses 2022;Shirk et al. 2012). For the purpose of this paper, we take an all-encompassing approach and define CS as "active engagement of the general public in scientific research tasks" (Vohland et al. 2021, p. 1).
This paper describes a new approach for using CS data as the basis for SDG indicators by enriching them with structured paradata and other metadata to better assess their qualities and possible areas of use. The term "citizen" is not to be considered exclusive but encompasses all volunteers in the participatory processes.

CITIZEN SCIENCE DATA FOR MONITORING PROGRESS TOWARDS THE SUSTAINABLE DEVELOPMENT GOALS
The SDGs have been established to cope with the "wicked problems" of this century, which are characterized by complexity, high stakes of uncertainty, and divergence in viewpoints, values, and intentions of the stakeholders involved (Crowley and Head 2017;Head 2022;Rittel and Webber 1973;Sauermann et al. 2020). Applying CS can be a useful tool in this context. Participants in CS activities can help identify sustainability problems, set the research agenda, and contribute efforts and knowledge. CS can also support the SDGs by contributing to the definition of national and subnational targets (Sauermann et al. 2020;Stockholm Environment Institute 2017). Also, the SDGs themselves build on features that have traits similar to CS, like encouraging new kinds of multistakeholder partnerships and collaborations (through, e.g., representation of CS networks), sustainable living, and global citizenship (Shulla et al. 2020). CS can support the SDGs further through educating and informing volunteers, resulting in changes in their behavior (Lämmerhirt et al. 2018;Shulla et al. 2020;Woods et al. 2022).
Although a large number of scholars and NSOs question the quality of data generated by CS activities as useful for statistical purposes (e.g., Balázs et al. 2021), there is an emerging trend in acknowledging the usefulness of those data to track SDG progress. In a recent policy brief, several international organizations present how CS data can contribute to produce evidence, to strengthen accountability, or to develop solutions (CROWD4SDG 2022). CS as a means to reach the SDGs has gained wider appreciation by global stakeholders, amongst them the UN. In 2017, UN institutions supported the establishment of the Citizen Science Global Partnership (Citizen Science Global Partnership n.d.), a network that is promoting CS for sustainability and is a partner of the Global Partnership for Sustainable Development Data (Global Partnership for Sustainable Development Data 2022). In November 2021, UNESCO adopted a Recommendation on Open Science that acknowledges CS as an essential element of open science (UNESCO 2021). These are positive signs for CS to play a more relevant role in supporting sustainable transitions and attaining the SDGs.
There is growing recognition that CS provides both means and opportunities for collecting various kinds of data to close data gaps and to contribute to monitoring progress towards individual SDGs (Fraisl et al. 2020;Lämmerhirt et al. 2018;Stockholm Environment Institute 2017;Woods et al. 2022). Especially data from CS activities related to environmental issues can have the potential to contribute to a significant amount of the SDG indicators (Fraisl et al. 2020). In certain areas, this potential is also recognized by the UN ( UN Water 2018).
Recognizing the challenges and insecurities related to the quality of CS data when measuring progress towards the SDGs, this paper presents a recently developed taxonomy for classifying sustainability indicators. One of the categories in the taxonomy specifically addresses the issue of indicator quality by classifying indicators according to the quality classes (1-3) from the European Statistical System (European Statistical System n.d.).
For the purpose of better linking the indicator framework to CS data, we have defined categories for both activities and data from CS initiatives (Supplemental Files 1 and 2: Appendices A and B). We find that scholars have chosen different approaches for the categorization of CS activities, each of them rooted in their definition and understanding of CS. We choose to categorize the CS activities according to level of participation, because in our understanding the participation of non-professional scientists in scientific work is the main factor that distinguishes CS activities from other scientific efforts.
For the purpose of better linking the indicator taxonomy to CS data, we also have defined categories for the output from CS activities-the CS data. This is important because these data might become the basis of indicators, either on their own, or in combination with other data. We have identified five main categories for being able to better describe data obtained through CS activities (Supplemental File 2: Appendix B).

THE NEW TAXONOMY: WHAT IS IT, AND FOR WHAT CAN IT BE USED?
Since the term "sustainable development" was coined in the 1987 report "Our common future" (World Commission on Environment and Development 1987), efforts have been made to monitor and track progress on the different fields of sustainable development across sectors, mainly by using sustainable development indicators (SDIs). This resulted in large numbers of possible indicators with little or no guidance for individuals or organizations in choosing the most relevant ones, or as Pinter et al. (2005) termed it, an "Indicator Zoo." This changed somewhat with the advent of the Millennium Development Goals and eventually the UN SDGs (United Nations 2015) and their respective indicator sets, both gaining wider global acknowledgement; but the situation today remains largely the same. It is still challenging for actors across sectors to choose in which way they monitor sustainability or compare indicators to find the ones best suited to assess their efforts towards sustainability (Pinter et al. 2005;Rasche 2010). So, the challenge is not to find indicators, but rather to know which ones to choose.
In Norway this need became evident for the Norwegian Association of Local and Regional Authorities (KS), as Norwegian municipalities and regions started looking for the best ways to measure sustainability following the establishment of the SDGs. To support the Norwegian local authorities in this work, KS developed a method for classifying SDG-related indicators-a taxonomy (Statistics Norway 2021). KS commissioned Statistics Norway (SSB) to establish the taxonomy. Its main purpose was to make organizations, regardless of size and sector, able to faster distinguish between indicators and indicator sets and find the ones most relevant to their strategic priorities on their way towards sustainable development. The taxonomy was tested in 2021 and internationally launched in 2022.
A taxonomy is a system to classify objects in an orderly fashion, a purposeful sorting (Juliadotter and Choo 2015). The positioning, or classification, of the objects is not done just to make them look nice, but also indicates how they should be used in the best possible manner. This is the main idea behind the KS/SSB taxonomy for SDG indicators ( Figure 1). It can be used to classify or assess SDIs to "clarify their use and usability, either on its own or in comparison to others" (Statistics Norway 2021, p. 6). Indicators classified according to the dimensions and classes in the taxonomy are therefore not just sorted but also presented in a way that makes it easier for different stakeholders to choose the most relevant indicators or indicator set that is fitfor-purpose, seen from the point of view of the respective stakeholder.
As shown in Figure 1, the taxonomy classifies indicators along three dimensions: Goal, Perspective, and Quality. Goal denotes the "what" of the indicator, classifying which SDG(s), SDG target(s) and triple bottom line(s) (TBLs: social justice, economic prosperity, and environmental quality [see Elkington 1998]) it might be relevant for. The Perspective dimension is used to say something about the "why," specifying the viewpoint of different users for, e.g., policy making or governance related activities. The Quality dimension points to the usefulness of the indicator-how well it is designed for its intended usage. Each dimension is divided into several categories. These make use of internationally known categorizations (where available), for better global dissemination of the model. Some examples are the SDGs and TBL for the Goal dimension, and a logic model for Evaluation in the Perspective dimension, as well as the classes from the European Statistical System (ESS) framework for the Quality dimension (Statistics Norway 2021).

A PRACTICAL EXAMPLE OF CLASSIFICATION
The following chapter gives a concrete example of classification of an indicator based on CS data from a CS activity with a limited scope and a small, well-structured dataset (Supplemental File 3: Appendix C). The CS data set we applied stems from the EU FP7-funded CITI-SENSE project (2012-2016) (European Commission 2017). With the help of a smartphone application, the participants could indicate their perception of the surrounding air quality through a four-color code (from green = very good to red = very poor). The participants could indicate the position on a map with the help of their GPS location. They could also generate a user profile with basic sociodemographic information (age, gender, education level). Public perception data was collected in this way during a period of 18 months. 332 reports were submitted for the larger Oslo area (Norway) during this period, and 241 valid reports were used for further analysis (see Grossberndt et al. 2020).
The CITI-SENSE project had collected data on citizens' perception of air quality at a given point in time at a specific geographic location. The data gathered in the project were not collected to establish a specific indicator, so for the purpose of this paper, we have used the available data set and created an experimental indicator "Perceived air quality in urban areas" (Supplemental File 4: Appendix D). We classified this indicator according to the taxonomy (see Figure 1), the CS typology of activities, and data category (see Supplemental Files 1 and 2: Appendices A and B). An overview of the results is available in Table 1.
We started with the classification in the dimension "Additional dimension(s)." The indicator was categorized according to the type of CS activity and category of data gathering. Volunteers were asked to collect data (observations of perceived air quality); therefore, we classified the CS activities as "Contribution or crowdsourcing." All data that were used as a basis for the experimental indicator were subjective observations of air quality, and therefore belong to the CS data category "Observation reporting." When we looked at the taxonomy, the classification procedure was partly straightforward, but triggered debate in certain dimensions and categories. It was relatively easy to classify the indicator according to the Goal dimension. A number of SDGs directly address air quality. Connecting to the TBL was also quite simple. Local air quality is mainly originating from human activities (People) that influence the physical surroundings (Planet), but also the social living conditions of the inhabitants of that particular area (People). Our discussions on the classification according to the Goal dimension led us quite naturally to the next dimension, Perspective.
In this dimension, we discovered links to 8 of 14 development sectors, more than we first had anticipated. This is because the topic of air quality addresses both the sources of what affects air quality and the actors being affected by air quality. The time series and geolocation of the data, alongside the sociodemographic information from the respondents, also made it quite simple to classify this indicator according to the Distribution perspective. This includes hourly data from specific regions and to some extent disaggregated data according to age, gender, and level of education. Here, we definitively see weaknesses in the data set. Since it was voluntary to create a user profile, only parts of the data set were able to be disaggregated according to sociodemographic values. Potential bias in the sociodemographic data (see Grossberndt et al. 2020) might limit the reliability of the data behind this indicator. But the indicator itself is useful, as it provides additional information to complement official (environmental) monitoring in a given geographical area.
The data set is a historical data set from a given period in time, which is no longer updated. This limits its usefulness for live monitoring purposes, but it is still a useful information archive, or a possible baseline towards further data gathering in the future. If the data set is tagged with the specific time period during when the actual data were collected, this will help in determining the usage of this indicator based on its own specific time series. We discussed whether the attribute "Time period" should be recommended as part of an updated, future taxonomy, but this could also be solved by a adding a voluntary field in a metadata structure based on the taxonomy.
The data set contains subjective observations, but this has not necessarily negative implications on their further usability. According to Garrett and Latawiec (2015, p. 18), "indicators are not objective and, in fact, they do not need to be (as long as they are adequate and reflect assumptions behind sustainability)." The data on perceived air quality given by the users of the app are primarily providing output. However, classifying indicators according to the Evaluation perspective can, in some cases, trigger controversy: "Whether the indicator measures output, outcome or impact may vary from one user to another, depending on the adopted logical framework of evaluation" (Statistics Norway 2021, p. 31).
Statistics Norway (2021) also state that the logical framework of evaluation can affect the assessment of the last dimension, Quality. This is an important dimension when it comes to CS data. In our example, the indicator was classified as Class 2, since the criteria was met that it belonged neither to Class 1 (not all elements of ESS' standard quality framework, such as timeliness, could be met) nor to Class 3 (data, method, and measurable concept were given).

A WALK IN THE ZOO: APPLYING THE TAXONOMY AND TYPOLOGY
In Pinter et al. (2005), the term "zoo" has been used to describe chaos and lack of structure. What if we used the term in a different way? Normally, a zoo is the opposite of a confusing place to be. Maps make it easier for the visitors to navigate their way. Different enclosures and safety measures facilitate the interaction with the different animals in the right way. They differ between which animals can be petted and which ones admired from a safe distance. The enclosures are also properly marked with relevant signs or fact sheets to learn more about the animals and how they relate to each other. So, maybe an Indicator Zoo is not a bad idea. Maybe it is the best scenario for a world where CS activities, CS data, and SDG indicators are somewhat fragmented? From our point of view, a properly organized SDI Zoo would actually be quite preferable.
We therefore want to explore the possibilities of creating an orderly CS SDI Zoo by combining the typologies of CS activities and CS data with the KS/SSB taxonomy, to better categorize possible CS indicators that can support the SDGs. This can be done in different ways, since it is possible to order the CS SDI Zoo according to the categories in the taxonomy, (i.e., by SDG, by TBL, by data quality, etc.) and by the typologies for CS activities and data (see Supplemental Files 1 and 2: Appendices A and B). In this article, we are focusing on CS, and have therefore sorted the example map of our CS SDI Zoo (Figure 2) by the typology for CS activities. This makes it easier to look for and find indicators belonging to different "species" of CS activities. We then show an example of how to label an enclosure, or home, of the different SDIs, through metadata from both the taxonomy and CS typologies. This helps us to know more about what the indicator (animal) in each enclosure is usable for in the work of measuring sustainability, whether it is safe to pet (based on reliable data sources of good quality), or should be handled with more caution (experimental indicators).

THE CITIZEN SCIENCE SUSTAINABLE DEVELOPMENT INDICATOR ZOO: TAXONOMY AND TYPOLOGIES COMBINED
Classifying the CS activities can be done in different ways. In this paper, we have chosen to classify according to the degree of volunteers' participation in the scientific activities. The degree of participation is also indicative of the point in time when the engagement is happening, for example, at the planning phase or during the collection of data only. This information will be useful for NSOs. But to organize the indicators this way is only one way of navigating them.
Data collection is the single CS task that is performed across all categories of CS activities. We could therefore also have chosen to draw the map of the zoo (Figure 2) with the various CS data categories (see Supplemental File 2: Appendix B) as enclosures. Then, the indicators would be presented according to categories such as indicators from data collection, indicators from sorting and classifications, or indicators from algorithm development.
The map of the CS SDI Zoo can therefore be ordered and re-ordered by all the available metadata used to categorize the CS indicators. This creates a dynamic map, where different visitors can navigate through the zoo differently, depending on their point of entry. They can look at the indicators from, for example, the perspectives of which SDG and/or SDG target for which they might be useful, which development area they can measure, how often the data are gathered, or the statistical quality of the indicator. This should present citizen scientists, statisticians, and SDG experts with a dynamic tool to better navigate the different possibilities that might be found in SDI indicators based on data from various CS activities.
To make this possible, each indicator needs to be categorized properly. Once this is done, the sign outside the indicator enclosure will provide the necessary information to look at this specimen of indicators from the perspectives of both the KS/SSB taxonomy or from the CS typology of activities and data category, as shown in Figure 3.

CHALLENGES OF CLASSIFICATION
Sometimes, an indicator is easily categorized, but the task of classifying indicators (or placing the animal in the right enclosure) contains different challenges. An indicator that clearly belongs to Class 1 of the quality dimension and that is easily placed within all other categories, is most likely a "domesticated animal" that can be "petted," or quite safely be an indicator relevant to measure the specific SDG under which it is categorized. But this is not the case for all indicators.
For some indicators, it might be difficult to distinguish clearly between certain attributes within the categories of the taxonomy. This might be due to the attributes of the indicator or to differences in experts' opinions on how to categorize the indicator in question. One such example is the Evaluation perspective, as stated by Statistics Norway (2021).
Another challenge is linked to new or experimental indicators, which challenge our concept of both taxonomy and CS typology. If everybody agrees that something looks a lot like a horse, but it also has a long, spiraled alicorn on its forehead, then what is it? How do we categorize it? Which enclosure is its rightful home? With the horses? With the antelopes? In the pool with the narwhales? In this case, we are probably dealing with a unicorn, so why not create an enclosure for "Creatures of myth" until more is known about its functions and possible uses? If an indicator is of Class 3 quality and there is much debate concerning how it should be categorized, if at all, we might be looking at such a new species, one that is in need of additional categorizations to be properly understood and utilized. The point is, we do not throw it out of the zoo just because it cannot be easily classified. We use the categories we have, learn more about it, and in time it hopefully finds its place with all the other specimens and species in a properly ordered CS SDI Zoo.
The experimental indicator to which we applied the taxonomy combines two special characteristics: (1) the data are subjective and (2) they are collected by volunteers. The exercise of applying the KS/SSB taxonomy to the CS data set provided us with more information about what this dataset actually contains and therefore what it can be used for, compared with other data sets that might be classified in the future. The experimental indicator can also be benchmarked to other indicators, both official and unofficial, maybe filling a knowledge gap, strengthening a theory, or complementing other data being used in policy making. The taxonomy facilitates the comparison of different indicators, as they can be compared according to the same criteria. This can include indicators based on CS-generated data and data sets generated through other activities.
Our suggested approach could also be of use in the development of new sustainability indicators. A crucial characteristic of indicators is their alignment with the goals of the respective users. This can be achieved by their codevelopment together with the users-a principle that is key to citizen science activities (Garrett and Latawiec 2015;Mitchell 1996). As a complement to indicators developed by experts in a top-down manner and with explicit methodologies, CS offers opportunities for bottomup approaches, developing qualitative indicators together with different stakeholders and implicit methodologies. In this way, different kinds of (local) knowledge from different stakeholders can be combined in a democratic way, fostering opportunities for learning, empowerment, and ownership (Waas et al. 2014). For "only through active community involvement can indicators facilitate progress toward sustainable development goals" (Reed et al. 2005, p. 1).

OPPORTUNITIES AND CHALLENGES FOR CITIZEN SCIENCE NETWORKS TO CONNECT TO THE TAXONOMY
The taxonomy has potential as a useful tool for CS networks, both nationally and internationally. It can help classify the data generated by CS activities and point to Figure 3 An example of labeling an indicator according to the Norwegian Association of Local and Regional Authorities/Statistics Norway (KS/SSB) taxonomy and citizen science typology for activities and data category. Image source: Freepik.com.
The data set behind this indicator, stems from the EU FP7 funded CITI-SENSE project (2012-2016. By help of a smartphone application the participants could indicate their perception of the surrounding air quality through a four-colour code (from green -very good to red -very poor). The participants could indicate the indicators for which they may be useful. A robust data infrastructure would be needed to collect, store, and grant access to CS data, but if in place, such infrastructure could be used by NSOs services to complement official SDG reporting (CROWD4SDG 2022; MacFeely and Barnat 2017). Open access according to FAIR data principles, in concordance with data protection regulations such as the European Union (EU) General Data Protection Regulation (GDPR), could provide access to data collected through CS activities, not only for those interested in CS, but also for authoritative organizations, such as NSOs, many of which claim they do not have access to these data (CROWD4SDG 2022). The infrastructure would also be a useful repository for local and regional authorities to develop quality criteria for uptake of CS data in official SDG reporting, and it would offer new ways of collaboration between statistical services, research, and academia, as well as societal stakeholders and municipalities.

CONCLUSIONS
In this paper, we present a new taxonomy that has both the intention and the potential to better organize the CS SDI Zoo. It suggests sorting the CS indicators according to their "species of CS activity" and labeling the "indicator enclosures" with the necessary information to look at each specimen of indicators from different perspectives. The taxonomy can help facilitate a better understanding of the respective data sets created through CS activities. This provides new opportunities for data generated by CS activities other than professional science or monitoring activities. As long as there is enough information about the origin of the data and other contextual metadata, and as long as we can gather data and knowledge from a wide spectrum of sources, data generated by CS activities has great potential to inform about the progress towards sustainability.

DATA ACCESSIBILITY STATEMENT
Human subjects using the smartphone application in the CITI-SENSE project were informed about the purpose of the data collection and its nature during recruitment. The use of the app was on a voluntary basis. Provision of personal information (gender, year of birth, education level) was voluntary, and if a subject decided not to provide this information, there was no restriction to functions in the app. No individual information other than the voluntarily provided information and the position of the user at the time of reporting was retained in the database.

SUPPLEMENTAL FILES
The supplemental files for this article can be found as follows: