<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article
  PUBLIC "-//NLM//DTD Journal Publishing DTD v3.0 20080202//EN" "http://dtd.nlm.nih.gov/publishing/3.0/journalpublishing3.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" dtd-version="3.0" xml:lang="en" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">PLoS ONE</journal-id>
<journal-id journal-id-type="publisher-id">plos</journal-id>
<journal-id journal-id-type="pmc">plosone</journal-id>
<journal-title-group>
<journal-title>PLOS ONE</journal-title>
</journal-title-group>
<issn pub-type="epub">1932-6203</issn>
<publisher>
<publisher-name>Public Library of Science</publisher-name>
<publisher-loc>San Francisco, CA USA</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">PONE-D-15-02269</article-id>
<article-id pub-id-type="doi">10.1371/journal.pone.0133505</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Research Article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>You Are What You Tweet: Connecting the Geographic Variation in America’s Obesity Rate to Twitter Content</article-title>
<alt-title alt-title-type="running-head">You Are What You Tweet: Connecting Twitter and America’s Obesity Rate</alt-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes" xlink:type="simple">
<name name-style="western">
<surname>Gore</surname> <given-names>Ross Joseph</given-names></name>
<xref ref-type="corresp" rid="cor001">*</xref>
<xref ref-type="aff" rid="aff001"/>
</contrib>
<contrib contrib-type="author" xlink:type="simple">
<name name-style="western">
<surname>Diallo</surname> <given-names>Saikou</given-names></name>
<xref ref-type="aff" rid="aff001"/>
</contrib>
<contrib contrib-type="author" xlink:type="simple">
<name name-style="western">
<surname>Padilla</surname> <given-names>Jose</given-names></name>
<xref ref-type="aff" rid="aff001"/>
</contrib>
</contrib-group>
<aff id="aff001">
<addr-line>Virginia Modeling, Analysis and Simulation Center, Old Dominion University, Norfolk, VA, United States of America</addr-line>
</aff>
<contrib-group>
<contrib contrib-type="editor" xlink:type="simple">
<name name-style="western">
<surname>Meyre</surname> <given-names>David</given-names></name>
<role>Editor</role>
<xref ref-type="aff" rid="edit1"/>
</contrib>
</contrib-group>
<aff id="edit1">
<addr-line>McMaster University, CANADA</addr-line>
</aff>
<author-notes>
<fn fn-type="conflict" id="coi001">
<p>The authors have declared that no competing interests exist.</p>
</fn>
<fn fn-type="con" id="contrib001">
<p>Conceived and designed the experiments: RJG SD JP. Performed the experiments: RJG. Analyzed the data: RJG. Contributed reagents/materials/analysis tools: RJG. Wrote the paper: RJG SD JP.</p>
</fn>
<corresp id="cor001">* E-mail: <email xlink:type="simple">rgore@odu.edu</email></corresp>
</author-notes>
<pub-date pub-type="collection">
<year>2015</year>
</pub-date>
<pub-date pub-type="epub">
<day>2</day>
<month>9</month>
<year>2015</year>
</pub-date>
<volume>10</volume>
<issue>9</issue>
<elocation-id>e0133505</elocation-id>
<history>
<date date-type="received">
<day>16</day>
<month>1</month>
<year>2015</year>
</date>
<date date-type="accepted">
<day>3</day>
<month>6</month>
<year>2015</year>
</date>
</history>
<permissions>
<copyright-year>2015</copyright-year>
<copyright-holder>Gore et al</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/" xlink:type="simple">
<license-p>This is an open access article distributed under the terms of the <ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by/4.0/" xlink:type="simple">Creative Commons Attribution License</ext-link>, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited</license-p>
</license>
</permissions>
<self-uri content-type="pdf" xlink:href="info:doi/10.1371/journal.pone.0133505" xlink:type="simple"/>
<abstract>
<p>We conduct a detailed investigation of the relationship among the obesity rate of urban areas and expressions of happiness, diet and physical activity on social media. We do so by analyzing a massive, geo-tagged data set comprising over 200 million words generated over the course of 2012 and 2013 on the social network service Twitter. Among many results, we show that areas with lower obesity rates: (1) have happier tweets and frequently discuss (2) food, particularly fruits and vegetables, and (3) physical activities of any intensity. Additionally, we provide evidence that each of these results offer different and unique insight into the variation of the obesity rate in urban areas within the United States. Our work shows how the contents of social media may potentially be used to estimate real-time, population-scale measures of factors related to obesity.</p>
</abstract>
<funding-group>
<funding-statement>The authors have no support or funding to report.</funding-statement>
</funding-group>
<counts>
<fig-count count="6"/>
<table-count count="2"/>
<page-count count="16"/>
</counts>
<custom-meta-group>
<custom-meta id="data-availability" xlink:type="simple">
<meta-name>Data Availability</meta-name>
<meta-value>All relevant data are within the manuscript and its Supporting Information files.</meta-value>
</custom-meta>
</custom-meta-group>
</article-meta>
</front>
<body>
<sec id="sec001" sec-type="intro">
<title>Introduction</title>
<p>Obesity is becoming increasingly problematic and common in the United States population [<xref ref-type="bibr" rid="pone.0133505.ref001">1</xref>, <xref ref-type="bibr" rid="pone.0133505.ref002">2</xref>]. More than one-third of U.S. adults are obese resulting in an annual medical cost of over $150 billion dollars [<xref ref-type="bibr" rid="pone.0133505.ref001">1</xref>, <xref ref-type="bibr" rid="pone.0133505.ref003">3</xref>, <xref ref-type="bibr" rid="pone.0133505.ref004">4</xref>]. These medical costs occur because obese people are significantly more prone to the leading causes of preventable death including: heart disease, stroke and type 2 diabetes [<xref ref-type="bibr" rid="pone.0133505.ref005">5</xref>]. Obesity is defined by a Body-Mass Index (BMI) which reflects an individual’s weight divided by square of their height. Obese individuals have a BMI of 30 kg m<sup>2</sup> or greater. Obesity rate is defined as the percentage of the people in a Metropolitan Statistical Area (MSA) who have a BMI of 30 kg m<sup>2</sup> or greater [<xref ref-type="bibr" rid="pone.0133505.ref002">2</xref>, <xref ref-type="bibr" rid="pone.0133505.ref006">6</xref>].</p>
<p>Despite the prevalence of obesity in the U.S. it is not problematic to the same degree across the country. According to the 2012–2013 Gallup-Healthways Wellness Survey (GHWS) the obesity rate of U.S. MSAs ranges from 12.4% (Boulder, CO) to 39.5% (Huntington, WV). The lack of uniformity in the obesity rate has motivated researchers to identify the factors that can affect obesity and offer insight into the variation in the data [<xref ref-type="bibr" rid="pone.0133505.ref007">7</xref>].</p>
<p>While the GHWS and other approaches to quantifying the well being of a city rely almost exclusively on survey data, there are now a range of complementary, remote-sensing methods available to researchers. The explosion in the amount and availability of data relating to social media in the past 10 years has driven a rapid increase in the application of data-driven techniques to the social sciences and other analyses of large-scale populations.</p>
<p>Our overall aim in this paper is to investigate how the obesity rate of an urban geographic area correlates with the contents of geo-tagged tweets in that area. Here, tweets refer to 140 character microblogs expressed on the social media platform <ext-link ext-link-type="uri" xlink:href="http://www.twitter.com" xlink:type="simple">www.twitter.com</ext-link> and urban areas reflect the 189 MSAs defined by the U.S. Office of Management and Budget [<xref ref-type="bibr" rid="pone.0133505.ref008">8</xref>]. In particular we ask four research questions using geo-tagged tweets from 2012–2013:
<list list-type="order"><list-item><p>How is the average happiness of the tweets in an urban area related to the population’s obesity rate?</p></list-item> <list-item><p>How is the overall discussion of food consumption on Twitter, and the nutritional density of the food discussed, in related an urban area related to the population’s obesity rate?</p></list-item> <list-item><p>How is the overall discussion of physical activity on Twitter, and the intensity of the activity discussed, in an urban area related to the population’s obesity rate?</p></list-item> <list-item><p>To what extent do the measures used to answer these questions offer unique insight and how well does each correlate with a MSA-level survey measure of a similar variable?</p></list-item></list></p>
<p>Our methodology for answering the first question uses word frequency distributions collected from a large corpus of geo-tagged tweets posted on Twitter, with individual words scored for their happiness independently by users of Amazon’s Mechanical Turk service [<xref ref-type="bibr" rid="pone.0133505.ref009">9</xref>]. This measure was introduced by Dodds and Danforth [<xref ref-type="bibr" rid="pone.0133505.ref010">10</xref>], tested for robustness and sensitivity [<xref ref-type="bibr" rid="pone.0133505.ref011">11</xref>], and employed by Mitchell et. al in a similar pursuit [<xref ref-type="bibr" rid="pone.0133505.ref012">12</xref>].</p>
<p>In answering questions 2 and 3 we explore the extent to which the level of granularity needed to answer the first question is required for the second and third question. To answer the final question we compute the correlations among the measures used to answer the first three questions to gauge how much unique insight they provide. We also evaluate how well each of our derived Twitter measures correlates with a MSA-level survey measure of a similar variable. This analysis helps determine if the measures actually capture the intended variables (happiness, diet and physical activity) as opposed to other unrelated variables.</p>
<p>The answers to these questions are not always intuitive and provide significant insight into the health-related habits of Twitter users in different urban areas. Ultimately, they show how social media may potentially be used to estimate population-scale measures of factors related to obesity.</p>
<p>The remainder of the paper is structured as follows. In the Methods section, we describe the data sets in our study and our measures of happiness, diet and physical activity derived from tweets. In the Results section we demonstrate that obesity rate and happiness have a similar relationship in 2012 and 2013 as the two variables did in 2011. Next, we explore the relationship between the discussion of food consumption on Twitter and the obesity rate in urban areas. Then, we shift our focus to discussions of physical activity. Finally, we explore the extent to which these measures: (1) contain unique insight and (2) match MSA-level survey measures of similar variables. We conclude with a discussion of the validity and limitations of our study along with directions for future work.</p>
</sec>
<sec id="sec002" sec-type="materials|methods">
<title>Methods</title>
<sec id="sec003">
<title>Datasets</title>
<p>We examine the relationship between the content of a corpus of geo-tagged tweets (not retweets) and the obesity rate of 189 urban areas in the contiguous United States during the calendar years 2012 and 2013. Our data collection procedure adheres to Twitter’s terms of use/service. It uses Twitter’s streaming API which provides low latency access to Twitter’s global stream of Tweet data. The data we collected reflects a ∼ 10% random sample of all tweets in 2012–2013. From that random sample, 1.5% of the tweets were geo-tagged resulting in a corpus of over 25 million geo-tagged tweets. The geographic boundaries of the urban areas we explore reflect the MSAs defined by U.S. Office of Management and Budget. It is important to note that these urban area boundaries often agglomerate small towns together, particularly when there are small towns geographically close to larger towns or cities.</p>
<p>The obesity rates of the MSAs are provided by the 2012–2013 Gallup Healthways Wellbeing Survey. While other sources of geographic obesity rates exist (i.e. BRFSS and NHANES)[<xref ref-type="bibr" rid="pone.0133505.ref013">13</xref>, <xref ref-type="bibr" rid="pone.0133505.ref014">14</xref>] we use the GHWS because its data was collected during the same time frame (2012–2013) as our Twitter corpus and (2) it measures other MSA-level variables related to happiness, diet and physical activity which allow us to evaluate additional aspects of our work (i.e. Question 4).</p>
<p>The relationship between these datasets is examined using six measures derived from our Twitter corpus: (a) one related to happiness, (b) three related to diet and (c) two related to physical activity. We define each of these measures next.</p>
</sec>
<sec id="sec004">
<title>Measure of Happiness</title>
<p>To quantify the happiness of a tweet we employ Mitchell et al.’s measure <italic>h</italic><sub><italic>avg</italic></sub> which reflects the <italic>happiness</italic> of a tweet. In previous work Mitchell et al. showed that the <italic>happiness</italic> of tweets are correlated with several population-scale measures including household income, education levels and the 2011 obesity rate in MSAs [<xref ref-type="bibr" rid="pone.0133505.ref012">12</xref>].</p>
<p>The <italic>happiness</italic> of a tweet is measured using the Language Assessment by Mechanical Turk (LabMT) word list, assembled by combining the 5,000 most frequent words occurring in each of four text sources: Google Books (English), music lyrics, the New York Times and Twitter. Ten thousand of these individual words have been scored by users of Amazon’s Mechanical Turk service on a scale of 1 (sad) to 9 (happy), resulting in a measure of happiness, <italic>h</italic>, for each given word [<xref ref-type="bibr" rid="pone.0133505.ref009">9</xref>]. For example, ‘rainbow’ is one of the happiest words in the list with a score of 8.10, while ‘earthquake’ is one of the saddest, with a score of 1.90. Neutral words like ‘the’ or ‘thereof’ tend to score in the middle of the scale, with <italic>h</italic>(<italic>the</italic>) = 4.98 and <italic>h</italic>(<italic>thereof</italic>) = 5.00 respectively.</p>
<p>For a given tweet <italic>T</italic> containing <italic>N</italic> unique words the average happiness, <italic>h</italic><sub><italic>avg</italic></sub>. is calculated by:
<disp-formula id="pone.0133505.e001"><alternatives><graphic id="pone.0133505.e001g" mimetype="image" xlink:type="simple" position="anchor" xlink:href="info:doi/10.1371/journal.pone.0133505.e001"/><mml:math id="M1" display="block" overflow="scroll"><mml:mrow><mml:mtable><mml:mtr><mml:mtd><mml:mrow><mml:msub><mml:mi>h</mml:mi><mml:mrow><mml:mi>a</mml:mi><mml:mi>v</mml:mi><mml:mi>g</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mi>T</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mstyle displaystyle="true"><mml:munderover><mml:mo>∑</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mi>N</mml:mi></mml:munderover><mml:mrow><mml:mi>h</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msub><mml:mi>w</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:msub><mml:mi>f</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:mstyle></mml:mrow><mml:mrow><mml:mstyle displaystyle="true"><mml:munderover><mml:mo>∑</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mi>N</mml:mi></mml:munderover><mml:mrow><mml:msub><mml:mi>f</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:mstyle></mml:mrow></mml:mfrac><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:munderover><mml:mo>∑</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mi>N</mml:mi></mml:munderover><mml:mrow><mml:mi>h</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msub><mml:mi>w</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:msub><mml:mi>p</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:mstyle></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:mrow></mml:math></alternatives> <label>(1)</label></disp-formula></p>
<p>In <xref ref-type="disp-formula" rid="pone.0133505.e001">Eq 1</xref>, <italic>f</italic><sub><italic>i</italic></sub> is the frequency of the <italic>i</italic>th word <italic>w</italic><sub><italic>i</italic></sub> in <italic>T</italic> for which we have a happiness value <italic>h</italic>(<italic>w</italic><sub><italic>i</italic></sub>) and <inline-formula id="pone.0133505.e002"><alternatives><graphic id="pone.0133505.e002g" mimetype="image" xlink:type="simple" position="anchor" xlink:href="info:doi/10.1371/journal.pone.0133505.e002"/><mml:math id="M2" display="inline" overflow="scroll"><mml:mrow><mml:msub><mml:mi>p</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mi>f</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>/</mml:mo><mml:mstyle displaystyle="true"><mml:munderover><mml:mo>∑</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mi>N</mml:mi></mml:munderover><mml:mrow><mml:msub><mml:mi>f</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:mstyle></mml:mrow></mml:math></alternatives></inline-formula> is the normalized frequency of the word <italic>w</italic><sub><italic>i</italic></sub>.</p>
</sec>
<sec id="sec005">
<title>Measures of Diet</title>
<p>To quantify the dietary content of the foods one tweets about we explore three different measures at varying degrees of granularity. Each of these three measures require that we partition our corpus of tweets using the following binary criteria: if a tweet contains a word(s) describing at least one food in the USDA National Nutrient Database (USDANDB) [<xref ref-type="bibr" rid="pone.0133505.ref015">15</xref>] it is placed in the <italic>Food Tweets</italic> set <italic>FT</italic>; otherwise it is placed in the <italic>Non-Food Tweets</italic> set <italic>NFT</italic>.</p>
<p>Given this partitioning, the <italic>Food Tweet %</italic> (<italic>FT</italic>%) of a MSA, is the ratio of <italic>Food Tweets</italic> in the MSA compared to the total number of tweets within the MSA. This reflects our first measure of diet and is shown in <xref ref-type="disp-formula" rid="pone.0133505.e003">Eq 2</xref>.
<disp-formula id="pone.0133505.e003"><alternatives><graphic id="pone.0133505.e003g" mimetype="image" xlink:type="simple" position="anchor" xlink:href="info:doi/10.1371/journal.pone.0133505.e003"/><mml:math id="M3" display="block" overflow="scroll"><mml:mtable displaystyle="true"><mml:mtr><mml:mtd columnalign="right"><mml:mrow><mml:mi>F</mml:mi> <mml:mi>T</mml:mi> <mml:mo>%</mml:mo> <mml:mo>=</mml:mo> <mml:mfrac><mml:mrow><mml:mo>|</mml:mo> <mml:mi>F</mml:mi> <mml:mi>T</mml:mi> <mml:mo>|</mml:mo></mml:mrow> <mml:mrow><mml:mo>(</mml:mo> <mml:mo>|</mml:mo> <mml:mi>F</mml:mi> <mml:mi>T</mml:mi> <mml:mo>|</mml:mo> <mml:mo>+</mml:mo> <mml:mo>|</mml:mo> <mml:mi>N</mml:mi> <mml:mi>F</mml:mi> <mml:mi>T</mml:mi> <mml:mo>|</mml:mo> <mml:mo>)</mml:mo></mml:mrow></mml:mfrac></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></alternatives> <label>(2)</label></disp-formula></p>
<p>While, the <italic>FT</italic>% of a MSA quantifies how frequently people tweet about food, it does not offer any insight into the actual food about which people tweet. To measure how nutritious each food included in each tweet is we measure the average nutrient density, <italic>nd</italic><sub><italic>avg</italic></sub>, of the tweet by using the Nutrient-Rich Foods Index (NRF) formula [<xref ref-type="bibr" rid="pone.0133505.ref016">16</xref>].</p>
<p>While other formulae to determine the nutrient density of foods exist, we use the NRF because its’ scores have been shown to be highly correlated with the recommendations of the USDA’s Healthy Eating Index [<xref ref-type="bibr" rid="pone.0133505.ref017">17</xref>] and diets featuring high nutrient dense foods on the NRF have been been shown to reduce obesity, while diets consisting of low nutrient dense foods increase the prevalence of obesity [<xref ref-type="bibr" rid="pone.0133505.ref018">18</xref>, <xref ref-type="bibr" rid="pone.0133505.ref019">19</xref>]. Furthermore the NRF is not restricted to any subset of foods. It is generalizable to any food in the USDANDB [<xref ref-type="bibr" rid="pone.0133505.ref020">20</xref>].</p>
<p>Nutrient density in the NRF is determined by computing the daily recommended intake value of protein, dietary fiber, vitamin A, vitamin C, vitamin E, calcium, magnesium, iron and potassium provided per 100 kCals of a given food and then subtracting the daily recommended intake values for saturated fat, sodium and added sugars in 100 kCals of the food. Using this formula, fruits and vegetables are some of the most nutrient dense foods (<italic>nrf</italic>(<italic>spinach</italic>) = 694.8; <italic>nrf</italic>(<italic>strawberries</italic>) = 375.9) while soda is one of the least (<italic>nrf</italic>(<italic>soda</italic>) = −55.8). For a given tweet <italic>T</italic> containing <italic>N</italic> unique foods we calculate the average nutrient density <italic>nd</italic><sub><italic>avg</italic></sub> using <xref ref-type="disp-formula" rid="pone.0133505.e004">Eq 3</xref>.
<disp-formula id="pone.0133505.e004"><alternatives><graphic id="pone.0133505.e004g" mimetype="image" xlink:type="simple" position="anchor" xlink:href="info:doi/10.1371/journal.pone.0133505.e004"/><mml:math id="M4" display="block" overflow="scroll"><mml:mrow><mml:mtable><mml:mtr><mml:mtd><mml:mrow><mml:mi>n</mml:mi><mml:msub><mml:mi>d</mml:mi><mml:mrow><mml:mi>a</mml:mi><mml:mi>v</mml:mi><mml:mi>g</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mi>T</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mstyle displaystyle="true"><mml:munderover><mml:mo>∑</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mi>N</mml:mi></mml:munderover><mml:mrow><mml:mi>n</mml:mi><mml:mi>r</mml:mi><mml:mi>f</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>f</mml:mi><mml:mi>o</mml:mi><mml:mi>o</mml:mi><mml:msub><mml:mi>d</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:msub><mml:mi>f</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:mstyle></mml:mrow><mml:mrow><mml:mstyle displaystyle="true"><mml:munderover><mml:mo>∑</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mi>N</mml:mi></mml:munderover><mml:mrow><mml:msub><mml:mi>f</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:mstyle></mml:mrow></mml:mfrac><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:munderover><mml:mo>∑</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mi>N</mml:mi></mml:munderover><mml:mrow><mml:mi>n</mml:mi><mml:mi>r</mml:mi><mml:mi>f</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>f</mml:mi><mml:mi>o</mml:mi><mml:mi>o</mml:mi><mml:msub><mml:mi>d</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:msub><mml:mi>p</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:mstyle></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:mrow></mml:math></alternatives> <label>(3)</label></disp-formula></p>
<p>The calculation of <italic>nd</italic><sub><italic>avg</italic></sub> in <xref ref-type="disp-formula" rid="pone.0133505.e004">Eq 3</xref> is similar to the calculation of <italic>h</italic><sub><italic>avg</italic></sub>. In <xref ref-type="disp-formula" rid="pone.0133505.e004">Eq 3</xref> <italic>f</italic><sub><italic>i</italic></sub> is the frequency of the <italic>i</italic>th food <italic>food</italic><sub><italic>i</italic></sub> in <italic>T</italic> with NRF value <italic>nrf</italic>(<italic>food</italic><sub><italic>i</italic></sub>) and <inline-formula id="pone.0133505.e005"><alternatives><graphic id="pone.0133505.e005g" mimetype="image" xlink:type="simple" position="anchor" xlink:href="info:doi/10.1371/journal.pone.0133505.e005"/><mml:math id="M5" display="inline" overflow="scroll"><mml:mrow><mml:msub><mml:mi>p</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mi>f</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mstyle displaystyle="true"><mml:munderover><mml:mo>∑</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mi>N</mml:mi></mml:munderover><mml:mrow><mml:msub><mml:mi>f</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:mstyle></mml:mrow></mml:math></alternatives></inline-formula> is the normalized frequency of the food <italic>food</italic><sub><italic>i</italic></sub>. The result is a measure of the average nutrient density of the foods mentioned in a single tweet.</p>
<p>There is a significant difference between the level of granularity in our first measure (<italic>FT</italic>%) and our second (<italic>nd</italic><sub><italic>avg</italic></sub>). To bridge this gap we formulate one more measure of the diet of an MSA: <italic>Produce %</italic> (<italic>Prod</italic>%). <italic>Prod</italic>% marries together the nutritional aspects of <italic>nd</italic><sub><italic>avg</italic></sub> with the coarse granularity of <italic>FT</italic>%.</p>
<p>Recall, fruits and vegetables are among the most nutritionally dense items on the NRF Index. Any tweet that mentions at least one food listed in either <italic>Fruits and Fruit Juices</italic> or <italic>Vegetable and Vegetable Products</italic> sections of the USDANDB is in set <italic>Prod</italic>. Given this partitioning, <italic>Prod</italic>% is the ratio of tweets in set <italic>Prod</italic> the compared to the total number of tweets in the MSA. This measure is shown in <xref ref-type="disp-formula" rid="pone.0133505.e006">Eq 4</xref>.
<disp-formula id="pone.0133505.e006"><alternatives><graphic id="pone.0133505.e006g" mimetype="image" xlink:type="simple" position="anchor" xlink:href="info:doi/10.1371/journal.pone.0133505.e006"/><mml:math id="M6" display="block" overflow="scroll"><mml:mtable displaystyle="true"><mml:mtr><mml:mtd columnalign="right"><mml:mrow><mml:mi>P</mml:mi> <mml:mi>r</mml:mi> <mml:mi>o</mml:mi> <mml:mi>d</mml:mi> <mml:mo>%</mml:mo> <mml:mo>=</mml:mo> <mml:mfrac><mml:mrow><mml:mo>|</mml:mo> <mml:mi>P</mml:mi> <mml:mi>r</mml:mi> <mml:mi>o</mml:mi> <mml:mi>d</mml:mi> <mml:mo>|</mml:mo></mml:mrow> <mml:mrow><mml:mo>(</mml:mo> <mml:mo>|</mml:mo> <mml:mi>F</mml:mi> <mml:mi>T</mml:mi> <mml:mo>|</mml:mo> <mml:mo>+</mml:mo> <mml:mo>|</mml:mo> <mml:mi>N</mml:mi> <mml:mi>F</mml:mi> <mml:mi>T</mml:mi> <mml:mo>|</mml:mo> <mml:mo>)</mml:mo></mml:mrow></mml:mfrac></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></alternatives> <label>(4)</label></disp-formula></p>
</sec>
<sec id="sec006">
<title>Measures of Physical Activity</title>
<p>Along with happiness and diet, research has shown that the physical activity level of individuals affects obesity [<xref ref-type="bibr" rid="pone.0133505.ref021">21</xref>–<xref ref-type="bibr" rid="pone.0133505.ref023">23</xref>]. With this foundation we explore two different measures to quantify discussions of physical activity within our Twitter data set. Each of these measures require that we partition our corpus of tweets into those that discuss physical activities and those that do not. To do this partition we use a binary criteria similar to our food tweet criteria. If a tweet contains a word(s) discussing at least one physical activity in the guidelines for exercise testing published by the American College of Sports Medicine (ACSM) and the Center for Disease Control and Prevention (CDC) [<xref ref-type="bibr" rid="pone.0133505.ref024">24</xref>] it is placed in the <italic>Physical Activity Tweets</italic> set <italic>PA</italic>; otherwise it is placed in the <italic>Non-Physical Activity Tweets</italic> set <italic>NPA</italic>. While the guidelines for exercise published by the ACSM and CDC are not exhaustive and do not contain every possible physical activity descriptor we employ them in our work because they list over 400 activities and are well established. They been used by the American Heart Association [<xref ref-type="bibr" rid="pone.0133505.ref025">25</xref>], national cross-sectional studies [<xref ref-type="bibr" rid="pone.0133505.ref026">26</xref>] and public health recommendations [<xref ref-type="bibr" rid="pone.0133505.ref027">27</xref>].</p>
<p>Our first physical activity metric, <italic>Physical Activity %</italic> (<italic>PA</italic>%) is shown in <xref ref-type="disp-formula" rid="pone.0133505.e007">Eq 5</xref>. It measures the ratio of <italic>Physical Activity Tweets</italic> compared to the total number of tweets.
<disp-formula id="pone.0133505.e007"><alternatives><graphic id="pone.0133505.e007g" mimetype="image" xlink:type="simple" position="anchor" xlink:href="info:doi/10.1371/journal.pone.0133505.e007"/><mml:math id="M7" display="block" overflow="scroll"><mml:mtable displaystyle="true"><mml:mtr><mml:mtd columnalign="right"><mml:mrow><mml:mi>P</mml:mi> <mml:mi>A</mml:mi> <mml:mo>%</mml:mo> <mml:mo>=</mml:mo> <mml:mfrac><mml:mrow><mml:mo>|</mml:mo> <mml:mi>P</mml:mi> <mml:mi>A</mml:mi> <mml:mo>|</mml:mo></mml:mrow> <mml:mrow><mml:mo>(</mml:mo> <mml:mo>|</mml:mo> <mml:mi>P</mml:mi> <mml:mi>A</mml:mi> <mml:mo>|</mml:mo> <mml:mo>+</mml:mo> <mml:mo>|</mml:mo> <mml:mi>N</mml:mi> <mml:mi>P</mml:mi> <mml:mi>A</mml:mi> <mml:mo>|</mml:mo> <mml:mo>)</mml:mo></mml:mrow></mml:mfrac></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></alternatives> <label>(5)</label></disp-formula></p>
<p>The guidelines of physical activities from the ACSM and CDC divides activities into two categories which serve as the basis for our second measure. The two categories of activities are: (1) moderately intense activities that burn 3.5 kCals a minute and (2) strenuously intense activities that burn 7.0 kCals a minute. Moderately intense physical activities include yoga, walking and stretching while strenuously intense physical activities include jogging, mountain climbing and aerobics. For a given tweet <italic>T</italic> discussing <italic>M</italic> moderately intense physical activities and <italic>S</italic> strenuously intense physical activities we calculate, <italic>pa</italic><sub><italic>weighted</italic></sub> in <xref ref-type="disp-formula" rid="pone.0133505.e008">Eq 6</xref>. <italic>pa</italic><sub><italic>weighted</italic></sub> is the <italic>weighted</italic> number of calories burned by participating in all the physical activities discussed in the tweet for one minute.
<disp-formula id="pone.0133505.e008"><alternatives><graphic id="pone.0133505.e008g" mimetype="image" xlink:type="simple" position="anchor" xlink:href="info:doi/10.1371/journal.pone.0133505.e008"/><mml:math id="M8" display="block" overflow="scroll"><mml:mtable displaystyle="true"><mml:mtr><mml:mtd columnalign="right"><mml:mrow><mml:mi>p</mml:mi> <mml:msub><mml:mi>a</mml:mi> <mml:mrow><mml:mi>w</mml:mi> <mml:mi>e</mml:mi> <mml:mi>i</mml:mi> <mml:mi>g</mml:mi> <mml:mi>h</mml:mi> <mml:mi>t</mml:mi> <mml:mi>e</mml:mi> <mml:mi>d</mml:mi></mml:mrow></mml:msub> <mml:mrow><mml:mo>(</mml:mo> <mml:mi>T</mml:mi> <mml:mo>)</mml:mo></mml:mrow> <mml:mo>=</mml:mo> <mml:mrow><mml:mo>(</mml:mo> <mml:mn>3</mml:mn> <mml:mo>.</mml:mo> <mml:mn>5</mml:mn> <mml:mo>×</mml:mo> <mml:mi>M</mml:mi> <mml:mo>)</mml:mo></mml:mrow> <mml:mo>+</mml:mo> <mml:mrow><mml:mo>(</mml:mo> <mml:mn>7</mml:mn> <mml:mo>.</mml:mo> <mml:mn>0</mml:mn> <mml:mo>×</mml:mo> <mml:mi>S</mml:mi> <mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></alternatives> <label>(6)</label></disp-formula></p>
</sec>
<sec id="sec007">
<title>Objectivity and Limitations</title>
<p>All of the measures in Eqs <xref ref-type="disp-formula" rid="pone.0133505.e003">2</xref>–<xref ref-type="disp-formula" rid="pone.0133505.e008">6</xref> make no attempt to take the context of words or the meaning of a tweet into account. While this may limit the ability of our measures to appropriately score tweets containing only a few words, previous researchers have employed this approach and obtained reliable results. Furthermore, by ignoring the context of words we gain a degree of impartiality. We are not the one’s deciding a priori whether a given word, food or activity is associated with obesity. This strategy reduces experimental bias and maintains objectivity.</p>
</sec>
</sec>
<sec id="sec008" sec-type="results">
<title>Results</title>
<sec id="sec009">
<title>Happiness and Obesity Rate</title>
<p>The first measure we explore is the <italic>happiness</italic> conveyed in individual words from tweets. Mitchell et al. showed that the <italic>happiness</italic> of tweets are correlated with the 2011 obesity rate in MSAs [<xref ref-type="bibr" rid="pone.0133505.ref012">12</xref>]. To validate this result we explore the correlation between the <italic>happiness</italic> of a tweet and the obesity rate of MSAs in our random sample of Twitter data. Recall, our Twitter data contains ∼ 25 million tweets collected during 2012 and 2013 while Mitchell et al.’s data contains ∼ 10 million tweets collected during 2011. Also Mitchell et al. used GHWS obesity rates collected during 2011 while we use obesity rates collected during 2012 and 2013.</p>
<p>
<xref ref-type="fig" rid="pone.0133505.g001">Fig 1</xref> shows the correlation of <italic>h</italic><sub><italic>avg</italic></sub> and the obesity rate in all the MSAs for: (a) 2011 (Mitchell et al.) and (b) 2012–2013 (our work). The data shows that the happiness people express in tweets generally decreases as the obesity rate increases. This result holds true in 2011 as well as in 2012–2013. Furthermore, the strength of the relationship and the subtleties of the data points are similar. For example, Boulder, CO is the city with the lowest obesity rate and is among the three most happy cities each year. Furthermore Beaumont, TX is in the top 10 MSAs in terms of obesity rate in both data sets and bottom five happiest cities. The Spearman correlation coefficients are similar (<italic>r</italic> = -0.339 in 2011, <italic>r</italic> = -0.318 in 2012–2013) and each have <italic>p</italic>-values far below.001 indicating that the negative correlations are statistically significant. Next, we explore the relationship of five measures of other factors affecting obesity (diet and physical activity) that can be gleamed from Twitter data in a manner similar to the <italic>happiness</italic> metric, <italic>h</italic><sub><italic>avg</italic></sub>.</p>
<fig id="pone.0133505.g001" position="float">
<object-id pub-id-type="doi">10.1371/journal.pone.0133505.g001</object-id>
<label>Fig 1</label>
<caption>
<title>Correlation of <italic>h</italic><sub><italic>avg</italic></sub> and obesity rate over all MSAs in: (a) 2011 and (b) 2012–2013.</title>
</caption>
<graphic mimetype="image" xlink:type="simple" position="float" xlink:href="info:doi/10.1371/journal.pone.0133505.g001"/>
</fig>
</sec>
<sec id="sec010">
<title>Dietary Health and Obesity Rate</title>
<p>Research has shown that diet influences obesity [<xref ref-type="bibr" rid="pone.0133505.ref028">28</xref>, <xref ref-type="bibr" rid="pone.0133505.ref029">29</xref>]. However, the happiness metric, <italic>h</italic><sub><italic>avg</italic></sub>, does not account for diet. Many foods that are widely considered unhealthy have high happiness values (<italic>h</italic>). For example, the term <italic>cake</italic> has a <italic>h</italic> value = 7.58 Also, healthy foods can have relatively low happiness values. The term <italic>vegan</italic> has a <italic>h</italic> value of 4.82 despite reflecting a diet featuring fruits and vegetables. Furthermore, many healthy and unhealthy foods are not included in the list of terms scored for happiness. As a result, they are completely ignored in the previous analysis.</p>
<p>To gather insight into the relationship between the foods one tweets about and obesity we explore the correlation between three different measures of the dietary content of a tweet and the obesity rate of MSAs. The first measure we explore is <italic>nd</italic><sub><italic>avg</italic></sub> shown in <xref ref-type="disp-formula" rid="pone.0133505.e004">Eq 3</xref>. Recall, <italic>nd</italic><sub><italic>avg</italic></sub> reflects the average nutrient density of a tweet. The twitter data we use for this analysis includes more than two million tweets from 2012–2013 mentioning more than six hundred of the 8,000 different foods listed in the USDANDB. The Spearman correlation between <italic>nd</italic><sub><italic>avg</italic></sub> and obesity rate in all MSAs over 2012–2013 is shown in <xref ref-type="fig" rid="pone.0133505.g002">Fig 2</xref>.</p>
<fig id="pone.0133505.g002" position="float">
<object-id pub-id-type="doi">10.1371/journal.pone.0133505.g002</object-id>
<label>Fig 2</label>
<caption>
<title>Correlation of <italic>nd</italic><sub><italic>avg</italic></sub> and BMI over all MSAs for 2012–2013.</title>
</caption>
<graphic mimetype="image" xlink:type="simple" position="float" xlink:href="info:doi/10.1371/journal.pone.0133505.g002"/>
</fig>
<p>
<xref ref-type="fig" rid="pone.0133505.g002">Fig 2</xref> shows that there is not a statistically significant relationship between the nutrient density of the foods people discuss in their tweets and obesity rate. This result is unexpected. Given our previous result related to the happiness of tweets and the established relationship between diet and obesity, we anticipated a statistically significant negative correlation. We pursue an explanation by identifying the ten foods that are most strongly negatively and positively correlated with obesity. These results are shown in <xref ref-type="table" rid="pone.0133505.t001">Table 1</xref>.</p>
<table-wrap id="pone.0133505.t001" position="float">
<object-id pub-id-type="doi">10.1371/journal.pone.0133505.t001</object-id>
<label>Table 1</label>
<caption>
<title>Top Ten Foods Most Negatively &amp; Positively Correlated With Obesity Rate.</title>
</caption>
<alternatives>
<graphic id="pone.0133505.t001g" mimetype="image" xlink:type="simple" position="float" xlink:href="info:doi/10.1371/journal.pone.0133505.t001"/>
<table frame="box" rules="all" border="0">
<colgroup span="1">
<col align="left" valign="top" span="1"/>
<col align="left" valign="top" span="1"/>
<col align="left" valign="top" span="1"/>
<col align="left" valign="top" span="1"/>
<col align="left" valign="top" span="1"/>
<col align="left" valign="top" span="1"/>
<col align="left" valign="top" span="1"/>
<col align="left" valign="top" span="1"/>
</colgroup>
<thead>
<tr>
<th align="left" rowspan="1" colspan="1"><italic>Negative</italic></th>
<th align="center" rowspan="1" colspan="1"/>
<th align="left" rowspan="1" colspan="1"/>
<th align="right" rowspan="1" colspan="1"/>
<th align="left" rowspan="1" colspan="1"><italic>Positive</italic></th>
<th align="center" rowspan="1" colspan="1"/>
<th align="left" rowspan="1" colspan="1"/>
<th align="right" rowspan="1" colspan="1"/>
</tr>
<tr>
<th align="left" rowspan="1" colspan="1">Food</th>
<th align="center" rowspan="1" colspan="1"><italic>r</italic></th>
<th align="left" rowspan="1" colspan="1"><italic>p</italic>-value</th>
<th align="right" rowspan="1" colspan="1">NRF</th>
<th align="left" rowspan="1" colspan="1">Food</th>
<th align="center" rowspan="1" colspan="1"><italic>r</italic></th>
<th align="left" rowspan="1" colspan="1"><italic>p</italic>-value</th>
<th align="right" rowspan="1" colspan="1">NRF</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left" rowspan="1" colspan="1">wine</td>
<td align="char" char="." rowspan="1" colspan="1">-.407</td>
<td align="left" rowspan="1" colspan="1"><italic>p</italic> &lt;.001</td>
<td align="char" char="." rowspan="1" colspan="1">10.0</td>
<td align="left" rowspan="1" colspan="1">chicken nuggets</td>
<td align="char" char="." rowspan="1" colspan="1">.207</td>
<td align="left" rowspan="1" colspan="1"><italic>p</italic> &lt;.01</td>
<td align="char" char="." rowspan="1" colspan="1">5.9</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">coffee</td>
<td align="char" char="." rowspan="1" colspan="1">-.372</td>
<td align="left" rowspan="1" colspan="1"><italic>p</italic> &lt;.001</td>
<td align="char" char="." rowspan="1" colspan="1">4.5</td>
<td align="left" rowspan="1" colspan="1">ham</td>
<td align="char" char="." rowspan="1" colspan="1">.174</td>
<td align="left" rowspan="1" colspan="1"><italic>p</italic> &lt;.01</td>
<td align="char" char="." rowspan="1" colspan="1">-6.4</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">banana</td>
<td align="char" char="." rowspan="1" colspan="1">-.325</td>
<td align="left" rowspan="1" colspan="1"><italic>p</italic> &lt;.001</td>
<td align="char" char="." rowspan="1" colspan="1">51.8</td>
<td align="left" rowspan="1" colspan="1">french fries</td>
<td align="char" char="." rowspan="1" colspan="1">.165</td>
<td align="left" rowspan="1" colspan="1"><italic>p</italic> &lt;.05</td>
<td align="char" char="." rowspan="1" colspan="1">-15.2</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">espresso</td>
<td align="char" char="." rowspan="1" colspan="1">-.314</td>
<td align="left" rowspan="1" colspan="1"><italic>p</italic> &lt;.001</td>
<td align="char" char="." rowspan="1" colspan="1">3.8</td>
<td align="left" rowspan="1" colspan="1">chicken wings</td>
<td align="char" char="." rowspan="1" colspan="1">.145</td>
<td align="left" rowspan="1" colspan="1"><italic>p</italic> &lt;.05</td>
<td align="char" char="." rowspan="1" colspan="1">6.8</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">croissant</td>
<td align="char" char="." rowspan="1" colspan="1">-.285</td>
<td align="left" rowspan="1" colspan="1"><italic>p</italic> &lt;.001</td>
<td align="char" char="." rowspan="1" colspan="1">-9.1</td>
<td align="left" rowspan="1" colspan="1">sausage</td>
<td align="char" char="." rowspan="1" colspan="1">.129</td>
<td align="left" rowspan="1" colspan="1">p &gt;.05</td>
<td align="char" char="." rowspan="1" colspan="1">-19.3</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">apple</td>
<td align="char" char="." rowspan="1" colspan="1">-.282</td>
<td align="left" rowspan="1" colspan="1"><italic>p</italic> &lt;.001</td>
<td align="char" char="." rowspan="1" colspan="1">46.7</td>
<td align="left" rowspan="1" colspan="1">biscuit</td>
<td align="char" char="." rowspan="1" colspan="1">.113</td>
<td align="left" rowspan="1" colspan="1">p &gt;.05</td>
<td align="char" char="." rowspan="1" colspan="1">0.2</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">salmon</td>
<td align="char" char="." rowspan="1" colspan="1">-.274</td>
<td align="left" rowspan="1" colspan="1"><italic>p</italic> &lt;.001</td>
<td align="char" char="." rowspan="1" colspan="1">36.0</td>
<td align="left" rowspan="1" colspan="1">collards</td>
<td align="char" char="." rowspan="1" colspan="1">.097</td>
<td align="left" rowspan="1" colspan="1">p &gt;.05</td>
<td align="char" char="." rowspan="1" colspan="1">392.5</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">quinoa</td>
<td align="char" char="." rowspan="1" colspan="1">-.268</td>
<td align="left" rowspan="1" colspan="1"><italic>p</italic> &lt;.001</td>
<td align="char" char="." rowspan="1" colspan="1">31.8</td>
<td align="left" rowspan="1" colspan="1">bbq sauce</td>
<td align="char" char="." rowspan="1" colspan="1">.092</td>
<td align="left" rowspan="1" colspan="1">p &gt;.05</td>
<td align="char" char="." rowspan="1" colspan="1">-2.5</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">brie</td>
<td align="char" char="." rowspan="1" colspan="1">-.265</td>
<td align="left" rowspan="1" colspan="1"><italic>p</italic> &lt;.001</td>
<td align="char" char="." rowspan="1" colspan="1">-8.5</td>
<td align="left" rowspan="1" colspan="1">fried chicken</td>
<td align="char" char="." rowspan="1" colspan="1">.088</td>
<td align="left" rowspan="1" colspan="1">p &gt;.05</td>
<td align="char" char="." rowspan="1" colspan="1">8.9</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">macaroon</td>
<td align="char" char="." rowspan="1" colspan="1">-.261</td>
<td align="left" rowspan="1" colspan="1"><italic>p</italic> &lt;.001</td>
<td align="char" char="." rowspan="1" colspan="1">-8.4</td>
<td align="left" rowspan="1" colspan="1">gravy</td>
<td align="char" char="." rowspan="1" colspan="1">.084</td>
<td align="left" rowspan="1" colspan="1">p &gt;.05</td>
<td align="char" char="." rowspan="1" colspan="1">-4.2</td>
</tr>
</tbody>
</table>
</alternatives>
</table-wrap>
<p>
<xref ref-type="table" rid="pone.0133505.t001">Table 1</xref> elucidates several insights into the set of tweets that discuss food. The first is that areas with lower obesity rates do not exclusively discuss foods that are nutritionally dense. Similarly areas with high obesity rates discuss a mix of nutritionally dense and non-nutritionally dense foods. Specifically, both lists contain multiple foods with positive and negative NRF Index values and the food with the highest NRF Index value (collards) is correlated with high obesity rates.</p>
<p>It is important to note that our nutrient density metric ignores the quantity and preparation of the food consumed in the tweet. These limitations could explain the lack of a significant relationship between the nutrient density of foods people discuss in tweets and their obesity rate. However, the correlation coefficients and <italic>p</italic>-values in <xref ref-type="table" rid="pone.0133505.t001">Table 1</xref> reveal that tweets that discuss food, regardless of their nutritional density, are more likely to be negatively correlated with obesity rate than positively correlated. The absolute value of the correlation coefficient of the food tenth most negatively correlated with obesity is ∼ 25% larger than the absolute value of the correlation coefficient for the food most positively correlated with obesity rate. The <italic>p</italic>-values in <xref ref-type="table" rid="pone.0133505.t001">Table 1</xref> also reflect this trend. The relationship between all the foods negatively correlated with obesity rate are statistically significant (<italic>p</italic> &lt;.05) while only the top four foods positively correlated with obesity rate are statistically significant.</p>
<p>Given these two observations we explore the data to see if the frequency with which individuals tweet about food, regardless of its nutritional density, is correlated with obesity. We use the same twitter data as our previous analysis. However, in this version we measure the ratio of <italic>Food Tweets</italic> compared to the total number of tweets. This metric, <italic>FT</italic>% is shown in <xref ref-type="disp-formula" rid="pone.0133505.e003">Eq 2</xref>. The Spearman correlation between <italic>FT</italic>% and obesity over all MSAs for each is shown in <xref ref-type="fig" rid="pone.0133505.g003">Fig 3</xref>.</p>
<fig id="pone.0133505.g003" position="float">
<object-id pub-id-type="doi">10.1371/journal.pone.0133505.g003</object-id>
<label>Fig 3</label>
<caption>
<title>Correlation of <italic>FT</italic>% and BMI over all MSAs for 2012 &amp; 2013.</title>
</caption>
<graphic mimetype="image" xlink:type="simple" position="float" xlink:href="info:doi/10.1371/journal.pone.0133505.g003"/>
</fig>
<p>
<xref ref-type="fig" rid="pone.0133505.g003">Fig 3</xref> shows that the frequency with which people discuss foods in tweets generally decreases as obesity rate increases. For example, San Francisco, CA is the MSA with one of the highest <italic>FT</italic>% and is among the ten MSAs with the lowest obesity rate. Similarly, several of the MSAs with top twenty obesity levels (Flint, MI; Mobile, AL; Rockford, IL) are amongst the bottom twenty MSAs in terms of <italic>FT</italic>%. However, the negative correlation between <italic>FT</italic>% and obesity rate is not as strong as the negative correlation between <italic>h</italic><sub><italic>avg</italic></sub> and obesity rate. Furthermore, the negative correlation between <italic>FT</italic>% and obesity is not immediately obvious. There is not a quorum of established evidence that shows that the more people discuss food the less obese they are.</p>
<p>In order to examine our data further we explore our final measure of the diet of a MSA: <italic>Produce %</italic> (<italic>Prod</italic>%). Recall, <italic>Prod</italic>% marries together the nutritional aspects of <italic>nd</italic><sub><italic>avg</italic></sub> with the coarse granularity of <italic>FT</italic>%. It reflects the percentage of total tweets that discuss at least one of the foods listed in either the <italic>Fruits and Fruit Juices</italic> or <italic>Vegetable and Vegetable Products</italic> sections of the USDANDB. The twitter data we use for this analysis includes more than one million tweets from 2012–2013 mentioning more than 150 different fruits, vegetables or fruit/vegetable related products. The Spearman correlation between <italic>Prod</italic>% and obesity rate over all MSAs is shown in <xref ref-type="fig" rid="pone.0133505.g004">Fig 4</xref>.</p>
<fig id="pone.0133505.g004" position="float">
<object-id pub-id-type="doi">10.1371/journal.pone.0133505.g004</object-id>
<label>Fig 4</label>
<caption>
<title>Correlation of <italic>Prod</italic>% and obesity over all MSAs for 2012 &amp; 2013.</title>
</caption>
<graphic mimetype="image" xlink:type="simple" position="float" xlink:href="info:doi/10.1371/journal.pone.0133505.g004"/>
</fig>
<p>
<xref ref-type="fig" rid="pone.0133505.g004">Fig 4</xref> shows that the <italic>Prod</italic>% metric reconciles the trends we saw in our previous explorations with the measures <italic>nd</italic><sub><italic>avg</italic></sub> and <italic>FT</italic>%. The frequency with which people tweet about fruits, vegetables or related products increases as obesity decreases.</p>
<p>Intuitively this makes sense. Fruits and vegetables are some of the highest scoring items on the NRF Index, so eating them regularly should decrease the obesity rate. The previous measure, <italic>nd</italic><sub><italic>avg</italic></sub>, attempted to account for this but over penalized tweeters for mentioning average and below average foods on the NRF Index. The <italic>FT</italic>% metric offered a much coarser level of granularity but did not consider the nutritional density of the foods being discussed in a tweet at all. By including nutritional density at a coarse level of granularity we are able to reveal a correlation with obesity rate (<italic>r</italic> = -0.344) that is similar in magnitude to the correlation between <italic>h</italic><sub><italic>avg</italic></sub> and obesity rate. Next, we investigate the discussion of physical activity levels on Twitter and their relationship to the obesity rate in MSAs.</p>
</sec>
<sec id="sec011">
<title>Physical Activity Level and Obesity Rate</title>
<p>Along with happiness and diet, research has shown that the physical activity level of individuals affects obesity [<xref ref-type="bibr" rid="pone.0133505.ref021">21</xref>–<xref ref-type="bibr" rid="pone.0133505.ref023">23</xref>]. However, none of our previously explored measures (<italic>h</italic><sub><italic>avg</italic></sub>, <italic>nd</italic><sub><italic>avg</italic></sub>, <italic>FT</italic>% and <italic>Prod</italic>%) account for discussions of physical activities within tweets. As a result, we explore two different measures of discussions of physical activity within our Twitter data set.</p>
<p>Our first physical activity measure, <italic>Physical Activity %</italic> (<italic>PA</italic>%) measures the ratio of Physical Activity related tweets compared to the total number of tweets. Our second measure weights physical activities according to the intensity levels published by guidelines of the ACSM and CDC. These two measures are shown in Eqs <xref ref-type="disp-formula" rid="pone.0133505.e007">5</xref> and <xref ref-type="disp-formula" rid="pone.0133505.e008">6</xref>. The Spearman correlation between <italic>PA</italic>% and obesity rate and <italic>pa</italic><sub><italic>weighted</italic></sub> and obesity rate in all MSAs over 2012 and 2013 is shown in <xref ref-type="fig" rid="pone.0133505.g005">Fig 5(a) and 5(b)</xref>.</p>
<fig id="pone.0133505.g005" position="float">
<object-id pub-id-type="doi">10.1371/journal.pone.0133505.g005</object-id>
<label>Fig 5</label>
<caption>
<title>Correlation of obesity rate and (a) <italic>PA</italic>% and (b) <italic>pa</italic><sub><italic>weighted</italic></sub> over all MSAs in 2012 &amp; 2013.</title>
</caption>
<graphic mimetype="image" xlink:type="simple" position="float" xlink:href="info:doi/10.1371/journal.pone.0133505.g005"/>
</fig>
<p>The twitter data we use for this analysis includes more than three million tweets from 2012 and 2013 mentioning more than eighty of the physical activities listed by the ACSM and CDC. Almost two million tweets discuss forty-eight different activities of moderate intensity and more than one million tweets discuss thirty-six different activities of strenuous intensity.</p>
<p>The <italic>pa</italic><sub><italic>weighted</italic></sub> values of the tweets in our data set vary. The minimum is zero, which reflects a tweet that does not discuss any physical activities from the list published by the ACSM and CDC. The maximum <italic>pa</italic><sub><italic>weighted</italic></sub> observed in our data set is 24.5. However, over 99% of the tweets in our data set have <italic>pa</italic><sub><italic>weighted</italic></sub> values of either: 0, 3.5 or 7.</p>
<p>
<xref ref-type="fig" rid="pone.0133505.g005">Fig 5</xref> shows that there is a statistically significant negative correlation between both <italic>PA</italic>% and <italic>pa</italic><sub><italic>weighted</italic></sub> and the obesity rate in MSAs. However, the relationship between <italic>PA</italic>% and obesity rate is stronger (<italic>r</italic> = -0.330) than the relationship between <italic>pa</italic><sub><italic>weighted</italic></sub> (<italic>r</italic> = -0.190) and obesity rate. This result may seem unexpected. The <italic>pa</italic><sub><italic>weighted</italic></sub> metric offers the capability to combine the calories burned from multiple activities based on their intensity level. Given these additional capabilities one might expect it to correlate better with obesity rate than the basic <italic>PA</italic>% metric. To gather additional insight we calculate the activities most positively and negatively correlated with obesity rate in <xref ref-type="table" rid="pone.0133505.t002">Table 2</xref>. <xref ref-type="table" rid="pone.0133505.t002">Table 2</xref> only includes five activities in each column because there are so few physical activities that have a positive statistically significant correlation with obesity rate.</p>
<table-wrap id="pone.0133505.t002" position="float">
<object-id pub-id-type="doi">10.1371/journal.pone.0133505.t002</object-id>
<label>Table 2</label>
<caption>
<title>Top Five Physical Activities Most Negatively &amp; Positively Correlated With Obesity Rate.</title>
</caption>
<alternatives>
<graphic id="pone.0133505.t002g" mimetype="image" xlink:type="simple" position="float" xlink:href="info:doi/10.1371/journal.pone.0133505.t002"/>
<table frame="box" rules="all" border="0">
<colgroup span="1">
<col align="left" valign="top" span="1"/>
<col align="left" valign="top" span="1"/>
<col align="left" valign="top" span="1"/>
<col align="left" valign="top" span="1"/>
<col align="left" valign="top" span="1"/>
<col align="left" valign="top" span="1"/>
<col align="left" valign="top" span="1"/>
<col align="left" valign="top" span="1"/>
</colgroup>
<thead>
<tr>
<th align="left" rowspan="1" colspan="1"><italic>Negative</italic></th>
<th align="center" rowspan="1" colspan="1"/>
<th align="left" rowspan="1" colspan="1"/>
<th align="right" rowspan="1" colspan="1"/>
<th align="left" rowspan="1" colspan="1"><italic>Positive</italic></th>
<th align="center" rowspan="1" colspan="1"/>
<th align="left" rowspan="1" colspan="1"/>
<th align="right" rowspan="1" colspan="1"/>
</tr>
<tr>
<th align="left" rowspan="1" colspan="1">Activity</th>
<th align="center" rowspan="1" colspan="1"><italic>r</italic></th>
<th align="left" rowspan="1" colspan="1"><italic>p</italic>-value</th>
<th align="right" rowspan="1" colspan="1">Intensity</th>
<th align="left" rowspan="1" colspan="1">Activity</th>
<th align="center" rowspan="1" colspan="1"><italic>r</italic></th>
<th align="left" rowspan="1" colspan="1"><italic>p</italic>-value</th>
<th align="right" rowspan="1" colspan="1">Intensity</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left" rowspan="1" colspan="1">golf</td>
<td align="char" char="." rowspan="1" colspan="1">-.327</td>
<td align="left" rowspan="1" colspan="1"><italic>p</italic> &lt;.001</td>
<td align="right" rowspan="1" colspan="1">moderate</td>
<td align="left" rowspan="1" colspan="1">basketball</td>
<td align="char" char="." rowspan="1" colspan="1">.218</td>
<td align="left" rowspan="1" colspan="1"><italic>p</italic> &lt;.01</td>
<td align="right" rowspan="1" colspan="1">strenuous</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">yoga</td>
<td align="char" char="." rowspan="1" colspan="1">-.318</td>
<td align="left" rowspan="1" colspan="1"><italic>p</italic> &lt;.001</td>
<td align="right" rowspan="1" colspan="1">moderate</td>
<td align="left" rowspan="1" colspan="1">hunting</td>
<td align="char" char="." rowspan="1" colspan="1">.182</td>
<td align="left" rowspan="1" colspan="1"><italic>p</italic> &lt;.01</td>
<td align="right" rowspan="1" colspan="1">moderate</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">hiking</td>
<td align="char" char="." rowspan="1" colspan="1">-.273</td>
<td align="left" rowspan="1" colspan="1"><italic>p</italic> &lt;.001</td>
<td align="right" rowspan="1" colspan="1">moderate</td>
<td align="left" rowspan="1" colspan="1">football</td>
<td align="char" char="." rowspan="1" colspan="1">.176</td>
<td align="left" rowspan="1" colspan="1"><italic>p</italic> &lt;.05</td>
<td align="right" rowspan="1" colspan="1">strenuous</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">racquetball</td>
<td align="char" char="." rowspan="1" colspan="1">-.246</td>
<td align="left" rowspan="1" colspan="1"><italic>p</italic> &lt;.001</td>
<td align="right" rowspan="1" colspan="1">strenuous</td>
<td align="left" rowspan="1" colspan="1">dancing</td>
<td align="char" char="." rowspan="1" colspan="1">.151</td>
<td align="left" rowspan="1" colspan="1">p &gt;.05</td>
<td align="right" rowspan="1" colspan="1">moderate</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">lacrosse</td>
<td align="char" char="." rowspan="1" colspan="1">-.222</td>
<td align="left" rowspan="1" colspan="1"><italic>p</italic> &lt;.01</td>
<td align="right" rowspan="1" colspan="1">strenuous</td>
<td align="left" rowspan="1" colspan="1">coaching</td>
<td align="char" char="." rowspan="1" colspan="1">.128</td>
<td align="left" rowspan="1" colspan="1">p &gt;.05</td>
<td align="right" rowspan="1" colspan="1">moderate</td>
</tr>
</tbody>
</table>
</alternatives>
</table-wrap>
<p>
<xref ref-type="table" rid="pone.0133505.t002">Table 2</xref> shows that areas with low obesity and areas with high obesity engage in twitter discussions of a mixture of moderately and strenuously intense activities. Both lists include three moderately intense activities and two strenuously intense activities. However, <xref ref-type="table" rid="pone.0133505.t002">Table 2</xref> also shows that areas with lower obesity rates simply tweet more about physical activities than areas with high obesity rates. The absolute value of the correlation coefficient for the fifth most negatively correlated activity is higher than the absolute value of the correlation coefficient for the activity most positively correlated with obesity.</p>
<p>It is important to note that our physical activity measures ignore if an individual’s discussion of an activity reflects them physically engaging in it or merely witnessing it in some manner. The inability to make this distinction could explain the lack of a more significant relationship between the intensity levels of physical activities and obesity rate.</p>
<p>However, these insights do reveal similarities between the measures: (1) <italic>nd</italic><sub><italic>avg</italic></sub> and <italic>Prod</italic>% and (2) <italic>pa</italic><sub><italic>weighted</italic></sub> and <italic>PA</italic>%. In both cases adding too much detail to the measure derived from tweets diluted the relationship between the quantities of interest. This is a valuable lesson learned. Given the complexity of Mitchell et. al.’s happiness metric, <italic>h</italic><sub><italic>avg</italic></sub>, we assumed we would need measures of discussions of food and physical activities with a similar structure. However, this is not the case. The more coarse measures <italic>Prod</italic>% and <italic>PA</italic>% had a stronger relationship to obesity rate than the nuanced measures <italic>nd</italic><sub><italic>avg</italic></sub> and <italic>pa</italic><sub><italic>weighted</italic></sub>. Next, we explore the extent to which these measures provide different insight about the obesity rate of a MSA and evaluate the extent to which each correlates with a a MSA-level survey measure of a similar variable.</p>
</sec>
<sec id="sec012">
<title>Evaluation of Measures</title>
<p>The results we have presented thus far demonstrate that three measures (<italic>h</italic><sub><italic>avg</italic></sub>, <italic>Prod</italic>% and <italic>PA</italic>%) which can be obtained from geo-tagged tweets have a statistically significant negative correlation with the obesity rate of a MSA and that correlation is on the order of -0.30. However, we have not presented any results which show that: (1) the three measures (<italic>h</italic><sub><italic>avg</italic></sub>, <italic>Prod</italic>% and <italic>PA</italic>%) have unique relationships with the obesity rate of a MSA and (2) the measures actually quantify the happiness, diet and physical activity level of a MSA.</p>
<p>We address both of these questions by computing the correlation among seven variables. Three of the seven variables are the measures of happiness, diet, and physical activity that can be gleamed from Twitter discussions within a MSA and are most correlated with obesity rate: <italic>h</italic><sub><italic>avg</italic></sub>, <italic>Prod</italic>% and <italic>PA</italic>%. The other four variables reflect MSA-level data collected by the GHWS survey data. These variables are the: (1) obesity rate of a MSA, (2) percentage of individuals in a MSA who report that they eat a healthy diet, (3) percentage of individuals in a MSA who report that they exercise frequently and (4) Well-Being Index of a MSA. The Well-Being Index is computed by aggregating the responses from participants to five statements. Each participant rates their agreement with each statement on a 0 (very strong disagreement) -10 (very strong agreement) scale. The statements are [<xref ref-type="bibr" rid="pone.0133505.ref007">7</xref>]:
<list list-type="order"><list-item><p>I am satisfied with my present life situation and anticipated life situation.</p></list-item> <list-item><p>My daily feelings and mental state are healthy.</p></list-item> <list-item><p>I have the physical ability to live a full life.</p></list-item> <list-item><p>The behaviors I engage in positively affect my physical health.</p></list-item> <list-item><p>Within my community I feel safe, satisfied and optimistic.</p></list-item></list></p>
<p>
<xref ref-type="fig" rid="pone.0133505.g006">Fig 6</xref> visualizes the lower triangle of a matrix of the Spearman correlations among the seven variables. The blue boxes in <xref ref-type="fig" rid="pone.0133505.g006">Fig 6</xref> reflect a positive correlation, red boxes reflect a negative correlation. This data shows that each of the measures we computed from Twitter discussions within a MSA (<italic>h</italic><sub><italic>avg</italic></sub>, <italic>Prod</italic>% and <italic>PA</italic>%) are more correlated with the obesity rate of a MSA than they are correlated with any of the other measures computed from Twitter data. This provides evidence that each of the three measures reflect different factors which are correlated with the obesity rate of a MSA. In other words, these three measures are not simply different methods of quantifying the same variable.</p>
<fig id="pone.0133505.g006" position="float">
<object-id pub-id-type="doi">10.1371/journal.pone.0133505.g006</object-id>
<label>Fig 6</label>
<caption>
<title>Correlation among four MSA-level measures and three Twitter measures of MSAs.</title>
<p>Blue boxes reflect a positive correlation, red boxes reflect a negative correlation.</p>
</caption>
<graphic mimetype="image" xlink:type="simple" position="float" xlink:href="info:doi/10.1371/journal.pone.0133505.g006"/>
</fig>
<p>Furthermore, each of the three measures gleamed from our Twitter corpus is more correlated with the MSA-level measure of a similar variable from GHWS than any other variable. To help elucidate this trend we have underlined the correlation coefficient of the variables with the strongest correlation to <italic>happiness</italic>, <italic>Prod</italic>% and <italic>PA</italic>%. While, this trend does not completely rule out the existence of confounders within our Twitter-level measures, it provides evidence that <italic>h</italic><sub><italic>avg</italic></sub>, <italic>Prod</italic>% and <italic>PA</italic>% are actually reflecting the level of happiness/well-being, diet/healthy-eating and physical activity/exercise within a MSA as opposed to three completely unrelated variables. Next, we review related work, discuss the validity and limitations of our results and provide directions for future work.</p>
</sec>
</sec>
<sec id="sec013" sec-type="conclusions">
<title>Discussion</title>
<p>We are not the first researchers to explore modeling human behavior with content from Twitter. Emotions have been accurately captured at different levels of granularity from tweets by using hashtags [<xref ref-type="bibr" rid="pone.0133505.ref030">30</xref>] and sentiment analysis [<xref ref-type="bibr" rid="pone.0133505.ref031">31</xref>, <xref ref-type="bibr" rid="pone.0133505.ref032">32</xref>]. Given these classification capabilities other researchers have used Twitter data to explore the emotional states individuals go through in a 24 hour period [<xref ref-type="bibr" rid="pone.0133505.ref033">33</xref>] and while watching sporting events [<xref ref-type="bibr" rid="pone.0133505.ref034">34</xref>].</p>
<p>Tweets have also been used to model consumer confidence [<xref ref-type="bibr" rid="pone.0133505.ref035">35</xref>] and identify major news events that cause breaking points in public opinion [<xref ref-type="bibr" rid="pone.0133505.ref036">36</xref>]. They have served as a platform to explore the unique characteristics of astrophysicists [<xref ref-type="bibr" rid="pone.0133505.ref037">37</xref>] and been analyzed to characterize varieties of the Spanish dialect on a global scale [<xref ref-type="bibr" rid="pone.0133505.ref038">38</xref>]. However, the two studies most related to our research are Broniatowski et al.’s work on modeling the spread of influenza through tweets [<xref ref-type="bibr" rid="pone.0133505.ref039">39</xref>] and Mitchell et al.’s exploration of the relationship between the happiness of a tweet and its geographic origin [<xref ref-type="bibr" rid="pone.0133505.ref012">12</xref>].</p>
<p>Since we have already reviewed and validated Mitchell et al.’s work, we only focus on Broniatowski et al. here. Broniatowski et al identified measures that distinguish tweets relevant to influenza from other tweets. In this paper we adopt this strategy to identify measures related to the variation in obesity rate of MSAs from 2012–2013.</p>
<p>We have identified three measures which can be gleamed from Twitter content related to happiness, diet and physical activities. Each of these measures has a statistically significant negative correlation with obesity on the order of -0.30. Furthermore, we have provided evidence that these measures reflect different variables associated with obesity and that these variables actually reflect the happiness, diet and physical activity levels of MSAs. Ultimately, this work has furthered the research effort in understanding obesity by providing a new path through social media data for the development of population-scale measures of factors related to obesity.</p>
<p>Despite these results, internal and external validity threats affect our study. Threats to internal validity arise when factors affect the dependent variables without the evaluators’ knowledge. It is possible that some flaws in the implementation of our metrics could have affected the results of the evaluation. However, the algorithms we used to compute the metrics passed several internal code reviews and the strength of the relationship between our implementation of the happiness metric, <italic>h</italic><sub><italic>avg</italic></sub>, and the obesity rate in MSAs is similar to previously published results [<xref ref-type="bibr" rid="pone.0133505.ref012">12</xref>]. Threats to external validity occur when the results of the evaluation cannot be generalized. Although the evaluation was performed for two years of data over 189 MSAs the results cannot be generalized to: (1) other urban areas, (2) during different years or (3) different Twitter data sets.</p>
<p>Furthermore, there are issues that must be addressed with how well a geo-tagged Twitter data set can represent the obesity rate of a population. Only 15% of online adults regularly use Twitter, and 18–29 year-olds and minorities tend to be more highly represented on Twitter than in the general population [<xref ref-type="bibr" rid="pone.0133505.ref040">40</xref>]. Furthermore, on Twitter, 95% of users never geo-tag a single tweet and only ∼ 1% of users geo-tag the majority of the tweets they post. Also, the extent to which the individual ‘tweeter’ is represented in our Twitter corpus is biased. Very passive users (&lt; 50 tweets per year) and very active users (&gt; 1000 tweets per year) geo-tag a smaller percentage of tweets than moderate users (50–1000 tweets per year) [<xref ref-type="bibr" rid="pone.0133505.ref040">40</xref>]. Finally, we collected only a random sample of all tweets during 2012–2013. Ultimately, these limitations mean that the data set which informed our study is a non-uniform subsample of statements made by a non-representative portion of MSA populations.</p>
<p>Even with these limitations and validity threats we have only scratched the surface of what is possible using social media datasets. In particular, Tables <xref ref-type="table" rid="pone.0133505.t001">1</xref> and <xref ref-type="table" rid="pone.0133505.t002">2</xref> could be very illuminating. One can observe that the top foods and physical activities positively (espresso, yoga) and negatively (french fries, hunting) correlated with obesity rate may have social and cultural underpinnings (i.e. income and education levels).</p>
<p>This would not be unexpected. Recall, previous work showed that the happiness of a MSA, which correlates with our diet and physical activities measures, has statistically significant positive correlations with: (a) the percentage of households with median income levels and (b) the percentage of the individuals living in an area who have obtained a bachelor’s degree. Also, happiness has a statistically significant negative correlation with families living below the poverty line. In future work, we plan to use the census data for 2012 to investigate how different demographics across urban areas are correlated with our measures of diet (<italic>Prod</italic>%) and physical activity level (<italic>PA</italic>%).</p>
<p>Additionally, we have not examined whether or not these methods have any predictive power. Future work will look at how observed changes in the measures which can be gleamed from Twitter data, predict changes in the obesity rate of MSAs. We plan to pursue this in future work using content from Twitter and the GHWS data collected in 2014 and 2015.</p>
</sec>
<sec id="sec014">
<title>Supporting Information</title>
<supplementary-material id="pone.0133505.s001" position="float" xlink:href="info:doi/10.1371/journal.pone.0133505.s001" mimetype="text/csv" xlink:type="simple">
<label>S1 Dataset</label>
<caption>
<title>Dataset for <italic>h</italic><sub><italic>avg</italic></sub> over all MSAs for 2012 &amp; 2013.</title>
<p>(CSV)</p>
</caption>
</supplementary-material>
<supplementary-material id="pone.0133505.s002" position="float" xlink:href="info:doi/10.1371/journal.pone.0133505.s002" mimetype="text/csv" xlink:type="simple">
<label>S2 Dataset</label>
<caption>
<title>Dataset for <italic>nd</italic><sub><italic>avg</italic></sub> over all MSAs for 2012 &amp; 2013.</title>
<p>(CSV)</p>
</caption>
</supplementary-material>
<supplementary-material id="pone.0133505.s003" position="float" xlink:href="info:doi/10.1371/journal.pone.0133505.s003" mimetype="text/csv" xlink:type="simple">
<label>S3 Dataset</label>
<caption>
<title>Dataset for <italic>FT</italic>% over all MSAs for 2012 &amp; 2013.</title>
<p>(CSV)</p>
</caption>
</supplementary-material>
<supplementary-material id="pone.0133505.s004" position="float" xlink:href="info:doi/10.1371/journal.pone.0133505.s004" mimetype="text/csv" xlink:type="simple">
<label>S4 Dataset</label>
<caption>
<title>Dataset for <italic>Prod</italic>% over all MSAs for 2012 &amp; 2013.</title>
<p>(CSV)</p>
</caption>
</supplementary-material>
<supplementary-material id="pone.0133505.s005" position="float" xlink:href="info:doi/10.1371/journal.pone.0133505.s005" mimetype="text/csv" xlink:type="simple">
<label>S5 Dataset</label>
<caption>
<title>Dataset for <italic>pa</italic><sub><italic>weighted</italic></sub> over all MSAs in 2012 &amp; 2013.</title>
<p>(CSV)</p>
</caption>
</supplementary-material>
<supplementary-material id="pone.0133505.s006" position="float" xlink:href="info:doi/10.1371/journal.pone.0133505.s006" mimetype="text/csv" xlink:type="simple">
<label>S6 Dataset</label>
<caption>
<title>Dataset for <italic>PA</italic>% over all MSAs in 2012 &amp; 2013.</title>
<p>(CSV)</p>
</caption>
</supplementary-material>
<supplementary-material id="pone.0133505.s007" position="float" xlink:href="info:doi/10.1371/journal.pone.0133505.s007" mimetype="text/csv" xlink:type="simple">
<label>S1 Table</label>
<caption>
<title>Foods Most Negatively Correlated With Obesity Rate.</title>
<p>(CSV)</p>
</caption>
</supplementary-material>
<supplementary-material id="pone.0133505.s008" position="float" xlink:href="info:doi/10.1371/journal.pone.0133505.s008" mimetype="text/csv" xlink:type="simple">
<label>S2 Table</label>
<caption>
<title>Foods Most Positively Correlated With Obesity Rate.</title>
<p>(CSV)</p>
</caption>
</supplementary-material>
<supplementary-material id="pone.0133505.s009" position="float" xlink:href="info:doi/10.1371/journal.pone.0133505.s009" mimetype="text/csv" xlink:type="simple">
<label>S3 Table</label>
<caption>
<title>Physical Activities Most Negatively Correlated With Obesity Rate.</title>
<p>(CSV)</p>
</caption>
</supplementary-material>
<supplementary-material id="pone.0133505.s010" position="float" xlink:href="info:doi/10.1371/journal.pone.0133505.s010" mimetype="text/csv" xlink:type="simple">
<label>S4 Table</label>
<caption>
<title>Physical Activities Most Positively Correlated With Obesity Rate.</title>
<p>(CSV)</p>
</caption>
</supplementary-material>
</sec>
</body>
<back>
<ack>
<p>The authors gratefully acknowledge the support from their colleagues within the Virginia, Modeling, Analysis Center at Old Dominion University.</p>
</ack>
<ref-list>
<title>References</title>
<ref id="pone.0133505.ref001">
<label>1</label>
<mixed-citation xlink:type="simple" publication-type="journal">
<name name-style="western"><surname>Tsai</surname> <given-names>AG</given-names></name>, <name name-style="western"><surname>Williamson</surname> <given-names>DF</given-names></name>, <name name-style="western"><surname>Glick</surname> <given-names>HA</given-names></name>. <article-title>Direct medical cost of overweight and obesity in the USA: a quantitative systematic review</article-title>. <source>Obesity Reviews</source>. <year>2011</year>;<volume>12</volume>(<issue>1</issue>):<fpage>50</fpage>–<lpage>61</lpage>. <comment>doi: <ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1111/j.1467-789X.2009.00708.x" xlink:type="simple">10.1111/j.1467-789X.2009.00708.x</ext-link></comment> <object-id pub-id-type="pmid">20059703</object-id></mixed-citation>
</ref>
<ref id="pone.0133505.ref002">
<label>2</label>
<mixed-citation xlink:type="simple" publication-type="book">
<name name-style="western"><surname>Ogden</surname> <given-names>CL</given-names></name>, for <collab xlink:type="simple">Health Statistics (US) NC</collab>, <etal>et al</etal>. <source>Prevalence of obesity in the United States, 2009–2010</source>. <publisher-name>US Department of Health and Human Services, Centers for Disease Control and Prevention, National Center for Health Statistics</publisher-name>; <year>2012</year>.</mixed-citation>
</ref>
<ref id="pone.0133505.ref003">
<label>3</label>
<mixed-citation xlink:type="simple" publication-type="journal">
<name name-style="western"><surname>Cawley</surname> <given-names>J</given-names></name>, <name name-style="western"><surname>Meyerhoefer</surname> <given-names>C</given-names></name>. <article-title>The medical care costs of obesity: an instrumental variables approach</article-title>. <source>Journal of health economics</source>. <year>2012</year>;<volume>31</volume>(<issue>1</issue>):<fpage>219</fpage>–<lpage>230</lpage>. <comment>doi: <ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1016/j.jhealeco.2011.10.003" xlink:type="simple">10.1016/j.jhealeco.2011.10.003</ext-link></comment> <object-id pub-id-type="pmid">22094013</object-id></mixed-citation>
</ref>
<ref id="pone.0133505.ref004">
<label>4</label>
<mixed-citation xlink:type="simple" publication-type="journal">
<name name-style="western"><surname>Trogdon</surname> <given-names>JG</given-names></name>, <name name-style="western"><surname>Finkelstein</surname> <given-names>EA</given-names></name>, <name name-style="western"><surname>Feagan</surname> <given-names>CW</given-names></name>, <name name-style="western"><surname>Cohen</surname> <given-names>JW</given-names></name>. <article-title>State-and Payer-Specific Estimates of Annual Medical Expenditures Attributable to Obesity</article-title>. <source>Obesity</source>. <year>2012</year>;<volume>20</volume>(<issue>1</issue>):<fpage>214</fpage>–<lpage>220</lpage>. <comment>doi: <ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1038/oby.2011.169" xlink:type="simple">10.1038/oby.2011.169</ext-link></comment> <object-id pub-id-type="pmid">21681222</object-id></mixed-citation>
</ref>
<ref id="pone.0133505.ref005">
<label>5</label>
<mixed-citation xlink:type="simple" publication-type="journal">
<name name-style="western"><surname>Finkelstein</surname> <given-names>EA</given-names></name>, <name name-style="western"><surname>Khavjou</surname> <given-names>OA</given-names></name>, <name name-style="western"><surname>Thompson</surname> <given-names>H</given-names></name>, <name name-style="western"><surname>Trogdon</surname> <given-names>JG</given-names></name>, <name name-style="western"><surname>Pan</surname> <given-names>L</given-names></name>, <name name-style="western"><surname>Sherry</surname> <given-names>B</given-names></name>, <etal>et al</etal>. <article-title>Obesity and severe obesity forecasts through 2030</article-title>. <source>American journal of preventive medicine</source>. <year>2012</year>;<volume>42</volume>(<issue>6</issue>):<fpage>563</fpage>–<lpage>570</lpage>. <comment>doi: <ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1016/j.amepre.2011.10.026" xlink:type="simple">10.1016/j.amepre.2011.10.026</ext-link></comment> <object-id pub-id-type="pmid">22608371</object-id></mixed-citation>
</ref>
<ref id="pone.0133505.ref006">
<label>6</label>
<mixed-citation xlink:type="simple" publication-type="journal">
<name name-style="western"><surname>Shah</surname> <given-names>NR</given-names></name>, <name name-style="western"><surname>Braverman</surname> <given-names>ER</given-names></name>. <article-title>Measuring adiposity in patients: the utility of body mass index (BMI), percent body fat, and leptin</article-title>. <source>PLoS One</source>. <year>2012</year>;<volume>7</volume>(<issue>4</issue>):<fpage>e33308</fpage>. <comment>doi: <ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1371/journal.pone.0033308" xlink:type="simple">10.1371/journal.pone.0033308</ext-link></comment> <object-id pub-id-type="pmid">22485140</object-id></mixed-citation>
</ref>
<ref id="pone.0133505.ref007">
<label>7</label>
<mixed-citation xlink:type="simple" publication-type="other">Gallup-Healthways Well Being Index 2011–2014;. Accessed: 2014-11-24. <ext-link ext-link-type="uri" xlink:type="simple" xlink:href="http://info.healthways.com/wellbeingindex">http://info.healthways.com/wellbeingindex</ext-link>.</mixed-citation>
</ref>
<ref id="pone.0133505.ref008">
<label>8</label>
<mixed-citation xlink:type="simple" publication-type="other">Huberman BA, Romero DM, Wu F. Social networks that matter: Twitter under the microscope. Available at SSRN 1313405. 2008;.</mixed-citation>
</ref>
<ref id="pone.0133505.ref009">
<label>9</label>
<mixed-citation xlink:type="simple" publication-type="other">Turk AM. Best Practices Guide. Amazon Web Services. 2011;.</mixed-citation>
</ref>
<ref id="pone.0133505.ref010">
<label>10</label>
<mixed-citation xlink:type="simple" publication-type="journal">
<name name-style="western"><surname>Dodds</surname> <given-names>PS</given-names></name>, <name name-style="western"><surname>Danforth</surname> <given-names>CM</given-names></name>. <article-title>Measuring the happiness of large-scale written expression: Songs, blogs, and presidents</article-title>. <source>Journal of Happiness Studies</source>. <year>2010</year>;<volume>11</volume>(<issue>4</issue>):<fpage>441</fpage>–<lpage>456</lpage>. <comment>doi: <ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1007/s10902-009-9150-9" xlink:type="simple">10.1007/s10902-009-9150-9</ext-link></comment></mixed-citation>
</ref>
<ref id="pone.0133505.ref011">
<label>11</label>
<mixed-citation xlink:type="simple" publication-type="journal">
<name name-style="western"><surname>Dodds</surname> <given-names>PS</given-names></name>, <name name-style="western"><surname>Harris</surname> <given-names>KD</given-names></name>, <name name-style="western"><surname>Kloumann</surname> <given-names>IM</given-names></name>, <name name-style="western"><surname>Bliss</surname> <given-names>CA</given-names></name>, <name name-style="western"><surname>Danforth</surname> <given-names>CM</given-names></name>. <article-title>Temporal patterns of happiness and information in a global social network: Hedonometrics and Twitter</article-title>. <source>PloS one</source>. <year>2011</year>;<volume>6</volume>(<issue>12</issue>):<fpage>e26752</fpage>. <comment>doi: <ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1371/journal.pone.0026752" xlink:type="simple">10.1371/journal.pone.0026752</ext-link></comment> <object-id pub-id-type="pmid">22163266</object-id></mixed-citation>
</ref>
<ref id="pone.0133505.ref012">
<label>12</label>
<mixed-citation xlink:type="simple" publication-type="journal">
<name name-style="western"><surname>Mitchell</surname> <given-names>L</given-names></name>, <name name-style="western"><surname>Frank</surname> <given-names>MR</given-names></name>, <name name-style="western"><surname>Harris</surname> <given-names>KD</given-names></name>, <name name-style="western"><surname>Dodds</surname> <given-names>PS</given-names></name>, <name name-style="western"><surname>Danforth</surname> <given-names>CM</given-names></name>. <article-title>The geography of happiness: Connecting Twitter sentiment and expression, demographics, and objective characteristics of place</article-title>. <source>PloS one</source>. <year>2013</year>;<volume>8</volume>(<issue>5</issue>):<fpage>e64417</fpage>. <comment>doi: <ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1371/journal.pone.0064417" xlink:type="simple">10.1371/journal.pone.0064417</ext-link></comment> <object-id pub-id-type="pmid">23734200</object-id></mixed-citation>
</ref>
<ref id="pone.0133505.ref013">
<label>13</label>
<mixed-citation xlink:type="simple" publication-type="journal">
<name name-style="western"><surname>Carwile</surname> <given-names>JL</given-names></name>, <name name-style="western"><surname>Michels</surname> <given-names>KB</given-names></name>. <article-title>Urinary bisphenol A and obesity: NHANES 2003–2006</article-title>. <source>Environmental research</source>. <year>2011</year>;<volume>111</volume>(<issue>6</issue>):<fpage>825</fpage>–<lpage>830</lpage>. <comment>doi: <ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1016/j.envres.2011.05.014" xlink:type="simple">10.1016/j.envres.2011.05.014</ext-link></comment> <object-id pub-id-type="pmid">21676388</object-id></mixed-citation>
</ref>
<ref id="pone.0133505.ref014">
<label>14</label>
<mixed-citation xlink:type="simple" publication-type="journal">
<collab xlink:type="simple">National Center for Health Statistics (US)</collab>. <article-title>Plan and operation of the third National Health and Nutrition Examination Survey</article-title>, <fpage>1988</fpage>–<lpage>94</lpage>. <volume>32</volume>. <source>Natl Ctr for Health Statistics</source>; <year>1994</year>.</mixed-citation>
</ref>
<ref id="pone.0133505.ref015">
<label>15</label>
<mixed-citation xlink:type="simple" publication-type="book">
<name name-style="western"><surname>Food</surname> <given-names>U</given-names></name>. <article-title>Nutrient Database for Dietary Studies, 11.0</article-title>. <source>Agricultural Research Service</source>, <publisher-name>Food Surveys Research Group</publisher-name>, <publisher-loc>Beltsville, MD</publisher-loc>. <year>2014</year>;.</mixed-citation>
</ref>
<ref id="pone.0133505.ref016">
<label>16</label>
<mixed-citation xlink:type="simple" publication-type="journal">
<name name-style="western"><surname>Fulgoni</surname> <given-names>VL</given-names></name>, <name name-style="western"><surname>Keast</surname> <given-names>DR</given-names></name>, <name name-style="western"><surname>Drewnowski</surname> <given-names>A</given-names></name>. <article-title>Development and validation of the nutrient-rich foods index: a tool to measure nutritional quality of foods</article-title>. <source>The Journal of nutrition</source>. <year>2009</year>;<volume>139</volume>(<issue>8</issue>):<fpage>1549</fpage>–<lpage>1554</lpage>. <comment>doi: <ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.3945/jn.108.101360" xlink:type="simple">10.3945/jn.108.101360</ext-link></comment> <object-id pub-id-type="pmid">19549759</object-id></mixed-citation>
</ref>
<ref id="pone.0133505.ref017">
<label>17</label>
<mixed-citation xlink:type="simple" publication-type="book">
<name name-style="western"><surname>Variyam</surname> <given-names>JN</given-names></name>, <name name-style="western"><surname>Blaylock</surname> <given-names>JR</given-names></name>, <name name-style="western"><surname>Smallwood</surname> <given-names>DM</given-names></name>, <name name-style="western"><surname>Basiotis</surname> <given-names>PP</given-names></name>. <source>USDA’s Healthy Eating Index and nutrition information</source>. <publisher-name>United States Department of Agriculture, Economic Research Service</publisher-name>; <year>1998</year>.</mixed-citation>
</ref>
<ref id="pone.0133505.ref018">
<label>18</label>
<mixed-citation xlink:type="simple" publication-type="journal">
<name name-style="western"><surname>Fuhrman</surname> <given-names>J</given-names></name>, <name name-style="western"><surname>Sarter</surname> <given-names>B</given-names></name>, <name name-style="western"><surname>Glaser</surname> <given-names>D</given-names></name>, <name name-style="western"><surname>Acocella</surname> <given-names>S</given-names></name>. <article-title>Changing perceptions of hunger on a high nutrient density diet</article-title>. <source>Nutrition journal</source>. <year>2010</year>;<volume>9</volume>(<issue>1</issue>):<fpage>393</fpage>–<lpage>399</lpage>. <comment>doi: <ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1186/1475-2891-9-51" xlink:type="simple">10.1186/1475-2891-9-51</ext-link></comment></mixed-citation>
</ref>
<ref id="pone.0133505.ref019">
<label>19</label>
<mixed-citation xlink:type="simple" publication-type="journal">
<name name-style="western"><surname>Drewnowski</surname> <given-names>A</given-names></name>. <article-title>Obesity and the food environment: dietary energy density and diet costs</article-title>. <source>American journal of preventive medicine</source>. <year>2004</year>;<volume>27</volume>(<issue>3</issue>):<fpage>154</fpage>–<lpage>162</lpage>. <comment>doi: <ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1016/j.amepre.2004.06.011" xlink:type="simple">10.1016/j.amepre.2004.06.011</ext-link></comment> <object-id pub-id-type="pmid">15450626</object-id></mixed-citation>
</ref>
<ref id="pone.0133505.ref020">
<label>20</label>
<mixed-citation xlink:type="simple" publication-type="journal">
<name name-style="western"><surname>Guenther</surname> <given-names>PM</given-names></name>, <name name-style="western"><surname>Reedy</surname> <given-names>J</given-names></name>, <name name-style="western"><surname>Krebs-Smith</surname> <given-names>SM</given-names></name>. <article-title>Development of the healthy eating index-2005</article-title>. <source>Journal of the American Dietetic Association</source>. <year>2008</year>;<volume>108</volume>(<issue>11</issue>):<fpage>1896</fpage>–<lpage>1901</lpage>. <comment>doi: <ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1016/j.jada.2008.08.016" xlink:type="simple">10.1016/j.jada.2008.08.016</ext-link></comment> <object-id pub-id-type="pmid">18954580</object-id></mixed-citation>
</ref>
<ref id="pone.0133505.ref021">
<label>21</label>
<mixed-citation xlink:type="simple" publication-type="journal">
<name name-style="western"><surname>Wing</surname> <given-names>RR</given-names></name>. <article-title>Physical activity in the treatment of the adulthood overweight and obesity: current evidence and research issues</article-title>. <source>Medicine and science in sports and exercise</source>. <year>1999</year>;<volume>31</volume>(<issue>11 Suppl</issue>):<fpage>S547</fpage>–<lpage>52</lpage>. <object-id pub-id-type="pmid">10593526</object-id></mixed-citation>
</ref>
<ref id="pone.0133505.ref022">
<label>22</label>
<mixed-citation xlink:type="simple" publication-type="journal">
<name name-style="western"><surname>Ross</surname> <given-names>R</given-names></name>, <name name-style="western"><surname>Janssen</surname> <given-names>I</given-names></name>, <name name-style="western"><surname>Dawson</surname> <given-names>J</given-names></name>, <name name-style="western"><surname>Kungl</surname> <given-names>AM</given-names></name>, <name name-style="western"><surname>Kuk</surname> <given-names>JL</given-names></name>, <name name-style="western"><surname>Wong</surname> <given-names>SL</given-names></name>, <etal>et al</etal>. <article-title>Exercise-induced reduction in obesity and insulin resistance in women: a randomized controlled trial</article-title>. <source>Obesity research</source>. <year>2004</year>;<volume>12</volume>(<issue>5</issue>):<fpage>789</fpage>–<lpage>798</lpage>. <comment>doi: <ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1038/oby.2004.95" xlink:type="simple">10.1038/oby.2004.95</ext-link></comment> <object-id pub-id-type="pmid">15166299</object-id></mixed-citation>
</ref>
<ref id="pone.0133505.ref023">
<label>23</label>
<mixed-citation xlink:type="simple" publication-type="journal">
<name name-style="western"><surname>Weltman</surname> <given-names>A</given-names></name>, <name name-style="western"><surname>Weltman</surname> <given-names>JY</given-names></name>, <name name-style="western"><surname>Watson Winfield</surname> <given-names>DD</given-names></name>, <name name-style="western"><surname>Frick</surname> <given-names>K</given-names></name>, <name name-style="western"><surname>Patrie</surname> <given-names>J</given-names></name>, <name name-style="western"><surname>Kok</surname> <given-names>P</given-names></name>, <etal>et al</etal>. <article-title>Effects of continuous versus intermittent exercise, obesity, and gender on growth hormone secretion</article-title>. <source>The Journal of Clinical Endocrinology &amp; Metabolism</source>. <year>2008</year>;<volume>93</volume>(<issue>12</issue>):<fpage>4711</fpage>–<lpage>4720</lpage>. <comment>doi: <ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1210/jc.2008-0998" xlink:type="simple">10.1210/jc.2008-0998</ext-link></comment></mixed-citation>
</ref>
<ref id="pone.0133505.ref024">
<label>24</label>
<mixed-citation xlink:type="simple" publication-type="other">USD of Health. Physical activity and health: a report of the Surgeon General. DIANE Publishing; 1996.</mixed-citation>
</ref>
<ref id="pone.0133505.ref025">
<label>25</label>
<mixed-citation xlink:type="simple" publication-type="journal">
<name name-style="western"><surname>Haskell</surname> <given-names>WL</given-names></name>, <name name-style="western"><surname>Lee</surname> <given-names>IM</given-names></name>, <name name-style="western"><surname>Pate</surname> <given-names>RR</given-names></name>, <name name-style="western"><surname>Powell</surname> <given-names>KE</given-names></name>, <name name-style="western"><surname>Blair</surname> <given-names>SN</given-names></name>, <name name-style="western"><surname>Franklin</surname> <given-names>BA</given-names></name>, <etal>et al</etal>. <article-title>Physical activity and public health: updated recommendation for adults from the American College of Sports Medicine and the American Heart Association</article-title>. <source>Circulation</source>. <year>2007</year>;<volume>116</volume>(<issue>9</issue>):<fpage>1081</fpage>. <comment>doi: <ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1161/CIRCULATIONAHA.107.185649" xlink:type="simple">10.1161/CIRCULATIONAHA.107.185649</ext-link></comment> <object-id pub-id-type="pmid">17671237</object-id></mixed-citation>
</ref>
<ref id="pone.0133505.ref026">
<label>26</label>
<mixed-citation xlink:type="simple" publication-type="journal">
<name name-style="western"><surname>Arriaza Jones</surname> <given-names>D</given-names></name>, <name name-style="western"><surname>Ainsworth</surname> <given-names>BE</given-names></name>, <name name-style="western"><surname>Croft</surname> <given-names>JB</given-names></name>, <name name-style="western"><surname>Macera</surname> <given-names>CA</given-names></name>, <name name-style="western"><surname>Lloyd</surname> <given-names>EE</given-names></name>, <name name-style="western"><surname>Yusuf</surname> <given-names>HR</given-names></name>. <article-title>Moderate leisure-time physical activity: who is meeting the public health recommendations? A national cross-sectional study</article-title>. <source>Archives of Family Medicine</source>. <year>1998</year>;<volume>7</volume>(<issue>3</issue>):<fpage>285</fpage>. <comment>doi: <ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1001/archfami.7.3.285" xlink:type="simple">10.1001/archfami.7.3.285</ext-link></comment></mixed-citation>
</ref>
<ref id="pone.0133505.ref027">
<label>27</label>
<mixed-citation xlink:type="simple" publication-type="journal">
<name name-style="western"><surname>Weyer</surname> <given-names>C</given-names></name>, <name name-style="western"><surname>Linkeschowa</surname> <given-names>R</given-names></name>, <name name-style="western"><surname>Heise</surname> <given-names>T</given-names></name>, <name name-style="western"><surname>Giesen</surname> <given-names>H</given-names></name>, <name name-style="western"><surname>Spraul</surname> <given-names>M</given-names></name>. <article-title>Implications of the traditional and the new ACSM physical activity recommendations on weight reduction in dietary treated obese subjects</article-title>. <source>International journal of obesity and related metabolic disorders: journal of the International Association for the Study of Obesity</source>. <year>1998</year>;<volume>22</volume>(<issue>11</issue>):<fpage>1071</fpage>–<lpage>1078</lpage>. <comment>doi: <ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1038/sj.ijo.0800728" xlink:type="simple">10.1038/sj.ijo.0800728</ext-link></comment></mixed-citation>
</ref>
<ref id="pone.0133505.ref028">
<label>28</label>
<mixed-citation xlink:type="simple" publication-type="journal">
<name name-style="western"><surname>Togo</surname> <given-names>P</given-names></name>, <name name-style="western"><surname>Osler</surname> <given-names>M</given-names></name>, <name name-style="western"><surname>Sørensen</surname> <given-names>T</given-names></name>, <name name-style="western"><surname>Heitmann</surname> <given-names>B</given-names></name>. <article-title>Food intake patterns and body mass index in observational studies</article-title>. <source>International journal of obesity and related metabolic disorders: journal of the International Association for the Study of Obesity</source>. <year>2001</year>;<volume>25</volume>(<issue>12</issue>):<fpage>1741</fpage>–<lpage>1751</lpage>. <comment>doi: <ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1038/sj.ijo.0801819" xlink:type="simple">10.1038/sj.ijo.0801819</ext-link></comment></mixed-citation>
</ref>
<ref id="pone.0133505.ref029">
<label>29</label>
<mixed-citation xlink:type="simple" publication-type="journal">
<name name-style="western"><surname>Newby</surname> <given-names>PK</given-names></name>, <name name-style="western"><surname>Muller</surname> <given-names>D</given-names></name>, <name name-style="western"><surname>Hallfrisch</surname> <given-names>J</given-names></name>, <name name-style="western"><surname>Qiao</surname> <given-names>N</given-names></name>, <name name-style="western"><surname>Andres</surname> <given-names>R</given-names></name>, <name name-style="western"><surname>Tucker</surname> <given-names>KL</given-names></name>. <article-title>Dietary patterns and changes in body mass index and waist circumference in adults</article-title>. <source>The American journal of clinical nutrition</source>. <year>2003</year>;<volume>77</volume>(<issue>6</issue>):<fpage>1417</fpage>–<lpage>1425</lpage>. <object-id pub-id-type="pmid">12791618</object-id></mixed-citation>
</ref>
<ref id="pone.0133505.ref030">
<label>30</label>
<mixed-citation xlink:type="simple" publication-type="other">Mohammad SM, Kiritchenko S. Using hashtags to capture fine emotion categories from tweets. Computational Intelligence. 2014;.</mixed-citation>
</ref>
<ref id="pone.0133505.ref031">
<label>31</label>
<mixed-citation xlink:type="simple" publication-type="journal">
<name name-style="western"><surname>Mohammad</surname> <given-names>SM</given-names></name>, <name name-style="western"><surname>Kiritchenko</surname> <given-names>S</given-names></name>, <name name-style="western"><surname>Zhu</surname> <given-names>X</given-names></name>. <article-title>NRC-Canada: Building the state-of-the-art in sentiment analysis of tweets</article-title>. <source>arXiv preprint</source> <volume>arXiv</volume>:<fpage>13086242</fpage>. <year>2013</year>;.</mixed-citation>
</ref>
<ref id="pone.0133505.ref032">
<label>32</label>
<mixed-citation xlink:type="simple" publication-type="journal">
<name name-style="western"><surname>Balabantaray</surname> <given-names>R</given-names></name>, <name name-style="western"><surname>Mohammad</surname> <given-names>M</given-names></name>, <name name-style="western"><surname>Sharma</surname> <given-names>N</given-names></name>. <article-title>Multi-class twitter emotion classification: A new approach</article-title>. <source>International Journal of Applied Information Systems</source>. <year>2012</year>;<volume>4</volume>(<issue>1</issue>):<fpage>48</fpage>–<lpage>53</lpage>. <comment>doi: <ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.5120/ijais12-450651" xlink:type="simple">10.5120/ijais12-450651</ext-link></comment></mixed-citation>
</ref>
<ref id="pone.0133505.ref033">
<label>33</label>
<mixed-citation xlink:type="simple" publication-type="other">Bollen J, Mao H, Pepe A. Modeling public mood and emotion: Twitter sentiment and socio-economic phenomena. In: ICWSM; 2011.</mixed-citation>
</ref>
<ref id="pone.0133505.ref034">
<label>34</label>
<mixed-citation xlink:type="simple" publication-type="other">Sintsova V, Musat CC, Pu Faltings P. Fine-grained emotion recognition in olympic tweets based on human computation. In: 4th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis. EPFL-CONF-197185; 2013.</mixed-citation>
</ref>
<ref id="pone.0133505.ref035">
<label>35</label>
<mixed-citation xlink:type="simple" publication-type="journal">
<name name-style="western"><surname>O’Connor</surname> <given-names>B</given-names></name>, <name name-style="western"><surname>Balasubramanyan</surname> <given-names>R</given-names></name>, <name name-style="western"><surname>Routledge</surname> <given-names>BR</given-names></name>, <name name-style="western"><surname>Smith</surname> <given-names>NA</given-names></name>. <article-title>From tweets to polls: Linking text sentiment to public opinion time series</article-title>. <source>ICWSM</source>. <year>2010</year>;<volume>11</volume>:<fpage>122</fpage>–<lpage>129</lpage>.</mixed-citation>
</ref>
<ref id="pone.0133505.ref036">
<label>36</label>
<mixed-citation xlink:type="simple" publication-type="book">
<name name-style="western"><surname>Akcora</surname> <given-names>CG</given-names></name>, <name name-style="western"><surname>Bayir</surname> <given-names>MA</given-names></name>, <name name-style="western"><surname>Demirbas</surname> <given-names>M</given-names></name>, <name name-style="western"><surname>Ferhatosmanoglu</surname> <given-names>H</given-names></name>. <chapter-title>Identifying breakpoints in public opinion</chapter-title>. In: <source>Proceedings of the First Workshop on Social Media Analytics</source>. <publisher-name>ACM</publisher-name>; <year>2010</year>. p. <fpage>62</fpage>–<lpage>66</lpage>.</mixed-citation>
</ref>
<ref id="pone.0133505.ref037">
<label>37</label>
<mixed-citation xlink:type="simple" publication-type="journal">
<name name-style="western"><surname>Holmberg</surname> <given-names>K</given-names></name>, <name name-style="western"><surname>Bowman</surname> <given-names>TD</given-names></name>, <name name-style="western"><surname>Haustein</surname> <given-names>S</given-names></name>, <name name-style="western"><surname>Peters</surname> <given-names>I</given-names></name>. <article-title>Astrophysicists’ Conversational Connections on Twitter</article-title>. <source>PloS one</source>. <year>2014</year>;<volume>9</volume>(<issue>8</issue>):<fpage>e106086</fpage>. <comment>doi: <ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1371/journal.pone.0106086" xlink:type="simple">10.1371/journal.pone.0106086</ext-link></comment> <object-id pub-id-type="pmid">25153196</object-id></mixed-citation>
</ref>
<ref id="pone.0133505.ref038">
<label>38</label>
<mixed-citation xlink:type="simple" publication-type="journal">
<name name-style="western"><surname>Gonçalves</surname> <given-names>B</given-names></name>, <name name-style="western"><surname>Sánchez</surname> <given-names>D</given-names></name>. <article-title>Crowdsourcing Dialect Characterization through Twitter</article-title>. <source>PloS one</source>. <year>2014</year>;<volume>9</volume>(<issue>11</issue>):<fpage>e112074</fpage>. <comment>doi: <ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1371/journal.pone.0112074" xlink:type="simple">10.1371/journal.pone.0112074</ext-link></comment> <object-id pub-id-type="pmid">25409174</object-id></mixed-citation>
</ref>
<ref id="pone.0133505.ref039">
<label>39</label>
<mixed-citation xlink:type="simple" publication-type="journal">
<name name-style="western"><surname>Broniatowski</surname> <given-names>DA</given-names></name>, <name name-style="western"><surname>Paul</surname> <given-names>MJ</given-names></name>, <name name-style="western"><surname>Dredze</surname> <given-names>M</given-names></name>. <article-title>National and local influenza surveillance through twitter: An analysis of the 2012–2013 influenza epidemic</article-title>. <source>PloS one</source>. <year>2013</year>;<volume>8</volume>(<issue>12</issue>):<fpage>e83672</fpage>. <comment>doi: <ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1371/journal.pone.0083672" xlink:type="simple">10.1371/journal.pone.0083672</ext-link></comment> <object-id pub-id-type="pmid">24349542</object-id></mixed-citation>
</ref>
<ref id="pone.0133505.ref040">
<label>40</label>
<mixed-citation xlink:type="simple" publication-type="other">Smith A, Brenner J. Twitter use 2012. Pew Internet &amp; American Life Project. 2012;p. 4.</mixed-citation>
</ref>
</ref-list>
</back>
</article>