Track mosquitoes with your Smartphone

During my first summer in New York City, mosquitoes bit me mercilessly. Eventually I learned which parks and gardens to avoid, but I was struck by the large geographic variation in mosquito prevalence. Interested whether the city made any data about mosquitos publically available, I learned that the city only had 52 permanent mosquito traps, roughly one per 6 square miles (for reference, the Upper West Side, Upper East Side, and Central Park combined take up less than 5 square miles). Moreover, the majority of the trapping locations in NYC are within parks. Vector control officials either do not have the resources or the legal authority to trap mosquitoes in many locations, particularly homeowner backyards.

The annoyance of mosquito bites, combined with the serious threat they pose as vectors of infectious diseases, led me to wonder what could be done. Mosquito control requires active citizen engagement to monitor standing water, open containers, and remove litter. I remembered reading about the brigadistas in Nicaragua, a public health group and campaign that taught community leaders to work with their neighbors to eliminate mosquito breeding habitat (the brigadistas produced graphics, such as the one below). When the Earth Institute of Columbia University put out a request for proposals for smartphone applications to engage citizen scientists, I quickly pulled together my ideas for an app to report mosquito activity, and applied with Professor Jeffrey Shaman and PhD student Eliza Little. Once funded, Eliza tapped the talented Matt Brennan to implement our vision.

Above: Graphics from anti-malaria campaign: “These are breeding sites of mosquitoes” [1].

Introducing BiteBytes

We created a smartphone application called BiteBytes as a tool for citizens to report mosquito activity. Community participation using the smart phone app can help supplement expensive mosquito trapping. Crowd-sourced mosquito identification can be an additional source of information for monitoring and forecasting mosquito abundance anywhere that user engagement is high. Data generated from the app would allow cities to target key areas reduce mosquito abundance, and thus control the potential for the spread of mosquito-borne diseases. In addition to gathering data and enhancing the city mosquito-monitoring network, the app can be used to educate the public on mosquitoes, the diseases mosquitoes transmit, and mosquito habitat control. We hope people will feel empowered by reporting mosquitoes and contributing to data collection in their city or county.

How you can use BiteBytes

You can now download BiteBytes from iTunes or Google Play. In the first screen, you can select the mosquito that most resembles the one bothering you (or select unknown). Next, additional information can be added. There will be a heat map showing all reporting activity, and information about mosquitoes in the northeastern U.S. and how to control mosquitoes. Reports from the app will also be available on the BiteBytes website. The more mosquitos you report, the more status you accrue within Bitebytes.

Above: Screenshots of BiteBytes beta version. This heat map does not reflect actual mosquito prevalence! Please visit to view reported mosquito activity.

Hopes for Bitebytes

User-generated electronic data streams are increasingly incorporated into models of infectious diseases. Many user-generated data streams of infectious diseases are passive (through twitter posts or Wikipedia searches); here citizens will need to download an app onto their smartphone and actively use it. Citizen monitoring for mosquitoes can be successfully implemented, however, and other mobile application reporting systems have been used in Nicaragua and Brazil [2], Europe [3], and Canada [4].

With public awareness of mosquito vectors and participation to decrease the number of these vectors, we can counter the public health risks of mosquito-borne diseases, while improving residents’ quality of life and use of parks. Given the heightened awareness of mosquito-borne diseases due to the Zika outbreak, public uptake is likely to be high. Together with Eliza, Matt, and Jeff, we are excited to use the bitebytes platform to engage with end users, both vector control and the public, on issues regarding surveillance, risk of infectious disease, and education!


  1. Garfield RM, Vermund SH. Health education and community participation in mass drug administration for malaria in Nicaragua. Soc Sci Med. 1986;22: 869–877. doi:10.1016/0277-9536(86)90241-8
  2. Coloma J, Suazo H, Harris E, Holston J. Dengue chat: A novel web and cellphone application promotes community-based mosquito vector control. Ann Glob Heal. Elsevier; 2016;82: 451. doi:10.1016/j.aogh.2016.04.244
  3. European Centre for Disease Prevention and Control. Guidelines for the surveillance of native mosquitoes in Europe. Technical report. 2014. Available at Accessed 4/26/2016.
  4. McLeod, B. Experimental Social Media Marketing. 2103. Youtube video: Accessed 4/26/2016.


Smart Cities Hackathon

Last month, I participated in the Smart Cities Hackathon, organized by the group Women in Machine Learning and Data Science (WiMLDS). Our small group consisted of Shivani Trehan, a statistician with a background in media studies and myself and Ruthie Birger, both postdocs in computational biology and infectious diseases. We decided to look into a possible link between green spaces and respiratory health. Out of more than eight data sets publicly available and made readily availably by WiMLDS, we selected NYC tree cover, 311 complaints, and respiratory health outcomes. Although we all expected to find a correlation between green space and the hospital admission data, I expected green space to correlate with more pollen-induced asthma hospital admissions, while Ruthie and Shivani expected green space to be protective against negative respiratory health outcomes.

Our process

The main challenge of the day, like in so many academic projects, turned out to be accessing and wrangling the data. New York state hospital admission data is downloadable through SPARCS (Statewide Planning and Research Cooperative System) in very large files at, and the data can also be accessed through the cloud using Google’s command line API (application program interface). Ruthie helped me set up python and Jupyter notebook to use the Google API, while she got to work on the 311 complaint data set. Ruthie selected an impressive list of complaints that could be related to respiratory health, and also shortened her list to complaints of air quality, asbestos, general construction/plumbing, mold, rodent, smoking, and unsanitary pigeon condition.

With her interest in media, Shivani wanted to look into NYT articles to quantify and qualify articles depicting potential respiratory hazards, such as smog or pollen. The NYT query form had restricted access, so as time in the day elapsed, Shivani focused instead on getting the tree count data. We used tree cover from the TreesCount! 2015 Street Tree Census, which was conducted by volunteers and staff, and organized by NYC Parks & Recreation and partner organizations.

I managed to create a subset of the 2013 SPARCS data that included only NYC counties and health codes related to respiratory health (e.g. asthma, bronchitis, and influenza). I wanted to build a time-series model from the 311 complaints and tree cover to predict hospital admissions, but SPARCS only provides the year of patient admission. Since our model would be primarily spatial, I next found the latitude and longitude of the 56 hospital locations in NYC counties using the website


After a tour through the Carto map visualization website, we simply drug our data into the Carto interface to create some really nifty maps. First Ruthie and Shivani overlaid the Tree Count data and the geo-located 311 sanitary complaints. The video shows that 311 complaints are clustered in areas without many trees.

The video seems to support Ruthie and Shivani’s hypothesis that green spaces foster a healthy environment! On the one hand, green spaces are likely to be more desirable, and as a result of socio-economic factors, have more resources and fewer sanitary problems. On the other hand, other WiMLDS groups looked into 311 complaints and census data and suggested that wealthier NYC residents complain more, which makes our correlation between green space and lower numbers of 311 sanitary complaints more remarkable, and less contingent on socio-economic factors. We’d hoped to test our hypothesis by adding census data variables like neighborhood income to our spatial model, but we ran out of time.

Meanwhile, I overlaid the hospital admissions data on the tree cover and 311 complaints. The map was too busy with the 311 complaints, but there also seems to be a correlation between green space and respiratory hospital admissions. We weren’t sure if we could use Carto to conduct spatial correlations of the tree count, 311 complaints, and hospital admissions data, but would be interested in such correlations given more time.

Trees + hospital admissions

All of us learned a lot over the day. We used Carto for the first time, I used Jupyter and Google’s API for the first time, and Shivani got very close to being able to access the NYT database! It was an intense day of concentration that was a reminder of what we can accomplish when working together without any distractions. Although I had to explain what a hackathon is to friends and family who may have worried that I was doing something illegal, I would love to participate in the next WiMLDS hackathon!