Reimaging NYC neighborhoods boundaries using open data and machine learning
By Data Clinic
R Shiny App: https://data-clinic.shinyapps.io/newerhoods/
RStudio Cloud project: https://rstudio.cloud/project/235657
New York City's (NYC's) neighborhoods are a driving force in the lives of New Yorkers—their identities are closely intertwined and a source of pride. However, the history and evolution of NYC’s neighborhoods don’t follow the rigid, cold lines of statistical and administrative boundaries. Instead, the neighborhoods we live and work in are the result of a more organic confluence of factors.
Data Clinic developed NewerHoods with the goal of helping individuals and organizations better advocate for their communities by enabling them to tailor insights to meet their specific needs. NewerHoods is an interactive web-app that uses open data to generate localized features at the census tract-level and machine learning to create homogeneous clusters. Users are able to select characteristics of interest (currently open data on housing, crime, and 311 complaints), visualize NewerHood clusters on an interactive map, find similar neighborhoods, and compare them against existing administrative boundaries. The tool is designed to enable users without in-depth data expertise to compare and incorporate these redefined neighborhoods into their work and life.
In this example, we analyze clusters based on real-estate sale prices across different time periods to gain insights into market trends. In the figure below, you’ll notice that the left map (showing average sales from 2013-17) highlights the upper east side as one (blue) cluster within the orange circle. However, when visualizing sale prices for 2017 only (right map), the neighborhood of Yorkville is now a different, distinct cluster. This is likely due to the arrival of the Q train in early 2017, which lead to a massive boom in real-estate for Yorkville. Our tool reclassified this area as its own Newerhood, adapting to localized changes and improving our contextual and temporal understanding of neighborhoods.
Another potential use case of NewerHoods involves comparing clusters to existing administrative boundaries. Specifically, we looked at violation rates, which are the least serious offenses (i.e., trespassing, disorderly conduct, jaywalking, etc.) and compared clusters to police precincts. We found that NewerHood clusters often reflect precinct boundaries (thick white borders) as depicted in the below figure. This suggests that individual precinct behavior may influence violations - is the likelihood of being ticketed for jaywalking dependent on which side of a precinct boundary you are on?