The Connected States of America maps communities

by: Alexandre Gerber, Deirdre W. Paul, James R. Rowland, and Christopher A. Rath, Sun Jul 03 10:22:00 EDT 2011
Regional communities are based on strong communication ties derived from communications data.   


Borders that define cities, states, and nations are products of politics, culture, and geography. Once drawn, they rarely change even as the world transforms around them. Over time, do borders remain relevant to the way people actually interact and communicate?


The data and data analysis

In a project designed to understand the intersection of people’s self-formed communities with administrative borders, researchers at AT&T Labs - Research, IBM Research, and MIT SENSEable City Laboratory collaborated on mapping communities based on the strength of communication ties among members.

Using millions of anonymized records of cell phone data, researchers were able to map the communities that people form themselves through personal interactions. The cell phone data included both calls and texts and was collected over a single month from residential and business users.

All communication data was aggregated by county, with researchers looking at the home counties of the caller and recipient involved in each communication; home counties were determined by the caller’s and recipient’s most frequently used cell tower, which was assumed to be near their residence (see sidebar). The actual locations of the caller and recipient were used in a later stage of the project.

No personally identifiable information was used. Researchers were interested in numbers and overall patterns, not individuals; they simply wanted to know which counties communicated most closely.

Only anonymized communications between two AT&T customers were considered to ensure complete location data for both ends of the calls and texts. Counties with insufficient data were excluded from the study.

On the anonymized communication data, researchers applied a modularity algorithm to find strong county-to-county links, normalized for population. Counties with strong ties were assigned a similar color. A change in color between neighboring counties indicates a boundary and a falling off in the strength of communication ties. 



Comparing communities and states

Once the counties were organized into communities according to communication patterns, researchers compared the map of communities with established state borders. This map shows communities drawn from call data. 

 Communities formed from call data are compared to state boundaries.


From a high level, several phenomena are immediately apparent:

States with splits: California, Illinois, New Jersey, all showed north-south splits, with Pennsylvania showing an east-west divide.


States that seamlessly merge with neighboring states: Louisiana-Mississippi, Alabama-Georgia, New England.

States that retain to a remarkable degree their official borders:  Texas, Ohio, and Michigan.

The pull of cities

Large cities often pulled in counties from across state lines, sometimes splitting states in the process. This accounts for the north-south divide in New Jersey--with northern counties gravitating toward New York City, and southern ones toward Philadelphia—and for the split in Wisconsin, where two large cities pull from opposite directions: Chicago from the southeast, and Minneapolis/St. Paul from the north and west.


St. Louis draws counties from southern Illinois, while Chicago exerts a strong pull on northern Illinois counties. How strong depends on the type of communication.


Other views

St. Louis’s area of influence diminishes when the community map is drawn from texting data only. One third of Illinois counties (19 in all) that align with St. Louis when only calls are considered exhibit a tighter relationship with the rest of Illinois when only texts are considered. What can be inferred when communities form differently for texts than for calls?

 St. Louis and Illinois counties: Left: Calls only, Right: Texts only


Comparing the two maps from the national perspective shows that many of the merged states that result from call data no longer exist in the communities formed from texting data (Georgia and Alabama, Kentucky and Tennessee, Oklahoma and Arkansas).

Separate maps were drawn for anonymized calls (left) and for anonymized texts (right) to evaluate how their respective communities differed.


Texas shows remarkable consistency for both calls and texts, even though the state's large cities (Dallas, Houston, Austin, San Antonio) could potentially form their own communities.


Questions for demographers

These examples illustrate some of the insights that can be inferred from anonymized and aggregated communication patterns. The data can be endlessly examined by region, community, or by type of communication: Will a map based on business communications show what role businesses play in pulling together communities?

Other information may also one day be considered, including the length of a call. Do longer calls suggest closer personal and family ties, or are constant, short calls more indicative of close family and friend relationships?

Especially useful will be comparing the aggregate cell phone data with census data and with studies previously done in the areas of commuting behavior, urban living, and community planning. In examining communication patterns for New Jersey counties, researchers found that the north-south split—consistent between call and texting data—differed on the map drawn from mobility data, a pattern that could be explained from commuting and migration data for New Jersey.

But other findings from this research project may challenge conventional thinking that is based on previous studies, and it’s in resolving these cases that demographers, sociologists, statisticians, and other experts will learn the most.


 About the project

Researchers from AT&T Labs - Research (Alexandre Gerber, DeDe Paul, James Rowland, Christopher Rath) along with researchers at MIT SENSEable City Laboratory and IBM Research examined anonymous connections from AT&T cell phone networks across the US and analyzed how these aggregated county-to-county connections determine regional boundaries.

This research was originally highlighted in the global edition of TIME Magazine on the 11th of April in the series on Intelligent Cities, funded by the Rockefeller Foundation and in partnership with the National Building Museum, IBM, and TIME. The research was also featured in the Opinion section of the New York Times on July 3, 2011.



Home-based, actual, and mobility communities

With today’s highly mobile communications, researchers could look at a single communication from three location perspectives: home-based, actual, and mobility.