Deriving location intelligence insights using SafeGraph

Deriving location intelligence insights using SafeGraph
Hong Kong Night Traffic | PxHere

SafeGraph provides unique data on visitation patterns and geometries. Let's leverage this data to derive research insights. On their website, SafeGraph mentions they have over 11M+ records for Points of Interest (POI) including data on location, brands, etc. Looking at the documentation, there are basically 3 kinds of data sets: places, geometry, and patterns (they recently introduced a new Spend data set  as well). In this article I'm going to walkthrough a location based use-case of SafeGraph data, combining the 3 main types of data.

SafeGraph API

First, I'm going to imagine I'm a traffic heavy store owner and that I need more granular information on visitors. This narrows my options to retail stores, grocery stores, basically places where people frequent (even during the pandemic). I choose our family's favorite grocery store - Great Wall Supermarket (GW Supermarket), located in the Duluth area, less than an hour from Atlanta. My wife is Chinese and we love Chinese food. Plus I've made some friendswith the people at food counters and the checkout as they are happy to talk with someone who does not look Chinese, but can speak a little Mandarin!

GW supermarket is quite famous locally due to the diversity in food products and size. I've heard of stories of people driving to Great Wall supermarket from nearby states often, and storing food in large ice coolers, and immediately heading back home. Let's see if SafeGraph can give any location insights on GW supermarket visitations.

We can use the SafeGraph core API to find the placekey of GW Supermarket (that we need to query patterns information) if all works well.

And success! We now have the GW supermarket placekey!

GW placekey from SafeGraph | Skanda Vivek

Now, we can use this placekey to query the relevant information using the SafeGraph core and monthly patterns API:

The most popular days are Saturday and Sunday, and the most unpopular days are Monday and Tuesday. From personal experience doing groceries, I say this matches my intuition. I'd rather do groceries later in the week.

Visitations to GW Supermarket by day in December 2021 | Skanda Vivek

Visitations by hour are interesting. Looks like the peak hours are from 11 AM to 5 PM. I usually visit GW either at 10 AM or 2 PM but haven't really paid attention to which times are more crowded (although I have noted that sometimes GW is much more crowded than other times). I will keep this in mind for the next time I visit there!

Visitations to GW Supermarket by hour in December 2021 | Skanda Vivek

The graph below shows that visitors that went to GW were more likely to go to Walmart, Costco, and McDonald's. For a bit of location context; Walmart is right opposite GW and the closest Costco is 1.5 miles away. Costco's are few and far between. So this makes a lot of sense.

Other brands that the visitors to GW went to in December 2021 | Skanda Vivek

SafeGraph Geometries

I've shown you the Places and Patterns functionalities of SafeGraph. But SafeGraph has another powerful Geometries feature - useful for understanding and visualizing geospatial trends. Specifically, the question I'm interested in is where do the visitors to GW primarily come from? Are they local? Do they come from out of state? The dataframe corresponding to GW visitation data consists of a single row.

Some columns contain dictionaries that can be used to infer where users came from. In particular, the 'visitor_daytime_cbgs' column contains the number of visitors and their primary census block group (cbg) ID. However, this cbg number does not tell us anything about the locations. So how do we get the locations and geometries of the cbgs?

For this, I turn to census data provided by SafeGraph (It is free to download!). The cbg geometries data is in geojson format and can be read through geopandas:

CBG geometries data from SafeGraph | Skanda Vivek

Finally, I save this as geojson and use kepler.gl (See my recent article about kepler.gl) to visualize a Choropleth map showing visitations by CBG, as well as the location of GW (large yellow dot).

GW visitations from primary daytime CBGs in December 2021 | Skanda Vivek

Zooming out, you can see there are visitors from CBGs outside Georgia including Tennessee and North Carolina! It does look like the stories of people driving hours to shop at GW might have some truth!

GW visitations from primary daytime CBGs in December 2021 (zoomed out) | Skanda Vivek

Takeaways

SafeGraph data gives quite a bit of location insights and their API is super friendly. There's not too many tutorials walking through use cases, but this could be because the company is relatively new. It is quite tempting for companies that want to boost location insights without investing in intensive architectures to track store visitations.

The data is aggregated which can be good or bad depending on what you need. The good thing is that SafeGraph has already done some (pretty thoughtful actually!) preliminary analyses, so that you can use their formatted data for generating insights. In addition, this aggregation removes certain privacy concerns. The bad part about this is that you might miss some granular insights into visitations that might be useful.

Overall, SafeGraph is a great tool for providing location insights. There have been many recent studies that used SafeGraph data to answer important research questions especially related to people's movement patterns during the COVID-19 pandemic. I'm sure the number of researchers and companies using SafeGraph for location intelligence will only grow in the coming years!


Thanks for reading!

You can subscribe to my free newsletter to get articles like this delivered straight to your inbox!