Learn how AirSage leverages human movement data to provide best-in-class location insights.

About AirSage

About AirSage Data

How do you translate raw data from different sources into meaningful insights?

One of the biggest challenges in the geospatial Big-Data analytics space is translating the results generated from a varying sample of mobile devices into insights about the full population. AirSage has developed the most efficient extrapolation methodologies to do so. This is done by maximizing and validating the correlation to independent sources such as updated census data, high-quality traffic counts, and attendance reports.

How does AirSage clean and pre-process its data?

Sourced data is normalized and archived in AirSage’s Big Data system in a secure and accessible format. Irrespectively of the final use case, proprietary pre-processing is run on the data. This includes some unique features such as:

  • Accurate point classification: every user location is classified whether it represents a person in motion or stationary. This is then very critical, for example, when trying to count visitation to a location and differentiate between people who just passed by and those who actually spent time at the location. In a recent comparison of the AirSage point classification with an independent source, we found that the AirSage data was more than 99.9% accurate.
  • Home/Work assignment: to serve several use cases, the need for high-quality assignment of the location in which the mobile device holders live and work are critical. These assignments need to be able to cope with extraordinary cases, such as people relocating to a vacation home for a few weeks or even months, people working night shifts regularly, etc.

How does AirSage collect location data?

AirSage supports the ingestion of data from multiple different data providers (publishers, 1st party data providers and aggregators) and has also evaluated other providers that we don’t support.

Our experience is that the current data we use is among the largest panel with the most sufficiently high-quality devices for us to be able to select a large enough sample of a consistently high enough quality so that we can adjust for things like variable sample sizes.

We select our sample using a per device abstract monthly metric that measures both the visibility and mobility of each device to ensure that we have a sample of devices that behave consistently.

Our metric was defined by staff that also worked with telecom data, which offered better visibility than app data.

This is a key differentiator between us and competitors. Much of this is IP and, therefore, cannot be expanded upon.

How does AirSage ensure high-quality data sourcing?

With more than a decade of experience with sourcing various types of anonymous location data (carrier data, connected car data, fleet data, smartphone data, and more), and 5 years specifically in sourcing App data, AirSage has developed a unique skill in sourcing the best available data and building an optimal data panel.

Nearly all data available in the open market for large scale sourcing has been evaluated and considered by AirSage to enter its panel. Each such candidate passes a thorough and efficient evaluation process that ultimately reveals its data volume, coverage, uniqueness, and multiple other quality metrics, all relevant for the AirSage analytics use cases.

Data that has been chosen to enter the panel goes through similar ongoing evaluation to make sure that the highest quality standards are also kept through time. Data feeds that fail to maintain such standards are removed from the panel.

In what sort of ways does AirSage normalize the data and manage noise?

AirSage cleanses the data we use on ingest. We apply point types to sightings and various other important metadata for our individual product processing. Further, we don’t use bid-stream data like other providers.

In which formats can I receive the data?

We provide our output as CSV files for maximum compatibility with our customer’s systems.

About Data Visualization

Can I use AirSage data with Kepler or Superset?

Our customers can convert our output into their own GeoJson datasets to use it with Kepler and Superset. These do support the ability to import CSV data into the database to which it’s connected.

How can I visualize the data?

Our data can easily be imported as attribute data to be joined with standard Census shapefiles generically, allowing the use of your preferred GIS suite.

Which GIS formats does AirSage support by default?

We can accept Shapefiles, GeoJson, and delimited text files with WKT or Hexified WKB.

About Location Insights

How does AirSage address biases?

We control sample bias by having a diverse data panel to get a better representation of all people. Our data panel includes tens of millions of unique devices and is comprised of apps in every bucket. After receiving the aggregated data, we implement an accuracy metric and device quality score to exclude some noise. Some things we take into consideration:

  • How often the device moves around.
  • How frequently devices are seen.
  • The duration of each sighting.

There are some other known biases that would be hard to avoid, such as age bias when looking at usage during particular times of the day (waking/sleeping hours typically vary depending on age). There could also be a vacation bias that may increase activity when one is on vacation compared with regular daily activities. Another possible bias would be income bias where more affluent areas may have more devices (i.e., people from affluent areas may have more than 1 device).

How does AirSage apply “point types” to “sightings”?

We discern user behavior, for example, at home/work vs. moving through a reported point vs. at a stationary location.

How does AirSage associate or link the unique device if said device has several apps sending their location?

This is not an issue for us. AirSage uses the mobile advertiser ID to uniquely identify devices. AirSage’s data is coalesced at the device level, so we do not distinguish between different apps or SDKs.

How does AirSage measure “home” and “work” locations?

We like to also consider “home” and “work” locations as “daytime” and “evening” locations.  These locations are based on where devices ping the most during the daytime and late evening.

What is the difference between “Total Sightings” vs. “Total Devices” in TLA/Properties output?

Total Devices counts distinct devices present at the location of interest during the reporting period. Total Sightings counts the total number of individual records produced by all devices present at the location of interest during the reporting period.

© 2022 AirSage Inc. All right reserved.