Q+A: The Use and Misuse of Geolocation App Data in Research – Do the Risks Outweigh the Rewards?

17 Apr

In today’s world we have access to more data than ever. With the rise of digital technologies, we have unprecedented amounts of information about how people live, behave and move through the world.

This data is incredibly valuable and has been a driving force of innovation in a range of contexts including commercial, research and policy contexts.

However, even when this data is used for “good” or socially positive purposes, there are still many ethical questions around how, and for what purposes, this data should be used.

In this article we talk to Pascarn Dickinson who explored these questions in research our team conducted in partnership with Covid-19 Modelling Aotearoa (CMA). This research aimed to understand whether the public health benefits from using geolocation app data for analysis over the Covid-19 pandemic, justified the collection and use of this data.

In our chat, Pascarn walks us through the findings of this research and explains the different factors researchers should consider when weighing up the risks and benefits of using geolocation app data.

Note: Since this Q+A Pascarn has moved on to a new role at Huber Social – thank you for all your mahi Pascarn and all the best!

NC: Hi Pascarn, thanks for chatting with us! Would you be able to give us a little bit of background about this research you worked on?

PD: Sure thing! Early last year I was asked by Covid-19 Modelling Aotearoa to do some research into the ethical and technical issues associated with geolocation app data – this was largely to support their decision-making about whether to use that type of data in their future research work.

The project on my end started out as a literature review, and then morphed a little into a literature ‘analysis’ – basically a review, but with some extra interpretation, discussion, and assessment mahi. It culminated in a publication titled The Use and Misuse of Geolocation Mobility Data: A literature analysis and recommendations for ethical research.

NC: What exactly is geolocation app data?

PD: Good question. Geolocation app data generally takes the form of individual-level information on people, including their geographic coordinates over time; think of Google or Apple maps showing a pathway of where you have travelled in the last few months and you get the idea. The ‘app’ part is because such information is sourced from your mobile phone apps, usually by tapping into your location services when serving you an ad.

The information on your geographic coordinates is often accompanied by some ‘inferred’ personal information about you; things like your gender, age group, occupation, or even things like your personality traits. Each geolocation app dataset contains many location observations on each individual in it, alongside this inferred information. Such datasets often contain information on many thousands or millions of individuals.

The network of sharing and distribution of geolocation app data is pretty nebulous. The companies that serve the advertisements (and get the location data from you) on-sell the data to other organisations (often called ‘data brokers’), who then tidy or package up the data to make it more useable for others further down the line. The Irish Council for Civil Liberties did some interesting reports a couple of years ago on the ‘Real-Time Bidding’ markets that of lot of data brokers buy their geolocation app data from if you’re interested in learning more about that part of the data chain, as well as about the data brokers themselves.

NC: What are some common applications for geolocation app data?

PD: This is an interesting one. It is worth noting there are some technical concerns with the data that aren’t always discussed – I’ll set those aside here, but more information is in the full report.

Like a lot of technology, geolocation app data can be used for social good or social bad purposes. On the good (or at least innocuous) end, it might be used to help researchers understand population movements to help with pandemic or traffic modelling, or to better plan how to deliver services to people.

Unfortunately, there are many examples of it being used for things I would consider not socially good: planning for military operations in other countries, aggressively targeting anti-abortion ads at people who visit abortion clinics, publicly and maliciously outing the sexuality of others, blackmail of those in positions of authority, and probably a whole bunch of stuff that the general public never hears about. The Literature Analysis cites a bunch of interesting investigative journalism pieces in this space, and the Irish Council for Civil Liberties reports I mentioned earlier also cover some pretty chilling examples.

NC: What privacy issues or ethical concerns are there around the use of geolocation app data?

PD: Lots, unfortunately – many more than we have time for here. You’re probably already starting to understand what these might be based on my answers to previous questions.

From a privacy perspective, the data is deeply personal – relating to individuals, where they have been, who they are, what they do, where they live, and more. It also has implications for group privacy. Geolocation data shared relating to you will also likely relate to others you work, live, or interact with. The data represents huge opportunity for blackmail or misuse, and is currently barely regulated or monitored at all. Moreover, such data is sometimes collected without any consent, and is nearly always collected without truly informed consent; you’re often signing away permission for others to take this data when you accept the Terms and Conditions on a website or the End-User License Agreement (EULA) on an app or game.

That lack of consent raises some ethical issues as well. There are a whole raft of considerations around the potential inequitable impacts of the collection and use of geolocation app data for different groups. One important ethical consideration for researchers looking to use geolocation app data for public good purposes is function creep. Many researchers may think the ethical concerns are less relevant if you plan on using the data for the public good. However, even if you’re using the data for ‘good’ purposes, your use of the data also serves to legitimise it. That makes it easier for someone else to justify the use of the data for their slightly less ‘good’ purposes, which justifies it for some other even less good use – and so on and so forth.

There are tangible examples of this exact thing happening in the past – like where some governments promised they would only collect geolocation information over the pandemic to help address Covid-19, until all of a sudden, a while afterwards they decided to use that geolocation data for police work as well.

Honestly, my learning during this research project has meant that I just leave my location services off on my phone 99% of the time. It’s not foolproof and I sometimes turn it back on for a few minutes when I need it, but overall I’ve noticed no difference in my quality of life and I feel a lot better not having that data about me out there.

NC: Can you describe the tension that exists between privacy and utility when using geolocation app data?

PD: I describe this as the tension between privacy and potency, mainly because I like the alliteration and I think it makes it more memorable. Anyway, yes, I can describe it.

Basically I think of most datasets about individuals as being on a spectrum. At one end, you have your very private datasets, which aggregate individuals up to large scales, obscure any potentially personally identifiable information, and protect access to the initial individual-level information very closely. Due to its privacy, this data is less potent – you can do a bit less with it, but it is still helpful for a whole bunch of work.

At the other end, you have the very potent raw information on individuals that allows for easy identification of them personally, accompanied by relatively few protections regarding access to the data. The potential uses of this type of data are vast (assuming the data is accurate), but the potential for privacy breaches and misuse are also very high.

Geolocation app datasets could exist at many points on this spectrum, but many of the commercially available ones are much closer to the potency end, albeit with a cost barrier preventing totally open access to the data.

NC: What considerations are there for those looking to use geolocation app data in a safe and ethical way?

PD: To be honest, I’d point you towards the full report for more discussion of this – it’s pretty tricky to summarise! There are a fair few things to consider and steps to take to make sure that your work with the data is as safe and ethical as possible, and to some extent the data ‘means’ may be more justified when you’re aiming for particularly great public good ‘ends’.

In short though, I’d generally recommend researchers not use this individual-level data unless there are clear and strong privacy protections in place, there is a very strong public good justification for doing so, and there is a feasible pathway to positive impact. Regardless of the type of data, I’d always recommend that people use the least invasive data to do their analysis when that is an option.

Probably the biggest consideration from my perspective though is about what’s next for this type of data, rather than what now.

The data is out there, being collected and sold – so what do we want to do about it? Engaging with the data when we know that the industry is highly problematic sends a signal of support, or at least apathy, towards those doing harm, and so perpetuates it. Disengaging entirely is a better alternative, but isn’t going to help address the problem in the long run when others are still engaging with the industry.

My feeling is that what we need next is more effective monitoring, regulation, and awareness of this type of data, especially in Aotearoa – I’d love to see the Office of the Privacy Commissioner do some more work specifically in this space. In working towards this greater scrutiny, we can hopefully identify and uplift those in the industry that are genuinely operating to high ethical standards, and can help push for positive change while we’re at it.

Thank you for your time Pascarn! And a big thank you to Covid-19 Modelling Aotearoa for commissioning this mahi.

If you would like to know more about this mahi, or how to strengthen your organisation’s data practices from an ethical perspective, reach out to us at hello@nicholsonconsulting.co.nz

Pip Nicholson

Q+A: The Use and Misuse of Geolocation App Data in Research – Do the Risks Outweigh the Rewards?

Social Investment Blog #1: How to Harness Data and Analytics to Deliver Better Social Services

Designing a trustworthy digital hub for Whakatōhea Māori Trust Board

Mā te raraunga ka hāpaitia ai te orangaAnalytics, for the good of our communities

Mā te raraunga ka hāpaitia ai te oranga
Analytics, for the good of our communities