Building a Blockchain Protocol for Market Research

This is the first in a series of posts about building a blockchain protocol for market research. The series is aimed at people in the business of research but many of the interesting and difficult topics center around markets and incentives and trust and are likely interesting to anyone thinking about applying blockchain technology to an industry.

First, an introduction and a disclaimer: I’m John Martin and I’m one of the founders of Measure. At Measure, we’re building a blockchain protocol for market research. As such, we are incredibly biased on this topic! However, we’ve thought and argued long and hard about all aspects of what we’re building and are pretty convinced that we’ve arrived at justifiable conclusions. Regardless, these posts are an attempt to show our work so you don’t have to take our word for it.

In this series we’re going to cover a lot of ground, including, among other things, consumer privacy, data quality and validation, the role of intermediaries, sampling rigor, and pricing models. For the remainder of this first post we’ll set the context and make some broad recommendations for thinking about blockchain technology.

The Opportunity: Facilitating Data Collection

There are likely several ways in which blockchain technology can be applied to market research. We are focused on just one of them: the facilitation of data collection from individuals. Importantly, this does not include the design, programming, or hosting of the survey instrument itself. Instead, it is concerned specifically with getting appropriately qualified individuals to the front door of a survey and incentivizing them to make a good faith effort to answer questions and contribute data.

This is, essentially, the core service performed today by survey panels and other sample providers:

We can break down this problem into four specific tasks:

a. Recruiting - finding new respondents and signing them up
b. Profiling - recording and validating the profile of respondents
c. Sampling - inviting and routing qualified respondents into studies
d. Payment - paying respondents for participation

Recruiting, for our purposes, is outside of the system. It’s the act of bringing people into the system. And so we’re left with profiling, sampling, and payments. It is this set of tasks that we believe are particularly well-suited to a blockchain protocol.

Taken as a whole, this is an economic coordination problem. There are hundreds or thousands of buyers (researchers) that want to purchase goods (data and opinions) from millions of sellers (consumers). However, each request from a buyer can only be satisfied by a very particular subset of sellers, e.g. a study on automotive brand loyalty may be interested only in Floridian women who drive Subarus.

There are two things that are probably true today:

- The total amount of money being spent by researchers and their clients on sample today is enough to adequately incentivize consumers to complete all of the research they need.

- Many studies are hard to fill and data quality is questionable (compare the typical sample requirements for a tracking study to a one-off study to understand real opinions on this).

This is not really surprising. It’s a genuinely difficult task to find the right consumer at the right time with the right proposition. Over time, the industry has evolved to accommodate this through a patchwork of publishers, panels, and routers. This, however, gets expensive. And the more expensive it gets, the less money there is to pay respondents. Further, each time a new service layer gets added we introduce opacity and uncertainty into the process. In many markets, opacity and uncertainty in the supply chain are harmless. When I buy a pair of sneakers, if they look and feel and fit like the sneakers I wanted then the supply chain that brought them to me is unimportant. (Well, beyond labour laws and working conditions in developing nations, that is.) In market research and other data-centric industries, however, the provenance of the product is the product. The less transparency and certainty we have around the supply chain, the less useful and valuable the data.

As such, the particular type of coordination problem we have here is one where transparency is primary. It turns out — as Gladwell would say — blockchains have a set of properties that make them particularly good at solving these types of problems.

Requirements and Goals

An early whiteboard sketch of the system

We want a system that allows for the following:

Consumers can contribute data through surveys or by connecting data sources such as health and location data.

Researchers and other data buyers can make requests for particular types of data from particular types of consumers.

A bunch of software can facilitate transactions in a way that protects the privacy of consumers, incentivizes veracity, and guarantees non-privacy-destructive transparency.

When these tasks are executed today there are particular ways in which consumers and researchers place trust in the combination of intermediaries — routers, panels, and publishers — that orchestrate the transaction. Consumers trust intermediaries to keep their profile data safe and to pay them appropriately for their participation. Researchers trust intermediaries to select a randomized set of consumers who conform to the demographic and behavioral profile they have specified. Importantly, the behavioral profile of consumers encompasses behaviors related to the research process itself — frequency of participating in research, tenure on a panel, etc. (The ESOMAR 28 enumerates many of these things.)

For a blockchain-based alternative to be useful it needs to be able to make similar assurances. Specifically, the goal is that consumers and researchers can “trust the protocol” in the same way that they trust their sample provider today. This does not mean that the need for intermediaries disappears — in fact in our design the protocol makes explicit affordances for them — it simply means that consumers and researchers, whether transacting directly or via an intermediary, can be satisfied that commitments around such things as privacy, compensation, and sampling rigor are upheld without needing to trust any particular entity.

As we’ll show in subsequent posts, the implementation of these assurances tends to be varied and idiosyncratic. Sometimes the transparency gained by putting transaction records on a blockchain is sufficient. This is true for such things as frequency of participation and tenure on the network. Other times it entails obviating the requirement in the first place. How do we provide assurances to consumers that their profile data is kept private? If we remove the need to send that data to intermediaries in the first place then concerns around privacy fall away.

To be clear, our intention is not simply to replicate the status quo on a blockchain; it is to improve on it in meaningful ways. Beyond the basic plumbing of profiling, sampling, and payments, our goals for the system include:

strong, objectively verifiable sampling guarantees for researchers;
strong, objectively verifiable privacy guarantees for respondents;
increased data quality via profile verification, reputation tracking, and economic incentives;
reduced redundant questioning and increased data reuse via shared demographic taxonomies;
reduced costs for researchers; and
more, and more equitable, compensation for respondents.

This is, without doubt, a tall order and blockchain technology, in general, is no panacea. However, this particular use case is so well suited to the strengths (and weaknesses) of the blockchain that we think a step change in price, quality, and participation is eminently possible.

Blockchain Isn’t Magic, Be Skeptical

Having said that, the blockchain isn’t magic. Whether you’re looking at Measure or any other blockchain project, you would be right to default to skepticism. It’s very easy to use words like privacy and transparency but cashing those concepts out in the real world in a useful way is another matter entirely.

The blockchain, in a way, is just a very slow, shared database that can’t be modified. When we see something written to a blockchain, the only things we know for sure are who wrote it and at what time. We know nothing, inherently, about the veracity of what was written. For each claim of transparency that we hear from a blockchain project we need to ask: what makes this disclosure accurate? If I write to a blockchain that I have successfully selected a randomized group of respondents for your study, is there enough supporting data on the blockchain to verify that claim? Is the supporting data itself reliable? Alternatively, is there a third party entity who can confirm the claim? If so, are they reliable?

This is not to say that these things are not solvable. We wouldn’t be writing this were that the case. Just that — to butcher a phrase — on the blockchain, sometimes even simple claims require extraordinary evidence.

Blockchain Is a Long Game

Finally, I want to ward off a common first reaction to blockchain projects. We are very clear-eyed about how early we are in the evolution and adoption of this technology: most people don’t own cryptocurrency; acquiring it today is cumbersome and complicated; the cryptocurrency markets are volatile and unpredictable; and public blockchains are weird, slow, and expensive. This is all true.

What’s also true is that there is an army of intensely motivated people working really hard on all aspects of these problems — from software engineers to user interface designers to legislators. What’s unthinkable today can be commonplace in a short span of time. Bitcoin, the cryptocurrency that gave birth to the idea of blockchains, is just a decade old. It may not be next quarter or even next year, but mainstream adoption of blockchain technology is coming.

And, frankly, we may not even know it when we see it. It’s increasingly possible to build a blockchain-powered solution without exposing the blood and guts of blockchains and cryptocurrencies to end users. In fact, that is our plan. Neither consumers nor researchers need know anything about blockchains or cryptocurrencies to participate fully in Measure. As the ecosystem evolves and awareness of the underlying technology grows, users will be able to take advantage of the self-sovereign nature of blockchain networks and move currency and data around without requiring the permission or participation of any commercial entity. However, none of this need detract nor complicate everyday mainstream usage.

Well, that’s an introduction and then some. We’ll try to keep things a little more bite-sized in the future.

In the next post we’ll look at the ethical and economic considerations around the collection and usage of person-based data. After that, we’ll dive into the grubby details of designing a protocol.

Comments and questions are always welcome. Drop them below or find us on Twitter at @johnm or @measureprotocol.

Original article published on Medium