Realigning Online Incentives

Originally published on Hacker Noon. The original had room for improvement[1] so I added a bunch of evidence and revised my points. The initial version was published with help from Nir Kabessa and Professor Dan Rubenstein.

When Larry Page and Sergey Brin received funding for Google in 1998, they had to figure out how their academic project could make money. Initially the two were against advertising because advertising would incentivize displaying paid advertisements over quality searches. In a 1998 paper they even argue, “we believe the issue of advertising causes enough mixed incentives that it is crucial to have a competitive search engine that is transparent and in the academic realm.” But later that year, Page and Brin would decide to experiment with small ads on their platform.

Fast forward to 2019 and Google has over a third of the digital advertising market share, followed by Facebook with over 20%. But there is a fundamental problem with the advertisement business model: the mixed incentives that Page and Brin identified when starting Google. Users want useful services and quality content, but advertising platforms want users to view ads. This issue is prevalent seemingly everywhere online, but now many projects are trying to build solutions to help realign incentives around user data and content creation.

Attention and Data Extraction

Platforms like Facebook, YouTube, and Instagram want to maximize ad revenue so they try to maximize user attention and user data. More attention so their ad space is worth more, and more data to improve their machine learning algorithms to generate more attention. But while more attention does lead to more ad revenue, it does not mean that users like the content shown in their algorithmic news feed and their recommendation sidebars. These machine learning algorithms have revealed that humans are naturally distracted by sensationalist and provocative content. Mark Zuckerberg acknowledged this problem on Facebook (after activists like Tristan Harris started raising awareness about it); referring to this content, Zuckerberg says, “people will engage with it more on average — even when they tell us afterwards they don’t like the content." Zuck recognizes that people do not necessarily like content that grabs their attention. The fundamental issue with these algorithms is they rate content based on metrics, most notably time spent viewing it. They are great at maximizing these metrics, yet they do so at a cost to their user experience.

Additionally, the data these sites hoard makes them prime targets for hacks. All Facebook data is hosted on Facebook’s servers which makes it really easy for say Cambridge Analytica to access millions of users’ data, or even for one hacker to access over 100 million credit cards. These events are rare but completely unpredictable. Over time the security of these companies should improve, making these hacks rarer and rarer, but the amount of data these organizations stockpile will also grow. Resultantly these hacks will also get bigger and bigger.

These hacks should be seen as white swan events, events that are inevitable but blamed on poor human judgment. They have been and will continue to be blamed on slight flaws that have been overlooked. But instead, we should recognize the inherent fragility in storing user data together, a design flaw big tech has so far failed to adapt to. And why should Facebook, Google and other big tech store data more robustly? For them this is a perfect situation. They and only they have the upside potential of their user data, yet have little of the downside risk of these hacks. They own their users' data for their sole use in their algorithms, yet are not affected by the hacks because it is not their data. The downside risk is pushed to their users. It is only the user's data that is leaked.

Building Decentralized Alternatives

Mistrust in Facebook and social media in general increased greatly after the Cambridge Analytica scandal. Many joined the ‘Delete Facebook’ movement, but because users did not have any viable alternatives, it was largely unsuccessful. Now many are calling to break up big tech companies, including Facebook co-founder Chris Hughes. Breaking up big tech may lead to more competition in the short term, but it is unlikely to have any effect on the underlying problem. Platforms will still maximize user data and attention to increase the value of their advertising space.

Instead, the solution should be to realign incentives online by creating decentralized alternatives to traditional networks. Why must they (likely) be decentralized? As Chris Dixon notes in his article ‘Why Decentralization Matters,’ a fundamental issue with centralized platforms is that once they saturate their market, the only way for them to continue growing is to extract more from their existing users. Centralized platforms inevitably become zero-sum, but decentralized protocols remain positive-sum. Anyone can continue to build services on top of them without fear that their ability to build will be taken away, or create content without fear that it will be censored. Thus realigning online incentives may require decentralized protocols.

Data

Marc Andreessen is among many believers that blockchain technology could help realign incentives between platforms and users. For one, blockchains allow for digital ownership of data. Instead of allowing centralized platforms to control their data, users could control and store it themselves. Instead of just hacking into a single server that stores data for millions of users, hackers would now need to individually hack millions of people to get the same amount of data. It is a much more robust system to store data. There are several projects currently building these decentralized storage lockers. 3Box, Solid[2], and Blockstack are 3 of the biggest projects that you can use right now (with minimal functionality).

Blockstack founder Muneed Ali compares the shift from centralized to personal storage like the shift from mainframes to desktop computers during the 1980s and 1990s. Like the shift to desktops, the shift to personal data likely won't seem like a big deal a priori, but it should create new opportunities for users and developers. With user-controlled data, multiple applications may be able to use the same user data in ways that were not previously possible online. Imagine if Facebook opened up its backend so that developers could create alternative user experiences with the same data. Additionally, users may be able to eventually sell their data directly to companies that want it, essentially giving users universal basic income.

Monetization

The next key to realigning incentives is allowing individual users to be rewarded for the data and the content they are creating online. As Jaron Lanier and E. Glen Weyl discuss in A Blueprint for a Better Digital Society, people's data is super valuable to Facebook's, Google's, and other big tech companies' machine learning algorithms. People should be able to sell their data instead of handing it over in exchange for free services. But because individuals alone have no negotiating power, Lanier and Weyl propose an abstract organization called a mediator of individual data (MID). They do not propose any implementations of MIDs but they do give a few guidelines:

  1. No incentives to abuse member privacy and trust.
  2. Medium-sized. If they are too small, they won't have much bargaining power. If they are too large, they will just become the powerful organizations they were meant to stop.
  3. Data should not be permanently sold. It should only be licensed for defined purposes.
  4. Their benefits should be shared among members.
  5. Individuals need to be able to understand the terms and conditions of the MIDs they join.
  6. Longevity. MIDs cannot realistically last forever, but they should be designed to last longer than a lifetime.

It will be a challenge to combine personal ownership of data with Lanier and Weyl's idea of a MID. And a successful solution to directly monetize user data may have not personal ownership of data nor satisfy Lanier and Weyl's constraints.

In addition to rewarding individuals for their data, content creators should also be rewarded for the value that they bring to social media platforms. Many projects aim to reward users for the content including Steemit, Dlive, Voice, and SocialX. In many of these protocols, rewards are distributed based on users' votes so for reward distribution to be fair, they need to accurately reflect what everyone thinks. But unfortunately, there is no consensus on subjective opinions. If there was then there would be some election system such that everyone is always happy regarding its outcome. Imagine if everyone was happy with the outcome of a presidential election. Arrow's impossibility theorem disproves the possibility of such a system. Inevitably some people will not agree with the outcome of a subjective vote. Stated simply, there is no one-size-fits-all subjective voting system.

A related problem with subjective voting systems: how can you stop identities from colluding? First, you would need to be able to detect collusion, but how can you differentiate between a vote for a friend that is collusion and one that is not? Let's say that Bob is friends with Alice offline so he votes on most of her content. Maybe Bob is friends with Alice because he thinks she has awesome content, or maybe Bob is voting on her content because they are friends offline and Alice also votes for Bob's content. Because we cannot see into Bob's mind, it's impossible to know if he truly likes Alice's content or if he just likes Alice.[3]. Vitalik Buterin also wrote a great piece on collusion that covers this issue with subjective voting protocols far more in-depth.

More practical than subjective reward protocols is decentralized access control. Unlock Protocol is building such a system. If anyone can easily set up a paywall around their content, then more content creators will be able to charge for premium content. While allowing anyone to easily set up a paywall is a step in the right direction, it does not seem a suitable substitute for social media platforms, because there's no society. It's just one content creator monetizing their work. We believe that next-generation social media should ideally allow a whole community of creators to both support each other and to make money with their content.

The Future of Web3

Decentralized storage and monetization protocols may have the potential to solve the issues with the existing, centralized internet companies. They also have the potential to increase the rate of innovation online. As previously stated, centralized platforms inevitably become zero-sum, while decentralized protocols remain positive-sum. All developers can continue to build on decentralized systems, while they cannot on their centralized alternatives.

If decentralized platforms outcompete their centralized alternatives, a good parallel may be how desktop computers disrupted the computer industry in the 1980s. As previously stated, Muneeb Ali compares digital ownership of data to desktop computers and Marc Andreessen has similarly compared the development of bitcoin to that of desktops. When everyone was using mainframes in the 1960s and 1970s, IBM was the biggest computer company. Then IBM fell behind when they didn’t start selling desktops. Creating desktops would have taken a sizeable amount of time and money. It would not have made sense for IBM to lose money in the short term for an unproven product. But then Apple came along and built one of the first successful desktops in 1976. Desktops were ultimately successful and gave users more control like how decentralized networks could give users control over their data and content. But unlike Apple who is able to capture all the value of the computers they sell, the value captured by decentralized systems will be, well, decentralized. Successful decentralized systems will benefit everyone in the network, not just the company that created them.


  1. It's really bad. ↩︎

  2. Solid doesn't actually use a blockchain but it's creating the same technology so I mention it. ↩︎

  3. It’s even impossible to prove our own motives. Maybe with a better understanding of the brain in the future we will be able to analyze all the information someone is using to make a decision, but as Kevin Simler and Robin Hanson explain in The Elephant in the Brain consciousness often only rationalizes what we do. ↩︎