Skip to main content
Data Thesis

Overcoming Data Homogeny

For as much value as our Web 2 overlords extract from data, they undoubtedly leave much more on the table. Stringent regulation aimed at safeguarding personal privacy allows companies to do less and less with your data. And with even the pro-business right wing staunchly against the biggest businesses of our age, expect corporate data moats to get weaker. The *making the world a better place* ethos of 2015 tech is truly dead, for better or worse.

We think better, because giving people the ability to monetize their own data could drive the biggest social transformation since the New Deal, as cryptoeconomic platforms allow people to unlock hundreds of billions of dollars by enriching their own data with context. The Web 3 revolution displacing significant chunks of Web 2 will allow common people to capture the reward for the value their data adds.

History shows that corporate hegemony will extract every drop out of common people that it can, labor being the clearest example. Even fair pay for work is a recent, western development: for hundreds of years, serfs and slaves were exploited out of the reward for their labor. These contributions became harder to ignore when combined with technological progress (agricultural technology, cars, and automation) that amplifies a worker’s contribution and allows them to measure it. Physical labor is the clearest human value add to quantify as Americans make on average ~$42k for it.

If the common person could distribute and contextualize their own data, they’d supplement their income significantly. But how much is that data worth? One simple approach is to look at the advertising revenue generated by America’s biggest tech companies – they capture the lion’s share of our personal data across every aspect of our physical and digital lives.

Ad revenue is the toll payment big tech charges to access their captive consumer base’s data. The toll payment is variable, and changes depending on how profitable the tech behemoth’s customer is. For small businesses, advertising revenue is their customer acquisition cost (CAC), and business that are buying ads must make sure they’re earning enough lifetime value (LTV) from their customers to justify their CAC. By enriching data, Web 2 behemoths can price discriminate more effectively, selling adds to the highest bidder vs. selling generic adds. Big tech companies are obsessed with finding roundabout ways of adding context to their generic data, especially so today, given cookies (invisible bots that track you across the internet) are being phased out.

Without rich context, advertising is the easiest way to monetize this partial data and it’s the only standardized system today with a clear ROI. Let’s see how much these companies earn off their users:

We can comfortably make the blanket assumption that Facebook’s numbers for daily active users for the US (195mm) are representative of the US population taking part in the data economy covered by the companies above. That’s ~60% of Americans, which makes sense when you consider ~50mm Americans are under 11 years old and ~80mm are over 60:

Doing the math, each US user is monetized to the tune of ~$1,000 by six companies, with advertising the only monetization avenue. Tech behemoths today are creating $200bn of US advertising revenue off 200mm users, using frameworks requiring them to be extremely sneaky in capturing your data, and severely limited in how much they can enrich that data with context.

The scale achieved by the digital advertising industry is incredible: however, growth expectations are tempering in the face of increasing barriers to data capture, led by the EU but likely to eventually leak towards the US. Headwinds like these, as well as increasingly privacy-focused stance of Apple have led to core changes in the direction many of these companies are taking, led by Meta’s radical pivot towards the digital asset world.

So then what of Facebook’s metaverse? Well, in a world where everything you see/touch/interact with is optimized for you, platforms can price discriminate massively because they can direct traffic to the highest value advertiser and charge a massive toll.

Crypto’s vision is a public metaverse (ie: one that doesn’t need to earn excess rents). Instead of *price discriminating for toll booth access*, the applicable mental model is *entropy in a chemical system*. The more composable and recent data is, the more chemical reactions it can spark (ie: interactions, where insight from high-context data allows 1+1 = 3). We provide further examples of this below.

So how do you enrich data? Provide context. There are three simple ways you can empower data:

  • Proximity: Data is more valuable in the hands of someone who values it more. Your healthcare data is extremely valuable to your doctor, moderately valuable to an advertiser, and completely useless to your plumber. By getting data into the hands of the person able to make the most out of it, you enrich the data’s monetary value.
  • Composability: Data that can be contributed neatly as a piece towards other supersets can be easily leveraged by others to derive key insights is very valuable. This underlies many of the DataDAO approaches taking off recently – millions of sets of Amazon, Facebook, and Google data can be aggregated to: predict the onset of severe diseases, power massive hedge funds to invest behind consumer behavior, or even understand development of society’s entertainment preferences. Supersetting data breeds insights that many will pay a premium for.
  • Recency: Data is more valuable the more recent it is, and the fewer people have seen it. This is more obvious for certain forms of data than others: a sales lead for a paper salesman in Scranton is far more valuable ten seconds after it’s delivered vs. 10 days, an inflation reading is more valuable to JP Morgan’s fixed income department if they see it before anyone else, people are happier to get a warning of a meteor headed for their city sooner rather than later. There’s a premium for getting relevant data to people who need it faster.

Aided by self-empowering technology, we can give data appropriate context by getting it into the right hands, at the right time, in the right format. Many Web 3 platforms are working towards this end goal, and while approaches differ radically, they center around the theme of empowering people to capture their own data in a composable and immediate format, and eventually connecting them with a proximate user of that data. Two contrasting blockchain-native approaches that may unlock this trapped value at scale are Machine Networks and Data Funds.

MachineFi is a good example of a Machine Network. MachineFi is a developer platform built on the IoTeX blockchain that empowers the creation of autonomous machine networks. These are networks of connected IoT devices (think Fitbits, sensors, phones) that relay real-time data about their subject, which is captured on IoTeX’s secure blockchain. Developers can spin up different applications to rewards participants for their data in real time. As a simple example, Aetna could use MachineFi to spin up a network that pays people for contributing their health data from their fitbits: with context rich data, Aetna can better price healthcare across all patients, but can also offer you better pricing individually if your on-chain data is linked to an identity layer. Here, 1+1=3. The result is more dollars in your pocket from providing data to the most proximate counterparty and savings derived from analysis of composable data. MachineFi traffics in real-world data and monetizes that data for users.

Delphia is an investment fund with an attached DAO that rewards users directly from contributing Amazon, Facebook, Google, and other data to their DAO in return for a payment in tokens. That’s immediate value to the user, but is compounded even more so by Delphia’s strategy, which uses that data to power a robo-advisor that will invest with the benefit of that data, with some winnings distributed to providers of the data. Delphia benefits from the composable nature of data provided (they can superset Amazon data for unique insights), the recency of that data (real time data allows enhanced portfolio management capabilities). Again, 1+1=3. Delphia empowers users to make more out of data already being collected and unlocks significant additional value through providing context.

In both cases, the revolution is enabled by blockchains, which allows new forms of coordination and automation. Incentive systems embedded into platforms reward contributed data in real time, increasing recency of data. Combined with machines/standardization mechanisms, automated data collection creates similar looking datasets that enable composability. Finally, the organization of standardized data in a single, easy to access place allows people to get that data into the hands of the most proximate user of that data. This system has significantly more to contribute to enriching data than advertiser cookies sneakily placed on the internet.

Data is money, and money is power, so finding ways to give people ownership of their own data could be one of the great equalizers of the 21st century. An illustrative10x increase in the value of data through providing detailed context (vs. advertising) would be a game-changer for the average American. $10,000 of annual income from data sales could be the new basis for a universal basic income and would drag tens of millions of Americans back above the poverty line. You could drive even more radical insights: the value of context rich data could imply that, if hardware costs continue to deflate, it may make policy sense to subsidize buying smartphones/sensors for homeless people. That insight then empowers governments to spend aggressively to build smart cities: Data collection can juice the ROI of national infrastructure projects significantly, creating a virtuous cycle of investment driven by the insight provided by new data capabilities.

Investing in business that increase the context around data has the rare quality of benefitting from increasing returns to scale, and we believe the adoption and normalization of crypto-eonomic incentives will unlock a whole new breed of businesses that get better as they get bigger. As the world scours for crypto use cases with real world utility, expect data-empowering use cases to take a central role.

Leave a Reply