An idea of decentralized search for web3
Web3 [GAV] is coming. This can shift existing web's ubiquitously used client-server architecture to truly peer-to-peer interactions based on stateless IPFS and variety of stateful consensus computers such as Ethereum.
In the following series of posts, I would love to discuss opportunities behind consensus search engine, challenges of crawling, indexing, and evaluation for the next generation web and eventually propose a consensus computer implementation.
But first, let us start a discussion from disadvantages of general-purpose search engines:
No Transparency
Nobody outside (and probably inside) of Google understands how the ranking really works. This creates a market for black and white SEO. The truth is that if e.g. Google discloses complete details of the ranking algorithm it would be easy for adversaries to game organic search results that kill the quality of results and ad revenue streams. Google's approach to Sybil resistance is, in fact, KYC and captcha. Using modern technologies this problem can be addressed using accountable consensus computer with properly designed economic incentives.
No Access
Currently, all search engines are centralized. Nobody is able to add to index as well as participate in improving the quality of search results. However, Google itself internally uses a workforce of search evaluators. It is our belief that user-generated search engine could have a higher quality of results as in a story with almost every website in existence.
Broken Incentives
The vast majority of contribution to a search quality is made by users. When any user searches something she extends semantic core. When any user clicks on search results she trains a model. This creates an opportunity to continuously improve ranking model at the expense of users. Then search engines sell users to advertisers at the expense of harming user experience and acquire revenue streams which are not returned back to users at all. This simple loop created Alphabet's ~ $780 billion capitalizations (~ $100 per Earth capita) in 20 years. We need to change that.
Central Control
Google become too powerful. It is scary to imagine a future where everything about everybody is known and controlled by closed AI corporation. Imagine the world where (1) the only country exist, (2) nobody can control its government and (3) everybody should obey the decision of government without any explanation. There should be open, transparent and accessible alternative with decentralized control built on principles of modern distributed interplanetary content-addressable syberspace [HUAN] and DAO like governance [RALF].
Annoying Ads
Separation for organic and ad search results is unnecessary. In fact, all organic ranking decisions are being made by search authority. But for paid search Google use free-market solution to determine a fair ad price for every word in its gigantic semantic core. Historically free market solutions are proven to be more efficient in virtually any area of decision making. Why do not use the same principle for the ranking itself disintermediating annoying ads? Let us imagine that every link (1) can be submitted by everybody, (2) based on this actions some rank can be computed and then (3)those who predict content with higher rank in the future will benefit more. This non-zero-sum game is significantly more Sybil-resistant, hence that is where we are heading.
One-Way Trust
Everybody are used to trust Google, Baidu, and Yandex. But Google, Baidu, and Yandex don't trust users. You cannot know what happens inside Google, Baidu, and Yandex because they don't trust us. But they know everything that happens inside us. We want to establish a system where trust is bidirectional between a search engine and agents because search engine ownership is distributed across all its agents based on which all ranking decisions are made.
Zero Privacy
All search engines will answer you only if they explicitly know how to map your device with your real identity or pseudo-identity which is tracked by realtime bidding ecosystem. Otherwise, you should prove that you are not a robot every time you search. That harms our privacy. That harms our experience. That basically sucks. Real privacy is very expensive in consensus computers at this stage of development. But we need to find workarounds.
Censorship
content is censored
Online only
Worth to note that you cannot search offline even if necessary information is stored next door. If we are cut from the wire or backbone we powerless. Global offline search is not a feature which can be easily deployed even by a multi-billion corporation. This goal is nearly impossible to achieve based on centralized architecture using TCP/IP-DNS-HTTP stack. Only content addressable distributed systems can solve this fundamental problem for the next generation Internet. This future is not about gateway keepers in form of ISPs but about mesh networking and peer-to-peer communications built for interplanetary scale in mind.
Weak Security
What happens if tomorrow my Google account will be blocked? Do we have something to prevent this? Do we have the necessary level of assurance that guarantee us our security based on math and not on the complicated legal tender? All technical solutions are here but to solve this important issue we need some effort from everybody.
Because security and privacy is a foundation for life, liberty, and property.
Principles of better Google
A pretty huge amount of problems to fix. Thus we are to declare the principles of a general purpose decentralized and distributed search engine for the upcoming age:
- Privacy and Security. Just it.
- Ubiquitous Ownership and Access. Everybody should have a right to possess a piece of it.
- Mesh networks future proof. It should work in every connected surrounding.
- Interplanetary scale. It should work on Earth and Mars.
- Tolerant. In the era of machine learning, it should work for any kind of thinking beasts.
- Open and Accessible. Everybody should be able to bring a bit to a quality of search results.
- Blockchain Agnostic. Foundations behind its design should not rely on any protocol or stack rather be explicitly derived from the nature of the information itself.
- Beautiful. The economics model should not harm the user experience.
- Transparency and Trustfulness. Every piece of its reasoning and behavior must be auditable by everybody.
- No Single Point of Failure. Nobody should have a single key to modify or change it.
- Sybil Attacks Resistant. This resistance should be derived from the properties of a free market but not from some single authority.
- Intelligent. It should answer natural questions with easy to read and provable answers no matter text, media or natural numbers should be involved in the answer.
Reference
- [GAV] - Gavin Wood, ĐApps: What Web 3.0 Looks Like
- [HUAN] - Huan Benet, IPFS - Content Addressed, Versioned, P2P File System
- [RALF] - Ralf Merkle, DAOs, Democracy and Governance
Feedback
Help to test cyb, our web3 browser.
Follow cyber•Congress blog to get updates on all web3 movement.
Signature
This content is signed by xhipster.eth on November 30, 2018
Updates
- Fixed typos and grammar
Thanks for using eSteem!
Your post has been voted as a part of eSteem encouragement program. Keep up the good work! Install Android, iOS Mobile app or Windows, Mac, Linux Surfer app, if you haven't already!
Learn more: https://esteem.app
Join our discord: https://discord.gg/8eHupPq
Are there any projects doing this, or are you yourself working on something of this nature? Great read btw, your content really stands out.
Thanks
Yep, couple of years