Web of Trust

This is a relatively long text. No cool pictures. The only reason you are considering reading it, is because you trust the person that gave you the link not to feed you trash.

That person went through a similar story.

And here we are - me, the author, and you, somebody who trusts me (possibly through a few hops) with his most valuable asset - his time.

The Idea

I think we desperately need of a working, decentralized, web of trust. We are living globally now, and much too often your trust is based on majority's opinion. Which was a reasonable heuristic in a real world. If most people in your town say that John is trustworthy, then he probably is. On the Internet though, those "most people" can be John too.

Before I continue let me clarify what I mean by the Web of Trust, even though you probably understand it already. The concept is not new at all.

We have some friends. We trust them. Some of them more, some less. Those friends have friends, which we don't necessarily know. But if my best friend tells me that he really trusts person X, I trust X a lot, even though I don't know him. And of course X also has some friends. Thanks to the small world effect only a few hops separate you from any person on the planet.

Ain't that fantastic? If you only could query that trust, you could asses your trust to any person on the planet. This includes politicians, car mechanics, the pope, that random guy from ebay, and the dude who tried to sell you some weed. Anybody.

The most crucial thing about this, is that this depends on people who you trust. It's not global. It's different for everybody.

This trust can be span multiple layers. Or it can be multidimensional. I can trust that Bob is an insanely good coder. But I'm not sure you can safely lend him money.

So you would be able to know if this random dude is a good coder. That is assuming you know some great coders yourself and you trust their assessment.

But let me repeat one more time, because that's the most important thing of it all - how good he is, is still your opinion, based on who do you trust. It's not absolute, it's not global.

You could get that info instantly though.

We've tried tons of different kind of ratings globally. It never works. Any global rating can be gamed. And even if it wasn't (but it will be), and if all participants were real human beings (won't be the case), it would still just be opinion of the majority. For many use cases, you can do much better than that.

This is the idea. Let me give you a few examples on why do I think that this could be as important as the Internet itself.

Examples

Software - you can't read all the open source code to verify what you are running. It always drops down to trust.
Product reviews. We have tons of reviews online. And we kind of learned already how to read them. "well that's clearly fake, scroll", but marketing teams are getting better, there are quite smart people working there. For some technical products, we've learned that we should listen to some guy who reviews them, because we think that at this point he probably has more incentive providing trustworthy reviews than trying to push some bad product for money. This doesn't work for all kinds of products though, because sometimes we don't know who's the expert and who can we trust (maybe one of our friends knows?). Plus reviews from many long term users have some advantages over somebody who only spent maybe a day with the product. You could get a long list of reviews, just like you get on Amazon, but you could sort them by how much you trust the author.
Pretty much any recommendations. Restaurants, renovation teams, lawyers, where to stay during your holidays, pubs. That's always how you get the good stuff - you don't check online review, you ask trusted people for recommendations. It's the same thing, just automated.
News. Well that's a whole new level of mess. Quite a few levels actually. There are political news, news that are news but you may not consider them to be newsworthy at all. Technology news. Domain specific news. News is one of the main avenues to get your attention now that you disabled your ads. We try to patch up our lame substition of web of trust - we choose some people who we think are trustworthy and follow them. Or have some niche news sources that were not gamed that much yet. In both cases it's still pretty centralized. It's still the case that a link from a friend is a valuable resource (if you choose your friends wisely that is).
Science. Trusted peer review. Without having to resort to abstract trust entities like journals.
Voting. Elections. Politicians. That's where lack of any trust connection shines. I won't even dive into that rabbit hole. But wouldn't that be pretty if I could vote for somebody I can trust. Really trust.
Is that long article really worth reading? It's currently solved by choosing your sources wisely, but WoT enables a random individual to publish an article that you may consider worth reading.
Have you heard about deep fakes? It's already happening that you cannot trust voice or video recording of somebody. It's gonna happen, nobody will be able to tell. Maybe some specialized AI, until the AI which generates it fixes that bug. How can you trust what you see if you were not there? You can trust it somewhat if your friend A was there, and guy C who is a very good friend of your friend B. In other words, videos, just like a text dialogue, became a thing which you cannot verify. You need to know who to trust.
Starting a coding project, hiring somebody, or making sure you want to work for somebody.
Dependencies - your project can only be as trustworthy and stable as its dependencies (simplification)
Investing, other recommendations, finding somebody to spend time with and what not. It could influence your social life and interaction with starngers, if you would want to.
Better startups funding.

Marketing is everything

That's how the world works today. You even have to be good at marketing to write a popular open source software.

Nice landing pages get VC funding. Developers judge projects by nice looking docs. Security vulnerabilities get logos. There's a reason for that. You have so many choices and you have to base your decision on something. So you learn to use heuristics based on appearance, without much digging. It's understandable. You don't have a time to research background of developers of each library that you are considering to use in your project.

You could make use of what your friends already know. And friends of their friends. It really seems bizarre that we don't do that yet. We have some partial solutions, which I'm gonna explore later. But these solutions are based on the majority's opinion. The majority is driven by marketing, not research.

You

Now if like the idea of Web of Trust, there are plenty of things to consider. It's not easy at all to do it right.

Let me tell you why you're here. I don't think I know how to do it right. I've been thinking about it for some time and I have some ideas. But I'm only smart enough to realize how foolish would I be, if I thought I could do it on my own.

I'm willing to go all-in on this project. It is not a for profit project. I don't think any money can be made if it's done right. I also don't really care if I do it or somebody else (as long as it is truly decentralized). I just want it to exist.

If you got this far, do let me know what you think. Any kind of feedback helps.

The plan

Architecture

We need to decide how the thing would work first. I'm pretty sure that when we get down to the implementation part, Web of Trust is not a software, but a protocol. Obviously we need some decent software to make use of that protocol, and UX in that software will be just as important for the project success as the core architecture is.

Different ways of achieving the goal must be considered. Privacy and security are very important. I'm afraid some trade offs may be necessary. Finding the sweet spot is not easy.

This is where we are now, finding out how it could work. Making sure that's the way we want it to work. That's where I really need your help.

Implementation

Then we need to implement something based on that idea. After deciding the exact protocol. Frankly, this seems like a trivial problem compared to the first one. Making sure it's easy and friendly to use, that's a bit harder.

Deploy

Then we have the network effect problem. The thing is only useful if some people use it and they will only use it if it's useful. We could start with some niche. We could try to bootstrap using some centralized site with similar goals. It's not an easy problem but it's doable, especially if the first part is done right. Doing the first part right would include getting enough people that are excited about this idea working together, which should help.

Some additional difficulties come from it being decentralized. It wouldn't be the first decentralized thing, so we can handle that, the hard part is when there's some disagreement between developers or users about what it all should look like. That's why the first part is very important. There are going to be many use cases. If successful, majority of users will look very different than the small group that currently may be interested in the project. But we need these users.

Basic idea

You assign weights to people you trust.
If you want to know how much can you trust somebody unknown, just ask all your friends. Do a weighted average* on the results using their weights.
If somebody asks you about your trust to somebody, return the weight if you have it assigned, otherwise just repeat the procedure above.

Weights

Why having weights at all? Wouldn't it be easier to just have a binary trust? If we had binary trust, then practically, through some longer or shorter chain you trust anybody in the world. You could analyze trust paths connecting you to extract some more information and be presented with non-binary result. From the user side perspective binary trust is very appealing. There's a reason Facebook doesn't have star rating system. Decisions are hard. And assigning weights is a burden. I'm still open to the binary trust idea (i.e. simple connections), but so far I much prefer the idea of weights as a normalized number (e.g. between 0 and 1).

Ignoring privacy

Forgetting about the privacy for a moment, you could just have this graph out in the open (could be on some blockchain, or even DHT should be enough, irrelevant for now). This way you see all weights assigned by all people and you can query it any way you like. The huge plus is that it's up to you to decide what algorithm to use to compute how much you trust a given node.

Butt privacy

Especially when weights are not binary, you don't necessarily want to let Bob know that his assigned trust is only 0.3 compared to 0.8 for Alice. You don't even want to let anybody know to which people you assigned any weights.

The problem

And that's the main problem I'm struggling with. Of course, graph does not need to be out in the open. Treat me as a node, you ask me what trust do I assign to the person X, I give you the result. You don't know if that's what I assigned, or if that's what my network returned. Especially if we add some rounding or noise.

But there are two issues. First, this information, how much trust my network returns for somebody, is already some privacy leak that some people may not like. A possible solution would be that I need to allow you to query me. Optionally add some limits and ability to monitor what peers are querying me. This solution, even though it's not even complete, already has a big drawback. I cannot add Elon Musk to my trust network. If you read a book on programming written by somebody, you may want to assign some trust regarding programming knowledge to her. But with this limitation, you can only follow people who allow you to.

The second issue is, trying to exploit the system. Imagine a small graph. You don't know how nodes are connected but you can query them asking about trust. You could then do some reasoning about weights that are assigned. Especially if you would have some information about possible nodes connections from some other source. This should be harder in a big network, but it seems that it's possible that you could get some meaningful probabilities.

Plus there's the timing leak. If I can query everybody fast enough, and I have some information that you may have met X, then I can observe how your response to your trust to X changes over time.

There are some mitigations, you can introduce noise, both static and one that fluctuates your weights over time (when calculating response). But I'm not able to tell if that is going to be enough. I'd love my weights to be completely private. Clearly that won't be the case. I'm unable to evaluate the severity of this issue. I'm sure there are some papers which could help. Send papers.

You may think that the worst case scenario here, is that Bob, after building a complex system and querying everybody on the network, learns that you don't trust him all that much. But it's worse than that. If there will be some leaks, somebody will build a service that explores them and Bob will use that service.

All in all, just knowing that you return low trust for me based on your network, especially if we may have similar networks and your response stands out from others, is already something that can be a problem.

This above, this is the main issue. We need brains. Any pointer, reference to some paper or any feedback helps. If you think it would be nice to have this Web of Trust, then a simple e-mail or post with just a few words can help push it forward.

Maybe limiting who can query you is necessary.

I caught myself thinking that if somebody asks for Bob, and I have a weight assigned to Bob, then I can just ignore that weight and return whatever response my network would have returned if I haven't had that weight assigned. Unfortunately, if everybody does that, then nobody knows anything about Bob.

Aggregate function

When I want to learn how much to trust X, which I don't know, I ask my trusted peers, then they give me their results. Now I'd like to put it together into a single number. I mentioned weighted average. But it has its problems.

If I add a new trusted peer, who I trust a lot, but X is far from his network, then my trust evaluation of X suddenly drops. Not sure if that's right.

So maybe we should just use the highest result that we've got, multiplied by the weight? Clearly I'm missing out on a lot of information here. If 3 of my peers return high trust to somebody then I think I should trust it more than if only one did.

Sum seems to do the justice (each result multiplied by its weight, then summed). The problem is, I should be able to get a normalized value, only then it can be used by people who query me. Just doing sum/max_sum seems naive. I don't think that trusting peers who don't trust anybody should influence my results. But I'm not completely sure about that.

A problem, that is waiting for the solution. One of many.

TODO

More stuff to come including:

reinventing the wheel
oh no we assign numbers to people
ain't nobody has time for that
fixing network failures
wallet of identities & privacy
security ponderings
trusting virtual entities
layers and multidimensionality
pushing data down the network (advanced)

How can I help

Think about it. Provide some feedback. Get some more people to think about that problem. Even in science projects everything drops down to marketing nowadays. So promoting the idea is just as helpful as providing insights.

Contact

I'm at wot@comboy.pl. If I get a few e-mails then maybe setting up a mailing list can be productive. Or maybe r/web_of_trust, or some github or slack, whatever works for you, just let me know.

Drop me an empty e-mail to get a one time notification once there's some progress on the project.