[hashcash] Re: Reputation proxy

Simon Josefsson wrote:


Have you thought more about how submission into a reputation proxy would
work?  That part is a bit sketchy to me, and the main problem seems to
be how to avoid submissions from spammers.  Seed it with
known-good-addresses, and then add new addresses based on who they
communicate with?  There are also some serious privacy issues here.

I do like the general idea of a reputation proxy though.  It can be used
for a lot of things beyond email, but perhaps the many small details
makes it difficult to re-use one reputation proxy for another purpose.

sorry, I had a brain belch when I was writing. I should have said proof of work as a proxy for the value of reputation. In that a larger stamp conveys more reputation points on someone then a small or no stamp situation. My reputation idea will probably work something like this. Believe me, I am very open to hearing about flaws and ways of trying to fix it simply if possible.

in this context, reputation is a historical record of any given IP address. It is aggregated into one of four or five bands. The current thinking says five bands ranging from very bad to very good with neutral in the middle. reputation is modeled on the human experience in that you earn a good reputation slowly but can lose it very quickly.

Reputation is driven by the messages received from an IP address. Automatic processes such as filters on content give a small negative or positive boost to a reputation. Proof of work puzzles at a medium positive boost to reputation. A human, sorting through the "unsure" collection, gives a strong positive or negative boost to reputation.

Reputation also modifies the filter configuration. By themselves, a content filter or proof of work stamp is insufficient. The filter is inadequate because of false positives and negatives, the proof of what works stamp is insufficient because a stamp large enough to defeat the super enormous bot net hypothesis consumes far more resources than any good mail site should spend.

I've tried a bunch of ways to describe the filter configurations so, please bear with me as I try one more. Hopefully this will work.

As I said above, reputation chooses what filter is used to analyze the mail message. There is no regular pattern. Each configuration is admittedly subjective according to my biases but I believe each configuration is defensible. This is not to say that other configurations wouldn't be just as valid and I would welcome a discussion of other configurations.

Configuration: very bad; The filter configuration is a serial stamp, content filter. That is to say, a message must pass the stamp challenge and the content filter in order to be delivered.messages from a very bad reputation source will never be delivered to an inbox. Once the messages have passed the stamp/filter hurdle, messages that would have been delivered to the inbox or the spam trap are delivered to the spam trap. Messages that score for delivery to the dumpster, are delivered the dumpster. the presence of the stamp does not modify reputation in either direction. Once a site is in the very bad state, the only reputation modifier is the spam trap processing i.e. messages trained as good improve reputation.

configuration: bad; the filter configuration is a mandatory content filter with optional stamp. Like the very bad filter, messages are delivered only to the dumpster and spam trap destinations using the same rules. If the stamp is present, the message will be delivered to the spam trap destination.

Configuration: neutral; mandatory content filter with optional stamp.
as messages come to the system, they're operated on by a variety of filters and user driven processes. presence of stamp may or may not change message delivery from dumpster to spam trap. Stamps at this reputation level do not enable direct to your inbox delivery. They do however improve reputation at a rate faster than simple message exposure will.

Configuration: good; very good; delivery requires stamp or content filter. Stamped messages are delivered direct inbox, content filter messages are delivered to inbox or spam trap. Dumpster scored messages are placed in a spam trap. The only difference between good and very good reputation is the stamp size.

My rationale for these filters is admittedly subjective. The very back configuration has two barriers to cross because the very bad category is effectively a blacklist. I believe it is highly unethical to not have any way for a legitimate user to get through and scream for help or correction which is why the content filter plus stamp. This also makes it damn near impossible for a spammer to get through as well. A good user is more likely to have a simple message that will be recognized as good or as a spammer is more likely to have a message recognized as spam and therefore, fail of the tests

the bad category is my way of saying "I expect your message will be trash but, if you generate a stamp, it least it won't be thrown in the dumpster. the filter only category is a concession to backwards compatibility but I've eliminated the inbox delivery range of messages because anything with a bad reputation doesn't deserved direct delivery.

The neutral category is simply filters as we know them today with ham, spam, and mystery meat categories (inbox, dumpster, spam trap). The only thing a stamp gives you is faster movement from neutral to good reputation.

The good and very good categories are simply a safe place to enable direct delivery of messages in the presence of a stamp. They also are safe place to eliminate the dumpster range for messages and either deliver to inbox or spam trap.


I'm afraid this is all I have time for today. I look forward to your comments and questions. I know I've left some important stuff out but I will deal with that in a later message.

---eric


--
Speech-recognition in use.  It makes mistakes, I correct some.

Other related posts: