Gmane

Filtering

Gmane is an archival site, so it would be inappropriate for it to filter its contents aggressively. Gmane is also an interface for conveniently reading mailing lists, which means that it might be nice if it were to filter noise.

Apart from the protocol conversions from mail to news, Gmane filters away all mailing list tags from the subject headers. They usually look like [some-mailing-list]. Gmane also removes the standard mailing list trailers that usually appear the the bottom of the message bodies and say how to unsubscribe from the mailing list.

No further filtering is done of the contents.

Articles that have subjects reading just subscribe or unsubscribe are redirected to the gmane.junk group.

Spam

All messages are run through SpamAssassin to determine whether they are spam messages or not. Messages that SpamAssassin say are spam are cross-posted to the gmane.spam.detected group. If you're reading the Gmane groups through a news reader, you can instruct it to kill/score articles based on the Xref header. Just kill all messages that match gmane.spam.detected.

The web interface won't display messages that are cross-posted to this group.

Gmane adds headers to all the messages that contains an URL for reporting back to Gmane whether a message is spam (or not, if it has already been marked as such). These reports are then viewed by the Gmane administrators, and approved or rejected by them.

In total, this gives Gmane a mechanism for filtering the obvious spam automatically, and the border cases use a collaborative filtering scheme. As time goes on and more people use Gmane (and presses the spam reporting button), the archive will grow progressively more and more free of spam.