Gmane

Cross-posting

Gmane cross-posts articles that are Cc'd to several mailing lists. If you read several related mailing lists, you'll often find the same messages appearing on these lists. It's annoying having to read the same message more than once, which is why most news readers make sure that you only see cross-posted messages once. Gmane wants to take advantage of this by cross-posting messages between newsgroups when they are Cc'd to several mailing lists.

This web page discusses various approaches towards achieving this. But first, a brief overview of how Gmane funnels mailing list messages into the news spool.

Each individual mailing list is subscribed as a different mail box on m.gmane.org. For instance, the mailing list milk@antidote.org might have gcm-milk@m.antidote.org as its Gmane subscriber address, and all messages from that list will arrive at that address. This means that Gmane knows with 100% certainty which newsgroup the messages are meant for -- no guessing or heuristics are necessary.

But how should Gmane then cross-post messages?

Retroactively cross-posting

If Gmane first receives a message for gmane.culture.milk.skimmed and then the same message arrives at gmane.culture.milk.whole, Gmane could then retroactively alter the first message to be a cross-posted article. That's entirely possible by going behind INN's back and forcibly altering the messages, but people who had already read the message in gmane.culture.milk.skimmed wouldn't know that it was supposed to be cross-posted, and when entering gmane.culture.milk.whole later, would then be presented with the same message again.

It would also be impossible to feed these retro-cross-posts to other news servers. It would also be somewhat computationally intensive.

A variation on this theme would be to cancel the first message, and then post a new, cross-posted article. Again, this would mean that people who read the groups often would likely be presented with the messages more than once. It would also break threading.

So this method doesn't sound very promising.

Queuing

Gmane could queue up all incoming messages for, say, five minutes, and see whether the same message crops up in the queue more than once. If so, it could cross-post the message between all the constituent groups.

This would be quite simple to implement, and would catch most cross-posted messages. However, the Internet being what it is, and mailing list software being what they are, messages going through different mailing lists have quite divergent round trip times. The longer the period, the more correct it would be. On the other hand, the longer the period, the less useful Gmane is as a reading mechanism for mailing lists. It sucks having to wait an extra five or ten minutes for messages to show up.

Still, it's a much better solution than the first alternative.

Examining The Headers

Gmane knows all the mailing list addresses, so why can't it just look at the To and Cc headers and cross-post the messages to the newsgroups implicitly named there? This is the solution used by Gmane.

This, however, is not fool-proof. Messages that are Bcc'd to lists, or are hosted on mailing list machines that have many names, won't be properly detected by this heuristic. There is no way to say that one mail box is equivalent to another without actually sending a mail to both and see where they end up.

Still, it doesn't have the problems that the second alternative has.

The Algorithm

Here's how the Gmane mail-to-news script does cross-posting, in detail.

  1. All mail addresses from the To and Cc headers are gathered.

  2. The addresses are checked against the list of mailing list addresses Gmane knows about.

  3. A mailing list is deemed to match if the parts before the at sign matches, as well as the two last elements of the domain name. So milk@mail.antidote.org and milk@antidote.org are deemed to be the same list.

  4. The message is then posted as a cross-posting between all matching groups.

  5. If the news server says that the message has already been posted, it's queried as to which groups is has been posted to. If the current newsgroup isn't among these, the message is given a new Message-ID and posted anyway.

News To Mail

How about the opposite transform? When cross-posting to several groups, the posted article is sent to each of the participating mailing lists. However, the user has to authorize themselves for each individual mailing list, just as if they were posting distinct messages to several mailing lists.