[SGVLUG] Anyone know about Postfix on Linux?

Thu Sep 6 18:58:21 PDT 2007

> -----Original Message----- Of Gareth Greenaway
> Quick question, but could you be a little more specific about 
> the mail senidng being slow?  One thing that I would look at 
> right off the bat is your name resolution settings.
> 
> On Wed, Sep 05, 2007 at 11:34:26AM -0700, Stan Schwarz wrote:
> > I run the Earthquake Notification System here ...
> > I'm trying to achieve faster-than-a-speeding-spammer mail 
> > sends, since everyone wants to know about earthquakes right after
they happen.

Hmmm... I'll admit right up front I'm not a postfix guru [or even much
of an e-mail guru] but I know enough to be dangerous, so I'll toss out
some stream-of-consciousness ideas that might make some sense...

Gareth's comment brings to mind the thought of pre-sorting your
addresses: I presume your database of "people who want to know about an
event" grows slowly over time, with changes being far less common [and
deletions even more so!] EXCEPT when there is a major event, but they
will all sign up AFTER the event so we can ignore them for the moment.
Anyway, pre-sorting: for every address you have, maintain a two-field
table: "address" and "primary MX record", with an index on the MX
record.  [Actually, come to think of it, you might be able to get away
with the "@wherever" part of the address only]  The idea being you feed
into the queue addresses destined for the same MX all at the same time.
I don't know offhand if there is a way to tell postfix "don't bother
with a lookup, I already know it should go /here/", but if not, by
batching them all for the same MX [and, secondarily, by the same
@wherever], your DNS queries should essentially be "cached" for the bulk
of the lookups needed.

Of course, MX records can and will change, so a "background"
thread/job/process can update the table during
off-peak/whenever-there-ISN'T-an-earthquake hours [yes, I'm being
funny... ;) ]

I don't know if you could trap on it, but if you detect that the primary
MX isn't responding, you can essentially skip [large?] blocks of e-mails
during the first pass of notifications.  I can think of two reasons why
this may occur: (1) the machine the MX record points to is physically
located "near" the event, and thus is "temporarily offline..." [which,
if you have a good IP-to-geo mapping, you could avoid even TRYING
machines you know to be at the epicenter...(*)] or (2) you're
overloading the machine / they think you ARE a spammer / it was just
changed after your last background scan.  i.e., any miscellaneous reason
not related to physical hardware failure.  This may be harder to deal
with, but chances are good that any "secondary" MX record will be the
same for any given primary MX record.  Here is where sorting/indexing by
"@wherever" makes sense: for a given domain, the secondary MX WILL be
the same for "everyone at wherever", but could change for
"anybody at across_the_street".

(*) of course, this raises an interesting dilemma: should you send
messages to folks you know are physically located near an event first or
last?  By their proximity alone, they will be the ones most interested
in "how big was THAT?", but if the event is big enough, "checking
e-mail" might be pretty low on their priority list.  Of course, if the
target machine is "offline" due to proximity, sending to those addresses
would likely "timeout", thus delaying everyone else in the queue.  As
you get further away, the reliability to send and have the message
received should increase, but at the same time the interest level
decreases [I'd venture that the only people in Europe who /really/ care
about an earthquake in California are those with friends/relatives or
"business interests" in California.]  Of course, knowing that a given
recipient is/should be "sufficiently far enough away" from an event that
they should be notified quickly won't do any good if that person's MX
server IS in the affected area -- you might have to contact Solomon for
a "rather interesting query" to find and prioritize addresses to send to
based on known location of the epicenter, MX, and presumed/stated
location of the recipient.

=====

You mentioned flow_delay -- In addition to pre-sorting, you might
consider partitioning the data and building multiple VIRTUAL postfix
"servers" on a given machine. [or send them to multiple physical
machines...]  Each postfix process should be "unaware" of any other
postfix processes running on the same box.  So for instance, process "A"
can send a message and enter it's flow_delay period, then process "B"
can send a message, followed by process "C", "D", and so on until
process "A" wakes up again to send the next message.

=====

Actually, partitioning the data for processing by multiple servers may
be your best bet -- your "actual" primary server that sends out
notifications would [essentially] "broadcast" the event info to a few
dozen machines.  Each machine, in turn, handles one specific
[geographically or logically defined] area.  These could even go to a
third [then fourth...] level machine for even more distributed
processing.  At the risk of ACTUALLY doing "the worst thing that a
spammer really does..." you could even distribute an application to
"willing participating recipients" who will, in turn, allow their
machine to "be a zombie" and further replicate the e-mails  [of course,
you'd have to secure the application somehow to prevent misuse by said
spammers...  Perhaps by authenticating the source of the message via a
pap signature?]