Copy link to clipboard
Copied
Users were complaining that 1000's of emails were not being sent via our CF service. I checked the spool directory and it was empty, as was Undelivr. Our own application logs showed the cfmail routine had been called but full mail logging via the CF Administrator showed nothing had been sent. So where on earth have the emails gone?
Through frustration we just did an emergency reboot of the affected server from the cluster and as soon as it rebooted 1000's of emails appeared in the spool directory. This included every email going back since the problem manifested.
Where does CF store its emails before it puts them into the spool dir? Surely it must have written them to disk someone else that isn't documented? How else could it generate all of the missing emails after a reboot without any prompting? Maybe Java itself had a problem and has its own mail spool dir that then feeds into the CF spool dir?
Any answers would be useful in case this problem ocurrs again.
Thank you.
Gary.
Copy link to clipboard
Copied
Where does CF store its emails before it puts them into the spool dir? Surely it must have written them to disk someone else that isn't documented? How else could it generate all of the missing emails after a reboot without any prompting? Maybe Java itself had a problem and has its own mail spool dir that then feeds into the CF spool dir?
If CF is set to spool (as per CFAdmin setting), then it should indeed write the messages to the spool dir before sending 'em. Is that setting set? As far as I know, if it's not set to spool, then it sends the messages straight away.
It sounds like you might have all the various mail logging options set (are they set to debug?).
Is anything being logged on the mail server side of things which might indicate a problem?
Was it just the CF service you restarted, or was it the whole box that also possibly had the SMTP server on it too? IE: did the mail definitely get jammed up on the CF side of things?
Are you absolutely certain that whatever process generates the emails had actually run? Was it from a scheduled task or is there some logic in your code that might kick in when the application is in a certain state or anything like that?
Sounds a bit odd.
--
Adam
Copy link to clipboard
Copied
Hi Adam. Thanks for the reply.
Our CF servers are set to spool to disk. Logging was off when we first got wind of the problem so we activated "debug" mail logging and also the option to log all sent emails. We then watched users choose our app's function to send more emails but nothing was added to these log files. The SMTP service had nothing in its queue or badmail directory.
Our email creation routine is in an asynchronous gateway event and one line of code in there logs how many emails it has generated. It logged about 2000 more emails since we started looking into the problem so we knew the code around cfmail was running correctly, but it was weird that CF's mail logging recorded absolutely nothing and nothing was added to the spool directory. So what was cfmail doing?
Through frustration of realising the problem was out of our hands since our code was logging that it was calling cfmail and CF logged nothing, we did an emergency reboot of the server (the physical box which also has an SMTP service). As soon as Windows booted up again the empty spool folder was flooded with 1000's of files. Where the frak did they come from? This is why I suspect the Java engine had created and stored the mail files but got stuck and hadn't handed them over to CF, hence the lack of CF logging any mail activity. What other explanation could there be? Crazy!
Copy link to clipboard
Copied
OK, well: all things considered, if I was you I'd be raising a support call with Adobe at this point.
You've certainly troubleshot (troubleshooted?) along the same lines as I would, and I'm drawing a blank as to what else to try.
--
Adam
Copy link to clipboard
Copied
A sudden thought.
If it happens again, test sending an email with <cfmail> manually, and watch what happens. If it vanishes as well, then send another email with a specific UUID in it, or some other unique string, and then do a full search of the file system in the CF dir and then the JVM's dir, and temp dirs and what not looking for files containing that string. If the mail is being send after a restart, then they MUST be getting written to disk somewhere. If you can find out where they're being written, it might cast some light on why they are getting jammed where they are.
Oh, and what's your spooling interval? It's not like every one million years or something, is it? But even then, one would expect the email to be in the spool dir, waiting. Hmmm.
Weird.
--
Adam
Copy link to clipboard
Copied
We then watched users choose our app's function to send more emails but nothing was added to these log files. The SMTP service had nothing in its queue or badmail directory.
Our email creation routine is in an asynchronous gateway event.
The asynchronous gateway is a thread that runs aside from Coldfusion's main flow of execution. I therefore expect the gateway's page-context to be different from that of an ordinary CFM page or function call. For example, you usually have to specify any communication with other parts of Coldfusion, such as logging, explicitly within the page that uses the gateway.
...it was weird that CF's mail logging recorded absolutely nothing and nothing was added to the spool directory. So what was cfmail doing?
The connection to the mail server might have failed temporarily. Other frequently reported causes are faulty attachments, the presence of zero-byte files in the spool or a known cfmail bug.
Coldfusion very likely stored the undelivered mail in a folder somewhere on disk. Coldfusion would normally place a lock on such mail files, with the intention of sending them when connection resumes.
That's what you would normally expect when you send mail from an ordinary CFM page, for example. It just might be that, in the gateway context, Coldfusion failed to release the lock or to resume the connection.
Copy link to clipboard
Copied
Hi BKBK. CF would alway log email activity from cfmail (if switched on) within an asynchronous gateway event under normal circumstances. No further coding is required to have it log into mail.log and mailsent.log. Errors that do occur from asynchronous events are logged to eventgateway.log and I did check that log.
Some of the emails didn't have attachments and we had previously patched the email attachment bug.
Copy link to clipboard
Copied
CF would alway log email activity from cfmail (if switched on) within an asynchronous gateway event under normal circumstances. No further coding is required to have it log into mail.log and mailsent.log. Errors that do occur from asynchronous events are logged to eventgateway.log and I did check that log.
One question is, what about an error that is caused neither by cfmail nor by the gateway? I would say, arguably the most useful debugging tag in gateway code is cflog. Whatever gremlin it is, would be caught by the net <cftry><cfcatch><cflog></cfcatch></cftry>
Copy link to clipboard
Copied
You're right and I am using cflog in the gateway event script - it's permanently logging data. The cflog line is actually at the end of the script outside of any loops, so if an error was thrown then cflog wouldn't have run. Come to think of it, if any error was thrown then all of those emails that were generated and sent after the reboot wouldn't have happened.
It's a mystery.
Copy link to clipboard
Copied
Still scratching my beard. Take a look at the default temporary directory \runtime\servers\coldfusion\SERVER-INF\temp\ to see whether Coldfusion left any evidence there.
Copy link to clipboard
Copied
Hmm, maybe I should grow a beard to help me think. 🙂
We run a multi-instance installation. The master installation has a file called fileStore which is 0 bytes dated around the time of the problem at this location:
\jrun4\servers\cfusion\SERVER-INF\temp\JRunSQLInvoker
All other files in temp were a year old.
Looking in the temp directory for the server instance that runs the website I see a load of files. The nearest date match is from a dir called JINIClusterLog1254097290328. There are 3 small files within.
Logfile.531 goes on about com.sun.jini.reggie.RegistrarImpl$ServiceLeaseRenewedLogObj. I didn't understand what it's talking about so I searched for 'mail' and 'smtp' but it didn't find anything.
Snapshot.531 was bigger (15kb) containing even more rubbish and strange chars but searching it for 'mail' and 'smtp' also failed. Is there anything else I should look to find in the file? I'm a bit reluctant to upload them onto a public website in case they contain specifics to the server that could help a hacker.
Thanks.