Outgoing mail failures cropping up in AMS

Outgoing mail failures cropping up in AMS

Postby EKjellquist » Tue Aug 30, 2022 4:17 pm

We're noticing a lot of outgoing mail failures lately (within the past week or so) and I'm trying to get to the bottom of it. This is in the latest AMS 5.0.2, server has been restarted, did not have any major config changes recently, etc. and otherwise is working ok.

We have everything set for TLS 1.2, which since AMS 5.x has been ok for outgoing mail (there was a bug somewhere in 4.x that allowed us to update that). However, I'm seeing a handful of recent failures that leads me to believe that support for something has begun to be dropped by other mail servers. on outgoing mails in these cases we'll get undeliverable warnings and then failures with:

Connection to the destination SMTP was closed.

As of this writing, checking the MX servers of various domains which this is an issue for seems to be limited to mimecast.com / ppe-hosted.com / mailspamprotection.com and a few others. Overall this seems to be something like 10-15% of our outgoing mail. What happens is the normal EHLO from AMS occurs, the server responds that STARTTLS is ok, and the 2nd EHLO from AMS occurs, and it just seems to time out after 5 mins. Below is a sample from checkTLS when this occurs (our domain is anonymized)

***********************************
Tue, 30 Aug 2022 10:49:26 -> Success: Action=[Process Mail], Details=[3 KB: Start transfer.]
Tue, 30 Aug 2022 10:49:26 -> Success: Action=[Detect DNS's], Details=[Found 2 entries.]
Tue, 30 Aug 2022 10:49:26 -> Success: Action=[MX Lookup], Details=[DNS=Using automatically detected DNS's, Domain=TestSender.CheckTLS.com: Found 1 records]
Tue, 30 Aug 2022 10:49:26 -> Success: Action=[SMTP Transfer], Details=[Domain=TestSender.CheckTLS.com, Host=ts11-do.CheckTLS.com:25, IP=165.227.190.238: Connection accepted.]
Tue, 30 Aug 2022 10:49:26 -> ***DEBUG*** -> Success: Action=[Recv Response], Details=[IP=165.227.190.238: 220 ts11-do.checktls.com ESMTP TestSender Tue, 30 Aug 2022 10:49:26 -0400]
Tue, 30 Aug 2022 10:49:26 -> ***DEBUG*** -> Success: Action=[Send Command], Details=[IP=165.227.190.238: EHLO mail.domain.com]
Tue, 30 Aug 2022 10:49:26 -> ***DEBUG*** -> Success: Action=[Recv Response], Details=[IP=165.227.190.238: 250-ts11-do.checktls.com Hello domain.com [1.2.3.4], pleased to meet you]
Tue, 30 Aug 2022 10:49:26 -> ***DEBUG*** -> Success: Action=[Recv Response], Details=[IP=165.227.190.238: 250-ENHANCEDSTATUSCODES]
Tue, 30 Aug 2022 10:49:26 -> ***DEBUG*** -> Success: Action=[Recv Response], Details=[IP=165.227.190.238: 250-8BITMIME]
Tue, 30 Aug 2022 10:49:26 -> ***DEBUG*** -> Success: Action=[Recv Response], Details=[IP=165.227.190.238: 250-STARTTLS]
Tue, 30 Aug 2022 10:49:26 -> ***DEBUG*** -> Success: Action=[Recv Response], Details=[IP=165.227.190.238: 250 HELP]
Tue, 30 Aug 2022 10:49:26 -> ***DEBUG*** -> Success: Action=[Send Command], Details=[IP=165.227.190.238: STARTTLS]
Tue, 30 Aug 2022 10:49:26 -> ***DEBUG*** -> Success: Action=[Recv Response], Details=[IP=165.227.190.238: 220 Ready to start TLS]
Tue, 30 Aug 2022 10:49:26 -> Success: Action=[SMTP Transfer], Details=[Domain=TestSender.CheckTLS.com, Host=ts11-do.CheckTLS.com:25, IP=165.227.190.238: Starting TLS.]
Tue, 30 Aug 2022 10:49:27 -> Success: Action=[SMTP Transfer], Details=[Domain=TestSender.CheckTLS.com, Host=ts11-do.CheckTLS.com:25, IP=165.227.190.238: TLS started.]
Tue, 30 Aug 2022 10:49:27 -> ***DEBUG*** -> Success: Action=[Send Command], Details=[IP=165.227.190.238: EHLO mail.domain.com]
.
.
.
Tue, 30 Aug 2022 10:54:27 -> Failed: Action=[SMTP Transfer], Details=[Domain=TestSender.CheckTLS.com, Host=ts11-do.CheckTLS.com:25, IP=165.227.190.238: Connection closed unexpectedly or forced shutdown.]
***********************************

On successful outgoing mails, the receiving server responds to the 2nd EHLO as expected and it goes through fine. What I don't know is WHY these handful of outgoing mails are failing (I can send fine from, say, Gmail to any of the affected domains fine). There's no specific refusal error codes for this particular problem, it's just that the handshake seems to fail. If I had to guess, I'd say the likeliest culprit was the aging OpenSSL 1.0.2L version AMS is still using; it would be limited to ciphers that over time would be deprecated in newer versions receiving servers would use; even though we're using TLS 1.2, the available cipher list has entirely to do with OpenSSL version as far as what's supported. Going back in the logs further than just the past week or so, I do see more similar failures like this one, but I can't find any notice for the MX servers we're seeing the failures on being like 'as of X date we no longer support TLS 1.2 connections using Y or Z ciphers / protocols'.

So ultimately I guess I'm asking for two things; (1) we need support for openSSL 1.1.1 anyway as that's the current standard pretty much across the board (currently 1.1.1q) and (2) could the debug log for outgoing mail be modified to report the cipher used in the TLS handshake? That may be helpful in pinning these sorts of issues down. I don't KNOW that this is the issue though, because we can receive emails from all of these domains fine; that tells me they have some receiving SMTP requirement that we're either not meeting or not equipped to meet...

This is a major issue for us that's probably not going to get better (it doesn't look like, say, a configuration error within mimecast.com or other host domains); we can use alternate personal emails for now, but this becomes a security issue the longer it lasts...

There seems to have been continued updates to OpenSSL 1.0.2 after 1/1/2020, but they're only available for 'Premium level support' customers and costs $50k+ (https://www.openssl.org/support/contracts.html#premium). They seem to be up to 1.0.2zf, but i'm guessing we'd get 1.1.1 support before code-crafters pays THAT kind of $$?
EKjellquist
 
Posts: 80
Joined: Tue Sep 09, 2014 10:40 pm

Re: Outgoing mail failures cropping up in AMS

Postby Code Crafters » Fri Sep 02, 2022 1:28 pm

The first EHLO is used to check if STARTTLS is supported by the receiving mail server. Then once TLS is started, the second EHLO should work fine too. As you say there is no reason given for the delayed rejection of the conneciton after that. No reason is more secure but not very helpful and the delay is to help stop brute force attacks by reconnecting straight after. It is possible that the TLS version 1.2 or the ciphers used are too low but it could also be other reasons too. Can you tell us if these domains ever accept emails from your Ability Mail Server? If they reject always it could be TLS version or just your IP they don't like (RBLs, dynamic / ISP assigned IP etc.). If they reject just sometimes it's more likely to be that you are sending too many emails or something else.
Code Crafters
 
Posts: 906
Joined: Mon Sep 10, 2007 2:35 pm

Re: Outgoing mail failures cropping up in AMS

Postby EKjellquist » Fri Sep 02, 2022 1:56 pm

I thought at first that it might be delays, but we have resends that go through 2 hours and all of these affected mails are consistently failing throughout the warning periods. We've had a few domains that used transaction delays before, but they have 100% sent failure messages to that effect; these are just 100% failing with the 'closed connection' issue. Our IP is not on any blacklists. We're also not sending too many emails; these are mostly from our purchasing manager to their contacts outside the company (our own 'max mails per day' is not enabled). All of these domains were accepting mails fine right up until recently; other unaffected domains will still send/receive mail fine from us. These domains definitely do accept TLS 1.2 according to CheckTLS.com.

For example, when i test to my domain, by default checktls.com tries to connect with TLS 1.2 and AES256-GCM-SHA384, which does not have perfect forward secrecy and runs into the 'DH key too small' situation, but it works ok. I try the same test to one of the affected recipient domains, specifying the same, and it also appears to work. So ultimately w/o other evidence, initially I thought that that recipient domain(s) need to either whitelist us or otherwise mane some firewall exception. But they assure me they've already done this.

By default, checkTLS reports all of these domains as using TLS 1.3 and TLS_AES_256_GCM_SHA384, with either Curve P-256 DHE(256 bits) / Curve X25519 DHE(253 bits) / etc. and PFS. While they do still accept TLS 1.2 connections, over time older ciphers are often rejected because they're weak / outdated. I think what's happening here is that, because AMS is still using 1.0.2L (mid-2017), while it's TLS 1.2-capable, it only offers the ciphers that were available at that time in order to make the connection; I think these domains have disabled at least some of the ciphers AMS is capable of, and outgoing mails are failing because on the 2nd EHLO, it's trying to handshake with a cipher they either don't accept or can't / won't use. It explains why we can 100% RECEIVE email from all of these domains because their servers must connect to us using TLS 1.2 with AMS' supported cipher list, of which AES256-GCM-SHA384 and the like are still somewhat in use.

So ultimately, even if OpenSSL 1.1.1 support is a ways off, if we could at least get 1.0.2u support (which is the last public release of 1.0.2), that might be helpful at least in the interim.

https://www.orangecountycomputer.com/20 ... d-working/
EKjellquist
 
Posts: 80
Joined: Tue Sep 09, 2014 10:40 pm

Re: Outgoing mail failures cropping up in AMS

Postby Code Crafters » Fri Sep 02, 2022 2:23 pm

I agree that we do need to update to OpenSSL 1.1.x but as you know this requires quite a bit of change on our side as it's quite different. We have previously updated to newer versions of 1.0.x but had compatability issues. You're welcome to try swapping out the dlls for OpenSSL yourself to the latest versions as this isn't hard to do. Just keep the originals. We'll try to do an update with later versions too if we can find them to be stable enough although looking at our usual source for these they're no longer supplying older than 1.1.x and also supply 3.0.x which is now the latest experimental version. We will try to get 1.1.x implemented as soon as we can too of course.
Code Crafters
 
Posts: 906
Joined: Mon Sep 10, 2007 2:35 pm

Re: Outgoing mail failures cropping up in AMS

Postby EKjellquist » Wed Sep 07, 2022 9:02 pm

This may shed some light also -> https://security.stackexchange.com/questions/174935/how-to-support-forward-secrecy-in-openssl

One thing that I always get dinged on in PCI is the lack of support for Perfect Forward Secrecy in AMS; though apparently in 1.0.2L the ECDHE / DH cipher suites to utilize it in TLS 1.2 are available - is it possible they're just not turned on server-side? Does AMS / AFS specify which ciphers it uses for its OpenSSL versions or can it use any that are available? looks like it could be a variable-specifying thing...

I tried this in version 4.x of AMS before we upgraded to no avail, BUT in 5.0.2 swapping out the libraries with 1.0.2u DOES actually seem to work. The bad news is it doesn't solve this particular 'Connection to the destination SMTP was closed' problem, but IMAP/SMTP/outgoing mails seem to be fine. I used libeay32.dll and ssleay32.dll from openssl-1.0.2u-i386-win32.zip from https://indy.fulgan.com/SSL/?C=M;O=D and swapped them with the ones in the AMS base install directory. Hypothetically this resolves everything from https://www.openssl.org/news/vulnerabilities-1.0.2.html#y2019 down to 1.0.2L, but hey, it's something.
EKjellquist
 
Posts: 80
Joined: Tue Sep 09, 2014 10:40 pm

Re: Outgoing mail failures cropping up in AMS

Postby Code Crafters » Tue Sep 13, 2022 8:54 am

Thanks for the info. We'll put out an update with these OpenSSL 1.0.2u too if it seems to be stable.
Code Crafters
 
Posts: 906
Joined: Mon Sep 10, 2007 2:35 pm

Re: Outgoing mail failures cropping up in AMS

Postby EKjellquist » Mon Sep 19, 2022 3:35 pm

So as it turns out, after applying the September 2022 Microsoft updates (we primarily use AMS on Windows Server) the 'Connection to destination SMTP was closed' issues went away. The frustrating thing is I'm not certain what Windows / .net issue could have been introduced in the August 2022 updates (as that's when this behavior appears to have started), but currently it appears to be fine, also with OpenSSL1.0.2u continuing to be used.

So I guess we can chalk this one up to some MS issue from last month; If anyone else happened to be troubleshooting it, reboots / openSSL version changes didn't fix anything, but applying the 2022-09 cumulative Windows Server update (KB5017315) and for .net framework 3.5, 4.7.2 and 4.8 (KB5017528) somehow did.
EKjellquist
 
Posts: 80
Joined: Tue Sep 09, 2014 10:40 pm

Re: Outgoing mail failures cropping up in AMS

Postby Code Crafters » Tue Sep 20, 2022 8:39 am

Ability Mail Server and OpenSSL are written in C++ so they don't use .Net. They use a version of Microsoft C++ redistributable but that's shipped with the product. Not sure what could have caused the issue but glad that it's resolved and the newer version of OpenSSL still seems stable. We've been running this on our server for the last week or so for further testing and will do a public update with this version in as mentioned previously.
Code Crafters
 
Posts: 906
Joined: Mon Sep 10, 2007 2:35 pm


Return to General

Who is online

Users browsing this forum: No registered users and 7 guests

cron