Exit
  • Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
  • 한국 커뮤니티
0

CF2018 W2K16 IIS10 giving 500 errors with increasing frequency on one site

Explorer ,
Jul 03, 2022 Jul 03, 2022

Windows 2016 server, IIS 10 and Adobe ColdFusion 2018 using Tomcat. The server has about 8 sites on it and one site is getting intermittent 500 errors. All other sites remain up and responsive. Failed Request Tracing shows "Filter Error - Incorrect Function 0x1"

 

The site will come back up if you recycle the app pool for that site, or restart that site in IIS. I have the app pool set to recycle every 10 minutes.

 

I've seen some vague references to possibly being a URL rewrite setting something that Tomcat doesn't understand, but could never find anything concrete related to that. If I go through our rewrite rules one by one, each one works fine individually.

 

At this point I'm not sure what to even do with it. In March it was doing it frequently (once per day or more) and we recreated the site - new web root, new IIS config etc. same thing. Finally I noticed that the server had a pending restart from updates and once we did that it was fine until about 2 weeks ago. Other than standard code updates to the site, nothing has changed with the IIS or Tomcat configuration.

 

Within the last 24 hours I've rebooted the server 3 times and per a suggestion commented out the heartbeat_interval in workers.properties.  No change - still goes down with 500 errors several times per day.  I've attached the isapi_redirect.log image of an incident, as well as the error block from the failed reqeust tracing.  

500.PNGfrt.PNG

TOPICS
Connector , Server administration
950
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jul 04, 2022 Jul 04, 2022

Can you identify any specific request(s)/page(s) causing the issue? For example, static pages (html, images, and so on)?

 

Why have you increased the maximum number of connections from 500 to 5000? That raises questions about your other settings. So could you please share the contents of D:\ColdFusion2018\config\wsconfig\1\workers.properties

  

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Jul 06, 2022 Jul 06, 2022

No - they're not related to a specific page (or in some cases any page at all - i.e. images .png etc.)

worker.list=cfusion
#heartbeat_interval=30
heartbeat_limit=90
worker.cfusion.type=ajp13
worker.cfusion.host=localhost
worker.cfusion.port=8018
worker.cfusion.connection_pool_size=5000
worker.cfusion.connection_pool_timeout=60
worker.cfusion.max_reuse_connections=225

connection_pool_size was calculated by number of sites on the server * max_resuse_connections.

I don't think this is a CF issue per se, but rather IIS and tomcat with the possibility that web.config is involved.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Adobe Employee ,
Jul 06, 2022 Jul 06, 2022

Can you please let us know the Max thread value that is set for the Coldfusion AJP port.

You need to increase the value of Connection_pol_size and maxThreads value accordingly, you can refer to below document for more information

https://coldfusion.adobe.com/2018/07/connector-tuning/

 

 

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jul 06, 2022 Jul 06, 2022

As I suspected, the connection_pool_size of 5000 is not justified. You have just 1 worker/site, so you don't have to multiply anything. Also, I miss certain settings, for example, secret

 

I would suggest you follow these steps:

 

(1)

Store the current worker.properties as backup. Then create a new worker.properties file whose content is:

 

heartbeat_interval=30
heartbeat_limit=90

worker.list=cfusion

worker.cfusion.type=ajp13
worker.cfusion.host=localhost
worker.cfusion.port=8018
worker.cfusion.heartbeat_servlet_path=/__cf_connector_heartbeat__
worker.cfusion.connection_pool_timeout=60
worker.cfusion.connection_pool_size=500
worker.cfusion.max_reuse_connections=500

# Use your own value of monitoringsecret
worker.cfusion.monitoringsecret=49162e5a-b1db-4d25-a560-d8056c02016d
# Use your own value of secret
worker.cfusion.secret=9661aabd-ab6f-42ad-8cd8-b371b366b641

 

You will have noticed that connection_pool_size = max_reuse_connections. That is the optimal situation. The value 500 is sufficient for most sites. If your site has heavy traffic, then you may raise the value to 1000. But only where necessary. See, for example, https://www.petefreitag.com/item/871.cfm 

 

(2)

Edit the connector element in \cfusion\runtime\conf\server.xml so that 

worker connection_pool_timeout (seconds) matches connectionTimeout (milliseconds) in server.xml;

worker connection_pool_size matches maxThreads in server.xml;

secret in worker.properties matches secret attribute of the connector element in server.xml.

 

The connector element should then look like:

 

<Connector protocol="AJP/1.3" port="8018" redirectPort="8445" secret="9661aacd-ac6f-42ad-8dd8-c371c366c641" maxThreads="500" connectionTimeout="60000" tomcatAuthentication="false" address="127.0.0.1" allowedRequestAttributesPattern=".*"/>

 

(3)

Add the flag

 

-Djava.net.preferIPv4Stack=true

 

to the java.args setting in /cfusion/bin/jvm.config

 

(4)

Restart the cfusion instance.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jul 08, 2022 Jul 08, 2022

BKBK, you say to Steve: "As I suspected, the connection_pool_size of 5000 is not justified. You have just 1 worker/site, so you don't have to multiply anything."

 

You say he has "just 1 worker/site". I think you are saying he has just 1 worker (in the config for this connector), and that is true. But he didn't indicate in his note here how many SITES he has using that connector. I know from prior conversations that he does have many sites.

 

Even without knowing that, the indication of his having connection_pool_size (CPS) at multiples of max_reuse_connections (MRU) would be what's to be expected if one has multiple sites using a connector. 

 

For instance a CPS value of 5000 and an MRU 250 would be suited to when 20 sites are using a given connector. (Yes, he has 225, which by the math would be appropriate for  about 22 sites, but it's best for the ratio to be a whole number rather than a fractional one.) And this is the case when folks use the "all sites" option, for a given connector.

 

Now, of course, an argument could be made that one could instead have a connector per site. And since CF2016 there is indeed even the option in the wsconfig UI to choose "all-individually" (rather than "all"), in which case the connector WOULD create one connector for EACH site. That's certainly an option Steve could consider. (As an interesting aside, that causes CF to setup each connector with a CPS of 500 and an MRU of 250, which is the default for any connector unless you change the values. That's suited to a connector using up to 2 sites. As you note, it's not quite optimal for a connector used for ONE site, where the MRU and the CPS ought to just be the same.) 

 

For anyone who may want to hear more about this notion of the ratio of CPS to MRU, I'll point out a different resource than BKBK did. Some folks will remember that when this new Tomcat web server connector came out with CF10, Adobe did blog posts then (in 2012) and another in 2014 with CF11, discussing things in far more detail:

 

https://coldfusion.adobe.com/2014/05/coldfusion-11-iis-connector-tuning/

 

Even as valuable was the tremendous back and forth of comments that followed over the next couple of years...though sadly about half of them were "lost" when Adobe moved their blog from the old blogs.coldfusion.com to the new coldfusion.adobe.com portal. You may notice this when folks seem to reply to a comment you can't find.

 

Anyway, I realize it's quite old, and it's too bad that no one from Adobe ever did a more updated, perhaps taking into account all the questions, answers, and discoveries in the comments. As for the link BKBK shared, it's indeed a recent doc page, though it's focused more on using the PMT and its connector auto-tuning feature and doesn't really discuss some of these details that the 2014 blog post did, like its indication that the ration of CPS / MRU should equal or be less than the number of sites.

 

Finally, here is the key quote on this point from that blog post: "Tune the entry for max_reuse_connections to appropriate value based on number of site. Optimal value is connection_pool_size / {no of site}"

 

It then goes on to show examples for different scenarios, including what the server.xml maxthreads should be for different combinations. (And sometimes the math they use is inconsistent with their very words in the post. That was clearly mistaken. And these are among the things that some complained about in comments, and the confusion and excessive number of comments to wade through did diminish the value of the post for many.)

 

Anyway, all that said, given the nature of the errors Steve is getting (the "incorrect function"), I wouldn't have been inclined to think his issue was with these settings.

 

As for whether the JVM arg proposed will help, again my understanding is that it solved a different problem. But let's see what Steve may have to say or may find.

 

(And Steve, you COULD at least consider going "the one connector per site" approach. You'd do that be removing the one connector for "all" and then adding back using the "all individually", which will create 20 connectors for 20 sites. Just beware then that this would have 20 connectors each with 500 CPS, so one might argue that the maxthreads in server.xml should change to 10,000. Since the MRU is 250 by default, perhaps 5000 would be enough. I would instead change the CPS and MRU to be 250 each. And then your current maxthreads of 5000 would be correct--again according to what Adobe says in both resources. 🙂

 

(PS Note that even the doc on autotuning even acknowledges that while that feature of the PMT can "tune" the connector CPS values, it does NOT ever "tune" the maxthreads in server.xml, leaving one to do that themselves. That's always seemed unfortunate to me, and diminished the value of the auto-tuning--if indeed one bothered to get the PMT setup in the first place.)


/Charlie (troubleshooter, carehart. org)
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Jul 11, 2022 Jul 11, 2022

The issue seems to be one/all or a combination of these rules in web.config.  We commented these out on Friday and had zero 500 errors over the weekend.

<rewrite>
            <rules>
                <rule name="newsroom-join" stopProcessing="true">
                    <match url="^newsroom\/join$|^newsroom\/join\/(freelanceGuidelines|studentGuidelines|survey|agreements)$" />
                    <action type="Rewrite" url="index.cfm?newsroom=1&amp;join={R:1}" />
                </rule>
                <rule name="newsroom-landing" stopProcessing="true">
                    <match url="^newsroom$|^newsroom\/(articles|authors|categories|tags|news|mcags|join)$" />
                    <action type="Rewrite" url="index.cfm?newsroom=1&amp;class={R:1}&amp;pkid=&amp;alias=" />
                </rule>
                <rule name="newsroom-detail" stopProcessing="true">
                    <match url="^newsroom\/([_0-9a-z-]+)\/([_0-9]+)\/([_0-9a-z-]+)$" />
                    <action type="Rewrite" url="index.cfm?newsroom=1&amp;class={R:1}&amp;pkid={R:2}&amp;alias={R:3}" />
                </rule>
            </rules>
        </rewrite>

 

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jul 11, 2022 Jul 11, 2022

Ah, OK. 

What happens when you change every occurrence of 

&amp;

to

&
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Jul 11, 2022 Jul 11, 2022

As soon as any page is loaded (i.e. not just one that would hit those rules):

500.19 with 0x8007000d - Configuration file is not well-formed XML

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jul 11, 2022 Jul 11, 2022

I don't think you can use unescaped ampersands (&) in XML. Those are metacharacters used to initiate escape sequences like &amp;

 

Dave Watts, Eidolon LLC 

Dave Watts, Eidolon LLC
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jul 11, 2022 Jul 11, 2022

Oops. Thanks, Dave. I see it just now.

 

By 

 

&

 

I  meant the URL-encoded character, which is:

 

%26

 

 

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Jul 11, 2022 Jul 11, 2022

Rewrite just ignores it and doesn't covert it to an actual & for the query string.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jul 11, 2022 Jul 11, 2022

OK, thanks. 

In any case, do you mean that you didn't get any errors when you used %26? 

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Jul 11, 2022 Jul 11, 2022

No errors, but the action defined in the query strings never happens

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jul 11, 2022 Jul 11, 2022

That would suggest that the config file has an error.

Try rewriting the rules using condition and action elements.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Jul 11, 2022 Jul 11, 2022
LATEST

Here's an example:

<action type="Redirect" url="index.cfm?p=register/enroll&amp;bpid=2006C3>

That works fine (the 500s were never consistent - i.e. we tried all of the rules in web.config and they'd all work, but eventuall the 500 would be thrown)

<action type="Redirect" url="index.cfm?p=register/enroll%26bpid=2006C3 />

Does the first part - redirects to register/enroll but (I guess) unless we add urldecode() CF isn't going to see %26 as an &
 

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Resources