Highlighted

CFSpreadsheet: Simultaneous reads causing blocking situation

New Here ,
Dec 13, 2018

Copy link to clipboard

Copied

I have an application that allows users to upload multi-worksheet Excel files.  I use CFSpreadsheet to parse these files and import the data.  The processing can take anywhere from a minute to several minutes.  The first worksheet contains data that dictates which additional worksheets require processing during the import.

While doing some load testing, I noticed that if two users upload files, the second user has to wait for the first user process to finish before it works.  I have added CFLOG statements throughout my application, and done some simplified test scenarios (where I just read one worksheet and dump the result) that prove to me there is a blocking situation.  Has anyone else seen this situation?  Is there a way around this?

Views

599

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more

CFSpreadsheet: Simultaneous reads causing blocking situation

New Here ,
Dec 13, 2018

Copy link to clipboard

Copied

I have an application that allows users to upload multi-worksheet Excel files.  I use CFSpreadsheet to parse these files and import the data.  The processing can take anywhere from a minute to several minutes.  The first worksheet contains data that dictates which additional worksheets require processing during the import.

While doing some load testing, I noticed that if two users upload files, the second user has to wait for the first user process to finish before it works.  I have added CFLOG statements throughout my application, and done some simplified test scenarios (where I just read one worksheet and dump the result) that prove to me there is a blocking situation.  Has anyone else seen this situation?  Is there a way around this?

Views

600

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Dec 13, 2018 0
Advocate ,
Dec 14, 2018

Copy link to clipboard

Copied

Can you post a simplified but complete test case that demonstrates the problem?

Also provide your versions of ColdFusion and JAVA.

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Dec 14, 2018 0
New Here ,
Dec 14, 2018

Copy link to clipboard

Copied

Thanks for the response.

CF Version: 11,0,08,298512

Java Version: 1.8.0_25

I have also tried this on a different server running CF2016, and have witnessed the same behavior.

I have created two test files in the same directory.  I have two XLSX files that each have 2 worksheets (so I can read from the second worksheet specifically).  These files are ~300k in size and reside in the same directory as the CFM files.  The second worksheets each have ~2000 rows x 10 columns of data.

The code of both CFM files is very similar to this (only changing the values of the "thisExecution" and "thisFile" variables.

I tail the application.log as load both of these from the same computer.  (separate computers also present the same behavior).

The code:  (for some reason isn't showing up - I have tried editing a few times)

<CFSETTING requesttimeout="6000" >

<CFSET private.thisExecution = "Test1" >

<CFSET private.thisFile = "testfile1.xlsx" >

<CFOUTPUT>#private.thisExecution#: #private.thisFile#: #private.dataType#</CFOUTPUT>

<CFLOG text="#private.thisExecution#: Start: Reading File #private.thisFile#" >

<CFSPREADSHEET action="read" src="#expandpath('#private.thisFile#')#" sheet="2" query="getData" headerrow="1" excludeheaderrow="true" >

<CFLOG text="#private.thisExecution#: End: Reading File #private.thisFile#" >

<CFLOG text="#private.thisExecution#: Sleep: 5 seconds" >

<CFSCRIPT>sleep(5000);</CFSCRIPT>

<CFLOG text="#private.thisExecution#: Start: Dump getData" >

<CFDUMP var="#getData#" >

<CFLOG text="#private.thisExecution#: End Dump getData" >

Picture of the code: (since I can't get it to output in this forum)

The log output:

"Information","ajp-bio-8014-exec-104","12/14/18","13:04:47",,"Test1: Start: Reading File ABC SUPPLY ROOFING.xlsx"

"Information","ajp-bio-8014-exec-87","12/14/18","13:04:48",,"Test2: Start: Reading File ALSIDE SUPPLY ROOFING.xlsx"

"Information","ajp-bio-8014-exec-104","12/14/18","13:04:58",,"Test1: End: Reading File ABC SUPPLY ROOFING.xlsx"

"Information","ajp-bio-8014-exec-104","12/14/18","13:04:58",,"Test1: Sleep: 5 seconds"

"Information","ajp-bio-8014-exec-104","12/14/18","13:05:03",,"Test1: Start: Dump getData"

"Information","ajp-bio-8014-exec-104","12/14/18","13:05:45",,"Test1: End: Dump getData"

"Information","ajp-bio-8014-exec-87","12/14/18","13:05:52",,"Test2: End: Reading File ALSIDE SUPPLY ROOFING.xlsx"

"Information","ajp-bio-8014-exec-87","12/14/18","13:05:52",,"Test2: Sleep: 5 seconds"

"Information","ajp-bio-8014-exec-87","12/14/18","13:05:57",,"Test2: Start: Dump getData"

"Information","ajp-bio-8014-exec-87","12/14/18","13:06:13",,"Test2: End: Dump getData"

They both start reading the files at around the same time.  But script 2 waits until script 1 has finished working with the spreadsheet before it continues.

I am thinking CFSpreadsheet is not thread safe.  Is there a way to explicitly kill the process after it is done reading the worksheet in the file?  Since it creates a query object with the results of the read, I could duplicate the object, kill CFSpreadsheet and free it up for the next instance to use.

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Dec 14, 2018 0
Advocate ,
Dec 14, 2018

Copy link to clipboard

Copied

I won't have time to test this in my environment this week, but as a quick test, comment out the cfspreadsheet line and give GetData an arbitrary value so that the script completes without error.

If the same sequence of events happens, then cfspreadhseet is unlikely to be the cause. If the behavior is different, then there is a possibility that there is a problem.

As another test, you could replace cfspreadsheet with a binary read of the file and see if it affects the order of processing.

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Dec 14, 2018 0
New Here ,
Dec 14, 2018

Copy link to clipboard

Copied

Thanks for the suggestions.  I have already done a bunch of different tests to debug this, noteably, using a query with a similar amount of data from the database.  It doesn't behave the same way.  What I have provided here is the simplest way I have been able to reproduce and visualize the issue.

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Dec 14, 2018 0
Adobe Community Professional ,
Dec 14, 2018

Copy link to clipboard

Copied

Erik, you refer to this as the tag being "not thread safe". Instead, it seems the opposite: it's single-threaded, which is the ULTIMATE in thread safety (only one thread at a time could run it). 🙂

And there may be a good explanation for that: are you running CF Standard (as opposed to Enterprise, Trial, or Developer)? If so, note that one of the things "held back" in Standard is that various tags/functions/script statements are in fact single-threaded. The spreadsheet ones (and indeed all those creating Office-style documents), as well as the PDF ones, and so on.

If you could run your test on a trial version of CF (which runs as Enterprise), and it worked as expected, at least you'd have your answer. Let us know what you find.

/Charlie (server troubleshooter, carehart.org)

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Dec 14, 2018 0
New Here ,
Dec 14, 2018

Copy link to clipboard

Copied

You are correct.  I didn't mean to put the word "NOT" there.  That was from previous edits.

I will test this out on a developer edition, as you are correct, we are running CF Standard.

Do you know where in Adobe documentation it explicitly says that these functions are limited to single-threads?

I found this document: Buying guide | Adobe ColdFusion Family  but don't see an explanation of what "Restricted Functionality" means.

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Dec 14, 2018 0
Adobe Community Professional ,
Dec 15, 2018

Copy link to clipboard

Copied

That is indeed the only resource  and that phrase is explained only in a footnote on the page.

/Charlie (server troubleshooter, carehart.org)

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Dec 15, 2018 0
Adobe Community Professional ,
Dec 16, 2018

Copy link to clipboard

Copied

Hi erikpinpointdigital​, the execution of your 2 CFM files was not single-threaded. In fact, the threads that ran the pages were, respectively,  ajp-bio-8014-exec-104 and ajp-bio-8014-exec-87.

Having said this, I can still imagine a situation where blocking occurs, just as you observed. The situation is when the 2 CFM files share the same value for the src or query attributes in <cfspreadsheet>. There might then be contention when 2 threads attempt to access the same file or the same query object.

If so, then the solution is obvious. Make sure the values of the attributes src and query in the first CFM file differ from the respective values in the second CFM file.

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Dec 16, 2018 0
New Here ,
Dec 17, 2018

Copy link to clipboard

Copied

BKBK​ thanks.  I had a hunch that might be an issue, and as such did indeed make sure it was reading two different files and using two different query attributes.  I have even gone so far as to try copying the queries from the spreadsheet read into new variables and deleting the original queries.  This really has me stumped.  In 20+ years of CF development, this is the first time I haven't been able to find a solution or work around for something that puzzles me.

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Dec 17, 2018 0
Adobe Community Professional ,
Dec 17, 2018

Copy link to clipboard

Copied

A sleep-time of 5 seconds is too short to enable us to draw any conclusions. Use 60 seconds, for example. I did.

I placed two copies of an Excel file on my desktop. I then opened page1.cfm in the browser and, within several seconds, page2.cfm.

page1.cfm

<cfsetting requesttimeout="66000">

<cfset thisExecution="Execution1">

<cfset thisFile="file1">

<cflog text="#thisExecution#. Start reading file: #thisFile#" >

<cfspreadsheet action="read" src="C:\Users\BKBK\Desktop\shipType.xlsx" query="shipData" sheet="2" headerrow="1" excludeheaderrow="true">

<cflog text="#thisExecution#. End reading file: #thisFile#" >

<cflog text="#thisExecution#. Start sleep" >

<cfset sleep(60000)>

<cflog text="#thisExecution#. End sleep / start dump in #thisExecution#" >

<cfdump var="#shipData#" >

<cflog text="#thisExecution#: end dump in #thisExecution#" >

page2.cfm

<cfsetting requesttimeout="6000">

<cfset thisExecution="Execution2">

<cfset thisFile="file2">

<cflog text="#thisExecution#. Start reading file: #thisFile#" >

<cfspreadsheet action="read" src="C:\Users\BKBK\Desktop\shipType2.xlsx" query="shipData2" sheet="2" headerrow="1" excludeheaderrow="true">

<cflog text="#thisExecution#. End reading file: #thisFile#" >

<cflog text="#thisExecution#: start dump in #thisExecution#" >

<cfdump var="#shipData2#" >

<cflog text="#thisExecution#: end dump in #thisExecution#" >

The result was as expected:

"Information","http-nio-8500-exec-2","12/17/18","21:37:16","FORUM","Execution1. Start reading file: file1"

"Information","http-nio-8500-exec-2","12/17/18","21:37:17","FORUM","Execution1. End reading file: file1"

"Information","http-nio-8500-exec-2","12/17/18","21:37:17","FORUM","Execution1. Start sleep"

"Information","http-nio-8500-exec-10","12/17/18","21:37:28","FORUM","Execution2. Start reading file: file2"

"Information","http-nio-8500-exec-10","12/17/18","21:37:28","FORUM","Execution2. End reading file: file2"

"Information","http-nio-8500-exec-10","12/17/18","21:37:28","FORUM","Execution2: start dump in Execution2"

"Information","http-nio-8500-exec-10","12/17/18","21:37:32","FORUM","Execution2: end dump in Execution2"

"Information","http-nio-8500-exec-2","12/17/18","21:38:17","FORUM","Execution1. End sleep / start dump in Execution1"

"Information","http-nio-8500-exec-2","12/17/18","21:38:20","FORUM","Execution1: end dump in Execution1"

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Dec 17, 2018 2
New Here ,
Dec 17, 2018

Copy link to clipboard

Copied

Thank you for the continued effort with helping try to figure this out.

I increased my sleep to 60 seconds.

You can see that the behavior between our environments is different.  Also, where is your "Start Dump in Execution1" ?

"Information","ajp-bio-8014-exec-156","12/17/18","16:50:22",,"Test1: Start: Reading File ABC SUPPLY ROOFING.xlsx"

"Information","ajp-bio-8014-exec-155","12/17/18","16:50:24",,"Test2: Start: Reading File ALSIDE SUPPLY ROOFING.xlsx"

"Information","ajp-bio-8014-exec-156","12/17/18","16:50:38",,"Test1: End: Reading File ABC SUPPLY ROOFING.xlsx"

"Information","ajp-bio-8014-exec-156","12/17/18","16:50:38",,"Test1: Sleep: 60 seconds"

"Information","ajp-bio-8014-exec-156","12/17/18","16:51:38",,"Test1: Start: Dump getData1"

"Information","ajp-bio-8014-exec-156","12/17/18","16:52:02",,"Test1: End: Dump getData1"

"Information","ajp-bio-8014-exec-155","12/17/18","16:52:03",,"Test2: End: Reading File ALSIDE SUPPLY ROOFING.xlsx"

"Information","ajp-bio-8014-exec-155","12/17/18","16:52:03",,"Test2: Sleep: 60 seconds"

"Information","ajp-bio-8014-exec-155","12/17/18","16:53:03",,"Test2: Start: Dump getData2"

"Information","ajp-bio-8014-exec-155","12/17/18","16:53:24",,"Test2: End: Dump getData2"

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Dec 17, 2018 0
Adobe Community Professional ,
Dec 17, 2018

Copy link to clipboard

Copied

erikpinpointdigital  wrote

...where is your "Start Dump in Execution1" ?

"Information","http-nio-8500-exec-2","12/17/18","21:38:17","FORUM","Execution1. End sleep / start dump in Execution1

erikpinpointdigital  wrote

I increased my sleep to 60 seconds.

You can see that the behavior between our environments is different.

I think the difference so far is in what we are testing. I am using just one sleep(), you are using two.

My test was to see if the first request blocked the second. If it did the second request would have had to wait 60 seconds before completing. It didn't.

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Dec 17, 2018 0
Adobe Community Professional ,
Dec 18, 2018

Copy link to clipboard

Copied

Guys, can you both stop and clarify whether your testing is each on cf Standard or on Enterprise/trial/developer?

If it's on Standard, then no sleeping will help. They will not be able to run concurrently. Again it's CF that is imposing the single threading.

And BK, while you're right that you would indeed SEE both threads listed in a thread dump as "running" an ajp thread at the same time, the proof that it's being single-threaded by CF Standard is that the thread that is held up will be in a method referring to this class:

coldfusion.featurerouter.edition.StandardServerEdition

So can you both start with confirming the edition you are running? And Erik, if you are Standard, and you're seeing a stack trace or thread dump at the time, can you tell us if the request hanging shows being in this class at that moment?

/Charlie (server troubleshooter, carehart.org)

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Dec 18, 2018 1
Adobe Community Professional ,
Dec 20, 2018

Copy link to clipboard

Copied

Hi Charlie,

I am running the Developer Edition.

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Dec 20, 2018 0
Adobe Community Professional ,
Dec 20, 2018

Copy link to clipboard

Copied

And Erik?

/Charlie (server troubleshooter, carehart.org)

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Dec 20, 2018 0
New Here ,
Dec 20, 2018

Copy link to clipboard

Copied

I am running Standard in a production environment.  Client can't justify the expense of moving to Enterprise for this solution, so we have discussed going a different route, and creating a processing queue for files that are uploaded by the customers.  It was nice to be able to do it realtime, as I had programmed a "console" that showed them the progress (and any potential errors) of their file as I was parsing it.  But it is what it is, I suppose.

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Dec 20, 2018 0
Adobe Community Professional ,
Dec 20, 2018

Copy link to clipboard

Copied

The code you have shown may be run simultaneously on separate CFM pages in ColdFusion 11 Standard. With suitable changes, such as in variable names, of course. If that were not so, then websites built with ColdFusion 11 Standard would only handle one visitor at a time.

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Dec 20, 2018 0
Adobe Community Professional ,
Dec 30, 2018

Copy link to clipboard

Copied

Well, bkbk, again the request WILL be single-threaded while running the cfspreadsheet. No changing of var names will alter that.

It's a fundamental limitation of cf standard (that some tags are indeed single-threaded) which explains Erik's original problem posted here.

No, it's not that ALL "websites built with ColdFusion Standard would only handle one visitor at a time", but yes it's that any using such an EFR-limited tag /feature will run only when no other request is using that same tag.

So one either accepts that limitation (and the delay it can cause) , or they manage the work (like Erik is proposing), or they move to CF Enterprise (or Lucee, assuming it supports the desired tag/feature).

Or instead of switching their Cf engine over ENTIRELY, one  could implement such a alternate engine (even on the same machine as CF) and call it via cfhttp or REST, to have that engine run that tag/feature and return the result to CF.

(It may be possible to find a CF Enterprise license for an old CF version on the secondary market. Since it need not be open to access to other than such inter-process communications, the negative of using that old version beyond its end-of-life even would seemingly be reduced.)

Just offering ideas for someone feeling in a bind. 

/Charlie (server troubleshooter, carehart.org)

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Dec 30, 2018 0
BKBK LATEST
Adobe Community Professional ,
Dec 30, 2018

Copy link to clipboard

Copied

Hi Charlie,

On re-reading my replies I realize they might give the impression that I disagree with your point. Sorry.

With "changing of var names" and "websites built with ColdFusion Standard would only handle one visitor at a time" I meant something else. Namely, the independence of the server's responses to separate CFM requests. Best-practice in encapsulation and client-server architecture require that the client know nothing about how the server handles in-coming requests. Further explanation follows in a moment.

But first, I agree with your point about ColdFusion's Enterprise Feature Router (EFR). In fact, given its importance. I shall repeat it here. Lest there be any confusion.

The EFR restricts ColdFusion Standard Edition to one, single thread when processing a function or tag in any of the following categories:

Asynchronous CFML Gateway

Data Management Service

Data Services Messaging

Directory Watcher Gateway

Enterprise IM Gateways (Sametime and XMPP)

Gateway Architecture

Interface For Publish/Subscribe Using Web Socket

JMS Gateway

Microsoft Office 2010 Excel Support

Online Presentation Generation

PDF Document Manipulation

PDF File Generation From Html

PDF Form Processing

Server-Side Printing

SMS Gateway

Structured Reporting

TCP/IP Socket Gateway

There is no such restriction in ColdFusion Enterprise Edition. There these functions and tags each enjoy multi-threaded access and processing.

Now, to the point I wish to make. The suggestion to use distinct names for variables is to avoid them being overwritten. In spite of the EFR restriction in ColdFusion Standard, multiple CFM requests may access an EFR-restricted function or tag, such as <cfspreadsheet>, simultaneously. ColdFusion Standard does not queue or block the requests. It queues or blocks just that part of the request that invokes the function or tag.

The remaining business of the CFM request is not queued or blocked. The moment ColdFusion finishes processing the EFR-restricted function or tag it hands processing back to the respective request threads. Typically, the remaining business may consume the most execution time in the request-thread. But that wont be because of the EFR restriction.

To conclude, in spite of the EFR restriction, ColdFusion Standard has been optimized to give the semblance of multi-threading.

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Dec 30, 2018 0