Skip to main content
erikpinpointdigital
Participating Frequently
December 13, 2018
Question

CFSpreadsheet: Simultaneous reads causing blocking situation

  • December 13, 2018
  • 2 replies
  • 2606 views

I have an application that allows users to upload multi-worksheet Excel files.  I use CFSpreadsheet to parse these files and import the data.  The processing can take anywhere from a minute to several minutes.  The first worksheet contains data that dictates which additional worksheets require processing during the import.

While doing some load testing, I noticed that if two users upload files, the second user has to wait for the first user process to finish before it works.  I have added CFLOG statements throughout my application, and done some simplified test scenarios (where I just read one worksheet and dump the result) that prove to me there is a blocking situation.  Has anyone else seen this situation?  Is there a way around this?

This topic has been closed for replies.

2 replies

BKBK
Community Expert
Community Expert
December 16, 2018

Hi erikpinpointdigital​, the execution of your 2 CFM files was not single-threaded. In fact, the threads that ran the pages were, respectively,  ajp-bio-8014-exec-104 and ajp-bio-8014-exec-87.

Having said this, I can still imagine a situation where blocking occurs, just as you observed. The situation is when the 2 CFM files share the same value for the src or query attributes in <cfspreadsheet>. There might then be contention when 2 threads attempt to access the same file or the same query object.

If so, then the solution is obvious. Make sure the values of the attributes src and query in the first CFM file differ from the respective values in the second CFM file.

erikpinpointdigital
Participating Frequently
December 17, 2018

BKBK​ thanks.  I had a hunch that might be an issue, and as such did indeed make sure it was reading two different files and using two different query attributes.  I have even gone so far as to try copying the queries from the spreadsheet read into new variables and deleting the original queries.  This really has me stumped.  In 20+ years of CF development, this is the first time I haven't been able to find a solution or work around for something that puzzles me.

BKBK
Community Expert
Community Expert
December 17, 2018

A sleep-time of 5 seconds is too short to enable us to draw any conclusions. Use 60 seconds, for example. I did.

I placed two copies of an Excel file on my desktop. I then opened page1.cfm in the browser and, within several seconds, page2.cfm.

page1.cfm

<cfsetting requesttimeout="66000">

<cfset thisExecution="Execution1">

<cfset thisFile="file1">

<cflog text="#thisExecution#. Start reading file: #thisFile#" >

<cfspreadsheet action="read" src="C:\Users\BKBK\Desktop\shipType.xlsx" query="shipData" sheet="2" headerrow="1" excludeheaderrow="true">

<cflog text="#thisExecution#. End reading file: #thisFile#" >

<cflog text="#thisExecution#. Start sleep" >

<cfset sleep(60000)>

<cflog text="#thisExecution#. End sleep / start dump in #thisExecution#" >

<cfdump var="#shipData#" >

<cflog text="#thisExecution#: end dump in #thisExecution#" >

page2.cfm

<cfsetting requesttimeout="6000">

<cfset thisExecution="Execution2">

<cfset thisFile="file2">

<cflog text="#thisExecution#. Start reading file: #thisFile#" >

<cfspreadsheet action="read" src="C:\Users\BKBK\Desktop\shipType2.xlsx" query="shipData2" sheet="2" headerrow="1" excludeheaderrow="true">

<cflog text="#thisExecution#. End reading file: #thisFile#" >

<cflog text="#thisExecution#: start dump in #thisExecution#" >

<cfdump var="#shipData2#" >

<cflog text="#thisExecution#: end dump in #thisExecution#" >

The result was as expected:

"Information","http-nio-8500-exec-2","12/17/18","21:37:16","FORUM","Execution1. Start reading file: file1"

"Information","http-nio-8500-exec-2","12/17/18","21:37:17","FORUM","Execution1. End reading file: file1"

"Information","http-nio-8500-exec-2","12/17/18","21:37:17","FORUM","Execution1. Start sleep"

"Information","http-nio-8500-exec-10","12/17/18","21:37:28","FORUM","Execution2. Start reading file: file2"

"Information","http-nio-8500-exec-10","12/17/18","21:37:28","FORUM","Execution2. End reading file: file2"

"Information","http-nio-8500-exec-10","12/17/18","21:37:28","FORUM","Execution2: start dump in Execution2"

"Information","http-nio-8500-exec-10","12/17/18","21:37:32","FORUM","Execution2: end dump in Execution2"

"Information","http-nio-8500-exec-2","12/17/18","21:38:17","FORUM","Execution1. End sleep / start dump in Execution1"

"Information","http-nio-8500-exec-2","12/17/18","21:38:20","FORUM","Execution1: end dump in Execution1"

EddieLotter
Inspiring
December 14, 2018

Can you post a simplified but complete test case that demonstrates the problem?

Also provide your versions of ColdFusion and JAVA.

erikpinpointdigital
Participating Frequently
December 14, 2018

Thanks for the response.

CF Version: 11,0,08,298512

Java Version: 1.8.0_25

I have also tried this on a different server running CF2016, and have witnessed the same behavior.

I have created two test files in the same directory.  I have two XLSX files that each have 2 worksheets (so I can read from the second worksheet specifically).  These files are ~300k in size and reside in the same directory as the CFM files.  The second worksheets each have ~2000 rows x 10 columns of data.

The code of both CFM files is very similar to this (only changing the values of the "thisExecution" and "thisFile" variables.

I tail the application.log as load both of these from the same computer.  (separate computers also present the same behavior).

The code:  (for some reason isn't showing up - I have tried editing a few times)

<CFSETTING requesttimeout="6000" >

<CFSET private.thisExecution = "Test1" >

<CFSET private.thisFile = "testfile1.xlsx" >

<CFOUTPUT>#private.thisExecution#: #private.thisFile#: #private.dataType#</CFOUTPUT>

<CFLOG text="#private.thisExecution#: Start: Reading File #private.thisFile#" >

<CFSPREADSHEET action="read" src="#expandpath('#private.thisFile#')#" sheet="2" query="getData" headerrow="1" excludeheaderrow="true" >

<CFLOG text="#private.thisExecution#: End: Reading File #private.thisFile#" >

<CFLOG text="#private.thisExecution#: Sleep: 5 seconds" >

<CFSCRIPT>sleep(5000);</CFSCRIPT>

<CFLOG text="#private.thisExecution#: Start: Dump getData" >

<CFDUMP var="#getData#" >

<CFLOG text="#private.thisExecution#: End Dump getData" >

Picture of the code: (since I can't get it to output in this forum)

The log output:

"Information","ajp-bio-8014-exec-104","12/14/18","13:04:47",,"Test1: Start: Reading File ABC SUPPLY ROOFING.xlsx"

"Information","ajp-bio-8014-exec-87","12/14/18","13:04:48",,"Test2: Start: Reading File ALSIDE SUPPLY ROOFING.xlsx"

"Information","ajp-bio-8014-exec-104","12/14/18","13:04:58",,"Test1: End: Reading File ABC SUPPLY ROOFING.xlsx"

"Information","ajp-bio-8014-exec-104","12/14/18","13:04:58",,"Test1: Sleep: 5 seconds"

"Information","ajp-bio-8014-exec-104","12/14/18","13:05:03",,"Test1: Start: Dump getData"

"Information","ajp-bio-8014-exec-104","12/14/18","13:05:45",,"Test1: End: Dump getData"

"Information","ajp-bio-8014-exec-87","12/14/18","13:05:52",,"Test2: End: Reading File ALSIDE SUPPLY ROOFING.xlsx"

"Information","ajp-bio-8014-exec-87","12/14/18","13:05:52",,"Test2: Sleep: 5 seconds"

"Information","ajp-bio-8014-exec-87","12/14/18","13:05:57",,"Test2: Start: Dump getData"

"Information","ajp-bio-8014-exec-87","12/14/18","13:06:13",,"Test2: End: Dump getData"

They both start reading the files at around the same time.  But script 2 waits until script 1 has finished working with the spreadsheet before it continues.

I am thinking CFSpreadsheet is not thread safe.  Is there a way to explicitly kill the process after it is done reading the worksheet in the file?  Since it creates a query object with the results of the read, I could duplicate the object, kill CFSpreadsheet and free it up for the next instance to use.

EddieLotter
Inspiring
December 14, 2018

I won't have time to test this in my environment this week, but as a quick test, comment out the cfspreadsheet line and give GetData an arbitrary value so that the script completes without error.

If the same sequence of events happens, then cfspreadhseet is unlikely to be the cause. If the behavior is different, then there is a possibility that there is a problem.

As another test, you could replace cfspreadsheet with a binary read of the file and see if it affects the order of processing.