Exit
  • Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
  • 한국 커뮤니티
0

How to Retrieve All Files from S3 Bucket When There Are More Than 1000 Files Using ColdFusion?

Community Beginner ,
Apr 10, 2025 Apr 10, 2025

Hello,

I am currently working on an S3 integration using ColdFusion, and I'm facing an issue when trying to retrieve all the files from an S3 bucket. I am using the listAll() method, but it seems that it only returns a maximum of 1000 files, and I need to handle cases where there are more than 1000 files in the bucket.

I understand that Amazon S3 uses pagination for listing objects, and I need to handle the NextContinuationToken to paginate through the results, but I'm unsure how to implement this correctly in ColdFusion.

Has anyone experienced this issue or can offer advice on how to paginate through the list of objects in S3 using ColdFusion to retrieve more than 1000 files?

Here is the code I'm currently using:
<cfset allObjects = bucket.listAll()>

Can someone please guide me on how to modify this to handle pagination when there are more than 1000 objects in the bucket?

Thanks in advance!

112
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Apr 11, 2025 Apr 11, 2025

Let's assume you start off with an array. If your result is not an array, then convert it into one.

 

You could then use the ColdFusion function ArraySlice for pagination. The following demo is a fully worked out example. 

<cfset objectsArray=['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q']>

<cfset numberOfObjects = arrayLen(objectsArray)>
<cfset numberOfObjectsPerPage = 4>
<cfset numberOfPages = ceiling(numberOfObjects/numberOfObjectsPerPage)>

<p>
	<cfoutput>
		List: #arrayToList(objectsArray)# <br>
		Number of objects: #numberOfObjects# <br>
		Number of objects per page: #numberOfObjectsPerPage# <br>
		Number of pages: #numberOfPages# <br>
	</cfoutput>
</p>

<cfset page = arrayNew(1)>

<cfif numberOfObjects GT 0>
	<cfloop index="np" from="1" to="#numberOfPages#">
		<cfif np LT numberOfPages>
			<!--- Pages that are 'full'. Each contains the max number of objects allowed per page --->
			<cfset page[np] = arraySlice(objectsArray,1+(np-1)*numberOfObjectsPerPage,numberOfObjectsPerPage)>
		<cfelse>
			<!--- Relevant when the last page has fewer than the max number of objects allowed per page --->
			<cfset page[np] = arraySlice(objectsArray,1+(np-1)*numberOfObjectsPerPage,numberOfObjects-(np-1)*numberOfObjectsPerPage)>
		</cfif>
	</cfloop>
	
	<cfloop index="np" from="1" to="#numberOfPages#">
		<p>
			<cfdump var="#page[np]#" label="Page[#np#]">
		</p>
	</cfloop>
</cfif>

Just plug in your own values for objectsArray and numberOfObjectsPerPage in the above code, and you're done.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Apr 11, 2025 Apr 11, 2025

Here is a somewhat crude example, it appends an array of the bucket array, what you want is to use options.marker which will keep track of the last object so it knows where to start.


        <cfscript>
            storage_bucket = "Your Bucket Name";
            prefix = {
                "prefix" = "PREFIX/NAME/IF/YOU/ONLY/NEEDED/Sub/Objects/"
            };

            allObjects = []; // Array to store all objects
            marker = ""; // Marker to track the last object
            maxObjects = 1000; // AWS S3 default limit

            do {
                // Prepare the options for the listAll method
                options = {
                    "prefix": prefix.prefix
                };

                // Add the marker only if it's not blank
                if (len(marker) > 0) {
                    options.marker = marker;
                }

                // Fetch objects with the current options
                bucketList = s3Obj.bucket(storage_bucket, false).listAll(options);

                bucketList = bucketList['response'];

                // Append the current batch of objects to the allObjects array
                arrayAppend(allObjects, bucketList);

                // Update the marker to the key of the last object in the current batch
                if (arrayLen(bucketList) > 0) {
                    marker = bucketList[arrayLen(bucketList)].key;
                }

            } while (arrayLen(bucketList) == maxObjects); // Continue if the batch size is 1000

            // Output all objects
            writeDump(var='#allObjects#', abort='true');
        </cfscript>

 

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Apr 12, 2025 Apr 12, 2025
LATEST
<!--- Function to paginate an array --->
<cffunction name="paginateArray" returntype="struct">
	
	<cfargument name="inputArray" type="array" required="yes" hint="Array that consists of a list to be paginated. That is, an array you want to be split up into sub-arrays.">	
	<cfargument name="inputNumberPerPage" type="numeric" required="yes" hint="The maximum number of objects per page. That is, the maximum number of elements per sub-array.">

	<!--- A page = a sub-array of the input array --->
	<cfset var page = arrayNew(1)>
	
	<!--- The data to be returned will be assembled in a struct --->
	<cfset var returnData = structNew()>

	<cfset var numberOfObjects = arrayLen(arguments.inputArray)>
	<cfset var numberOfObjectsPerPage = arguments.inputNumberPerPage>
	<cfset var numberOfPages = ceiling(numberOfObjects/numberOfObjectsPerPage)>
	
	<!--- Initialize page counter --->
	<cfset var np = 0>
	
	<cfif numberOfObjects GT 0>
		<cfloop index="np" from="1" to="#numberOfPages#">
			<cfif np LT numberOfPages>
				<!--- Pages that are 'full'. That is, pages each of which contains the max number of objects per page --->
				<cfset page[np] = arraySlice(arguments.inputArray,1+(np-1)*numberOfObjectsPerPage,numberOfObjectsPerPage)>
			<cfelse>
				<!--- This line handles the last page separately. That is because the last page may contain fewer than the max number of objects allowed per page --->
				<cfset page[np] = arraySlice(arguments.inputArray,1+(np-1)*numberOfObjectsPerPage,numberOfObjects-(np-1)*numberOfObjectsPerPage)>
			</cfif>
		</cfloop>		
	</cfif>
	
	<!--- Assemble return data --->
	<cfset returnData = { "numberOfObjects":numberOfObjects,
						  "numberOfObjectsPerPage":numberOfObjectsPerPage,
						  "numberOfPages":numberOfPages,
						  "page":page }>
						  
	<cfreturn returnData>
	
</cffunction>

 

<!--- Test run --->
<cfset testArray=['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q']>
<cfset numberOfObjectsPerPage=4>

<cfset paginationResult = paginateArray(testArray,numberOfObjectsPerPage)>
<p>
	<cfoutput>
		Number of objects: #paginationResult.numberOfObjects# <br>
		Number of objects per page: #paginationResult.numberOfObjectsPerPage# <br>
		Number of pages: #paginationResult.numberOfPages# <br>
	</cfoutput>
</p>
	
<cfloop index="n" from="1" to="#paginationResult.numberOfPages#">
	<p>
		<cfdump var="#paginationResult.page[n]#" label="paginationResult.page[#n#]">
	</p>
</cfloop>
		
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Resources