CF2023 Collection Issues
Well folks while I'm waiting to resolve the html data field validation bug, I started working on collections.
In CF 4.51 I have 6 collections. 3 of them are:
Basically, these are populated with .txt files while were extracted from resumes with the following formats: .doc, .docx, .html, .txt resumes. There are 20,000 + or - files in each of them.
I am only able to index the folder ending in ho. And it only contains 8,396 documents. Whereas the folder contains 19,385 .txt files. I presume CF2023 collection should contain 19,385 documents?
The other two folders bomb out when I try to index them resulting in zero documents being populated in the corresponding collections. I have checked the source folders and they only contain .txt files. Thus, I have no clue as to why the processing fails.
It would require far less code changes, were we able to use our current collection scheme and simply add some additional code to accomodate .pdf resumes.
As an alternative approcach , I tried creating a collection from the folder that contained 180 of the 60,040 raw resumes with all four of the aforementioned formats. And it did create 180 documents in the collection.
I am concerned that only 1 collection containing 60,040 documents would process too slowly. I would appreciate any opinions on this concern.
Thanks in advance for any help!
