Solr error when indexing recordset of 500k records
Hello,
I am using CF9.0.1 and am having terrible trouble indexing large recordsets. In this instance, I am trying to index a collection of nearly 500,000 records, but around the 400,000 mark Solr is returning an error:
org.apache.commons.httpclient.ProtocolException: Unbuffered entity enclosing request can not be repeated.
Dec 21, 2011 12:56:16 AM org.apache.solr.core.SolrCore execute
INFO: [candsearch_e14] webapp=/solr path=/update params={waitSearcher=false&commit=true&wt=javabin&waitFlush=false&version=1} status=500 QTime=65187
Dec 21, 2011 12:56:16 AM org.apache.solr.common.SolrException log
SEVERE: java.lang.RuntimeException: [was class org.mortbay.jetty.EofException] null
This happens every time when I try and index this particular collection. It never happens on smaller collection sizes of just a few hundred or a few thousand.
I have altered the JVM arguments in Solr.lax to try and improve the performance to this:
lax.nl.java.option.additional=-server -Xms1024m -Xmx1024m -XX:MaxNewSize=256m -XX:MaxPermSize=256m -XX:+ScavengeBeforeFullGC -XX:-UseParallelGC -DSTOP.PORT=8079 -DSTOP.KEY=cfstop -Dsolr.solr.home=multicore
I have also changed the mergeFactor in the solrconfig.xml to 25 so I can speed up the indexing process (however, I have changed the the values of mergefactor and JVM and it makes no difference to the error above).
Has anyone experienced this error before? Does anyone even have any ideas what it means? I am totally out of ideas so need help.
