Wednesday 01 October 2008 1:36:23 am
Hi there, I'm trying to index the content of files (txt, pdf) using eZFind + Solr and have troubles with it.
eZPublish 4.0.1
eZFind 1.0.0 beta2 Linux Debian 4
First of all, I've installed the eZPublish and eZFind package as recommended. When I create a new media/file and upload a file (txt or pdf), indexing works perfectly and I can make searches (I can find my words into the database table ezkeyword as well).
Because I found the raw Search a bit "light", I decided to test with the Solr. ... And everything gets wrong now. Pretty sure that the thing is well installed because I can search for articles contents or file summary into the admin Solr search. But nothing about the <b>content of the uploaded file</b> itself.
What am I doing wrong? Is there a trick?
Having a look at the Solr guide (http://wiki.apache.org/solr/), I found this : "Solr has an extensible DocumentHandler architecture that allows you to feed it XML and CSV documents. There is now a patch file available as part of SOLR-284 that adds support for parsing rich binary formats. " Do we have to patch the provided Solr? Would anyone be so kind to help?
Thanks a lot Laurence
|