Advanced Search
Search Results
7 total results found
Lucene
Lucene – Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. http://lucene.apache.org/java/docs/
Compiling Lucene with GCJ
BackgroundLucene is a open-source search library written in Java.GCJ is a Java to native-executable compilter. As shown in a LinuxJournal article, using gcj is similar to using gcc.Versions UsedAs of 1/26/08, the latest version of Lucene is 2.3.0. It needs Jav...
Free/open source information retrieval libraries
What are they and why using oneInformation retrieval libraries are software libraries that provides functionality for searching within databases and documents within them. In particular, this often refers to searching text document for combinations of words an...
Lucene spans
IntroductionIn Lucene, a span is a triple (i.e. 3-tuple) of <document number, start position, end position>. Document numbers start from zero. The positions are term positions, not character positions, and start from zero (i.e. the first token of a field...
Lucene term documents and term positions
IntroductionTerm documentsFor each term T, there are (doc frequency of the term) tuples of <doc ID, freq of T in this doc>.This information is stored in the .frq file and accessible via the TermDocs interface.Term positionsFor each term T, there are (doc...
Pure negation query in lucene
In many information-retrieval system, you can use queries like “term1 AND (NOT term2)” but you cannot use queries like “NOT term2” on their own (e.g. to get only documents that do not contain term2). At least the system returns no result even if some documents...
Why are Lucene's stored fields so slow to access
ProblemI have a Lucene index that has some large fields (about 50 KB each) and some small fields (about 50 bytes each). I need to access (iterate) one of the small fields for say 1/10 of the documents. For some reason, such operation is very slow, unreasonably...