Lucene term documents and term positions

Introduction

Term documents

For each term T, there are (doc frequency of the term) tuples of <doc ID, freq of T in this doc>.

This information is stored in the .frq file and accessible via the TermDocs interface.

Term positions

For each term T, there are (doc frequency of the term) tuples of <doc ID, freq of T in this doc, (term freq of T in this doc) counts of positions of T in this doc>.

This information is stored in the .prx file and accessible via the TermPositions interface.

Using the information

Via Query

If you use Query classes, they get and make use of term documents and term position so you do not have to worry about them. Non-span queries other than phrase queries use term documents only. Phrase queries uses term documents + term positions to make sure that a document actually have the terms, say, right next to each other and in order. This makes phrase queries slower than term queries, i.e. searching for the phrase “southern california” show be slower than searching for required words “southern california”.

Lower-level than queries is the spans API. The Spans class is still higher-level than using TermPositions directly.

Via the interfaces

Note that the document number (returned by doc()) and frequency (returned by freq()) of a TermDocs object is undefined until next() is called the first time. This is unclear from the API reference but I have found this out by experiment.

How do I update root certificates in Apache/PHP/cURL environment

What are the differences between addslashes(), mysql_escape_string() and mysql_real_escape_string()

PHP and ODBC

Speed of unpack() in PHP

Configuring PEAR on Windows

How do I use cURL in PHP on Windows?

Passing command-line arguments into PHP

Using SSL socket in PHP under Windows

PHP Resources

PHP error reporting

PHPXref vs PHPDocumentor

Create a PHP unit test case using SimpleTest

PHP Commenting Style

phpMyAdmin Security

PHP

How can I make phpMyAdmin avoid sending MySQL passwords in the clear?

PHP ODBC Setup Guide

Performance of array_shift and array_pop in PHP

Javascript

Numeric Validation JavaScript

The advantages of Javascript

jQuery Tutorial

Which Javascript framework should I use?

jQuery and JavaScript Coding: Examples and Best Practices

Java

XML and Java

Converting Java content into AJAX (Javascript and XML)

Create a Java class that is only comparable to itself

Removing old Java versions

Java Server Faces

MySQL Resources

PostgreSQL Resources

Why doesn't mysqlshow work for databases or tables with underscores in their names?

What is mysqlshow good for?

How can I search/replace strings in MySQL?

Microsoft Access, OpenOffice and MySQL

SQL joins

Get rid of default annoyances in MySQL Workbench

Who uses PostgreSQL at UCLA?

Why NoSQL Matters

Subversion

Revision Control

Revision Control Systems Compared

Installing Subversion on Windows

GIT info

What are some document management services/document version control applications out there?

svn: Working copy '<filename>' is missing or not locked

Learning about CSS

What sort of menus can I make with CSS?

Top Ten Web Design Mistakes of 2005

The importance of "!important" in CSS

CSS Design Concerns for IE6, IE7, and Firefox

Forcing a page break with CSS

What's a solid starting point (global reset) for a CSS file?

UX Team ( UCLA Library - Digital Initiatives & Information Technology )

Hi, are there any UCLA style resources or style guides for websites?

UX Resources

What to do when CSS stylesheets refuse to apply

Web Accessibility Resources

Sass versus LESS

Introduction to XML in Flash - Making Flash Dynamic

XML Resources

XML

Why is it important to use short names in Plone?

Plone CMS Resources

Plone 4 Tips and Tricks: Table of Contents

How do I identify the stylesheets in Plone?

How to get rid of icons in Plone

Importing and exporting a Plone site

Installing Plone v3.2 on Mac OS X 10.5

Remove highlighting of search terms in Plone

Is there a permission that allows a user edit content that s/he does not own in Plone?

Why can't I add a photo using AT Photo in Plone?

Shibboleth For Plone

How do I get started with designing new/existing layouts in Plone?

Backing up and packing Plone's database file (Data.fs)

Zope/Plone usage statistics

Should I use plonecustom.css when changing the layout for my Plone site

Changing number of displayed news/events in Plone portlets

Search across multiple Plone instances