This module contains objects that query the search index. These query objects are composable to form complex query trees.
See also whoosh.qparser which contains code for parsing user queries into query objects.
The following abstract base classes are subclassed to create the the “real” query operations.
Abstract base class for all queries.
Note that this base class implements __or__, __and__, and __sub__ to allow slightly more convenient composition of query objects:
>>> Term("content", u"a") | Term("content", u"b")
Or([Term("content", u"a"), Term("content", u"b")])
>>> Term("content", u"a") & Term("content", u"b")
And([Term("content", u"a"), Term("content", u"b")])
>>> Term("content", u"a") - Term("content", u"b")
And([Term("content", u"a"), Not(Term("content", u"b"))])
Returns a set of all terms in this query tree.
This method simply operates on the query itself, without reference to an index (unlike existing_terms()), so it will not add terms that require an index to compute, such as Prefix and Wildcard.
>>> q = And([Term("content", u"render"), Term("path", u"/a/b")])
>>> q.all_terms()
set([("content", u"render"), ("path", u"/a/b")])
Parameter: | phrases – Whether to add words found in Phrase queries. |
---|---|
Return type: | set |
Returns an iterator of (docnum, score) pairs matching this query. This is a convenience method for when you don’t need a QueryScorer (i.e. you don’t need to use skip_to).
>>> list(my_query.doc_scores(ixreader))
[(10, 0.73), (34, 2.54), (78, 0.05), (103, 12.84)]
Parameters: |
|
---|
Returns an iterator of docnums matching this query.
>>> searcher = my_index.searcher()
>>> list(my_query.docs(searcher))
[10, 34, 78, 103]
Parameters: |
|
---|
Returns a set of all terms in this query tree that exist in the index represented by the given ixreaderder.
This method references the IndexReader to expand Prefix and Wildcard queries, and only adds terms that actually exist in the index (unless reverse=True).
>>> ixreader = my_index.reader()
>>> q = And([Or([Term("content", u"render"),
... Term("content", u"rendering")]),
... Prefix("path", u"/a/")])
>>> q.existing_terms(ixreader, termset)
set([("content", u"render"), ("path", u"/a/b"), ("path", u"/a/c")])
Parameters: |
|
---|---|
Return type: | set |
Returns a recursively “normalized” form of this query. The normalized form removes redundancy and empty queries. This is called automatically on query trees created by the query parser, but you may want to call it yourself if you’re writing your own parser or building your own queries.
>>> q = And([And([Term("f", u"a"),
... Term("f", u"b")]),
... Term("f", u"c"), Or([])])
>>> q.normalize()
And([Term("f", u"a"), Term("f", u"b"), Term("f", u"c")])
Note that this returns a new, normalized query. It does not modify the original query “in place”.
Returns a copy of this query with oldtext replaced by newtext (if oldtext was anywhere in this query).
Note that this returns a new query with the given text replaced. It does not modify the original query “in place”.
Returns QueryScorer object you can use to retrieve documents and scores matching this query.
Return type: | whoosh.postings.QueryScorer |
---|
Matches documents containing the given term (fieldname+text pair).
>>> Term("content", u"render")
Matches documents containing words similar to the given term.
Parameters: |
|
---|
Matches documents containing a given phrase.
Parameters: |
|
---|
Matches documents that match ALL of the subqueries.
>>> And([Term("content", u"render"),
... Term("content", u"shade"),
... Not(Term("content", u"texture"))])
>>> # You can also do this
>>> Term("content", u"render") & Term("content", u"shade")
Matches documents that match ANY of the subqueries.
>>> Or([Term("content", u"render"),
... And([Term("content", u"shade"), Term("content", u"texture")]),
... Not(Term("content", u"network"))])
>>> # You can also do this
>>> Term("content", u"render") | Term("content", u"shade")
Excludes any documents that match the subquery.
>>> # Match documents that contain 'render' but not 'texture'
>>> And([Term("content", u"render"),
... Not(Term("content", u"texture"))])
>>> # You can also do this
>>> Term("content", u"render") - Term("content", u"texture")
Parameters: |
|
---|
Matches documents that contain any terms that start with the given text.
>>> # Match documents containing words starting with 'comp'
>>> Prefix("content", u"comp")
Matches documents that contain any terms that match a wildcard expression.
>>> Wildcard("content", u"in*f?x")
Parameters: |
|
---|
Matches documents containing any terms in a given range.
>>> # Match documents where the indexed "id" field is greater than or equal
>>> # to 'apple' and less than or equal to 'pear'.
>>> TermRange("id", u"apple", u"pear")
Parameters: |
|
---|
These binary operators are not generally created by the query parser in whoosh.qparser. Unless you specifically need these operations, you should use the normal query classes instead.
Binary query returns results from the first query that also appear in the second query, but only uses the scores from the first query. This lets you filter results without affecting scores.
Parameters: |
|
---|
Binary query takes results from the first query. If and only if the same document also appears in the results from the second query, the score from the second query will be added to the score from the first query.
Parameters: |
|
---|
Binary boolean query of the form ‘a ANDNOT b’, where documents that match b are removed from the matches for a.
Parameters: |
|
---|