-
Notifications
You must be signed in to change notification settings - Fork 10
Description
The case of text is preserved in the Lucene.Net document's stored field because the StandardAnalyzer only lowercases the text during the indexing (analysis) process, not the original value that is stored.
Here is a breakdown of how Lucene and the StandardAnalyzer handle text:
Indexing (Analysis): When you use the StandardAnalyzer for indexing a field (e.g., a TextField), it processes the text through a series of steps (tokenization and filters) which include a LowerCaseFilter. The terms that are added to the search index are, therefore, all lowercase, making searches case-insensitive.
Storage: The StandardAnalyzer and the analysis process in general do not modify the original raw string value of the field if it is configured to be stored (using Field.Store.YES or equivalent options). The original value is stored in the index in its exact, original form for retrieval and display purposes.
Searching: When you search using the same StandardAnalyzer (which is standard practice), your query terms are also lowercased before searching the index, ensuring a match with the lowercased terms in the index.
In essence, Lucene separates the indexed data (used for searching) from the stored data (used for displaying the original content), allowing the original casing to be preserved in the document while still enabling case-insensitive search.
If you were to use a StringField instead of a TextField with an Analyzer, the field would be indexed "as is" (without analysis), and searches would then be case-sensitive unless you manually lowercased all input.
So, we need convert query parameters to lower case