Analyzers
Analyzers convert text into searchable tokens through a pipeline of character filters, tokenizer, and token filters.
Available Analyzers
Analyzer
Base class for creating custom analyzers with specified character filters, tokenizer, and token filters.
StandardAnalyzer
English text analyzer with word tokenization, lowercase normalization, and optional stop word filtering.
Best for: English text, Western languages, general text search
SimpleAnalyzer
Letter-based tokenization with automatic lowercasing.
Best for: Simple text tokenization, case-insensitive search without stop words
WhitespaceAnalyzer
Splits text on whitespace characters.
Best for: Preserving punctuation and special characters, pre-tokenized input
KeywordAnalyzer
Treats the entire input as a single token for exact matching.
Best for: IDs and identifiers, categories and tags, exact string matching
StopAnalyzer
Letter-based tokenization with lowercasing and stop word filtering.
Best for: English text with stop word removal, reducing index size
PatternAnalyzer
Regex-based tokenization with optional lowercasing and stop word filtering.
Best for: Custom tokenization patterns, domain-specific text formats
EnglishAnalyzer
Optimized analyzer for English text with stemming and stop word filtering.
Best for: English text search with stemming, handling English word variations
FrenchAnalyzer
Optimized analyzer for French text with elision and stemming support.
Best for: French text search, handling French elisions
SpanishAnalyzer
Optimized analyzer for Spanish text with stemming support.
Best for: Spanish text search, Spanish word stemming