Couchbase’s Full-Text Search (FTS) indexes don’t work like traditional database indexes; they actually run as a separate, distributed service that streams data changes from your primary Couchbase cluster.
Let’s see it in action. Imagine we have a bucket named travel-sample with documents like this:
{
"type": "hotel",
"name": "Grand Hyatt Dubai",
"city": "Dubai",
"country": "AE",
"description": "Located in the heart of Dubai's business district, the Grand Hyatt Dubai offers luxurious rooms and suites..."
}
We want to search for hotels by keywords in their descriptions. First, we need to create an FTS index. In the Couchbase UI, navigate to "Search" -> "Indexes" and click "Create Index."
Index Name: hotel-description-idx
Bucket: travel-sample
Scope: _default
Collection: hotel (assuming you’re using collections)
Mappings:
{
"type": "hotel",
"name": "keyword",
"city": "keyword",
"country": "keyword",
"description": {
"analyzer": "standard",
"type": "text"
}
}
This mapping tells Couchbase to treat the description field as text, using the standard analyzer for tokenization (splitting text into words, lowercasing, removing punctuation). Other fields like name, city, and country are set to keyword which is good for exact matches or aggregations but not full-text searching.
Once the index is created and has had a chance to build (you can monitor its status in the UI), you can run a search query using the SEARCH N1QL command:
SELECT META().id, name, city, description
FROM `travel-sample`._default.hotel
WHERE SEARCH(hotel-description-idx, 'luxury')
LIMIT 10;
This query searches the hotel-description-idx index for documents containing the term "luxury" within their indexed fields. The SEARCH function takes the index name as the first argument, followed by the search query.
The magic behind FTS is its decoupled nature. The FTS service runs independently of your data nodes. It subscribes to the mutation log (the xorcr stream) of your Couchbase bucket. When data changes in the bucket, the FTS service picks up these changes, processes them according to the index definition (tokenizing, stemming, etc.), and updates its own distributed index, which is stored on separate nodes. This separation means that heavy FTS queries won’t impact the performance of your primary data operations.
The analyzer in the mapping is crucial. The standard analyzer is a good default, but Couchbase supports others like whitespace, simple, english, and custom analyzers. For instance, if you wanted to treat "running" and "ran" as the same word, you’d use an analyzer with stemming, like english.
{
"type": "hotel",
"description": {
"analyzer": "english",
"type": "text"
}
}
And your query would still be:
SELECT META().id, name, city, description
FROM `travel-sample`._default.hotel
WHERE SEARCH(hotel-description-idx, 'run')
LIMIT 10;
This would now match documents containing "running," "ran," or "runs."
The type field in the mapping is also important. Setting it to hotel ensures that only documents with {"type": "hotel"} are indexed, preventing irrelevant documents from cluttering your search index. If you omit type or type is not present in your documents, it will index all documents in the specified bucket/scope/collection.
The most surprising thing is how FTS handles term weighting and relevance scoring. By default, Couchbase uses the Okapi BM25 algorithm to calculate a relevance score for each search result, allowing you to ORDER BY SCORE DESC. This means you don’t just get hits; you get hits ranked by how relevant they are to your query, based on term frequency, inverse document frequency, and document length.
SELECT META().id, name, city, description, SCORE() AS relevance
FROM `travel-sample`._default.hotel
WHERE SEARCH(hotel-description-idx, 'luxury hotel')
ORDER BY SCORE DESC
LIMIT 5;
This query not only finds hotels matching "luxury hotel" but also orders them by how well they match, with the most relevant appearing first.
When you’re done with an index or need to recreate it, simply select it in the UI and click "Drop Index."
The next hurdle you’ll face is implementing more complex search features like fuzzy matching, phrase searching, or creating custom analyzers for multilingual support.