Mango queries let you search your CouchDB documents using a JSON-based query language, and indexes are crucial for making those queries fast.
Let’s see it in action. Imagine you have a collection of users documents in CouchDB, each with a name, email, and city.
{
"_id": "user-1",
"type": "user",
"name": "Alice Smith",
"email": "alice.smith@example.com",
"city": "New York"
}
{
"_id": "user-2",
"type": "user",
"name": "Bob Johnson",
"email": "bob.j@example.com",
"city": "London"
}
{
"_id": "user-3",
"type": "user",
"name": "Alice Wonderland",
"email": "alice.w@example.com",
"city": "New York"
}
You want to find all users in "New York". A Mango query to do this would look like:
POST /mydb/_find
{
"selector": {
"city": "New York"
}
}
This selector is the heart of your query. It specifies the conditions that documents must meet. Here, we’re simply saying "find documents where the city field is exactly New York." CouchDB will return the matching documents:
{
"docs": [
{
"_id": "user-1",
"type": "user",
"name": "Alice Smith",
"email": "alice.smith@example.com",
"city": "New York"
},
{
"_id": "user-3",
"type": "user",
"name": "Alice Wonderland",
"email": "alice.w@example.com",
"city": "New York"
}
],
"bookmark": "...",
"total_rows": 2
}
But what if you need to search for users named "Alice" and living in "New York"? You can combine conditions:
POST /mydb/_find
{
"selector": {
"name": "Alice",
"city": "New York"
}
}
This query will only return documents where both conditions are true. Mango also supports more complex logic like $or, $and, $not, and range queries with $gt, $lt, $gte, $lte. For instance, to find users whose names start with "A":
POST /mydb/_find
{
"selector": {
"name": {
"$gt": "A",
"$lt": "B"
}
}
}
This works because CouchDB treats string values lexicographically. This query finds all strings that come after "A" but before "B", effectively matching names starting with "A".
Now, about those indexes. Without an index, CouchDB has to perform a full table scan for every _find request. It reads every single document in your database and checks if it matches your selector. This is incredibly slow, especially as your database grows. Indexes tell CouchDB where to look for specific data, dramatically speeding up queries.
To create an index for our "city" query, you’d use the _index endpoint:
POST /mydb/_index
{
"index": {
"fields": ["city"]
},
"ddoc": "index-users-by-city",
"name": "users-by-city",
"type": "json"
}
The fields array specifies which document fields to index. ddoc and name are identifiers for this index. type: "json" indicates a JSON index, which is standard for Mango. Once this index is created, the _find query for city: "New York" will use it, making it much faster.
For compound queries, like finding users by name and city, you create a compound index:
POST /mydb/_index
{
"index": {
"fields": ["name", "city"]
},
"ddoc": "index-users-by-name-city",
"name": "users-by-name-city",
"type": "json"
}
The order of fields in the fields array matters for compound indexes. An index on ["name", "city"] can efficiently serve queries filtering on name, queries filtering on city, and queries filtering on both name and city. However, it won’t be as efficient for queries filtering only on city as a dedicated ["city"] index would be.
When CouchDB executes a _find query, it consults its available indexes. If it finds an index that matches the query’s selector, it uses that index. If multiple indexes could potentially be used, CouchDB has an internal cost-based optimizer that chooses the most efficient one.
The explain: true option in a _find request is invaluable for understanding how CouchDB plans to execute your query and whether it’s using an index:
POST /mydb/_find
{
"selector": {
"city": "New York"
},
"explain": true
}
The output will show you which index (if any) is being used, or if a full scan is occurring.
A subtle but powerful aspect of Mango is the ability to index on nested fields. If you had documents like this:
{
"_id": "order-1",
"type": "order",
"customer": {
"name": "Alice Smith",
"address": {
"city": "New York"
}
}
}
You can create an index on the nested city field like this:
POST /mydb/_index
{
"index": {
"fields": ["customer.address.city"]
},
"ddoc": "index-orders-by-customer-city",
"name": "orders-by-customer-city",
"type": "json"
}
And then query it directly:
POST /mydb/_find
{
"selector": {
"customer.address.city": "New York"
}
}
This makes querying deeply nested data straightforward and performant, as long as the corresponding index exists.
The next step after mastering basic Mango queries and indexes is exploring aggregation capabilities with the group and reduce options in your _find requests.