Stop Making This Common Mistake When Counting Documents in Your Firestore Collections

·

5 min read

Introduction

Firestore is a NoSQL document-based database that is widely used by developers to store and retrieve data for web and mobile applications etc. One of the fundamental concepts of Firestore is the collection, which is a group of related documents that can be queried and indexed together. However, for some use cases such as showing the number of messages, notifications, and posts... many developers do it by counting their Firestore collection documents in a way that can lead to unexpected results and performance issues. In this article, we will explore why this happens and how to avoid it by understanding the underlying data model of Firestore. By the end of this article, you will have a better understanding of how to properly count your Firestore collections documents and optimize your queries for better performance.

Understanding Firestore's data model

Firestore stores data as documents, which are organized into collections. Each document contains a set of key-value pairs, where the keys are strings and the values can be of various types, such as strings, numbers, booleans, arrays, or nested objects. The documents within a collection can have different fields and structures, unlike traditional relational databases where all rows in a table have the same columns.

Firestore collections are groupings of related documents that can be queried and indexed together. Each collection has a unique name and can contain any number of documents, ranging from zero to millions. Firestore collections can also have subcollections, which are nested collections within a document. Subcollections can have their own documents, which can also have their own subcollections, forming a hierarchical data structure.

Firestore indexes the data in each document and collection to enable efficient querying and sorting. By default, Firestore creates an automatic index for each field in a document, allowing queries to filter, order, and limit the results based on specific fields. Firestore also allows developers to create composite indexes that include multiple fields or filter conditions, improving the performance of complex queries.

Compared to traditional relational databases, Firestore's data model is more flexible and scalable, as it allows for dynamic schema changes and distributed data storage. However, it also requires a different approach to querying and modeling data, as the lack of joins and transactions can pose challenges for complex applications. Understanding Firestore's data model is crucial for properly structuring collections and documents and optimizing queries for better performance.

The problem with counting Firestore collections documents

Counting the number of documents in a Firestore collection seems like a simple task, but it can lead to unexpected results and performance issues if done incorrectly. The common approach to counting a collection of documents is by using the get() method to retrieve all those documents from that collection and then get the size of the resulting query snapshot. For example, in Dart:

Future<int> getCollectionSize() async {
try {
 final query = FirebaseFirestore.instance.collection("collection_name")
 final collectionSnapshot = await query.get();

 return collectionSnapshot.docs.length;
} catch(error) {
 print(error);
 }
}

While this approach does really work, if we have 100 documents inside the collection_name collection, then the getCollectionSize() will resolve to return 100 which is what we expect. But What it does actually is that it loads the whole collection of documents data, inside the collectionSnapshot variable, which will affect the performance of our product when it scales, and the worst is that you will be billed for every single read you perform on it by Firebase, causing a huge unnecessary usage bill.

The correct way to count Firestore collections documents

Instead of getting all the documents data in a collection to just know their size, we can totally avoid it. Being aware of all Firebase SDKs and their APIs is always good since every use case will require a special use of the tools in your hands, and for this use case, Firestore comes already with what is called an aggregate query.

An aggregate query is a special Firebase query that is intended to allow you to get only the metadata of a collection, in our case we will need to get a quick summary of just the size of the collection without fetching their data, and so we will need exactly to use the count() aggregate query.

The count() aggregate query offered by Firebase will be responsible to count documents inside of a collection from their server, and getting you with the only thing that you will need, the number of its result (size of documents).

With Dart, taking the previous getCollectionSize() method from the previous code snippet, we can do instead:

Future<int> getCollectionSize() async {
try {
 final aggregateQuery = FirebaseFirestore.instance.collection("collection_name")
 final documentsSize = await query.count();

 return documentsSize;
} catch(error) {
 print(error);
 }
}

See the difference? what we did here is that instead of calling the get() to read the whole documents of our collection to return its length, we simply replaced it with the count() method, which will get us directly what we want, and most importantly, it will be faster in our product, and will reduce costs of Firebase usage.

Now by using both implementations of getCollectionSize(), the same results will be returned, but they are different in the manner of speed, billing and the scale of our product.

Conclusion

Firestore is a powerful NoSQL document-based database, but how to use it efficiently is what can make a developer valuable for the business model that is used in a certain product. counting collection documents in Firestore can be tricky to do from a general view, while in fact the simple thing of using aggregate queries over the usual queries can be a game changer.