These summary tables are key because:
They are fast, optimizing data query.
They do not have sampling, which guarantees that the data represents 100% of reality.
They keep the complete history of data, regardless of how much time has passed.
However, not everything is perfect and it also generates serious limitations in its use.
One of the main limitations of the aggregate system is cardinality, which refers to the maximum number of rows a summary table can handle. If you have too many unique values in the dimensions you’re querying, you could run into the dreaded “(others)” value, which groups together less common data and excludes it from the report.
Having accumulated data means we can’t cross-reference some dimensions because there aren’t any tables that work together. This has become increasingly noticeable as GA4 has evolved, and it’s due to system efficiency. The more limits, the fewer tables; the fewer tables, the faster everything works.
The final limitation of this system is in the filtering: The aggregate system, as proposed by GA4, does not allow for complex filters. We cannot create segments or sequences in this system.
Now that we understand how this GA4 query engine works, we need to see where we’ll find it in GA4. This system can be found in:
Mainly in standard queries. Almost always, when we request a report from the GA4 menu, we’ll use the standard system.
In the Query API. The popular GA4 API uses the aggregate system and therefore provides results similar to those of the standard reports in the GA4 Reporting Library.
And therefore, in all products connected to GA4: Looker Studio, Google Sheets, Make,
code integrations. Everything that pulls data from GA4 does so mostly through this system.
The granular system: depth and precision, but with sampling
This system uses the same information and technologies as GA4, but instead of accumulating so much information, it stores it in much more detail. It still uses cubes, dimensions, and technical tricks to streamline queries, but the granular system prioritizes drilling down to the details. Here, data is stored in a much more specific manner, allowing for deeper insight into events and sessions.
Among its advantages, we find:
It is not affected by cardinality, allowing for greater reporting accuracy, even with problematic dimensions.
It allows you to cross-reference data more easily, even allowing the creation of dynamic tables.
It offers advanced segmentation, such as the use of sequences and session or user filters.