fbpx

Analytics July 6, 2024

Understanding GA4 data in BigQuery

Writen by admin

comments 0

One of the most impressive features of ga4 is its native integration to bigquery. You can export your ga4 data to bigquery by simply linking your bigquery project in the ga4 interface. This feature wasn’t available in universal analytics.

Using bigquery to explore your ga4 data would provide us with new exploration capabilities which aren’t available in the ga4 interface. Here are few reasons to use bigquery for your GA4 data:
->To connect your marketing data with other data sources
->To store your historical data as GA4 data retention is limited to 14 months
->To use advanced segmentation & data insights for hyper personalization and mapping customer journey
->For machine learning & AI integration

But to use bigquery, we first need to understand the ga4 data schema. The GA4 data in bigquery is event based, i.e. every single row in a bigquery is associated with an event name. Below screenshot will give you an idea of what event based schema ga4 uses:

GA4 bigquery event based schema example

Here in the above image we can see the first row is associated with the “session_start” event and we have nested fields to get more information about the event i.e. session_start. Event_params is a nested field which contains the details about that specific event. Below screenshot shows you the schema of ga4 data in bigquery:

BG ga4 schema

GA4 data fields in BigQuery

Here are few of the important fields that are used frequently in bigquery for your GA4 data:

Event Fields:
->event_date
->event_timestamp
->event_name
->event_params (event parameters like page_location, ga_session_id, page_title & more)

User Fields:
->user_pseudo_id (device_id or client_id generated by ga4 to identify users)
->user_id (unique user identifier for logged in users)
->user _properties (emails, phone number or other user details collected by user ga4)
->analytics_storage & ads_storage (user consent generated by your website consent banner)

Device Fields:
->platform (ios, android or web)
->device.category (mobile, desktop or tablet)
->device.operating_system
->device.web_info.browser

->Geo Fields:
->geo.country
->geo.city

Traffic Source:
This field shows the first source a user arrived from.
->traffic_source.name
->traffic_source.source
->traffic_source.medium

Traffic Source collected by using UTM parameter in your urls:
This field gives session-scoped traffic source information, letting you attribute sessions and their events to the most recent traffic source. This offers a more granular view of user behaviour and traffic source effectiveness.

->collected_traffic_source.manual_campaign_id
->collected_traffic_source.manual_campaign_name
->collected_traffic_source.manual_source
->collected_traffic_source.manual_medium
->collected_traffic_source.manual_term
->collected_traffic_source.manual_content
->collected_traffic_source.gclid

E-commerce fields:
->Items.item_id
->Items.item_name
->Items.item_category
->Items.item_price
->ecommerce.transcation_id
->ecommerce.purchase_revenue
->And more

In bigquery we have to use sql to create tables to get insights from the data. We don’t have any built in dashboards like in GA4, data is all raw and we need to query our raw data.

Below is the sample query to get event report:

select event_name,
count(event_name) as event_count,
count(distinct user_pseudo_id) as total_users,
round(count(event_name)/count(distinct user_pseudo_id),1) as event_count_per_user

FROM cbga4-410110.analytics_381112027.events_*
where TABLESUFFIX BETWEEN '20240103' AND '20240129'
group by 1
order by 2 desc


Result of the above Query

query result

We export tables from bigquery directly to looker studio and build reports as per our needs. Here are few resources where you can learn more about bigquery & GA4:

Crabs Marketing

Helping Businesses with power of data to drive growth

Location:

London, UK
Company Number: 14986833

Connect With Us: