Google Ads API Report Fetcher uses GAQL syntax with some extended capabilities.
This is how a generic query might look like:
SELECT
ad_group.id,
ad_group.name
FROM ad_group
When running this query and saving the results we get pretty long and unreadable
column names - ad_group.id
and ad_group.name
.
Things might be more complicated if you want to extract and save such objects as unselectable elements, complex messages and resource names.
In order to simplify data extraction and processing when querying data from Ads API we introduce additional syntax (see an example below):
SELECT
resource.attribute AS column_name_1,
resource.attribute:nested.resource AS column_name_3
resource.attribute~1 AS column_name_4
FROM resource
- Aliases (
AS column_name
) - Nested resources (
:nested.resource.name
) - Resource indices (
~position
) - Virtual columns (
metric.name / metric.name_2 AS alias
)
Alias is used to give a descriptive name to a metric or attribute fetched from
API when saving data. So instead of column name
campaign.app_campaign_setting.bidding_strategy_goal_type
you may use something
more user friendly, like bidding_type
.
Aliases are specified using AS
keyword as shown below:
SELECT
campaign.app_campaign_setting.bidding_strategy_goal_type AS bidding_type
FROM campaign
If you don't specify an alias it will be generated as full column name where "." replaced with "_".
In NodeJS version autogenerated aliases won't contains current resource name (for example while selecting "campaign.app_campaign_setting.bidding_strategy_goal_type" from campaign there will be a column "app_campaign_setting_bidding_strategy_goal_type")
Some fields return structs, and if you want to get a nested attribute scalar value you can use nested resource selectors.
One particular example is working with change_event
- change_event.new_resource
consists of various changes made to an entity and looks something like that:
new_resource {
campaign {
target_cpa {
target_cpa_micros: 1000000
}
}
}
In order to extract a particular element (i.e., final value for target_cpa_micros
)
we use the :
syntax - change_event.new_resource:campaign.target_cpa.target_cpas_micros
:
SELECT
change_event.old_resource:campaign.target_cpa.target_cpa_micros AS old_target_cpa,
change_event.new_resource:campaign.target_cpa.target_cpa_micros AS new_target_cpa
FROM change_event
Resource indices are used to extract a particular element from data type
RESOURCE_NAME. I.e., if we want to get resource name for campaign_audience_view.resource_name
and save it somewhere, the saved result will contain a string customers/{customer_id}/campaignAudienceViews/{campaign_id}~{criterion_id}.
Usually we want to get only the last element from (criterion_id
) and
it can be extracted with ~N
syntax where N is a position of an element you want to extract
(indexing is starting from 0).
If the resource you're selecting looks like this customers/111/campaignAudienceViews/222~333
you can specify campaign_audience_view.resource_name~1
to extract the second element (333
).
If you specify campaign_audience_view.resource_name~0
you'll get '222' (the last resource id before ~).
SELECT
campaign_audience_view.resource_name~1 AS criterion_id
FROM campaign_audience_view
Virtual columns allow to specify in GAQL query some fields or expressions that are not present in Google Ads API.
SELECT
1 AS counter,
metrics.clicks / metrics.impressions AS ctr,
metrics.cost_micros * 1e6 AS cost,
campaign.app_campaign_setting.bidding_strategy_goal_type AS bidding_type
FROM campaign
Virtual columns can contain constants (i.e. 1 AS counter
will add new column counter
filled with 1
) or expressions.
Expressions can contain field selectors, constants and any arithmetics operations with them.
For example metrics.clicks / metrics.impressions AS ctr
will calculate metrics.clicks / metrics.impressions
for each GoogleAdsRow and store the results in a new column ctr
.
For this the fields metrics.clicks
and metrics.impressions
will be fetched implicitly.
Or for example campaign.target_cpa.target_cpa_micros / 1000000 AS target_cpa
expression will fetch campaign.target_cpa.target_cpa_micros
field but return the result of its division by 1000000.
The query parser parses a query and remove all columns which are not simple field accessors (i.e. contains anything except field names). For constants columns they will be re-added into result after executing the query. For more complex columns with expressions (i.e. some operations with fields) the result will evaluated using the response from Ads API (GoogleAdsRow).
NodeJS versions:
For constants you can also use same syntax as for macros - ${..}
. For example, a query SELECT '${today()}' as date
returns a column 'date' with values like '2023-02-11'.
For virtual columns expressions you can use same syntax as inside ${}
with all operators from MathJS library but operating any of fields available for a current resource.
There can be a confusion between two types of expressions - in ${}
blocks and virtual columns.
First of all virtual columns are supported by both versions (Python/NodeJS) while ${}
-expressions are supported by NodeJS only.
Expressions in ${}
blocks are parsed in the very beginning of query parsing, before submitting a query to Ads API. They are more like templates.
You queries can contain macros. Macro is just a substitution in script text, i.e.
SELECT
campaign.id AS campaign_id,
metrics.clicks AS clicks
FROM campaign
WHERE
segments.date BETWEEN "{start_date}" AND "{end_date}"
When this query is executed it's expected that two macros --macros.start_date=...
and --macros.end_date=...
are supplied to gaarf
.
Macros can be used not only in WHERE statements as in the example above but also in the SELECT statement. In that case this macros will be expanded and then treated as a virtual column.
SELECT
"{current_date}" AS date,
campaign.id AS campaign_id,
campaign_budget.budget_amount_micros AS budget
FROM campaign
This will return all campaign budgets and attach current date (i.e. 2023-06-01) as a date column in the output.
gaarf
by default has several common macros:
date_iso
- current date in YYYYMMDD format (i.e. 19700101)yesterday_iso
- previous day date in YYYY-MM-DD format (i.e. 19700101)current_date
- current_date in YYYY-MM-DD format (i.e. 1970-01-01)current_datetime
- current datetime in YYYY-MM-DD HH:mm-ss format (i.e. 1970-01-01 00:00:00)
Your SQL scripts can be templates using a template engine: Jinja for Python and Nunjucks for NodeJS. Inside templates you can use appropriate syntax and control structures of a template engine (Jinja/Nunjucks). They are mostly compatible but please consult the documentations if you migrate between platforms (Python <-> NodeJS).
SELECT
customer_id AS
{% if level == "0" %}
root_account_id
{% else %}
leaf_account_id
{% endif %}
FROM dataset1.table1
WHERE name LIKE @name
When this query is executed it's expected to have template --template.level=...
is supplied to gaarf
.
This will create a column named either root_account_id
since the specified level is 0.
Template are great when you need to create multiple column based on condition:
SELECT
{% for day in cohort_days %}
SUM(GetCohort(lag_data.installs, {{day}})) AS installs_{{day}}_day,
{% endfor %}
FROM asset_performance
When this query is executed it's expected to have template --template.cohort_days=0,1,3,7
is supplied to gaarf
.
Please note that all values passed through CLI arguments are strings. But there's a special case - a value containing "," - it's converted to an array.
It will create 4 columns (named installs_0_day
, installs_1_day
, etc).
Note: currently expressions are supported only in NodeJS version.
Your queries can contain expressions. The syntax for expressions is ${expression}
.
They will be executed right after macros substitution. So macros can contain expressions inside.
Both expressions and macros deal with query text before submitting it for execution.
Inside expression block we can do anything that the MathJS library supports - see https://mathjs.org/docs/index.html,
plus work with date and time. It's all sort of arithmetic operations, strings and dates manipulations.
One typical use-case - evaluate date/time expressions to get dynamic date conditions in queries. These are when you don't provide a specific date but evaluate it right in the query. For example, applying a condition for date range for last month, which can be expressed as a range from today minus 1 month to today (or yesterday):
WHERE start_date >= '${today()-period('P1M')}' AND end_date <= '${today()}'
will be evaluated to:
WHERE start_date >= '2022-06-20 AND end_date <= '2022-07-20'
if today is 2022 July 20th.
Also you can use expressions for making table names dynamic (in BQ scripts), e.g.
CREATE OR REPLACE TABLE `{bq_dataset}_bq.assetssnapshots_${format(yesterday(),'yyyyMMdd')}` AS
Supported functions:
datetime
- factory function to create a DateTime object, by default in ISO format (datetime('2022-12-31T23:59:59')
) or in a specified format in the second argument (datetime('12/31/2022 23:59','M/d/yyyy hh:mm')
)date
- factory function to create a Date object, supported formats:date(2022,12,31)
,date('2022-12-31')
,date('12/31/2022','M/d/yyyy')
duration
- returns a Duration object for a string in ISO_8601 format (PnYnMnDTnHnMnS)period
- returns a Period object for a string in ISO_8601 format (PnYnMnD)today
- returns a Date object for today dateyesterday
- returns a Date object for yesterday datetomorrow
- returns a Date object for tomorrow datenow
- returns a DateTime object for current timestamp (date and time)format
- formats Date or DateTime using a provided format, e.g.${format(date('2022-07-01'), 'yyyyMMdd')}
returns '20220701'
Please note functions without arguments still should called with brackets (e.g. today()
)
For dates and datetimes the following operations are supported:
- add or subtract Date and Period, e.g.
today()-period('P1D')
- subtract 1 day from today (i.e. yesterday) - add or subtract DateTime and Duration, e.g.
now()-duration('PT12H')
- subtract 12 hours from the current datetime - for both Date and DateTime add or subtract a number meaning it's a number of days, e.g.
today()-1
- subtract two Dates to get a Period, e.g.
tomorrow()-today()
- subtract today from tomorrow and get 1 day, i.e. 'P1D' - subtract two DateTimes to get a Duration - similar to subtracting dates but get a duration, i.e. a period with time (e.g. PT10H for 10 hours)
By default all dates will be parsed and converted from/to strings in [ISO format]((https://en.wikipedia.org/wiki/ISO_8601)
(yyyy-mm-dd for dates and yyyy-mm-ddThh:mm:ss.SSS for datetimes).
But additionally you can specify a format explicitly (for parsing with datetime
and date
function and formatting with format
function)
using standard Java Date and Time Patterns:
- G Era designator
- y Year
- Y Week year
- M Month in year (1-based)
- w Week in year
- W Week in month
- D Day in year
- d Day in month
- F Day of week in month
- E Day name in week (e.g. Tuesday)
- u Day number of week (1 = Monday, ..., 7 = Sunday)
- a Am/pm marker
- H Hour in day (0-23)
- k Hour in day (1-24)
- K Hour in am/pm (0-11)
- h Hour in am/pm (1-12)
- m Minute in hour
- s Second in minute
- S Millisecond
- z Time zone - General time zone (e.g. Pacific Standard Time; PST; GMT-08:00)
- Z Time zone - RFC 822 time zone (e.g. -0800)
- X Time zone - ISO 8601 time zone (e.g. -08; -0800; -08:00)
Examples:
${today() - period('P2D')}
output: today minus 2 days, e.g. '2022-07-19' if today is 2022-07-21
${today()+1}
output: today plus 1 days, e.g. '2022-07-22' if today is 2022-07-21
${date(2022,7,20).plusMonths(1)}
output: "2022-08-20"
NodeJS version support in-place functions in JavaScript. The functions support consists of two parts:
- function execution - it's suffix
:$func
at any select field (where func is a funciton name) - function definition - a block below the main query starting with
FUNCTIONS
after which a normal JS code with a function definition follows
Example:
SELECT
campaign.id AS campaign_id,
campaign_criterion.ad_schedule.day_of_week:$format AS ad_schedule_day_of_week
FROM campaign_criterion
FUNCTIONS
function format(val) {
let days = ['пн', 'вт', 'ср', 'чт', 'пт', 'сб', 'вс'];
if (!val) return '';
return val === 8 ? days[6] : days[val-2];
}
Here we defined a function format
(converts a numeric week day into a localized day name) and then called it for campaign_criterion.ad_schedule.day_of_week
column.
So functions always accept only one parameter - a field value.
Google Ads API Report Fetcher can also works with built-in queries, which use the following syntax:
SELECT * FROM builtin.builtin_query_name
It expacts to provide a built-in query name when selecting from special builtin
namespace.
Currently the following queries are available:
ocid_mapping
- returnaccount_id
andocid
from each child account under MCC;ocid
can be used to build links to Google Ads UI.
You can use normal sql type parameters with sql
argument (NodeJS only):
SELECT *
FROM {dst_dataset}.{table-src}
WHERE name LIKE @name
and to execute:
gaarf-bq --macro.table-src=table1 --macro.dst_dataset=dataset1 --sql.name='myname%'
it will create a parameterized query to run in BQ:
SELECT *
FROM dataset1.table1
WHERE name LIKE @name