Skip to content

Google Search Console

Install

See the Install guide for the full setup, including Windows PowerShell.

curl -fsSL https://install.skippr.io/install.sh | shClick to copy

Installing Skippr means accepting the Skippr EULA.

Syncs Search Analytics daily facts (clicks, impressions, CTR, position) from the Google Search Console API. Each namespace is one bronze grain with replace-by-date landing semantics.

Complements SEO crawl and PageSpeed — join in the warehouse on page URL + date.

How it works

  1. Authenticates with bearer token, OAuth2 refresh, or service account (webmasters.readonly).
  2. For each selected stream, calls searchAnalytics.query with dataState: final (default).
  3. Paginates with startRow until fewer than rowLimit rows (max 25 000).
  4. Skips the trailing processing_lag_days (default 3) for stable final data.
  5. Re-syncs lookback_days (default 3) of mature dates; replace_partition on date overwrites revised metrics.
  6. Optional search_appearance_daily is skipped with a warning when the API returns 400.

Configuration

Engine skippr.yml (runtime plugin):

yaml
pipelines:
  gsc_seo:
    data_source: data_sources.gsc
    data_sink: data_sinks.athena

data_sources:
  gsc:
    GoogleSearchConsole:
      site_url: "https://example.com/"
      start_date: "2024-01-01"
      stream_profile: full
      lookback_days: 7
      processing_lag_days: 3
      window_in_days: 1
      search_type: web
      data_state: final
      access_token: ${GSC_ACCESS_TOKEN}

Public skippr.yaml (via skippr connect):

yaml
source:
  kind: google_search_console
  site_url: "https://example.com/"
  start_date: "2024-01-01"
  stream_profile: standard
  lookback_days: 7
  access_token: ${GSC_ACCESS_TOKEN}
FieldDefaultDescription
site_url(required)https://example.com/ (trailing slash) or sc-domain:example.com
start_date(required)First date to sync (YYYY-MM-DD)
end_dateLast date; omit to sync through yesterday minus lag
stream_profilefullminimal (1), standard (5), or full (9 namespaces)
streamsprofile setComma-separated namespaces; overrides profile
lookback_days3Mature days to re-fetch each run
processing_lag_days3Skip syncing the last N calendar days
window_in_days1Days per API date-range chunk
search_typewebSearch type filter for the API
data_statefinalfinal (stable) or all (includes fresh incomplete days)
row_limit25000Max rows per API page
url_inspection_enabledfalseEnable URL Inspection API
url_list[]URLs to inspect (max 20 per run)
Auth fieldsaccess_token, OAuth refresh, or service account

Stream profiles

ProfileNamespacesUse case
full9Production (includes page_query_daily, sitemaps, run summary)
standard5Core breakdowns without high-cardinality page×query
minimal1Dev/CI / discover sampling (site_daily only)

Bronze catalog (stream_profile: full)

NamespaceGrainNotes
google_search_console.site_dailydateSite totals
google_search_console.query_dailydate, query
google_search_console.page_dailydate, page
google_search_console.device_dailydate, device
google_search_console.country_dailydate, country
google_search_console.page_query_dailydate, page, queryHigh cardinality
google_search_console.search_appearance_dailydate, searchAppearanceOptional (skipped on 400)
google_search_console.sitemap_dailydate, pathSitemap list snapshot
google_search_console.site_run_dailydateRows synced / API errors per run

Config-gated: google_search_console.url_inspection_daily when url_inspection_enabled: true.

Metrics (all Search Analytics streams): clicks, impressions, ctr, position.

Landing semantics

Source landing semantics. Pair with Athena.

Authentication

Scope: https://www.googleapis.com/auth/webmasters.readonly.

Bearer access token (GSC_ACCESS_TOKEN)

Short-lived OAuth 2.0 access token. Set in the environment or skippr.yml:

bash
export GSC_ACCESS_TOKEN="ya29...."

OAuth refresh

yaml
oauth_token_url: https://oauth2.googleapis.com/token
oauth_client_id: ${GSC_OAUTH_CLIENT_ID}
oauth_client_secret: ${GSC_OAUTH_CLIENT_SECRET}
oauth_refresh_token: ${GSC_OAUTH_REFRESH_TOKEN}

Service account

Add the service account email as a user on the property in Search Console, then:

bash
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/key.json

Or set service_account_json_path in config.

CLI

bash
skippr connect source google-search-console \
  --site-url "https://example.com/" \
  --start-date 2024-01-01 \
  --stream-profile standard \
  --lookback-days 7 \
  --access-token "${GSC_ACCESS_TOKEN}"

On sync, the plugin validates site_url via sites.list when credentials are available.

Development fixtures

SKIPPR_GOOGLE_SEARCH_CONSOLE_FIXTURE_DIR — JSON under plugins/data_source/google_search_console/tests/fixtures/.

See Environment variables.