question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

BigQuery DataTransfer: Error in scheduling runs

See original GitHub issue

I’m trying to use Python’s API for BigQuery DataTransfer, but I’m getting RPC errors. I’m not sure if there’s a problem with the API or a general configuration problem with the script.

google-cloud-python version:

Name: google-cloud Version: 0.32.0 Summary: API Client library for Google Cloud Home-page: https://github.com/GoogleCloudPlatform/google-cloud-python Author: Google Cloud Platform Author-email: googleapis-publisher@google.com License: Apache 2.0 Location: /home/ubuntu/python3_virtualenv/python3_env/lib/python3.6/site-packages Requires: google-cloud-resource-manager, google-cloud-language, google-cloud-storage, google-cloud-trace, google-cloud-datastore, google-cloud-pubsub, google-cloud-core, google-cloud-speech, google-cloud-spanner, google-cloud-translate, google-cloud-vision, google-cloud-videointelligence, google-cloud-error-reporting, google-cloud-bigquery, google-cloud-firestore, google-cloud-bigquery-datatransfer, google-cloud-dns, google-cloud-bigtable, google-cloud-container, google-cloud-monitoring, google-cloud-logging, google-api-core, google-cloud-runtimeconfig

Example Code:

client = bigquery_datatransfer.DataTransferServiceClient()

timestamp_start = timestamp_pb2.Timestamp()
timestamp_start.FromSeconds(1524022447)

timestamp_end = timestamp_pb2.Timestamp()
timestamp_end.FromSeconds(1524133447)

client.schedule_transfer_runs(client.get_transfer_config("projects/<PROJECT_ID>/locations/us/transferConfigs/<TRANSFER_ID>").name,
                              start_time=timestamp_start,
                              end_time=timestamp_end)

Example Error:

---------------------------------------------------------------------------
_Rendezvous                               Traceback (most recent call last)
~/python3_virtualenv/python3_env/lib/python3.6/site-packages/google/api_core/grpc_helpers.py in error_remapped_callable(*args, **kwargs)
     53         try:
---> 54             return callable_(*args, **kwargs)
     55         except grpc.RpcError as exc:

~/python3_virtualenv/python3_env/lib/python3.6/site-packages/grpc/_channel.py in __call__(self, request, timeout, metadata, credentials)
    499         state, call, = self._blocking(request, timeout, metadata, credentials)
--> 500         return _end_unary_response_blocking(state, call, False, None)
    501 

~/python3_virtualenv/python3_env/lib/python3.6/site-packages/grpc/_channel.py in _end_unary_response_blocking(state, call, with_call, deadline)
    433     else:
--> 434         raise _Rendezvous(state, None, None, deadline)
    435 

_Rendezvous: <_Rendezvous of RPC that terminated with (StatusCode.INVALID_ARGUMENT, Request contains an invalid argument.)>

The above exception was the direct cause of the following exception:

InvalidArgument                           Traceback (most recent call last)
<ipython-input-40-13d74c5cc600> in <module>()
      7 client.schedule_transfer_runs(client.get_transfer_config("projects/967176960612/locations/us/transferConfigs/5aa1c6a1-0000-252e-b3b7-f403043605f4").name,
      8                               start_time=timestamp_start,
----> 9                               end_time=timestamp_end)

~/python3_virtualenv/python3_env/lib/python3.6/site-packages/google/cloud/bigquery_datatransfer_v1/gapic/data_transfer_service_client.py in schedule_transfer_runs(self, parent, start_time, end_time, retry, timeout)
    711             parent=parent, start_time=start_time, end_time=end_time)
    712         return self._schedule_transfer_runs(
--> 713             request, retry=retry, timeout=timeout)
    714 
    715     def get_transfer_run(self,

~/python3_virtualenv/python3_env/lib/python3.6/site-packages/google/api_core/gapic_v1/method.py in __call__(self, *args, **kwargs)
    137             kwargs['metadata'] = metadata
    138 
--> 139         return wrapped_func(*args, **kwargs)
    140 
    141 

~/python3_virtualenv/python3_env/lib/python3.6/site-packages/google/api_core/retry.py in retry_wrapped_func(*args, **kwargs)
    258                 sleep_generator,
    259                 self._deadline,
--> 260                 on_error=on_error,
    261             )
    262 

~/python3_virtualenv/python3_env/lib/python3.6/site-packages/google/api_core/retry.py in retry_target(target, predicate, sleep_generator, deadline, on_error)
    175     for sleep in sleep_generator:
    176         try:
--> 177             return target()
    178 
    179         # pylint: disable=broad-except

~/python3_virtualenv/python3_env/lib/python3.6/site-packages/google/api_core/timeout.py in func_with_timeout(*args, **kwargs)
    204             """Wrapped function that adds timeout."""
    205             kwargs['timeout'] = next(timeouts)
--> 206             return func(*args, **kwargs)
    207 
    208         return func_with_timeout

~/python3_virtualenv/python3_env/lib/python3.6/site-packages/google/api_core/grpc_helpers.py in error_remapped_callable(*args, **kwargs)
     54             return callable_(*args, **kwargs)
     55         except grpc.RpcError as exc:
---> 56             six.raise_from(exceptions.from_grpc_error(exc), exc)
     57 
     58     return error_remapped_callable

~/python3_virtualenv/python3_env/lib/python3.6/site-packages/six.py in raise_from(value, from_value)

InvalidArgument: 400 Request contains an invalid argument.

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:6 (4 by maintainers)

github_iconTop GitHub Comments

2reactions
mxlei01commented, Jun 7, 2018

@tswast I was able to create the transfer in the UI.

But actually I figured out it was not the API’s issue, but authentication wise.

Here I’ll post my solution if anyone has the same issue with data transfer.

  1. I created a service account for all my google APIs.
  2. The service account does not have authentication to access adwords, because in order to gain access, the service account needs to at least be a read-only.
  3. But you cannot add the service account to adwords, since if you added the service account to adwords account then adwords send out an email to the service account, which I don’t think you have access to.
  4. The permissions issue does not get returned by Google’s servers, instead it is returned as dan INVALID_ARGUMENT, which incorrect. If you tried the same command with gcloud alpha cloud-shell ssh (while authenticated as a service account), and tried the bq command: bq mk --transfer_config --project_id=PROJECT_ID --target_dataset=temp --display_name=temp_tr --params='{"customer_id": "XXX-XXX-XXXX", "exclude_removed_items": "false"}' --data_source=adwords, then it will complain that adwords is an unknown data source.
  5. But if you tried the above command under a user account which has access to both Big Query and Adwords account, then it succeeds.

The workaround I did was get authenticated as a user (as opposed to a service account) using:

https://console.developers.google.com/apis/credentials/oauthclient https://developers.google.com/oauthplayground/

Using the API credentials, create a OAUTH2 client and secret, and enter it into the oauthplayground. This will return a refresh key, which you can call Google’s authentication server to give you an access token, for example:

To get a access token (they expire hourly):

import requests
import json
request = requests.post(f"https://www.googleapis.com/oauth2/v4/token?client_id={client_id}&client_secret={client_secret}&refresh_token={refresh_token}&grant_type=refresh_token")
access_token = json.loads(request.text)["access_token"]

Then create the biquery client using the access token:

from google.cloud import bigquery_datatransfer
from google.oauth2.credentials import Credentials
client = bigquery_datatransfer.DataTransferServiceClient(credentials=Credentials(token=access_token))

Then you can successfully initiate a transfer:

project_id = "blabla-land-18304"
parent = client.project_path(project_id)
config = bigquery_datatransfer.types.TransferConfig()
config.destination_dataset_id = "temp"
config.display_name = "temp display"
config.data_source_id = "adwords"
config.schedule = "every 24 hours"
config.data_refresh_window_days = 7
config.disabled = False
config.params["customer_id"] = "XXX-XXX-XXXX"
config.params["exclude_removed_items"] = False
config.params["exclude_inactive_accounts"] = False
client.create_transfer_config(parent=parent, transfer_config=config)
0reactions
tswastcommented, Jun 7, 2018

Thanks for sharing! Sounds like it’ll be important to have user-authentication samples for BQ-DTS.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Troubleshoot transfer configurations | BigQuery - Google Cloud
If you use the default scheduling values, the first transfer run starts immediately after the transfer is created, but it fails because your...
Read more >
BigQuery Data Transfer Service - Scheduler Not Working
1 Answer 1 ... It turns out that the run time doesn't get updated to the new run time, until after the job...
Read more >
How to Use the BigQuery Data Transfer Service?
If you're missing any data, you can schedule a backfill. Go back to the Transfer Details window and click the Schedule Backfill button...
Read more >
BigQuery Data Transfer Service for Google Ad Manager ...
Summary: BigQuery Data Transfer Service for Google Ad Manager transfer runs failing with error Description: We are experiencing an issue ...
Read more >
How to use Backfill: the Time Machine for Scheduled Queries ...
The solution to the problem above is explained in the Scheduling queries page of the BigQuery documentation. BigQuery provides the @run_time and @run_date ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found