Google Cloud has made the Datastream for BigQuery tool generally available. This allows developers to stream data updates from sources into BigQuery in near real-time.
The tool has been in beta since September 2020, and is now generally available. Datastream for BigQuery enables near real-time resource updates to be replicated in BigQuery tables.
This form of data integration eliminates the need to build data pipelines or program ETL and ELT processes. This makes data integration in BigQuery faster and more efficient.
Advantages
Benefits of the solution include real-time insights into BigQuery and serverless ELT and ETL pipelines that scale automatically with no setup or management required, according to Google Cloud.
Datastream for BigQuery also allows source schemas to change. It provides schema drift and automatically replicates new columns and tables in the source to the BigQuery environment. The solution uses its new change data capture (CDC) and Storage Write API’s UPSERT features to do this.
Minimal adjustments
Users only need to configure the source database, connection type, and destination in BigQuery. Supported databases include MySQL, PostgreSQL, AlloyDB, and Oracle databases.
Read also: Understand the differences between a data warehouse, data lake and lakehouse.