Easy CSV importing into Cloud Bigtable

By mullaned2002

April 12, 2022

529

If you’re trying to learn a new database, you’ll want to kick the tires by loading in some data and maybe doing a query or two. Cloud Bigtable is a powerful database service for scale and throughput, and it is quite flexible in how you store data because of its NoSQL nature. Because Bigtable works at scale, the tools that you use to read and write to it tend to be great for large datasets, but not so much for just trying it out for half an hour. A few years ago, I tried to tackle this by putting together a tutorial on importing CSV files using a Dataflow job, but that requires spinning up several VMs, which can take some time.

Here on the Bigtable team, we saw that the CSV import tutorial was a really popular example despite the need to create VMs, and we heard feedback from people wanting a faster way to dive in. So now we are excited to launch a CSV importer for Bigtable in the cbt CLI tool. The new importer takes a local file and then uses the Go client library to quickly import the data without the need to spin up any VMs or build any code.

Installation

If you already have the gcloud with the cbt tool installed, you just need to ensure it is up to date by running gcloud components upgrade. Otherwise, you can install gcloud which includes the cbt tool.

If you’re unable to install the tools on your machine, you can also access them via the cloud shell in the Google Cloud console.

Importing data

I have a csv file with some time series data in the public Bigtable bucket, so I’ll use that for the example. Feel free to download it yourself to try out the tool too. Note that these steps assume that you have created a Google Cloud project and a Cloud Bigtable instance.

code_block[StructValue([(u’code’, u’gsutil cp gs://cloud-bigtable-public-datasets/csv-import-blog-data.csv .’), (u’language’, u”)])]

You need to have a table ready for the import, so use this command to create one:

code_block[StructValue([(u’code’, u’cbt createtable mobile-time-series families=”cell_data”‘), (u’language’, u”)])]

Then, to import the data use the new cbt import command:

code_block[StructValue([(u’code’, u’cbt import mobile-time-series csv-import-blog-data.csv column-family=cell_data’), (u’language’, u”)])]

You will see some output indicating that the data is being imported. After it’s done you can use cbt to read a few rows from your table:

code_block[StructValue([(u’code’, u’cbt read mobile-time-series’), (u’language’, u”)])]

If you were following along, be sure to delete the table once you’re done with it.

code_block[StructValue([(u’code’, u’cbt deletetable mobile-time-series’), (u’language’, u”)])]

CSV format

CSV file without column families

CSV file with column families

The CSV file uses one row of headers specifying the column qualifiers and a blank for the rowkey. You can add an additional row of headers for the column families and then remove the column-family argument from the import command.

I hope this tool helps you get comfortable with Bigtable and can let you experiment with it more easily. Get started with Bigtable and the cbt command line with the Quickstart guide.

Cloud BlogRead More

Previous articleHow do payments providers keep from getting disrupted? By disrupting themselves first

Next articleTop 5 Takeaways from Data Cloud Summit ‘22

Easy CSV importing into Cloud Bigtable

Installation

Importing data

CSV format

The overwhelmed person’s guide to Google Cloud: week of April 18

Announcing PyTorch/XLA 2.3: Distributed training, dev improvements, and GPUs

Transforming How New York Protects and Serves its Community

LEAVE A REPLY Cancel reply

Most Popular

The overwhelmed person’s guide to Google Cloud: week of April 18

Inpainting and Outpainting with Stable Diffusion

Announcing PyTorch/XLA 2.3: Distributed training, dev improvements, and GPUs

GQL: The ISO standard for graphs has arrived

Recent Comments

EDITOR PICKS

Exploring the Click Element Variable in Google Tag Manager

How to track events with Google Tag Manager and Google Analytics

Data Layer Variable in GTM: What, Why, and Where?

POPULAR POSTS

How to Concatenate Strings in SAS (with 5 Methods)

Optimize hyperparameters with Amazon SageMaker Automatic Model Tuning

Identifying landmarks with Amazon Rekognition Custom Labels

POPULAR CATEGORY

Easy CSV importing into Cloud Bigtable

Installation

Importing data

CSV format

Google Cloud’s key investment areas to accelerate your database transformation

LEAVE A REPLY Cancel reply

Most Popular

Recent Comments

EDITOR PICKS

POPULAR POSTS

POPULAR CATEGORY