Google Cloud Storage
Google Cloud Storage (GCS) is an enterprise-grade object storage service. Unlike traditional file storage (like your computer's folders) or block storage (like a hard drive for a virtual machine), object storage treats every file as a discrete "object" bundled with metadata and a unique identifier.
Think of it as a massive, high-tech warehouse where you can store an unlimited amount of data—from small text files to 5 TB video files—and retrieve them instantly from anywhere in the world.
Think of it as a massive, high-tech warehouse where you can store an unlimited amount of data—from small text files to 5 TB video files—and retrieve them instantly from anywhere in the world.
Core Concepts: How it’s Organized
To understand GCS, you need to know three primary levels of organization:
Buckets: These are the basic containers that hold your data. Every bucket has a globally unique name and is assigned to a specific geographic location (like us-east1 or europe-west2).Objects: This is the actual data you upload (photos, logs, backups). Each object consists of the data itself, a unique key (name), and metadata (e.g., content type, creation date).
Projects: All buckets live inside a Google Cloud Project, which handles billing and permissions.
Note: Objects in GCS are immutable. You cannot "edit" a file in place. If you change a file and re-upload it, GCS either overwrites the old one or creates a new version (if versioning is enabled).
Storage Classes (Cost vs. Access)
Google offers four main storage classes.They all offer the same high durability and low latency, but they differ in price based on how often you plan to access the data.
Key Features
Durability & Availability: GCS is designed for 99.999999999% (11 nines) annual durability. This means if you store 10,000 objects, you might lose one every 10 million years.
Object Lifecycle Management: You can set rules to save money automatically. For example: "Move objects to Archive storage if they haven't been accessed in 30 days."
Consistency: GCS provides strong global consistency. As soon as an upload is successful, any subsequent request to read that object will return the new version.
Security: Data is encrypted by default "at rest" (on the disk) and "in transit" (moving over the internet).You can use Google-managed keys or your own.
How to Access Your Data
There are four primary ways to interact with GCS:
Google Cloud Console: A web-based GUI for point-and-click management.
gsutil / gcloud storage: Command-line tools for power users and scripting.
Client Libraries: SDKs for languages like Python, Java, Go, and Node.js to integrate storage into your apps.
REST API: Standard HTTP requests (JSON or XML) for custom integrations.
Common Use Cases
Hosting Static Websites: You can point a domain to a bucket to host HTML, CSS, and images.
Data Lakes: Storing massive amounts of raw data for BigQuery or AI/Machine Learning processing.
Backup and Recovery: Storing system images or database dumps safely off-site.
Content Delivery: Serving high-res videos or images to users globally via Google’s edge network.
Fundamental concepts and architecture of Cloud Storage (such as Google Cloud Storage or AWS S3), which functions as a global, scalable system for storing data as "objects" rather than traditional files.
## Core Concepts: Buckets and Objects
Cloud Storage uses a flat hierarchy consisting of two primary components:
Buckets: These are the top-level containers for your data. Every bucket must have a globally unique name across the entire service provider. Think of them as massive, virtual "disks."
Objects: These are the actual files (up to 5TB each) stored inside buckets. Each object has a unique name within its specific bucket.
## Data Availability and Location
Strategic placement of your data affects both cost and performance:
Regional: Data stays in one specific geographic area to minimize latency (the delay in data transfer) for local users or VMs.
Multi-regional: Data is replicated across several regions. This ensures high availability and places data closer to a global customer base.
Durability: Data is automatically spread across multiple physical disks to prevent loss.
## Key Features & Mechanics
Storage as an API: Unlike a standard hard drive, you move files in and out using an API (Application Programming Interface).
Key-Value Model: Technically, Cloud Storage acts as a key-value store optimized for large values.
Lifecycle & Access: You can configure how long data is kept (lifecycle) and who is allowed to see it (access control).
Cost Factor: While multi-regional data is more resilient, reading data across different regions usually incurs "network transfer fees."
## How to Start
Navigate to the Cloud Storage section in your cloud console.
Create a bucket first (remembering the name must be unique worldwide).
Upload your objects into that bucket.
Build a Cloud Storage Bucket
Type cloud storage in the search box then click into Cloud Storage
Click into Create bucket
I used Globally unique name cloud-storage-exam
Select Region - Hit Continue
No need for Multi or Dual, Chose Region keeps costs lower)
In Google Cloud Storage (GCS), the Location Type determines how your data is replicated across geographic areas. This choice is critical because it directly impacts your data's availability, latency, and cost.
Region Storage - High-performance data processing and lower costs.
Your data is stored within a single geographic region.
Performance: lowest latency and highest throughput because your compute resources can be "co-located" in the same region as your data.
Cost: Generally the cheapest storage option. There are no replication fees.
Risk: If an entire region goes offline (due to a major natural disaster or large-scale outage), your data is inaccessible until the region is restored.
Dual-Regional Storage - Mission-critical apps requiring high availability and low latency.
Data is automatically replicated across two specific regions within the same continent.
Availability: Very high. If one region goes down, Google automatically fails over to the second region.
Performance: You get "Region-like" performance if your compute is in either of the two chosen regions.
Cost: The most expensive option. You pay for the storage in both regions and a replication fee for data written to the bucket.
Multi-Regional Storage - Serving content to users across a large area
Data is spread across at least two regions within a large geographic area (e.g., the entire United States or the European Union).
Availability: Higher than a single region.
Performance: Great for "serving" data to users scattered across a continent, serving the data from the location closest to the user.
Cost: Mid-range. It is more expensive than Regional but often cheaper than Dual-Regional.
Next choose How to Store your data
Select Standard. Hit Continue
Different storage classes are designed to help you balance at-rest storage costs against data access costs.
Standard Storage- data that you access frequently.
Highest storage price but no retrieval fees.
Latency: Milliseconds (instant).
Nearline Storage - Low-cost option for data you expect to access less than once a month.
Coldline Storage - Designed for data you plan to access at most once every 90 days. Disaster recovery planning and older backups.
Archive Storage - The cheapest storage class for data you access less than once a year. Regulatory compliance, legal archives.
Uncheck Enforce public access prevention (enables internet access for web site)
Can leave the rest of selections with default values.
Then hit Create
Cloud Storage currently has a separate command-line tool called gsutil.
Allows listing of cloud storage buckets
Open cloud shell
gsutil is a Python application that lets you access Cloud Storage from the command line.
Welcome to Cloud Shell! Type "help" to get started, or type "gemini" to try prompting with Gemini CLI.
Your Cloud Platform project in this session is set to cloud-project-examples.
Use `gcloud config set project [PROJECT_ID]` to change to a different project.
john_iacovacci1@cloudshell:~ (cloud-project-examples)$ gsutil ls
gs://cloud-storage-exam/
john_iacovacci1@cloudshell:~ (cloud-project-examples)$
Now upload a simple text file with gsutil
create a file by using linux echo command and std output redirection
john_iacovacci1@cloudshell:~ (cloud-project-examples)$ mkdir cloudstore
john_iacovacci1@cloudshell:~ (cloud-project-examples)$ cd cloudstore
john_iacovacci1@cloudshell:~/cloudstore (cloud-project-examples)$ echo "This is my first file!" > my_first_file.txt
john_iacovacci1@cloudshell:~/cloudstore (cloud-project-examples)$ cat my_first_file.txt
This is my first file!
john_iacovacci1@cloudshell:~/cloudstore (cloud-project-examples)$ ls -lt my_first_file.txt
-rw-rw-r-- 1 john_iacovacci1 john_iacovacci1 23 Jan 21 21:02 my_first_file.txt
To use the cp command to move this file to the cloud storage bucket.
john_iacovacci1@cloudshell:~/cloudstore (cloud-project-examples)$ gsutil cp my_first_file.txt gs://cloud-storage-exam/
Copying file://my_first_file.txt [Content-Type=text/plain]...
/ [1 files][ 23.0 B/ 23.0 B]
Operation completed over 1 objects/23.0 B.
Use the gsutil ls command to find the file on the bucket
john_iacovacci1@cloudshell:~/cloudstore (cloud-project-examples)$ gsutil ls gs://uconnstamford2025/my_first_file.txt
gs://uconnstamford2025/my_first_file.txt
Now when I go back to my storage bucket I can see the file
Click into bucket name
The file (called an object in this context) made its way into your newly created bucket.
Access control
Core Access Control Methods
Default Security: By default, all newly created resources are "locked down," meaning they are only accessible to individuals who have explicit access to your specific project.
Access Control Lists (ACLs): This mechanism allows for "fine-grained" control. You can define specific permissions for individual buckets and the objects within them, rather than just applying broad project-wide rules.
Service Accounts: You can interact with and manage data by authorizing a service account—a special type of Google account intended for applications or virtual machines rather than individual people.
Management and Roles
Permission Roles: Cloud Storage utilizes specific roles that define what a user can or cannot do (e.g., viewing vs. modifying data).
Console Management: You can manage these permissions directly through the Google Cloud Console. For buckets, this is typically accessed via the three-dot action menu on the far right of the bucket list.
Edit bucket permissions.
Control access to your objects by assigning these roles to different actors.
Edit bucket permissions To set bucket for public access.
Click on Permissions
Then we GRANT ACCESS
For Add principles we enter allusers
For Assigned roles we select Cloud Storage
Then Storage Object Viewer
Next we ALLOW PUBLIC ACCESS
Hit Allow Public Access
The bucket will the state that public access is allowed
No comments:
Post a Comment