UCONN

UCONN
UCONN

Google Cloud Storage

Google Cloud Storage

Google Cloud Storage (GCS) is an enterprise-grade object storage service. Unlike traditional file storage (like your computer's folders) or block storage (like a hard drive for a virtual machine), object storage treats every file as a discrete "object" bundled with metadata and a unique identifier.

Think of it as a massive, high-tech warehouse where you can store an unlimited amount of data—from small text files to 5 TB video files—and retrieve them instantly from anywhere in the world.

Think of it as a massive, high-tech warehouse where you can store an unlimited amount of data—from small text files to 5 TB video files—and retrieve them instantly from anywhere in the world.

Core Concepts: How it’s Organized

To understand GCS, you need to know three primary levels of organization:

  • Buckets: These are the basic containers that hold your data. Every bucket has a globally unique name and is assigned to a specific geographic location (like us-east1 or europe-west2).Objects: This is the actual data you upload (photos, logs, backups). Each object consists of the data itself, a unique key (name), and metadata (e.g., content type, creation date).

  • Projects: All buckets live inside a Google Cloud Project, which handles billing and permissions.

Note: Objects in GCS are immutable. You cannot "edit" a file in place. If you change a file and re-upload it, GCS either overwrites the old one or creates a new version (if versioning is enabled).


Storage Classes (Cost vs. Access)

Google offers four main storage classes.They all offer the same high durability and low latency, but they differ in price based on how often you plan to access the data.

Class

Ideal Use Case

Minimum Duration

Retrieval Fee

Standard

"Hot" data: Web content, streaming, active app data.

None

None

Nearline

"Warm" data: Monthly backups, old documents.

30 days

Low

Coldline

"Cold" data: Disaster recovery, quarterly archives.

90 days

Medium

Archive

"Frozen" data: Long-term regulatory/legal storage.

365 days

High


Key Features

  • Durability & Availability: GCS is designed for 99.999999999% (11 nines) annual durability. This means if you store 10,000 objects, you might lose one every 10 million years.

  • Object Lifecycle Management: You can set rules to save money automatically. For example: "Move objects to Archive storage if they haven't been accessed in 30 days."

  • Consistency: GCS provides strong global consistency. As soon as an upload is successful, any subsequent request to read that object will return the new version.

  • Security: Data is encrypted by default "at rest" (on the disk) and "in transit" (moving over the internet).You can use Google-managed keys or your own.

How to Access Your Data

There are four primary ways to interact with GCS:

  1. Google Cloud Console: A web-based GUI for point-and-click management.

  2. gsutil / gcloud storage: Command-line tools for power users and scripting.

  3. Client Libraries: SDKs for languages like Python, Java, Go, and Node.js to integrate storage into your apps.

  4. REST API: Standard HTTP requests (JSON or XML) for custom integrations.

Common Use Cases

  • Hosting Static Websites: You can point a domain to a bucket to host HTML, CSS, and images.

  • Data Lakes: Storing massive amounts of raw data for BigQuery or AI/Machine Learning processing.

  • Backup and Recovery: Storing system images or database dumps safely off-site.

  • Content Delivery: Serving high-res videos or images to users globally via Google’s edge network.


Fundamental concepts and architecture of Cloud Storage (such as Google Cloud Storage or AWS S3), which functions as a global, scalable system for storing data as "objects" rather than traditional files.

## Core Concepts: Buckets and Objects

Cloud Storage uses a flat hierarchy consisting of two primary components:

  • Buckets: These are the top-level containers for your data. Every bucket must have a globally unique name across the entire service provider. Think of them as massive, virtual "disks."

  • Objects: These are the actual files (up to 5TB each) stored inside buckets. Each object has a unique name within its specific bucket.

## Data Availability and Location

Strategic placement of your data affects both cost and performance:

  • Regional: Data stays in one specific geographic area to minimize latency (the delay in data transfer) for local users or VMs.

  • Multi-regional: Data is replicated across several regions. This ensures high availability and places data closer to a global customer base.

  • Durability: Data is automatically spread across multiple physical disks to prevent loss.

## Key Features & Mechanics

  • Storage as an API: Unlike a standard hard drive, you move files in and out using an API (Application Programming Interface).

  • Key-Value Model: Technically, Cloud Storage acts as a key-value store optimized for large values.

  • Lifecycle & Access: You can configure how long data is kept (lifecycle) and who is allowed to see it (access control).

  • Cost Factor: While multi-regional data is more resilient, reading data across different regions usually incurs "network transfer fees."

## How to Start

  1. Navigate to the Cloud Storage section in your cloud console.

  2. Create a bucket first (remembering the name must be unique worldwide).

  3. Upload your objects into that bucket.


Build a Cloud Storage Bucket



Type cloud storage in the search box then click into Cloud Storage



Click into Create bucket


I used Globally unique name cloud-storage-exam



Select Region - Hit Continue


No need for Multi or Dual,  Chose Region keeps costs lower)


In Google Cloud Storage (GCS), the Location Type determines how your data is replicated across geographic areas. This choice is critical because it directly impacts your data's availability, latency, and cost.

Region Storage - High-performance data processing and lower costs.

Your data is stored within a single geographic region.

Performance: lowest latency and highest throughput because your compute resources can be "co-located" in the same region as your data.

Cost: Generally the cheapest storage option. There are no replication fees.

Risk: If an entire region goes offline (due to a major natural disaster or large-scale outage), your data is inaccessible until the region is restored.


Dual-Regional Storage -  Mission-critical apps requiring high availability and low latency.

Data is automatically replicated across two specific regions within the same continent.

Availability: Very high. If one region goes down, Google automatically fails over to the second region. 

Performance: You get "Region-like" performance if your compute is in either of the two chosen regions.

Cost: The most expensive option. You pay for the storage in both regions and a replication fee for data written to the bucket.


Multi-Regional Storage - Serving content to users across a large area 

Data is spread across at least two regions within a large geographic area (e.g., the entire United States or the European Union).

Availability: Higher than a single region.

Performance: Great for "serving" data to users scattered across a continent, serving the data from the location closest to the user.

Cost: Mid-range. It is more expensive than Regional but often cheaper than Dual-Regional.


Next choose How to Store your data


Select Standard. Hit Continue


Different storage classes are designed to help you balance at-rest storage costs against data access costs.

 Standard Storage- data that you access frequently. 

Highest storage price but no retrieval fees.

Latency: Milliseconds (instant).

Nearline Storage - Low-cost option for data you expect to access less than once a month.

Coldline Storage - Designed for data you plan to access at most once every 90 days.  Disaster recovery planning and older backups.

Archive Storage - The cheapest storage class for data you access less than once a year. Regulatory compliance, legal archives.


Uncheck Enforce public access prevention (enables internet access for web site)


Can leave the rest of selections with default values.


Then hit Create

Cloud Storage currently has a separate command-line tool called gsutil. 


Allows listing of cloud storage buckets



Open cloud shell


gsutil is a Python application that lets you access Cloud Storage from the command line.


Welcome to Cloud Shell! Type "help" to get started, or type "gemini" to try prompting with Gemini CLI.

Your Cloud Platform project in this session is set to cloud-project-examples.

Use `gcloud config set project [PROJECT_ID]` to change to a different project.

john_iacovacci1@cloudshell:~ (cloud-project-examples)$ gsutil ls

gs://cloud-storage-exam/

john_iacovacci1@cloudshell:~ (cloud-project-examples)$ 



Now upload a simple text file with gsutil


create a file by using linux echo command and std output redirection


john_iacovacci1@cloudshell:~ (cloud-project-examples)$ mkdir cloudstore

john_iacovacci1@cloudshell:~ (cloud-project-examples)$ cd cloudstore

john_iacovacci1@cloudshell:~/cloudstore (cloud-project-examples)$ echo "This is my first file!" > my_first_file.txt



john_iacovacci1@cloudshell:~/cloudstore (cloud-project-examples)$ cat my_first_file.txt

This is my first file!


john_iacovacci1@cloudshell:~/cloudstore (cloud-project-examples)$ ls -lt my_first_file.txt

-rw-rw-r-- 1 john_iacovacci1 john_iacovacci1 23 Jan 21 21:02 my_first_file.txt


To use the cp command to move this file to the cloud storage bucket.

john_iacovacci1@cloudshell:~/cloudstore (cloud-project-examples)$ gsutil cp my_first_file.txt gs://cloud-storage-exam/

Copying file://my_first_file.txt [Content-Type=text/plain]...

/ [1 files][   23.0 B/   23.0 B]                                                

Operation completed over 1 objects/23.0 B.    



Use the gsutil ls command to find the file on the bucket

john_iacovacci1@cloudshell:~/cloudstore (cloud-project-examples)$  gsutil ls gs://uconnstamford2025/my_first_file.txt

gs://uconnstamford2025/my_first_file.txt


Now when I go back to my storage bucket I can see the file



Click into bucket name




The file (called an object in this context) made its way into your newly created bucket.

Access control

Core Access Control Methods

  • Default Security: By default, all newly created resources are "locked down," meaning they are only accessible to individuals who have explicit access to your specific project.

  • Access Control Lists (ACLs): This mechanism allows for "fine-grained" control. You can define specific permissions for individual buckets and the objects within them, rather than just applying broad project-wide rules.

  • Service Accounts: You can interact with and manage data by authorizing a service account—a special type of Google account intended for applications or virtual machines rather than individual people.

Management and Roles

  • Permission Roles: Cloud Storage utilizes specific roles that define what a user can or cannot do (e.g., viewing vs. modifying data).

  • Console Management: You can manage these permissions directly through the Google Cloud Console. For buckets, this is typically accessed via the three-dot action menu on the far right of the bucket list.


Edit bucket permissions.


Control access to your objects by assigning these roles to different actors.


Edit bucket permissions To set bucket for public access.





Click on Permissions



Then we GRANT ACCESS


For Add principles we enter allusers 


For Assigned roles we select Cloud Storage




Then Storage Object Viewer


Next we ALLOW PUBLIC ACCESS



Hit Allow Public Access


The bucket will the state that public access is allowed



No comments:

Post a Comment

Assignment #12 due 12/12/25

  Build 4 graphs using machine learning - linear regression I want two separate publicly traded companies e.g. AAPL & AMZN Linear regres...