Chapter 8. Cloud Storage:
Object Storage
What is Object Storage?
What is Cloud Storage?
Interacting with Cloud Storage
Access control and lifecycle configuration
Globally unique name
Deciding whether Cloud Storage is a good fit
An application that involves storing an image.
Reduce complexity of the underlying disks and data centers
Provide an API (Application Programing Interface) for moving files.
8.1. Concepts
Key-value storage for large values with automatic replication and caching around the world.
8.1.1. Buckets and objects
Bucket as a container that stores your data.
The bucket has a globally unique name.
Geographical location and the storage class.
Buckets as “disks,” “disk” is extraordinarily large.
Replicated and spread across many physical disks to maintain high levels of durability and availability.
Each file in the bucket must not be larger than 5 terabytes
Objects are the files that you put inside a bucket .
Have a unique name inside the bucket, and as on typical file systems.
Locations
Buckets can have locations.
Buckets exist either at the regional level or spread across multiple regions.
Concerned about latency you might want to choose a specific region.
Multiregional bucket data is closest to where you or your customers are.
VMs can only exist in a single place, but data can be copied and live in multiple places simultaneously.
Latency is the delay between a user's action and a web application's response to that action.
In networking terms as the total round trip time it takes for a data packet to travel.
Paying a premium for reading data cross-region network transfer fees.
8.2. Storing data in Cloud Storage
First you have to create a bucket.
Bucket names need to be globally unique.
In the upper left hand corner select cloud storage then buckets
Once on the bucket screen hit +create
I used Globally unique name uconnstamford
Chose Region (keeps costs lower)
Choose storage class (Standard is fine)
Uncheck Enforce public access prevention (enables internet access for web site)
Then hit Create
Cloud Storage currently has a separate command-line tool called gsutil.
Allows listing of cloud storage buckets
Open cloud shell
gsutil is a Python application that lets you access Cloud Storage from the command line.
john_iacovacci1@cloudshell:~ (uconn-engr)$ gsutil ls
gs://carsforsale/
gs://dollarsforstuff/
gs://gcf-sources-341912105106-us-central1/
gs://gcf-sources-341912105106-us-east1/
gs://gcf-v2-sources-341912105106-us-central1/
gs://gcf-v2-uploads-341912105106-us-central1/
gs://johnspizza/
gs://knickfan/
gs://my-cloud-functions-jai/
gs://nyknicks/
Now upload a simple text file with gsutil
create a file by using linux echo command and std output redirection
echo "This is my first file!" > my_first_file.txt
cat my_first_file.txt
john_iacovacci1@cloudshell:~ (uconn-engr)$ ls -lt my_first_file.txt
-rw-r--r-- 1 john_iacovacci1 john_iacovacci1 23 Aug 17 15:21 my_first_file.txt
john_iacovacci1@cloudshell:~ (uconn-engr)$
1
john_iacovacci1@cloudshell:~ (uconn-engr)$ cat my_first_file.txt
This is my first file!
To use the cp command to move this file to the cloud storage bucket.
gsutil cp my_first_file.txt gs://uconnstamford/
john_iacovacci1@cloudshell:~ (uconn-engr)$ gsutil cp my_first_file.txt gs://uconnstamford/
Copying file://my_first_file.txt [Content-Type=text/plain]...
/ [1 files][ 23.0 B/ 23.0 B]
Use the gsutil ls command to find the file on the bucket
john_iacovacci1@cloudshell:~ (uconn-engr)$ gsutil ls gs://uconnstamford/my_first_file.txt
gs://uconnstamford/my_first_file.txt
Now when I go back to my storage bucket I can see the file
The file (called an object in this context) made its way into your newly created bucket.
8.3. Choosing the right storage class
Cloud Storage offers different types of buckets that you can configure.
Storage classes come with different performance characteristics.
Different use cases require different features.
8.3.1. Multiregional storage
Multiregional storage is the one to fit the needs of most applications.
The most expensive, it replicates data across regions.
8.3.2. Regional storage
Replicates the data across different zones inside a single region.
Lower availability, and latency to destinations far away from the region.
8.3.3. Nearline storage
Lower availability and higher latency.
Don’t need your data all that often
Wait a bit for the download to start.
8.3.4. Coldline storage
Extreme end of the data-archival spectrum.
Transaction logs for the past year better fit for the Coldline storage.
Database backups monthly.
8.4. Access control
How to control who’s able to access or modify the data after it’s stored.
8.4.1. Limiting access with ACLs
Interacting with your data while authorized as a service account.
Everything you create is locked down to be accessible by only those people who have access to your project.
Cloud Storage allows fine-grained access control of your buckets and objects through a security mechanism called Access Control Lists (ACLs).
Table 8.2. Description of roles for Cloud Storage
ACL for your bucket in the Cloud Console. You can do this by clicking the vertical three-dot button on the far right in your list of buckets and selecting Edit bucket permissions.
Control access to your objects by assigning these roles to different actors.
Edit bucket permissions To set bucket for public access.
Then we GRANT ACCESS
For Add principles we enter allusers
For Assigned roles we select Cloud Storage
Then Storage Object Viewer
Next we ALLOW PUBLIC ACCESS
The bucket will the state that public access is allowed
Adding access to a specific user log in with Google’s traditional login.
In addition to adding access to individuals, Cloud Storage also allows you to control access based on a few other things:
User allUsers, refers to anyone. readable by anyone who asks for it.
User allAuthenticatedUsers - logged in with their Google account.
Groups - all members of a specific Google Group.
Domains - Google Apps managed domain name.
Default object ACLs
ACLs set on objects in the form of a bucket’s default object ACLs.
Define the ACL at the bucket and level apply to objects when they’re created.
ACL best practices
When in doubt, give the minimum access possible.
The Owner permission is powerful, so be careful with it.
Allowing access to the public is a big deal, so do it sparingly.
Default ACLs happen automatically, so choose sensible defaults.
8.5. Object versions
Cloud Storage has the ability to turn on versioning, where you can have objects with multiple revisions over time.
8.8.2. Data archival
Cloud Storage can be a cost-effective way to archive your data.
Access logs, processed data, or converted from DVDs.
Given that archived data is much less frequently accessed, the Nearline and Coldline storage classes are ideal options.
Logs are usually text files that a running process appends to over time and cycles to a new filename every so often files off of your machine’s storage and into a Cloud Storage bucket. Typically, your logging system packages 8.9. Understanding pricing
Cloud Storage pricing is broken into several different components:
Amount of data stored
Amount of data transferred
Number of operations executed
Amount of data retrieved (in addition to served)
30-day (or 90-day) minimum storage
8.9.1. Amount of data stored
Cloud Storage charges you based on the amount of data you keep in your bucket measured in gigabytes per month, prorated on how long the object was stored. If you store an object for 15 out of 30 days, your bill for a single 2 GB object will be 2 (GB) * 0.026 (USD) * 15/30 (months), which is 31 cents.
8.9.2. Amount of data transferred
Charged for sending that data to customers or to yourself.
Sometimes called network egress, which refers to the amount of data being sent out of Google’s network.
Network costs will vary depending on where you are in the world.
8.9.3. Number of operations executed
Cloud Storage charges for a certain subset of operations you might perform on your buckets or objects. The no-free operations have two classes: a “cheap” class (for example, getting a single object) costing 1 cent for every 10,000 operations, and an “expensive” class (for example, updating an object’s metadata), which costs 10 cents for every 10,000 operations.
8.9.4. Nearline and Coldline pricing
Nearline and Coldline storage classes has cheaper data storage cost. Nearline and Coldline also include an extra cost for data retrieval, which is currently $0.01 per GB retrieved on Nearline and $0.05 per GB for Coldline.
8.10. When should I use Cloud Storage?
Cloud Storage complements your other storage systems rather than whether Cloud Storage is a good fit at all.
8.10.1. Structure
Cloud Storage is by definition an unstructured storage system and is, therefore, meant to be used purely as a key-value storage system with no ability to handle any queries besides “give me the object at this key.”
8.10.3. Durability
Durability is an aspect where Cloud Storage is strong, offering a 99.999999999% durability guarantee (that’s 11 nines).
8.10.4. Speed (latency)
Multiregional storage is sufficiently fast to bring the latency (measured as time to the first byte) into the milliseconds.
If you need the speed, it’s there. If you don’t, you can save some money.
8.10.5. Throughput
Cloud Storage is optimized for throughput,
Google automatically manages capacity on a global scale to make sure that you never get stuck in need of a faster download.
8.10.6. Overall
Typical storage needs of each application and seeing how this service stacks up, this section will focus on the ways that each application can use Cloud Storage and how good of a fit it is.
Summary
Google Cloud Storage is an object storage system that allows you to store arbitrary chunks of bytes (objects) without worry about disk drives, replication, and so on.
Cloud Storage offers several storage classes, each with its own trade-offs (for example, lower cost for lower availability).
Although Cloud Storage is mainly about storing chunks of data, it also provides extra features like automatic deletion for old data (lifecycle management), storing multiple versions of data, advanced access control (using ACLs), and notification of changes to objects and buckets.
Unlike other storage systems you’ve learned about, Cloud Storage complements the others and, as a result, is typically used in addition to those rather than instead of them.
No comments:
Post a Comment