Notes for my Cloud certifications.
Regions : AWS Geographical regions like US East, US West, EU Central etc
Availability Zones : Distinct data centres that host the physical compute and other resources for AWS (AWS ensures a minimum of 2 AZs per region). They are often separated with each other but a geographical calamity could still impact/disrupt services on both
Edge Locations : Each Region further consists of many edge locations which basically serve cached data for frequent access from nearby users. Edge location can also be used to write data (For eg. in S3 Transfer Acceleration)
As of today there are approximately 15 regions, 45 AZs and ~100+ Edge locations fronted by Cloudfront. Sometimes edge locations could also be operated/managed by AWS partner network
Fun fact : In Route53, ‘Route’ comes from Route 66 — Oldest inter state highway in the United States, and port 53 used by DNS in Computer Networking
It is used for resolving DNS names to IP addresses, Registering Domain Names
There are different types of records used in DNS system:
SOA records
A records
CNAME records
MX records
PTR records
Alias records
NS records
A request from browser first goes to Top level domain (.com, .au, .gov etc), from there request is forwarded to Name Servers (NS), which fetch the details of A records and answer the request with respective IP address which can then be used by the browser to initiate a TCP connection.
When given a choice between Alias record and a CNAME record, Alias record usually offers more benefits.
When one domain like m.acm.org has to be routed to mobile.acm.org, always use a CNAME record to delegate resolution to another domain
Different types of resolution policies supported by Route53 are:
Simple Routing Policy : In this you can also provide multiple IP addresses but cannot associate health check of all IP addresses
Weighted Routing Policy : Using this policy one can route traffic based on weight assigned to different IP addresses (For eg. 70% traffic to IP X, and other 30% to IP Y), which is then proportionally routed/distributed. (Sounds similar to Canary Deployments ?)
Latency Based Routing Policy: This is based on response latency from the location of user. Using this information, request can be routed to different servers.
Failover Based Routing Policy : One active, One passive — If Active health check starts failing, traffic routes to Passive Setup
Geographical Routing Policy : In this we can define that user in Europe should be routed to Europe server only (This is different from Latency based routing policy and is more hard-coded per se)
Geographical Proximity Routing Policy : In this policy, we also have to use Route53 Traffic Flow rules to define a complex routing policy and also use bias to override the decisions taken by Route53. Using this policy one can utilise multiple configurations and control the domain resolution to IPs at a very granular level. Practically this is very less used
Multivalue Answer Routing Policy: Same as Simple routing policy with multiple IP addresses, but only difference is that health checks can be associated
Users
Groups
Roles (Who you are ?)
Policies (What you can do?)
By default, new users have:
only access_key_id and secret_access_key, but no console access
no permissions, until they are made member of a group or assigned some roles/policies
IAM, like other AWS services is eventually consistent as this data is replicated across multiple servers. IAM is a global service (Not scoped per region). Mainly two services can allow access without authentication / authorization in AWS — STS and S3
Following terms are used in context of IAM:
User : can be an IAM user or an applications accessing the AWS resources
Group :Group of users who can be collectively permitted some actions
Role : This is something which can be assumed by an entity like User or Group (This is similar to Authentication)
Policy : This defines the permissions you have on the resources that you want to access (This is similar to Authorization)
Request Context : This is the request object which AWS receives when somebody tries to access something or take an action on some AWS resource. This object includes source IP, resources that you are trying to access, what actions are you taking on those resources, what time of day this request originated etc.
Policies can be managed in two ways:
Managed policies : AWS Managed & Customer Managed
Inline Policies
Policies can further be of two types:
i) Identity based policies : These are attached directly to Identities like User/Groups etc. They can be managed or inline policies.
ii) Resource based policies : These are inline policies directly applied on the resource that has to be accessed from same/other accounts. This is mainly used for cross-account resource access
Policy versioning: Customer managed policies can normally have only 5 versions being managed at a single point of time. This is useful when you make a change to a policy and it breaks something, you can quickly set the default setting to a previously used policy
IAM Roles are more preferred instead of resource based policies which are not extendable to other entities
EC2:
EBS:
Placement Groups
EC2 Instance Launch Types
- On Demand Instances: short workload, predictable pricing
- Reserved: (MINIMUM 1 year)
- Reserved Instances: long workloads
- Convertible Reserved Instances: long workloads with flexible instances
- Scheduled Reserved Instances: example – every Thursday between 3 and 6 pm
- Spot Instances: short workloads, for cheap, can lose instances (less reliable)
- Dedicated Instances: no other customers will share your hardware
- Dedicated Hosts: book an entire physical server, control instance placement
Databases are mainly of two types:
Relational — Conventional relational databases to store data — RDS
Non-relational (DynamoDB — Like MongoDB) — Collections contain tables which are basically JSON objects
Relational Database Engines supported by AWS are: (POMMMA)
PostgreSQL
Oracle
MariaDB
MySQL
MS SQL
Aurora
Processing types supported by RDS:
OLTP (Online Transaction Processing)
OLAP (Online Analytic Processing) — This works on large amount of data and derives analytics out of it — Redshift is the Amazon offering for OLAP requirements
RDS can support multi-az setup and read replicas
Each read replica has its own DNS end point
Read replicas can be promoted to be their own databases — this breaks the replication though
You can have a read replica in another region
Read replicas ONLY work if backups are turned ON
Two types of backups are possible:
Automated backups — done during planned maintenance windows
Snapshots — Done manually to save state of RDS
NoSQL solution from Amazon
Cluster is spread across 3 different segregated data centres / AZs
Eventual read consistency with maximum delay of 1s
Incoming data transfer IS NOT charged if in a single region. If you cross regions, you will be charged at both ends of the transfer
DynamoDB supports concepts of streams where any modification to existing record in the table is written out on a data stream which can be processed by compute capabilities like AWS Lambda. Lambda can then take decisions based on that event stream or send a SNS notification instead
DynamoDB streams can also be configured to send out two copies of state (previous / current) with the primary key attribute to reflect on the actual change that has happened on the table data
DynamoDB supports DAX, to cache responses and improve time from milliseconds to microseconds
1/10th of the cost of other data warehousing solutions
Helps with OLAP requirement — to derive analytics out of data
Automated backups are by default done every day
Maximum retention period like RDS is 35 days
Leader node hours are not charged, only compute node hours are charged
Redhisft can currently run only in OneAZ -> For same reason, Redshift offers asynchronous backup replication in S3 to another region for Disaster Recovery (DR)
It is used for business intelligence use-cases
Cross region replication can be set up
Redshift additionally supports VPC Routing feature, where all COPY and UNLOAD requests between your cluster and data repositories are routed through VPC, thus gathering benefits of Security Groups, NACL, VPC Endpoints etc.
If enhanced VPC routing is not enabled, REDSHIFT cluster routes all traffic through internet
Redshift Spectrum allows to execute queries on files which are directly stored on S3
Compatible version of MySQL/PostgreSQL that AWS built from scratch
By default stores 2 copies of data in each Availability Zone, with a minimum of 3 availability zones (6 copies of data are hence stored at the minimum)
Compute resources can scale upto 32 vCPU cores and 244GB of RAM
Starts with 10GB of storage but scales up to 64TB automatically based on requirement, while other databases can grow max till 16TB
Aurora can automatically handle loss of 2 copies of data without affecting write capability and 3 copies of data without affecting read availability
Storage for Aurora is self healing -> Data blocks are continuously checked for errors and fixed
Aurora automated backups or snapshots does not affect performance of running clusters
Aurora Snapshots can be shared with other AWS accounts
Aurora read replicas can be of two types:- MySQL Read Replicas (Maximum 5) and Aurora Read Replicas (maximum 15)
Automated failover is supported for Aurora read replicas but not for MySQL read replicas
When you create an Aurora Read Replica from a MySQL RDS instance, AWS basically creates a new Aurora DB Cluster(with read/write capability) which is asynchronously synced with the main DB instance.
Two types of endpoints are supported:
i) Reader Endpoint : Load balances traffic across all read replicas
ii) Cluster Endpoint : Routes write queries to active master
In memory cache store for speeding up an application so that data fetch queries can be reduced
For REDIS AUT, user needs to enable in-transit encryption
Two types of engines are available:
i) Memcached
Multithreaded, NOT multi-az and useful for simple cache offloading
ii) Redis
Single threaded, MultiAZ, backups are possible, business use-cases available like MIN, MAX, AVG etc.
ELBs by default come up in background in all AZs and they also dynamically scale up and down based on the traffic
Full DNS lookup will often tell us about all the ELBs that are currently used by AWS to handle incoming requests
Load balancers are basically of three types:
Can route traffic based on layer 7 interaction. Basically work on HTTP/HTTPS layer and can be used for intelligent routing based on application needs (headers, query parameters, source IP etc.)
Are used for scenarios where pretty heavy workload (millions of requests) have to be routed/managed
They are deprecated now but were used for basic HTTP/TCP routing
If ASG is terminated, all instances associated as part of it will also be terminated
Launch Configurations are more about the configurations of the individual EC2 machines i.e. instance types, security group configurations, root volume configurations, tags etc. whereas Autoscaling Groups use LCs (Launch configurations) to spin up new instances and work on scaling up/down EC2 instances based on pre-defined policies
Egress only gateways allow IPv6 based internet traffic to access the internet and at the same time denying access from internet to the instances within the VPC
Amazon FSx is a file system offering from AWS. It is offered in two variants:
FSx is basically a high performance file system that can be used for compute intensive workloads offering high data throughput. Users can additionally configure the throughput irrespective of the data storage size of the file system (unlike EFS)
FSx is frequently used as file storage for Windows systems as it offers SMB protocol support. Additionally, it also offers integrations with other storage services like S3, where data can be temporarily copied from S3 to AWS FSx for high throughput needs from a filesystem perspective; and later the result can be copied back to S3 after the computations are completed.
Payment model is pay-as-you-go
AWS WAF is a managed service designed to protect public facing web applications from unintended/unsafe traffic
WAF provides readymade integrations with:
With these integrations, whenever any of these services receive a request; they forward it to WAF for validation. If WAF allows, only then these requests are further routed by CF, ALB or API GW to the back-end machine which needs to process the request
WAF offers many managed rules (based on industry best practices like OWASP top 10 vulnerabilities, SQL injection etc.)
As a customer, we can define our custom conditions or use these managed rules to provide security for our application
Custom rules for throttling (IP ‘123.x.x.x’ can only trigger 4000 requests per second etc.) can also be defined at WAF layer and then custom error messages/pages could also be configured in services like Cloudfront which could then be returned to the end-user. All this happens without affecting the real back-end systems.
Read after write consistency for new PUT objects (Newly uploaded objects are guaranteed to be read immediately without any stale state or problems)
Eventual consistency for overwrite PUTs and DELETEs (Modifications / deletions will eventually reflect latest state — there could be a delay of some seconds)
S3 offers various storage tiers that help control cost, availability and durability of the data
Encryption at rest is achieved in two ways
Service Side encryption (Can be further managed by AWS in three ways)
i) Keys managed by S3 service for encryption (SSE-S3)
ii) Keys provisioned by user in KMS (SSE-KMS)
iii) User/Customer provided encryption keys can also be used (SSE-C)
Client Side encryption — Client himself manages the encryption/decryption and uploads the encrypted data only
This is used to speed up large data uploads to S3. With this, user can upload the data to nearest edge location and S3 will then ensure that the data is replicated to the actual bucket for final storage. For Edge Location -> S3, AWS will then use the backbone network which is quite fast than the usual internet speed
Virtual / Physical appliance that sits in your data centre and replicates data to S3
File Gateway : Plain files, replicated to S3
Volume Gateway : There are two types of Volume Gateways
Gateway Virtual Tape library
Standard SQS Queue : This is the standard processing model for SQS service
FIFO SQS Queue : In this messages are delivered only once and also arrive in order. Maximum throughput of 300 transactions is supported
This makes more sense when a manual intervention or task oriented workflow is needed in contrast to a message oriented workflow with SQS
It works with the following components
i) Workflow Starters : Something like web application which triggers a workflow
ii) Deciders : Which decide that a particular workflow task has to be executed
iii) Activity Executors : They execute the real business logic defined in the workflow
API Gateway is an entry-point for various types of resources acting as a front door entry mechanism with support for:
API Gateway uses the following things to realise an API that can be exposed to the end-user
API Gateway supports throttling API requests on global or API level and also supports caching by defining a fixed data size for storage to be provisioned. With caching enabled you can then avoid passing on redundant calls to the backend systems
Streams — Analytics — Firehose
Kinesis offers three different types of services:
AWS Cognito builds upon two concepts:
When you build a mobile app for example, you cannot distribute AWS credentials along with the application code. When the application needs to access any AWS resource, it can instead generate a temporary AWS token which maps to a particular role and using that temporary token it accesses the specified resource. This avoids bundling any secure credential directly with the source code. For fetching an auth token, the app first authenticates the user against Google, Facebook, Amazon etc. or any other provider which support OIDC (Open ID Connect) connect capability.
Mobile App User -> Logs in to Amazon, Facebook, Microsoft etc. -> Authenticates -> Mobile app gets a secure token and exchanges it with AWS for a temporary access token mapped to a role
It helps to monitor and investigate state of security of the systems by scanning networks or configurations
Custom monitoring scripts written in Perl, Ruby etc. and are available to be installed on the EC2 instances. Same Cloudwatch agent can be used to ship logs as well as additional monitoring data like Memory Utilization etc to Cloudwatch. Metrics like MemoryUtilization, CPU Core usage, Disk space utilization, disk space utilization etc. are not available out of the box with default Cloudwatch capabilities
It is a messaging broker with support for large number of protocols and standards, and is usually better when migrating existing messaging broker workloads to the cloud. When building new applications that depend on messaging capabilities, we can always use Amazon SQS which is highly scalable.
Amazon SQS on the other hand is similar but does not support a large number of APIs and protocols.