Learning to be a code alchemist, one experiment at a time.

AWS Compute Week, Day 1: Introduction to EC2

|

Introduction to EC2/AWS

Andres Sivla, Sr. Technical Account Manager, AWS Enterprise Support AWS concepts, Instances, Storage, VPC/load balancer, Monitoring, Security, Deployment 16 regions/42 availability zones. Isolated from each other Elastic Compute Cloud Elastic Virtual Servers Host Server->Hypervisor->Guest Users Different instance types are optimized for different tasks Intel Xeon Core count, memory size, Storage size and type, net performance, etc. Supports many different OS?s

Storage File: EFS (nfs) Block: EBS (virtual SAN storage) and EC2 Instance Store Object: S3 and Glacier Virtual Private Cloud Give customers power to design your own network Logically isolated area within AWS to ensure your own settings ENI/Subnet/ACL/Route Table/Internet Gateway/ Virtual private gateway/Route 53 private hosted zone Subnets: smaller nets within VPC that can be public or private to access internet at large ACL: Network access control list; control ports and the like Route table: How to get to certain places on the net Internet Gateway: How to access public net Virtual Private Gateway: VPN tunnel between your VPC and your office Route 53 private hosted zone: DNS zone that resolves to internal IP?s only VPC wizard: Helps you create your own VPC Elastic Load Balancer Timeout config, connection draining, Cross-zone Load Balancing Autoscaling, auto directing to different instances based on how busy it is Cloudwatch allows you to look at all metrics in detail Costs extra Analyze information that is provided over time Security Bunch of different kinds of security Encryption, IIAM, etc. etc. Access Credentials and Key Pairs Amazon Lightsail easy deployment of web app Deployment Amazon Machine images, which can be amazon backed, community backed, or your own

Getting the most bang for your buck with #EC2 #Winning

Boyd McGeachie, Product Manager ?EC2 Spot On demand, reserved, spot On demand is pay by the hour Reserved is 1-3 commitment (30-60 percent discount) Spot is pay market price for unused capacity at steep discount over on-demand Increase elasticity measure to monitor and improve the cost No upfront investment, payless on reserve, pay as you go, pay less as aws gets bigger Low cost and flexible, develop and test Short term/spikey Standard vs convertible reserved costs are different but can be upgraded and changed

AWS: IAM policy

|

Today at work, I wanted to set up an EC2 instance->S3 bucket connection, but ensure that the S3 bucket is as locked down as possible. I also only wanted specific s3 commands to be allowed in my EC2 instance.

I had to learn what an IAM policy is, how it works, and how to write my own.


What is an IAM policy?

A set of RULES that, under the correct conditions, define what ACTIONS the policy PRINCIPAL or holder can take to specified AWS RESOURCES.

This boils down to “Who can do what to which resources. When do we care?”


IAM roles allow you to delegate access with defined permissions to trusted entities without having to share long-term access keys.

Lets look at one I wrote here:

{ “Version”: “2012-10-17”, “Statement”: [ { “Effect”: “Allow”, “Action”: [ “s3:ListBucket”, “s3:GetBucketLocation”, “s3:ListBucketMultipartUploads” ], “Resource”: “arn:aws:s3:::neilcomputing”, “Condition”: {} }, { “Effect”: “Allow”, “Action”: [ “s3:GetObject”, “s3:GetObjectAcl”, “s3:PutObject”, “s3:PutObjectAcl” ], “Resource”: “arn:aws:s3:::neilcomputing/*”, “Condition”: {} } ] }

Version: There’s only two versions - 2012-10-17 and 2008-10-17; use the newer one! Statement: The important part. Who does what to what resources when

Statements contain : Effect: Either Allow or Deny (the following actions) Principal: Who is doing this stuff? In the above statement, I have no principle because I was writing a role to assign to various EC2 instances (IE if you attach the role to it the principle is itself). Specify the ARN. Action:What actions are in this list? In my case, various List, Get, and Put operations Resource: What does it do these actions too? Condition: This policy should run if _ is true (mine is blank because I want it always on)


To learn what an IAM role is, this blog post was incredibly helpful:

http://start.jcolemorrison.com/aws-iam-policies-in-a-nutshell/

AWS: No-AWS Knowledge Quick Interview Prep

|

AWS:

EC2 ?> Cloud Server S3 ?> Cloud Storage, can support a static website Glacier ?> Archived storage (way cheaper) AWS Elastic Beanstock ?> Web App

DynamoDB ?> NOSQL RDS ?> Relational DQ Redshift ?> Petabyte scale data warehouse solution

Hive +EMR in S3 ?> Analyzing Big Data

Python SDK for AWS ?> Boto3 ? has Object oriented API; also allows for low-level direct access Provides access for just about everything Client API ?> Low level direct access (1-1 HTTP API) Resource API ?> provide resource objects and collections to access attributes and perform actions Python 2 and 3 support has waiters to check for ready status of spun-up instances before executing anything has service specific features like automatic multi-part transfers for S3 and simplified query conditions for Amazon DynamoDB

Chalice : Python Serverless Microframework for AWS Quickly create and deploy applications that use Amazon API gateway and AWS lambda? Not yet in production, do not have access to entire feature set

Cryptography Week 1: Intro

|

Principles of Modern Crypto:

Definitions:Define what security means in a particular context

Giving a proper defintion of what you want will help you figure out exactly what you need, and may help you find nuances that you might have otherwise missed

Assumptions: Certain kinds of problems are impossible to solve effeciently; ex P!=NP; explicitly state what aasumptions you have made

Proofs of Security: Propose scheme, provide mathematical proof if why its secure

Reasons “secure” schemes break in the real world: 1)Wrong definition of security chosen 2) Assumption was wrong

Kerchkhoffs Principle: Encryption Scheme should be public; only keys should be private


Private Key Encryption:

3 main algorithms: Gen –> Key Generation to create a random k in K Enc –> Encryption Algorithm takes k and m (message chosen from M) and outputs c (ciphertext) Dec –> Decrption Algorithm takes c and k and outputs m Dec(Enc(m))=m Encryption can be randomized; Decryption is deterministic K and M are their respective spaces; ie m is a message chosen from M, the set of all possible messages


Perfect Secrecy:

Cryptographic definitions have 2 parts: 1) threat model meant to capture real world capabilities attackers are assumed to have 2) Security guarantee : What is it that we are trying to prevent the attacker from doing?

Several threat models:

1) Get ciphertext 2)Get plain and ciphertext 3)Choose plain, get ciphertext, often with ability to decrypt certain chosen ciphertexts

An encryption scheme is secure iff regardless of any prior knowledge the attacker has of the plaintext, the ciphertext observed by the attacker should leak no additional information about the plaintext

Predicting Passenger Survival on the Titanic: part 2

|

Step 6) Does a Linear Regression work?

What if we tried to create a linear formula that looks at all values and tries to discover if this passenger was likely to survive or not? SciKit Learn contains a linear regression function:

from sklearn.linear_model import LinearRegression

Identify the predictors predictors = [“Pclass”, “Sex”, “Age”, “SibSp”, “Parch”, “Fare”, “Embarked”]

predictions = [] for train, test in kf: # The predictors we’re using the train the algorithm. Note how we only take the rows in the train folds. train_predictors = (titanic[predictors].iloc[train,:]) # The target we’re using to train the algorithm. train_target = titanic[“Survived”].iloc[train] # Training the algorithm using the predictors and target. alg.fit(train_predictors, train_target) # We can now make predictions on the test fold test_predictions = alg.predict(titanic[predictors].iloc[test,:]) predictions.append(test_predictions)

The predictions are in three separate numpy arrays. Concatenate them into one.

We concatenate them on axis 0, as they only have one axis.

predictions = numpy.concatenate(predictions, axis=0)

Map predictions to outcomes (only possible outcomes are 1 and 0)

predictions[predictions > .5] = 1 predictions[predictions <=.5] = 0 accuracy = sum(predictions[predictions == titanic[“Survived”]]) / len(predictions)

A Linear regression using these predictors provides an accuracy of 78.3%, which is not all that great.

Step 7) Logistic Regression Output values between 0 and 1.

One good way to think of logistic regression is that it takes the output of a linear regression, and maps it to a probability value between 0 and 1. The mapping is done using the logit function. Passing any value through the logit function will map it to a value between 0 and 1 by “squeezing” the extreme values. This is perfect for us, because we only care about two outcomes.

Sklearn has a class for logistic regression that we can use. We’ll also make things easier by using an sklearn helper function to do all of our cross validation and evaluation for us. This specific test run has an accuracy of 0.787878787879.

from sklearn import cross_validation

Initialize our algorithm

alg = LogisticRegression(random_state=1)

Compute the accuracy score for all the cross validation folds. (much simpler than what we did before!)

scores = cross_validation.cross_val_score(alg, titanic[predictors], titanic[“Survived”], cv=3)

Take the mean of the scores (because we have one for each fold)

print(scores.mean())

Step 8) Decision Tree’s,RandomForest, and Gradient Boosting Step 9)Ensembling Step 10) Matching and Predicting on the Test Set