Skip to content
This repository was archived by the owner on Nov 23, 2017. It is now read-only.

2.0 #114

Open
wants to merge 15 commits into
base: branch-2.0
Choose a base branch
from
Open

2.0 #114

Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/ISSUE_TEMPLATE
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
spark-ec2 is no longer in active development. Please refer to the README.
1 change: 1 addition & 0 deletions .github/PULL_REQUEST_TEMPLATE
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
spark-ec2 is no longer in active development. Please refer to the README.
14 changes: 8 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
_Please note: spark-ec2 is **no longer under active development** and the project has been archived. All the existing code, PRs and issues are still accessible but are now read-only. If you're looking for a similar tool that is under active development, we recommend you take a look at [Flintrock](https://github.com/nchammas/flintrock)._

# EC2 Cluster Setup for Apache Spark

`spark-ec2` allows you
Expand Down Expand Up @@ -52,8 +54,8 @@ identify machines belonging to each cluster in the Amazon EC2 Console.

```bash
export AWS_SECRET_ACCESS_KEY=AaBbCcDdEeFGgHhIiJjKkLlMmNnOoPpQqRrSsTtU
export AWS_ACCESS_KEY_ID=ABCDEFG1234567890123
./spark-ec2 --key-pair=awskey --identity-file=awskey.pem --region=us-west-1 --zone=us-west-1a launch my-spark-cluster
export AWS_ACCESS_KEY_ID=ABCDEFG1234567890123
./spark-ec2 --key-pair=awskey --identity-file=awskey.pem --region=us-west-1 --zone=us-west-1a launch my-spark-cluster
```

- After everything launches, check that the cluster scheduler is up and sees
Expand All @@ -65,7 +67,7 @@ following options are worth pointing out:

- `--instance-type=<instance-type>` can be used to specify an EC2
instance type to use. For now, the script only supports 64-bit instance
types, and the default type is `m1.large` (which has 2 cores and 7.5 GB
types, and the default type is `m3.large` (which has 2 cores and 7.5 GB
RAM). Refer to the Amazon pages about [EC2 instance
types](http://aws.amazon.com/ec2/instance-types) and [EC2
pricing](http://aws.amazon.com/ec2/#pricing) for information about other
Expand Down Expand Up @@ -110,8 +112,8 @@ permissions on your private key file, you can run `launch` with the

```bash
export AWS_SECRET_ACCESS_KEY=AaBbCcDdEeFGgHhIiJjKkLlMmNnOoPpQqRrSsTtU
export AWS_ACCESS_KEY_ID=ABCDEFG1234567890123
./spark-ec2 --key-pair=awskey --identity-file=awskey.pem --region=us-west-1 --zone=us-west-1a --vpc-id=vpc-a28d24c7 --subnet-id=subnet-4eb27b39 --spark-version=1.1.0 launch my-spark-cluster
export AWS_ACCESS_KEY_ID=ABCDEFG1234567890123
./spark-ec2 --key-pair=awskey --identity-file=awskey.pem --region=us-west-1 --zone=us-west-1a --vpc-id=vpc-a28d24c7 --subnet-id=subnet-4eb27b39 --spark-version=1.1.0 launch my-spark-cluster
```

## Running Applications
Expand Down Expand Up @@ -148,7 +150,7 @@ as JVM options. This file needs to be copied to **every machine** to reflect the
do this is to use a script we provide called `copy-dir`. First edit your `spark-env.sh` file on the master,
then run `~/spark-ec2/copy-dir /root/spark/conf` to RSYNC it to all the workers.

The [configuration guide](configuration.html) describes the available configuration options.
The [configuration guide](http://spark.apache.org/docs/latest/configuration.html) describes the available configuration options.

## Terminating a Cluster

Expand Down
10 changes: 7 additions & 3 deletions spark_ec2.py
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@
raw_input = input
xrange = range

SPARK_EC2_VERSION = "1.6.0"
SPARK_EC2_VERSION = "1.6.2"
SPARK_EC2_DIR = os.path.dirname(os.path.realpath(__file__))

VALID_SPARK_VERSIONS = set([
Expand All @@ -76,6 +76,8 @@
"1.5.1",
"1.5.2",
"1.6.0",
"1.6.1",
"1.6.2",
])

SPARK_TACHYON_MAP = {
Expand All @@ -94,14 +96,16 @@
"1.5.1": "0.7.1",
"1.5.2": "0.7.1",
"1.6.0": "0.8.2",
"1.6.1": "0.8.2",
"1.6.2": "0.8.2",
}

DEFAULT_SPARK_VERSION = SPARK_EC2_VERSION
DEFAULT_SPARK_GITHUB_REPO = "https://github.com/apache/spark"

# Default location to get the spark-ec2 scripts (and ami-list) from
DEFAULT_SPARK_EC2_GITHUB_REPO = "https://github.com/amplab/spark-ec2"
DEFAULT_SPARK_EC2_BRANCH = "branch-1.5"
DEFAULT_SPARK_EC2_BRANCH = "branch-1.6"


def setup_external_libs(libs):
Expand Down Expand Up @@ -192,7 +196,7 @@ def parse_args():
help="If you have multiple profiles (AWS or boto config), you can configure " +
"additional, named profiles by using this option (default: %default)")
parser.add_option(
"-t", "--instance-type", default="m1.large",
"-t", "--instance-type", default="m3.large",
help="Type of instance to launch (default: %default). " +
"WARNING: must be 64-bit; small instances won't work")
parser.add_option(
Expand Down