parsl.providers.AWSProvider

class parsl.providers.AWSProvider(image_id, key_name, init_blocks=1, min_blocks=0, max_blocks=10, nodes_per_block=1, parallelism=1, worker_init='', instance_type='t2.small', region='us-east-2', spot_max_bid=0, key_file=None, profile=None, iam_instance_profile_arn='', state_file=None, walltime='01:00:00', linger=False, launcher=SingleNodeLauncher(debug=True, fail_on_any=False))[source]

A provider for using Amazon Elastic Compute Cloud (EC2) resources.

One of 3 methods are required to authenticate: keyfile, profile or environment variables. If neither keyfile or profile are set, the following environment variables must be set: AWS_ACCESS_KEY_ID (the access key for your AWS account), AWS_SECRET_ACCESS_KEY (the secret key for your AWS account), and (optionaly) the AWS_SESSION_TOKEN (the session key for your AWS account).

Parameters:
  • image_id (str) – Identification of the Amazon Machine Image (AMI).

  • worker_init (str) – String to append to the Userdata script executed in the cloudinit phase of instance initialization.

  • walltime (str) – Walltime requested per block in HH:MM:SS.

  • key_file (str) – Path to json file that contains ‘AWSAccessKeyId’ and ‘AWSSecretKey’.

  • nodes_per_block (int) – This is always 1 for ec2. Nodes to provision per block.

  • profile (str) – Profile to be used from the standard aws config file ~/.aws/config.

  • nodes_per_block – Nodes to provision per block. Default is 1.

  • init_blocks (int) – Number of blocks to provision at the start of the run. Default is 1.

  • min_blocks (int) – Minimum number of blocks to maintain. Default is 0.

  • max_blocks (int) – Maximum number of blocks to maintain. Default is 10.

  • instance_type (str) – EC2 instance type. Instance types comprise varying combinations of CPU, memory, storage, and networking capacity For more information on possible instance types, see here. Default is ‘t2.small’.

  • region (str) – Amazon Web Service (AWS) region to launch machines. Default is ‘us-east-2’.

  • key_name (str) – Name of the AWS private key (.pem file) that is usually generated on the console to allow SSH access to the EC2 instances. This is mostly used for debugging.

  • spot_max_bid (float) – Maximum bid price (if requesting spot market machines).

  • iam_instance_profile_arn (str) – Launch instance with a specific role.

  • state_file (str) – Path to the state file from a previous run to re-use.

  • walltime – Walltime requested per block in HH:MM:SS. This option is not currently honored by this provider.

  • launcher (Launcher) – Launcher for this provider. With AWS, usually the default SingleNodeLauncher will be appropriate.

  • linger (Bool) – When set to True, the workers will not halt. The user is responsible for shutting down the node.

__init__(image_id, key_name, init_blocks=1, min_blocks=0, max_blocks=10, nodes_per_block=1, parallelism=1, worker_init='', instance_type='t2.small', region='us-east-2', spot_max_bid=0, key_file=None, profile=None, iam_instance_profile_arn='', state_file=None, walltime='01:00:00', linger=False, launcher=SingleNodeLauncher(debug=True, fail_on_any=False))[source]

Methods

__init__(image_id, key_name[, init_blocks, ...])

cancel(job_ids)

Cancel the jobs specified by a list of job ids.

config_route_table(vpc, internet_gateway)

Configure route table for Virtual Private Cloud (VPC).

create_name_tag_spec(resource_type, name)

Create a new tag specification for a resource name.

create_session()

Create a session.

create_vpc()

Create and configure VPC

generate_aws_id()

Generate a new ID for AWS resources.

get_instance_state([instances])

Get states of all instances on EC2 which were started by this file.

goodbye()

initialize_boto_client()

Initialize the boto client.

read_state_file(state_file)

Read the state file, if it exists.

security_group(vpc, name)

Create and configure a new security group.

show_summary()

Print human readable summary of current AWS state to log and to console.

shut_down_instance([instances])

Shut down a list of instances, if provided.

spin_up_instance(command, job_name)

Start an instance in the VPC in the first available subnet.

status(job_ids)

Get the status of a list of jobs identified by their ids.

submit([command, tasks_per_node, job_name])

Submit the command onto a freshly instantiated AWS EC2 instance.

teardown()

Teardown the EC2 infastructure.

write_state_file()

Save information that must persist to a file.

xstr(s)

Attributes

cores_per_node

Number of cores to provision per node.

label

Provides the label for this provider

mem_per_node

Real memory to provision per node in GB.

status_polling_interval

Returns the interval, in seconds, at which the status method should be called.

cancel(job_ids)[source]

Cancel the jobs specified by a list of job ids.

Parameters:

job_ids (list of str) – List of of job identifiers

Returns:

Each entry in the list will contain False if the operation fails. Otherwise, the entry will be True.

Return type:

list of bool

config_route_table(vpc, internet_gateway)[source]

Configure route table for Virtual Private Cloud (VPC).

Parameters:
  • vpc (dict) – Representation of the VPC (created by create_vpc()).

  • internet_gateway (dict) – Representation of the internet gateway (created by create_vpc()).

create_name_tag_spec(resource_type, name)[source]

Create a new tag specification for a resource name.

Parameters:
  • resource_type (str) – The AWS resource type

  • name (str) – The name to assign to the resource

Returns:

A TagSpecifications record to be passed into the creation of a new AWS resource.

Return type:

record

create_session()[source]

Create a session.

First we look in self.key_file for a path to a json file with the credentials. The key file should have ‘AWSAccessKeyId’ and ‘AWSSecretKey’.

Next we look at self.profile for a profile name and try to use the Session call to automatically pick up the keys for the profile from the user default keys file ~/.aws/config.

Finally, boto3 will look for the keys in environment variables: AWS_ACCESS_KEY_ID: The access key for your AWS account. AWS_SECRET_ACCESS_KEY: The secret key for your AWS account. AWS_SESSION_TOKEN: The session key for your AWS account. This is only needed when you are using temporary credentials. The AWS_SECURITY_TOKEN environment variable can also be used, but is only supported for backwards compatibility purposes. AWS_SESSION_TOKEN is supported by multiple AWS SDKs besides python.

create_vpc()[source]

Create and configure VPC

We create a VPC with CIDR 10.0.0.0/16, which provides up to 64,000 instances.

We attach a subnet for each availability zone within the region specified in the config. We give each subnet an ip range like 10.0.X.0/20, which is large enough for approx. 4000 instances.

Security groups are configured in function security_group.

generate_aws_id()[source]

Generate a new ID for AWS resources.

Returns:

An ID of the form ‘parsl.aws.123456.789’ for giving resources unique identifiers.

Return type:

str

get_instance_state(instances=None)[source]

Get states of all instances on EC2 which were started by this file.

goodbye()[source]
initialize_boto_client()[source]

Initialize the boto client.

property label[source]

Provides the label for this provider

read_state_file(state_file)[source]

Read the state file, if it exists.

If this script has been run previously, resource IDs will have been written to a state file. On starting a run, a state file will be looked for before creating new infrastructure. Information on VPCs, security groups, and subnets are saved, as well as running instances and their states.

AWS has a maximum number of VPCs per region per account, so we do not want to clutter users’ AWS accounts with security groups and VPCs that will be used only once.

security_group(vpc, name)[source]

Create and configure a new security group.

Allows all ICMP in, all TCP and UDP in within VPC.

This security group is very open. It allows all incoming ping requests on all ports. It also allows all outgoing traffic on all ports. This can be limited by changing the allowed port ranges.

Parameters:
  • vpc (VPC instance) – VPC in which to set up security group.

  • name (str) – Name tag for the newly created security group.

show_summary()[source]

Print human readable summary of current AWS state to log and to console.

shut_down_instance(instances=None)[source]

Shut down a list of instances, if provided.

If no instance is provided, the last instance started up will be shut down.

spin_up_instance(command, job_name)[source]

Start an instance in the VPC in the first available subnet.

N instances will be started if nodes_per_block > 1. Not supported. We only do 1 node per block.

Parameters:
  • command (str) – Command string to execute on the node.

  • job_name (str) – Name associated with the instances.

status(job_ids)[source]

Get the status of a list of jobs identified by their ids.

Parameters:

job_ids (list of str) – Identifiers for the jobs.

Returns:

The status codes of the requsted jobs.

Return type:

list of int

property status_polling_interval[source]

Returns the interval, in seconds, at which the status method should be called.

Returns:

the number of seconds to wait between calls to status()

submit(command='sleep 1', tasks_per_node=1, job_name='parsl.aws')[source]

Submit the command onto a freshly instantiated AWS EC2 instance.

Submit returns an ID that corresponds to the task that was just submitted.

Parameters:
  • command (str) – Command to be invoked on the remote side.

  • tasks_per_node (int (default=1)) – Number of command invocations to be launched per node

  • job_name (str) – Prefix for the job name.

Returns:

If at capacity, None will be returned. Otherwise, the job identifier will be returned.

Return type:

None or str

teardown()[source]

Teardown the EC2 infastructure.

Terminate all EC2 instances, delete all subnets, delete security group, delete VPC, and reset all instance variables.

write_state_file()[source]

Save information that must persist to a file.

We do not want to create a new VPC and new identical security groups, so we save information about them in a file between runs.

xstr(s)[source]