sheepdog Package

sheepdog Package

Sheepdog is a utility to run arbitary code on a GridEngine cluster and collect the results, typically by mapping a set of arguments to one function.

Documentation: http://sheepdog.readthedocs.org

Source code: https://github.com/adamgreig/sheepdog

PyPI: https://pypi.python.org/pypi/Sheepdog

Sheepdog is released under the MIT license, see the LICENSE file for details.

sheepdog.__init__.get_errors(request_id, dbfile)[source]

Fetch all the errors returned so-far for request_id.

sheepdog.__init__.get_results(request_id, dbfile, block=True, verbose=False)[source]

Fetch results for request_id. If block is true, wait until all the results are in. Otherwise, return just what has been received so far.

If verbose is true, print a status message every second with the current number of results.

Returns a list of (arg, result) tuples.

Where an error occured or no result has been submitted yet, result will be None.

sheepdog.__init__.map(f, args, config, ns=None, verbose=True)[source]

Submit f with each of args on GridEngine, wait until all the results are in, and return them in the same order as args. If an error occured for an arg, None is returned in that position. Call get_errors to get details on the errors that occured.

For details on config, see the documentation at: http://sheepdog.readthedocs.org/en/latest/configuration.html Or in docs/configuration.rst.

Optionally ns is a dict containing a namespace to execute the function in, which may itself contain additional functions.

If verbose is true, print out how many results are in so-far while waiting.

sheepdog.__init__.map_async(f, args, config, ns=None)[source]

Submit f with each of args on GridEngine, returning the (sheepdog-local) request ID.

For details on config, see the documentation at: http://sheepdog.readthedocs.org/en/latest/configuration.html Or in docs/configuration.rst.

Optionally ns is a dict containing a namespace to execute the function in, which may itself contain additional functions.

client Module

Sheepdog’s clientside code.

This code is typically only run on the worker, and this file is currently only used by pasting it into a job file (as workers don’t generally have sheepdog itself installed).

class sheepdog.client.Client(url, password, request_id, job_index)[source]

Find out what to do, do it, report back.

HTTP_RETRIES = 10
get_details()[source]

Retrieve the function to run and arguments to run with from the server.

go()[source]

Call get_details(), run(), submit_results(). Just for convenience.

run()[source]

Run the downloaded function, storing the result.

set_memlimit(fname='/home/docs/checkouts/readthedocs.org/user_builds/sheepdog/envs/latest/local/lib/python2.7/site-packages/Sheepdog-0.2.3-py2.7.egg/sheepdog/client.pyc')[source]
submit_results()[source]

deployment Module

Code for deploying code to servers and executing jobs on GridEngine.

class sheepdog.deployment.Deployer(host, port, user, keyfile=None)[source]

Connect to a remote SSH server, copy a file over, run qsub.

__init__ takes (host, port, user, keyfile) to specify which SSH server to connect to and how to connect to it.

deploy(jobfile, request_id, directory)[source]

Copy jobfile (a string of the file contents) to the connected remote host, placing it in directory with a filename containing request_id.

submit(request_id, directory)[source]

Submit a job to the GridEngine cluster on the connected remote host. Calls qsub with the job identified by request_id and directory.

job_file Module

Generate job files to send to the cluster.

The template is filled in with the job specifics and the formatted string is returned ready for deployment.

sheepdog.job_file.job_file(url, password, request_id, n_args, shell, grid_engine_opts)[source]

Format the template for a specific job, ready for deployment.

url is the URL (including port) that the workers should contact to fetch job information, including a trailing slash.

password is the HTTP Basic Auth password to use when talking to url.

request_id is the request ID workers should use to associate themselves with the correct request.

n_args is the number of jobs that will be queued in the array task, the same as the number of arguments being mapped by sheepdog.

shell is the path to the Python that will execute the job. Could be a system or user Python, so long as it meets the Sheepdog requirements. Is used for the -S option to GridEngine as well as the script shebang.

grid_engine_opts is a list of string arguments to Grid Engine to specify options such as resource requirements.

server Module

Sheepdog’s HTTP server endpoints.

The Server class sets up a server on another subprocess, ready to receive requests from workers. Uses Tornado if available, else falls back to the Flask debug web server.

class sheepdog.server.Server(port, password, dbfile)[source]

Run the HTTP server for workers to request arguments and return results.

Should be used via get_server(port, password, dbfile) to manage servers running globally.

__init__ creates and starts the HTTP server.

stop()[source]

Terminate the HTTP server.

sheepdog.server.check_auth(username, password)[source]
sheepdog.server.get_config(*args, **kwargs)[source]

Endpoint for workers to fetch their configuration before execution. Workers should specify request_id (integer) and job_index (integer) from their job file.

Returns a JSON object:

{“func”: (serialised function object),
“args”: (serialised arguments list)

}

with HTTP status 200 on success.

sheepdog.server.get_server(port, password, dbfile)[source]

Either start a new server or retrieve a reference to an existing server. Only one server may run per port. If the server currently running on that port has a different password or dbfile, a RuntimeError is raised.

If None is specified for port, a port is picked randomly and that server is the one referenced for None thereafter.

sheepdog.server.get_storage()[source]

Retrieve the request-local database connection, creating it if required.

sheepdog.server.report_error(*args, **kwargs)[source]

Endpoint for workers to report back errors in function execution. Workers should specify request_id (integer), job_index (integer) and error (an error string) HTTP POST parameters.

Returns the string “OK” and HTTP 200 on success.

sheepdog.server.requires_auth(f)[source]
sheepdog.server.run_server(port, password, dbfile)[source]

Start up the HTTP server. If Tornado is available it will be used, else fall back to the Flask debug server.

sheepdog.server.submit_result(*args, **kwargs)[source]

Endpoint for workers to submit results arising from successful function execution. Should specify request_id (integer), job_index (integer) and result (serialised result) HTTP POST parameters.

Returns the string “OK” and HTTP 200 on success.

storage Module

Interface to the storage backend.

Future plans involve porting most of those handwritten SQL to a sensible ORM.

class sheepdog.storage.Storage(dbfile)[source]

Manage persistence for requests and results.

Request functions and result objects are stored as binary blobs in the database, so any bytes object will be fine. They’ll be returned as they were sent.

__init__ creates a database connection.

dbfile is a file path for the sqlite file.

Use of “:memory:” is not advised as the web server runs in a separate process so will not share memory with the main interpreter process, making it rather difficult to retrieve results.

count_errors(request_id)[source]

Count the number of errors reported so far.

count_results(request_id)[source]

Count the number of results so far for the given request_id.

count_results_and_errors(request_id)[source]

Sum the result and error counts.

count_tasks(request_id)[source]

Count the total number of tasks for this request.

get_details(request_id, job_index)[source]

Get the target function, namespace and arguments for a given job.

get_errors(request_id)[source]

Fetch all errors for a given request_id.

Returns a list of (args, error) items in the order of the original args_list provided to new_request.

get_results(request_id)[source]

Fetch all results for a given request_id.

Returns a list of (args, result) items in the order of the original args_list provided to new_request.

Gaps are not filled in, so if results have not yet been submitted the corresponding arguments will not appear in this list and this list will be shorter than the length of args_list.

get_tasks_with_results(request_id)[source]

Fetch all tasks for a given request_id, including results for all tasks where results have come in already.

Returns a list of (args, result) items in the order of the original args_list provided to new_request, where result may be None.

initdb()[source]

Create the database structure if it doesn’t already exist.

new_request(serialised_function, serialised_namespace, args_list)[source]

Add a new request to the database.

serialised_function is some bytes object that should be given to workers to turn into the code to execute.

serialised_namespace is some bytes object that should be given to workers alongside the serialised function to provide helper variables and functions that the primary function requires.

args_list is a list, tuple or other iterable where each item is some bytes object that should be given to workers to run their target function with.

Returns the new request ID.

store_error(request_id, job_index, error)[source]

Store an error resulting from a computation.

store_result(request_id, job_index, result)[source]

Store a new result from a given request_id and job_index.