run_stacks Module
Functionality related to retrieving raw AT-TPC trace runs and balancing the load in multiprocessing
collect_runs(trace_path, run_min, run_max, runs_to_skip)
Make dict of runs with the size of the raw data file
Using the Workspace and the Config run_min and run_max, get a dict of run numbers to be processed and sort the list based on the size of the raw trace files. The dict key is the run number, and the associated data is the size of the raw trace file. Runs that do not exist or have no data are ommitted from the list.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
trace_path |
Path
|
the path to AT-TPC trace data |
required |
run_min |
int
|
the first run, inclusive |
required |
run_max |
int
|
the last run, inclusive |
required |
runs_to_skip |
list[int]
|
list of run numbers to skip (if any) |
required |
Returns:
Type | Description |
---|---|
dict[int, int]
|
a dictionary where the keys are run numbers and the values are the size of the associated raw trace files. The dict is sorted descending on the size of the raw trace files. |
Source code in src/spyral/core/run_stacks.py
create_run_stacks(trace_path, run_min, run_max, n_stacks, runs_to_skip)
Create a set of runs to be processed for each stack in n_stacks.
Each stack is intended to be handed off to a single processor. As such, the goal is to balance the load of work across all of the stacks. collect_runs is used to retrieve the list of runs sorted on their sizes. The list of stacks is then snaked, depositing one run in each stack in each iteration. This seems to provide a fairly balanced load without too much effort.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
trace_path |
Path
|
The path to AT-TPC traces |
required |
run_min |
int
|
The minimum run number, inclusive |
required |
run_max |
int
|
The maximum run number, inclusive |
required |
n_stacks |
int
|
the number of stacks, should be equal to number of processors |
required |
runs_to_skip |
list[int]
|
list of run numbers to skip (if any) |
required |
Returns:
Type | Description |
---|---|
tuple[list[list[int]], list[float]]
|
The stacks and their approximate load. Each stack is a list of ints, where each value is a run number for that stack to process. The first element of the tuple is the stacks the second element is the percent load in each stack estimated from the trace data. |
Source code in src/spyral/core/run_stacks.py
form_run_string(run_number)
Make the run_* string
Parameters:
Name | Type | Description | Default |
---|---|---|---|
run_number |
int
|
The run number |
required |
Returns:
Type | Description |
---|---|
str
|
The run string |
get_size_path(path)
Get the size of a path item.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path |
Path
|
the path item to be inspected |
required |
Returns:
Type | Description |
---|---|
int
|
the size of the item at the given path, or 0 if no item exists. |