We will use Mermaid to create flowcharts for these algorithms. An overview of Mermaid syntax can be found here: https://mermaid.js.org/intro/. If you are using Visual Studio Code, please install the Markdown Preview Mermaid Support extension.
Before we do anything, we must import the data.py module and create an environment. An environment object contains information about the kind of collider that the data will be generated in. It can be initialized with a constructor, and it has the following attributes
env = Environment()
print(env.top_layer_lim) # a float representing the upper and lower limits of the top layer (default 100cm)
print(env.bottom_layer_lim) # a float representing the upper and lower limits of layer 0 in m (default 15cm)
print(env.layers) # a integer representing the number of layers, excluding layer 0 (default 5)
print(env.radii) # a float representing the distance between consecutive layers in m (default 5.0cm)
Once we have generated the environment, we can generate a dataset by constructing the Dataset
object with the environment and number of points per layer as its argument.
data = DataSet(env, n_points = 150)
To access the data, we just need to call data.array
, which gives us a plot
method should give a shape that approximately looks like an inverted symmetric trapezoid.
The DataSet
object assumes that the data points are generated from uniform distribution. Hence it may be possible that such data may not holistically represent the data actually generated from the collider. To use realistic versions of collider data, we can simply read in the data file given in .txt format using readFile
function in reader.py module.
readFile
reads in the given data file line by line, where each line represents a spacepoint collection that can be used as an input data for our algorithms. As every line represents tuples separated by a comma, readFile
stores the information in every tuple as an instance of class SpacePoint
and appends all of the SpacePoint
objects into an instance of class SpacePtCollection
. In the end, readFile
function would return a list of SpacePtCollection
objects.
Then, one can use the converter function convertToDataset
in converter.py module to convert each SpacePtCollection
object into an instance of class WedgeData
, which can be found in wedgedata.py module. WedgeData
is a subclass of class DataSet
, thus inheriting its attributes. Yet WedgeData
class differs from DataSet
in that the number of points for each layer vary instead of being the same. Hence the attribute n_points
is a list with the dimensions array
is a nested list of SpacePoint
objects, with the shape being array
may not be a DataSet
object. Essentially, convertToDataset
function returns a WedgeData
object that can be directly inputted into our algorithms.
Here is an example of how readFile
and convertToDataSet
functions can be used to read in a wedge data file and be converted to be applicable for use with our algorithms:
from reader import readFile # Import readFile function
from converter import convertToDataset # Import convertToDataset function
wedges = readFile("Wedge_Data.txt") # readFile reads Wedge_Data.txt, stores into a list of SpacePtCollection objects
firstWedge = convertToDataset(wedges[0]) # Converts the first element in wedges into a WedgeData object
firstWedge.plot(show_lines = True) # Plots the converted first wedge data in a r vs. z plot
The file test_modules.py
contains the function wedge_test
that generates the plots found in the LaTex document. All that is required to generate the plots is running the function on a python file that has the wedgeData text files in the same directory. The files should be named with the format wedgeData_{VERSION}_128.txt
. Below is a list of the arguments in the function.
- lining (str, optional): solving method, default is solveS
- solve_at (int, optional): the z values the patches are being made in accordance to
- z0 (num or list, optional): array of z0 values we are testing over, default is range of -15 to 15 with 0.5 spacing
- n (int, optional): ppl (point per patch per layer), default is 16
- wedges (list, optional): which wedges, enter in list of [starting wedge, ending wedge]. Default is [0, 128]
- lines (int, optional): how many line to test acceptance with at each z0 value
- savefig (bool, optional):
True
to save figure - v (str, optional): version of data, ensure data file is in directory as "wedgeData_{v}_128.txt"
In order to accommodate for the different data structure and avoid confusion, I made a new module for solving with wedgeData
class called wedgeCover
. It operates exactly the same as the cover
class. There are currently four solving methods: solveS
, solveS_reverse
, solveS_center2
, and solveQ
. The easiest way to solve for a plot and obtain a visualization is to use the methods solve
from wedgeCover
.
The arguments for the solve
method are
- lining (str): solving method
- z0 (num or list): where on the z axis the patches are being solved with respect to. This can be a single number or a list.
- n (int): points per patch per layer
- nlines (int): number of lines used in visualization.
- show (bool): prevents line generation for visualization. This should be set to
False
when patches are solved for at multiple z0's.
After solving, the cover can be visualized with the plot
method. An example of how to solve and visualize a cover follows.
env = Environment() #init environment
events = readFile('wedgeData_v3_128.txt', 128) #read file that has wedge data
wedge1 = convertToDataset(events[0]) #convert into wedgeData format
cover = wedgeCover(env, wedge1) #init wedgeCover class
cover.solve('solveS', z0 = 0, show = True) #solve for cover with respect to z0 = 0
cover.plot() #plots cover
A line can be characterized by two things: a point
line = Line(env, start, slope)
where env
is the embedding environment, start
is slope
is
Ultimately, the line object is described by the env.layer
= points
attribute.
A LineGenerator
object simply generates lines within an environment and that has a start value equal to what is inputted.
lg = LineGenerator(env, start)
We can either choose to generate n
lines with equal spacing across the entire environment (like a grid over angles) by calling generateGridLines(n)
or generate them according to a uniform distribution (over the angle space $\theta = \arctan(m)$) by calling generateRandomLines(n)
. They both return a Python list of Line
objects.
We define three things:
- A cover
is a finite family of patches . - A patch
is a list of superpoints, with the 1st superpoint element representing the superpoint of the 1st layer. - A superpoint
is a collection of 16 consecutive points in one layer. For further developers, I would recommend only adding extra attributes and methods to the Patch
class. The cover itself should essentially be a collection of patches, no more, and a superpoint is something I've used for my personal algorithm.
A superpoint is characterized by the smallest interval of a layer that contains all 16 points. By abuse of notation, we can mathematically express it either a list of 16 numbers
contains(p)
method, which returns a boolean describing whether
A patch is simply a collection of 5 superpoints. They should all be from different layers, but this check (for computational reasons) have not been implemented. Given superpoints Patch
object by inputting a tuple of superpoints objects and the underlying environment.
# assume the sp1,...,sp5 have been initialized
sp_array = (sp1, sp2, sp3, sp4, sp5)
patch = Patch(env, sp_array)
This constructor actually will check that the sp_array
contains env.layers
superpoints. I figured that this check should be important at least for now since this is a common error. The contains
method is very important, since this allows the user to determine if a patch contains a line.
patch.contains(line)
# returns true if patch contains line
Remember previously that a line can be characterized by the env.layers
=
A cover Environment
object and a DataSet
object, and constructs an empty cover first, without any patches inside. It has 5 attributes
cover = Cover(env, data)
cover.n_patches # number of patches = 0
cover.env # the embedding environment object
cover.data # the inputted DataSet object
cover.patches # an empty list of patches
cover.superPoints # a list of lists of superpoints
In the constructor, we do the following for each layer
- We initialize a list representing the superpoints in layer
. - A superpoint is constructed by taking the first 16 points
and added to this list. - We take points
to make a second superpoint and add this to the list. Note that there is an overlap of one point between adjacent superpoints. - We continue this until there are less than 16 points left for the final superpoint.
- The final superpoint is constructed by taking the last 16 points and added to the list.
For
, this should create superpoints for each layer, and if we have 5 layers, cover.superPoints
should be a list of 5 lists, each containing 10 superpoints.
The solve
method of the Cover
object is where the main algorithm resides. Now that we have a list of list of superpoints upon initialization, which we will label
-
We construct a
LineGenerator
object and have it generate 100 equally spaced lines in the environment. These list of lines are generated "from left to right," meaning that the line that resides in the leftmost portion is the first element of the list, and the rightmost in the last element. -
For the first line
, we look at and find which superpoint in the $i$th list of cover.superPoints
it is contained in. After, we should have a collection of superpoints, which are stored in the patch_ingredients
variable. -
We construct the patch
from the elements of patch_ingredients
, which is guaranteed by construction to contain lineand add it to cover.patches
(and incrementcover.n_patches
by). -
We move onto the next line and do the same. Note that for each line, we do not need to iterate through all the superpoints. This is because we're going from left to right, and so given that line
is contained in the patch the next line must be contained in the superpoint that comes at least after those of line . Therefore, the corresponding patches must be of form where . This will save a lot of computational time by design. -
We repeat the above steps for all 100 lines. To detect repeated patches, unfortunately we cannot use hashsets since patches are not hashable objects (implementing this would be very nice). Therefore, for every generated patch
for line , we just compare it with the latest patch that is stored in cover.n_patches
. Just this one comparison is sufficient since again, since the patches are generated from left to right, and therefore,. -
After all repetitions we have our desired cover stored in the
cover.patches
list.
The solveS
method for the Cover
class generates a cover for a given set of data
-
The first patch is contrasted from a superpoint of the first 16 points of each layer.
-
The
S_loop15
method is run, which is the main part of the algorithm. This generates all the other patches. -
S_loop15
method loops through the rightmost point in each layer. It takes the point and 're-scales' it based on the r-value. Let the re-scaled value bewhere The minimum of for is . is stored in min_index
. -
Next re-scale each layer to find the index of the point closest in value to
. Let the index of this point be . There are three cases. -
is the leftmost point of a given layer ( ), which happens the first few patches due to the radius offset. In this case, the first 16 points is the superpoint for that layer. -
There are not enough points left for 16 new points (
.) The superpoint is then the rightmost 16 points in the layer. -
The closest index is neither of the above cases. The superpoint starts at
to ensure overlap then picks the next 15 points.
These superpoints are added to the list
patch_ingredients
. -
-
Checks to see if the new patch equals the last patch added. This happens when Case 2 is true for all five layers. Once that happens, the algorithm terminates. Otherwise,
S_repeated
is run again.
After the algorithm terminates, the patches are stored in cover.patches
.
Using the same logic, there are several methods that vary on this premise.
solveS
: Starts with the left most 16 points and works its way right to generate the cover usingS_repeated
.solveS_reverse
: Starts with the right most 16 points and works its way left to generate the cover usingS_repeated_reverse
.solveS_center1
: Starts with the center 16 points in each layer then usesS_repeated
on the right half of data andS_repeated_reverse
on left half of data.solveS_center2
Starts with a superpoint containing the 8 points to the left and right of 0 in each layer then usesS_repeated
on the right half of data andS_repeated_reverse
on left half of data.SolveQ
: Starts at Q1 and Q3 by z value and runssolveS_center2
starting at those points and ending at the center
- Setting up either a base environment or Anaconda environment with the appropriate packages (Numpy, Matplotlib, etc.)
- Need to add temp_image_dir manually
- Need to manually find and add the wedgeData_v3_128.txt to ./python/data