Skip to content

Conversation

@MihirT906
Copy link

@MihirT906 MihirT906 commented Feb 16, 2023

Resolve #355

@CLAassistant
Copy link

CLAassistant commented Feb 16, 2023

CLA assistant check
All committers have signed the CLA.

Copy link
Collaborator

@sarahmish sarahmish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great PR @MihirT906!
Couple of small things need to be fixed (mainly docstrings)

Please remove the changes of the following files:

  • 2' (I think it was added by mistake)
  • Makefile
  • tests/readme_test/README.md
  • tests/readme_test/README_evaluate.md

Comment on lines 12 to 23
"""
The function checks if the dataset being used is valid i.e has a length greater than 0 and contains the dimension required
Args:
X (ndarray):
N-dimensional value sequence to iterate over
dim (int):
Integer indicating the dimension number for a multi-dimensional dataset
Returns:
ndarray:
Returns an nd array that contains a dataset with 2 columns ['timestamp', 'value']

"""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a couple of things to keep in mind when writing docstrings. The most important thing is to be consistent with the existing docstrings which should look something like this

"""Short description in a single line ending with a dot.

Longer description that can span across multiple lines. Longer
description that can span across multiple lines. Longer description
that can span across multiple lines.

Args:
    arg_name (arg_type):
        argument description.
    arg_name (arg_type):
        Argument description that spans across multiple lines. Argument
        description that spans across multiple lines. Argument description
        that spans across multiple lines.

Returns:
    return_type:
        description of the returned objects

Based on the current code, what needs to be fixed is:

  • the first sentence is always short
  • if the line is longer than 100 characters, break it over multiple lines
  • add space between each section (Args:, and Returns:)
  • lastly remove unnecessary white space (line 22)

Applying all these changes would look something like:

"""Validate data dimensions.

The function checks if the dataset being used is valid i.e has a length 
greater than 0 and contains the dimension required.

Args:
    X (ndarray): 
        N-dimensional value sequence to iterate over.
    dim (int):
        Integer indicating the dimension number for a multi-dimensional dataset.

Returns:
    ndarray:
        An array that contains a dataset with 2 columns ['timestamp', 'value'].
"""

This applies to all the docstrings


def diff_thres(X, thres = "0.1", op = ">"):
"""
The function detects anomalies that are flagged through moving standard deviation thresholding
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the description here is inaccurate


def thresholding(X, thres, op):
"""
The function detects anomalies that are flagged through moving standard deviation thresholding
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similarly here, I think it needs to be updated

@@ -0,0 +1,31 @@
{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed build_anomaly_intervals is not being used. Is there a reason for adding it to the PR? (similar comment applies to the function, if it is not part of the pipelines, we can remove it)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add SOL Pipelines

3 participants