Skip to content

Commit e190913

Browse files
committed
Pushing Configs
0 parents  commit e190913

32 files changed

+2590
-0
lines changed

.gitignore

+131
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,131 @@
1+
.DS_Store
2+
# Byte-compiled / optimized / DLL files
3+
__pycache__/
4+
*.py[cod]
5+
*$py.class
6+
7+
# C extensions
8+
*.so
9+
10+
# Distribution / packaging
11+
.Python
12+
build/
13+
develop-eggs/
14+
dist/
15+
downloads/
16+
eggs/
17+
.eggs/
18+
lib/
19+
lib64/
20+
parts/
21+
sdist/
22+
var/
23+
wheels/
24+
pip-wheel-metadata/
25+
share/python-wheels/
26+
*.egg-info/
27+
.installed.cfg
28+
*.egg
29+
MANIFEST
30+
31+
# PyInstaller
32+
# Usually these files are written by a python script from a template
33+
# before PyInstaller builds the exe, so as to inject date/other infos into it.
34+
*.manifest
35+
*.spec
36+
37+
# Installer logs
38+
pip-log.txt
39+
pip-delete-this-directory.txt
40+
41+
# Unit test / coverage reports
42+
htmlcov/
43+
.tox/
44+
.nox/
45+
.coverage
46+
.coverage.*
47+
.cache
48+
nosetests.xml
49+
coverage.xml
50+
*.cover
51+
*.py,cover
52+
.hypothesis/
53+
.pytest_cache/
54+
55+
# Translations
56+
*.mo
57+
*.pot
58+
59+
# Django stuff:
60+
*.log
61+
local_settings.py
62+
db.sqlite3
63+
db.sqlite3-journal
64+
65+
# Flask stuff:
66+
instance/
67+
.webassets-cache
68+
69+
# Scrapy stuff:
70+
.scrapy
71+
72+
# Sphinx documentation
73+
docs/_build/
74+
75+
# PyBuilder
76+
target/
77+
78+
# Jupyter Notebook
79+
.ipynb_checkpoints
80+
81+
# IPython
82+
profile_default/
83+
ipython_config.py
84+
85+
# pyenv
86+
.python-version
87+
88+
# pipenv
89+
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
90+
# However, in case of collaboration, if having platform-specific dependencies or dependencies
91+
# having no cross-platform support, pipenv may install dependencies that don't work, or not
92+
# install all needed dependencies.
93+
#Pipfile.lock
94+
95+
# PEP 582; used by e.g. github.com/David-OConnor/pyflow
96+
__pypackages__/
97+
98+
# Celery stuff
99+
celerybeat-schedule
100+
celerybeat.pid
101+
102+
# SageMath parsed files
103+
*.sage.py
104+
105+
# Environments
106+
.env
107+
.venv
108+
env/
109+
venv/
110+
ENV/
111+
env.bak/
112+
venv.bak/
113+
114+
# Spyder project settings
115+
.spyderproject
116+
.spyproject
117+
118+
# Rope project settings
119+
.ropeproject
120+
121+
# mkdocs documentation
122+
/site
123+
124+
# mypy
125+
.mypy_cache/
126+
.dmypy.json
127+
dmypy.json
128+
129+
# Pyre type checker
130+
.pyre/
131+
.idea/

README.md

+202
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,202 @@
1+
2+
# Strelka Documentation
3+
4+
Welcome to the official documentation for Strelka, an advanced tool for automated malware analysis. This documentation aims to provide comprehensive insights into the functionality and usage of Strelka, facilitating ease of use and development.
5+
6+
## Table of Contents
7+
- [Overview](#overview)
8+
- [How Docs Work](#how-docs-work)
9+
- [Running Docs Locally](#running-docs-locally)
10+
- [Automated Pipeline](#automated-pipeline)
11+
- [Documentation Format](#documentation-format)
12+
- [Scanners](#scanners)
13+
- [Scanner Class](#scanner-class)
14+
- [Scanner Functions](#scanner-functions)
15+
- [Features and Fields](#features-and-fields)
16+
- [Backend Configuration](#backend-configuration)
17+
18+
## Overview
19+
20+
Strelka is designed for detailed malware analysis, providing robust scanning capabilities across various file types.
21+
The project's documentation is automatically generated and updated through GitHub Actions the latest changes in the `strelka` repository.
22+
23+
## How Docs Work
24+
25+
Documentation for Strelka is automatically generated to ensure up-to-date information. Key sections include:
26+
27+
- **Strelka Scanners**: Discusses the core analysis components.
28+
29+
## Running Docs Locally
30+
31+
To set up and view the documentation locally, follow these steps:
32+
33+
1. **Install Poetry**
34+
35+
Download and install Poetry, a tool for handling Python package dependencies.
36+
37+
```bash
38+
curl -sSL https://raw.githubusercontent.com/python-poetry/poetry/master/get-poetry.py | python -
39+
```
40+
41+
2. **Clone the Strelka Repository**
42+
43+
Obtain the latest version of the `strelka` code from its repository.
44+
45+
```bash
46+
git clone https://github.com/target/strelka
47+
```
48+
3. **Install Dependencies**
49+
50+
Use Poetry to install the necessary dependencies for running the documentation locally.
51+
52+
```bash
53+
poetry install
54+
```
55+
56+
4. **(Optional) Replace Scanners**
57+
58+
If you need to develop or test documentation for specific scanners, modify the scanner in the `strelka/scanner` folder.
59+
60+
5. **Build the Documentation**
61+
62+
Generate the latest version of the documentation by running the build script. This will create new `.md` files based on all of the scanner code.
63+
64+
```bash
65+
python ./build_docs.py
66+
```
67+
68+
6. **Start the Local Mkdocs Server**
69+
70+
Use Poetry to run the Mkdocs server and view the documentation locally.
71+
72+
```bash
73+
poetry run mkdocs serve
74+
```
75+
76+
7. **Access the Documentation**
77+
78+
Open your web browser and go to `http://127.0.0.1:8000/target/strelka/` to view the local documentation.
79+
80+
## Automated Pipeline
81+
82+
`strelka-docs` builds and publishes new documents to the `gh-pages` branch. This branch is hosted on [GitHub](https://target.github.io/strelka-docs/).
83+
84+
## Strelka Documentation Update Process
85+
86+
1. **Pull Request** (`strelka` repo)
87+
- A user submits a PR which is then reviewed for integration.
88+
89+
2. **Merge** (`strelka` repo)
90+
- The PR is approved and merged into the main branch.
91+
92+
3. **Build Trigger** (`strelka` repo)
93+
- The merge triggers the Vela pipeline, which builds Strelka and commits to the `strelka-docs` repo.
94+
95+
4. **Doc Build** (`strelka-docs` repo)
96+
- The `strelka-docs` pipeline generates documentation using the latest `strelka` repos.
97+
98+
5. **Publish** (`strelka-docs` repo)
99+
- Newly generated documentation is published and made available to users.
100+
101+
## Documentation Format
102+
103+
### Scanners
104+
105+
#### Scanner Class
106+
107+
Documented based on Google docstrings guidelines, including:
108+
109+
- **Description**: A concise overview of the scanner's purpose and functionality.
110+
- Includes **Scanner Type**: Collection or Malware
111+
- **Attributes**: Details about the scanner's attributes that define its behavior. Usually found outside functions or inside init. (Can be None)
112+
- **Other Parameters**: Details about the scanner's options. Can usually be found defined at the top of the `scan` class or inside the `backend.yml`.
113+
- **Detection Use Cases**: Examples of potential use cases for the scanner, highlighting its detection capabilities.
114+
- **Known Limitations**: Acknowledgment of any limitations or areas for improvement in the scanner's functionality. (Can be None)
115+
- **Todo**: List of potential script improvements / future implementations (Can be None)
116+
- **References**: List of references used to develop / describe the scanner (Can be None)
117+
- **Contributors**: List of users that have assisted in the development of the scanner.
118+
119+
##### Example of a Class-based Docstring
120+
121+
```
122+
class ScanEmail(strelka.Scanner):
123+
"""
124+
Extracts and analyzes metadata, attachments, and optionally generates thumbnails from email messages.
125+
126+
This scanner processes email files to extract and analyze metadata, attachments, and optionally generates
127+
thumbnail images of the email content for a visual overview. It supports both plain text and HTML emails,
128+
including inline images.
129+
130+
Scanner Type: Collection
131+
132+
## Options
133+
134+
Attributes:
135+
None
136+
137+
Other Parameters:
138+
create_thumbnail (bool): Indicates whether a thumbnail should be generated for the email content.
139+
thumbnail_header (bool): Indicates whether email header information should be included in the thumbnail.
140+
thumbnail_size (int): Specifies the dimensions for the generated thumbnail images.
141+
142+
## Detection Use Cases
143+
!!! info "Detection Use Cases"
144+
- **Document Extraction**
145+
- Extracts and analyzes documents, including attachments, from email messages for content review.
146+
- **Thumbnail Generation**
147+
- Optionally generates thumbnail images of email content for visual analysis, which can be useful for
148+
quickly identifying the content of emails.
149+
- **Email Header Analysis**
150+
- Analyzes email headers for potential indicators of malicious activity, such as suspicious sender addresses
151+
or subject lines.
152+
153+
## Known Limitations
154+
!!! warning "Known Limitations"
155+
- **Email Encoding and Complex Structures**
156+
- Limited support for certain email encodings or complex email structures.
157+
- **Thumbnail Accuracy**
158+
- Thumbnail generation may not accurately represent the email content in all cases,
159+
especially for emails with complex layouts or embedded content.
160+
- **Limited Output**
161+
- Content is limited to a set amount of characters to prevent excessive output.
162+
163+
## To Do
164+
!!! question "To Do"
165+
- **Improve Error Handling**:
166+
- Enhance error handling for edge cases and complex email structures.
167+
- **Enhance Support for Additional Email Encodings and Content Types**:
168+
- Expand support for various email encodings and content types to improve scanning accuracy.
169+
170+
## References
171+
!!! quote "References"
172+
- [Python Email Parsing Documentation](https://docs.python.org/3/library/email.html)
173+
- [WeasyPrint Documentation](https://doc.courtbouillon.org/weasyprint/stable/)
174+
- [PyMuPDF (fitz) Documentation](https://pymupdf.readthedocs.io/en/latest/)
175+
176+
## Contributors
177+
!!! example "Contributors"
178+
- [Josh Liburdi](https://github.com/jshlbrd)
179+
- [Paul Hutelmyer](https://github.com/phutelmyer)
180+
- [Ryan O'Horo](https://github.com/ryanohoro)
181+
182+
"""
183+
```
184+
185+
186+
#### Example of a Function-based Docstring
187+
188+
Outlines function purposes, arguments, and return values, promoting clarity and ease of use.
189+
190+
```
191+
"""
192+
Performs the scan operation on batch file data, extracting and categorizing different types of tokens.
193+
194+
Args:
195+
data (bytes): The batch file data as a byte string.
196+
file (strelka.File): The file object to be scanned.
197+
options (dict): Options for customizing the scan. These options can dictate specific behaviors
198+
like which tokens to prioritize or ignore.
199+
expire_at (datetime): Expiration timestamp for the scan result. This is used to determine when
200+
the scan result should be considered stale or outdated.
201+
"""
202+
```

__init__.py

Whitespace-only changes.

0 commit comments

Comments
 (0)