This program uses NuPIC to catch anomalies in streams of data. It runs as a Docker container, so to use it you just have to run the container with proper configuration files (see Usage).
Currently we support the following streams for input:
- Pingdom: Fetch response time data from pingdom and learn from it.
- Librato: Learn from any AWS EC2 metric (possibly any arbitrary metric will work)
- Dynamic: Push in any timeseries data via JSON and HTTP
More detailed information can be found in our Wiki.
We use configuration files to specify which stream we want to use, to give credentials for the respective services API we may need call, to specify some monitor IDs (if we don't want to run models for everyone) and also to control some parameters of the model.
We provide some heavy commented templates for Pingdom and Librato in monitor/config_templates/, but a minimal example for running Pingdom monitors can be found in the concrete example below.
An important point is in regard with the monitors
section, which may be omitted for any configuration file, in which case it will start monitors for every stream available of that type. For example, for Pingdom, it will start a monitor for each check found under the given credentials.
If you are using the dynamic HTTP event input - you don't need any configuration file (as the configuration data comes along with the data you push in).
With Docker installed, do:
docker run -d -v /HOST/PATH/TO/LOG_DIR:/var/log/docker/monitor -v /HOST/PATH/TO/CONFIG/FILES/:/CONTAINER/PATH/TO/CONFIG/FILES/ -p [PUBLIC_PORT]:5000 cloudwalk/monitor [-t SERVER_TOKEN] CONTAINER/PATH/TO/CONFIG/FILES/config1.yaml CONTAINER/PATH/TO/CONFIG/FILES/config2.yaml ...
As we must pass some configuration files to the container, we mount the host volume containing those files inside the container, passing the containers absolute path for the configuration files as an argument to the container.
We must pass at least one configuration file when starting the container and we can, optionally, pass a argument -t SERVER_TOKEN
with a token to be used for access authentication of our API.
Other parameter that we must specify is the [PUBLIC_PORT]
used by the Go server.
The logs generated by Redis, Martini and the monitors are saved inside the container
in a directory specified by LOG_DIR
environment variable, which defaults to
LOG_DIR=/var/log/docker/monitor
, that's why we can mount a volume into it to
get access to logs at host machine. An alternative to using the v
flaf is to
mount the volume from another container, using the --volumes-from
flag.
Just to be sure everything is clear, suppose you have written the following configuration file
in /home/monitor/config/pingdom.yaml
:
stream:
source: pingdom
credentials:
username: [email protected] # Put your e-mail here
password: abcdefghijklmn # Put your password here
appkey: xyzwxyzwxyzwxyzwxyzwxyzwyzwxyzwx # Put your appkey here
Now you want to start the service running in port 80
with access token goaway
and saving log files to /home/monitor/logs
.
The following command will do it:
docker run -d -v /home/monitor/logs:/var/log/docker/monitor -v /home/monitor/config:/etc/monitor -p 80:5000 cloudwalk/monitor -t goaway /etc/monitor/pingdom.yaml
Now you can go to open your browser and access localhost
in it. You may need to
wait a little and refresh the browser as the service gets the first data from the stream source.
If you really want to see some action while you wait, you can keep an eye at some logs:
tail -f /home/monitor/logs/run_monitor.log
If you start the container with extra parameters -p 80:5000 -p 8080:8080 -e DYNAMIC=true
,
the service will listen on port 8080 for input data. It will not use any stream
source and will work only with data supplied to it dynamically through its web
server running in port 8080
.
So when running in DYNAMIC
mode the service will create a monitor instance as
needed, whenever a new monitor id comes in (this is done via HTTP requests).
This allows you to pump in data from any event source at any pace.
As this all runs in the one process, this is slightly more memory efficient.
If you run the command make rebuild
, you will have an input endpoint listening on port 8080.
The service should start in localhost
, unless your Docker uses other host.
To push in data:
curl --data '{"check_id": "check_id_here", "time":1, "value":42}' http://localhost:8080
curl --data '{"check_id": "check_id_here", "time":2, "value":41}' http://localhost:8080
curl --data '{"check_id": "check_id_here", "time":3, "value":42}' http://localhost:8080
The first time it sees that check_id
, a monitor instance will be created.
The time/value pair is the time-series that is used as input.
The response will be a JSON object saying if the monitor is currently CRITICAL
or OK
.
You can of course access the client in the browser (in port 5000
) to find out more information,
including plots of the data and its anomaly score.
You should notice we don't have any predictions/anomaly scores for the first point, but for the second and beyond we start to compute anomaly scores, so after you run the previous three lines, you should see a plot with just two points.
If you want to pass in non default options (e.g. resolution), add a config map:
"config": {"name": "yeah"}
to the data you are inputting. Parameters resolution
, webhook
, anomaly_threshold
, likelihood_threshold
are the
most relevant ones. Defaults are generally fine. The unit, label and name are used for display purposes.
In the examples/ directory are some helper scripts to test out this feature.
When posting anomalies to Slack, we make use of the icon Analytics chart on a monitor screen made by Freepik from www.flaticon.com licensed under CC BY 3.0.
- Fork it
- Create your feature branch (
git checkout -b my-new-feature
) - Commit your changes (
git commit -am 'Added some feature'
) - Push to the branch (
git push origin my-new-feature
) - Create new Pull Request
OMG Monitor
Copyright (C) 2014 CloudWalk Inc.
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.