See also the github issues page.
that is, don't let it grow without bound, but instead, once it gets to a certain size, rename it and start a new file.
It might be easiest to invoke "multilog" from Dan Bernstein's daemontools; available on Ubuntu 10.04 in the "daemontools" package
... and fix the various lint-like warnings that come up
That's the big-log (which we mine for incubot-witticisms) and the userinfo db. I keep accidentally deleting them, and losing months of accumulated data; they should be on S3 or something.
One attractive idea for the big log: PostgreSQL running in Amazon's RDS. Why? 'cuz
- Amazon is unlikely to lose my data
- PostgreSQL has some sorta full-text search (see http://rachbelaid.com/postgres-full-text-search-is-good-enough/)
Another AWS-type of idea: use the simplest possible backing store, and combine it with Cloudsearch
For example, if someone says to him "you are clever" and he responds "I have a clever idea", there'll be messages like "incubot chose 'clever'" in the log output ... make those messages accessible via HTTP so that people can tell why he said what he said.
defverb #:master (censor objectionable_words) ...
would eliminate rows in the incubot db that contain those words.
Note that such a command would be insanely dangerous; if I accidentally typed "censor the", I'd lose most of the corpus, since most log entries contain the word 'the'.
In any case, deleting the word via a sqlite prompt is falling-down
easy: see README.censor-nasty-words
in this directory.
The original incubot does this. E.g., if we find
":offby1!n=user@pdpc/supporter/monthlybyte/offby1 PRIVMSG #scheme :rudybot: seen eli"
in the log, then we currently reduce that to
"rudybot: seen eli"
but we should really omit the leading nick and colon, yielding
"seen eli"
That's because it looks odd for the bot to utter something that's directed at someone who probably isn't present.
I just saw the bot respond with "\1ACTION does something or other\1", which was weird; it should have either said nothing, or else just "does something or other".
... to initially create parsed-log
, as to add individual utterances
to it.
For example, I'm pretty sure I omit utterances that are addressed to the bot, and begin with a command; i.e., when jordanb says "rudybot: quote", I don't add that, since it'd be dull.
And yet when log-parser creates parsed-log, it blithely adds everything.
Eliminate the "quotes" database, and just have "quote" find some random utternace by jordanb, and echo that. I'm not saving the speaker's name in the db, though; so it might not be possible to tell what came from him.
A super-simple hacky way would be for "quote" to simply turn around and do what "let's" would do.
<jordanb> Offby1 should make a lolbot.
<offby1> already did
<jordanb> Heh
<jordanb> Have it join. ^_^
<jordanb> rudybot: lol
<rudybot> jordanb: lol
<jordanb> rudybot: lolbot [16:19]
<rudybot> jordanb: Offby1 should make a lolbot.
<jordanb> O [16:21]
# offby1 makes a mental note to underweight recent utterances
ERC>
Imagine we've chosen the word "enhance", and we find the the DB these two strings: "Enhance your manhood with Presto Manhood Enlarger™" "I would have preferred to enhance that with soup"
Those strings are about the same length, so the current code will weight them about the same. But there's something to be said for giving the first one more weight, since when you read it, you see the word earlier ... those examples I just gave above don't really demonstrate the value, so let's take a real example:
2013-09-08T18:27:18Z incubot-server:incubot chose "enhance", which appears 117 times
2013-09-08T18:27:20Z => PRIVMSG #emacs :macrobat: how do I enable SLIME autocompletion in my lisp buffer? ideally using the auto-complete extension for a nice display of the completion options. I've installed ac-slime, but that only seems to enhance the auto-completion in my SLIME REPL, not the buffer in which I am editing my file.
That utterance is so long that you could die of boredom before getting to the bit that uses the word "enhance". That is why we should favor utterances where the word appears early.
This hopefully won't require a code change, but it'd be good if it didn't run as me, since if it did, it could conceivably read my Sekrit files.
I am not aware, off the top of my head, of an easy way to do this without a code change :-| Presumably some supervisor-like system could do this for me, but the one I'm currently using -- "upstart" (on Amazon Linux) doesn't let me specify a run-as UID, damn it. Maybe I can just wedge a "su" or "sudo" invocation into the upstart config file ... although I dimly recall that that doesn't work for some obscure reason.
If all else fails, I can perhaps get the program to invoke "setuid" and/or "setgid", although Racket doesn't provide functions for that; I'd have to use the FFI.
... with e.g., twb's quips