Nelson Brown | 11 May 2012 15:23
Picon

[Trac-dev] line numbers for code blocks and the BlogSpam module

Hi,
 
  What procedures do you guys have in place to un-spam yourself?  :)
 
  I was working on some analysis for a bitesized ticket, because I wanted to contribute to my first open source project.  But every time I tried to submit something it was caught in the spam filter.  I was only given one shot at the captchas, and I couldn't even tell if I put them in correctly or not...  And then my IP address was cached, so what should I do?
 
  Specifically, I was working on http://trac.edgewall.org/ticket/3275
 
  I tried to also add the following image as an attachment : http://i.imgur.com/lNBoB.jpg
 
  The text of my comment is below, and I kept getting the error.

Submission rejected as potential spam (StopForumSpam says this is spam (ip), BlogSpam says content is spam (Cached from stopforumspam.com))

 
===================================
 
 
So this is my first crack at contributing to an open source project.  I basically read the recommendation that you bite into a bitesized ticket and this was the first one I saw.  So what follows is some analysis, and I'd like to get a recommendation from anyone interested.
 
The current mechanism for parsing arguments to WikiProcessors is the parse_processor_args method in trac.wiki.parser module.  These currently get stored into {{{self.args}}} as a dictionary of key value paris in the form of {{{processorarg=processorargvalue}}}.  So utilizing the existing syntax for passing processor arguments, an example code block with annotated line numbers might look like this:
 
!{{{#!python annotations=lineno
import something
something.somethingelse()
}}}
 
This currently renders (with the arguments ignored) as:
 
{{{#!python annotations=lineno
import something
something.somethingelse()
}}}
 
But with the following change to source:trunk/trac/wiki/formatter.py <at> 10919#L345
 
{{{
#!diff
--- formatter.py (revision 10865)
+++ formatter.py (working copy)
<at> <at> -321,8 +321,19 <at> <at>
                                                     text)
 
     def _mimeview_processor(self, text):
-        return Mimeview(self.env).render(self.formatter.context,
-                                         self.name, text)
+        annotations = self.args['annotations'] \
+        if 'annotations' in self.args and \
+        self.args['annotations'] in ['lineno'] \
+        else None
+       
+       
+        if annotations:
+            return Mimeview(self.env).render(self.formatter.context,
+                                             self.name, text,
+                                             annotations=[annotations])
+        else:
+            return Mimeview(self.env).render(self.formatter.context,
+                                             self.name, text)
     # TODO: use convert('text/html') instead of render
 
     def process(self, text, in_paragraph=False):
}}}
 
It would render as follows (attached image):
(The spam bot prevented me from uploading the file, so here is a link on imgur)
 
[http://i.imgur.com/lNBoB.jpg An example of what code blocks with annotated line numbers can look in a wiki block with some other content around it for context.]
 
So this is nice, but there are some issues with it.  The anchors need to be provided with more context.  The line number annoatator was designed for the display of single files, but the anchor needs to know not just the URL, but the comment # / description # / code block # / etc.  The link to the eighth line in the fourth code block on the fifth comment might look something like this: #comment:5:codeblock:4:L8.  Or something shorter? #c5:cb4:L8
 
This would mean analyzing the way arguments are passed around to annotator objects, and how to provide that contextual information.  This would increase the risk associated with the change as it would be broader.  Also, with possible (?) changes to the WikiParser, I don't know if that is wise?  Although this is much less a parsing issue in my (limited) view, but more of a message passing issue.
 
I'd be happy to start researching passing that contextual information, and providing a patch with included tests.  Any feedback on that?

--
You received this message because you are subscribed to the Google Groups "Trac Development" group.
To post to this group, send email to trac-dev <at> googlegroups.com.
To unsubscribe from this group, send email to trac-dev+unsubscribe <at> googlegroups.com.
For more options, visit this group at http://groups.google.com/group/trac-dev?hl=en.
Remy Blank | 11 May 2012 23:16
Picon
Favicon

Re: [Trac-dev] line numbers for code blocks and the BlogSpam module

Nelson Brown wrote:
>   I was working on some analysis for a bitesized ticket, because I
> wanted to contribute to my first open source project.  But every time I
> tried to submit something it was caught in the spam filter.  I was only
> given one shot at the captchas, and I couldn't even tell if I put them
> in correctly or not...  And then my IP address was cached, so what
> should I do?

For some reason both StopForumSpam and BlogSpam thought your message was
spam. Your captcha verified correctly, but it wasn't enough to counter
both filters (you had a total score of 0).

You should set your name and email address in the preferences of the
site, that will give you some more positive karma, and end up at 6 (the
threshold is 3).

>   I tried to also add the following image as an attachment :
> http://i.imgur.com/lNBoB.jpg

I'm not sure about that, I don't see a trace of the upload in the spam
monitoring. Dirk, any insights?

-- Remy

Remy Blank | 11 May 2012 23:17
Picon
Favicon

Re: [Trac-dev] line numbers for code blocks and the BlogSpam module

Remy Blank wrote:
> You should set your name and email address in the preferences of the
> site, that will give you some more positive karma, and end up at 6 (the
> threshold is 3).

Oh, and if it still doesn't work, I'll post on your behalf. But that's
only a short-term solution.

-- Remy

Nelson Brown | 12 May 2012 02:43
Picon

Re: [Trac-dev] line numbers for code blocks and the BlogSpam module

Thank you, I'll try your suggestion about the preferences when I get home.  :)

-Nelson
On May 11, 2012 5:20 PM, "Remy Blank" <remy.blank <at> pobox.com> wrote:
>
> Remy Blank wrote:
> > You should set your name and email address in the preferences of the
> > site, that will give you some more positive karma, and end up at 6 (the
> > threshold is 3).
>
> Oh, and if it still doesn't work, I'll post on your behalf. But that's
> only a short-term solution.
>
> -- Remy
>

--
You received this message because you are subscribed to the Google Groups "Trac Development" group.
To post to this group, send email to trac-dev <at> googlegroups.com.
To unsubscribe from this group, send email to trac-dev+unsubscribe <at> googlegroups.com.
For more options, visit this group at http://groups.google.com/group/trac-dev?hl=en.
Dirk Stöcker | 12 May 2012 13:52
Picon

[Trac-dev] Re: line numbers for code blocks and the BlogSpam module

On Fri, 11 May 2012, Remy Blank wrote:

>>   I tried to also add the following image as an attachment :
>> http://i.imgur.com/lNBoB.jpg
>
> I'm not sure about that, I don't see a trace of the upload in the spam
> monitoring. Dirk, any insights?

As announced I don't care for TEO-Spam-Monitor for some time now. The 
admin pages are unusable due to missing CSS file delivery and the relevant 
ticket gets ignored.

Sorry, but if I should care again for it fix this issue first. Without 
care the detection ratio will slowly decrease over the time.

Ciao
-- 
http://www.dstoecker.eu/ (PGP key available)

--

-- 
You received this message because you are subscribed to the Google Groups "Trac Development" group.
To post to this group, send email to trac-dev <at> googlegroups.com.
To unsubscribe from this group, send email to trac-dev+unsubscribe <at> googlegroups.com.
For more options, visit this group at http://groups.google.com/group/trac-dev?hl=en.

Remy Blank | 12 May 2012 14:19
Picon
Favicon

Re: [Trac-dev] Re: line numbers for code blocks and the BlogSpam module

Dirk Stöcker wrote:
> As announced I don't care for TEO-Spam-Monitor for some time now. The 
> admin pages are unusable due to missing CSS file delivery and the relevant 
> ticket gets ignored.
> 
> Sorry, but if I should care again for it fix this issue first. Without 
> care the detection ratio will slowly decrease over the time.

Did you look recently? The admin pages have been fine for some time now.
But ok, I understand if you don't want to take care of it anymore. It
would be nice if you could at least teach someone how to manage the
filter, so that we can keep a good detection ratio.

-- Remy

Dirk Stöcker | 12 May 2012 18:25
Picon

[Trac-dev] Re: line numbers for code blocks and the BlogSpam module

On Sat, 12 May 2012, Remy Blank wrote:

> Did you look recently? The admin pages have been fine for some time now.

Don't know when I looked last. Yes. Works again.

Thought the spambayes filter is missing now and this is the most 
important one. Also the DNS based checks are missing. Maybe spambayes and 
dns-python are not installed?

The additional checks possibilities I added seem to allow a rather good 
performance even under bad conditions. Good to know :-)

> But ok, I understand if you don't want to take care of it anymore. It
> would be nice if you could at least teach someone how to manage the
> filter, so that we can keep a good detection ratio.

Now after it is usable again, I'll have a look again. Everything I know 
is written down on the DOC page of the plugin. Basically it is important 
to keep training up-to-date. First removing any >90% (negative) and any 0% 
(positive) as they don't help. The remaining ones MAY be trained as 
ham/spam. Any wrong detection SHOULD be trained.

Ciao
-- 
http://www.dstoecker.eu/ (PGP key available)

--

-- 
You received this message because you are subscribed to the Google Groups "Trac Development" group.
To post to this group, send email to trac-dev <at> googlegroups.com.
To unsubscribe from this group, send email to trac-dev+unsubscribe <at> googlegroups.com.
For more options, visit this group at http://groups.google.com/group/trac-dev?hl=en.

Remy Blank | 12 May 2012 23:24
Picon
Favicon

Re: [Trac-dev] Re: line numbers for code blocks and the BlogSpam module

Dirk Stöcker wrote:
> Thought the spambayes filter is missing now and this is the most 
> important one. Also the DNS based checks are missing. Maybe spambayes and 
> dns-python are not installed?

Yes indeed, they were both missing after the server upgrade. I have
installed them, and they appeared in the filter settings. Thanks for the
heads-up!

-- Remy

Nelson Brown | 12 May 2012 15:50
Picon

Re: [Trac-dev] line numbers for code blocks and the BlogSpam module

Thank you very much, your suggestions worked about setting the preferences.  It took right away, and posted both the attachment and the comment.  :)


-Nelson

On Fri, May 11, 2012 at 5:16 PM, Remy Blank <remy.blank <at> pobox.com> wrote:
Nelson Brown wrote:
>   I was working on some analysis for a bitesized ticket, because I
> wanted to contribute to my first open source project.  But every time I
> tried to submit something it was caught in the spam filter.  I was only
> given one shot at the captchas, and I couldn't even tell if I put them
> in correctly or not...  And then my IP address was cached, so what
> should I do?

For some reason both StopForumSpam and BlogSpam thought your message was
spam. Your captcha verified correctly, but it wasn't enough to counter
both filters (you had a total score of 0).

You should set your name and email address in the preferences of the
site, that will give you some more positive karma, and end up at 6 (the
threshold is 3).

>   I tried to also add the following image as an attachment :
> http://i.imgur.com/lNBoB.jpg

I'm not sure about that, I don't see a trace of the upload in the spam
monitoring. Dirk, any insights?

-- Remy


--
You received this message because you are subscribed to the Google Groups "Trac Development" group.
To post to this group, send email to trac-dev <at> googlegroups.com.
To unsubscribe from this group, send email to trac-dev+unsubscribe <at> googlegroups.com.
For more options, visit this group at http://groups.google.com/group/trac-dev?hl=en.

Gmane