11 May 2012 15:23
[Trac-dev] line numbers for code blocks and the BlogSpam module
Nelson Brown <brownnrl <at> gmail.com>
2012-05-11 13:23:44 GMT
2012-05-11 13:23:44 GMT
Hi,
What procedures do you guys have in place to un-spam yourself? :)
I was working on some analysis for a bitesized ticket, because I wanted to contribute to my first open source project. But every time I tried to submit something it was caught in the spam filter. I was only given one shot at the captchas, and I couldn't even tell if I put them in correctly or not... And then my IP address was cached, so what should I do?
Specifically, I was working on http://trac.edgewall.org/ticket/3275
I tried to also add the following image as an attachment : http://i.imgur.com/lNBoB.jpg
The text of my comment is below, and I kept getting the error.
===================================
So this is my first crack at contributing to an open source project. I basically read the recommendation that you bite into a bitesized ticket and this was the first one I saw. So what follows is some analysis, and I'd like to get a recommendation from anyone interested.
The current mechanism for parsing arguments to WikiProcessors is the parse_processor_args method in trac.wiki.parser module. These currently get stored into {{{self.args}}} as a dictionary of key value paris in the form of {{{processorarg=processorargvalue}}}. So utilizing the existing syntax for passing processor arguments, an example code block with annotated line numbers might look like this:
!{{{#!python annotations=lineno
import something
import something
something.somethingelse()
}}}
}}}
This currently renders (with the arguments ignored) as:
{{{#!python annotations=lineno
import something
import something
something.somethingelse()
}}}
}}}
But with the following change to source:trunk/trac/wiki/formatter.py <at> 10919#L345
{{{
#!diff
--- formatter.py (revision 10865)
+++ formatter.py (working copy)
<at> <at> -321,8 +321,19 <at> <at>
text)
def _mimeview_processor(self, text):
- return Mimeview(self.env).render(self.formatter.context,
- self.name, text)
+ annotations = self.args['annotations'] \
+ if 'annotations' in self.args and \
+ self.args['annotations'] in ['lineno'] \
+ else None
+
+
+ if annotations:
+ return Mimeview(self.env).render(self.formatter.context,
+ self.name, text,
+ annotations=[annotations])
+ else:
+ return Mimeview(self.env).render(self.formatter.context,
+ self.name, text)
# TODO: use convert('text/html') instead of render
def process(self, text, in_paragraph=False):
}}}
#!diff
--- formatter.py (revision 10865)
+++ formatter.py (working copy)
<at> <at> -321,8 +321,19 <at> <at>
text)
def _mimeview_processor(self, text):
- return Mimeview(self.env).render(self.formatter.context,
- self.name, text)
+ annotations = self.args['annotations'] \
+ if 'annotations' in self.args and \
+ self.args['annotations'] in ['lineno'] \
+ else None
+
+
+ if annotations:
+ return Mimeview(self.env).render(self.formatter.context,
+ self.name, text,
+ annotations=[annotations])
+ else:
+ return Mimeview(self.env).render(self.formatter.context,
+ self.name, text)
# TODO: use convert('text/html') instead of render
def process(self, text, in_paragraph=False):
}}}
It would render as follows (attached image):
(The spam bot prevented me from uploading the file, so here is a link on imgur)
[http://i.imgur.com/lNBoB.jpg An example of what code blocks with annotated line numbers can look in a wiki block with some other content around it for context.]
So this is nice, but there are some issues with it. The anchors need to be provided with more context. The line number annoatator was designed for the display of single files, but the anchor needs to know not just the URL, but the comment # / description # / code block # / etc. The link to the eighth line in the fourth code block on the fifth comment might look something like this: #comment:5:codeblock:4:L8. Or something shorter? #c5:cb4:L8
This would mean analyzing the way arguments are passed around to annotator objects, and how to provide that contextual information. This would increase the risk associated with the change as it would be broader. Also, with possible (?) changes to the WikiParser, I don't know if that is wise? Although this is much less a parsing issue in my (limited) view, but more of a message passing issue.
I'd be happy to start researching passing that contextual information, and providing a patch with included tests. Any feedback on that?
-- You received this message because you are subscribed to the Google Groups "Trac Development" group.
To post to this group, send email to trac-dev <at> googlegroups.com.
To unsubscribe from this group, send email to trac-dev+unsubscribe <at> googlegroups.com.
For more options, visit this group at http://groups.google.com/group/trac-dev?hl=en.
> But ok, I understand if you don't want to take care of it anymore. It
> would be nice if you could at least teach someone how to manage the
> filter, so that we can keep a good detection ratio.
Now after it is usable again, I'll have a look again. Everything I know
is written down on the DOC page of the plugin. Basically it is important
to keep training up-to-date. First removing any >90% (negative) and any 0%
(positive) as they don't help. The remaining ones MAY be trained as
ham/spam. Any wrong detection SHOULD be trained.
Ciao
--
RSS Feed