ropers | 1 Jul 2012 17:23
Picon

mojibake

Hi Nick, Bob, Henning & Ingo,
(Not sure who of you best to talk to; henning was in the sigline  <at> 
www.openbsd.org/papers/, Bob tends to be all over the WWW schtuff,
Nick tends to wear the documentation hat, and this chiefly concerns
Ingo's presentation.
I've bcc'd you all to avoid swamping your mailboxen by default with
misc <at>  reply-to-alls.
I'm copying misc <at>  in on this however, to avoid any behind-the-back
hard feelings.)

There was a recent misc thread where andres.p complained thusly:

>>  http://www.openbsd.org/papers/bsdcan11-mandoc-openbsd.html
> that page is encoded iso 8859-1, doesn't state so anywhere, breaks
> with browsers configured to default to utf8 in the absence of encoding
> qualifiers.
cf. <http://marc.info/?l=openbsd-misc&m=134083965227817&w=2>

I sent a diff adding a charset=iso-8859-1 meta tag content-type
parameter, and people had all kinds of responses, mostly  suggesting
that there was a much, much bigger problem than merely minor mojibake
gobbledygook in Ingo's presentation.

So I've now just gone through ALL the presentations on
http://www.openbsd.org/papers/ , and I've determined that the problem
is much, much smaller than it's cracked up to be in the misc thread.
This diff fixes things:

--- bsdcan11-mandoc-openbsd.html	2012-06-30 22:18:52.000000000 +0200
+++ bsdcan11-mandoc-openbsd.html.newentities	2012-06-30 22:34:58.000000000
(Continue reading)

Anthony J. Bentley | 1 Jul 2012 19:00
Picon
Gravatar

Re: mojibake

ropers writes:
> This diff fixes things:
> 
> --- bsdcan11-mandoc-openbsd.html	2012-06-30 22:18:52.000000000 +0200
> +++ bsdcan11-mandoc-openbsd.html.newentities	2012-06-30 22:34:58.000000000
> +0200
>  <at>  <at>  -13,7 +13,7  <at>  <at> 
> 
>  <p><a href="http://www.flickr.com/photos/tomkoadam/4778126822/"><img
>  src="http://farm5.static.flickr.com/4115/4778126822_555b453a1e.jpg"></a></p>
> -<p>Csiko - Foal. - Photo: Adam Tomko  <at> flickr (CC)</p>
> +<p>Csik&oacute; - Foal. - Photo: Adam Tomk&oacute;  <at> flickr (CC)</p>
> 
>  <HR>
>  <P>Ingo Schwarze: Mandoc in OpenBSD - page 2: INTRO I -
>  <at>  <at>  -725,7 +725,7  <at>  <at> 
>  <HR>
>  <P>Ingo Schwarze: Mandoc in OpenBSD - page 22: RECURRING II -
>  BSDCan 2011, May 13, Ottawa</P>
> -<H1>Bogue deja vue:</H1>
> +<H1>Bogue d&eacute;j&agrave; vue:</H1>
>  <H2>Collecting regression tests.</H2>
>  <UL>
>  <LI>Slow start in 2009:
> 
> That's it. That's all.

The advantage of using pure ASCII plus HTML escapes in a page is that it
displays the correct content regardless of declared character encoding.
The disadvantage is that it means adding escapes *everywhere*. Can you
(Continue reading)

Alexander Hall | 1 Jul 2012 20:14
Picon
Favicon

Re: mojibake

On 07/01/12 19:00, Anthony J. Bentley wrote:
> ropers writes:
>> This diff fixes things:
>>
>> --- bsdcan11-mandoc-openbsd.html	2012-06-30 22:18:52.000000000 +0200
>> +++ bsdcan11-mandoc-openbsd.html.newentities	2012-06-30 22:34:58.000000000
>> +0200
>>  <at>  <at>  -13,7 +13,7  <at>  <at> 
>>
>>   <p><a href="http://www.flickr.com/photos/tomkoadam/4778126822/"><img
>>   src="http://farm5.static.flickr.com/4115/4778126822_555b453a1e.jpg"></a></p>
>> -<p>Csiko - Foal. - Photo: Adam Tomko  <at> flickr (CC)</p>
>> +<p>Csik&oacute; - Foal. - Photo: Adam Tomk&oacute;  <at> flickr (CC)</p>
>>
>>   <HR>
>>   <P>Ingo Schwarze: Mandoc in OpenBSD - page 2: INTRO I -
>>  <at>  <at>  -725,7 +725,7  <at>  <at> 
>>   <HR>
>>   <P>Ingo Schwarze: Mandoc in OpenBSD - page 22: RECURRING II -
>>   BSDCan 2011, May 13, Ottawa</P>
>> -<H1>Bogue deja vue:</H1>
>> +<H1>Bogue d&eacute;j&agrave; vue:</H1>
>>   <H2>Collecting regression tests.</H2>
>>   <UL>
>>   <LI>Slow start in 2009:
>>
>> That's it. That's all.
>
> The advantage of using pure ASCII plus HTML escapes in a page is that it
> displays the correct content regardless of declared character encoding.
(Continue reading)

Andres Perera | 1 Jul 2012 20:25
Favicon

Re: mojibake

On Sun, Jul 1, 2012 at 12:30 PM, Anthony J. Bentley
<anthonyjbentley <at> gmail.com> wrote:
>> So again, the complaint was that there was mojibake gibberish in
>> Ingo's presentation, because the character encoding isn't specified
>> but defaults to UTF-8 in modern browsers, while the page is actually
>> iso-8859-1 encoded.
>
> Actually, "modern" browsers do not default to a particular encoding (in
> fact, this violates the HTML standard). Instead, they attempt to autodetect
> the charset. Sometimes this works, and sometimes it doesn't -- I've seen
> UTF-8 pages incorrectly detected as ISO-8859-1, and in particularly bad
> cases, vice versa.

i would consider firefox a modern browser, and it does not default to
autodetect. it defaults to iso-8859-1

however, the gui does not allow per html doctype default charset, so a
management configured browser would apply default charset to html1, 4,
... n

there should be no case where this is a problem. all pages should be
html 4 to avoid these silly exchanges. it would be nice if some sort
of style guide clearly stated "pages in www/ are html4, charset
explicitly set to iso-8859-1". in the absence of that, we have these
discussions. having a www/STYLE doc does not require committing to a
particular templating language so hopefully it's a realistic
short-term goal

Dave Anderson | 2 Jul 2012 02:53

Re: mojibake

On Sun, 1 Jul 2012, Anthony J. Bentley wrote:

>ropers writes:
>> This diff fixes things:
>>
>> --- bsdcan11-mandoc-openbsd.html	2012-06-30 22:18:52.000000000 +0200
>> +++ bsdcan11-mandoc-openbsd.html.newentities	2012-06-30 22:34:58.000000000
>> +0200
>>  <at>  <at>  -13,7 +13,7  <at>  <at> 
>>
>>  <p><a href="http://www.flickr.com/photos/tomkoadam/4778126822/"><img
>>  src="http://farm5.static.flickr.com/4115/4778126822_555b453a1e.jpg"></a></p>
>> -<p>Csiko - Foal. - Photo: Adam Tomko  <at> flickr (CC)</p>
>> +<p>Csik&oacute; - Foal. - Photo: Adam Tomk&oacute;  <at> flickr (CC)</p>
>>
>>  <HR>
>>  <P>Ingo Schwarze: Mandoc in OpenBSD - page 2: INTRO I -
>>  <at>  <at>  -725,7 +725,7  <at>  <at> 
>>  <HR>
>>  <P>Ingo Schwarze: Mandoc in OpenBSD - page 22: RECURRING II -
>>  BSDCan 2011, May 13, Ottawa</P>
>> -<H1>Bogue deja vue:</H1>
>> +<H1>Bogue d&eacute;j&agrave; vue:</H1>
>>  <H2>Collecting regression tests.</H2>
>>  <UL>
>>  <LI>Slow start in 2009:
>>
>> That's it. That's all.
>
>The advantage of using pure ASCII plus HTML escapes in a page is that it
(Continue reading)

Anthony J. Bentley | 2 Jul 2012 04:27
Picon
Gravatar

Re: mojibake

Dave Anderson writes:
> >So, in summary, the options are:
> >
> >Use HTML escapes everywhere. IMO, highly impractical.
> >
> >Use any encoding you wish, and set a meta tag when appropriate. This is
> >basically what we have now. (The front pages of /, /de/, /fr/ all use
> >ISO-8859-1; /cs/ uses UTF-8; /lt/ uses ISO-8859-13.)
> >
> >Use UTF-8 everywhere, and enforce this either with an HTTP header or
> >meta tags.
> 
> You missed one: use any encoding you wish, and configure the server to
> send the proper charset value in the real headers (by encoding the
> appropriate charset info in the file-name extension).

I was limiting the options to those that can be easily mirrored. All of
those are basically server-agnostic; yours is not. And I can't imagine a
situation when you'd ever want to do that anyway--sticking to one encoding
is much simpler and saner.

--
Anthony J. Bentley

Tomas Bodzar | 2 Jul 2012 06:34
Picon

Re: mojibake

On Sun, Jul 1, 2012 at 7:00 PM, Anthony J. Bentley
<anthonyjbentley <at> gmail.com> wrote:
> ropers writes:
>> This diff fixes things:
>>
>> --- bsdcan11-mandoc-openbsd.html      2012-06-30 22:18:52.000000000 +0200
>> +++ bsdcan11-mandoc-openbsd.html.newentities  2012-06-30 22:34:58.000000000
>> +0200
>>  <at>  <at>  -13,7 +13,7  <at>  <at> 
>>
>>  <p><a href="http://www.flickr.com/photos/tomkoadam/4778126822/"><img
>>  src="http://farm5.static.flickr.com/4115/4778126822_555b453a1e.jpg"></a></p>
>> -<p>Csiko - Foal. - Photo: Adam Tomko  <at> flickr (CC)</p>
>> +<p>Csik&oacute; - Foal. - Photo: Adam Tomk&oacute;  <at> flickr (CC)</p>
>>
>>  <HR>
>>  <P>Ingo Schwarze: Mandoc in OpenBSD - page 2: INTRO I -
>>  <at>  <at>  -725,7 +725,7  <at>  <at> 
>>  <HR>
>>  <P>Ingo Schwarze: Mandoc in OpenBSD - page 22: RECURRING II -
>>  BSDCan 2011, May 13, Ottawa</P>
>> -<H1>Bogue deja vue:</H1>
>> +<H1>Bogue d&eacute;j&agrave; vue:</H1>
>>  <H2>Collecting regression tests.</H2>
>>  <UL>
>>  <LI>Slow start in 2009:
>>
>> That's it. That's all.
>
> The advantage of using pure ASCII plus HTML escapes in a page is that it
(Continue reading)

Ted Unangst | 2 Jul 2012 06:27
Favicon

Re: mojibake

On Sun, Jul 01, 2012 at 17:23, ropers wrote:

>>>  http://www.openbsd.org/papers/bsdcan11-mandoc-openbsd.html
>> that page is encoded iso 8859-1, doesn't state so anywhere, breaks
>> with browsers configured to default to utf8 in the absence of encoding
>> qualifiers.

I think that complaint, as pointed out, is bogus.  Only broken
browsers are broken.

> +<p>Csik&oacute; - Foal. - Photo: Adam Tomk&oacute;  <at> flickr (CC)</p>

gods, no.  html entities are the last thing I want to see.

> So again, the complaint was that there was mojibake gibberish in
> Ingo's presentation, because the character encoding isn't specified
> but defaults to UTF-8 in modern browsers, while the page is actually
> iso-8859-1 encoded.

Again, only broken browsers are broken.

I think things (encoding wise) are fine as they are.

Ingo Schwarze | 3 Jul 2012 02:48
Picon
Favicon

Re: mojibake

Hi Ian,

ropers wrote on Sun, Jul 01, 2012 at 05:23:07PM +0200:

> (Not sure who of you best to talk to;

Regarding conference presentations, the author.
All conference presentations are different and
don't usually follow the www.openbsd.org site style,
if such a thing even exists.

> Andres Perera wrote:

>> http://www.openbsd.org/papers/bsdcan11-mandoc-openbsd.html
>> that page is encoded iso 8859-1,

It's purely english, hence intended as plain US-ASCII.

Regarding templates and whatnot - hell no, KISS.

> --- bsdcan11-mandoc-openbsd.html	2012-06-30 22:18:52.000000000 +0200
> +++ bsdcan11-mandoc-openbsd.html.newentities	2012-06-30 22:34:58.000000000
> +0200
>  <at>  <at>  -13,7 +13,7  <at>  <at> 
> 
>  <p><a href="http://www.flickr.com/photos/tomkoadam/4778126822/"><img
>  src="http://farm5.static.flickr.com/4115/4778126822_555b453a1e.jpg"></a></p>
> -<p>Csikó - Foal. - Photo: Adam Tomkó  <at> flickr (CC)</p>
> +<p>Csik&oacute; - Foal. - Photo: Adam Tomk&oacute;  <at> flickr (CC)</p>
> 
(Continue reading)

Ton Muller | 5 Jul 2012 15:12
Picon
Favicon

PF and altq isues...need advice please.

i start experimenting with alt-q ,and manage to make a nice test config.
my box has 3 LAN interfaces, but i am playing atm only with one network.

i manage to get the trafic that comes from the internet to each machine
nicely as i want it.

the global out speed to internet is set to 256Kbs.
while each machine is set with 1Mbit out
but the outspeed stays at the 256kbs ,and not the values i want,so my
question, what is wrong with my config ?

below my current pf.conf

######## START CONFIG ##########
#
ext_if  = "fxp0"
int0_if = "re0"
int2_if = "rl0"
int3_if = "rl1"
#
localnet0 ="192.168.0.0/24"
localnet2 ="192.168.2.0/24"
localnet3 ="192.168.3.0/24"
#
blockedport ="{21,25,53,80,110,119, 2128}"
openport    ="{ 21,25,110,8002,45631 }"
#
table <firewall> persist file "/etc/table/firewall.table"
#
# extern -> intern IF-0
(Continue reading)


Gmane