Awk does not understand unicode. The man page should warn about it Reference: /n/sources/patch/sorry/awk-utf Date: Tue Feb 26 16:08:55 CET 2008 Signed-off-by: paurea@lsub.org --- /sys/man/1/awk Tue Feb 26 16:02:12 2008 +++ /sys/man/1/awk Tue Feb 26 16:02:09 2008 @@ -546,3 +546,7 @@ .br The scope rules for variables in functions are a botch; the syntax is worse. +.br +It does not behave well with UTF and reads one +byte at a time. +