オイラのような端くれが、こんな記事を書いていいのだろうか。。。
ともあれ、後学のためにも書かねばいかんと思いますので、キリキリ書いていきます。
それにしても、dankogaiさん作のこのモジュールがテストでコケるとは。。。
まず、現象の出た環境。
# cat /etc/redhat-release
CentOS release 5.3 (Final)
# uname -m
x86_64
# uname -v
#1 SMP Wed Apr 1 09:10:25 EDT 2009
# uname -r
2.6.18-128.1.6.el5
この様な環境でテストを通すと、こんな風になりました。
# perl Makefile.PL
Checking if your kit is complete...
Looks good
Writing Makefile for HTTP::Response::Encoding
# make
cp lib/HTTP/Response/Encoding.pm blib/lib/HTTP/Response/Encoding.pm
Manifying blib/man3/HTTP::Response::Encoding.3pm
# make test
PERL_DL_NONLAZY=1 /usr/bin/perl "-MExtUtils::Command::MM" "-e" "test_harness(0, 'blib/lib', 'blib/arch')" t/*.t
t/00-load.t ....... 1/1 # Testing HTTP::Response::Encoding 0.05, Perl 5.008008, /usr/bin/perl
t/00-load.t ....... ok
t/01-file.t ....... 1/13
# Failed test '$res->charset eq 'EUC-JP''
# at t/01-file.t line 49.
# got: undef
# expected: 'EUC-JP'
# Failed test '$res->encoding eq 'euc-jp''
# at t/01-file.t line 51.
# got: undef
# expected: 'euc-jp'
# Failed test '$res->charset eq 'ISO-2022-JP''
# at t/01-file.t line 49.
# got: undef
# expected: 'ISO-2022-JP'
# Failed test '$res->encoding eq 'iso-2022-jp''
# at t/01-file.t line 51.
# got: undef
# expected: 'iso-2022-jp'
# Failed test '$res->charset eq 'Shift_JIS''
# at t/01-file.t line 49.
# got: undef
# expected: 'Shift_JIS'
# Failed test '$res->encoding eq 'shiftjis''
# at t/01-file.t line 51.
# got: undef
# expected: 'shiftjis'
# Failed test '$res->charset eq 'UTF-8''
# at t/01-file.t line 49.
# got: undef
# expected: 'UTF-8'
# Failed test '$res->encoding eq 'utf-8-strict''
# at t/01-file.t line 51.
# got: undef
# expected: 'utf-8-strict'
# Looks like you failed 8 tests of 13.
t/01-file.t ....... Dubious, test returned 8 (wstat 2048, 0x800)
Failed 8/13 subtests
t/boilerplate.t ... ok
t/pod-coverage.t .. ok
t/pod.t ........... ok
Test Summary Report
-------------------
t/01-file.t (Wstat: 2048 Tests: 13 Failed: 8)
Failed tests: 5-12
Non-zero exit status: 8
Files=5, Tests=19, 0 wallclock secs ( 0.02 usr 0.02 sys + 0.18 cusr 0.03 csys = 0.25 CPU)
Result: FAIL
Failed 1/5 test programs. 8/19 subtests failed.
make: *** [test_dynamic] エラー 255
取り敢えず、t/01-file.tを見てみます。
1 #!perl
2 #!perl -T
3
4 use strict;
5 use warnings;
6 use LWP::UserAgent;
7 use HTTP::Response::Encoding;
8 use File::Spec;
9 use Encode;
10 use Cwd;
11 use URI;
12 use Test::More tests => 13;
13
14 my $ua = LWP::UserAgent->new;
### 中略 ###
40 for my $charset (sort keys %charset){
41 my $uri = URI->new('file://');
42 $uri->path(File::Spec->catfile($cwd, "t", $filename{$charset}));
43 my $res;
44 {
45 local $^W = 0; # to quiet LWP::Protocol
46 $res = $ua->get($uri);
47 }
48 die unless $res->is_success;
49 is $res->charset, $charset, "\$res->charset eq '$charset'";
50 my $canon = find_encoding($charset)->name;
51 is $res->encoding, $canon, "\$res->encoding eq '$canon'";
52 }
### 以下略 ###
どうやらこのテストでは、LWP::UserAgentに'file://'なURIを食わせてgetさせています。そうして出てきたHTTP::Responseインスタンスの(HTTP::Response::Encodingで追加された)charsetメソッド及びencodingメソッドをキックして、出力を元にその動作正当性を検証しているようです。
では次に、lib/HTTP/Response/Encoding.pmを覗いてみます。
1 package HTTP::Response::Encoding;
2 use warnings;
3 use strict;
4 our $VERSION = sprintf "%d.%02d", q$Revision: 0.5 $ =~ /(\d+)/g;
5
6 sub HTTP::Response::charset {
7 my $self = shift;
8 return $self->{__charset} if exists $self->{__charset};
9 my $content_type = $self->headers->header('Content-Type');
10 return unless $content_type;
11 $content_type =~ /charset=([A-Za-z0-9_\-]+)/io;
12 $self->{__charset} = $1 || undef;
13 }
14
15 sub HTTP::Response::encoder {
16 require Encode;
17 my $self = shift;
18 return $self->{__encoder} if exists $self->{__encoder};
19 my $charset = $self->charset or return;
20 my $enc = Encode::find_encoding($charset);
21 $self->{__encoder} = $enc;
22 }
23
24 sub HTTP::Response::encoding {
25 my $enc = shift->encoder or return;
26 $enc->name;
27 }
### 以下略 ###
charsetメソッドは、HTTP::Response->headers->header('Content-Type')で得たcontent-typeヘッダを元に、11行目の正規表現にマッチした文字列を返しています。
encodingメソッドは、encoderメソッドが返すEncode::Encodingインスタンスのnameメソッドの結果を返しています。
つまり、
・encodingの動作はcharsetの動作に依存しており、
・charsetの動作はHTTP::Response::headersの動作に依存している
ことになります。従って、先のテストがこけたのは、LWP::UserAgentが("file://"なuriをgetした時の動作が)期待通りの動作をしていない為、という原因が考えられます。では、それを証明するためのテストを書いてみます。
# cat t/lwpua.t
use warnings;
use strict;
use Test::More tests => 7;
use FindBin;
use LWP::UserAgent;
use URI;
my @patterns = (
"http://www.yahoo.co.jp/",
"http://www.google.co.jp/",
"file://$FindBin::Bin/t-euc-jp.html",
"file://$FindBin::Bin/t-iso-2022-jp.html",
"file://$FindBin::Bin/t-shiftjis.html",
"file://$FindBin::Bin/t-utf-8.html"
);
my $ua = LWP::UserAgent->new;
my $res;
for ( @patterns ) {
$res = $ua->get( URI->new( $_ ) );
like(
$res->headers->header( 'Content-Type' ),
qr/^text\/html; charset=/,
$_. " content-type likes qr/^text\/html; charset=/"
);
}
$res = $ua->get( URI->new( "http://www.yahoo.co.jp/" ) );
open my $fh, ">", "$FindBin::Bin/yahoo.html" or die "could not open file";
print $fh $res->content;
close $fh;
$res = $ua->get( URI->new( "file://$FindBin::Bin/yahoo.html" ) );
like(
$res->headers->header( 'Content-Type' ),
qr/^text\/html; charset=/,
"file://$FindBin::Bin/yahoo.html content-type likes qr/^text\/html; charset=/"
);
unlink "$FindBin::Bin/yahoo.html";
上記テストの結果が以下の通りです。
# perl t/lwpua.t
1..7
ok 1 - http://www.yahoo.co.jp/ content-type likes qr/^text/html; charset=/
ok 2 - http://www.google.co.jp/ content-type likes qr/^text/html; charset=/
not ok 3 - file:///root/HTTP-Response-Encoding-0.05/t/t-euc-jp.html content-type likes qr/^text/html; charset=/
# Failed test 'file:///root/HTTP-Response-Encoding-0.05/t/t-euc-jp.html content-type likes qr/^text/html; charset=/'
# at t/lwpua.t line 22.
# 'text/html'
# doesn't match '(?-xism:^text/html; charset=)'
not ok 4 - file:///root/HTTP-Response-Encoding-0.05/t/t-iso-2022-jp.html content-type likes qr/^text/html; charset=/
# Failed test 'file:///root/HTTP-Response-Encoding-0.05/t/t-iso-2022-jp.html content-type likes qr/^text/html; charset=/'
# at t/lwpua.t line 22.
# 'text/html'
# doesn't match '(?-xism:^text/html; charset=)'
not ok 5 - file:///root/HTTP-Response-Encoding-0.05/t/t-shiftjis.html content-type likes qr/^text/html; charset=/
# Failed test 'file:///root/HTTP-Response-Encoding-0.05/t/t-shiftjis.html content-type likes qr/^text/html; charset=/'
# at t/lwpua.t line 22.
# 'text/html'
# doesn't match '(?-xism:^text/html; charset=)'
not ok 6 - file:///root/HTTP-Response-Encoding-0.05/t/t-utf-8.html content-type likes qr/^text/html; charset=/
# Failed test 'file:///root/HTTP-Response-Encoding-0.05/t/t-utf-8.html content-type likes qr/^text/html; charset=/'
# at t/lwpua.t line 22.
# 'text/html'
# doesn't match '(?-xism:^text/html; charset=)'
not ok 7 - file:///root/HTTP-Response-Encoding-0.05/t/yahoo.html content-type likes qr/^text/html; charset=/
# Failed test 'file:///root/HTTP-Response-Encoding-0.05/t/yahoo.html content-type likes qr/^text/html; charset=/'
# at t/lwpua.t line 31.
# 'text/html'
# doesn't match '(?-xism:^text/html; charset=)'
# Looks like you failed 5 tests of 7.
この結果から、LWP::UserAgent::headers->header( 'Content-Type' )の返り値は、"file://"なURIにおいて、"charset=ほげほげ"を含まないと言えそうです。
とりあえずオイラが言えるのは、t/01-file.tのテスト手法はこのままではまずい気がするってことと、現在のHTTP::Response::charsetは"file://"なURIに使わない方が良さそうだと言うことです。
ちなみにこの現象は、Task::Catalystをインストールしようとした際に起こりました。
2009-07-06 追記rtでいくつかのパッチが公開され始めています。
2009-09-14 追記7/28に0.06がリリースされていました。
テストもバッチリ通りました!
good job!
CPAN.pm: Going to build D/DA/DANKOGAI/HTTP-Response-Encoding-0.06.tar.gz
Checking if your kit is complete...
Looks good
Writing Makefile for HTTP::Response::Encoding
cp lib/HTTP/Response/Encoding.pm blib/lib/HTTP/Response/Encoding.pm
Manifying blib/man3/HTTP::Response::Encoding.3pm
/usr/bin/make -- OK
Running make test
PERL_DL_NONLAZY=1 /usr/bin/perl "-MExtUtils::Command::MM" "-e" "test_harness(0, 'blib/lib', 'blib/arch')" t/*.t
t/00-load.t ....... 1/1 # Testing HTTP::Response::Encoding 0.06, Perl 5.008008, /usr/bin/perl
t/00-load.t ....... ok
t/01-file.t ....... ok
t/boilerplate.t ... ok
t/pod-coverage.t .. ok
t/pod.t ........... ok
All tests successful.
Files=5, Tests=19, 2 wallclock secs ( 0.04 usr 0.00 sys + 0.27 cusr 0.03 csys = 0.34 CPU)
Result: PASS
/usr/bin/make test -- OK
Running make install
Manifying blib/man3/HTTP::Response::Encoding.3pm
Installing /usr/local/share/perl/5.8.8/HTTP/Response/Encoding.pm
Installing /usr/local/man/man3/HTTP::Response::Encoding.3pm
Appending installation info to /usr/lib/perl/5.8/perllocal.pod
/usr/bin/make install -- OK