- 論壇徽章:
- 0
|
剛開(kāi)始的思路是:
將整個(gè)文件讀取,然后按照空格切割后保存于數(shù)組中,然后遍歷數(shù)組創(chuàng)建哈希表。但是如果文章很長(zhǎng),并且有多個(gè)文章的話,
先保存數(shù)組有點(diǎn)不太妥,效率太低,請(qǐng)問(wèn)如何改進(jìn),使得當(dāng)讀入文件的時(shí)候不創(chuàng)建臨時(shí)數(shù)組直接創(chuàng)建哈希表呢?
text_in:
The U.N. Food and Agriculture Organization says it has less than half the funding it needs to help ensure food security in parts of South Sudan.
.......
(太多先不貼出來(lái)了,假設(shè)文本很規(guī)范)
創(chuàng)建如下的哈希表%Words:
(
The => 1,
U.N. => 1,
Food => 1,
...
)
我之前的想法是:
my $content;
{
local $/= undef;
$content = <$IN1>;
close($IN1);
#print "$content\n";
}
my @words1 = split /\s/,$content;
my %Words1 = map{$_ => 1} @words1;
可不可以不用臨時(shí)的數(shù)組呢,直接創(chuàng)建哈希表,那樣會(huì)不會(huì)更快呢? |
|