tr命令在統(tǒng)計(jì)英文單詞出現(xiàn)頻率中的妙用
tr命令我們很清楚,可以刪除替換,刪除字符串。 在英文中我們要經(jīng)常會(huì)經(jīng)常統(tǒng)計(jì)英文中出現(xiàn)的頻率,如果用常規(guī)的方法,用設(shè)定計(jì)算器一個(gè)個(gè)算比較費(fèi)事,這個(gè)時(shí)候使用tr命令,將空格分割替換為換行符,再用tr命令刪除掉有的單詞后面的點(diǎn)號(hào),逗號(hào),感嘆號(hào)。先看看要替換的this.txt文件
The Zen of Python, by Tim Peters
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
上面的文本文件,如果要文中出現(xiàn)次數(shù)的最多的10個(gè)單詞統(tǒng)計(jì)出來(lái),可以使用下面的命令
[root@linux ~]# cat this.txt | tr ' ' '\n' | tr -d '[.,!]' | sort | uniq -c | sort -nr | head -10 10 is 8 better 8 than 5 to 5 the 3 of 3 Although 3 never 3 be 3 one
可謂非常方便!
總結(jié)
以上就是這篇文章的全部?jī)?nèi)容了,希望本文的內(nèi)容對(duì)大家的學(xué)習(xí)或者工作具有一定的參考學(xué)習(xí)價(jià)值,謝謝大家對(duì)本站的支持。如果你想了解更多相關(guān)內(nèi)容請(qǐng)查看下面相關(guān)鏈接
版權(quán)聲明:本站文章來(lái)源標(biāo)注為YINGSOO的內(nèi)容版權(quán)均為本站所有,歡迎引用、轉(zhuǎn)載,請(qǐng)保持原文完整并注明來(lái)源及原文鏈接。禁止復(fù)制或仿造本網(wǎng)站,禁止在非maisonbaluchon.cn所屬的服務(wù)器上建立鏡像,否則將依法追究法律責(zé)任。本站部分內(nèi)容來(lái)源于網(wǎng)友推薦、互聯(lián)網(wǎng)收集整理而來(lái),僅供學(xué)習(xí)參考,不代表本站立場(chǎng),如有內(nèi)容涉嫌侵權(quán),請(qǐng)聯(lián)系alex-e#qq.com處理。
關(guān)注官方微信