Abaixo segue a lista das 250 palavras mais comuns, o percentual com que ocorrem e um resumo das estatísticas que mencionei acima.
Número de livros | 1032 |
Tamanho do arquivo combinado | 440MB |
Data | 29/09/97 |
Número total de palavras | 6.615.271 |
Número total de palavras diferentes | 103.590 |
Número de palavras que ocorrem menos de dez vezes | 78.332 |
Número de ocorrências das 250 palavras mais comuns | 3.781.615 (57%) |
Número de ocorrências das 1000 palavras mais comuns | 6.565.736 (99.25%) |
Palavra | Porc. Total | Número de Ocorrências |
---|---|---|
the | 7.846512 | 519068 |
of | 4.460135 | 295050 |
and | 3.653471 | 241687 |
to | 2.556630 | 169128 |
in | 1.815451 | 120097 |
was | 1.161903 | 76863 |
that | 1.112607 | 73602 |
his | 1.079124 | 71387 |
he | 1.033669 | 68380 |
it | 0.872406 | 57712 |
with | 0.772803 | 51123 |
as | 0.737385 | 48780 |
by | 0.707575 | 46808 |
for | 0.666821 | 44112 |
is | 0.663329 | 43881 |
had | 0.622454 | 41177 |
but | 0.576575 | 38142 |
which | 0.538663 | 35634 |
on | 0.520341 | 34422 |
be | 0.506994 | 33539 |
at | 0.504938 | 33403 |
not | 0.499753 | 33060 |
they | 0.499345 | 33033 |
from | 0.494795 | 32732 |
were | 0.474372 | 31381 |
their | 0.472573 | 31262 |
this | 0.449097 | 29709 |
or | 0.400679 | 26506 |
have | 0.384247 | 25419 |
you | 0.380876 | 25196 |
her | 0.376991 | 24939 |
who | 0.363099 | 24020 |
all | 0.361406 | 23908 |
him | 0.359970 | 23813 |
an | 0.338807 | 22413 |
so | 0.326850 | 21622 |
are | 0.297705 | 19694 |
one | 0.293624 | 19424 |
she | 0.263753 | 17448 |
my | 0.257782 | 17053 |
them | 0.254396 | 16829 |
we | 0.251494 | 16637 |
been | 0.250602 | 16578 |
no | 0.242152 | 16019 |
me | 0.236710 | 15659 |
if | 0.235637 | 15588 |
said | 0.234185 | 15492 |
there | 0.229787 | 15201 |
when | 0.223619 | 14793 |
would | 0.221261 | 14637 |
more | 0.212901 | 14084 |
will | 0.181625 | 12015 |
some | 0.174747 | 11560 |
what | 0.173795 | 11497 |
into | 0.172102 | 11385 |
has | 0.167340 | 11070 |
could | 0.158754 | 10502 |
than | 0.158255 | 10469 |
out | 0.156547 | 10356 |
then | 0.153720 | 10169 |
up | 0.153418 | 10149 |
its | 0.150697 | 9969 |
man | 0.147553 | 9761 |
time | 0.145315 | 9613 |
now | 0.140281 | 9280 |
two | 0.139133 | 9204 |
upon | 0.139057 | 9199 |
these | 0.137984 | 9128 |
after | 0.136533 | 9032 |
footnote | 0.135414 | 8958 |
may | 0.135006 | 8931 |
only | 0.134855 | 8921 |
other | 0.133676 | 8843 |
see | 0.128007 | 8468 |
such | 0.123321 | 8158 |
do | 0.123245 | 8153 |
great | 0.120932 | 8000 |
very | 0.120086 | 7944 |
any | 0.120010 | 7939 |
your | 0.118302 | 7826 |
about | 0.114689 | 7587 |
made | 0.113495 | 7508 |
our | 0.112800 | 7462 |
well | 0.112724 | 7457 |
first | 0.112346 | 7432 |
most | 0.110351 | 7300 |
like | 0.110154 | 7287 |
before | 0.109187 | 7223 |
little | 0.108401 | 7171 |
himself | 0.105287 | 6965 |
over | 0.103760 | 6864 |
without | 0.102868 | 6805 |
own | 0.102808 | 6801 |
those | 0.101644 | 6724 |
good | 0.101266 | 6699 |
might | 0.101175 | 6693 |
men | 0.099361 | 6573 |
can | 0.099331 | 6571 |
should | 0.098817 | 6537 |
did | 0.098741 | 6532 |
where | 0.095824 | 6339 |
come | 0.095763 | 6335 |
people | 0.095627 | 6326 |
must | 0.093450 | 6182 |
us | 0.093057 | 6156 |
day | 0.088991 | 5887 |
long | 0.088825 | 5876 |
much | 0.088810 | 5875 |
down | 0.088311 | 5842 |
same | 0.087676 | 5800 |
mr | 0.083927 | 5552 |
never | 0.083579 | 5529 |
even | 0.083398 | 5517 |
old | 0.082204 | 5438 |
under | 0.081327 | 5380 |
through | 0.080828 | 5347 |
still | 0.080828 | 5347 |
while | 0.080314 | 5313 |
many | 0.080239 | 5308 |
know | 0.079876 | 5284 |
every | 0.079196 | 5239 |
life | 0.078591 | 5199 |
three | 0.077790 | 5146 |
how | 0.077759 | 5144 |
way | 0.077291 | 5113 |
years | 0.076384 | 5053 |
came | 0.076354 | 5051 |
king | 0.074963 | 4959 |
go | 0.073436 | 4858 |
being | 0.072318 | 4784 |
again | 0.070549 | 4667 |
here | 0.069067 | 4569 |
make | 0.068629 | 4540 |
back | 0.068115 | 4506 |
new | 0.067510 | 4466 |
against | 0.066437 | 4395 |
found | 0.065198 | 4313 |
yet | 0.065031 | 4302 |
say | 0.064230 | 4249 |
too | 0.064170 | 4245 |
last | 0.063157 | 4178 |
though | 0.063051 | 4171 |
head | 0.062613 | 4142 |
away | 0.061948 | 4098 |
right | 0.061131 | 4044 |
hand | 0.060693 | 4015 |
place | 0.060375 | 3994 |
god | 0.060209 | 3983 |
another | 0.059136 | 3912 |
shall | 0.059121 | 3911 |
country | 0.058894 | 3896 |
part | 0.058788 | 3889 |
far | 0.058667 | 3881 |
left | 0.057624 | 3812 |
eyes | 0.057534 | 3806 |
soon | 0.056838 | 3760 |
went | 0.055961 | 3702 |
take | 0.055856 | 3695 |
each | 0.055840 | 3694 |
just | 0.055311 | 3659 |
power | 0.055221 | 3653 |
name | 0.054858 | 3629 |
am | 0.054344 | 3595 |
death | 0.054147 | 3582 |
world | 0.053361 | 3530 |
nor | 0.053104 | 3513 |
mind | 0.053104 | 3513 |
once | 0.052923 | 3501 |
off | 0.052243 | 3456 |
among | 0.051699 | 3420 |
thought | 0.051275 | 3392 |
whom | 0.050731 | 3356 |
house | 0.050656 | 3351 |
get | 0.050625 | 3349 |
nothing | 0.050595 | 3347 |
between | 0.050459 | 3338 |
hundred | 0.050278 | 3326 |
think | 0.050096 | 3314 |
both | 0.048978 | 3240 |
young | 0.048887 | 3234 |
because | 0.048509 | 3209 |
saw | 0.048267 | 3193 |
ever | 0.048055 | 3179 |
let | 0.047980 | 3174 |
themselves | 0.047557 | 3146 |
emperor | 0.047345 | 3132 |
case | 0.046680 | 3088 |
work | 0.046241 | 3059 |
whose | 0.046121 | 3051 |
war | 0.046075 | 3048 |
took | 0.045939 | 3039 |
general | 0.045622 | 3018 |
city | 0.045607 | 3017 |
state | 0.045259 | 2994 |
side | 0.044821 | 2965 |
things | 0.044684 | 2956 |
always | 0.044533 | 2946 |
days | 0.043838 | 2900 |
thus | 0.043808 | 2898 |
face | 0.043233 | 2860 |
night | 0.042946 | 2841 |
less | 0.042931 | 2840 |
give | 0.042871 | 2836 |
asked | 0.042825 | 2833 |
body | 0.042538 | 2814 |
also | 0.042311 | 2799 |
seemed | 0.041843 | 2768 |
four | 0.041646 | 2755 |
non | 0.041631 | 2754 |
son | 0.041586 | 2751 |
whole | 0.041525 | 2747 |
called | 0.041193 | 2725 |
don't | 0.040875 | 2704 |
however | 0.040437 | 2675 |
love | 0.040316 | 2667 |
put | 0.040210 | 2660 |
thousand | 0.039893 | 2639 |
hands | 0.039877 | 2638 |
seen | 0.039817 | 2634 |
tell | 0.039530 | 2615 |
almost | 0.039227 | 2595 |
look | 0.039182 | 2592 |
father | 0.039122 | 2588 |
heart | 0.038850 | 2570 |
few | 0.038850 | 2570 |
got | 0.038653 | 2557 |
five | 0.038502 | 2547 |
nature | 0.038366 | 2538 |
find | 0.038184 | 2526 |
public | 0.038094 | 2520 |
going | 0.038063 | 2518 |
roman | 0.037912 | 2508 |
perhaps | 0.037716 | 2495 |
woman | 0.037610 | 2488 |
since | 0.037126 | 2456 |
having | 0.036748 | 2431 |
arms | 0.036718 | 2429 |
heard | 0.036703 | 2428 |
looked | 0.036582 | 2420 |
age | 0.036552 | 2418 |
gave | 0.036008 | 2382 |
why | 0.035992 | 2381 |
words | 0.035781 | 2367 |
light | 0.035524 | 2350 |
better | 0.035312 | 2336 |
end | 0.034934 | 2311 |
water | 0.034919 | 2310 |
twenty | 0.034435 | 2278 |
until | 0.034360 | 2273 |
others | 0.034330 | 2271 |
Nenhum comentário:
Postar um comentário