11 小插曲:出现频率最高的单词

11.1 练习

  • 练习 11.1:当我们对一段文本执行统计单词出现频率的程序时,结果常常是一些诸如冠词和介词之类的没有太多意义的短词汇。请改写该程序,使它忽略长度小于4个字母的单词。

直接从 github 上复制一段 Readme 的内容进行测试,顺便把第二题也完成,思路是,在处理完源文件以后,再读取需要排除的词,将读取到的键值设为 nil 即可将结果从表中移除。

--- exercise11_1.lua

local counter = {}

local inputFile = io.open("aseprite.txt", "r")
local exceptionFile = io.open("exceptions.txt","r")

for line in inputFile:lines() do
    for word in string.gmatch(line, "%w+") do
        if #word >= 4 then
            counter[word] = (counter[word] or 0) + 1
        end
    end
end

for line in exceptionFile:lines() do
    for word in string.gmatch(line, "%w+") do
        counter[word] = nil
    end
end

for k, v in pairs(counter) do
    print(k, v)
end

inputFile:close()
exceptionFile:close()
------------------------- aseprite.txt

Aseprite
Build Status Build status Discourse Community Discord Server

Introduction
Aseprite is a program to create animated sprites. Its main features are:

Sprites are composed of layers & frames as separated concepts.
Support for color profiles and different color modes: RGBA, Indexed (palettes up to 256 colors), Grayscale.
Animation facilities, with real-time preview and onion skinning.
Export/import animations to/from sprite sheets, GIF files, or sequence of PNG files (and FLC, FLI, JPG, BMP, PCX, TGA).
Multiple editors support.
Layer groups for organizing your work, and reference layers for rotoscoping.
Pixel-art specific tools like Pixel Perfect freehand mode, Shading ink, Custom Brushes, Outlines, Wide Pixels, etc.
Other special drawing tools like Pressure sensitivity, Symmetry Tool, Stroke and Fill selection, Gradients.
Tiled mode useful to draw patterns and textures.
Transform multiple frames/layers at the same time.
Lua scripting capabilities.
CLI - Command Line Interface to automatize tasks.
Quick Reference / Cheat Sheet keyboard shortcuts (customizable keys and mouse wheel).
Reopen closed files and recover data in case of crash.
Undo/Redo for every operation and support for non-linear undo.
More features & tips
Issues
There is a list of Known Issues (things to be fixed or that aren't yet implemented).

If you found a bug or have a new idea/feature for the program, you can report them.

Support
You can ask for help in:

Aseprite Community
Aseprite Discord Server
Official support: [email protected]
Social networks and community-driven places: Twitter, Facebook, YouTube, Instagram.
Authors
Igara Studio is developing Aseprite:

David Capello: Lead developer, fixing issues, new features, and user support.
Gaspar Capello: Developer, fixing issues and new features.
Credits
The default Aseprite theme was introduced in v0.8, created by:

Ilija Melentijevic
Aseprite includes color palettes created by:

Richard "DawnBringer" Fhager, 16 colors, 32 colors.
Arne Niklas Jansson, 16 colors, 32 colors.
ENDESGA Studios, EDG16 and EDG32, and other palettes.
Hyohnoo Games, mail24 palette.
Davit Masia, matriax8c palette.
Javier Guerrero, nyx8 palette.
Adigun A. Polack, AAP-64, AAP-Splendor128, SimpleJPC-16, and AAP-Micro12 palette.
PineTreePizza, Rosy-42 palette.
It tries to replicate some pixel-art algorithms:

RotSprite by Xenowhirl.
Pixel perfect drawing algorithm by Sébastien Bénard and Carduus.
Thanks to third-party open source projects, to contributors, and all the people who have contributed ideas, patches, bugs report, feature requests, donations, and help me to develop Aseprite.

License
This program is distributed under three different licenses:

Source code and official releases/binaries are distributed under our End-User License Agreement for Aseprite (EULA). Please check that there are modules/libraries in the source code that are distributed under the MIT license (e.g. laf, clip, undo, observable, ui, etc.).
You can request a special educational license in case you are a teacher in an educational institution and want to use Aseprite in your classroom (in-situ).
Steam releases are distributed under the terms of the Steam Subscriber Agreement.
You can get more information about Aseprite license in the FAQ.

------------------------------- exceptions.txt

Support
Introduction
Issues
Authors
Credits
License


------------------------------ 结果

fixing  2
Micro12 1
Brushes 1
modules 1
Tiled   1
list    1
that    3
Lead    1
palette 5
releases    2
Custom  1
want    1
Transform   1
code    2
animations  1
data    1
Rosy    1
Gaspar  1
mode    2
mail24  1
main    1
SimpleJPC   1
some    1
Source  1
keyboard    1
color   3
nard    1
bastien 1
have    2
automatize  1
sheets  1
This    1
case    2
report  2
Redo    1
Xenowhirl   1
real    1
Games   1
layers  3
nyx8    1
modes   1
same    1
source  2
community   1
Jansson 1
Reference   1
Command 1
libraries   1
issues  2
Adigun  1
colors  5
Fhager  1
freehand    1
Shading 1
bugs    1
preview 1
Polack  1
skinning    1
licenses    1
palettes    3
features    4
Undo    1
wheel   1
Ilija   1
sequence    1
frames  2
recover 1
keys    1
There   1
draw    1
tips    1
sprite  1
undo    2
mouse   1
other   1
Twitter 1
multiple    1
drawing 2
status  1
fixed   1
EDG16   1
there   1
created 2
Reopen  1
more    1
User    1
Masia   1
aseprite    1
customizable    1
Wide    1
Other   1
Known   1
Outlines    1
ENDESGA 1
Multiple    1
Line    1
introduced  1
Status  1
capabilities    1
More    1
user    1
Splendor128 1
Discord 2
Gradients   1
EULA    1
RGBA    1
groups  1
Facebook    1
donations   1
clip    1
Steam   2
textures    1
developing  1
Cheat   1
Layer   1
Thanks  1
sprites 1
animated    1
default 1
Social  1
pixel   1
developer   1
concepts    1
tasks   1
projects    1
specific    1
party   1
RotSprite   1
theme   1
Animation   1
patterns    1
Aseprite    11
work    1
requests    1
replicate   1
Build   2
Studio  1
import  1
distributed 4
found   1
with    1
Davit   1
time    2
tools   2
Perfect 1
Fill    1
your    2
Grayscale   1
create  1
Sprites 1
linear  1
crash   1
Pixel   3
Export  1
reference   1
educational 2
things  1
PineTreePizza   1
perfect 1
Discourse   1
onion   1
facilities  1
develop 1
Capello 2
third   1
terms   1
useful  1
operation   1
Stroke  1
people  1
selection   1
aren    1
about   1
information 1
feature 2
program 3
Subscriber  1
Carduus 1
situ    1
support 5
profiles    1
Interface   1
classroom   1
Pixels  1
request 1
files   3
Quick   1
places  1
closed  1
institution 1
Developer   1
teacher 1
under   4
every   1
Guerrero    1
like    2
implemented 1
them    1
scripting   1
observable  1
license 3
sensitivity 1
Official    1
check   1
Please  1
Agreement   2
Instagram   1
idea    1
Richard 1
networks    1
official    1
algorithm   1
EDG32   1
Community   2
three   1
binaries    1
different   2
YouTube 1
Tool    1
editors 1
patches 1
ideas   1
contributed 1
open    1
contributors    1
DawnBringer 1
special 2
algorithms  1
tries   1
Javier  1
Arne    1
help    2
Hyohnoo 1
matriax8c   1
Indexed 1
separated   1
Server  2
Niklas  1
David   1
Studios 1
rotoscoping 1
includes    1
Igara   1
Sheet   1
Symmetry    1
Pressure    1
driven  1
Melentijevic    1
shortcuts   1
composed    1
from    1
organizing  1
  • 练习11.2:重复上面的练习,除了按照长度标准忽略单词外, 该程序还能从一个文本文件中读取需要忽略的单词表。

已写在上一题中

你可能感兴趣的:(11 小插曲:出现频率最高的单词)