Skip to content

vrachnis/OpSysII

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This is a map-reduce program for hadoop calculating the TF-IDF values
for every word in a set of input text files.

This was developed as a part of a school project.

after running `make jar`, to create the inverted index, run:
hadoop jar TfIdf.jar gr.upatras.ceid.romo.Index <input> <output> <title>

to create the tf-idf metrics, run:
hadoop jar TfIdf.jar gr.upatras.ceid.romo.Tf <input> <output> <title>

I hardcoded the number of reducers to 5 according to my system.
You might want to change it to suit your needs.

About

Repo for the Operating Systems II project

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages