Cloud Project

Cerrado Publicado Sep 6, 2015 Pagado a la entrega
Cerrado

Given a file system myFS on the network (say NFS).

Contents of the files in myFS are to be summarized as a set of keywords (KS).

Intersections of their KS denote a relation (FR) between the two files, that is weighted by the size of KS. Build an distributed, in-memory graph (GR) capturing FR, that is fault-tolerant (parameterized by c failures, where c is small compared to the number of nodes).GR should handle queries of the following sort:all files related related to a given file f, cliques with a minimum weight on each edge

(transitive) closure of a file etc. Write a program using the map-reduce framework for the FSO problem . evaluate the performance of the program for different number of nodes / processes in your system and for different data sizes. Plot runtime graphs for varying data sizes and number of [url removed, login to view] are not accessing a list of files. You are required to write a file system traversal routine given the root of the filesystem.

The file system is stored on the network i.e. your traversal procedure must be a parallel program running on nodes of the cluster although it will start with a single process accessing the root of the file [url removed, login to view] program should distribute files if necessary.

Hadoop

Nº del proyecto: #8413383

Sobre el proyecto

Proyecto remoto Activo Nov 2, 2015