Skip to content

Conversation

@arielmeragelman
Copy link
Collaborator

COMO: Analista de datos
QUIERO: Optimizar la ejecución actual del MapReduce
PARA: Poder ejectuar la misma o mayor cantidad de datos en menos tiempo

Criterios de aceptación:

Utilizar otra técnica compatible con Hadoop vistas en la clase

Documentación de referencia: https://hadoop.apache.org/docs/r1.2.1/mapred_tutorial.html

Otras técnicas interesantes: https://techvidvan.com/tutorials/mapreduce-job-optimization-techniques/


-The code includes a test for haddop , it runs a word count and it works

  • The second commit includes a test to check how the input/output works
    It takes the xml file that was included in input folder and run it as input , it add text to it and pass it to the reducer , the reducer also add text . The tests let me check how both files works and edit the data from the xml file

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants