Who will carry file to the slave node?
Lets First Discuss Problem Statement !!
We have been given a task i.e. as we know there is a cluster in Hadoop
As you can see in above figure that there is one Name Node/Master Node and there are several Slave Nodes attached to it. There is also a Client Node(By the Way slave and name node both can act as the Client but here in this we have one dedicated Machine which act as Client Node) which is Connected to the Master Node.
If client uploads any data or file, then who will carry this particular file to the slave node? This is a task which we have to prove practically…
With the help of AWS cloud I lunched two instances, Which I configured the first instance as Master and configured the second instance as a client and my each teammate also launched his instance and configured them as a slave with the help of AWS cloud. The formation of the cluster is completed. Now, we are ready to preform this task practically.
client-public-ip ->13.235.83.58
master-public-ip->65.0.98.239
With the help of the client instance I wrote a hadoop command to upload the data but didn’t run it.
The size of the data was 167 MB.(So, that we can identify the length of the data easily)
Now I asked my teammates to run this the tcpdump command on the port no. 50010(as we know the data transfer through this port in hadoop) On there slave Nodes . Now waiting for the data to upload from the client side….
Then I run the command which was written on the client instance this which will lead to the uploading of his file.
Now see the Result from the three Slaves…
Slave 1->
As you see in Highlight that public-ip (13.235.83.58)of the client is Directly Sending the Packets of length 60816….
Slave 2->
Slave 3->
Conclusion:
From this whole practical we have learnt that the client sends request to master node for the slave’s IP address. Now the master provides the public IP address of the slaves to the client. Then the client directly uploads the file/data at the particular IP address to the Slaves.