Thursday, June 14, 2007

Status Report for 06/07/07 to 06/13/07

Issues in using NWS
1. Hostnames
NWS’s memory service, which provides memory service to store NWS sensors’ data, saves data based on NWS sensors’ hostnames. (A hostname is a system name returned by gethostname or ‘hostname -–fqdn’) This can be problem when a master and sensors installed in different clusters and a sensor’s hostname is not a public domain name, i.e., internal hostname. For example, BigRed's hostname is not for publicly accessible from the Internent, rather hostnames are for only internal uses. In this case, NWS’s information extractor, nws_extract , can not show correct data to a user. If the memory service is in the same cluster with sensors, it will be ok but no other sensors outside clusters will be allowed.

2. Manager nodes and computing nodes
NWS’s bandwidth measurement may be incorrect depending on cluster’s topology. For instance, NCSA’s TeraGrid cluster consists of four manager nodes, which have all public hostnames and thus can be accessible from the Internet, and hundreds of computing nodes, whose hostname is not known unless submitting a job. To measure bandwidth before knowing which node will run a job can be difficult. To overcome this possible problem, knowledge on a cluster’s topology or managing policy will be required, if possible, by contacting a help desk.

Thursday, June 07, 2007

Status Report for 05/31/07 to 06/06/07

NWS installation in TeraGrid

I've installed a NWS client in a few TeraGrid clusters and in my local machine to measure network bandwidth information between them. To sum up, we need to run the followings for a master node and a client node

a. Mater node
nws_nameserver
nws_memory
nws_sensor

b. Client node
nws_sensor

To start measuring, we need to submit the following command:
start_activity -f file_name name_server


To extract measured values, type the following:
nws_extract -N name_server -f time,measurement band from_node to_node


Here is some outputs for network bandwidth:

## Units: Megabits/second
## E.g.,
## 1181197165 (<= This is time) 6.148710000(<= This is bandwidth, Megabits/second)
## Network bandwidth between a.ufo.edu and b.ufo.edu
$ nws_extract -N nameserver.ufo.edu -f time,measurement,source,destination band a.ufo.edu b.ufo.edu
Time Measure Source Destination
1181197165 6.148710000 a.ufo.edu b.ufo.edu
1181197285 6.142370000
1181197405 6.135040000