Monday, August 31, 2009

Some definitions of Computer Networks

PACKET-SWITCHING: It typically use a strategy called store-and-forward. Each node first receives a complete packet over some link, stores the packet in its internal memory, and then forwards the complete packet to the next node.

CIRCUIT-SWITCHING: It establishes a dedicated circuit across a sequence of links and then allows the source node to send a stream of bits across this circuit to destination node.

CLOUD: Any type of network e.g. point-to-point, multiple access, switched.

SEMANTIC GAP: Gap between what the application expects and what the underlying technology can provide.

GENERAL CLASSES OF FAILURES THAT NETWORK DESIGNERS HAVE TO WORRY: First, Bit-level errors(Electrical interference). Second, Packet-level errors(congestion). Third is link and node failure. if the network overcomes the above three failure then the network can provide reliability.

TWO NICE FEATURES OF LAYERING: First, it decomposes the problem of building a network into more manageable components. You can implement several layers, each of which solves one part of the problem. Second, it provides a more modular design. If you decide that you want to add some new service, you may only need to modify the functionality at one layer, reusing the function provided at all the other layers.

PROTOCOL: The abstract objects that make up the layers of a network system are called protocols. Each protocol defines two different interfaces, service interface and peer interface.

SERVICE INTERFACE: It defines the operations that local objects can perform on the protocol.

PEER INTERFACE: It defines the form and meaning of messages exchanged between protocol peers to implement the communication service.

OSI LAYERS:

Physical layer handles the transmission of raw bits over a communication link.

Data link layer then collects a stream of bits into a larger aggregate called a frame.

Network layer handles routing among nodes within a packet-switched network. At this layer unit of data exchanges among nodes is typically called packets.

Transport layer then implement process-to-process communication channel. At this layer unit of data exchanged is commonly called a message.

Session layer provides a name space for connection management, that is used to tie together the potentially different transport streams that are part of the single application.

Presentation layer is concerned with the format of data exchanged between peers.

Application layer is concerned with the different applications can interoperate with below layer.

DEMUX KEY (OR) DEMULTIPLEXING KEY: Protocol attaches a header to its message contains an identifier that records the application to which the message belongs. we call this identifier demux key.

INTERNETWORK OR INTERNET: A set of independent networks are interconnected to form an internetwork.

BANDWIDTH: Number of bits that pushed on to the network per second.

LATENCY: Is amount of time taken to travel from source node to the destination node.

Address: A byte string that identifies a node.

SWITCH: It main function is to store and foward packets.

ROUTER OR GATEWAY: A node that is connected to two or more networks is commonly called a router or gateway. it also store and forwards messages like switch does, but between the different networks.

ROUTING: The process of determining systematically how to forward message toward the destination node based on its address is called routing.

UNICAST: A source node wants to send a message to single destination node is called unicast.

BROADCAST: A source node want to send a message to all the nodes on the network is called broadcast.

MULTICAST: A source node want to send a message to some subset of the other node, but not all of them, is called multicast.

MULTIPLEXING: It enable two are more transmission sources to share same media.

INTERLEAVING: The process of taking a group of bits from each input line for multiplexing is called interleaving.

FREQUENCY DIVISION MULTIPLEXING: Assignment of non-overlapping frequency ranges to each “user” or signal on a medium. Thus, all signals are transmitted at the same time, each using different frequencies.

A multiplexor accepts inputs and assigns frequencies to each device.

The multiplexor is attached to a high-speed communications line.

A corresponding multiplexor, or demultiplexor, is on the end of the high-speed line and separates the multiplexed signals.

SYNCHRONOUS-TIME DIVISION MULTIPLEXING: The multiplexor accepts input from attached devices in a round-robin fashion and transmit the data in a never ending pattern.

If one device generates data at a faster rate than other devices, then the multiplexor must either sample the incoming data stream from that device more often than it samples the other devices, or buffer the faster incoming stream.

If a device has nothing to transmit, the multiplexor must still insert a piece of data from that device into the multiplexed stream.

STATISTICAL MULTIPLEXING: A statistical multiplexor transmits only the data from active workstations

If a workstation is not active, no space is wasted on the multiplexed stream.

A statistical multiplexor accepts the incoming data streams and creates a frame containing only the data to be transmitted.

To identify each piece of data, an address is included.

If the data is of variable size, a length is also included.

Creating blocks and grids in CUDA

GPU's are capable of performing task that are performed by CPU's, CUDA was developed.

This program demonstrates how to create grids and block in a process.

#include stdio.h
#include cuda.h

// Kernel that executes on the CUDA device
__global__ void square_array()
{
int idx = blockIdx.x * blockDim.x + threadIdx.x;
printf("idx %d blockIdx.x %d blockDim.x %d threadIdx.x %d\n",idx,blockIdx.x,blockDim.x,threadIdx.x);
}

// main routine that executes on the host
int main(void)
{
int N=9; // length of an array
int block_size = 4; // number of threads that fit in a block
int n_blocks = N/block_size + (N%block_size == 0 ? 0:1); // number of blocks
square_array <<<>>> ();
}


if you execute the program you will get the following output:
xxxxx@hpcc:~/prog$ ./test
idx 0 blockIdx.x 0 blockDim.x 4 threadIdx.x 0
idx 1 blockIdx.x 0 blockDim.x 4 threadIdx.x 1
idx 2 blockIdx.x 0 blockDim.x 4 threadIdx.x 2
idx 3 blockIdx.x 0 blockDim.x 4 threadIdx.x 3
idx 4 blockIdx.x 1 blockDim.x 4 threadIdx.x 0
idx 5 blockIdx.x 1 blockDim.x 4 threadIdx.x 1
idx 6 blockIdx.x 1 blockDim.x 4 threadIdx.x 2
idx 7 blockIdx.x 1 blockDim.x 4 threadIdx.x 3
idx 8 blockIdx.x 2 blockDim.x 4 threadIdx.x 0
idx 9 blockIdx.x 2 blockDim.x 4 threadIdx.x 1
idx 10 blockIdx.x 2 blockDim.x 4 threadIdx.x 2
idx 11 blockIdx.x 2 blockDim.x 4 threadIdx.x 3

int block_size = 4;(it is blockDim.x, each block contain 4 threads)

int n_blocks = N/block_size + (N%block_size == 0 ? 0:1);
this instruction will generate n_blocks=3 (number of blocks, in above output it is blockIdx.x)

How to compile:

Write program and save it with ".cu" extension.
$xyz.cu

setup environment variables
$set up LD_LIBRARY_PATH
$export LD_LIBRARY_PATH=$PATH:/home/cuda/lib/

compile
$/home/cuda/bin/nvcc -deviceemu xyz.cu -o xyz

run
$./xyz