Professional Documents
Culture Documents
Protocols
Protocols define interactions, such as the way the programmer uses the network. They have many elements, such as links,
switches, end-hosts, processes, and exist within a single layer of the network. A protocol is only used for one service. It defines
a service. You get stacks and layers of protocols in order to complete a full operation, these are known as network layers.
Network Layers
OSI 7 Layer Model (A Theoretical Model)
The OSI 7 Layer model is a theoretical model, and isnt necessarily how the network is laid out. This model was developed to
be follows theoretically and consists of seven key layers, as follows.
Presentation
Session
Data link
Service model.
Global coordination (e.g. port 80 is a web service).
Minimise manual setup.
Minimise volume of information at any point (otherwise bottlenecks at nodes can occur).
Distribute information capture and management.
Extensibility.
Integration with all different systems (e.g. Windows/Mac/Linux).
Error detection.
Error recovery (reliability).
Scalability.
Fixed path (channel) through the network is set up with dedicated resources to the connection.
Each path through the network becomes dedicated to the first connection made until its released.
Establishing a second link isnt possible because the resources arent available.
This idea comes from the way telephones used to work.
Advantages; guaranteed performance.
Disadvantages; setup time and the limitation of fixing an entire path through the network for only one connection.
Packet Switching
The data is broken up into discrete chunks and send it when the resources are available.
These pieces are normally of a fixed size.
All the bits in a piece are reserved for an end-to-end transfer.
The resource piece is idle if not used by the owning transfer.
Split up time.
At any time, the user gets all of the bandwidth
The user gets bursts of connection time.
The bursts are so small you probably wouldnt notice the difference.
Advantages; constant speed, good for latency needs.
Units of Measurement
1kbps is 1ms per bit.
1mbps is 1microsecond per bit.
Network Applications
Examples include mail clients, web browsers, video games and there are many more!
A network application is an application that has parts running on different computers. They communicate over the network.
They run at the edge of the network.
Architecture
Client-Server
This is the architecture web browsers use.
Theres a server and a client!
Server:
Always on. Has a permanent IP address so that it can always be found. There may be multiple IP addresses for a popular
site such as Google to improve performance.
Client:
Has a dynamic IP address. The client may not always be at the same address. Clients communicate through a server.
Advantages: Its easy to find the information because the server never changes.
Disadvantages: Not very scalable.
Peer-to-Peer (P2P)
No server. Instead, a collection of machines that change over time. Peers are intermittently connected and may have a
different address for each connection. Highly scalable.
Disadvantage: Its hard to manage because it can be hard to know where the information is.
Hybrid of Client-Server and P2P
Voice-over-IP and instant messaging uses a central server that registers clients IP address, and then links clients together. The
server sets up a P2P connection.
End-Point Implementation of Architecture
Network applications run as a processes on an operating system. The ends (one at each client) communicate by exchanging
messages. Messages are sent and received via a socket.
Socket is an abstraction. The socket sits between the process (application) and the transport service implementation. The
application can only use this socket (set up by the operating system) and cant change it. The operating system will provide an
API (Application Program Interface) which will allow a programmer of the application to make use of the socket through the
operating system.
One end sits in a waiting state for the connection (i.e. waiting for a message from the other system).
End-point identification
IP address tells us which computer were trying to connect to.
Port number tells us which application we want to give the information to. Implemented as a16 bit number.
e.g. HTTP server: port 80, Mail server: 25.
Quality of Service Parameters
Data loss We care for files or emails, but maybe not for video streaming.
Timing We dont care for file transfers, but we do for VoIP
Throughput We care for video streaming to get a decent image.
Security Encryption etc. may be required.
File transfer, email, web, instant messaging are elastic applications they just take advantage of the bandwidth thats there.
Video games are loss-tolerant, and require a few kbps upwards for the throughput and has a time sensitivity of a few seconds.
Internet Transport Service Models
TCP
Reliable
Can recover errors.
Has delays in recovering errors.
Example uses: E-mail, remote terminal access, web, file transfer, streaming media.
UDP
Unreliable.
No error recovery.
Example uses: Streaming multimedia, VoIP.
Application Protocols
RFCs (request for comments) used to define worldwide protocols. (e.g. Email).
Proprietary implementations the application just decides (e.g. Skype).
Application Data
The application source and destination must make sure they know and have the same interpretation of the data.
The applications also need to know what encoding is being used.
Compression if a form of encoding that makes minimises the size on the cable.
Understanding Data
Implicit Typing the application at each end has to know what the format of the data will come in.
Explicit Typing the data has in it flags (typically 1 bit) which tells the application whats coming next. The application reads
flag/data/flag/data etc.
Data Conversion
The data may need to be converted, for example if the size of an int is different for different applications on different systems
(this is an application-layer level of conversion).
Heterogeneous systems different operating systems working together may have to be allowed for, so for example one
messaging client may run on Mac OS X, whilst another client may be running on Linux the two have to be able to
communicate together and this is an application-level issue.
Canonical approach same representation across the cable. The source converts to this representation if it needs to and the
destination translates from this representation if it needs to. For example, all integers could be converted to, say 16, bits before
being transmitted, and then converted.
Some information can be sent at the start of the transmission that identified what needs to be converted (for example how to
convert ints). This information only needs to be sent once.
Binary attachments for emails are converted into 7-bit ASCII values. This encoding takes series of 3 bytes and then converts it
into a ASCII values. This is base64 encoding.
Application Extensibility
Communicating between different versions of the application can cause problems. You want version 2 to be compatible with
version 1. If this is possible then application extensibility is achieved.
Case Study: Telnet
DELETE
Web caching
A cache keeps a copy of web items, which avoids re-fetching from the server.
Reduces the amount of connections to the server.
Very cheap compared to upgrading the network speed.
If the version on the server has changed, then the information in the cache is not required.
The server can be asked if the version it has is different to the one in the cache.
Response header files also control caching.
Electronic Mail
Mail servers:
Dedicated mail servers are used to hold all the users messages.
Theres an outgoing queue of messages.
Mail must be sent to the correct server, which must then go into the correct mailbox on the server.
The server is often acting like a client in sorting these things out.
User agents:
Accesses the correct mailbox on the correct server.
There are a large number of sending protocols:
SMTP (Simple Mail Transfer Protocol)
Three phases of transfer:
1. Handshaking
2. Transfer of messages
3. Closure.
All messages must be 7-bit ASCII text.
Its about allowing your message to move backwards and forwards through the system. (Its like the envelope).
The end of the message is signalled by a single full-stop on its own line (because a blank line may be in the email message).
Basic email format:
Header lines (to, from, subject etc.)
These are followed by a blank line and body (the main message of the email and attachments via MIME).
MIME (Multipurpose Internet Mail Extensions):
Allow non-ASCII character sets and file attachments to be sent via the e-mail ASCII-encoded systems.
Additional header lines are defined in order to tell the client to interpret the data in the message in a different way.
Content types discrete types (e.g. image/gif, text/plain), application discrete types (the subtype is application, followed by
the type e.g. application/word it means that this application is responsible for interpreting this section), multipart type means
that the body can contain multiple types.
MIME Encodings:
Only 7-bit ASCII values can be transmitted, so a message that is a straight text message doesnt need any translation.
However, non-ASCII characters have to be translated into ASCII values and then translated back at the other end.
Base64 is the main encoder. As previously mentioned, this encoding takes groups of 3 bytes and translates them into 4 ASCII
characters.
Server access protocols for e-mail:
POP
The user agent communicates with the server and downloads all the messages.
So you can really only use one client, as the emails are being downloaded to this single client.
IMAP (Internet Mail Access Protocol)
More features but more complex.
The client manages the emails on the server, as opposed to getting a copy.
The messages stay on the server, so whatever client you use you always get the same state of your messages.
HTTP (e.g. Gmail, Hotmail etc.)
Probably still uses POP and IMAP underneath.
Web Address Name Lookup
Humans can easily remember names e.g. ebay. When you use a name, the numerical address is looked up.
Characterised as middleware, as name lookup is part of an application protocol. Its a service for users so they can just use
names.
Domain Name System (DNS) provides the mapping.
Resolution is the procedure that performs the mapping. The name server is an implementation of the resolution.
DNS
Hierarchical namespace for internet objects (e.g. .co.uk, .com, .ac.uk are different hierarchies).
Names have to be unique. But not worldwide unique, just within the hierarchy.
Decentralising because theres a lot of name mappings out there, so the looking up of these names has to be optimised.
Its a decentralised database (but not a real database).
Broadband uses the same copper wire but can now go up to 20mbs.
Fibre optic much more bandwidth still.
Data networks such as ethernet, and backbones to ISPs.
Broadcast television now converging to broadband multi-service networks. HD television shows are becoming available
over a broadband link.
Media is delay sensitive. When playing media from a remote source, you have two options; either you download the entire
media piece and play it back, or you ensure that enough of the data has been transferred before playing it back.
Media is loss tolerant. You dont notice small irregularities if the data received isnt exactly the same as what was sent, although
there is a threshold point beyond which the user starts to notice. There are methods of testing what this threshold of loss
toleration is (such as investigating the percentage of randomised pixel data in an image before the quality of the image
becomes unsatisfiable).
Networks That Can Allow Multimedia Access
Plain Old Telephone Service (POTS)
Phones are connected via a copper wire.
The circuits were switches in ordered to connect to ends and create a contiguous circuit from the source to the destination.
The bandwidth is also very limited and can only transmit around 4kz of sound (human voice and music requires
approximately 20khz to capture the relevant information).
Cellular Mobile Phones
All digital.
Information can be compressed further/better.
The phone makes the audio to digital data conversation, compresses it and sends it to a Base Transceiver Station (BTS),
which then passes it via a standardised Abis Interface to a Base Station Controller (BSC). The Base Station Controller then
uses another standardised interface to send the information to a Mobile Switching Centre (MSC). SS7 (Signalling Service 7) is
used to set up connections and tear them down. The information continues to move across the backbone of the phone network
using standard IP and networking.
2G phones cant use VoIP. GSM (the phone standard) uses a time sharing system that uses eight slots per transmission
channel (each users gets 1/8th of the transmitter at a time and only gets this time with the transmitter in bursts with long pauses
between).
3G uses a bandwidth dependant on what other users are doing and you dont have to find a slot on the transmitter because
you can always connect. The bandwidth is greater but still not great but good enough for the size of the screen youre using.
The Internet
Broadcast Television
A much more effective mechanism for broadcasting the same data to a wide range of devices. Much better than the internet,
because if the same amount of people try to get hold of the data via the internet, the servers will become overloaded.
How To Get Media
A simple approach to getting media
Audio or video is stored in a file.
Files are then transferred as HTTP objects embedded in TCP. The client then received the data, and passes it to the player.
This is not streamed just getting a simple file! Theres a long delay before you can play it back because the entire file has
to be downloaded to the clients local machine.
Streaming live multimedia
Client requests the media data stream.
Simple scheme for every group of n chunks of data send out n + 1 chunks. The additional chunk is the XOR of the original n
chunks. If data is lost then the XOR can be used to work out which bits should be there. Multiple packets can be XOR-ed
together one after the other to achieve this extra XOR packet for several packets. This only works when one packet is lost.
However, this mechanism adds to the play-out delay because the receivers sending out an extra packet every n packets.
Another idea is to send two versions of the same media, a compressed lower quality and a higher quality. When the network
begins to struggle with load, then the lower quality stream can be switched to and vice versa.
Interleaving the data is divided up and split it up to n packets. These packets are then mixed up to form a new set of packets,
interleaved with the information. The packets are then reassembled with the correct information in each at the receiving end. If
a packet is lost then n/data information is lost across the data so its not as noticeable.
Routers
Providing Quality of Service
Quality of service is about trying to find the best service over the resources you have. Packets are divided into different classes
and isolated. These classes are then allocated resources. Fixed, non-sharable bandwidth is allocated to the classes. At a
router, packets can arrive in any order they go into a queue. If the queue becomes full then packets have to be dropped.
Particular classes of router can have a higher priority for their packets to be dropped.
Scheduling Policies for Routers
Prioritising assigning priorities using classes to different routes. Classes with a higher priority will be forwarded first. This may
not be fair on some classes.
Round robin going round all the classes and forwarding a packet from each. However, if theres congestion then classes such
as Voice Data may arrive too late.
Weighted fair queue different classes of data coming in are divided into a different set of queues, so they each get a fixed
proportion of the bandwidth.
Policing Traffic
Traffic arrives in bursts. The aim of a policing mechanism is to limit the traffic to three set parameters:
1. Long term average rate the number of packets that can be sent per time unit.
2. Peak rate the maximum number of packets that can be sent at one time in packets per minute. This must support the long
term average rate above.
3. Maximum burst size the maximum number of consequentially sent packets.
A token bucket is used to throttle and limit the burst size and average rate. Tokens are added to the bucket periodically. In order
for a packet to pass through the router it must obtain a token from the bucket. This consequently means that if too few data is
provided then the token bucket will fill up with tokens at a dynamically changeable rate, and if too much data is provided then
the token bucket will be emptied of tokens. If there are no tokens then the data has to wait for tokens to become available
before it can continue through the network. If there are lots of tokens then a burst of data that is received is just forwarded
through.
Content Distribution Networks (CDNs)
Content Replication
Origin server with all the original data -> distributes the information to multiple systems spread around -> accessed by the user.
DNS can be used to replace (redirect) a query for a document to a more local query based on your current location. This
system can also be used to determine the data to be sent based on your location (such as local language).
CDN creates a map indicating distances from leaf ISPs and CDNs. It picks the closest CDN and redirects a users query
accordingly. The traditional client-to-server is inefficient for mass downloads. The servers cant handle the vast quantity of users
demanding the data.
Peer-2-Peer
The client sends out a query which searches for the file on other machines recursively until one system replies that it has the
information. The same document can then be sent to multiple systems so there are multiple locations to get the data from.
The load is now distributed and much lower. Theres no heavy load on one single server.
Bit Torrent uses a swarm of machines (any machine currently connected to the bit torrent facility) and a tracker. When you
request a document, the tracker then asks the swarm for the information and gets a section of the data from each machine. The
information is now coming from different machines so its even more distributed. However, the upload speed of users client is
often slower than that of a dedicates server.
Stop-and wait
Explained using examples
1.
Sender: sends data.
Receiver: successfully receives the data.
Receiver: sends back an acknowledgement.
Sender: receives the acknowledgement.
2.
Sender: sends data.
The data is lost.
Sender: timeout expires.
Sender: sends data again.
Receiver: successfully receives the data.
Receiver: sends back an acknowledgement.
Sender: receives the acknowledgement.
3.
Sender: sends data.
Receiver: successfully receives the data.
Receiver: sends back an acknowledgement.
The acknowledgement control packet is lost.
Sender: timeout expires.
Sender: sends data again.
Receiver: successfully receives the data and may make a duplicate copy (not desirable).
Receiver: sends back an acknowledgement.
Sender: receives the acknowledgement.
Sequence numbers are used to prevent duplicates. Only one-bit is needed if an acknowledgement is always required, which
alternates on each new data send.
Stop and wait is not very good because when data could be being transferred, the transmitter is instead just waiting for an
acknowledgement.
So instead sliding windows are used
Sliding Windows
Multiple packets are sent, and only acknowledged every so often. The window size is the number of packets that can be sent
before the sender requires an acknowledgment. So every, say 8 (if the window size is 8), packets the sender requires an
acknowledgement that lets it know that the previous 8 packets have been received and the window then moves along.
If the receiver gets a packet that comes out of order to the one it was expecting, then it cant acknowledge it because it hasnt
yet got the one before it. If the packet it was expecting arrives late, then the receiver can just acknowledge the most recent
packet in order it has. If the packet was lost, then the senders timeout will expire, and itll send the whole window of packets
again including the packets the receiver already has. This is refereed to as a go-back-N (GBN). However, it is not desirable
to have to resend everything within the window.
Instead, a NACK could be sent for the packet that was missing which arrives before the senders timeout expires. The sender
can then only resend the packet that was NACK-ed, which would save the amount of retransmission required.
Or, a selective acknowledgement could be used. This is sending an acknowledgement for every packet received. When the
senders timeout expires, the sender only resends the packets that it doesnt have an acknowledgement for.
Using sliding windows, 100% network utilisation can be achieved.
Sequence numbers are implemented using a fixed size integer. This number is minimised in order to minimise the header
overhead. A 3 bit sequence number could be used (which means the packets will count from 0 to 7 round and round). In this
example a window size of 8 would be ideal.
Issue: if an acknowledgment using selective acknowledgements is lost then the sender could send duplicate packets to the
receiver and the receiver would treat them as new.
The solution is to use a maximum window size that is half the maximum sequence number so that for the first set of
acknowledgements a different set of numbers are used to the second set acknowledgements. These sequence numbers would
then alternate in a binary fashion.
Transport Control Protocol (TCP)
TCP is a Connection-Oriented Protocol
This means that theres a state thats kept at each end of the communication. TCP is a reliable protocol because it provides a
service that handles any errors that can occur using error recovery techniques. TCP uses a finally tuned congestion control
mechanism it reduces the amount of data being sent when it detects that the network is congested. The data is always
delivered in the order its sent because an acknowledgement for each packet is required. TCP uses buffers to send and receive
the information before passing it to the application.
A connection identifier is used in the header files, using the source and destination IP address and port number. The application
passes its stream of bytes it wants sending to TCP, which puts it into packets and a send buffer. TCP can then choose how
much to send from the buffer depending on the network congestion. When theyre received, the packets are put into a receive
buffer, from which the application can decide when it wants to read the packets.
However, a key disadvantage is that if one end crashes, then the state at that end is lost. This means that theres no way to
perform a recovery, the connection will have to be set up again and everything will have to be sent again.
TCP is Reliable
TCPs reliability is achieved using a sliding window, go-back-N (cumulative ACKs), sequence numbers (for bytes not
segments) and a single retransmission timer. Sequence numbers are used for every single byte rather than every segment (in
UDP). This is because segmented data can cause problems with TCPs error recovery. If segmented data is used for transfer,
then the sender will send whatever segments are in its send buffer. If a retransmission is needed then there may now be more
information in the send buffer (because new data to be sent has been added after to the data the receiver hasnt yet received).
The retransmission will send all the data in the buffer, and the receiver will get more than it was originally supposed to all at
once. So if the sequence number counts segments, then the numbers would be meaningless, as this segment of data is now
different to what it should have been (because it effectively is a combination of two segments). If each byte is numbered, then
the receiver can tell exactly what its receiving, and duplicate information can be identified.
A TCP acknowledgement acknowledges by stating the byte sequence number that it next expects to receive thats just how it
works!
Sequence numbers are 32-bits long. The sequence number of the first byte in the segment is needed, and the other sequence
numbers in the segment are implicitly calculated form the first. Sequence numbers in each direction of the transmission are
independent of each other.
The value of the retransmission timeout cant be too small (retransmission will occur too often), nor too large (excessive delays
before a retransmission takes place). An appropriate value will to the round trip time. Because the round trip time varies
depending on the current level of traffic going through the network an adaptable algorithm is needed. This algorithm can
determine the current round trip time and adjust the the timeout accordingly. There are three main algorithms for this:
1. Basic algorithm set the timeout time to twice the roundtrip time (gives enough margin). An average round trip time (and
thus the timeout time) is taken, which is updated every time a packet is received (because the round trip time can be calculated
every send-receive-acknowledge cycle). However, the problem is that duplicate packets being retransmitted can be received,
which will make the average less representative.
2. Karn/Partridge algorithm only measures the round trip time for non-retransmitted segments in order to work around the
issue outlined for the basic algorithm.
3. Jacobson/Karels algorithm more suited to communications where the round trip time is more varied and an average isnt
appropriate. This algorithm takes into account the variation in the round-trip time (the jitter).
TCP Data Flow Control
The receiving buffer has a finite size. If the data is arriving more quickly than the application reads from it, then the buffer will fill
up. When the buffer becomes full, the receiver cant acknowledge any more data that it receives (because it cant store it
anywhere so itll just loose it). The senders timeout will expire and a retransmission will take place. Because the receivers
buffer may still not be empty this loop will occur and the transfer will be wasted sending the same thing again and again until
the buffer is emptied.
A mechanism is required so that the receiver can control how much information the sender is sending. This mechanism is
known as flow control
The sliding window size is not fixed. In the acknowledgment, the receiver lets the sender know how much data its prepared to
receive. The window size can be set to 0 (so no more data is transmitted until the buffer is freed, at which point the receiver will
re-acknowledge the last byte with a non-0 window size). The flaw is that the re-acknowledgment packet could be lost, which
would cause a deadlock. The solution is that once a window size of 0 is set at the sender a much longer timeout is used (such
as two minutes) before sending the next segment as normal.
Because the window size is determined in a 16 bit value, the maximum window cannot exceed 64kb. This means that for very
fast connections the network cannot be fully utilised as the sender will be waiting for the acknowledgment after sending its
window. One solution is to use multiple TCP connections. Another solution would be to use a multiplier factor of, say 10, for the
sequence number and window size, so that a segment can go up to 640kb.
TCP Connection Control
Setup is asymmetric, as one side is active and the other side is passive. The teardown is symmetric, as both sides must
perform a close symmetrically.
A three way handshake is used to establish a connection:
A sends a packet across to B, setting the SYN control flag and an initial sequence number
B acknowledges this packet, sending its own initial sequence number by setting the SYN control flag.
In the reality of networks, data can become lost and re-emerge a lot later. If a first TCP connection is set up and torn down
whilst information from this connection is still travelling through the network and a new connection is set up using the same
addresses and port numbers, then when the data is finally received its treated as data for the new connection (if its within the
window size expected). If the sequence number is not set to zero every time, then sequence numbers in the window the
receiver is expecting will be different, so the old data will be thrown away when its received because it will have a sequence
number outside the window of sequences numbers that were expected. This is why TCP negotiates an initial sequence number.
TCP control is defined in a state transition diagram, which gives a structured method of laying out the protocol.
TCP Congestion Control
TCP implements an algorithm that attempts to detect if congestion is occurring within the network. If an acknowledgment is
received, it determines that there was enough space for that segment to go through so it increases the amount of data it
sends. When a segment gets lost, it determines that there must be congestion, so it decreases the rate in which it sends data.
The increase of data occurs slowly, where as the decrease in the data rate occurs very sharply, so a graph showing the amount
of data TCP is transmitting would look like a sawtooth. When TCP first connects, the transfer would be very slow by the rules
of congestion control just outlined. So, a different approach is used when the connection is first established. As soon as the
sender receives an acknowledgement it doubles the rate it sends data (instead of the normal, slow increase) until a particular
threshold is met, at which point the sending rate increases using the standard slow increase.
TCP Fairness
TCP tries to divide the connection bandwidth between the number of TCP connections currently active. But, if an application
opens up multiple TCP connections, then that application will get an unfair share of the connection (TCP will have no
knowledge of this).
Inter-Networking
A collection of networks (each with their own address scheme, service model etc.) can be made to look like one huge, single
network. The Internet Protocol (IP) manages to achieve this.
Differences between physical networks:
Service model connection-oriented or connectionless.
Network level protocols being used: IP, IPX, AppleTalk
Addressing flat (no structure to the address you cant tell anything about the address from the bits), hierarchical
(structured address).
Broadcasting and Multicasting: whether or not theyre supported.
Maximum packet size.
Quality of service supported or not.
Error recovery performed or not.
IP has to work round all these differences and create a uniformed network where it doesnt matter what the individual attributes
of linked networks are. IP introduces a secondary, universal, logical, address space which maps physical addresses to logical
addresses. This is to give every location a unique universal identifier across the planet.
Different sizes of physical packets can be uniformed by setting a packet size. The minimum packet size depends on the path
being followed through a network, which could vary.
The packet formats can be different. The packets could be translated at each piece of technology, but this is not always
possible, so a new universal packet format may need to be introduced in order to encapsulate packets.
Broadcasting can be implemented by sending messages to every single computer on the network (multiple unicasts).
Service Model
This will be provided to the transport layer that the other layers that exist above.
Internet Protocol (IP)
The only realistic option to achieve worldwide host-to-host delivery.
Runs on all hosts and routers in the network.
Service model
Connectionless it places minimal demands on the underlying networks. Part of the result of this minimal demand is that it will
work on any network technology, and has worked on all technologies since it was developed in the 1970s.
Unreliable no error recovery. Best effort delivery.
A unique address is needed, with global coordination of these IP addresses. All the numbers are ultimately controlled by the
Internet Corporation for Assigned Names & Numbers (ICANN), which delegates blocks to Regional Internet Registries (RIRs).
Universal Packet Format (Datagram)
This has sufficient information to reach the destination and encapsulates the data from TCP/UDP etc. This is then sent across
the network via forwarding, which is a distributed sequence of decisions to make the next hop. The contents of the packet
header are as follows:
The first few bits contain the IP protocol version (typically v4 and v6 are in use at the moment).
Theres a 32 bit source IP address.
Theres a 32 bit destination IP address.
Options can hold a timestamp, record of the route taken, or a specified list of routers to take.
In order to stop an IP packet going round and round the network, the packet header also has a time-to-live which is
decremented each time it passes through a router. Once it gets down to 0, the packet is deleted and the source IP address is
notified.
Theres information about the upper layer protocol to use.
Datagrams must be encapsulated into a physical frame to be transmitted across the physical link layer. The physical address
must be mapped to the logical IP address. MAC addresses arent used, instead, an Address Resolution Protocol (ARP) is used,
which is associated with Ethernet/IEEE 802.3. This is a broadcast message which asks for the physical address for the IP. The
system with the physical address of the IP address responds with the physical address. In order to reduce the amount of traffic
caused by this protocol, the responses are cached and nodes along the network will also cache queries they see.
If a packet is too large for a network its about to enter, it is split up. There are fields in the header of the datagram that allow the
packets to be reassembled, which occurs at the final destination. The network becomes more unreliable the more the packet is
split up.
Dynamic Host Configuration Protocol (DHCP) is used to configure a host with an IP address, so that information such as the
default router, netmask and DNS servers are known. Its a client-server protocol. When a client starts up, it broadcasts for the
DHCP connection information. The server then sends back the details of the configuration. The DHCP server is part the routing
hub. Clients can use static or dynamic address assignment.
An organisation may want to restrict access to its network. This can be done using your own cable, although this can be very
expensive. Instead, a virtual private network can be implemented which creates a secure tunnel between two ends. All the data
has to pass through this tunnel.
Network Address Translation (NAT) allows multiple computers to share a unique worldwide address. Local addresses, such as
192.168 known as private addresses. Packets from these addresses that want to go to another network have to go to a NAT
box, which maps the address and port number to a worldwide address and vice versa. To a private network, the NAT box works
like a router.
Internet Protocol Version 6 (IPv6)
The motivation to develop a new version of the internet protocol was to deal with the growth of the internet. The datagram
format had to be changed for the new version, for example 32 bits for an address was determined to be too small to hold all the
network addresses one would want to use, so it was changed to 128 bit which means that every atom on the planet can have
its own unique IP address but who knows, maybe things in space may need IP addresses planning for the future!
Multicast is to be implemented, which means that computers will be able to broadcast beyond the local network. There will be a
method of specifying how far you want to broadcast.
The Link Layer I
Overview
The main aim of the link layer is to provide node-to-adjacent-node transfer of a datagram over a link.
Services required to achieve this main aim
Framing encapsulation needs to take place to a packet before it can sent. Nothing about a packet that is received is looked
at.
Sharing if a wire is to be shared this needs to be supported by the link layer.
Addressing one machine needs to know what the other machine is that its trying to communicate with is.
Flow control the amount of data thats being sent needs to be controllable. (This is despite higher-level protocols such as
UDP having no flow control).
Error correction forward error correction and other techniques are required at the other end in order get around realistic
problems of the network. This is to give the layers above the illusion of a reliable network.
1 bit in a million is the typical error rate for an Ethernet cable.
1 bit in a billion billion is the typical error rate in fibre optic cables.
Much more than 1 bit in a million is the typical error for wireless communications.
Full or half-duplex determining the direction that information can be sent at one time, or whether information can be sent in
both directions.
Where is it implemented?
In every node and in every adaptor such as a network interface card (NIC) such as an Ethernet card.
The link layer is not handled by the CPU (which handles all of the higher network levels). Instead, the link layer is handled by a
controller in the network card that passes information to a host BUS, which then passes it via interrupts and registers to the
CPU.
Network adaptor sending:
Encapsulates the datagram in a frame, adding error checking bits, flow control and more.
Network adaptor receiving:
Checks for errors, flow control and other information in the frame. It extracts the information from the datagram and passes it to
the upper layer at the receiving side.
Packet Encapsulation
A header and a tail is added to the message, which is taken off when its received.
Ethernet frame structure:
Header:
Preamble allows the receiver to learn whats going on during the networks. The preamble is set to 1,0,1,0, .. etc. a number of
times. The preamble is known, so errors in it can be learnt and recognised to be possible flaws in the network.
Start of frame delimiter lets the receiver know that the rest of the frame is about to continue, again preset.
Mac address of the destination 48 bits in size.
Mac address of the source again 48 bits in size.
Ether-type or packet length so you know much encapsulated data you have.
Tail:
CRC32 A cyclic redundancy check calculated by the sender and checked by the receiver to see if the data has been
received correctly. Not perfect. Its possible to pass the check and be passed up to the other layers despite there being an error.
Inter-frame gap a bit of space before the next frame comes along.
Flow Control
Optional.
Most network cards do have flow control. The aim is to ensure that the receivers buffer doesnt overflow.
Implemented via either:
Handshake a wire can be set to high or low to indicate whether the receiver is ready or not. This can also be done using a
software interpreted X-ON/X-OFF.
or
Open-flow pre-reserve and negotiate some of the resources of the receiver in order to deal with the senders data. A
protocol that handles this idea is Connection Admission Control (CAC).
or
Closed-loop a method of reporting resource availability and resource needs and sending data according to these reports.
A message is broadcast to a special multicast address with a 16 bit time request for a pause in the data transmission. The
Asynchronous Transfer Mode (ATM) has an Available Bit Rate (ABR) which guarantees a minimum bit rate to a sender and
allows the receiver to report back congestion at which point the minimum bit rate is used.
Link Layer Addressing
IP Address:
32-bit.
Network-layer.
Used to get a datagram to a destination IP subnet.
Partially geographical.
MAC Address:
MAC address is the link-layer address of your machine which is essential for communicating with anything.
LAN. Physical. Ethernet.
48 bits (6 bytes) for most LANS: 3 bytes for the organisation identifier and three bytes for the NIC identifier.
Set into the Network Interface Card (NIC), but can be software settable.
Expected to last until the year 2100 before the same addresses will be used again
The aim of the MAC address is to assist the link layer in transmitting the framed packet from one interface to another on a
physically-connected interface.
The broadcast address is: FF-FF-FF-FF-FF-FF (all 1s).
Administered by IEEE the manufacturer buys a portion of MAC address space in order to assure uniqueness.
MAC addresses are flat, which means that you cant tell anything about the sender or receiver (other than the manufacturer)
from the numbers. This is opposed to IP addresses, which are geographically allocated (theres not need for geographic
information with MAC addresses).
Mapping IP to/from MAC Addresses
Address Resolution Protocol (ARP) an address resolution protocol table is used. When an IP address is known, but the MAC
address isnt (i.e. its not in the lookup/mapping table) of a destination machine, then a new frame is broadcast detailing the
senders IP and MAC address, and requesting the MAC address of the system with the IP address we want as the destination.
The destination address responds with its MAC address, and nodes along the path and the original sender make a note of this
mapping within the ARP table as a cache for later. Entries stay in the table for typically 20 minutes. This cache is to ensure the
network isnt constantly filled with traffic requesting the MAC address.
Hubs and Switches
Hubs allow the connection of wires from a number of different machines to join together into an effectively analogue
communication. The aim is to allow multiple machines to share a connection. A frame is sent out on all systems in the hub.
Nodes connected to the hub can collide with each other and can listen to other messages. When multiple messages get mixed
up (and no longer make sense!) then collision avoidance has to take place, which may mean turning off communications and
backing off before retransmitting. Not very common any more.
Switches store and forward devices. Transparent the hosts are unaware of switches. Plug-and-play and self-learning. Each
wire is a separate collision domain theyre separate so theres no collisions. Theyre also full duplex wires. Switches can also
buffer and queue the packets. Switches can do much more than hubs can, which is why theyve pretty much replaced hubs.
For each of the connections it provides it has a network interface card and a processor in order to look at the data coming in,
check CRCs, extract the datagrams and then send the data up the layers. If the data is to be forwarded then it takes it up a a
layer, moves it to the correct network interface, encapsulates it, and sends it down the wire.
A switch knows that destinations are reachable using a switch table. It broadcasts a message in order to fill the table up with
MAC addresses of the destination. This is very similar to how ARP works.
Switches can be connected together the same principles apply, the switches will talk to each other in exactly the same way.
Gigabyte Ethernet is used for faster transmissions, and is often used for backbones to multiple standard Ethernet switches as it
has a higher bandwidth, but is becoming more popular and mainstream for Ethernet solutions.
Switches vs. Routers
They both act in the same store-and-forward way, but routers maintain routing tables and implement routing algorithms, whilst
switches maintain switch tables, implement learning algorithms, and implement filtering.
Routers can do more complicated forwarding because they can access the higher network layers and look at IP addresses in
the datagram etc.
Switches, Routing and Local Area Network (LAN) Addresses
Multiple Access Sharing a Network
Two main types:
1. Point-to-Point
E.g. ADSL at home using a telephone wire has a single sender and single receiver (the telephone exchange). Over the wire a
protocol is run (typically Point-toPoint Protocol or PPP, which is on top of the High Level Data Link Control or HLDLC which is
used to control the line).
2. Broadcast
A single shared medium which is the broadcast channel. Every device can talk at the same time so interference can occur. The
communications rely on Signal to Noise and Interference Ratio (SNIR) in order to listen to and understand the signal you
want to be listening to needs to be the loudest one at the time. The more information thats being sent in a given time period,
the louder the sender has to be.
Collisions can occur if two transmissions overlap in time. When this occurs the resources become wasted, the transmission
becomes rubbish and has to be thrown away and re-transmitted, which takes up more time and resources.
The solution is to coordinate all the systems using Medium Access Control (MAC) so as they all get a chance to communicate.
A central controller (such as in 2G phones) can have a fixed schedule of when every system uses the channel. This would
mean that the channel is never empty, and that theres never a clash. There may be empty slots left but these can be used for
network control.
Three main types of medium access control:
1. Channel Partitioning
2. Random Access
3. Turn Taking
The ideal characteristics for a channel is to have are:
One node with the data using the whole bandwidth of the channel so as theres no competition for resources. In practice
however, nodes are not capable of using the full channel.
The connection to be shared equally between all the nodes currently using the connection.
Decentralisation so as theres no central point of failure.
Simple and inexpensive.
Collisions and Channel Partitioning
Much cheaper than dedicated links. Some of the options currently available include:
Plain Old Telephone Systems (POTS) 56kbs limited by bandwidth.
Integrated Services Digital Network (ISDN) two POTS lines with 64kbs channels allows for 128kbs in total. This uses a coder
and a decoder to put voice over the digital link.
Digital Subscriber Line (DSL) marketed as broadband and has a speed of 1-32 Mbps and growing There are a number of
usable channels and the uplink and downlink determines the data rate.
Cable TV (CATV) wide bandwidth but shared between TV (6Mhz channels) and data.
Link Layer Bit Encoding
Signals, such as voltages, travel through the physical medium between hosts. So the bits have to be encoded into these
signals by the senders network adaptor, and decoded by the receivers network adapter. The link layer uses a mapping
between 1 to a HIGH value (e.g. a high voltage) and 0 to a LOW value (e.g. a low voltage). It uses an idea is known as Non
Return to Zero (NRZ), which means that if the voltage is high then ever clock cycle a 1 is added, and if the voltage is low then a
0 is used ever clock cycle. However, the problem with this is that the senders and the receivers clocks need to be
synchronised in order for the receiver to know when a new bit should be taken from the continuous voltage. The clock could be
broadcast on a different channel but this is a waste of the network!
Theres also an issue with the amount of voltage change required to signal a bit change. The receiver could learn where the
transitions are but after a long series of 0s or 1s, it can forget what the change should look like. So, the average of the signal
across time could be taken. An average voltage is used to distinguish between a 1 and a 0. All zeros causes the average to
drop to zero (and a slight change would signal a one) All ones cause the average to rises towards 1 (and a slight change
would signal a zero).
Clock Recovery
The clock is not sent as this is a waste of the networks resources.
Non-Return to Zero Inverted
A transition in the voltages levels always converts to a binary bit 1, where as a consecutive voltages across multiple clock
cycles indicates a 0 bit each clock cycle. Multiple ones then make good clock recovery, as therell be multiple transitions on the
clock edge. This is why an ideal synchronisation pattern is 111111. On the other hand, a whole sequence of 0s can mean the
clock gets lost.
Manchester Encoding
Used in IEEE 802.3 Ethernet.
The clock runs at twice the data rate. The sender sends the XOR of the clock (which is twice as fast as the data) and the data.
However, using this encoding, only 50% of the speed of the network can be used, as the clock has to be twice the speed of the
data. But there is very easy clock recovery. Strings of 0s and 1s can be dealt with quite easily.
4B/5B
Used in 100BASE-TX Cat 5 Ethernet.
More transitions, but without the same clock overhead.
The bits are broken down into a series of 4 bits. Each series of 4 bits is encoded as a 5 bit code. The 5 bit code ensures
theres only ever:
One leading zero and
Two trailing zeros.
This means that theres only ever 3 zeroes in a row before theres a transition to a 1. This encoding can then be sent over
Non-Return-to-Zero, so every 1 transmitted has a voltage transition (the more transitions the better!), and has an efficiency of
80%.
There are spare codes (as you can use 32 codes with 5 bits and encoding 4 bits only takes up 16 codes). Some of these
spare codes are used for transmitting control. 11111 == Idle (1s are used to maintain the transitions and clock synchronisation
so that when data does come in theres no transition problems). 00000 == dead (no transitions, no transmission its dead
because its not moving!). 00100 == halt (a control to stop the transmission). There are 6 more control signals, and 7 unused
codes because they break the zeros rule (described above).
Signals and Modulation
Whatever medium is being used, the signals are usually electromagnetic waves the speed of light. Different materials have
different refractive indexes, which varies the speed of light. When transmitting light down a copper wire, to find the velocity, it is
2/3 * speed of light. The velocity factor for a medium determines how much the light will slow down when it passes through that
medium.