You are on page 1of 30

Troubleshooting Riverbed WAN OPTIMIZATION

Authors Danny Mongrain


Version Number 1.6.0
Version Date 2015-03-10
Status Final
File Name Troubleshooting Riverbed WAN OPTIMIZATION.DOC
705 - WAN OPTIMIZATION 705 – Operational Guide

Revision History
Version Date By Comments
1.0 2013-06-20 Danny Mongrain Initial draft
1.1 2013-07-16 Danny Mongrain Added section Getting support from Riverbed TAC.
1.2 2013-07-16 Danny Mongrain Added section Software downgrade.
1.3 2014-02-21 Danny Mongrain Enforced the requirement to make product aware
when configuration is changed locally or if
passthrough rule must be kept for a while.
1.4 2014-02-21 Danny Mongrain Added Secure Peering section.
1.5 2014-05-30 Danny Mongrain Added No Logon Servers section
1.5.1 2014-06-19 Danny Mongrain Removed Troubleshooting HTTP problem (Rios 6.5)
1.5.2 2014-06-19 Danny Mongrain Added Scheduling a Reboot and Service restart
1.5.3 2014-10-01 Danny Mongrain Added Service Error
1.6.0 2015-01-12 Danny Mongrain Renamed CMC for SCC everywhere.
Updated screenshots following changes to GIU.
Corrected typos, etc.

Page 2 of 30
705 - WAN OPTIMIZATION 705 – Operational Guide

Table of Content

1 Component, Product Description and Owner______________________________________________4


1.1 Component Description_____________________________________________________________4
1.2 Scope______________________________________________________________________________4
1.3 Documentation______________________________________________________________________4
1.4 Prerequisites_______________________________________________________________________4
1.5 Disclaimer__________________________________________________________________________4
2 Identifying the client-side Steelhead (CSH)________________________________________________5
3 Clearing an established connection______________________________________________________6
4 Troubleshooting an Optimized connection________________________________________________7
5 Troubleshooting a passthrough connection_______________________________________________9
6 Troubleshooting HTTP problems________________________________________________________11
7 Troubleshooting a general failure_______________________________________________________13
8 Packet capture_________________________________________________________________________14
9 Investigating Admission control_________________________________________________________15
10 How to clear configuration changed alarms____________________________________________18
11 SSL Certificate expiring alarm_________________________________________________________19
12 How to reconnect a WOC to the SCC__________________________________________________20
13 Rebuilding a faulty drive into a raid group_____________________________________________21
14 Investigating bandwidth usage________________________________________________________22
15 Getting support from Riverbed TAC___________________________________________________25
16 Software downgrade_________________________________________________________________27
17 Scheduling a reboot__________________________________________________________________28
18 Scheduling a service restart__________________________________________________________29
19 Service error alarm____________________________________________________________________30

Page 3 of 30
705 - WAN OPTIMIZATION 705 – Operational Guide

1 Component, Product Description and Owner

1.1 Component Description


Riverbed Steelheads are WAN optimization controllers (WOC) that accelerates TCP traffic.

1.2 Scope
This document contains information that can be useful when operating the Riverbed Steelheads
(the “how to”), including the SteelCentral controller (SCC, ex. CMC) but excluding Steelhead Mobile.

The scope of this document is operating daily tasks and troubleshooting common problems on both the
Steelheads and the SCC.

1.3 Documentation
All the vendor documentation for this product can be found on Riverbed web site:
http://support.riverbed.com. A username and password is required to get full access.

1.4 Prerequisites

Ensure that WOC is installed and configured according to best practices and Riverbed deployment
guides.

1.5 Disclaimer

This document is NOT an official Riverbed document. In doubt, always adhere to Riverbed
documentation and follow instructions from Riverbed support. Use at your own risk.

Page 4 of 30
705 - WAN OPTIMIZATION 705 – Operational Guide

2 Identifying the client-side Steelhead (CSH)


Follow this procedure to determine which appliance as the CSH role. Knowing which WOC is the CSH is
critical in most configuration and troubleshooting procedures.

Print Screen or Description Action


The CSH is the WOC at the same location (site) as the client, which is the system that issues the TCP
connection (SYN) towards a server. If the client is in a Campus/MAN network its WOC might be in the central
site that provides WAN connectivity for the MAN.

In doubt, consult the network diagram of the location where the client is located.

Steelhead Mobile agents are CSH only, they cannot be SSH.


You can’t determine who the CSH is but you know
who the server-side Steelhead (SSH) is? Connect to
it and go to Report > Current connections. Filter
using the client IP and ALL connection type. Click on
the looking glass of any optimized connection with
your client as source IP. In the screen that opens,
look for Peer Appliance. That is the inpath IP of your
CSH (10.23.255.148 in this example). You don’t
know its name but you can always connect to its
inpath IP directly.

TIP: If the Peer Appliance IP is the same as the


source IP issuing the connecting then the CSH is a
Steelhead Mobile agent. You might need to refer to
the 705 – WAN Optimization Mobile document
depending on what is the problem.

Page 5 of 30
705 - WAN OPTIMIZATION 705 – Operational Guide

3 Clearing an established connection


Follow this procedure to clear an active connection so that the new connection is applied against new
WOC configuration. You’ll be asked to establish new connections in many configuration and
troubleshooting procedures.

Print Screen or Description Action


Inpath rules and other optimization techniques are applied on new connections only. You’ll need existing
connections to be closed and new ones opened to validate if your new configuration works as intended.
Ask that the client reinitiates the TCP connection:
 CIFS: close a mapped drive or close ad-hoc CIFS share, wait a minute. Re-open.
 MAPI: close Outlook and/or Lync
o If it doesn’t work ask the user to confirm the process outlook.exe is gone
 HTTP: close web browser. Wait a few seconds until all connection are gone before re-opening.
 etc.

If the above doesn’t work, ask the user to log off / log back in and it should do the trick.
If the ‘client’ is a server, stopping and restarting the service is usually sufficient.

Connect to the CSH (section ‘Identifying the client-side Steelhead (CSH)’),


Go to Report > Current connections and filter using
the client IP and ALL connection type. Confirm that
the timestamp on the connection you expect to be new
is more recent that your last config change.

If the timestamp is older than your config change then


the connection was never closed, so your change is not
effective. If the user cannot kill the connection on its
own you may attempt to reset if from the WOC
interface but this doesn’t work all the time, depending
on OS and software combination.

To reset it click on the looking glass of your connection


and click on the bottom button Reset Connection.
You might have to do it multiple times. If reset doesn’t
work then you’ll have to ask the client to log off or
reboot if he can’t have his application shut its TCP
sockets.

Page 6 of 30
705 - WAN OPTIMIZATION 705 – Operational Guide

4 Troubleshooting an Optimized connection


Follow this procedure to verify if a WOC is causing any issue while optimizing a connection.

Print Screen or Description Action


Connect to the CSH (section ‘Identifying the client-side
Steelhead (CSH)’), go to Report > Current
connections and filter using the client IP and ALL
connection type.
Locate the TCP connection that is reported as having
problems. Confirm it’s the good one by looking at the
server and destination port (service port).

If the connection is not listed then you’re on the wrong


CSH.
If the connection is not optimized ( ) then the WOC
is not modifying the natural behavior of the connection.
Your problem is most likely elsewhere. If you want to
understand why your connection is not optimized go to
step ’Troubleshooting a passthrough connection’.

If the connection is Optimized ( ) then the WOC is


modifying the natural behavior of the connection, and
as such it could potentially be causing issues at it.
Follow on with the next step.

The most efficient way to determine if the WOC is


causing the issue is to remove the WOC from the path.
To do so, you’ll configure a passthrough rule (bypass)
on the CSH. Doing so on the SSH is useless.

Go to Configure › Optimization › In-Path Rules.


Click Add a New In-Path rule. Fill in the info:
Type: Pass Through
Source subnet: the specific client source IP with a /32
mask.
Port: all.
Destination subnet: the specific server IP with a /32
mask.
Port: all (except if it is required to bypass only a single
specific destination port).
Vlan: all.

Page 7 of 30
705 - WAN OPTIMIZATION 705 – Operational Guide

Print Screen or Description Action


Protocol: TCP
Cloud: doesn’t matter
Position: Start
Enable rule: check
Click Add.

Once the page reloads confirm your rule is there at the


top.

Establish a new TCP connection (section ‘Clearing an established connection’).


It’s time to test again now that the WOC is not optimizing your connection anymore. Ask the user if its problem
is gone.

If the problem is still the same then the WOC is not at caused.

Since your problem was not fixed by adding a passthrough inpath rule this configuration must also be removed.
The temporary inpath rule you created has also caused a configuration changed alarm on the CMC as
configuration changes should normally be done on the SCC policies then pushed to WOCs.

Follow the steps in section ‘How to clear configuration changed alarms’ to get rid of your temporary rule and the
alarm in one step.

If the problem is gone then the WOC is involved in the problem (not necessarily the root cause of it).
Depending on what the exact problem is, you could have to do one or many of these:
 do packet captures (section ‘Packet capture’)
 apply different optimization technique
 open a trouble ticket with Riverbed Support
 post a question on Riverbed user forum splash.riverbed.com
 apply a permanent pass through rule
 restart the service and/or the WOC (section ‘Troubleshooting a general failure’)
 upgrade the WOC as your problem might be a bug that got fixed
 If HTTP: apply server-specific HTTP settings (sections ‘Troubleshooting HTTP problems’)

Page 8 of 30
705 - WAN OPTIMIZATION 705 – Operational Guide

5 Troubleshooting a passthrough connection


Follow this procedure to verify why the WOC is not optimizing a specific connection.

Print Screen or Description Action


Connect to the CSH (section ‘Identifying the client-side Steelhead (CSH)’) and the server-side Steelhead
(SSH). Go to Report > Current connections and filter using IP or port and ALL connection type. Locate the
passthrough TCP connection that you are investigating. Confirm it’s the good one by looking at the server and
destination port (service port).
Steelheads groups passthrough connections into two families: Intentional passthrough are considered perfectly
normal from a WOC perspective, while unintentional are considered a problem.
The most typical passthrough reasons are explained here.
Inpath rule
(intentional passthrough)

This one means a passthrough inpath rule on the CSH


is responsible. The Rios interface won’t tell you which
rule exactly, you have to figure this out on your own.
There are 3 typical situations:

1. The destination port is in one of the Port labels


part of the 3 default Rios passthrough rules:
Secure, Interactive or RBT-Proto. To verify
which port is in these port labels go to
Configure › Networking › Port Labels. Do
not modify these ports, ever.

2. A specific IP and/or port rule is at caused,


potentially a generic passthrough rule at the
end
Preexisting connection
(intentional passthrough)
This one means that the TCP connection was
established before the CSH could attempt an auto-
discovery process within the SYN packet. This is
typical when the service is restarted or the WOC is
rebooted.

Page 9 of 30
705 - WAN OPTIMIZATION 705 – Operational Guide

Print Screen or Description Action


Connection paused
(intentional passthrough)
This one usually means that admission control as
kicked in and is denying optimization to this connection.
(section ‘Investigating Admission control’)

No Steelhead on path to server


(unintentional passthrough)

This one means there is a single WOC on the end-to-


end connection. The WOC added his information as
TCP options into the SYN packet (auto discovery
process) but no other WOC has seen that SYN and
tried to established an optimized connection.

This usually happens when there is a CSH WOC at the


client location and the connection is established
towards a server at a location without a SSH WOC.
SYN on WAN side
(unintentional passthrough)
This one means there is a single WOC on the end-to-
end connection. The WOC added his information as
TCP options into the SYN packet (auto discovery
process) but no other WOC has seen that SYN and
tried to established an optimized connection.

This usually happens when there is no CSH WOC at


the client location and the connection is established
towards a server at a location with a SSH WOC. The
SSH will be the first and only WOC but the SYN
(without TCP options from a CSH) is seen on its WAN
interface instead of a LAN interface. A Riverbed
Steelhead initiate auto-discovery only on LAN
interfaces but accepts auto-discovery answers on both.
In this case there were no CSH so the SSH is
effectively the first WOC but the SYN comes in on a
WAN interface and optimization is denied.

Another scenario is if the LAN and WAN wires are


reversed on the CSH. The LAN clients send their
SYNs to the WAN and the CSH doesn’t like it.

Page 10 of 30
705 - WAN OPTIMIZATION 705 – Operational Guide

6 Troubleshooting HTTP problems


Follow this procedure to verify if HTTP optimization on the WOC is causing a specific problem.

Print Screen or Description Action


Starting with Rios 7, Riverbed has introduced an automated per host HTTP auto-configuration. The CSH will
compile and analyze every HTTP connection. Once it has enough data at hand it will decide which optimization
techniques to apply, per HTTP server.
There are rare situations where the auto-configuration will cause issues such as web page not opening,
authentication issues, etc. Your first step should be top diagnose the problem using the ‘Troubleshooting an
Optimized connection’ section. Follow up with these steps if a passthrough inpath rule clears the problem and
the destination (service port) is 80 (HTTP).
Start by connecting to the CSH (section ‘Identifying the client-side Steelhead (CSH)’),
Go to Configure > Optimization > HTTP.
Click on the web server having the issue in the list and
remove all optimization techniques. Click Apply and
Make Static.

Page 11 of 30
705 - WAN OPTIMIZATION 705 – Operational Guide

Print Screen or Description Action


Establish a new HTTP (TCP) connection (section ‘Clearing an established connection’).
It’s time to test again now that the WOC is optimizing your connection but a blank HTTP configuration. Ask the
client if the problem is gone.

If the problem is still the same then a blank HTTP configuration on the WOC is causing the issue. Contact
Riverbed Support.

If the problem is gone, go back to Configure > Optimization > HTTP and enable back one technique at the
time within the original techniques that had been auto-configured. New connection, test. Keep on going until
you figure out exactly which optimization technique is causing an issue. Once you know then keep this
exception permanently.

Page 12 of 30
705 - WAN OPTIMIZATION 705 – Operational Guide

7 Troubleshooting a general failure


Follow this procedure if there are numerous problems in a site, affecting many users and many
protocols.
Print Screen or Description Action
You should first start by looking at the health status of
the WOC, the problem might be listed there. Go to
Reports > Diagnostics > Alarm status and Reports
> Diagnostics > System details and check if anything
is reported wrong and could be related to your problem.
If most or all optimized TCP traffic is having severe
problems (CIFS, HTTP, MAPI, etc.) but unoptimized
traffic is ok (Telnet/SSH, RDP, anything to internet,
IPT), then the WOC as a whole might be causing
general issues. Restarting or stopping its service might
help.
Beware as this will disrupt all optimized traffic. Most
connections will end up unoptimized until they are re-
established, this can take hours/days depending on the
application. You should only restart or stop the service
if things are going really bad in a site.
Go to Configure > Maintenance > Services, and click
Restart. If the problem goes away for a while but
come back a bit later try doing a full service Stop. If
the problem is gone then the WOC was causing a
general failure. That is a very rare problem, contact
Riverbed Support.
Rebooting the Steelhead as a whole is not required as
the effect is the same as a service restart but it takes
longer to complete. Rebooting a WOC is only useful
for RIOS upgrade. The same goes with powering off a
WOC which is the same as a service stop but you can’
enable it back from the network.

You shall never choose the Clear Data Store option


when changing the state of the service. This flushes
the data store cache and will reduce performances
significantly for many days. Use that option only if
instructed by Riverbed support.

Page 13 of 30
705 - WAN OPTIMIZATION 705 – Operational Guide

8 Packet capture
Follow this procedure to conduct a packet capture (‘sniffing’ ‘trace’ ‘tcpdump’).

Print Screen or Description Action


Depending on what your problem is you might need to obtain a capture on the CSH, the SSH or both.
Go to Reports > Diagnostics > TCP Dumps. Click
Add a new TCP Dumps.
Give it a meaningful name including a short problem
description, your name and the date.
Use the IP/ports filters as required. Beware if your filter
is too narrow you might not capture the origin of the
problem; if your filter is too large your captures files will
be too big and finding the culprit will be difficult.
Apply the capture on the proper lan interface(s). If
there are many verify which will see your traffic by
referencing to the visio and the local routing/arp table.
Because we use correct addressing most wan traffic
will be on Riverbed ports with CSH and SSH as source
IP and the payload won’t be understandable. That is
why a capture on a wan interface is rarely useful.
Capture duration: as you wish, but I usually use ‘0’
which means I’ll have to stop the capture myself when I
see fit. It’s up to you. Maximum capture size and
number of files to rotate defines how much data you’ll
keep and in how many files, that is a safety gap in case
your filter is too wide and the amount of traffic too high.
Click Add to start the capture.

If you configured an ongoing capture using ‘0’ in its


duration: Select it and click Stop Selected Captures
when your done.

Your capture is ready to be downloaded. There will be


a capture file per interface selected. The name of the
WOC and the interface are automatically prefixed in the
file name.
Please delete all capture files when you’re done so that
the disks space is not wasted with old files.

Page 14 of 30
705 - WAN OPTIMIZATION 705 – Operational Guide

9 Investigating Admission control


Follow this procedure to investigate a WOC in Admission control.

Print Screen or Description Action


Admission control is a state in which a WOC refuses to accelerate new TCP connections. Warnings in the form
of alarms on the SCC are triggered at 85% of the maximum. New TCP connections are denied optimization
once 100% is reached.
There are various reasons for a WOC to be in Admission control and figuring out can be easy or quite difficult.
The symptoms to the most common situations are shown here.
Is this a new problem?
On the SCC or the WOC, go to Reports › WAN
Optimization › Connection History. In the Group
section specify ‘Custom’. Pick your specific WOC.
Verify the graph over 7 days, 30 days and 90 days
periods. Look for a sharp change in the trend. If so
there is either a faulty system/protocol or a sudden
sharp increase in head count.

On the WOC, go to Reports > Networking > Current


connections. Filter with Established connections.
Click Update.
Sort the optimized connections by source IP,
destination IP, destination port. Look for a bulk of
connections having the same pattern.
Alternatively you can select the content of the
Source:Port and Destination:Port columns while
holding the CTRL key so that only those columns are
selected. Copy-paste to notepad, save, open Excel,
open your file (filter with all files *.*), accept the format
warning, accept the default ‘delimited’ column format,
click Other and specify ‘:’ in the box, next, Done. You
now have a much powerful tool to sort, compile,
remove duplicate, etc.
Once you locate a suspect situation, do a reverse DNS
lookup and verify if these are PCs and/or servers.
Investigate what the destination(s) is/are. What’s the
service port used for? What’s does SMI tells you about
this server(s)? Is it normal to have this amount of
concurrent TCP connections?
If there are more than 500 connections then the GUI
will not show you all of them. SSH the WOC, type
‘show connections optimized’ and sort out the massive
output in excel.

Page 15 of 30
705 - WAN OPTIMIZATION 705 – Operational Guide

Print Screen or Description Action


Problematic source?
If the same source IP (different source ports) is seen
with multiple connections it might be that a faulty
system (PC, server) is opening too many concurrent
connections.
Examples:
A legitimate TCP port scanner used by your Security
team. When it runs it opens 1000s of connections,
which can cause WOCs to go in admission control.
Problematic destination?
If the same destination IP (same destination port) is
seen with multiple connections it might be just normal
(i.e. Exchange server, connected to by every single PC
in the site) or it might not be.
Examples:
An office had a WOC sized for its user count but there
was a local Exchange server used by remote offices.
All the remote PCs had multiple MAPI connections to
the server, causing admission control at the central
site. The WOC had to be upgraded because it is used
both as a client-side Steelhead and a server-side
Steelhead.
Problematic client <> server?
If you see lots of connections with same source and
destination IPs, and always the same destination port,
then a pair of systems is using a lot of TCP capacity.
This is quite common between a Read-only Domain
controller (RODC) and a normal DC. This is caused by
a RIOS bug that has yet to be fixed.
Example on the left: local RODC opens lots of TCP
sockets to a central site DC. All connections are on the
same port, they all look alike, they are all very small
(2KB). They never go away. The destination port is
not predictable; hence a passthrough rule cannot be
configured.
Other example: A faulty Outlook client was opening
600+ MAPI ports to Exchange. A new Outlook profile
on the PC fixed the issue.
Too many users (source IP)?
This one is quite common and easy to figure out. During local peak business hours, get ALL current connection
(optimized + passthrought). Sort by source IP and get the count of unique IPs. Remove the remote IPs from
the count, keep only local IPs. If there are more than 500 connections then the GUI will not show you all of
them. SSH the WOC, type ‘show connections optimized’ and sort out the massive output in excel.
If you conclude that there are simple too many legitimate user on site then a license of hardware upgrade may

Page 16 of 30
705 - WAN OPTIMIZATION 705 – Operational Guide

Print Screen or Description Action


be required.
Investigating admission control situations can be difficult as it requires experience, collaboration from other
team, accurate documentation and a good dose of instinct.

Page 17 of 30
705 - WAN OPTIMIZATION 705 – Operational Guide

10 How to clear configuration changed alarms


Follow this procedure if you have applied a temporary config change on a WOC and it’s not required
anymore. The change triggered an alarm on the SCC.

Print Screen or Description Action


Log on the SCC, go to the Reports > Topology >
Appliance status page, click on the Appliances
Needing Attention pane. Your WOC should be there
in alarm: The configuration on appliance has been
changed.

Go to Manage > Topology > Appliances and select


the checkbox next to you WOC.

Click Appliance Operations at the top right of the


page, leave the default operation Push Policies, leave
all options unchecked, and click Push.
Wait 2 minutes then confirm on your WOC that the
temporary rule is gone.

Go back to the Reports > Topology > Appliance


status page. Your WOC should now have
Connected: Healthy status under the Appliances
pane (not in the Appliances Needing Attention page).

Page 18 of 30
705 - WAN OPTIMIZATION 705 – Operational Guide

11 SSL Certificate expiring alarm


SSL certificates may expire from time to time. This is usually not an issue beside an annoying alarm on
the SCC/CMC. Follow this procedure to fix the problem.

Print Screen or Description Action


Log onto the Steelhead reporting an SSL ‘SSL
Certificates Expiring’ in its alarm page. The alarm will
tell you that the issue is with a Certificate Authority
(CA).

Go to Configure › Optimization › Certificate


Authorities and sort all Authorities by Expiry date. If
any is about to expire it will be listed in orange (expiry
within 60 days) or red (already expired). That problem
is caused by Riverbed trying to cover for all authorities
including some tiered 3 authorities rarely used in the
corporate world.
If the authority appears to be an unknown and unused
authority internally you may safely remove it. If using
SCC/CMC, don’t remove it on this local Steelhead as
other Steelhead will also have the same issue. Log on
the SCC, go Manage > Services > Policies, choose
the policy (policies) used by your WOCs for their SSL
config and go on its Certificate Authorities (SSL)
page. Sort by Expiry date, locate the same faulty
authority, remove it, save. Push a policy update to all
WOCs.

Page 19 of 30
705 - WAN OPTIMIZATION 705 – Operational Guide

12 How to reconnect a WOC to the SCC


Follow this procedure if a WOC is functional but somehow the SCC can’t see it.

Print Screen or Description Action


Sometime following a long network outage the SCC will lose track of a WOC. On the SCC Topology >
Appliance status page it will be seen as Disconnected: unreachable address or Disconnected: invalid
username / password. A manual reconnect may help.
First you must confirm your WOC is reachable. Connect to it using its DNS name or Primary IP. Its Home
page will show CMC (or SCC): not managed instead of the usual CMC (or SCC): [your CMC hostname/IP].
Log on the SCC, go to Manage > Topology >
Appliances, click on your WOC (not its checkbox), go
to the Appliance Utilities pane, and click Reconnect.
Wait 2 minutes then verify if your WOC Home page
shows it is managed by mc-qcmtl1-05-01.

Go to the SCC Topology > Appliance status page.


Your WOC should now be in a Connected status.
If this doesn’t work then there is a network problem
such as a firewall rule that’s blocking the SSH
connection from the SCC towards the WOC primary IP
(TCP 22).

Page 20 of 30
705 - WAN OPTIMIZATION 705 – Operational Guide

13 Rebuilding a faulty drive into a raid group


Follow this procedure if an HDD required a raid rebuild. If the same drive has the same problem more
than once you should open a ticket with Riverbed support and request an RMA.

Print Screen or Description Action


Need screenshot of Raid alarm Your WOC will be in ‘RAID disk [disk ID] Status
Degraded’ alarm, both locally and on the CMC.
A disk may only be added to a raid group through CLI commands.
Connect to your WOC in SSH.
Type those commands:

 enable
 configure terminal
 show raid physical ( double check the physical HD ID )
 raid swraid fail-disk [disk ID]
o Disk [disk ID] failed
 show raid diagram
o [ [your disk ID] : failed ] [ [all other disks]: online ]

 raid swraid add-disk [disk ID]


o Disk 11 added to the system
 show raid diagram
o [ [all disks]: online ]
Confirm the alarm is gone both locally and on the CMC.

Page 21 of 30
705 - WAN OPTIMIZATION 705 – Operational Guide

14 Investigating bandwidth usage


Follow this procedure to investigate bandwidth usage per protocol or host.

Print Screen or Description Action


Steelheads are powerful reporting tools and may be used to investigate traffic trend, top talkers, etc.
You must always understand the local topology when analyzing traffic stats:
 If the WOC is getting the WAN packets by WCCP then you must analyze the WCCP ACL as it will most
likely ignore a lot of useless traffic which won’t show in your report as the WOC never sees those
packets.
 If the WOC is physical inpath then its sees everything including internet-bound traffic. This traffic will be
included in the stats in the passthrough category.
 If there is a DMZ at the sites, routed on the firewall off the WOC wan port, the WOC will most likely see
the LAN-to-DMZ traffic. It won’t optimize it but this traffic will show in your report.
 Reports on passthrough traffic don’t make any difference between internet traffic and corporate traffic
that couldn’t be optimized (i.e. no remote WOC, traffic to local DMZ, etc.).
All these are bundled together per TCP ports.
 If the WOC uses Hardware passthrough (HAP), ignored packets won’t show in the reports.

The same stats available on individual WOCS are also on the SCC. Generally speaking the SCC is better for
long term local trends reporting or aggregated country/regional/global stats, while local WOCs are better for
short term, local stats.
The SCC aggregates stats of current WOCs only. If a WOC is removed its stats goes away with him. If a WOC
is moved to a different location its historical stats moves with him. This may invalidates some reports.
The direction (Bi-Directional, WAN-to-LAN or LAN-to-WAN) only applies to the individual packets without
regards to the location of the client and server. A local client that downloads from a remote server will look
exactly the same as a remote client that uploads to a local server.
TCP 8779 (SMB2) is using TCP 445 in reality (i.e. current connections, TCPdumps). It is reported on its own
port just to separate from its predecessor SMB1 (CIFS).
LAN statistics represent packets to and from the client and server as they see them (and so do the LAN
switches). WAN statistics represent the same packets after they were optimized by the WOCs. They were
either pre-cached (only the index were sent) compressed or removed (optimization of protocol chattiness).

Page 22 of 30
705 - WAN OPTIMIZATION 705 – Operational Guide

Print Screen or Description Action


Per protocol investigation (aggregate)
On the WOC, go to Reports › Networking › Traffic
Summary.
This report gives on the sum of traffic per protocol, the
reduction % (caching and compression combined) and
the weight of the protocol compared to all traffic in the
site (using pre-optimized ‘LAN’ stats).

Per protocol investigation (throughput)


On the WOC, go to Reports › Optimization ›
Optimized Throughput.
This report gives the throughput usage per second. It
is recommended you uncheck the
‘LAN Peak’ and ‘WAN Peak’ as these are mostly stats
distortions that shouldn’t be considered. 95 th percentile
is actually closer to the real ‘peak’ usage from a
business perspective.
You may filter per protocols (ports) and adapt the time
frame per your needs.

Per protocol investigation (data reduction)


On the WOC, go to Reports › Optimization ›
Bandwidth Optimization.
This report gives the data reduction % per second.
You may filter per protocols (ports) and adapt the time
frame per your needs.

Page 23 of 30
705 - WAN OPTIMIZATION 705 – Operational Guide

Print Screen or Description Action


Per host investigation (live)
On the WOC, go to Reports › Networking › Current
Connections. Filter with ALL optimized.
This report gives you per-connection statistics. The
connections must exist (active or idle) for stats to be
displayed. Connections are removed if a TCP FIN of
RST is seen, of if the WOC service stops.
You may filter with source or destination, IP or port,
protocol name (i.e. MAPI uses different ports) by using
the search field. You may sort all columns as you wish.
There is no export tool for the Current Connections table. To export manually, select the content of the columns
you need while holding the CTRL key so that only those columns are selected. Copy-paste to notepad, save,
open Excel, open your file (filter with all files *.*), accept the format warning, accept the default ‘ delimited’
column format, click Other and specify ‘:’ in the box, next, Done. You now have a much powerful tool to sort,
compile, remove duplicates, etc.
Per host investigation (live)
On the WOC, go to Reports › Networking › Top
talkers.
This report is a lightweight Netflow reporting tool.
This report gives you stats bundled either per source
(Sender), per destination (Receiver), combined
source+destination (Host), per TCP port (Application
ports) or per connections (Conversation).
The period is either last hour last day or All (two days).
Warning: Passthrough is both internet-bound traffic and
internal corporate traffic that couldn’t be optimized (no
remote WOC, local DMZ, etc.).

Page 24 of 30
705 - WAN OPTIMIZATION 705 – Operational Guide

15 Getting support from Riverbed TAC


Follow this procedure if you need to open a support ticket with the Riverbed TAC.

Print Screen or Description Action


If applicable, do a packet capture on the CSH and SSH. (section ‘Packet capture’).
If applicable, get screenshots of the problem as it is seen by the user.
Once the problem was reproduced, go to Reports ›
Diagnostics › System Dumps. Choose ‘Include
Statistics’ and ‘Include All Logs’ then click Generate
System Dump. The dumps will be ready in a few
minutes.

Downloads the LAN and WAN TCP dumps from both the CSH and SSH (4 TCP dumps in total) when the
problem occurs. Do the same with a passthrough rule if it clears the problem (4 more TCP dumps). Download
the System Dumps. Name all files explecitely such as the TAC engineer will know which is CSH, which is SSH,
which is optimized (not working) and which is passthrough (working). Wrap all these into a single ZIP file, and
include any other files you might need such as screenshots, visio, etc.
Login to https://support.riverbed.com. You’ll need an individual account to get in. If you don’t have any go
ahead and create one, it will be helpful. It only takes 2 minutes.
Once you’re in, go to My Riverbed (top right) then
Cases and RMAs.

Click Submit a Case Online

Page 25 of 30
705 - WAN OPTIMIZATION 705 – Operational Guide

Print Screen or Description Action


Fill in the necessary information.

Please use a precise yet short description in the


subject field as it cannot be changed afterward.
‘Connection not working’ or ‘WOC issue’ is too vague.
Uses ‘HTTP timeout after Rios8.5 upgrade’ or
‘Steelhead won’t boot after reload’ instead.

Priority: Should be P3 if you have a workaround


(passthrough rule until problem is fixed) or P2 is users
are affected by the problem. P1 shall be very rarely
used as it means the company operations as a whole
are severely degraded or stopped due to this problem.

Use the Steelhead serial # in the Product identified


field. To get the serial go to Support.

Attach the ZIP file you created the step before only if
it’s smaller than 50 MB.

Submit the ticket. Note the case ticket #.

If your ZIP was too big to be uploaded in the WEB


form, connect to ftp.riberbed.com using anonymous as
user and your email address as the password.
Rename your ZIP as [case ticket #].zip and upload to
the Incoming folder.

A Riverbed support engineer will eventually contact


you, usually by email but sometime directly by phone.
If you need faster service dial 1.888.782.3822, provide
your case ticket # and ask to get hold of your engineer
ASAP.

Page 26 of 30
705 - WAN OPTIMIZATION 705 – Operational Guide

16 Software downgrade
Follow this procedure if you need to downgrade the version of a recently upgraded WOC if a new
problem is noticed.

Print Screen or Description Action


Once you have determined that the problem is WOC-
related and it follows a very recent version upgrade, go
to Configure › Maintenance › Software Upgrade.

Click Switch to Backup Version.

Go to Configure › Maintenance › Reboot/Shut Down.


Click Reboot.

Beware not to click Shut Down as its only 3 pixel


away. The WOC will not ask you to confirm. A full shut
down requires a local intervention to power it on.

Wait a few minutes for the WOC to reboot. Log back in and confirm its running its previous version. Verify if
the problem is gone.

Page 27 of 30
705 - WAN OPTIMIZATION 705 – Operational Guide

17 Scheduling a reboot
Follow this procedure if a WOC requires a reboot and you need this to happen outside business hours.

Print Screen or Description Action


Connect to the WOC in HTTP(s), Go to go to
Configure › Maintenance › Reboot/Shut Down.
Do not check the Clear Data Store except if you have
a very good reason of doing it.
Click Schedule Later and enter a date/time.
Click reboot (don’t click Shut Down!! Or else you’ll
need a local contact to power it back on).

Alternatively you can do the same from the SCC


where one or many WOCs can be Instructed to do the
same.
Log on the SCC. Go to go to Manage > Appliances
Operations > Reboot Job.
Click Launch a new reboot job…
On the Welcome screen click Select the appliances on
the bottom right corner.
Select one or many appliances, using filter or
browsing through the list.
When done click Configure settings.

Give a name to the job if you want (optional).


Select Schedule the reboot and enter a date/time.
Select switch partition only if you want the WOC(s) to
reload using their alternate RIOS image.
Click Summary.
The next page is a summary of your reboot job. If
you’re satisfied click Reboot. If you want to make
changes click back.

Page 28 of 30
705 - WAN OPTIMIZATION 705 – Operational Guide

18 Scheduling a service restart


Follow this procedure if a WOC needs its service restarted and you need this to happen outside
business hours.

Print Screen or Description Action


You cannot schedule a service restart on a WOC, it
can only be done from the SCC.
Log on the SCC and go to Manage › Appliances >
Appliances.
Click the checkbox that precedes all the WOCs to be
reloaded.
Click Appliance Operations.
Select the operation: Start/Stop Services.
Change Service Actions to Restart.
Do not check the Clear Data Store except if you have
a very good reason of doing it.
Click Schedule Later and enter a date/time.

Page 29 of 30
705 - WAN OPTIMIZATION 705 – Operational Guide

19 Service error alarm


While pushing configuration policies from the SCC (CMC) to a WOC, a SSH (server-side steehead)
might trigger an alarm like this one:

The optimization service has encountered a non-fatal error condition.

The optimization service is still running but you may want to review the
appliance logs for more information.

This alarm will stay triggered until you manually reset it or the optimization
service is restarted. To reset this alarm without restarting the service, you
can use the CLI command "service error reset" or visit the 'Alarm Status' page
under 'Reports' in the Web Management Console.

Print Screen or Description Action


Connect to the WOC CLI interface by SSH.
Type:
 enable
 service error reset

Exit the CLI interface. Wait a minute and confirm that


the WOC is healthy in the SCC.

Page 30 of 30

You might also like