You are on page 1of 7

Library Manager : Resposible for Media mounts/dismounts/ maintains inventory of

the Library , Library Audit


Library Client : Owns the Volumed and requests the LM for Mounts,
Storage Agent : A reduced version of TSM server and used for lan free transfer
of data between client and TSM Storage Pools.

Connectivity :
Library device : Must be connected to the Library Manager
Drive device : MUst be connected to both Library Manager and Library Client
Lanfree Backup Components
Library[Robotic Arm and Storage
Drives
Media
HBA
Device Driver
Switches
Zoning
BA/TDP Client
Storage Agent
Library Manager
Library Client

Slots] : Def Library


: Def Devclss , Def drive
: Holds the data
:
:
:
:
: Client to do backups
:
:
:

TSM Server Configurations


1.Server to server communication
- server communication addresss and testing connectivity
- Servers defined to each other
SAN Media is used for data movement
TCP/IP is used as a communication protocol
Library must be shared between the Library Managers and Library Client
2.Library Definitions at Library Manager
Library Definitions at Library Client
Paths to Library [ Only Library Manager ]
Paths to Drives [ For Library Manager + All Library Clients who will use the Dri
ves ]
Path definitions from library Manager to its drives should always be Online
Path definitions from library client to drives may be offline because of connect
ivity issue.
2.1 Identifying Offline paths marked by system_Console
libraryManager:run rjpst
3.Validating Lanfree configuration from TSM Server
Validate Lanfree node_name Storage_Agent
- Check all storage pool allocated for backups to a client from Mgmtclas
s destination parameter
- Check for each stgpool if the Library is Shared or not
- Checks if Library is shared then paths from STA to Drives are defined
or not
- Checks if paths are defined then Paths are online or not
- also checks if STA is pingable from TSM Server
4.Client Configurations

- Datareadpath : any|lanfree
- Datawritepath: any|lanfree
- Maxnummp : max number of drives that can be used by a node

Client Configurations
1.Drive Availability
Drive Definitions
WINDOWS
- Windows device Manager [Windows Only]
- You should be able to see ult3580 Drives or Physical drives
- Use sansurfer or Hbanywhere utility to get WWPN Details
- TSMinstallDir\Storage Agent\tsmdlst /detail [ Windows Only ]
- You should be able to see the drive serial and WWPN Number if
not there is connectivity issue or Driver Issue
- Drive Test
- TSMinstalldir\device\mttest --> set special device file --> 37
---> should not give any error.
HPUNIX
- ioscan -fnkCtape | grep -iE 'IBM|^Class' [ Hp Unix ]
- for every tape device ULT3580 ,test the drive for operability
- tapeutil [ open device :1 - /dev/rmt/rmtx - open in readonlymo
de:2 - query serial Number:3 ]
- if you are able to see serial number the Drive is oper
ational
- ioscan -fnkCunknown
- sometimes the drives go in unknown device category can be a co
nnectivity issue or Device driver issue
- ioscan -fnkCfc
- used to identify which HBA being used for the Drives
fcmsutil argument will be drive : from ioscan command [ FC status ]
here driver is /dev/td1
LINUX
- cat /proc/scsi/IBMtape
[ shows Serial Number ,this actually use
s device driver to read the device to get the serial number ]
- ls -lrt /dev/IBMtape*
[ Shows device special names ]
- cat /proc/scsi/qla*/[1..9]
[ List FC adapter Ports ]
- /opt/hp/hp_fibreutils/hp_rescan -a [ all devices on qlogic , -l on spe
cific hba ]z [ it removes the added LUN's then adds , dont use unless required
]
SUN SOLARIS
- ls -alrt /dev/rmt | grep IBM [ shows device special names ]
- /opt/IBMtape/tapelist -l
[ gives you serial number ]
- /opt/IBMtape/tapeutil
[ Drive Test ]
- tapeutil [ open device - /dev/rmtx - open in readonlymode:2 query serial Number ]
- if you are able to see serial number the Drive is oper
ational
- ls -l /dev/fc/*
[ List adapters and ports ]
- ls -l /dev/fc/fcp*
To check WWN specifics: `luxadm -e dump_map /devices/pci@8,600000/SUNW,qlc@1/fp@
0,0:devctl`

AIX
- lsdev -Cc tape
- lscfg -vl deviceName [ Serial Number ]
- lsdev -Cc adapter [ list all fc adapters ]
- lscfg -vl fc*|rmt*
- grep -p [ grep results displays the paragraph ]
- IBMtapeutil -f /dev/IBMtape2 inquiry 80

2. Device Driver
- If device Driver software is not running the Device will not work at a
ll
WINDOWS
- after installation the driver starts automatically as part of
OS drivers
HpUnix
- /usr/sbin/swlist | grep atdd [ You should be able to see the
atdd device driver ]
- kernel loads the atdd during end of boot process so you can cl
aim unclaim devices as long as kernel has loaded the atdd.
Linux
- lintape [ linux 2.6 and above ]
IBMtape has been replaced by lin_tape, which can be found here
<ftp://ftp.software.ibm.com/storage/devdrvr/Linux/RHEL4/Latest/>
lin_tape is open source driver, but it's essentially the same dr
iver
as it shares most of its code with IBMtape. Even the kernel modu
le it
installs is still called IBMtape.ko.
Checking Driver status
-/usr/bin/lin_taped status [ for lintaped ]
-/usr/bin/IBMtaped status [ for 2.4 and lower ]
-/var/log/lin_tape.errorlog [ logs error ]
Sun Solaris
- IBM Tape Device Driver ,loaded as part of system initializatio
n
Aix
- IBM Tape Device Driver ,loaded as part of system initializatio
n
3. TSM Configurations
- dsm.sys lanfree options
- enablelanfree : if commented or set to no then lanfree not ena
bled
- lanfreetcpserveraddress :hostname/ip of storage agent
- lanfreetcpport : port of storage agent
- lanfreecommmethod : tcpip/sharedmem/namedpipe
4. Storage Agent
- Usually installed on the client which needs to do a Lanfree ba
ckup
- installed at
Unix - /opt/tivoli/tsm/Storage*/bin
Aix - /usr/tivoli/tsm/Storage*/bin

Win - C:\progra*\tivoli\tsm\Storage*
- dsmsta.opt [ devconfig ]
- devconfig.out
- setstorageserver myname=abc mypa=secret myhla=hostname servern
ame=node_reg_instance serverpa=secret hla=ip lla=port
- commmethod
sharedmem -- shmp
namedpipe -- pipename [ winonly ]
tcpip -- tcpport
5. Starting Stopping Storage Agent
- Linux :
/opt/tivoli/tsm/StorageAgent/bin/dsmsta.rc stop
/opt/tivoli/tsm/StorageAgent/bin/dsmsta.rc start
if above doesent work kill the process and remove the /opt/tivo
li/tsm/StorageAgent/bin/dsmserv.lock file and start it using above utility.
verify Storage Agent stopped/started [ wait for abt 5 secs to
allow for stop/start ]
ps -ef | grep dsmsta
- HPUnix/AIx
- Kill the process to stop it
- remove /opt/tivoli/tsm/StorageAgent/bin/dsmserv.lock f
ile
- start process by
./dsmsta > /dev/null &
verify Storage Agent stopped/started [ wait for abt 5 secs to
allow for stop/start ]
ps -ef | grep dsmsta
- Windows
- Goto Service Panel stop the Storage agent Service
- Goto Service Panel start the Storage agent Service
- Sun Solaris
/opt/tivoli/tsm/StorageAgent/bin/dsmsta.rc stop
/opt/tivoli/tsm/StorageAgent/bin/dsmsta.rc start
if above doesent work kill the process and remove the /opt/tivo
li/tsm/StorageAgent/bin/dsmserv.lock file and start it using above utility.
verify Storage Agent stopped/started [ wait for abt 5 secs to
allow for stop/start ]
ps -ef | grep dsmsta
6. Testing Connectivity Between Client Node and Storage Agent
open dsmc -se=stanza
q sess : you should see a storage agent , session
Oracle clients
open dsmc -se=oraclestanza
the password will be hostname if the hostname is greater than 8
chars long else the passwd will be [hostname01... or hostname12..] to make it
8 chars long
q sess : you should see a storage agent sess
7. RMT Mismatch
- device special name changes of the
- Every drive is uniquely identified by its Serial Number , a drive may
many device special names but only one serial number
7.1 Identifying the Path definitions on Library Manger
lm:q path storageagent

- you will get paths for physical drives if its defined


- you will get paths for vtl drives if it is defined
RMAN Backups :
- Oracle PPT
Common Oracle backups issues
- Storage Agent Down :
- Maximum Mount Point Exceeded :
- server detected internal error
- Device Problem
- Device Driver Problem
- RMT Mismatch Issue
- Session goes into MediaW state
- Slow backups ,time outs
- Drive I/O error at TSM , VTL issues.
- Query for max Channels
- Query for Current backup stats
- Lanfree or Lanbased

Queries
Max Channels
- q node oraclenode f=d
- lm:q path storageagent
- The number of online drive and maxnummp must match ,
- if maxnummp=3 and 2 drives online and one offline then
make the offline drive online
- if maxnummp=3 and 3 drives online and one offline no a
ction required
Current Backups Stats
- get a rough starttime from oracle dba
- q act orig=client node=oranode begint=-timeestimate
- you will see the transactions completed this will tell u the b
ytes backed and the database name
- tell him the last transaction time
- If backup is lanfree tell him the number of active sessions
- go to library manager and do
-storageagent:q sess
-look for client sessions check the session states
Lanfree or Lanbased Backups
- q node oraclenode f=d
- if datawritepath=lanfree then its definitely lanfree if datawr
itepath=any
- login to client
- cd /opt/tivoli/tsm/client/oracle/bin*
- cat tdpo.opt : note down the dsmi_orc_config value
- cat the opt file form the prev parameter
- look for the servername in dsm.sys file and check for
lanfree parameter
Problem Determination
1. Backups fail with error
ANS0350E The current client configuration does not comply with the value of the
DATAWRITEPATH or DATAREADPATH server option for this node.
- StorageAgent is down or not responding , recycle the storage agent

2. Backups failed with Maximum Mount point exceeded


- ask the DBA how many channels he is using and maximum he is allowed to
- Ask him to run backups with appropriate number of channels
3. Backups fails with Unable to allocate device
- Check paths and make them online also do RMT mismatch
4. server detected internal error
- recycle storage agent
- check the device driver is running
- Ask vtl team to check for low light
- Check for i/o errors at TSM server and Library manager
5. Backups sessions in MediaW state in storage agent
- do a RMT mismatch
- device drive is running
Sample Errors
dsmerror.log and rman Logs
10/25/08
10/25/08
10/25/08
points.
10/25/08
points.

23:50:18 ANS0278S The transaction will be aborted.


23:50:18 ANS0278S The transaction will be aborted.
23:50:18 ANS0326E This node has exceeded its maximum number of mount

08:19:08
08:19:08
02/23/09
riting the

ANS0278S The transaction will be aborted.


ANS0278S The transaction will be aborted.
ANS1315W Unexpected retry request. The server found an error while w
data

23:50:18 ANS0326E This node has exceeded its maximum number of mount

02/23/09 21:02:10 Error -50 sending request


02/23/09 21:02:11 ANS1235E An unknown system error has occurred from which TSM
cannot recover. ---> Communication issues Between [sta and client] or [sta and
tsm] or [client and tsm] recycling sta and retrying
usually works
02/23/09 21:02:11 ANS1235E An unknown system error has occurred from which TSM
cannot recover.
02/23/09 22:38:49 ANS1301E Server detected system error -------> Communicatio
n issues Between [sta and client] or [sta and tsm] or [client and tsm]
recycling sta and retrying usually works
02/23/09 22:38:49 ANS1301E Server detected system error
02/23/09 01:40:05 ANS4994S TDP Oracle HP ANU0599 TDP for Oracle: (5536): =>(sd
1n0v2_ORACLE_BKUP) ANU2602E The object /adsmorc//c-694282618-20090223-00 was
02/23/09 06:13:18 ANS0278S
02/23/09 06:13:18 ANS0278S
02/23/09 06:13:18 ANS1315W
while writing the data.
02/23/09 06:13:18 ANS1315W
while wrkiting the data.

The transaction will be aborted.


The transaction will be aborted.
Unexpected retry request. The server found an error
Unexpected retry request. The server found an error

RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============


RMAN-00571: ===========================================================
RMAN-03009: failure of backup command on ORA_SBT_TAPE_2 channel at 01/17/2009 14
:01:46

ORA-27192: skgfcls: sbtclose2 returned error - failed to close file


ORA-19511: Error received from media manager layer, error text:
ANS1235E (RC-72) An unknown system error has occurred from which TSM cannot reco
ver.

Backup Now failed coz STA not running , also specified the lanfreetcpserveraddre
ss , need to observe
------------------------TSM Act Log :

Checking Logs at Client


Oracle Config files : /opt/tivoli/tsm/client/oracle/bin*/
cat tdpo.opt : check for errorlog path
tail -100 tdpo.error
cd /var/opt/oracle/logs
dbname_backup_status : contains backup status -----> check this for backup st
atus
Dbname_rman_oracle_bkup.timestamp : contains rman logs -----> check for oracle
rman logs
TSM Actlog
- q act orig=client node=oranode begint=-estimatedstarttime se=error|una
ble|fail|terminate
- q act se=error begint=-estimatedstarttime
- librarymanager: q act se=error begint=-estimatedstarttime
- q act se=unable begint=-estimatedstarttime
- q act se=fail begint=-estimatedstarttime
- q act se=terminate begint=-estimatedstarttime
- q act orig=server server=sta begint=-? se=error|unable|fail|terminate|
severed|abort|conne

You might also like