You are on page 1of 51

Resin Health System

Beyond Java Monitoring and Server Monitoring

Health Checks, Watchdog and Snapshot Report


Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved. caucho , resin and quercus are registered trademarks of Caucho Technology, Inc.

Java EE Certied

Gartner names Caucho in "Cool Vendors in Platform and Integration Middleware"

Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved. caucho , resin and quercus are registered trademarks of Caucho Technology, Inc.

Resin Health System (RHS) Overview

Resin Health System (RHS) Goes Beyond Just Monitoring Server and JVM can respond to conditions with actions Actions can remediate problems If server about to go down

due to bug, denial of service, or spike RHS triggers diagnostics then restarts Resin Application Server keeps running

Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.

RHS : Reliability and System Transparency

Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.

RHS born from need

Idea for RHS came from doing Resin support Thread lock? Can you do a thread dump when you see the problem? Running out of memory? Can you do a heap dump? How is your machine congured? What version?

What OS?

Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.

RHS By Engineers for Engineers

Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.

Major features of Resin Health System (RHS)

Ability to respond to problems Detect JVM and OS issues Avoid zombie processes Restarts Resin if there are major problems Internal monitoring

External Monitoring

Resin WatchDog Process Uses process control, socket connection and periodic ping to determine up time status

Advanced Reporting PDF

Post-mortem analysis Thread Dump/Log Dump Meters and Graphs Heap Dump

Resin Internal WatchDog Thread Watchers internal meters for problems Periodic Thread

Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.

RHS Tracks Metrics

Metrics are things like Available Memory, Number of Requests Per Minute, Garbage Collection Time, CPU Load, etc. Metrics can be graphed Tracks Historical Data for Trends Can determine Anomalies Can determine Trends Can compare current data with baseline data

Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.

VisualizaFon

You can view data that Health System collects

Resin Web Admin Watchdog Report

Post mortem PDF Report PDF Report you can generate anytime

Snapshot Report

Trigger: CLI, REST, Through Web Admin

Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.

RHS and Web Admin

Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.

RHS: Health Checks

RHS is highly congurable Similar to the Resin's "URL Rewrite" rules Rules are congurable

checks, conditions, actions

Internal Watchdog periodic checks

Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.

Watchdog process

Lightweight process : Used to stop and start Resin instances Can restart an instance if Java Monitoring / Server Monitoring / Health issue Parent process of Resin Server Opens socket to Resin Server Sends are-you-alive ping?

Watchdog Process

Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.

Watchdog Non Stop Mode

Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.

Watchdog Non Stop Mode

Resin is resilient If a Denial of Service or unexpected Spike or Bug knocks down JVM, Resin restarts Beyond that Resin can detect critical problems and do critical diagnostics so DevOps and Developers can get to root of problem Resin long been product of choice for embedded devices, network appliances and large deployments Non Stop mode makes Resin perfect for cloud deployments

Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.

Resin Watch-Dog

Watchdog Process

Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.

Resin Watch-Dog

Watchdog Process

Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.

Resin Watch-Dog
Starting Resin Process Ownership TCP Link

Resin

Watchdog Process

Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.

Resin Watch-Dog
Non-Stop Up State Process Ownership TCP Link

Resin

Watchdog Process

Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.

Resin Watch-Dog
Non-Stop Up State Process Ownership TCP Link

Resin

Watchdog Process

Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.

Resin Watch-Dog
Non-Stop Up State

Watchdog Process

Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.

Internal Watchdog Thread Inside of Resin

Watchdog Process
Resin Health System Watchdog Thread

Resin Process

Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.

Internal Watchdog Thread Inside of Resin

Watchdog Process
Resin Health System Watchdog Thread

Resin Process

Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.

Internal Watchdog Health Thread

Runs inside of Resin Server Runs periodically

Collects data Collects baseline data


Resin Health System Watchdog Thread

Executes series of checks Recheck failed conditions Perform actions when conditions are critical or fatal

Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.

Resin Java CDI / CanDI and Resin Conf based

RHS conguration extends Resin conguration le resin.xml RHS uses CanDI (Resins Java CDI)

create and update Java objects, XML tags exactly matches either a Java class or a Java property Use HealthSystem JavaDoc Use JavaDoc of the various checks, actions,

CanDI means classes and cong is in JavaDocs

Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.

Java Doc / XML conf of RHS

Startup delay : wait for baselined date before recording Period: how often to check metrics Recheck period: if some level has been crossed how often should RHS recheck to see if better

Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.

Types of Health Checks

Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.

Health Checks produce Status

Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.

Resin Checks and Responds

Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.

Health System AcFons

Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.

AcFons based on condiFon

Actions can be grouped If in critical state for two minutes perform group of actions Dump JMX values, Dump Threads, Dump Heap, CPU Prole, Restart If actions longer than 10 m, restart

Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.

Collect data needed to diagnose

When something goes wrong

Bug

Denial of Service Attack Application Bug Unexpected Spike

RHS collects metrics you need to diagnose problem Without collection, you are ying blind

Denial of Service Spike

Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.

AcFons beQer than just watching

Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.

Watch dog report (PDF)

Post Mortem Report Environment Info Server Metrics JVM Metrics Thread Dump Heap Dump Metrics Graph
Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.

Watch dog report (PDF)

Post Mortem Report Environment Info Server Metrics JVM Metrics Thread Dump Heap Dump Metrics Graph
Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.

Environment Data

Collect critical information about environment When, What OS, What version of Resin How did Resin startup And much more

Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.

Health Status

Status of Health Checks in Report

Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.

Recent Errors and Warnings

Recent Errors and Warnings

Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.

Anomalies

Health Checking stores baseline Anomalies are congurable triggers based on large changes from expected baseline Anomaly detection is congurable can trigger actions
Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.

Understanding Anomaly DetecFon

Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.

Understanding Anomaly DetecFon

Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.

Types of Metric Graphs in Report

Cluster Status Request Count Request Time HTTP Request Errors Log Warnings Threads CPU Usage Database Connection Active

Database Query Time NetStat JVM Memory

Heap Used Tenured Used PermGen Used

GC Time

Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.

Sample Graphs Memory and GC Time

Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.

Sample Metric Graphs Request

Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.

GC and Memory Metrics

Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.

Heap Dump

Heap dump critical for tracking down memory leaks Also generates hprof le which can be analyzed by many third party tools
Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.

CPU Prole / Thread Dumps

Critical for debugging thread deadlock issues

Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.

Snapshot report

Reports same type of data as watchdog Watchdog report is a postmortem analysis Snapshots are whenever you feel like

e.g., during a stress test trigger via REST, CLI and Web Admin

Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.

Conclusion

Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.

More Background Info About Health System

Resin Health System : Java Monitoring and Server Monitoring built into Resin Application Server Resin Health System : Current and Into the Future Resin Application Server Fullls Vision of Cloud Computing Resin Health System Enhancements

Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.

More Info

Caucho Technology | Home Page Resin | Application Server Resin | Java EE Web Prole Application Server Resin - Cloud Support | 3G - Java Clustering Resin | Java CDI | Dependency Injection / IoC Resin - Health System | Java Monitoring and Server Monitoring Download Resin | Application Server Watch Resin | Application Server Featured Video

Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.

Resin Java ApplicaFon Server

Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.

You might also like