You are on page 1of 2

RESEARCH STATEMENT

My research has been at the intersection of distributed systems, data management, and network computing. I am interested in research issues in performance, scalability, reliability and security of large scale data intensive systems with special focus on Cloud computing, system virtualization and service computing. My research approach is characterized by the innovative use of cross-eld ideas, large-scale design and deployment and collaboration. First, my research aims at designing, implementing, deploying and measuring real large-scale distributed systems to meet the demanding scalability requirement in Cloud computing era, and to reveal important hidden problems as well as opportunities for new functionalities and optimization. Second, my research creatively applies ideas from multiple domains to solve complex problems. Third, my research is beneted by collaboration with people with diverse expertise and experiences: professors, industrial researchers and engineers, and peer students of diverse backgrounds. By leveraging the domain knowledge and experiences of my collaborators, I conduct research on problems of real world relevance with clear research goals and visions, and develop solutions applicable to a broader range of applications. In the remaining of this research statement, I will give an overview of my dissertation research that demonstrates my research strategies, my future research plan and directions.

Dissertation Research

My dissertation research tackles the emerging research theme of providing advanced monitoring functionalities as Cloud services to help users to manage Cloud and harness its power. FORMCEPT is looking forward to take up the challenge of managing the incoming flow of unstructured content, irrespective of it's source. The solution will be built on top of distributed file system and message queues. It will handle the incoming flow of data, storage of data and retrieval on demand. In this project deals with building a data collection framework associated with the FORMCEPT. This data collection framework is only focusing the social network Twitter. It collected the data from search by searching the specified keyword. The resultant data is stored in Hbase, which is a distributed database. Each resultant data consist of tweet message, image of the author,

date of message created, and the user id. Each search consists of hundreds of tweet data. The controlled movement of data is done by a message queue using ActiveMQ. Hbase is a distributed database which is built on top of a distributed file system, named Hadoop. The searched data will be analyzed, and the analysis will be shown using graphs. The last part of the project will be the searching interface for the Hadoop. Using this interface we can retrieve the data from the Database. Scope and Objectives

One result of the Internet's rapid growth has been a huge increase in the amount of information generated and shared by organizations in almost every industry and sector. These demands have created an equally huge need for tools that can be used to manage what we call unstructured data. The challenge of managing unstructured data represents perhaps the largest data management opportunity for our community since managing relational data. This project deals with a data collection framework, includes storage of unstructured data by creating a structured framework. Objectives

1. 2. 3. 4. 5. 6.

Install and configure Hadoop and HBase on a working node.

Study the Hadoop/HBase API and write several HBase test programs to demonstrate functionality. Write several HBase programs to test the performance of HBase under a variety of conditions not tested by their own Performance Evaluation test. Fetch the content - Structured and Unstructured Store and Search the content Analyse the data and represent the results using graphs.

You might also like