You are on page 1of 81

CHAPTER 1

INTRODUCTION

1.1. PROJECT OVERVIEW

Web crawler forms an integral part of any search engine. The basic task of a crawler is to
fetch pages, parse them to get more URLs, and then fetch these URLs to get even more
URLs. In this process crawler can also log these pages or perform several other operations on
pages fetched according to the requirements of the search engine. Most of these auxiliary
tasks are orthogonal to the design of the crawler itself.

The explosive growth of the web has rendered the simple task of crawling the web non-
trivial. With this rapid increase in the search space, crawling the web is becoming more
difficult day by day. But all is not lost, newer computational models are being introduced to
make resource intensive tasks more manageable.

The price of computing is decreasing monotonically. It has now become very economical to
use several cheap computation units in distributed fashion to achieve high throughputs. The
challenge while using a distributed model such as one described above, is to efficiently
distribute the computation tasks avoiding overheads for synchronization and maintenance of
consistency.

Scalability is also an important issue for such a model to be usable.

In this project, design architecture of a scalable, distributed web crawler has been proposed
and implemented . It has been designed to make use of cheap resources and tries to remove
some of the bottleneck of the present crawlers in novel way. For sake of simplicity and focus,
we only worked on the crawling part of the crawler, logging only the URLs. Other functions
can be easily integrated to the design.

1
1.2. OBJECTIVES OF THE PROJECT

The objective of the project is to improve is to allow the user to store the website and the
links on his system and analyze the result. The project will also helps in finding the broken
link in any website. Our main objectives during the development of projects were:

Increased resource utilization (by multithreaded programming to increase


concurrency).
Effective distribution of crawling tasks with no central bottleneck.
Easy portability.
Limiting the request load for all the web servers.
Configurability of the crawling tasks

Besides catering to these capabilities our design also includes probabilistic hybrid search
model. This is done using a probabilistic hybrid of stack and queue ADTs (Abstract Data
Type) for maintaining the pending URL lists. Details of the probabilistic hybrid model are
presented later in the project. This distributed crawler is a peer-to-peer distributed
crawler, with no central entity.

By using a distributed crawling model we have overcome the bottlenecks like:

Network throughput
Processing capabilities
Database capabilities
Storage capabilities.

A database capability bottleneck is avoided by dividing the URL space into disjoint sets,
each of which is handled by a separate crawler. Each crawler parses and logs only the
URLs that lie in its URL space subset, and forwards rest of the URL to corresponding
crawler entity. Each crawler will have a prior knowledge of the look up table relating each
URL subset to [IP:PORT] combination identifying all the crawler threads

2
CHAPTER 2

LITERATURE REVIEW

The crawler system consists of a number of crawler entities, which run on distributed sites
and interact in peer-to-peer fashion. Each crawler entity has the knowledge to its URL subset,
as well as mapping from URL subset to network address of corresponding peer crawler entity.
Whenever the crawler entity encounters a URL from a different URL subset, it is forwarded
to the appropriate peer crawler entity based on URL subset to crawler entity lookup. Each
crawler entity maintains its own database, which only stores the URLs from the URL subset
assigned to the particular entity. The databases are disjoint and can be combined offline
when the crawling task is complete.

CRAWLER ENTITY

Each crawler entity consists of several of crawler threads, a URL handling thread, a URL
packet dispatcher thread and URL packet receiver thread. The URL set assigned to each
crawler entity will be further divided into subsets for each crawler thread. Each crawler
thread has its own pending URL list. Each thread picks up an element from URL pending
list, generates an HTTP fetch requests, gets the page, parses through this page to extracts
any URLs in it and finally puts them in the job pending queue of the URL handling thread.

During initialization URL handling thread reads the hash to [IP:PORT] mapping. It also has
a job queue. This thread gets a URL from the job queue, checks to see if the URL belongs to
the URL set corresponding to the crawler entity. It does so based on the last few bits of the
hash of the domain name in the URL with conjunction of hash to [IP:PORT] mapping.

If the URL belongs to another entity it will put the URL on the dispatcher queue and get a
new URL from its job queue. If the URL belongs to its set, it firsts checks the URL-seen
cache, if the test fails it queries the URL database to check if the URL has been seen, and puts
the URL in the URL database. It then puts the URL into URL pending list of one of the
crawler threads.

3
URLs are assigned to a crawler thread based on domain names. Each domain name will only
be serviced by one thread; hence only one connection will be maintained with any given
server. This will make sure that the crawler doesnt overload a slow server.
A different hash is used while distributing jobs in between the crawler thread and while
determining the URL subset. The objective behind this to isolate the two operations such that
there is no correlation between a crawler entity and the thread that is assigned to it; thus
balancing the load evenly within the threads. The decision to divide URL space on the bases
to domain names was based on the observation that a lot of pages on the web tend to have
links to pages in the same domain name. Hence if all URLs with a particular domain name
will lie in the same URL space, these URLs will not be needed to be forwarded to other
crawler entities. Thus this scheme provides and effective strategy to divide the crawl task
between different peer-to-peer nodes of this distributed system. We validate this argument in
our experiments described in Section 7. URL dispatcher thread communicates the URLs
corresponding crawler entity. A URL receiver thread collects the URLs received from other
crawler entities i.e. communicated via dispatcher threads of those crawler entities and puts
them on the job queue of the URL handling thread.

4
THE IMPLEMENTATION

The system was implemented in Java platform for portability reasons. MySQL was used for
the URL database. Even though Java is less efficient than other languages that can be
compiled to the native machine code and none of the team members were proficient with it,
we selected Java for this prototype. The reasons behind this decision were to keep the
software architecture modular, make the system portable, and to deal with complexity of
such a system. In retrospect this turned to be a good decision as we might not have been able
to complete this project in time if we would have implemented it in other languages
such as C.

The comprehensive libraries provided with Java us to concentrate our efforts on design of the
system and software architecture. A java class was written for each of the various components
of the system ( i.e. different kind of threads, database, synchronized job queues, caches etc.).
First we wrote generic classes for various infrastructure components of the system like
synchronized job queues and caches. The LRUCache class implements an approximate LRU
cache based of hash table with overlapping buckets. The JobQueue class
implements a generic synchronized job queue with option for probabilistic hybrid of stack
and queue ADT.

The main Crawler class performs the initialization, by reading the configuration files,
spawning various threads accordingly and initializing various job queues. It then behaves as
the Handler Thread. A class named CrawlerThread performs the operation of the Crawler
Thread. This thread simply gets a URL from its job queue, messages the URLlist class with
this URL. The URLlist class then spawns a new thread that fetches the page, parses it for
URL links and returns the list of these URLs back to the CrawlerThread.

In java the URL fetch operation is not guaranteed to return and in case of a malicious web
server the whole thread can possibly hang, waiting for the operation to complete. This is the
reason why the URLlist class spawns a new thread every time to fetch the URL. The thread is
completed with a certain time-out, hence if the URL fetch operation isnt completed in time
the thread stops after time-out and normal operation is resumed. Spawning a new thread to
fetch each page does put an extra overhead on the operation but is essential for the robustness
of the system.
5
The Sender and Receiver classes implement the Sender and Receiver threads respectively.
The Receiver class starts a UDP socket at pre-determine port and waits for any packet. The
Sender class transmits the URLs via UDP packet to appropriate remote node. Besides the
classes that form the system architecture described before, we added a Probe Thread to the
system and a Measurement class.

The relevant classes report the appropriate measurements to the Measurements class and the
Probe Threads messages the Measurement class to output the measurements at configurable
periodic time intervals.

In this project a group computers are used to implement the distributed crawler. Every
node in the computer has its maximum capacity of storing a number of sites. While
using any site the user will select the IP Address of the the target machine and a shared
location. On clicking search button the content will be downloaded on the remote
machin. The user is also having the choice of saving the file into local drive if the
remoter computer is not available.

6
CHAPTER 3

SYSTEM ANALYSIS

3.1 IDENTIFICATION OF NEEDS

Information Retrieval is the area of computer science concerned with retrieving information
about a subject from a collection of data objects. This is not the same as Data Retrieval,
which in the context of documents consists mainly in determining which documents of a
collection contain the keywords of a user query. Information Retrieval deals with satisfying a
user need. Although there was an important body of Information Retrieval techniques
published before the invention of the World Wide Web, here are unique characteristics of the
Web that made them unsuitable or insufficient.

The low cost of publishing in the "open Web" is a key part of its success, but implies that
searching information on the Web will always be inherently more difficult then searching
information in traditional, closed repositories.

The typical design of search engines is a "cascade", in which a Web crawler creates a
collection which is indexed and searched. Most of the designs of search engines consider the
Web crawler as just a first stage in Web search, with little feedback from the ranking
algorithms to the crawling process. This is a cascade model, in which operations are executed
in strict order: first crawling, then indexing, and then searching. An aim of this approach is to
provide the crawler with access to all the information about the collection to guide the
crawling process effectively. This can be taken one step further, as there are tools available
for dealing with all the possible interactions between the modules of a search engine,

3.2 PRELIMINARY INVESTIGATION

Requirement Determination is the heart of System Analysis, aimed at acquiring a detailed


description of all important areas of business that is under investigation. So the whole
business process is studied. Many fact finding techniques are available for the requirements
determinations. Some of them are given below:

7
Existing documentation, forms, file and records
Research and site visits
Observation of the work environment
Questionnaires
Interviews and group work sessions.

From the above mentioned techniques following techniques are used in the project
Web Crawler using Distributed Links for the requirements determination.

Existing documentation, forms, file and records


Research and site visits
Questionnaires

3.2.1 Existing Documentation, Forms, File and Records: Existing information is


absolutely essential for organization. These documents provide information about the
present happenings in the organization. These documents help in knowing what
happens right now and a better system can be developed only after the correct
understanding of the current system. So the following documents are collected for the
study of the current system.

Website crawling procedures


Format of topics
Available protocols for storing the information
List of common questions asked by the users

The above documents provide information about the forms and the reports to be built
and the type of information to be stored.

3.2.2 Research and the Site visits: This is also fact finding technique it means studying
the application and the problem area. In this project many industries have been visited
to find out the answer of some common questions such as:

What is the working of Google?


What are various algorithms available for Crawling?

8
Performance evaluation of various algorithm.

3.2.3 Questionnaires: It is a document prepared for a special purpose that allows the
analysts to collect information and opinions from a number of respondents.

It contains a list of questions. The questionnaires are distributed to the selected


respondents answer the questions in their own time and return questions to the
analysts with the answers. The analyst can then analyze the responses and then reach
at the conclusions.
Sample questionnaires used in the Web Crawler using Distributed Links

What are the services provided by the organization?


What is the specific data of company?
How the records are maintained by users?
How the tests are conducted?
Some of the sample layouts of the reports.
What are the outputs of system?

3.2.4 Personal Interviews: There are always two roles in the personal interview. The
analyst is the interviewer who is responsible for organizing and conducting the
interview. The other role is interviewee who is the end-user or the manager or the
decision maker. The interviewee is asked a number of questions by the interviewer.
In this project the interviews of company head, department heads and employee are
conducted to ascertain their expectation to the system.

3.3 FEASIBILITY STUDY:

Feasibility Study is an important part of the Preliminary Investigation because only feasible
projects go to development stages. A very basic feasibility study for the current project is
given below:

9
3.3.1 Technical Feasibility: Technical feasibility raises questions like, is it possible that the
work can be done with the current equipment; software technology is required what
the possibility that it can be developed is?

In case of this project it fully supports windows XP/2000 but its lacks the support for
windows 98 and lower version. Also the front end tools and the back end tools for the
development of this project are also available. In this project SWING, Servlets has
been used as front end while the MySQL is used as the back end. Both the softwares
are easily available.
Thus it can be concluded that the project is technical feasible.

3.3.2 Economic Feasibility: It deals with economical impacts of the system on the
environment it is used, i.e., benefits in creating the system.
In case of this project it will save the precious time of recording the same data again
and again. The software is also designed to reduce the time and cost during the
calculation of critical data. The security provided by the software is an additional
benefit.
Thus it can be concluded that the project is economically feasible

3.3.3 Operational Feasibility: It deals with the user friendliness of the system, i.e., wills
the system be used if it is developed and implemented? Or will there be resistance
from the users?
In case of this project care has been taken to make this project highly user friendly so
that a person having only a little knowledge of English can handle it. By the way on-
line as well as special help programs which help in training the user are also built.
Thus the project is operationally feasible.

3.3.4 Legal Feasibility: This type of feasibility evaluates whether out project breaks any
law or not. According to the analysis, this project doesnt break any laws. So, it is
legally feasible.

10
CHAPTER 4

SOFTWARE SPECIFICATION

SOFTWARES USED

There were many technologies available for the development of the project. For example for
the front-end development Visual Basic 6, power Builder, X-Windows, Visual Basic.NET,
Oracle Developer 2000, VC++ and Jbuilder. And for the back end Oracle, Ingress, Sybase,
SQL Plus, MY SQL etc. But among these technologies SWING & SERVLET is selected as
Front End tool and MySQL is used as Back End because of the following reasons.

4.1 REASONS FOR THE SELECTION OF SWING & SERVLET

SWING & SERVLET is a Website development technology that has been developed
by Sun Microsystems. It is a powerful programming language to develop
sophisticated web application very quickly. In Java everything is Object Oriented. All
items, even variables, are objects in Java

SWING provide direct integration of Java Code in HTML, that allow the user to
develop websites efficientyly and effectively, apart from this Java is platform
independent and can run on any server.

SWING also provides the support of AJAX that enables the user to partially refresh
the web pages. Programmer can done this with the help of some pre-defined controls.
Thus Java enables the programmer to build efficient websites.

Java is an object oriented programming language, so it allows the project using


features of real world entities like class, objects, encapsulation, abstraction and
inheritance. So it allows a programmer to build a more robust and scalable
application.

11
SWING supports the use of HTML, CSS and Java Script and a set of pre-defined
classes in the form of JDBC that can be used to access and update databases.

4.2 REASONS FOR THE SELECTION OF MySQL

MySQL is one of widely used Back End Tools for developing the application software. Its
gaining the popularity due to the following reasons.

Updating the database.


Retrieving information from the database.
Accepting query language statements.
Enforcing security specifications
Managing data sharing.
Optimizing queries.
Managing system catalogs.

MySQL provides the following advantage for both clients and servers:-

Client Advantages:

Easy to use.
Supports multiple hardware platforms.
Supports multiple software applications
Familiar to the user

Server Advantages:

Reliable
Concurrent
Sophisticated locking
Fault tolerant

12
Thats why MySQL is selected as a Back End tool.

Apart from the above mentioned reasons relevant experience in SWING, SERVLET
and MySQL Server made to select them as front end and back end tools for
developing the project.
CHAPTER 5

SYSTEM SPECIFICATION

5.1. HARDWARE REQUIREMENTS

The project Web Crawler requires following hardwares for its successful implementation.

HARDWARE

Processor : Dual Core or Above


RAM : 512 MB or above
Hard Disk : 10 GB or Above
Monitor : TFT or LCD
Internet Connection : Broadband Connection

5.2. SOFTWARE REQUIREMENTS

The project Web Crawler System requires following hardwares for its successful
implementation.

SOFTWARE

Operating System : Windows 7 or above


Programming Language : Java, Java Swings
Visual Studio IDE : NetBeans IDE 7.2
Database : MySQL
Connector : MySQL Java Connector

13
CHAPTER 6

PROJECT DESCRIPTION

SOFTWARE REQUIREMENT SPECIFICATION

Based on the System Analysis described in last few pages a complete Software Requirement
Specification can be prepared which is described below:

6.1 INTRODUCTION

Purpose: The purpose of the software is to provide system support to the users in
storing the web pages of a website by performing a series of crawl operations upto the
given level. Theese pages will be stored in a folder and user can reference these pages
for further study.

Scope: The software would be of great importance for a company. Although the
software is specially designed for the companies but it could be individually used by
any organization of institute to provide offline study of the webpages.

Benefits:The project will automatically navigate through the pages, generate records,
save a lot of bandwidth, allow offline study of webpages.

6.2 OVERALL DESCRIPTION

Product Description: The product is named Web Crawler. The system is going to be
developed using the technologies like Servlet, AWT, Swings and MySQL.

14
Product Functioning: The client will be able to store frequently visited webpages on
his local hard disk. The raw data is then verified and finally a set of operations are to
be performed. For example for user database a new user can be added, existing user
can be removed, or the password can be changed.

Functions of the Project: There are six major function of the software

a) User Verification
b) Upload Raw Data
c) Validate Data
d) Use Validate Data
e) Take Input From The User
f) Save Data Again

Users of the product: There will be five major users of the software:

a) Owner of the company


b) Course Administrators
c) Students
d) Employees working in a company

6.3 SPECIFIC REQUIREMENTS

Interface Requirements: The interface requirement includes: easy to follow


interface, very few graphics, relevant error message, and proper linking of forms,
proper validation etc.

Hardware Requirements: The hardware requirement for the project:

a) Pentium- IV or higher Processor


b) 40 GB Hard disk
c) 512 MB RAM

15
d) Printer
e) Color Display Monitor

Software Requirements: The hardware requirement for the project:

a) Linux or Windows XP Service Pack 2 or above


b) Net Beans 7 or above
c) MySQL 5.1 or above
d) MS OFFICE

Logical Database Requirements: The following information is to be stored in the


databases.

a) The user data


b) Login Data
c) Website Data
d) Crawled Page Data

6.4. APPENDICES

a) Software Engineering Paradigm Applied


b) Context Free Diagrams
c) E- R Diagrams
d) Data Flow Diagrams
e) Data Dictionary
f) Diagrams Of The Tables Relationship

16
CHAPTER 7

PROJECT DESCRIPTION

7. 1 CONTEXT FREE DIAGRAM

0TH LEVEL DIAGRAM


Context free diagram shows the working of any system in only one process. The DFD
for the Web Crawler is given below

Request Distributed Request


Admin Web User
Response Crawler Response

Request Response

Web Sites

Fig 4.1- 0 Level Data Flow Diagram

REPORTS GENERATED FROM ABOVE SOFTWARE

17
a. USERS DETAILS

b. LOGIN DETAILS

c. WEB DETAILS

d. PAGES INFO

7.2. ER DIAGRAM:

Passwor Web
User_i dd Name URL
d

Admin 1 * Web Site


Add Date

Nam
e
Page
* Info
Mana Pages
ges

Sele Locatio
addres
cts n
s Web
Name
Compute 1
Nam r Page
e ID

User_i
d

18
7.3. DATA FLOW DIAGRAM

19
7.4. DATABASE DESIGN

Database is a collection of related table and it is the heart of any software because it stores the
most critical part, the data about the system. So proper planning needs to done be done to
ensure the design of an effective database. An effective database design includes:

Normalized Tables
Data Dictionary
Constraints

4.2.1. NORMALIZED TABLES

Web Crawler Project will contain following tables:

SNo. Table Name Description


Stors the information about admin
1 admin
username and password
Will store the information of visited
2 Sitesinfo
website
Will store the information about crawled
3 Pagesinfo
pages

NOTE:

All the tables are normalized up to 3 NF


Tables are stored in movie onlinecourse database
All the tables are created in MySQL Server
MySQL command based utility is used to create tables.

20
CHAPTER 8

SNAPSHOTS

SNAPSHOT 1:

SNAPSHOT 2:

21
SNAPSHOT 3:

SNAPSHOT 4:

22
SNAPSHOT 5:

SNAPSHOT 6:

23
SNAPSHOT 7:

24
CHAPTER 9

CODING

CODING OF CRAWLER.JAVA

package coding;
import javax.swing.JOptionPane;
import java.sql.*;

public class admin_login extends javax.swing.JFrame {


/** Creates new form Manage_Nodes */
public admin_login() {
initComponents();
}

/** This method is called from within the constructor to


* initialize the form.
* WARNING: Do NOT modify this code. The content of this method is
* always regenerated by the Form Editor.
*/
@SuppressWarnings("unchecked")
// <editor-fold defaultstate="collapsed" desc="Generated Code">
private void initComponents() {
jLabel1 = new javax.swing.JLabel();
jLabel2 = new javax.swing.JLabel();
jTextField1 = new javax.swing.JTextField();
jLabel3 = new javax.swing.JLabel();
jPasswordField1 = new javax.swing.JPasswordField();
jButton1 = new javax.swing.JButton();
jButton2 = new javax.swing.JButton();
jLabel4 = new javax.swing.JLabel();
jLabel5 = new javax.swing.JLabel();
jLabel6 = new javax.swing.JLabel();

25
setDefaultCloseOperation(javax.swing.WindowConstants.EXIT_ON_CLOSE);
addWindowListener(new java.awt.event.WindowAdapter() {
public void windowOpened(java.awt.event.WindowEvent evt) {
formWindowOpened(evt);
}
});

jLabel1.setFont(new java.awt.Font("Tahoma", 1, 18));


jLabel1.setText("Admin Login");
jLabel2.setFont(new java.awt.Font("Tahoma", 1, 12));
jLabel2.setText("Username");
jLabel3.setFont(new java.awt.Font("Tahoma", 1, 12));
jLabel3.setText("Password");
jButton1.setText("Login");
jButton1.addActionListener(new java.awt.event.ActionListener() {
public void actionPerformed(java.awt.event.ActionEvent evt) {
jButton1ActionPerformed(evt);
}
});

jButton2.setText("Exit");
jLabel4.setForeground(new java.awt.Color(255, 0, 0));
jLabel4.setText("*");
jLabel5.setForeground(new java.awt.Color(255, 0, 0));
jLabel5.setText("*");
jLabel6.setFont(new java.awt.Font("Tahoma", 1, 24)); // NOI18N
jLabel6.setForeground(new java.awt.Color(255, 0, 51));
jLabel6.setText("Distributed Web Crawler");
javax.swing.GroupLayout layout = new javax.swing.GroupLayout(getContentPane());
getContentPane().setLayout(layout);
layout.setHorizontalGroup(
layout.createParallelGroup(javax.swing.GroupLayout.Alignment.LEADING)
.addGroup(javax.swing.GroupLayout.Alignment.TRAILING,
layout.createSequentialGroup()
26
.addContainerGap(134, Short.MAX_VALUE)
.addComponent(jButton1, javax.swing.GroupLayout.PREFERRED_SIZE, 90,
javax.swing.GroupLayout.PREFERRED_SIZE)
.addPreferredGap(javax.swing.LayoutStyle.ComponentPlacement.UNRELATED)
.addComponent(jButton2, javax.swing.GroupLayout.PREFERRED_SIZE, 90,
javax.swing.GroupLayout.PREFERRED_SIZE)
.addGap(145, 145, 145))
.addGroup(javax.swing.GroupLayout.Alignment.TRAILING,
layout.createSequentialGroup()
.addContainerGap(185, Short.MAX_VALUE)
.addComponent(jLabel1)
.addGap(189, 189, 189))
.addGroup(layout.createSequentialGroup()
.addGap(79, 79, 79)

.addGroup(layout.createParallelGroup(javax.swing.GroupLayout.Alignment.LEADING)
.addGroup(layout.createSequentialGroup()

.addGroup(layout.createParallelGroup(javax.swing.GroupLayout.Alignment.LEADING)
.addComponent(jLabel2)
.addComponent(jLabel3))
.addGap(57, 57, 57)

.addGroup(layout.createParallelGroup(javax.swing.GroupLayout.Alignment.LEADING,
false)
.addComponent(jPasswordField1)
.addComponent(jTextField1,
javax.swing.GroupLayout.PREFERRED_SIZE, 164,
javax.swing.GroupLayout.PREFERRED_SIZE))

.addPreferredGap(javax.swing.LayoutStyle.ComponentPlacement.UNRELATED)

.addGroup(layout.createParallelGroup(javax.swing.GroupLayout.Alignment.LEADING)

27
.addComponent(jLabel5, javax.swing.GroupLayout.PREFERRED_SIZE,
18, javax.swing.GroupLayout.PREFERRED_SIZE)
.addComponent(jLabel4)))
.addComponent(jLabel6))
.addContainerGap(82, Short.MAX_VALUE))
);
layout.setVerticalGroup(
layout.createParallelGroup(javax.swing.GroupLayout.Alignment.LEADING)
.addGroup(javax.swing.GroupLayout.Alignment.TRAILING,
layout.createSequentialGroup()
.addContainerGap(29, Short.MAX_VALUE)
.addComponent(jLabel6)
.addGap(18, 18, 18)
.addComponent(jLabel1)
.addGap(18, 18, 18)

.addGroup(layout.createParallelGroup(javax.swing.GroupLayout.Alignment.BASELINE)
.addComponent(jLabel2)
.addComponent(jTextField1, javax.swing.GroupLayout.PREFERRED_SIZE,
javax.swing.GroupLayout.DEFAULT_SIZE, javax.swing.GroupLayout.PREFERRED_SIZE)
.addComponent(jLabel4))
.addGap(18, 18, 18)

.addGroup(layout.createParallelGroup(javax.swing.GroupLayout.Alignment.TRAILING)
.addComponent(jLabel3)

.addGroup(layout.createParallelGroup(javax.swing.GroupLayout.Alignment.BASELINE)
.addComponent(jPasswordField1,
javax.swing.GroupLayout.PREFERRED_SIZE, javax.swing.GroupLayout.DEFAULT_SIZE,
javax.swing.GroupLayout.PREFERRED_SIZE)
.addComponent(jLabel5)))
.addGap(28, 28, 28)

.addGroup(layout.createParallelGroup(javax.swing.GroupLayout.Alignment.BASELINE)
28
.addComponent(jButton1)
.addComponent(jButton2))
.addGap(25, 25, 25))
);

pack();
}// </editor-fold>

private void jButton1ActionPerformed(java.awt.event.ActionEvent evt) {


// TODO add your handling code here:

int flag = 0;
String str = "";
if (jTextField1.getText().equals("")) {
flag = 1;
jLabel3.setVisible(true);
str = "Username";
}
if (jPasswordField1.getText().equals("")) {
flag = 1;
str = str + " Password";
str = str.trim();
}

if (flag == 0) {
try {
DataBaseInfo data = new DataBaseInfo();
PreparedStatement stmt = data.conn.prepareStatement("select * from admin where
username=? and password=?", ResultSet.TYPE_SCROLL_INSENSITIVE,
ResultSet.CONCUR_UPDATABLE);
stmt.setString(1, jTextField1.getText());
stmt.setString(2, jPasswordField1.getText());
ResultSet rs = stmt.executeQuery();

29
if (rs.next()) {
DataBaseInfo.un = rs.getString(1);
DataBaseInfo.pwd = rs.getString(2);
DataBaseInfo.localadd = rs.getString(3);
DataBaseInfo.usedistributed = rs.getString(4);
Manage_Computers obj = new Manage_Computers();
obj.setVisible(true);
this.dispose();

} else {
JOptionPane.showMessageDialog(this, "Invalid username or password !!");
}

} catch (Exception ex) {


JOptionPane.showMessageDialog(this, ex);
}

} else {
str = str + " can't be empty";
JOptionPane.showMessageDialog(this, str);
}

private void formWindowOpened(java.awt.event.WindowEvent evt) {


// TODO add your handling code here:
this.setLocationRelativeTo(null);
}
/**
* @param args the command line arguments
*/
public static void main(String args[]) {
/* Set the Nimbus look and feel */
//<editor-fold defaultstate="collapsed" desc=" Look and feel setting code (optional) ">
30
/* If Nimbus (introduced in Java SE 6) is not available, stay with the default look and
feel.
* For details see
http://download.oracle.com/javase/tutorial/uiswing/lookandfeel/plaf.html
*/
try {
for (javax.swing.UIManager.LookAndFeelInfo info :
javax.swing.UIManager.getInstalledLookAndFeels()) {
if ("Nimbus".equals(info.getName())) {
javax.swing.UIManager.setLookAndFeel(info.getClassName());
break;
}
}
} catch (ClassNotFoundException ex) {

java.util.logging.Logger.getLogger(admin_login.class.getName()).log(java.util.logging.Level
.SEVERE, null, ex);
} catch (InstantiationException ex) {

java.util.logging.Logger.getLogger(admin_login.class.getName()).log(java.util.logging.Level
.SEVERE, null, ex);
} catch (IllegalAccessException ex) {

java.util.logging.Logger.getLogger(admin_login.class.getName()).log(java.util.logging.Level
.SEVERE, null, ex);
} catch (javax.swing.UnsupportedLookAndFeelException ex) {

java.util.logging.Logger.getLogger(admin_login.class.getName()).log(java.util.logging.Level
.SEVERE, null, ex);
}
//</editor-fold>

/* Create and display the form */


java.awt.EventQueue.invokeLater(new Runnable() {
31
public void run() {
new admin_login().setVisible(true);
}
});
}
// Variables declaration - do not modify
private javax.swing.JButton jButton1;
private javax.swing.JButton jButton2;
private javax.swing.JLabel jLabel1;
private javax.swing.JLabel jLabel2;
private javax.swing.JLabel jLabel3;
private javax.swing.JLabel jLabel4;
private javax.swing.JLabel jLabel5;
private javax.swing.JLabel jLabel6;
private javax.swing.JPasswordField jPasswordField1;
private javax.swing.JTextField jTextField1;
// End of variables declaration
}

CODING OF CRAWLER.JAVA

/*
* To change this template, choose Tools | Templates
* and open the template in the editor.
*/

/*
* Manage_Nodes.java
*
* Created on Feb 11, 2015, 11:49:21 AM
*/
package coding;

import javax.swing.JOptionPane;

32
import java.sql.*;

/**
*
* @author DSOFT
*/
public class Manage_Computers extends javax.swing.JFrame {

/** Creates new form Manage_Nodes */


public Manage_Computers() {
initComponents();
}

/** This method is called from within the constructor to


* initialize the form.
* WARNING: Do NOT modify this code. The content of this method is
* always regenerated by the Form Editor.
*/
@SuppressWarnings("unchecked")
// <editor-fold defaultstate="collapsed" desc="Generated Code">
private void initComponents() {

jLabel1 = new javax.swing.JLabel();


jLabel6 = new javax.swing.JLabel();
jTabbedPane1 = new javax.swing.JTabbedPane();
jPanel1 = new javax.swing.JPanel();
jLabel3 = new javax.swing.JLabel();
jTextField1 = new javax.swing.JTextField();
jLabel4 = new javax.swing.JLabel();
jTextField2 = new javax.swing.JTextField();
jLabel9 = new javax.swing.JLabel();
jTextField3 = new javax.swing.JTextField();
jButton1 = new javax.swing.JButton();
jCheckBox1 = new javax.swing.JCheckBox();
33
jPanel2 = new javax.swing.JPanel();
jScrollPane1 = new javax.swing.JScrollPane();
jTable1 = new javax.swing.JTable();
jButton3 = new javax.swing.JButton();
jPanel6 = new javax.swing.JPanel();
jLabel10 = new javax.swing.JLabel();
jTextField4 = new javax.swing.JTextField();
jButton4 = new javax.swing.JButton();
jCheckBox2 = new javax.swing.JCheckBox();
jPanel7 = new javax.swing.JPanel();
jLabel11 = new javax.swing.JLabel();
jTextField5 = new javax.swing.JTextField();
jButton5 = new javax.swing.JButton();
jCheckBox3 = new javax.swing.JCheckBox();
jLabel12 = new javax.swing.JLabel();
jTextField6 = new javax.swing.JTextField();
jLabel13 = new javax.swing.JLabel();
jTextField7 = new javax.swing.JTextField();
jLabel14 = new javax.swing.JLabel();
jLabel15 = new javax.swing.JLabel();
jPasswordField1 = new javax.swing.JPasswordField();
jMenuBar1 = new javax.swing.JMenuBar();
jMenu1 = new javax.swing.JMenu();
jSeparator1 = new javax.swing.JPopupMenu.Separator();
jMenuItem1 = new javax.swing.JMenuItem();
jMenu3 = new javax.swing.JMenu();
jSeparator2 = new javax.swing.JPopupMenu.Separator();
jMenuItem7 = new javax.swing.JMenuItem();
jMenu5 = new javax.swing.JMenu();
jSeparator3 = new javax.swing.JPopupMenu.Separator();
jMenuItem14 = new javax.swing.JMenuItem();
jMenu6 = new javax.swing.JMenu();
jSeparator4 = new javax.swing.JPopupMenu.Separator();
jMenu2 = new javax.swing.JMenu();
34
setDefaultCloseOperation(javax.swing.WindowConstants.EXIT_ON_CLOSE);
addWindowListener(new java.awt.event.WindowAdapter() {
public void windowOpened(java.awt.event.WindowEvent evt) {
formWindowOpened(evt);
}
});

jLabel1.setFont(new java.awt.Font("Tahoma", 1, 24));


jLabel1.setForeground(new java.awt.Color(51, 0, 204));
jLabel1.setText("Computer Management");

jLabel6.setFont(new java.awt.Font("Tahoma", 1, 36));


jLabel6.setForeground(new java.awt.Color(255, 0, 51));
jLabel6.setText("Distributed Web Crawler");

jPanel1.setFont(new java.awt.Font("Tahoma", 1, 12));

jLabel3.setFont(new java.awt.Font("Tahoma", 1, 12));


jLabel3.setText("Computer IP");

jTextField1.setFont(new java.awt.Font("Tahoma", 1, 12));

jLabel4.setFont(new java.awt.Font("Tahoma", 1, 12));


jLabel4.setText("Shared Location");

jTextField2.setFont(new java.awt.Font("Tahoma", 1, 12));

jLabel9.setFont(new java.awt.Font("Tahoma", 1, 12));


jLabel9.setText("Maximum Allowed Sites");

jTextField3.setFont(new java.awt.Font("Tahoma", 1, 12));

jButton1.setText("Save Information");
35
jButton1.addActionListener(new java.awt.event.ActionListener() {
public void actionPerformed(java.awt.event.ActionEvent evt) {
jButton1ActionPerformed(evt);
}
});

jCheckBox1.setText("Confirm Save");

javax.swing.GroupLayout jPanel1Layout = new javax.swing.GroupLayout(jPanel1);


jPanel1.setLayout(jPanel1Layout);
jPanel1Layout.setHorizontalGroup(
jPanel1Layout.createParallelGroup(javax.swing.GroupLayout.Alignment.LEADING)
.addGroup(jPanel1Layout.createSequentialGroup()
.addGap(67, 67, 67)

.addGroup(jPanel1Layout.createParallelGroup(javax.swing.GroupLayout.Alignment.LEADI
NG)
.addComponent(jLabel9)
.addComponent(jLabel4)
.addComponent(jLabel3)
.addComponent(jButton1, javax.swing.GroupLayout.PREFERRED_SIZE, 188,
javax.swing.GroupLayout.PREFERRED_SIZE))
.addPreferredGap(javax.swing.LayoutStyle.ComponentPlacement.RELATED)

.addGroup(jPanel1Layout.createParallelGroup(javax.swing.GroupLayout.Alignment.LEADI
NG)
.addGroup(jPanel1Layout.createSequentialGroup()
.addComponent(jCheckBox1)
.addContainerGap())

.addGroup(jPanel1Layout.createParallelGroup(javax.swing.GroupLayout.Alignment.LEADI
NG)
.addGroup(jPanel1Layout.createSequentialGroup()

36
.addComponent(jTextField3,
javax.swing.GroupLayout.PREFERRED_SIZE, 201,
javax.swing.GroupLayout.PREFERRED_SIZE)
.addContainerGap())
.addGroup(jPanel1Layout.createSequentialGroup()

.addGroup(jPanel1Layout.createParallelGroup(javax.swing.GroupLayout.Alignment.LEADI
NG)
.addComponent(jTextField1,
javax.swing.GroupLayout.DEFAULT_SIZE, 619, Short.MAX_VALUE)
.addComponent(jTextField2,
javax.swing.GroupLayout.Alignment.TRAILING,
javax.swing.GroupLayout.DEFAULT_SIZE, 619, Short.MAX_VALUE))
.addGap(110, 110, 110)))))
);
jPanel1Layout.setVerticalGroup(
jPanel1Layout.createParallelGroup(javax.swing.GroupLayout.Alignment.LEADING)
.addGroup(jPanel1Layout.createSequentialGroup()
.addGap(43, 43, 43)

.addGroup(jPanel1Layout.createParallelGroup(javax.swing.GroupLayout.Alignment.LEADI
NG, false)
.addGroup(jPanel1Layout.createSequentialGroup()

.addGroup(jPanel1Layout.createParallelGroup(javax.swing.GroupLayout.Alignment.BASEL
INE)
.addComponent(jLabel3)
.addComponent(jTextField1,
javax.swing.GroupLayout.PREFERRED_SIZE, javax.swing.GroupLayout.DEFAULT_SIZE,
javax.swing.GroupLayout.PREFERRED_SIZE))
.addPreferredGap(javax.swing.LayoutStyle.ComponentPlacement.RELATED,
javax.swing.GroupLayout.DEFAULT_SIZE, Short.MAX_VALUE)

37
.addComponent(jTextField2, javax.swing.GroupLayout.PREFERRED_SIZE,
javax.swing.GroupLayout.DEFAULT_SIZE,
javax.swing.GroupLayout.PREFERRED_SIZE))
.addGroup(jPanel1Layout.createSequentialGroup()
.addGap(45, 45, 45)
.addComponent(jLabel4)))
.addGap(18, 18, 18)

.addGroup(jPanel1Layout.createParallelGroup(javax.swing.GroupLayout.Alignment.BASEL
INE)
.addComponent(jTextField3, javax.swing.GroupLayout.PREFERRED_SIZE,
javax.swing.GroupLayout.DEFAULT_SIZE, javax.swing.GroupLayout.PREFERRED_SIZE)
.addComponent(jLabel9))
.addGap(29, 29, 29)

.addGroup(jPanel1Layout.createParallelGroup(javax.swing.GroupLayout.Alignment.BASEL
INE)
.addComponent(jCheckBox1)
.addComponent(jButton1))
.addContainerGap(193, Short.MAX_VALUE))
);

jTabbedPane1.addTab("Add New Computer", jPanel1);

jTable1.setFont(new java.awt.Font("Verdana", 1, 10));


jTable1.setModel(new javax.swing.table.DefaultTableModel(
new Object [][] {

},
new String [] {

}
));
jTable1.setAutoResizeMode(javax.swing.JTable.AUTO_RESIZE_ALL_COLUMNS);
38
jTable1.setRowHeight(25);
jScrollPane1.setViewportView(jTable1);

jButton3.setText("Show Computers");
jButton3.addActionListener(new java.awt.event.ActionListener() {
public void actionPerformed(java.awt.event.ActionEvent evt) {
jButton3ActionPerformed(evt);
}
});

javax.swing.GroupLayout jPanel2Layout = new javax.swing.GroupLayout(jPanel2);


jPanel2.setLayout(jPanel2Layout);
jPanel2Layout.setHorizontalGroup(
jPanel2Layout.createParallelGroup(javax.swing.GroupLayout.Alignment.LEADING)
.addGroup(jPanel2Layout.createSequentialGroup()
.addGap(21, 21, 21)

.addGroup(jPanel2Layout.createParallelGroup(javax.swing.GroupLayout.Alignment.LEADI
NG)
.addComponent(jScrollPane1, javax.swing.GroupLayout.PREFERRED_SIZE,
948, javax.swing.GroupLayout.PREFERRED_SIZE)
.addComponent(jButton3, javax.swing.GroupLayout.PREFERRED_SIZE, 151,
javax.swing.GroupLayout.PREFERRED_SIZE))
.addContainerGap(21, Short.MAX_VALUE))
);
jPanel2Layout.setVerticalGroup(
jPanel2Layout.createParallelGroup(javax.swing.GroupLayout.Alignment.LEADING)
.addGroup(jPanel2Layout.createSequentialGroup()
.addGap(23, 23, 23)
.addComponent(jButton3)
.addGap(18, 18, 18)
.addComponent(jScrollPane1, javax.swing.GroupLayout.PREFERRED_SIZE, 288,
javax.swing.GroupLayout.PREFERRED_SIZE)
.addContainerGap(35, Short.MAX_VALUE))
39
);

jTabbedPane1.addTab("View Computers", jPanel2);

jPanel6.setFont(new java.awt.Font("Tahoma", 1, 12));

jLabel10.setFont(new java.awt.Font("Tahoma", 1, 12));


jLabel10.setText("Enter Computer IP");

jTextField4.setFont(new java.awt.Font("Tahoma", 1, 12));

jButton4.setText("Delete Computer");
jButton4.addActionListener(new java.awt.event.ActionListener() {
public void actionPerformed(java.awt.event.ActionEvent evt) {
jButton4ActionPerformed(evt);
}
});

jCheckBox2.setText("Confirm Record Delete");

javax.swing.GroupLayout jPanel6Layout = new javax.swing.GroupLayout(jPanel6);


jPanel6.setLayout(jPanel6Layout);
jPanel6Layout.setHorizontalGroup(
jPanel6Layout.createParallelGroup(javax.swing.GroupLayout.Alignment.LEADING)
.addGroup(jPanel6Layout.createSequentialGroup()
.addGap(67, 67, 67)

.addGroup(jPanel6Layout.createParallelGroup(javax.swing.GroupLayout.Alignment.LEADI
NG)
.addGroup(jPanel6Layout.createSequentialGroup()
.addComponent(jCheckBox2)
.addContainerGap())
.addGroup(jPanel6Layout.createSequentialGroup()
.addComponent(jLabel10)
40
.addGap(57, 57, 57)
.addComponent(jTextField4, javax.swing.GroupLayout.PREFERRED_SIZE,
201, javax.swing.GroupLayout.PREFERRED_SIZE)
.addPreferredGap(javax.swing.LayoutStyle.ComponentPlacement.RELATED,
javax.swing.GroupLayout.DEFAULT_SIZE, Short.MAX_VALUE)
.addComponent(jButton4)
.addGap(452, 452, 452))))
);
jPanel6Layout.setVerticalGroup(
jPanel6Layout.createParallelGroup(javax.swing.GroupLayout.Alignment.LEADING)
.addGroup(jPanel6Layout.createSequentialGroup()
.addGap(43, 43, 43)

.addGroup(jPanel6Layout.createParallelGroup(javax.swing.GroupLayout.Alignment.BASEL
INE)
.addComponent(jTextField4, javax.swing.GroupLayout.PREFERRED_SIZE,
javax.swing.GroupLayout.DEFAULT_SIZE, javax.swing.GroupLayout.PREFERRED_SIZE)
.addComponent(jButton4)
.addComponent(jLabel10))
.addGap(22, 22, 22)
.addComponent(jCheckBox2)
.addContainerGap(276, Short.MAX_VALUE))
);

jTabbedPane1.addTab("Delete Computer", jPanel6);

jPanel7.setFont(new java.awt.Font("Tahoma", 1, 12));

jLabel11.setFont(new java.awt.Font("Tahoma", 1, 12));


jLabel11.setText("Local System Address");

jTextField5.setFont(new java.awt.Font("Tahoma", 1, 12));

jButton5.setText("Save Details");
41
jButton5.addActionListener(new java.awt.event.ActionListener() {
public void actionPerformed(java.awt.event.ActionEvent evt) {
jButton5ActionPerformed(evt);
}
});

jCheckBox3.setText("Confirm Record Update");

jLabel12.setFont(new java.awt.Font("Tahoma", 1, 12));


jLabel12.setText("Use Distributed");

jTextField6.setFont(new java.awt.Font("Tahoma", 1, 12));

jLabel13.setFont(new java.awt.Font("Tahoma", 1, 12));


jLabel13.setText("User ID");

jTextField7.setFont(new java.awt.Font("Tahoma", 1, 12));

jLabel14.setFont(new java.awt.Font("Tahoma", 1, 12));


jLabel14.setText("(Press Y or N)");

jLabel15.setFont(new java.awt.Font("Tahoma", 1, 12));


jLabel15.setText("Password");

javax.swing.GroupLayout jPanel7Layout = new javax.swing.GroupLayout(jPanel7);


jPanel7.setLayout(jPanel7Layout);
jPanel7Layout.setHorizontalGroup(
jPanel7Layout.createParallelGroup(javax.swing.GroupLayout.Alignment.LEADING)
.addGroup(jPanel7Layout.createSequentialGroup()

.addGroup(jPanel7Layout.createParallelGroup(javax.swing.GroupLayout.Alignment.LEADI
NG, false)
.addGroup(jPanel7Layout.createSequentialGroup()
.addGap(96, 96, 96)
42
.addGroup(jPanel7Layout.createParallelGroup(javax.swing.GroupLayout.Alignment.LEADI
NG)
.addComponent(jCheckBox3)
.addComponent(jLabel12)
.addComponent(jLabel13)
.addComponent(jLabel15))
.addGap(43, 43, 43))
.addGroup(javax.swing.GroupLayout.Alignment.TRAILING,
jPanel7Layout.createSequentialGroup()
.addContainerGap(javax.swing.GroupLayout.DEFAULT_SIZE,
Short.MAX_VALUE)
.addComponent(jLabel11)
.addGap(63, 63, 63)))

.addGroup(jPanel7Layout.createParallelGroup(javax.swing.GroupLayout.Alignment.LEADI
NG)
.addGroup(jPanel7Layout.createSequentialGroup()
.addComponent(jTextField6, javax.swing.GroupLayout.PREFERRED_SIZE,
201, javax.swing.GroupLayout.PREFERRED_SIZE)

.addPreferredGap(javax.swing.LayoutStyle.ComponentPlacement.UNRELATED)
.addComponent(jLabel14))
.addComponent(jTextField5, javax.swing.GroupLayout.PREFERRED_SIZE,
201, javax.swing.GroupLayout.PREFERRED_SIZE)
.addComponent(jButton5)

.addGroup(jPanel7Layout.createParallelGroup(javax.swing.GroupLayout.Alignment.TRAILI
NG, false)
.addComponent(jPasswordField1,
javax.swing.GroupLayout.Alignment.LEADING)
.addComponent(jTextField7,
javax.swing.GroupLayout.Alignment.LEADING,
javax.swing.GroupLayout.DEFAULT_SIZE, 201, Short.MAX_VALUE)))
43
.addGap(440, 440, 440))
);
jPanel7Layout.setVerticalGroup(
jPanel7Layout.createParallelGroup(javax.swing.GroupLayout.Alignment.LEADING)
.addGroup(jPanel7Layout.createSequentialGroup()
.addGap(43, 43, 43)

.addGroup(jPanel7Layout.createParallelGroup(javax.swing.GroupLayout.Alignment.BASEL
INE)
.addComponent(jTextField5, javax.swing.GroupLayout.PREFERRED_SIZE,
javax.swing.GroupLayout.DEFAULT_SIZE, javax.swing.GroupLayout.PREFERRED_SIZE)
.addComponent(jLabel11))
.addGap(18, 18, 18)

.addGroup(jPanel7Layout.createParallelGroup(javax.swing.GroupLayout.Alignment.BASEL
INE)
.addComponent(jTextField6, javax.swing.GroupLayout.PREFERRED_SIZE,
javax.swing.GroupLayout.DEFAULT_SIZE, javax.swing.GroupLayout.PREFERRED_SIZE)
.addComponent(jLabel12)
.addComponent(jLabel14))
.addGap(18, 18, 18)

.addGroup(jPanel7Layout.createParallelGroup(javax.swing.GroupLayout.Alignment.BASEL
INE)
.addComponent(jLabel13)
.addComponent(jTextField7, javax.swing.GroupLayout.PREFERRED_SIZE,
javax.swing.GroupLayout.DEFAULT_SIZE,
javax.swing.GroupLayout.PREFERRED_SIZE))
.addGap(18, 18, 18)

.addGroup(jPanel7Layout.createParallelGroup(javax.swing.GroupLayout.Alignment.BASEL
INE)
.addComponent(jLabel15)

44
.addComponent(jPasswordField1,
javax.swing.GroupLayout.PREFERRED_SIZE, javax.swing.GroupLayout.DEFAULT_SIZE,
javax.swing.GroupLayout.PREFERRED_SIZE))
.addPreferredGap(javax.swing.LayoutStyle.ComponentPlacement.RELATED, 120,
Short.MAX_VALUE)

.addGroup(jPanel7Layout.createParallelGroup(javax.swing.GroupLayout.Alignment.BASEL
INE)
.addComponent(jCheckBox3)
.addComponent(jButton5))
.addGap(64, 64, 64))
);

jTabbedPane1.addTab("Change Initial", jPanel7);

jMenu1.setText("Admin Panel");
jMenu1.add(jSeparator1);

jMenuItem1.setText("Show Panel");
jMenuItem1.addActionListener(new java.awt.event.ActionListener() {
public void actionPerformed(java.awt.event.ActionEvent evt) {
jMenuItem1ActionPerformed(evt);
}
});
jMenu1.add(jMenuItem1);

jMenuBar1.add(jMenu1);

jMenu3.setText("Web Crawler");
jMenu3.add(jSeparator2);

jMenuItem7.setText("Load Crawler");
jMenuItem7.addActionListener(new java.awt.event.ActionListener() {
public void actionPerformed(java.awt.event.ActionEvent evt) {
45
jMenuItem7ActionPerformed(evt);
}
});
jMenu3.add(jMenuItem7);

jMenuBar1.add(jMenu3);

jMenu5.setText("Search Websites");
jMenu5.add(jSeparator3);

jMenuItem14.setText("View Search Box");


jMenuItem14.addActionListener(new java.awt.event.ActionListener() {
public void actionPerformed(java.awt.event.ActionEvent evt) {
jMenuItem14ActionPerformed(evt);
}
});
jMenu5.add(jMenuItem14);

jMenuBar1.add(jMenu5);

jMenu6.setText("Logout");
jMenu6.addActionListener(new java.awt.event.ActionListener() {
public void actionPerformed(java.awt.event.ActionEvent evt) {
jMenu6ActionPerformed(evt);
}
});
jMenu6.add(jSeparator4);

jMenuBar1.add(jMenu6);

jMenu2.setText("Exit");
jMenu2.addActionListener(new java.awt.event.ActionListener() {
public void actionPerformed(java.awt.event.ActionEvent evt) {
jMenu2ActionPerformed(evt);
46
}
});
jMenuBar1.add(jMenu2);

setJMenuBar(jMenuBar1);

javax.swing.GroupLayout layout = new javax.swing.GroupLayout(getContentPane());


getContentPane().setLayout(layout);
layout.setHorizontalGroup(
layout.createParallelGroup(javax.swing.GroupLayout.Alignment.LEADING)
.addGroup(layout.createSequentialGroup()
.addGap(70, 70, 70)
.addComponent(jLabel6)
.addPreferredGap(javax.swing.LayoutStyle.ComponentPlacement.RELATED, 244,
Short.MAX_VALUE)
.addComponent(jLabel1)
.addGap(20, 20, 20))
.addGroup(layout.createSequentialGroup()
.addGap(29, 29, 29)
.addComponent(jTabbedPane1, javax.swing.GroupLayout.PREFERRED_SIZE,
995, javax.swing.GroupLayout.PREFERRED_SIZE)
.addContainerGap(39, Short.MAX_VALUE))
);
layout.setVerticalGroup(
layout.createParallelGroup(javax.swing.GroupLayout.Alignment.LEADING)
.addGroup(layout.createSequentialGroup()
.addContainerGap()

.addGroup(layout.createParallelGroup(javax.swing.GroupLayout.Alignment.BASELINE)
.addComponent(jLabel6)
.addComponent(jLabel1))
.addPreferredGap(javax.swing.LayoutStyle.ComponentPlacement.UNRELATED)
.addComponent(jTabbedPane1, javax.swing.GroupLayout.DEFAULT_SIZE, 415,
Short.MAX_VALUE)
47
.addContainerGap())
);

pack();
}// </editor-fold>

private void formWindowOpened(java.awt.event.WindowEvent evt) {


// TODO add your handling code here:
this.setLocationRelativeTo(null);
jTextField5.setText(DataBaseInfo.localadd);
jTextField6.setText(DataBaseInfo.usedistributed);
jTextField7.setText(DataBaseInfo.un);
jPasswordField1.setText(DataBaseInfo.pwd);
}

private void jButton1ActionPerformed(java.awt.event.ActionEvent evt) {


// TODO add your handling code here:

try {
if (jCheckBox1.isSelected() == true) {
DataBaseInfo db=new DataBaseInfo();
PreparedStatement stmt =db.conn.prepareStatement("insert into nodesinfo
values(?,?,?,0)");
stmt.setString(1, jTextField1.getText());
stmt.setString(2, jTextField2.getText());
stmt.setString(3, jTextField3.getText());

stmt.executeUpdate();
JOptionPane.showMessageDialog(this, "Node successfully added to network");
} else {
JOptionPane.showMessageDialog(this, "Please confirm node entry");
}
48
} catch (Exception e) {
JOptionPane.showMessageDialog(this, e.getMessage());
}
}

private void jButton3ActionPerformed(java.awt.event.ActionEvent evt) {


// TODO add your handling code here:

try {

DataBaseInfo db=new DataBaseInfo();

PreparedStatement stmt = db.conn.prepareStatement("select * from nodesinfo",


ResultSet.TYPE_SCROLL_INSENSITIVE, ResultSet.CONCUR_UPDATABLE);

ResultSet rs = stmt.executeQuery();
ResultSetMetaData rdata = rs.getMetaData();
String[] str = {"Node IP Address", "Shared Path", "Maximum Limit", "Used Limit"};

int n =DataBaseInfo.returnColumn(rs);
int col = rdata.getColumnCount();

Object[][] data = new Object[n][col + 1];

rs.beforeFirst();
int an = 0;
while (rs.next()) {
for (int j = 1; j <= col; j++) {
data[an][j - 1] = rs.getString(j);
}
an++;
}

jTable1.setModel(new javax.swing.table.DefaultTableModel(
49
data, str));

} catch (Exception ex) {


JOptionPane.showMessageDialog(this, ex);
}
}

private void jButton4ActionPerformed(java.awt.event.ActionEvent evt) {


// TODO add your handling code here:

try {

DataBaseInfo db=new DataBaseInfo();

PreparedStatement stmt = db.conn.prepareStatement("delete from nodesinfo where


NodeIP=?", ResultSet.TYPE_SCROLL_INSENSITIVE,
ResultSet.CONCUR_UPDATABLE);
stmt.setString(1, jTextField4.getText());
int n=stmt.executeUpdate();
if(n==1)
{
JOptionPane.showMessageDialog(this,"Node successfully deleted !!");
jTextField4.setText("");
}
else
{
JOptionPane.showMessageDialog(this,"Node not found !!");
}
} catch (Exception ex) {
JOptionPane.showMessageDialog(this, ex);
}
}

private void jButton5ActionPerformed(java.awt.event.ActionEvent evt) {


50
// TODO add your handling code here:

try {
if(jCheckBox3.isSelected())
{
DataBaseInfo db=new DataBaseInfo();

PreparedStatement stmt = db.conn.prepareStatement("update admin set username=?,


password=?, localaddress=?, usedistributed=?", ResultSet.TYPE_SCROLL_INSENSITIVE,
ResultSet.CONCUR_UPDATABLE);
stmt.setString(1, jTextField7.getText());
stmt.setString(2, jPasswordField1.getText());
stmt.setString(3, jTextField5.getText());
stmt.setString(4, jTextField6.getText());

int n=stmt.executeUpdate();
if(n==1)
{
JOptionPane.showMessageDialog(this,"Admin information successfully updated !!");
DataBaseInfo.un=jTextField7.getText();
DataBaseInfo.pwd=jPasswordField1.getText();
DataBaseInfo.localadd=jTextField5.getText();
DataBaseInfo.usedistributed=jTextField6.getText();
}
else
{
JOptionPane.showMessageDialog(this,"Node not found !!");
}
}
else
{
JOptionPane.showMessageDialog(this,"Please confirm record update !!");
}
} catch (Exception ex) {
51
JOptionPane.showMessageDialog(this, ex);
}
}

private void jMenuItem1ActionPerformed(java.awt.event.ActionEvent evt) {


// TODO add your handling code here:

Manage_Computers mcom=new Manage_Computers();


mcom.setVisible(true);
this.dispose();
}

private void jMenuItem7ActionPerformed(java.awt.event.ActionEvent evt) {


// TODO add your handling code here:

mywebcrawler mcrawl =new mywebcrawler();


mcrawl.setVisible(true);
this.dispose();
}

private void jMenu6ActionPerformed(java.awt.event.ActionEvent evt) {


// TODO add your handling code here:
admin_login al=new admin_login();
al.setVisible(true);
this.disable();

private void jMenu2ActionPerformed(java.awt.event.ActionEvent evt) {


// TODO add your handling code here:
this.dispose();
}

private void jMenuItem14ActionPerformed(java.awt.event.ActionEvent evt) {


52
// TODO add your handling code here:

displaylinks obj=new displaylinks();


obj.setVisible(true);
this.dispose();

/**
* @param args the command line arguments
*/
public static void main(String args[]) {
/* Set the Nimbus look and feel */
//<editor-fold defaultstate="collapsed" desc=" Look and feel setting code (optional) ">
/* If Nimbus (introduced in Java SE 6) is not available, stay with the default look and
feel.
* For details see
http://download.oracle.com/javase/tutorial/uiswing/lookandfeel/plaf.html
*/
try {
for (javax.swing.UIManager.LookAndFeelInfo info :
javax.swing.UIManager.getInstalledLookAndFeels()) {
if ("Nimbus".equals(info.getName())) {
javax.swing.UIManager.setLookAndFeel(info.getClassName());
break;
}
}
} catch (ClassNotFoundException ex) {

java.util.logging.Logger.getLogger(Manage_Computers.class.getName()).log(java.util.loggin
g.Level.SEVERE, null, ex);
} catch (InstantiationException ex) {

53
java.util.logging.Logger.getLogger(Manage_Computers.class.getName()).log(java.util.loggin
g.Level.SEVERE, null, ex);
} catch (IllegalAccessException ex) {

java.util.logging.Logger.getLogger(Manage_Computers.class.getName()).log(java.util.loggin
g.Level.SEVERE, null, ex);
} catch (javax.swing.UnsupportedLookAndFeelException ex) {

java.util.logging.Logger.getLogger(Manage_Computers.class.getName()).log(java.util.loggin
g.Level.SEVERE, null, ex);
}
//</editor-fold>

/* Create and display the form */


java.awt.EventQueue.invokeLater(new Runnable() {

public void run() {


new Manage_Computers().setVisible(true);
}
});
}
// Variables declaration - do not modify
private javax.swing.JButton jButton1;
private javax.swing.JButton jButton3;
private javax.swing.JButton jButton4;
private javax.swing.JButton jButton5;
private javax.swing.JCheckBox jCheckBox1;
private javax.swing.JCheckBox jCheckBox2;
private javax.swing.JCheckBox jCheckBox3;
private javax.swing.JLabel jLabel1;
private javax.swing.JLabel jLabel10;
private javax.swing.JLabel jLabel11;
private javax.swing.JLabel jLabel12;
54
private javax.swing.JLabel jLabel13;
private javax.swing.JLabel jLabel14;
private javax.swing.JLabel jLabel15;
private javax.swing.JLabel jLabel3;
private javax.swing.JLabel jLabel4;
private javax.swing.JLabel jLabel6;
private javax.swing.JLabel jLabel9;
private javax.swing.JMenu jMenu1;
private javax.swing.JMenu jMenu2;
private javax.swing.JMenu jMenu3;
private javax.swing.JMenu jMenu5;
private javax.swing.JMenu jMenu6;
private javax.swing.JMenuBar jMenuBar1;
private javax.swing.JMenuItem jMenuItem1;
private javax.swing.JMenuItem jMenuItem14;
private javax.swing.JMenuItem jMenuItem7;
private javax.swing.JPanel jPanel1;
private javax.swing.JPanel jPanel2;
private javax.swing.JPanel jPanel6;
private javax.swing.JPanel jPanel7;
private javax.swing.JPasswordField jPasswordField1;
private javax.swing.JScrollPane jScrollPane1;
private javax.swing.JPopupMenu.Separator jSeparator1;
private javax.swing.JPopupMenu.Separator jSeparator2;
private javax.swing.JPopupMenu.Separator jSeparator3;
private javax.swing.JPopupMenu.Separator jSeparator4;
private javax.swing.JTabbedPane jTabbedPane1;
private javax.swing.JTable jTable1;
private javax.swing.JTextField jTextField1;
private javax.swing.JTextField jTextField2;
private javax.swing.JTextField jTextField3;
private javax.swing.JTextField jTextField4;
private javax.swing.JTextField jTextField5;
private javax.swing.JTextField jTextField6;
55
private javax.swing.JTextField jTextField7;
// End of variables declaration
}

CODING OF CRAWLER.JAVA

package coding;

import java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.File;
import java.io.FileWriter;
import java.io.IOException;
import java.io.InputStreamReader;
import java.net.URL;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Statement;
import java.sql.ResultSetMetaData;
import javax.swing.JOptionPane;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;

/**
*
* @author DSOFT
*/
public class mywebcrawler extends javax.swing.JFrame {

public static DataBaseInfo db = new DataBaseInfo();


56
String path = "";
String IP;

/** Creates new form mywebcrawler */


public mywebcrawler() {
initComponents();
}

/** This method is called from within the constructor to


* initialize the form.
* WARNING: Do NOT modify this code. The content of this method is
* always regenerated by the Form Editor.
*/
@SuppressWarnings("unchecked")
// <editor-fold defaultstate="collapsed" desc="Generated Code">
private void initComponents() {

jTextField1 = new javax.swing.JTextField();


jButton1 = new javax.swing.JButton();
jButton2 = new javax.swing.JButton();
jProgressBar1 = new javax.swing.JProgressBar();
jLabel6 = new javax.swing.JLabel();
jPanel1 = new javax.swing.JPanel();
jLabel1 = new javax.swing.JLabel();
jComboBox1 = new javax.swing.JComboBox();
jCheckBox1 = new javax.swing.JCheckBox();
jLabel2 = new javax.swing.JLabel();
jMenuBar1 = new javax.swing.JMenuBar();
jMenu1 = new javax.swing.JMenu();
jSeparator1 = new javax.swing.JPopupMenu.Separator();
jMenuItem1 = new javax.swing.JMenuItem();
jMenu3 = new javax.swing.JMenu();
jSeparator2 = new javax.swing.JPopupMenu.Separator();
jMenuItem7 = new javax.swing.JMenuItem();
57
jMenu5 = new javax.swing.JMenu();
jSeparator3 = new javax.swing.JPopupMenu.Separator();
jMenuItem14 = new javax.swing.JMenuItem();
jMenu6 = new javax.swing.JMenu();
jSeparator4 = new javax.swing.JPopupMenu.Separator();
jMenu2 = new javax.swing.JMenu();

setDefaultCloseOperation(javax.swing.WindowConstants.DISPOSE_ON_CLOSE);
setTitle("Web Crawler");
addWindowListener(new java.awt.event.WindowAdapter() {
public void windowOpened(java.awt.event.WindowEvent evt) {
formWindowOpened(evt);
}
});

jTextField1.setFont(new java.awt.Font("Tahoma", 1, 18));


jTextField1.setHorizontalAlignment(javax.swing.JTextField.CENTER);
jTextField1.setText("http://");

jButton1.setFont(new java.awt.Font("Tahoma", 1, 14));


jButton1.setText("Search & Save");
jButton1.addActionListener(new java.awt.event.ActionListener() {
public void actionPerformed(java.awt.event.ActionEvent evt) {
jButton1ActionPerformed(evt);
}
});

jButton2.setFont(new java.awt.Font("Tahoma", 1, 14));


jButton2.setText("Show History");
jButton2.addActionListener(new java.awt.event.ActionListener() {
public void actionPerformed(java.awt.event.ActionEvent evt) {
jButton2ActionPerformed(evt);
}
});
58
jProgressBar1.setForeground(new java.awt.Color(153, 0, 0));

jLabel6.setFont(new java.awt.Font("Tahoma", 1, 36));


jLabel6.setForeground(new java.awt.Color(255, 0, 51));
jLabel6.setText("Distributed Web Crawler");

jLabel1.setFont(new java.awt.Font("Tahoma", 1, 14));


jLabel1.setText("Select IP Address");

jCheckBox1.setText("Use Load Balancing");

javax.swing.GroupLayout jPanel1Layout = new javax.swing.GroupLayout(jPanel1);


jPanel1.setLayout(jPanel1Layout);
jPanel1Layout.setHorizontalGroup(
jPanel1Layout.createParallelGroup(javax.swing.GroupLayout.Alignment.LEADING)
.addGroup(jPanel1Layout.createSequentialGroup()
.addGap(51, 51, 51)

.addGroup(jPanel1Layout.createParallelGroup(javax.swing.GroupLayout.Alignment.LEADI
NG)
.addComponent(jCheckBox1)
.addGroup(jPanel1Layout.createSequentialGroup()
.addComponent(jLabel1)
.addGap(72, 72, 72)
.addComponent(jComboBox1,
javax.swing.GroupLayout.PREFERRED_SIZE, 264,
javax.swing.GroupLayout.PREFERRED_SIZE)))
.addContainerGap(22, Short.MAX_VALUE))
);
jPanel1Layout.setVerticalGroup(
jPanel1Layout.createParallelGroup(javax.swing.GroupLayout.Alignment.LEADING)
.addGroup(jPanel1Layout.createSequentialGroup()
.addContainerGap()
59
.addGroup(jPanel1Layout.createParallelGroup(javax.swing.GroupLayout.Alignment.BASEL
INE)
.addComponent(jLabel1)
.addComponent(jComboBox1, javax.swing.GroupLayout.PREFERRED_SIZE,
javax.swing.GroupLayout.DEFAULT_SIZE,
javax.swing.GroupLayout.PREFERRED_SIZE))
.addPreferredGap(javax.swing.LayoutStyle.ComponentPlacement.RELATED, 9,
Short.MAX_VALUE)
.addComponent(jCheckBox1))
);

jMenu1.setText("Admin Panel");
jMenu1.add(jSeparator1);

jMenuItem1.setText("Show Panel");
jMenuItem1.addActionListener(new java.awt.event.ActionListener() {
public void actionPerformed(java.awt.event.ActionEvent evt) {
jMenuItem1ActionPerformed(evt);
}
});
jMenu1.add(jMenuItem1);

jMenuBar1.add(jMenu1);

jMenu3.setText("Web Crawler");
jMenu3.add(jSeparator2);

jMenuItem7.setText("Load Crawler");
jMenuItem7.addActionListener(new java.awt.event.ActionListener() {
public void actionPerformed(java.awt.event.ActionEvent evt) {
jMenuItem7ActionPerformed(evt);
}
});
60
jMenu3.add(jMenuItem7);

jMenuBar1.add(jMenu3);

jMenu5.setText("Search Websites");
jMenu5.add(jSeparator3);

jMenuItem14.setText("View Search Box");


jMenuItem14.addActionListener(new java.awt.event.ActionListener() {
public void actionPerformed(java.awt.event.ActionEvent evt) {
jMenuItem14ActionPerformed(evt);
}
});
jMenu5.add(jMenuItem14);

jMenuBar1.add(jMenu5);

jMenu6.setText("Logout");
jMenu6.addActionListener(new java.awt.event.ActionListener() {
public void actionPerformed(java.awt.event.ActionEvent evt) {
jMenu6ActionPerformed(evt);
}
});
jMenu6.add(jSeparator4);

jMenuBar1.add(jMenu6);

jMenu2.setText("Exit");
jMenu2.addActionListener(new java.awt.event.ActionListener() {
public void actionPerformed(java.awt.event.ActionEvent evt) {
jMenu2ActionPerformed(evt);
}
});
jMenuBar1.add(jMenu2);
61
setJMenuBar(jMenuBar1);

javax.swing.GroupLayout layout = new javax.swing.GroupLayout(getContentPane());


getContentPane().setLayout(layout);
layout.setHorizontalGroup(
layout.createParallelGroup(javax.swing.GroupLayout.Alignment.LEADING)
.addGroup(layout.createSequentialGroup()

.addGroup(layout.createParallelGroup(javax.swing.GroupLayout.Alignment.LEADING)
.addGroup(layout.createSequentialGroup()
.addGap(230, 230, 230)
.addComponent(jLabel6))
.addGroup(layout.createSequentialGroup()
.addGap(196, 196, 196)

.addGroup(layout.createParallelGroup(javax.swing.GroupLayout.Alignment.TRAILING)
.addComponent(jPanel1, javax.swing.GroupLayout.PREFERRED_SIZE,
javax.swing.GroupLayout.DEFAULT_SIZE, javax.swing.GroupLayout.PREFERRED_SIZE)
.addComponent(jTextField1,
javax.swing.GroupLayout.PREFERRED_SIZE, 530,
javax.swing.GroupLayout.PREFERRED_SIZE)))
.addGroup(layout.createSequentialGroup()
.addGap(269, 269, 269)
.addComponent(jButton1, javax.swing.GroupLayout.PREFERRED_SIZE,
187, javax.swing.GroupLayout.PREFERRED_SIZE)
.addGap(18, 18, 18)
.addComponent(jButton2, javax.swing.GroupLayout.PREFERRED_SIZE,
187, javax.swing.GroupLayout.PREFERRED_SIZE))
.addGroup(layout.createSequentialGroup()
.addGap(130, 130, 130)
.addComponent(jProgressBar1,
javax.swing.GroupLayout.PREFERRED_SIZE, 676,
javax.swing.GroupLayout.PREFERRED_SIZE))
62
.addGroup(layout.createSequentialGroup()
.addGap(152, 152, 152)
.addComponent(jLabel2, javax.swing.GroupLayout.PREFERRED_SIZE, 603,
javax.swing.GroupLayout.PREFERRED_SIZE)))
.addContainerGap(164, Short.MAX_VALUE))
);
layout.setVerticalGroup(
layout.createParallelGroup(javax.swing.GroupLayout.Alignment.LEADING)
.addGroup(layout.createSequentialGroup()
.addGap(46, 46, 46)
.addComponent(jLabel6)
.addGap(18, 18, 18)
.addComponent(jTextField1, javax.swing.GroupLayout.PREFERRED_SIZE, 42,
javax.swing.GroupLayout.PREFERRED_SIZE)
.addGap(27, 27, 27)
.addComponent(jPanel1, javax.swing.GroupLayout.PREFERRED_SIZE,
javax.swing.GroupLayout.DEFAULT_SIZE, javax.swing.GroupLayout.PREFERRED_SIZE)
.addGap(39, 39, 39)

.addGroup(layout.createParallelGroup(javax.swing.GroupLayout.Alignment.BASELINE)
.addComponent(jButton1, javax.swing.GroupLayout.PREFERRED_SIZE, 42,
javax.swing.GroupLayout.PREFERRED_SIZE)
.addComponent(jButton2, javax.swing.GroupLayout.PREFERRED_SIZE, 42,
javax.swing.GroupLayout.PREFERRED_SIZE))
.addGap(22, 22, 22)
.addComponent(jLabel2)
.addPreferredGap(javax.swing.LayoutStyle.ComponentPlacement.UNRELATED)
.addComponent(jProgressBar1, javax.swing.GroupLayout.PREFERRED_SIZE, 23,
javax.swing.GroupLayout.PREFERRED_SIZE)
.addGap(101, 101, 101))
);

pack();
}// </editor-fold>
63
int count = 0;

private void jButton1ActionPerformed(java.awt.event.ActionEvent evt) {

if (DataBaseInfo.usedistributed.toUpperCase().equals("Y")) {
IP = jComboBox1.getSelectedItem().toString();
path = "\\\\" + jComboBox1.getSelectedItem().toString() + "\\" +
loc[jComboBox1.getSelectedIndex()];
} else {

path = DataBaseInfo.localadd;

if (jCheckBox1.isSelected()) {
try {

DataBaseInfo db = new DataBaseInfo();

PreparedStatement stmt = db.conn.prepareStatement("select NodeIP,


sharedfolderlocation from nodesinfo where (no_of_sites-availablesites)= (select distinct
max(no_of_sites-availablesites) from nodesinfo)",
ResultSet.TYPE_SCROLL_INSENSITIVE, ResultSet.CONCUR_UPDATABLE);
ResultSet rs = stmt.executeQuery();

if (rs.next()) {
path = rs.getString(1) + "\\" + rs.getString(2);
IP = rs.getString(1);
}

} catch (Exception ex) {


JOptionPane.showMessageDialog(this, ex);
}

64
}

abc obj = new abc(); // where abc is the name of thread class
obj.start();

}
String[] loc;

private void formWindowOpened(java.awt.event.WindowEvent evt) {


// TODO add your handling code here:
if (DataBaseInfo.usedistributed.toUpperCase().equals("N")) {
jPanel1.setVisible(false);
} else {

try {

DataBaseInfo db = new DataBaseInfo();

PreparedStatement stmt = db.conn.prepareStatement("select


NodeIP,sharedfolderlocation from nodesinfo", ResultSet.TYPE_SCROLL_INSENSITIVE,
ResultSet.CONCUR_UPDATABLE);
ResultSet rs = stmt.executeQuery();
ResultSetMetaData rdata = rs.getMetaData();
int n = DataBaseInfo.returnColumn(rs);
loc = new String[n];

rs.beforeFirst();
int i = 0;
while (rs.next()) {
jComboBox1.addItem(rs.getString(1));
loc[i] = rs.getString(2);
i++;
65
}

} catch (Exception ex) {


JOptionPane.showMessageDialog(this, ex);
}

}
this.setLocationRelativeTo(null);
}

private void jButton2ActionPerformed(java.awt.event.ActionEvent evt) {


// TODO add your handling code here:
displaylinks obj = new displaylinks();
obj.setVisible(true);
this.dispose();

private void jMenuItem1ActionPerformed(java.awt.event.ActionEvent evt) {


// TODO add your handling code here:

Manage_Computers mcom = new Manage_Computers();


mcom.setVisible(true);
this.dispose();
}

private void jMenuItem7ActionPerformed(java.awt.event.ActionEvent evt) {


// TODO add your handling code here:

mywebcrawler mcrawl = new mywebcrawler();


mcrawl.setVisible(true);
this.dispose();
66
}

private void jMenu6ActionPerformed(java.awt.event.ActionEvent evt) {


// TODO add your handling code here:
admin_login al = new admin_login();
al.setVisible(true);
this.disable();

private void jMenu2ActionPerformed(java.awt.event.ActionEvent evt) {


// TODO add your handling code here:
this.dispose();
}

private void jMenuItem14ActionPerformed(java.awt.event.ActionEvent evt) {


// TODO add your handling code here:
displaylinks obj=new displaylinks();
obj.setVisible(true);
this.dispose();

public void processPage(String URL) throws SQLException, Exception {


//check if the given URL is already in database

String sql = "select * from crawledpages where URL = '" + URL + "'";
/*ResultSet rs = db.runSql(sql);
if (rs.next()) {
} else {*/
//store the URL to database to avoid parsing again

67
if (DataBaseInfo.usedistributed.toUpperCase().equals("Y")) {

IP = jComboBox1.getSelectedItem().toString();

JOptionPane.showMessageDialog(this,jComboBox1.getSelectedIndex());
path = "\\\\" + jComboBox1.getSelectedItem().toString() + "\\" +
loc[jComboBox1.getSelectedIndex()];
} else {

path = DataBaseInfo.localadd;
}

if (jCheckBox1.isSelected()) {
try {

DataBaseInfo db = new DataBaseInfo();

PreparedStatement stmt = db.conn.prepareStatement("select NodeIP,


sharedfolderlocation from nodesinfo where (no_of_sites-availablesites)= (select distinct
max(no_of_sites-availablesites) from nodesinfo)",
ResultSet.TYPE_SCROLL_INSENSITIVE, ResultSet.CONCUR_UPDATABLE);
ResultSet rs1 = stmt.executeQuery();

if (rs1.next()) {
path = "\\\\" + rs1.getString(1) + "\\" + rs1.getString(2);
IP = rs1.getString(1);
}

} catch (Exception ex) {


JOptionPane.showMessageDialog(this, ex );
}

68
path = path + "\\" +
jTextField1.getText().substring(jTextField1.getText().indexOf("//") + 2);
String webpath = jTextField1.getText().substring(jTextField1.getText().indexOf("//")
+ 2);
sql = "INSERT INTO Crawledpages values(?,?,?,?)";
PreparedStatement stmt = db.conn.prepareStatement(sql,
Statement.RETURN_GENERATED_KEYS);
stmt.setString(1, URL);
stmt.setString(2, webpath);

if (DataBaseInfo.usedistributed.toUpperCase().equals("Y")) {
stmt.setString(3, IP);
} else {
stmt.setString(3, "Local");
}
stmt.setString(4, path);
stmt.execute();

Document doc = Jsoup.connect(URL).timeout(0).get();


Elements questions = doc.select("a[href]");

for (Element link : questions) {

sql = "INSERT INTO Crawledpages values(?,?,?,?)";


stmt = db.conn.prepareStatement(sql, Statement.RETURN_GENERATED_KEYS);
stmt.setString(1, link.attr("abs:href"));
stmt.setString(2, webpath);

if (DataBaseInfo.usedistributed.toUpperCase().equals("Y")) {
stmt.setString(3, IP);
69
} else {
stmt.setString(3, "Local");
}
stmt.setString(4, path);
stmt.execute();
jProgressBar1.setValue(count);

jLabel2.setText(count + " files downloaded and saved !!");

URL oracle = new URL(link.attr("abs:href"));

BufferedReader in = new BufferedReader(


new InputStreamReader(oracle.openStream()));

//path="\\\\10.0.1.42\\anilsir\\"+jTextField1.getText().substring(jTextField1.getText().indexOf
("//")+2);

if (DataBaseInfo.usedistributed.toUpperCase().equals("Y")) {

IP = jComboBox1.getSelectedItem().toString();
path = "\\\\" + jComboBox1.getSelectedItem().toString() + "\\" +
loc[jComboBox1.getSelectedIndex()];
} else {

path = DataBaseInfo.localadd;
}

if (jCheckBox1.isSelected()) {
try {

DataBaseInfo db = new DataBaseInfo();

70
PreparedStatement stmt1 = db.conn.prepareStatement("select NodeIP,
sharedfolderlocation from nodesinfo where (no_of_sites-availablesites)= (select distinct
max(no_of_sites-availablesites) from nodesinfo)",
ResultSet.TYPE_SCROLL_INSENSITIVE, ResultSet.CONCUR_UPDATABLE);
ResultSet rs1 = stmt1.executeQuery();

if (rs1.next()) {
path = "\\\\" + rs1.getString(1) + "\\" + rs1.getString(2);
IP = rs1.getString(1);
}

} catch (Exception ex) {


JOptionPane.showMessageDialog(this, ex + "Itna difficult kaam");
}

}
path = path + "\\" +
jTextField1.getText().substring(jTextField1.getText().indexOf("//") + 2);
File file = new File(path);
if (file.exists() == false) {
file.mkdirs();
}
BufferedWriter writer = new BufferedWriter(new FileWriter(path + "\\" +
link.attr("abs:href").substring(link.attr("abs:href").lastIndexOf('/') + 1)));

String inputLine;
while ((inputLine = in.readLine()) != null) {
try {
writer.write(inputLine);
} catch (IOException e) {
e.printStackTrace();
JOptionPane.showMessageDialog(this, e);
return;
}
71
}
in.close();
writer.close();

count++;
}
JOptionPane.showMessageDialog(null, count + " record found and saved to
database !!");

class abc extends Thread {

@Override
public void run() {

try {
// TODO add your handling code here:

// db.runSql2("TRUNCATE Record;");
processPage(jTextField1.getText());
} catch (Exception ex) {

JOptionPane.showMessageDialog(null, ex.getMessage().toString());
} finally {
JOptionPane.showMessageDialog(null, count + " record found and saved to
database !!\n\nFiles saved to " + path);
jProgressBar1.setVisible(false);

}
}
}

/**
72
* @param args the command line arguments
*/
public static void main(String args[]) {
/* Set the Nimbus look and feel */
//<editor-fold defaultstate="collapsed" desc=" Look and feel setting code (optional) ">
/* If Nimbus (introduced in Java SE 6) is not available, stay with the default look and
feel.
* For details see
http://download.oracle.com/javase/tutorial/uiswing/lookandfeel/plaf.html
*/
try {
for (javax.swing.UIManager.LookAndFeelInfo info :
javax.swing.UIManager.getInstalledLookAndFeels()) {
if ("System".equals(info.getName())) {
javax.swing.UIManager.setLookAndFeel(info.getClassName());
break;
}
}
} catch (ClassNotFoundException ex) {

java.util.logging.Logger.getLogger(mywebcrawler.class.getName()).log(java.util.logging.Lev
el.SEVERE, null, ex);
} catch (InstantiationException ex) {

java.util.logging.Logger.getLogger(mywebcrawler.class.getName()).log(java.util.logging.Lev
el.SEVERE, null, ex);
} catch (IllegalAccessException ex) {

java.util.logging.Logger.getLogger(mywebcrawler.class.getName()).log(java.util.logging.Lev
el.SEVERE, null, ex);
} catch (javax.swing.UnsupportedLookAndFeelException ex) {

java.util.logging.Logger.getLogger(mywebcrawler.class.getName()).log(java.util.logging.Lev
el.SEVERE, null, ex);
73
}
//</editor-fold>

/* Create and display the form */


java.awt.EventQueue.invokeLater(new Runnable() {

public void run() {


new mywebcrawler().setVisible(true);
}
});
}
// Variables declaration - do not modify
private javax.swing.JButton jButton1;
private javax.swing.JButton jButton2;
private javax.swing.JCheckBox jCheckBox1;
private javax.swing.JComboBox jComboBox1;
private javax.swing.JLabel jLabel1;
private javax.swing.JLabel jLabel2;
private javax.swing.JLabel jLabel6;
private javax.swing.JMenu jMenu1;
private javax.swing.JMenu jMenu2;
private javax.swing.JMenu jMenu3;
private javax.swing.JMenu jMenu5;
private javax.swing.JMenu jMenu6;
private javax.swing.JMenuBar jMenuBar1;
private javax.swing.JMenuItem jMenuItem1;
private javax.swing.JMenuItem jMenuItem14;
private javax.swing.JMenuItem jMenuItem7;
private javax.swing.JPanel jPanel1;
private javax.swing.JProgressBar jProgressBar1;
private javax.swing.JPopupMenu.Separator jSeparator1;
private javax.swing.JPopupMenu.Separator jSeparator2;
private javax.swing.JPopupMenu.Separator jSeparator3;
private javax.swing.JPopupMenu.Separator jSeparator4;
74
private javax.swing.JTextField jTextField1;
// End of variables declaration
}

CHAPTER 10
TESTING

10.1 TESTING

Testing is a process, which reveals errors in the program. It is the major quality measure
employed during software development. During testing, the program is executed with a set of
conditions known as test cases and the output is evaluated to determine whether the program
is performing as expected.

75
In order to make sure that the system does not have errors, the different levels of testing
strategies that are applied at differing phases of software development.

10.2 LEVELS OF TESTING

The two levels of Testing are


Unit Testing
System Testing

10.2.1. UNIT TESTING:

Unit Testing is done on individual modules as they are completed and become
executable. It is confined only to the designer's requirements.
Each module can be tested using the following two strategies:

Black Box Testing (BBT)

In this strategy some test cases are generated as input conditions that fully execute all
functional requirements for the program. This testing has been uses to find errors in
the following categories:

a) Incorrect or missing functions


b) Interface errors
c) Errors in data structure or external database access
d) Performance errors
e) Initialization and termination errors.
In this testing only the output is checked for correctness. The logical flow of
the data is not checked.

White Box testing (WBT)

76
In this the test cases are generated on the logic of each module by drawing flow
graphs of that module and logical decisions are tested on all the cases.
It has been used to generate the test cases in the following cases:

a) Guarantee that all independent paths have been executed.


b) Execute all logical decisions on their true and false sides.
c) Execute all loops at their boundaries and within their operational bounds.
d) Execute internal data structures to ensure their validity.

10.3. SYSTEM TESTING (ST)

Involves in-house testing of the entire system before delivery to the user. Its aim is to
satisfy the user the system meets all requirements of the client's specifications.

10.4. INTEGRATING TESTING (IT)

Integration testing ensures that software and subsystems work together as a whole. It tests
the interface of all the modules to make sure that the modules behave properly when
integrated together.

10.5. ACCEPTANCE TESTING (AT)

It is a pre-delivery testing in which entire system is tested at client's site on real world data to
find errors.

10.6. VALIDATION

The system has been tested and implemented successfully and thus ensured that all the
requirements as listed in the software requirement specification are completely fulfilled. In
case of erroneous input corresponding error messages are displayed.

COMPILING TEST

77
It was a good idea to do our stress testing early on, because it gave us time to fix some of the
unexpected deadlocks and stability problems that only occurred when components were
exposed to very high transaction volumes.

EXECUTION TEST

This program was successfully loaded and executed. Because of good programming there
were no execution errors.

10.7. TEST CASES

Test Cases:

S.No. Module Test Case Expected Result


Do
ID

Web Enter website name in the Website will be


1. MQT1-001
Crawler search box downloaded
2. Web History should be
MQT1-002 Click on search button
Crawler displayed.
3. Web
MQT1-003 Click on exit button Form should be closed
Crawler
4. Node Add details of a Computer New Computer should
MQT1-004
Manager and click on save be added
5. Node Add details of computer and Record should be
MQT1-005
Manager click on delete. deleted.
6. Node Node information
MQT2-006 Change node information
Manager should be changed.
7. Node Record should be
MQT2-007 Click on view node info
Manager displayed.

CHAPTER 11

SYSTEM IMPLEMENTATION

11.1. INSTALLATION PROCEDURE OF THE SOFTWARE

To install the software perform the following tasks.

78
a. First match the minimum requirement for the system. If the condition matches then
install Microsoft Windows XP SP2 or above on the system in which program is going
to be used.

b. After that, it would require to setup JDK 1.7 or above as JVM.

c. Then we require setup NetBeans IDE. Now the software is ready to install the
software.

d. Then, Insert the Project CD in the CD-ROM Drive. Open NetBeans, click on Open
Menu and select project.

e. After that build and run the software by selection run from context menu or by
pressing Alt+F6.

f. Select one notepad file with the list of numbers and perform the required sorting
comparision.

11.2 USAGE OF THE SOFTWARE

At first we need the PC and the minimum hardware and software configuration as specified
earlier. After installation any user can make use of the software.

CHAPTER 12
CONCLUSION AND FUTURE SCOPE OF STUDY

79
The biggest contribution of this project is the concept of distributing crawl tasks based on
disjoint subsets of the URL crawl space. We also presented a scalable, multi-threaded,
peerto-
peer distributed architecture for a WebCrawler based on the above concept. Another
interesting contribution of the project is the proposed probabilistic hybrid of Depth-
First
Traversal and Breath-First Traversal, although we were unable to study its advantages or
disadvantages during this project. This traversal strategy can be used to achieve the hybrid
of the two traditional strategies without any extra book-keeping and is very easy to
implement. We also implement the complete WebCrawler that demonstrates all of the
above
concepts.

FUTURE SCOPE:

Future extension of the project includes implementing the DNS cache in the Crawler Thread
and studying the performance of the hybrid traversal strategy on the various cache-hit rates. A
lot of issues need to be dealt with to make this system usable in the real world. The Crawler
needs to conform to robot exclusion protocol. We need to handle partial failure. Although at
present failure of one node will not stop other components, it would be desirable
for other system to take over the task of the node that failed. Also dynamic reconfiguration
and dynamic load-balancing would be desirable.

80
CHAPTER 13

REFERENCES

1. Allen Heydon and Mark Najork, "Mercator: A Scalable, Extensible Web Crawler",
Compaq Systems Research Center, 130 Lytton Ave, Palo Alto, CA 94301, 2001.

2. Francis Crimmins, "Web Crawler Review", Journal of Information Science,


Sep.2001.
3. Robert C. Miller and Krishna Bharat, "SPHINX: a framework for creating
personal,site-specificWeb-crawlers", in Proc. of the Seventh International World Wide Web
Conference (WWW7), Brisbane, Australia, April 1998. Printed in Computer Network and
ISDN Systems v.30, pp. 119-130, 1998. Brisbane, Australia, April 1998, [4] Berners-Lee and
Daniel Connolly, "Hypertext Markup Language. Internetworking draft", Published on the
WW W at http://www.w3.org/hypertext, 1, 13 Jul 1993.
4. Sergey Brin and Lawrence Page, "The anatomy of large scale hyper textual web
search engine", Proc. of 7th International World Wide Web Conference, volume 30,
Computer Networks and ISDN Systems, pg. 107-117, April 1998.
5. Alexandros Ntoulas, Junghoo Cho, Christopher Olston "What's New on the Web? The
Evolution of the Web from a Search Engine Perspective." In Proc. of the World-wide-Web
Conference (WWW), May 2004.
6. Arvind Arasu,Junghoo Cho, Hector Garcia-Molina, Andreas Paepcke. Sriram
Raghavan. Computer Science Department, Stanford University."Searching The Web",.
7. Thomas H. Cormen, Charles E.Leiserson, Ronald L. Rivest, "INTODUCTION TO
ALGORITHM", seventh edition, published by Prentice-Hall of India Private Limited.
8. Ute Abe, Prof. Brandenburg. "String Matching", Sommersemester 2001, pg 1-9.
9. Shi Zhou, Ingemar Cox, Vaclav Petricek, "Characterising Web Site Link Structure",
Dept. of Computer Science, University College London, UK, IEEE 2007.
10. M. Najork, J. Wiener, "Breadth-first crawling yields high quality pages", Compaq
Systems Research Center, 130 Lytton Avenue, Palo Alto, CA 94301, USA, WWW 2001, pg.
114-118.

81

You might also like