You are on page 1of 4

Diagnosis of ADHD using SVM algorithm

J Anuradha Senior Professor


VIT University Tamil Nadu +91-9443130861

Tisha B.Tech(IT)
VIT University Tamil Nadu +91-9952113012

Varun Ramachandran B.Tech(IT)


VIT University Tamil Nadu +91-9952212336

januradha@vit.ac.in Dr. K.V. Arulalan


M.D., D.C.H. Pediatrics-Primary Care Consultant Gandhi Nagar, Vellore Tamil Nadu +91-9365836260

tisha.dhiman88@gmail.com Dr. B.K.Tripathy


Senior Professor VIT University Tamil Nadu

varun.ramachandran89@gmail. com

tripathybk@rediffmail.com

Arul_katpadigandhinagar@redif fmail.com Abstract


Attention Deficit Hyperactivity Disorder (ADHD) is a Disruptive Behaviour Disorder characterized by the presence of a set of chronic and impairing behaviour patterns that display abnormal levels of inattention, hyperactivity, or their combination. Since most individuals especially children display these behaviours from time to time, it is be difficult to differentiate behaviours that reflect ADHD from those that are a normal part of growing up which makes the diagnosis a tricky job. In this paper, we apply a well known artificial intelligence technique, the SVM algorithm, for the diagnosis of the disorder. The major advantage of using SVM is that it helps in controlling the complexity of the problem of diagnosing. There has not been much development or research on ADHD using SVM algorithm. Hence this is the first attempt at diagnosing the problems using the algorithm. To improve on the overall identification accuracy; we also make use of the GAbased, Feature Selection Algorithm. Genetic algorithms are known to give good solution to very complex problems. In conclusion, we expect that AI techniques like SVM will certainly play an essential role in future ADHD diagnosis applications. writing, figuring out a math problem, communicating with a parent, or paying attention in class [3]. ADHD is characterized by: Hyperactivity, Impulsivity, Inattention Difficulty staying seated Fidgeting and bouncing while seated Talking excessively Seeming to be in constant motion Climbing on things and jumping off things inappropriately Running inappropriately. Having great difficulty waiting for turns Interrupting children's play activities Interrupting conversations Blurting out answers to questions not directed at them Acting recklessly without thinking of the consequences These are just ways of identifying if a child is suffering from ADHD or not. They are just an estimate and not the correct way to predict the advent of ADHD. Since there is no correct method or proposed method to identify ADHD we plan to do that based on the algorithm. These points just give us an idea of how a child with ADHD may react; it might also be possible that a child may not suffer from ADHD in spite of having the above problems [2]. The report, "Prevalence of Attention Deficit Disorder and Learning Disability," based on 1997-98 data from CDCs National In this paper we attempt to develop a tool based on SVM algorithm in order to increase efficiency the of ADHD diagnosis.

1. Theory
Attention deficit hyperactivity disorder (ADHD) is one of the most common childhood disorders and can continue through adolescence and adulthood. ADHD is a disorder in which a person has a difficulty to learn effectively, caused by an unknown factor or factors. The unknown factor is the disorder that affects the brain's ability to receive and process information [2]. It typically first show up when a person has difficulty speaking, reading,

2. Algorithm & Design


Support vector machines (SVMs) are a set of related supervised learning methods used for classification and regression. A support vector machine constructs a hyperplane or set of hyperplanes in a high-dimensional space, which can be used for classification, regression or other tasks [8].

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Compute10, Jan 22-23, 2010, Bangalore, Karnataka, India Copyright 2010 ACM 978-1-4503-0001-8/00/0010$5.00.

Fig 2 Output given by SVM Fig 1 Support Vector Machine Representation Intuitively, a good separation is achieved by the hyperplane that has the largest distance to the nearest training datapoints of any class (so-called functional margin), since in general the larger the margin the lower the generalization error of the classifier [5]. The training data used is in the form : D = {(xi,ci) | xi Rp, ci {-1,1}}i=1 to n ...(1) The SVM is given by : (w.x1) + b = +1 (w.x1) + b = -1 (2) ADHD and its diagnosis and treatment have been considered controversial since the 1970s. The controversies have involved clinicians, teachers, policymakers, parents and the media. Opinions regarding ADHD range from not believing it exist at all to believing there are genetic and physiological bases for the condition as well as disagreement about the use of stimulant medications in treatment [7]. To implement this algorithm we use a tool called Clementine 12.0. Clementine is a graphical interface puts the power of data mining in the hands of the user. In our paper, we describe, how we use this tool for our benefit i.e. in order to diagnose ADHD [3]. The general algorithm described in Fig3 helps us in understanding the steps we follow for the collection and analysis of data. The steps taken by us are explained in brief below: (3) Step 1: (Collection of Data) this step deals with the collection of data and representing the data in the form of an MS-EXCEL SHEET. Step 2: (Test Data Processing) this step deals with using the SVM tool called CLEMENTINE 12.0 and processing the data through it. Thus, this step deals with the test data. Step 3: (Diagnosing) in this step, the testing and actual diagnosing takes place. We introduce the data that needs to be diagnosed and run it through the SVM tool again.

=> (w. (x1 x2)) = 2 => ((w/||w||).(x1-x2)) = 2/||w||

Parameter C determines the trade off between the model complexity (flatness) and the degree to which deviations larger than are tolerated in optimization formulation for example, if C is too large (infinity), then the objective is to minimize the empirical risk only, without regard to model complexity part in the optimization formulation [4]. Parameter controls the width of the -insensitive zone, used to fit the training data. The value of can affect the number of support vectors used to construct the regression function. The bigger , the fewer support vectors are selected. On the other hand, bigger -values results in more flat estimates. Hence, both C and -values affect model complexity (but in a different way).

Fig 3 General Algorithm Procedure to diagnose ADHD in students using SVM Collect data set = {data set 1, data set 2.} Split the data set into T and D T : contains the trained data set with diagnosis, and D : contains the data set to be diagnosed Repeat the next step for data sets T and D Apply pre-processing to reduce the noisy data Create a stream in Clementine 12.0 with SVM algorithm Choose Radial Basis Function Kernel Apply this stream to the trained data set | T Now, apply the stream to both data set T and D combined Output is the diagnosis of ADHD

Fig 4 Input to the SVM Module

Fig 5 Output given by the SVM Module Provided by this module are in tabular form (can be graphical too) with a last column added which gives us the algorithms diagnosis. For the data collected it had an 88.3% of efficiency [8]. In fig 5 the output of the SVM module is shown. Columns V and W are added by the SVM algorithm. The column V shows the diagnosis predicted by the algorithm. Notice again, that the rows 98 and 99 which did not have any diagnosis value now in the Column V has a value; this value is the diagnosis by SVM algorithm. To verify this result we can consult any verified doctors.

3. Experimentation
This section explains the process and procedure we go through in order to diagnose ADHD.

3.1 Collection of Data


The first step for implementing the algorithm is supplying trained data to the model. For this purpose we collected data from doctors for 100 children. The training data used is in the form :

D = {(xi,ci) | xi Rp, ci {-1,1}}i=1 to n(1)


This collected data was in the form of answers to a questionnaire along with the doctors diagnoses. The 100 children whose test results we used were all between the age of 7 to 10 yrs of age. These 100 students were given a set of 6 questions with sub parts, each of whose answer was either a yes or a no. This data was collected and stored in a relation with each attribute representing a question or a sub-part of a question and each tuple representing a new student and the last column for a doctors diagnosis. The yes or a no is stored as a 1 or a 0. This data is used as the training data for the SVM module. This particular subset issue problem whole it does remain highly vulnerable to other design issues namely cost. The fig4 gives a preview of the input given to SVM module. The column A contains the names of the children and the columns B to T contain the answers to the question. Notice, that the column U denotes the Diagnosis where a 1 represents positive diagnosis for ADHD and 0 represents children not suffering from ADHD. In the last two rows in the fig7 namely rows 98 and 99 there is no value in the diagnosis column. This means these are the columns for which we perform diagnosis using SVM algorithm.

3.3 Diagnosing
The next step is to diagnose the disorder using this technique and a set of collected test data. We feed this to the module along with the trained data and thus we get the diagnosis according to the SVM algorithm. This diagnosis needs to be verified by a doctor in order to find out if we are taking the correct approach. After verifying this data from the doctors we compute and find that our module is giving us the right diagnosis for 88.674% times. This shows us that out of a 100 values for test data we will get an accuracy of 88% in our diagnosis which is highly encouraging at this stage of research.

4. Results & Discussion


The Fig6 shows the occurrence of ADHD in children between the ages 6 to 11 yrs as evaluated by the SVM algorithm. According to this graph number of reported and diagnosed cases with the childs age 8 yrs are maximum followed by children of age 7. Our results are very encouraging. Our study about diagnosing ADHD using SVM algorithm has shown has shown a percentage of 88.674% success in diagnosing. There a lot of scope for development in this field of study in order to increase the percentage success achieved by using this technique alone.

3.1 Test Data Processing


To use the model in the Clementine tool we create a stream to work on and create SVM algorithm module for Radial basis function kernel type. After executing this database according to the Support vector machine it provides us with its own interpretation of the results. On executing the module the results

4.1 Practical Application


In this paper we have tried to diagnose Attention Deficit Hyperactivity Disorder (ADHD) by applying one of the techniques of artificial intelligence to the problem. According to our results this makes the diagnosing accurate, less time consuming and a less tedious job. If proven feasible this model the

computer-aided not only save time, manpower and other resources but also avoid the possible human bias. However, there is still work to be done in the future. The first and most important one is to find an optimal feature set so as to develop a model that

results so as to not use a previously developed model or software. There is still a lot of research to be done in LD and ADHD especially in India where it is still not considered a threat.

6. References
[1] Attention Deficit Hyperactivity Disorder: a Handbook for Diagnosis and Treatment, 2nd ed. (Barkley, R.A., Guilford Press, 1999) [2] Eicka L.Wodka, Chritopher Loftis, Stewart H.Mostofsky, Christine Prahme, Jennifer C. Gidley Larson, Martha B. Denckla, E. Mark Mahone Prediction of ADHD in boys and girls using D-KEFS,2007 oxford journal clinical neuropsychology 2008 23(3):283-293 [3] J.Weston, S.Mukherjee, O.Chapelle, M.Pontil, T.Poggio, V.Vapnik Feature Selection for SVMs IEEE paper (2001): 668-674 [4] Stuart Andrews, Ioannis Tsochantaridis, Thomas Hofmann Support Vector Machine for Multiple-Instance Learning Eighteenth national conference on Artificial intelligence Edmonton, Alberta, Canada Pages: 943-944,2002 [5] Davide Anguita, Andrea Boni and Sandro Ridella Digital Architecture for support vector machines: Theory, Algorithm and FPGA implementation, Neural Networks, IEEE Transactions on In Neural Networks, IEE Transactions on, Vol. 14, No. 5. (2003), pp. 993-1009 [6] Impact of parent and Teacher concordance on Diagnosing Attention Deficit Hyperactivity Disorder and its Sub-t types, Indian Journal of Paediatrics , volume 75, March 2008 [7] Machine Learning Using Support Vector Machines King Saud University The College of Computer & Information Science Computer Science Department (Master) Neural Networks and Machine Learning Applications [8] Digital Least Squares Support Vector Machines, David Anguita and Andrea Boni. Neural Processing Letters journal Springer publications, volume 18, number 1 august 2003 pages 65-72 [9] American Academy Of Pediatrics Committee on Quality Improvement, Subcommittee on Attention-Deficit/Hyperactivity Disorder Clinical Practice Guideline: Diagnosis and Evaluation of the Child With Attention-Deficit/Hyperactivity Disorder, Vol. 105 No.5 May,2000 [10] National Institute Of Mental Health- ADHD Research http://www.nimh.nih.gov/health/publications/attentiondeficit-hyperactivity-disorder/complete-index.shtml

Fig6 Graph showing the occurrence of ADHD with age as predicted by SVM algorithm requires fewer features and has sufficiently high accuracy. Fewer features mean that parents and teachers (both regular and special education teachers) can concentrate on collecting variables that are relevant and essential. In addition, it is also necessary to collect much more data, not just in the number of samples, but the associated variables with each sample. More data samples and additional essential features help to build a well-supported AI classification model for future prediction. This is essential since there are 11 sub-types of learning disabilities, and chances are there could be more. As we have found in our experiment with data set 1, certain students manually diagnosed as LD are always classified as non-LD. Finally, most special education teachers or professionals we talked to tend to be skeptical to these kinds of black box predictor. Additional research is examining the longterm outcome of ADHD [1]. How do children with ADHD turn out, compared to brothers and sisters without the disorder? As adults, how do they handle their own children? Still other studies seek to better understand ADHD in adults. Such studies give insights into what types of treatment or services make a difference in helping an ADHD child grow into a caring parent and a wellfunctioning adult [6].

5. Summary
Diagnosis of students with learning disability has never been an easy job. In this paper we have tried to diagnose Attention Deficit Hyperactivity Disorder (ADHD) by applying one of the techniques of artificial intelligence to the problem. According to our results this makes the diagnosing accurate, less time consuming and a less tedious job. We have taken a data-set which is verified by a doctor, it includes the results of a questionnaire used by the doctors to diagnose ADHD. This data-set is the given to the SVM module, this is called the test data [1]. After that we introduce the data set which needs to be diagnosed and again give it to the SVM module, this time the module gives us the diagnosis. This diagnosis can be verified by any doctor. In future we can use this SVM algorithm to do diagnosis of other ADHD related problems. We can also create our own algorithm and test the

You might also like