Predicting Steel Properties with ANN

PREDICTING THE MECHANICAL PROPERTIES OF PLAIN CARBON STEELS USING ANN
By 1. S. Ibtesam Hasan Abidi (Group Leader) 2. Jamal Ahmed (A.G.L) 3. S. Raheem shah 01 MT 23 01 MT 21 2k-01 MT 31
Supervised by:
Professor Mohammad Hayat Jokhio

Chairman
Department of Metallurgy And Materials Engineering Mehran University Of Engineering And Technology, Jamshoro
Submitted in the partial fulfillment of the requirement for the degree of Bachelor Of Metallurgy And Materials Engineering
Dec 2004
Dedication
Dedicated to Time, our best teacher, and then to our Parents, our first teachers
ii
CERTIFICATE
This is to certify that the work presented in this project thesis on Predicting the Mechanical Properties Of Plain Carbon Steels Using ANN is entirely written by the following students under the supervision of Prof. Mohammad Hayat Jokhio.
1. S. Ibtesam Hasan Abidi (Group Leader) 2. Jamal Ahmed (A.G.L) 3. S. Raheem Shah
01 MT 23 01 MT 21 2k-01 MT 31
Thesis Supervisor
External Examiner
______, Dec 2004 Chairman Department of Metallurgy And Materials Engineering
iii
Acknowledgements
Every praise is to ALLAH alone, and the grace of ALLAH is on Prophet Muhammad (PBUH) the only source of guidance and knowledge for all humanity. It gives us great pleasure to record our sincere thanks to our supervisor Professor Mohammad Hayat Jokhio, Chairman Department Of Metallurgy And Materials Engineering, who gave his consent to guide us in the project. He had been very encouraging and cooperative, while the work was carried out.
We are thankful to Mr. Isahque Abro, assistant professor, Department Of Metallurgy And Materials Engineering, for his contributions in making the work possible.
We consider ourselves extremely lucky to be a member of Mehran University Of Engineering And Technology. Our Friends were great fun to be with. They always helped us during the hard time. We will miss them all dearly. We also thank to all those people who helped and contributed in our preparation of this thesis. Finally, we would like to say how much we love our family. Without their support and encouragement, these years would not have been possible.
iv
ABSTRACT Neural networks are the most emerging and prominent technology in the field of Artificial intelligence. Present work, carried out in Mehran University Of Engineering And Technology, Jamshoro is a try to use feed forward neural network with back propagation training algorithm to predict the mechanical properties of plain carbon. In present work the composition of the plain carbon steels was used as the parameter. 40 samples were used to train the network while it was validated on 7 samples. Neural network was used with tansig, purelin, and trainlm functions. The work would help the materials engineers and manufacturers suitability design product for high performance designed mechanical properties.
CONTENTS
Chapter No. 1 INTRODUCTION.1 Chapter No. 2 LITERATURE REVIEW.3 2.1 2.2 NEURAL NETWORK CHARACTERISTICS OF NEURAL NETWORK 2.2.1 2.2.2 2.3 2.4 Capabilities Of Modeling Ease Of Use
ANALOGY TO THE BRAIN THE BIOLOGICAL NETWORK 2.4.1 The Work Mechanism
2.5
THE ARTIFICIAL NEURON 2.5.1 The Work Mechanism
2.6 2.7 2.8 2.9 2.10
OUT STANDING FEATURES OF NEURAL NETWORK MAIN FEATURES OF NEURAL NETWORK LIMITATIONS ADVANTAGES CLASSIFICATION OF NEURAL NETWORKS 2.10.1 Feed Forward Networks 2.10.2 The Back Propagation 2.10.3 Single layer perceptron 2.10.4 Multi-layer perceptron 2.10.5 Simple recurrent network 2.10.6 Hopfield network 2.10.7 Boltzmann machine 2.10.8 Committee of machines 2.10.9 Self-organizing map
2.11. DESIGNING 2.11.1 Layers
vi
2.11.2 Communicating And Types Of Connections 2.11.2.1 2.11.2.2 Inter-layer connections Intera-layer connections
2.11.3 Learning 2.11.3.1 2.11.3.2 Chapter No. 3 MECHANICAL PROPERTIES OF STEEL AND FACTORS AFFECTING MECHANICAL PROPERTIES.23 3.1.1 3.1.2 3.1.3 3.1.4 3.1.5 3.1.6 3.1.7 3.1.8 3.1.9 Tensile Strength Yield Strength Elasticity Plasticity Ductility Brittleness Toughness Hardness Fatigue Off-Line Or Online Learning Laws
3.1.10 Creep 3.2 FACTORS AFFECTING MECHANICAL PROPERTIES 3.2.1 3.2.2 3.2.3 3.2.4 Effect Of Grain Size On Properties Of Metals Effect Of Heat Treatment On Properties Of Metals Effect Of Environmental Variables Effect Of Alloying Elements
Chapter No. 4 TESTING TECHNIQUES33 4.1 TENSILE TEST 4.1.1 4.1.2 4.1.3 Tensile Test Results Proof stress The Interpretation of tensile test results
vii
4.1.4 4.2
The effect of grain size and structure on tensile testing:
IMPACT TESTING 4.2.1 The Izod Test 4.2.2 The Charpy Test
4.2.3 The Interpretation Of Impact Tests 4.2.4 4.3 The Effect Of Processing On Toughness
HARDNESS TESTING 4.3.1 The Brinell Hardness Test 4.3.1.1 Machinability 4.3.1.2 Relationship Between Hardness And Tensile Strength 4.3.1.3 Work-Hardening Capacity 4.3.2 4.3.3 The Vickers Hardness Test The Rockwell Hardness Test
4.3.4 Shore scleroscope 4.3.4.1 The Effect Of Processing On Hardness Chapter No. 5 USING NEURAL NETWORK TOOL BOX60 5.1 5.2 5.3 INTRODUCTION THE STRUCTURE OF THE NEURAL NETWORK TOOLBOX NETWORK LAYERS 5.3.1 Constructing Layers 5.3.2 5.4 Connecting Layers
SETTING TRANSFER FUNCTIONS 5.4.1 Activation Functions
5.5 5.6
WEIGHTS AND BIASES TRAINING FUNCTIONS & PARAMETERS 5.6.1 5.6.2 5.6.3 Performance Functions Train Parameters Adapt Parameters
5.7
BASIC NEURAL NETWORK EXAMPLE 5.7.1 Manually Set Weights
viii
5.7.2 5.8
Training Algorithms
GRAPHICAL USER INTERFACE 5.8.1 5.8.2 5.8.3 5.8.4 5.8.5 5.8.6 5.8.7 5.8.8 5.8.9 Introduction of GUI Create A Perceptron Network Input And Target Create Network Train the Perceptron Export Perceptron Results To Workspace Clear Network/Data Window Importing From Command Line Save A Variable To A File And Load It Later
Chapter No. 6 EXPERIMENTAL WORK..86 6.1 6.2 DATA SET METHODOLOGY 6.2.1 Algorithm
Chapter No. 7 RESULTS AND CONCLUSION 95 7.1 7.2 7.3 RESULTS CONCLUSION FUTURE WORK
APPENDICES Appendix a 99 Appendix b 103 BIBLIOGRAPHY 105
ix
List of tables/illustration
Figure 2.1 A systematic diagram of a single neuron nerve Figure 2.3 Artificial Neuron Figure 2.4 Feed Forward Networks Figure 2.5 Back propagation Figure 2.6 network layers Figure 3.1 Yield point and yield strength Figure 4.1 Testing machine Figure 4.2 Tensile test specimen ( round ) Figure 4.3 Tensile test specimen ( flat ) Figure 4.4 load / extension curve for low-carbon steel Figure 4.5 Proof stress Figure 4.6 Typical stress/strain curves Figure 4.7 Effect of grain orientation on material testing Figure 4.8 effect of processing on the properties of low-carbon steel Figure 4.9 Effect of tempering on tensile test Figure 4.10 effect of temperature on cold-worked material Figure 4.11 Typical impact testing machine Figure 4.12 Impact loading Figure 4.13 Izod test Figure 4.14 Charpy test Figure 4.15 Standard charpy notches Figure 4.16 Effect of temperature on toughness Figure 4.17 Effect of Annealing on the toughness of low-carbon steel Figure 4.18 Effect of tempering on quench-hardened high carbon steel Figure 4.19 Brinell hardness tester Figure 4.20 Brinell hardness principle Figure 4.21 Work-hardening capacity Figure 4.22 Micro Vicker And Vicker Hardness Testers Figure 4.23 Rockwell hardness tester Table 4.1 Rockwell hardness test conditions Table 4.2 Rockwell superficial hardness test conditions Figure 4.24 Effect of cold-working on the hardness of various Figure 4.25 Effect of heating cold-worked 70/30 brass Figure 4.26 Effect of heating a Quench-hardened 0.8% plain carbon Table 5.1 The XOR-problem Figure 5.1 The logsig activation function Figure 5.2 The targets and the actual output Figure 5.3 Network data manager Figure 5.4 Create new data window Figure 5.5 Create new network window Figure 5.6 View network window Figure 5.7 Main network window Figure 5.8 training result window Figure 5.9 Import / load window 5 6 9 11 15 24 33 35 35 36 37 40 41 41 42 43 44 44 45 46 47 48 49 49 51 51 52 53 55 56 57 58 59 59 68 69 71 75 76 77 78 79 80 84
Table 6.1. Data set to be used for training the network Table 6.2. Data set to be used as unseen data for the network Figure 6.1 training of the network with TRAINLM function and 1 neuron Figure 6.2 training of network with 7neurons Figure 6.3 training of network with 9 neurons Figure 6.4 the network Graph 7.1 Comparisons of actual value and predicted value with 1 neuron Graph 7.2 Comparisons of actual value and predicted value with 7 neuron Graph 7.3 Comparisons of actual value and predicted value with 9 neuron Graph 7.4 Regression line for the values predicted and actual
86 87 89 91 93 94 96 97 97 98
xi
Chapter No. 1
INTRODUCTION
Chapter No. 1
INTRODUCTION
Neural networks are the most emerging and prominent technology in the field of Artificial intelligence, Used in almost every Engineering, Finance, Defiance, Economics, and other areas, neural networks have proven them selves the best predicting and controlling tool. Present work is an attempt in the Department Of Metallurgy And Materials Engineering MUET, Jamshoro to use these neural networks to predict the mechanical properties of plain carbon, as these (mechanical properties) are the single most important factor considered while designing a new composition, and also this work will be useful in creation of a reference to compare the tested mechanical properties. In present work we have used the composition of the plain carbon steels modeling the mechanical properties, although, many other parameters are involved that affects the mechanical properties of plain carbon steel such as, heat treatment, grain size, cold working, environment etc., which are explained in chapter 3. 40 samples were used to train the network while it was tested on 7 samples. As metallurgy and materials engineers, we have adopted and preferred a theoretical approach, however research work reveals that a mathematical approach adopted on similar topic. The main objectives of this thesis work are (1) To modify the neural networks into comprehensive and simple tutorial that could be helpful for utilization of neural networks in the field of engineering materials by coming students. (2) To create an example we have tried to implement this technology for prediction of mechanical properties. During the research we have found the neural networks an intelligent, powerful and a useful tool for engineering applications and especially for the prediction of mechanical properties of plain carbon steel. All the wok was carried out at Mehran University of Engineering And Technology, Jamshoro. The data was obtained from Pakistan Steel Mills and ASTM Materials Handbook On Properties Of Metal Vol: 9. The work includes: Chapter 2 highlights a comprehensive, brief and theoretical concept of neural networks, training laws, kind of networks used, and some history.
2 Chapter 3 provides an introduction of the mechanical properties of plain carbon steel and the parameters affecting these mechanical properties. The parameters have their significance as they can be used to enhance the capabilities of the neural network for prediction. Chapter 4 involves experimental techniques to measure the mechanical properties on laboratory scale. In this chapter, the relationship between the mechanical properties is also briefly explained. Chapter 5 consists of a tutorial on how to use neural network toolbox in matlab. Matlab is a very power full engineering and mathematical language, provided with a built in neural network tool that is very easy to use. Chapter 6 is the conclusion, future work, and the results of our experiment.
Chapter No. 2
LITERATURE REVIEW
Chapter No. 2
LITERATURE REVIEW
2.1 NEURAL NETWORK Artificial Neural Network can be best defined as a system loosely modeled on the human brain this field goes by many names, such as connectionism, parallel distributed processing, Menno computing, Natural Intelligent systems, Machine learning algorithm and artificial Neural Networks. Neural Networks have seen an explosion of interest over the last few years, and are being successfully applied across an extraordinary range of problem. In the areas as diverse as, finance, medicine, engineering, geology and physics. In short anywhere, where is a problem of prediction, classification or control neural Networks are tailoring their applications. Neural Networks is an attempt to simulate with in specialized hardware or sophisticated software, the multiple layers of single processing elements called neurons. Each Neuron is linked to certain its neighbors with varying coefficient of connectivity that represent the strength there connections, learning is accomplished by adjusting these strengths cause the over all network to output appropriate results.
2.2 2.2.1
CHARACTERISTICS OF NEURAL NETWORK Capability of modeling: Neural Network is very sophisticated modeling techniques capable of modeling
extremely function in particular. Neural Network have been non linear, for many years linear modeling has been the commonly used technique in most modeling domain since linear models have well known optimization strategies. Where the linear approximation was not valid, the models suffered accordingly. 2.2.2 Ease of use: Since Neural Networks learn by examples, therefore neural network user gathers representative data first, and then uses training algorithms to automatically learn the runtime of the data. Although the user does need to have some hemistich knowledge of how to select and prepare data, how to select an appropriate neural network, and how to
4 interpret the results. The level of user knowledge needed to successfully apply Neural Networks is much simple than would be in case of using rational non-linear statistical methods. 2.3 ANALOGY TO THE BRAIN The most basic components of neural networks are modeled after the structure of the brain is developed. Some neural network structures are not closely to the brain and some does not have a biological counter part in the brain. However, neural networks have a strong similarity to the biological brain therefore a great deal of the technology is barrowed from the sciences. 2.4 THE BIOLOGICAL NETWORK The most basic elements of the human brain is a specific type of cell, which provide the abilities to remember, think and apply previous experiences to our every action. Here cells are known as neuron, each of this neuron can connect with 20,0000 other neuron. The power of the brain comes from the numbers of these basic components and the multiple connections between there. All neurons have four basic components, which are dentenitres, soma, axon and synapses (Fig 2.1). Basically, a biological neuron receives inputs from other sources. Combines then in some why, performs a generally non linear operation on the results, and then output the final results the figure below shows a simplified biological neuron and the relationship of its four components. 2.4.1 The work mechanism The brain is principally composed of a very large number (circa 10,000,000,000) of neurons, massively interconnected (with an average of several thousand interconnects per neuron, although this varies enormously). Each neutron is a specialized cell, which can propagate an electrochemical signal. The neuron has a branching input structure (the dendrites), a cell body, and a branching output structure (the axon). The axons of one cell connect to the dendrites of another via a synapse. When a neuron is activated, it fires an electrochemical signal along the axon. This signal crosses the synapses to other neurons, which may in turn fire. A neuron fires only if the total signal received at the cell body from the dendrites exceeds a certain level (the firing threshold).
5 The strength of the signal received by a neuron (and therefore is chances of firing) critically depends on the efficiency of the synapse actually contains a gap, with neurotransmitter chemicals poised to transmit a signal across the gap. Thus, from a very large number of extremely simple processing units (each performing a weighted sum of its inputs, and then firing a binary signal if the total input exceeds a certain level) the brain manages to perform extremely complex tasks. Of course, there is a great deal of complexity in the brain which has not been discussed here, but it is interesting that artificial Neural Networks can achieve some remarkable results using a model not much more complex than this.
Figure: 2.1 A systematic diagram of a single neuron nerve 2.5 THE ARTIFICIAL NEURON The basic unit of neural networks, the artificial neurons, simulates the four basic functions of natural neurons. Artificial are much simpler than the biological neuron; the figure below shows the basics of an artificial neuron.
Figure: 2.3 Artificial Neuron Notes that various inputs to the network are represented by the mathematical symbol x (n). Each of these inputs are multiplied by a connection weight, these weights are represented by w (n). In the simplest case, these products are simply summed, fed through a transfer function to generate a result, and then output. Even through all artificial neural networks are constructed from this basic building block the fundamentals may vary in these building blocks and there are differences.
2.5.1
The work mechanism It receives a number of inputs (either from original data, or from the output of
other neurons in the neural network). Each input comes via a connection that has a strength (or weight); these weights correspond to synaptic efficacy in biological neuron. Each neuron also has a single threshold value. The weighted sum of the inputs is formed, and the threshold subtracted, to compose the activation of the neuron (also known as the post synaptic potential, or PSP, of the neuron). The activation signal is passed through an activation function (also known as a transfer function) to produce the output of the neuron.
7 2.6 OUT STANDING FEATURES OF NEURAL NETWORK Neural networks are performing successfully where other methods do not reorganizing and matching complicated, vague, or incomplete patterns. Neural networks have been applied in solving a wide variety of problems. Prediction: - The most common use for neural networks is to predict what will most likely happen. There are many areas where prediction can help in setting priorities. For example: - The emergency room at a hospital can be hectic place; to know who needs the most critical help, and who can enable a more successful operation. Basically, all organizations must establish priorities, which govern the allocation of their resources. Neural Networks have been used as a mechanism of knowledge acquisition for export system in stock market forecasting with astonishingly accurate results. Neural networks have also been used for bankruptcy prediction for credit card institution. Other most common applications of neural networks fall into the following categories: Classification: - Use input values to determine the classification. Data association: - It also recognizes data that contains errors. For example it only identify the characters that were scanned but identify when the scanner is not working properly. Data Conceptualization: - Analyze the inputs so that grouping relationships can be inferred. For example it extracts the names of customers from database that most likely buy a particular product Data Filtering: -Smooth an input signal. For example it take the noise out of a telephone signal. Frequently speaking, the neural network system can be applied for interpretation, prediction, diagnosis, planning, monitoring, debugging, repair, instruction, and control. Application of neural networks in materials: Most of the neural networks applications in materials science and engineering lie in the category of prediction, modeling and control. Or example the current work is the prediction of mechanical properties of plain carbon steels, Jokhio [2004] have found the application of neural networks in the field of powder metallurgy.
8 Iqbal Shah [2002] has worked predicting the tensile properties of austenitic stainless steels. H K D H Bhadesia [1999] defines the neural network applications in controlling the welding robots, predicting the solidification cracking of welds; strength of steel welds, hot cracking of the weld; creep, predicting fatigue properties, fatigue threshold; martensite start temperature, and most importantly prediction of continuous cooling transformation (or TTT) diagram. 2.7 2.8 MAIN FEATURES OF NEURAL NETWORK Artificial Neural Network (ANNs) learns by experience rather than by modeling or programming. ANN architectures are distributed, inherently parallel and potentially real time. They have the ability to generalize. They do not require a prior understanding of the process or phenomenon being studied. They can form arbitrary continuous non-linear mappings. They are robust to noisy data. VLSI implementation is easy. LIMITATIONS Tools for analysis and model validation are not well established. An intelligent machine can only solve some specific problem for which it is trained. Human brain is very complex and cannot be fully simulated with present computing power. An artificial neural network does not have capability of human brain. 2.9 I. II. ADVANTAGES: Adaptive learning: An ability to learn how to do tasks based on the data given for training or initial experience. Self organization: An ANN can create its own organization or representation of the information it receives during learn time.
9 III. Real Time operation: ANN computations may be carried out in parallel, and special hardware devices are being designed and manufactured which take advantage of this capability. IV. Fault Tolerance via redundant information coding: Partial destruction of a network leads to the corresponding degradation of performance. However, some network capabilities may be retained even with major network damage. 2.10 CLASSIFICATION OF NEURAL NETWORKS Neural networks can be classified as [wikipedia] a) a) a) c) e) Feed forward Single layer precepton Simple recurrent network Boltzmann machine Committee of machines b) b) b) d) f) Back propagation Multi layer preceptons Hopfield network Support vector machine Self organizing map According of architecture, neural networks can be classified as Other types includes
2.10.1 Feed forward Networks Feed forward ANNs allow signals to travel one way only, from input to output. There is no feedback (loop) i.e. the output of any layer does not affect that same layer. Feed forward ANNs tend to be straightforward networks that associate inputs with outputs. They are extensively used in pattern recognition. This type of organization is also referred to as bottom up or top down.
Figure: 2.4 Feed Forward Networks
10 2.10.2 The back propagation The term is an abbreviation for "backwards propagation of errors". Back propagation still has advantages in some circumstances, and is the easiest algorithm to understand. There are also heuristic modifications of back propagation which work well for some problem domains, such as quick propagation In back propagation, the gradient vector of the error surface is calculated. This vector points along the line of steepest descent from the current point, so we know that if we move along it a short distance, we will decrease the error. A sequence of such moves (slowing as we near the bottom) will eventually find a minimum of some sort. The difficult part is to decide how large the steps should be. Large steps may converge more quickly, but may also overstep or (if the error surface is very eccentric) go off in the wrong direction. A classic example of this in neural network training is where the algorithm progresses very slowly along a steep, narrow, valley, bouncing from one side across to the other. In contrast, very small steps may go in the correct direction, but they also require a large number of iterations. In practice, the step size is proportional to the slope (so that the algorithms settle down in a minimum) and to a special constant: the learning rate. The correct setting for the learning rate is application-dependant, and is typically chosen by experiments; it may also be time- varying, getting smaller as the algorithm progresses. The algorithm is also usually modified by inclusion of a momentum term: this encourages movement in a fixed direction, so that if several steps are taken in the same direction, the algorithm picks up speed, which gives it the ability to (sometimes) escape local minimum, and also to move rapidly over flat spots and plateaus. The algorithm therefore progresses iteratively, through a number of epochs. On each epoch, the training cases are each submitted in turn to the network, and target and actual outputs compared and the error calculated. This error, together with the error surface gradient, is used to adjust the weights, and then the process repeats. The initial network configuration is random, and training stops when a given number of epochs elapses, or when the error reacted and acceptable level, or when the error stops improving (you can select which of these stopping conditions to use).
11
Figure: 2.5 Back propagation 2.10.3 Single layer perceptron The earliest kind of neural network is a single-layer perceptron network, which consists of a single layer of output nodes; the inputs are fed directly to the outputs via a series of weights. In this way it can be considered the simplest kind of feedforward network. The sum of the products of the weights and the inputs is calculated in each node, and if the value is above some threshold (typically 0) the neuron fires and takes the value 1; otherwise it takes the value -1. Neurons with this kind of activation function are also called McCulloch-Pitts neurons or threshold neurons. In the literature the term perceptron often refers to networks consisting of just one of these units. Perceptrons can be trained by a simple learning algorithm that is usually called the delta-rule. It calculates the errors between calculated output and sample output data, and uses this to create an adjustment to the weights, thus implementing a form of gradient descent.
12 2.10.4 Multi-layer perceptron This class of networks consists of multiple layers of computational units, usually interconnected in a feedforward way. This means that each neuron in one layer has directed connections to the neurons of the subsequent layer. In many applications the units of these networks apply a sigmoid function as an activation function. The universal approximation theorem for neural networks states that every continuous function that maps intervals of real numbers to some output interval of real numbers can be approximated arbitrarily closely by a multi-layer perceptron with just one hidden layer. This result holds only for restricted classes of activation functions, e.g. for the sigmoidal functions. Multi-layer networks use a variety of learning techniques, the most popular being backpropagation. Here the output values are compared with the correct answer to compute the value of some predefined error-function. By various techniques the error is then fed back through the network. Using this information, the algorithm adjusts the weights of each connection in order to reduce the value of the error-function by some small amount. After repeating this process for a sufficiently large number of training cycles the network will usually converge to some state where the error of the calculations is small. In this case one says that the network has learned a certain target function. To adjust weights properly one applies a general method for nonlinear optimization task that is called gradient descent. For this the derivation of the errorfunction with respect to the network weights is calculated and the weights are then changed such that the error decreases (thus going downhill on the surface of the error function). For this reason backpropagation can only be applied on networks with differentiable activation function. In general the problem of reaching a network that performs well, even on examples that were not used as training examples, is a quite subtle issue that requires additional techniques. This is especially important for cases where only very limited numbers of training examples are available. The danger is that the network overfits the
13 training data and fails to capture the true statistical process generating the data. Statistical learning theory is concerned with training classifiers on a limited amount of data. In the context of neural networks a simple heuristic, called early stopping, often ensures that the network will generalize well to examples not in the training set. Other typical problems of the back-propagation algorithm are the speed of convergence and the possibility to end up in a local minimum of the error function. Today there are practical solutions that make backpropagation in multi-layer perceptrons the solution of choice for many machine learning tasks. 2.10.5 Simple recurrent network A simple recurrent network (SRN) is a variation on the multi-layer perceptron, sometimes called an "Elman network" due to its invention by Professor Jeff Elman. A three-layer network is used, with the addition of a set of "context units" in the input layer. There are connections from the middle ("hidden") layer to these context units fixed with weight 1. At each time step, the input is propagated in a standard feedforward fashion, and then a learning rule (usually backpropagation) is applied. The fixed back connections result in the context units always maintaining a copy of the previous values of the hidden units (since they propagate over the connections before the learning rule is applied). Thus the network can maintain a sort of state, allowing it to perform such tasks as sequence-prediction that are beyond the power of a standard multi-layer perceptron. 2.10.6 Hopfield network The Hopfield net is a recurrent neural network in which all connections are symmetric. This network has the property that its dynamics are guaranteed to converge. If the connections are trained using Hebbian learning then the Hopfield network can perform robust content-addressable memory, robust to connection alteration.
2.10.7 Boltzmann machine
14 The Boltzmann machine can be thought of as a noisy Hopfield network. The Boltzmann machine was important because it was one of the first neural networks in which learning of latent variables (hidden units) was demonstrated. Boltzmann machine learning was slow to simulate, but the Contrastive Divergence algorithm of Geoff Hinton allows models including Boltzmann machines and Product of Experts to be trained much faster. 2.10.8 Committee of machines A committee of machines (CoM) is a collection of different neural networks that together vote on a given example. It has been seen that this gives a much better result. In fact in many cases, starting with the same architecture and training but different initial random weights give vastly different networks. A CoM tends to stabilize the result. 2.10.9 Self-organizing map The Self-organizing map (SOM), sometimes referred to as "Kohonen map", is an unsupervised learning technique that reduces the dimensionality of data through the use of a self-organizing neural network. A probabilistic version of SOM is the Generative Topographic Map (GTM). 2.11 DESIGN The developer must go through a period of trial and error in the design decisions before coming up with a satisfactory design. The design issues in neural networks are complex and are the major concerns of system developers. Designing a neural network consist of:

Arranging neurons in various layers. Deciding the type of connections among neurons for different layers, as well as among the neurons within a layer. Deciding the way a neuron receives input and produces output.
15
Determining the strength of connection within the network by allowing the network learns the appropriate values of connection weights by using a training data set.
The process of designing a neural network is an iterative process. Below are its basic steps. 2.11.1 Layers Biologically, neural networks are constructed in a three dimensional way from microscopic components. These neurons seem capable of nearly unrestricted interconnections. This is not true in any man-made network. Artificial neural networks are the simple clustering of the primitive artificial neurons. This clustering occurs by creating layers, which are then connected to one another. How these layers connect may also vary. Basically, all artificial neural networks have a similar structure of topology. Some of the neurons interface the real world to receive its inputs and other neurons provide the real world with the networks outputs. All the rest of the neurons are hidden form view.
Figure: 2.6 network layers
16 As the figure above shows, the neurons are grouped into layers. The input layer consists of neurons that receive input form the external environment. The output layer consists of neurons that communicate the output of the system to the user or external environment. There are usually a number of hidden layers between these two layers; the figure above shows a simple structure with only one hidden layer. When the input layer receives the input its neurons produce output, which becomes input to the other layers of the system. The process continues until a certain condition is satisfied or until the output layer is invoked and fires their output to the external environment. To determine the number of hidden neurons the network should have to perform its best, one are often left out to the method trial and error. If you increase the hidden number of neurons too much you will get an over fit, that is the net will have problem to generalize. The training set of data will be memorized, making the network useless on new data sets. 2.11.2 Communication and types of connections Neurons are connected via a network of paths carrying the output of one neuron as input to another neuron. These paths is normally unidirectional, there might however be a two-way connection between two neurons, because there may be another path in reverse direction. A neuron receives input from many neurons, but produce a single output, which is communicated to other neurons. The neuron in a layer may communicate with each other, or they may not have any connections. The neurons of one layer are always connected to the neurons of at least another layer.
17 2.11.2.1 Inter-layer connections There are different types of connections used between layers; these connections between layers are called inter-layer connections. Fully connected Each neuron on the first layer is connected to every neuron on the second layer. Partially connected A neuron of the first layer does not have to be connected to all neurons on the second layer. Feed forward The neurons on the first layer send their output to the neurons on the second layer, but they do not receive any input back form the neurons on the second layer. Bi-directional There is another set of connections carrying the output of the neurons of the second layer into the neurons of the first layer. Feed forward and bi-directional connections could be fully- or partially connected.
Hierarchical
If a neural network has a hierarchical structure, the neurons of a lower layer may only communicate with neurons on the next level of layer.
18
Resonance The layers have bi-directional connections, and they can continue sending messages across the connections a number of times until a certain condition is achieved.
2.11.2.2 Intra-layer connections In more complex structures the neurons communicate among themselves within a layer, this is known as intra-layer connections. There are two types of intra-layer connections.
Recurrent The neurons within a layer are fully- or partially connected to one
another. After these neurons receive input form another layer, they communicate their outputs with one another a number of times before they are allowed to send their outputs to another layer. Generally some conditions among the neurons of the layer should be achieved before they communicate their outputs to another layer.
On-center/off surround A neuron within a layer has excitatory connections to itself and its
immediate neighbors, and has inhibitory connections to other neurons. One can imagine this type of connection as a competitive gang of neurons. Each gang excites itself and its gang members and inhibits all members of other gangs. After a few rounds of signal interchange, the neurons with an active output value will win, and is allowed to update its and its gang members weights. (There are two types of connections between two neurons, excitatory or inhibitory. In the excitatory connection, the output of one neuron increases the action potential of the neuron to which it is connected. When the connection type between two neurons is inhibitory, then the output of the neuron
19 sending a message would reduce the activity or action potential of the receiving neuron. One causes the summing mechanism of the next neuron to add while the other causes it to subtract. One excites while the other inhibits.) 2.11.3 Learning The brain basically learns from experience. Neural networks are sometimes called machine-learning algorithms, because changing of its connection weights (training) causes the network to learn the solution to a problem. The strength of connection between the neurons is stored as a weightvalue for the specific connection. The system learns new knowledge by adjusting these connection weights. The learning ability of a neural network is determined by its architecture and by the algorithmic method chosen for training. The training method usually consists of one of three schemes: 1. Unsupervised learning The hidden neurons must find a way to organize themselves without help from the outside. In this approach, no sample outputs are provided to the network against which it can measure its predictive performance for a given vector of inputs. This is learning by doing. 2. Reinforcement learning This method works on reinforcement from the outside. The connections among the neurons in the hidden layer are randomly arranged, then reshuffled as the network is told how close it is to solving the problem. Reinforcement learning is also called supervised learning, because it requires a teacher. The teacher may be a training
20 set of data or an observer who grades the performance of the network results. Both unsupervised and reinforcement suffers from relative slowness and inefficiency relying on a random shuffling to find the proper connection weights. 3. Back propagation This method is proven highly successful in training of multilayered neural nets. The network is not just given reinforcement for how it is doing on a task. Information about errors is also filtered back through the system and is used to adjust the connections between the layers, thus improving performance. A form of supervised learning. 2.11.3.1 Off-line or On-line One can categorize the learning methods into yet another group, off-line or on-line. When the system uses input data to change its weights to learn the domain knowledge, the system could be in training mode or learning mode. When the system is being used as a decision aid to make recommendations, it is in the operation mode; this is also sometimes called recall.
Off-line In the off-line learning methods, once the systems enters into the operation mode, its weights are fixed and do not change any more. Most of the networks are of the off-line learning type.
On-line In on-line or real time learning, when the system is in operating mode (recall), it continues to learn while being used as a decision tool. This type of learning has a more complex design structure.
21 2.11.3.2 Learning laws There are a variety of learning laws, which are in common use. These laws are mathematical algorithms used to update the connection weights. Most of these laws are some sorts of variation of the best-known and oldest learning law, Hebbs Rule. Mans understanding of how neural processing actually works is very limited. Learning is certainly more complex than the simplification represented by the learning laws currently developed. Research into different learning functions continues as new ideas routinely show up in trade publications etc. A few of the major laws are given as an example below.
Hebbs Rule
The first and the best-known learning rule was introduced by Donald Hebb. This basic rule is: If a neuron receives an input from another neuron, and if both are highly active (mathematically have the same sign), the weight between the neurons should be strengthened.
Hopfield Law
This law is similar to Hebbs Rule with the exception that it specifies the magnitude of the strengthening or weakening. It states, "if the desired output and the input are both active or both inactive, increment the connection weight by the learning rate, otherwise decrement the weight by the learning rate." (Most learning functions have some provision for a learning rate, or a learning constant. Usually this term is positive and between zero and one.)
The Delta Rule
The Delta Rule is a further variation of Hebbs Rule, and it is one of the most commonly used. This rule is based on the idea of continuously modifying the strengths of the input connections to
22 reduce the difference (the delta) between the desired output value and the actual output of a neuron. This rule changes the connection weights in the way that minimizes the mean squared error of the network. The error is back propagated into previous layers one layer at a time. The process of back-propagating the network errors continues until the first layer is reached. The network type called Feed forward, Backpropagation derives its name from this method of computing the error term. This rule is also referred to as the Windrow-Hoff Learning Rule and the Least Mean Square Learning Rule.
Kohonens Learning Law
This procedure, developed by Teuvo Kohonen, was inspired by learning in biological systems. In this procedure, the neurons compete for the opportunity to learn, or to update their weights. The processing neuron with the largest output is declared the winner and has the capability of inhibiting its competitors as well as exciting its neighbors. Only the winner is permitted output, and only the winner plus its neighbors are allowed to update their connection weights. The Kohonen rule does not require desired output. Therefore it is implemented in the unsupervised methods of learning. Kohonen has used this rule combined with the on-center/off-surround intra- layer connection to create the self-organizing neural network, which has an unsupervised learning method.
Chapter No. 3
MECHANICAL PROPERTIES OF STEEL AND FACTORS AFFECTING MECHANICAL PROPERTIES
23
Chapter No. 3
MECHANICAL PROPERTIES OF STEEL AND FACTORS AFFECTING MECHANICAL PROPERTIES
The mechanical property of materials is simply the ability of materials to with stand internal/external physic-mechanical forces such as pulling, pushing, twisting, bending and sudden impact. In general terms, these properties are various kinds of strength. These properties are measured by means of destructive testing of materials in laboratory. However it is very difficult to provide actual service condition in laboratory. The following material properties are of great importance. 3.1.1 Tensile Strength The ratio of the maximum load to original cross sectional area is called Tensile Strength or Ultimate Tensile Strength. Where as Ultimate tensile strength refers to the force needed to fracture the material. Theses are the Properties of materials related to its ability to with stand external mechanical forces such as pulling, pushing, twisting, bending and sudden impact and ductility of a material. Tensile strength or ultimate strength is the maximum point shown on the stressstrain curve. (Fig: 3.1c). Tensile strength value is commonly taken as a basis for fixing the working stresses especially in brittle materials. The units of tensile strength are kg/cm2
( a ) low-carbon steel
( b ) Non-ferrous metals
24
( c ) stress-strain curve showing types of static strength. Figure: 3.1 Yield point and yield strength 3.1.2 Yield Strength When metals are subjected to a tensile force, they stretch or elongate as the stress increases. The point where the stretch suddenly increases is known as the yield strength of the material. Yield strength of a material represents the stress below which the deformation is almost entirely elastic, and It is that value of stress at which a material exhibits a specified deviation from proportionality of stress and strain. It can be defined as the ability of a material to resist plastic deformation, and is calculated by dividing the force initiating the yield by the original cross-sectional area of the specimen. In material where the proportion limit or the elastic limit (fig.3.1b) is less obvious, it is common to define the yield load as that force required to give 0.2%plastic offset. In other words, the yield strength is defined, as the stress required producing an arbitrary permanent deformation. The deformation most often used is 0.2%(fig.3.1), and is commonly referred as proof strain.
25
3.1.3
Elasticity Loading a solid will change its dimensions, but the resulting deformation will
disappear upon unloading. This tendency of a deformed solid to seek its original dimensions upon unloading is described to a property called elasticity. The recovery from the distorting effects of the loads may be instantaneous or gradual, complete or partial. A solid is called perfectly elastic if this recovery is instantaneous and complete; it is said to exhibit delayed elasticity or inelastic effects, respectively, if the recovery is gradual or incomplete. Accurate measurements reveal some delayed elasticity and inelastic effects in all solids.
3.1.4
Plasticity Plasticity is that property of a material by virtue of which it may be permanently
deformed when it has been subjected to an externally applied force great enough to exceed the elastic limit. It has great importance to a fabricator engineer because it is the property that enables material to shape in the solid state. For most materials, the plastic deformation follows the elastic deformation. Referring to stress-strain curve (fig.3.1) a material obeys the law of elastic solids or stresses below the yield stress and this is followed by the plastic deformation. The mechanism of plastic deformation is essentially different in crystalline materials and amorphous materials. Crystalline materials undergo plastic deformation as the result of slip along definite crystallographic planes, whereas in amorphous materials plastic deformation occurs when individual molecules or groups of molecules slide past one another. 3.1.5 Ductility Ductility refers to the capability of a material to undergo deformation under tension without rupture. It is the ability of a material to be drawn from a large section to a small section such as in wire drawing. Ductility may be expressed as percent elongation (%EL) or percent area reduction (%AR). From a tensile test: The percent elongation = (l-lo) 100/lo
26 Percent area reduction = (Ao-A) 100/Ao Where L is fracture length Lo is original gauge length Ao is original cross-sectional area at the point of fracture. Ductility is a measure of the degree of plastic deformation that has been sustained at fracture. Knowledge of the ductility of materials is important for at least two reasons: First it indicates to a designer the degree to which a structure will deform plastically before fracture. Second, it specifies the degree of allowable deformation during fabrication operation. 3.1.6 Brittleness Brittleness is defined as a tendency to fracture without appreciable deformation and is therefore the opposite of ductility or malleability. A brittle material will fracture with little permanent deformation/distortion; it is a sudden failure. A brittle material is hard and has little ductility. It will not stretch or bend before breaking. Cast iron is an example of a brittle material. If a material can be mechanically worked to a different size or shape without breaking or shattering, it is ductile or malleable; but if little or no change in the dimensions can be made before fracture occurs, it is brittle. Technically speaking if an elongation less than 5% in a 50 mm gauge length is taking place, then a material will be recognized as a brittle material. The brittle fractures normally follow the grain boundaries (inter granular or intercrystalline), whereas ductile fractures normally occur through the grains (transgranular or transcrystalline). 3.1.7 Toughness Toughness is the ability of the material to absorb energy during plastic deformation up to fracture. It refers to the ability of a material to withstand bending or the application of shear stresses without fracture. By this definition, copper is extremely tough but cast iron is not. Specimen geometry as well as the manner of load application is important in toughness determinations. For dynamic loading conditions and when a notch (or point of
27 stress concentration) is present, notch toughness is assessed by using an impact test. Furthermore, fracture toughness is a property indicative of a materials resistance to fracture when a crack is present. For the static situation, toughness may be ascertained from the results of a tensile stress-strain test. Toughness of a material, then, is indicated by the total area under the materials tensile stress-strain curve up to the point of fracture. 3.1.8 Hardness Hardness is the resistance of a material to penetration/scratch. However, the term may refer to stiffness or temper or to resistance to scratching, abrasion or cutting. Tests such as Brinell, Rockwell, Vickers etc., are generally employed to measure hardness. The hardness of materials depends upon the type of bonding forces between atoms, ions or molecules and increases, like strength, with the magnitude of these forces. Thus molecular solids such as plastics are relatively soft, metallic and ionic solids are harder than molecular solids, and covalent solids are the hardest materials known. 3.1.9 Fatigue When subjected to fluctuating or repeated loads (or stresses), materials tend to develop a characteristic behavior, which is different from that (of the materials) under steady load. Fatigue is the phenomenon that leads to fracture under such conditions. Fracture takes place under repeated or fluctuating stresses whose maximum value is less than the tensile strength of the material (under steady loads). Fatigue fracture is progressive, beginning as minute cracks that grow under the action of fluctuating stress. The term fatigue is used because this type of failure normally occurs after a lengthy period of repeated stress or strain cycling. Fatigue is important as much as it is the single largest cause of failure in metals (bridges, aircraft and machine components), estimated to comprise approximately 90% of all metallic failures; it is catastrophic and insidious, occurring very suddenly and without warning. Fatigue failure is brittle like in nature even in normally ductile metals, in that there is very little, if any, gross plastic deformation associated with failure. The process occurs by the initiation and propagation of cracks, and ordinarily the fracture surface is perpendicular to the direction of an applied tensile stress.
28 3.1.10 Creep Creep is the time-dependent permanent deformation that occurs under stress; for most materials, it is important only at elevated temperatures. Materials are often placed in service at elevated temperatures and exposed to static mechanical stresses (e.g., turbine rotors in jet engines and steam generators that experience centrifugal stresses and highpressure steam lines). Deformation under such circumstances is termed creep. 3.2 FACTORS AFFECTING MECHANICAL PROPERTIES Mechanical properties of materials are affected due to: 1. Alloy contents such as addition of W, Cr, etc. improve hardness and strength. 2. Grain size and microstructure. 3. Crystal imperfections such as dislocations. 4. Manufacturing defects such as cracks, blowholes etc. 5. Physio-mechanical treatments. 3.2.1 Effect Of Grain Size On Properties Of Metals On the basis of grain size, materials may be classified as: 1. Coarse-grained materials, (the grain size is large). 2. Fine-grained materials, (the grain size is small). Grain size is very important in deciding the properties of polycrystalline materials because it affects the area and length of the grain boundaries. Various effects of grain size on mechanical properties of metals are: 1. Fine-grained materials possess higher strength, toughness, hardness and resistance to suddenly applied force. 2. Fine-grained materials possess better fatigue resistance, and impact strength. 3. Fine grained materials are more crack-resistant and provide better finish in deep drawing unlike coarse grained ones which gibe rise to orange-peel effect. 4. Fine-grained steel develops hardness faster in carburising (heat treatment). 5. Fine-grained materials are preferred for structural applications. 6. Fine-grained materials generally exhibit greater yield stresses than coarsegrained materials at low temperature, whereas at high temperatures grain boundaries become weak and sliding occurs. 7. A coarse grained material is responsible foe surface roughness.
29 8. A coarse grained material possesses more ductility, malleability (forging, rolling, etc.) and better machinability. 9. Coarse-grained metals are difficult to polish or plating (as rough surface is visible even after polish etc.). 10. Coarse-grained steels have greater depth of hardening power as compared to fine-grained ones. 11. At elevated temperatures, coarse-grained materials show better creep strength than the fine-grained ones. 3.2.2 Effect Of Heat Treatment On Properties Of Metals Heat treatment is an operation or combination of operations involving heating and cooling of a metal / alloy in solid state to obtain desirable behavior or set of properties. Actually it affects the grain size and shape in the metal and some time a change in microstructure may or may not take place. By controlling the grain size and type of microstructure the desired mechanical properties can be achieved. This is done by heat treatment Some important heat-treatment processes are: Annealing Hardening Martemperig properties of metals: 1. Hardens and strengthens the metals. 2. Improves machinability. 3. Changes or refines grain size. 4. Softens metals for further working as in wire drawing. 5. Improves ductility and toughness. 6. Increases resistance of materials to heat, wear, shock and corrosion. 7. Improves electrical and magnetic properties. 8. Homogenises the metal structure. 9. Relieves internal stresses developed in metals / alloys during cold working, welding, casting, forging etc. Normalizing Tempering Austempering etc.
One or the other heat-treatment processes produce the following effects on the
30 10. Produces a hard wear resistant surface on a ductile steel piece (as in case hardening). 11. Improves thermal properties such as conductivity. 3.2.3 Effect of environmental variables Gaseous environment: The atmosphere contains mainly nitrogen and oxygen and added to it are gaseous products such as sulphur dioxide, hydrogen sulphide, moisture, chlorine, fluorine etc., etc., as industrial and other pollutants. On account of oxygen, an oxide film forms on the metals. In the presence of humid air, an oxide film-rust-can be seen on the surface of mild steel which is not desirable. Liquid environment: When exposed to moist (and saline) atmosphere, the metals may corrode. Corrosion is a gradual chemical attack on a metal under the influence of a moist atmosphere, (or of a natural or artificial solution). Working temperature: When exposed to very cold atmosphere, even ductile metals may behave like brittle metals. Water pipes in very cold countries normally burst and this is the effect of atmospheric exposure. When the metals are subjected to a very hot atmosphere there is 1. Accelerated oxidation and / or corrosion. 2. Creep. 3. Grain boundary weakening. 4. Allotropic and other phase changes. 5. Change of conventional properties. 6. Reduction in tensile strength and yield point.
3.2.4
Effect of alloying elements
Carbon With an increase in the amount of carbon, the hardness and tensile strength of the steel also increase (which slows as the level of carbon rises). An increase in carbon thusly causes a decrease in both ductility and weldability.
31 Manganese will also increase hardness as levels increase, but not to the same degree as carbon. Ductility and weldability are decreased but, again, to a lesser degree than caused by carbon. Phosphorus Benefits machinability and resistance to atmospheric corrosion. It increases strength and hardness, much akin to carbon, but it decreases ductility and impact strength (toughness). Phosphorus is often considered an impurity except in specific situations. Sulphur Like phosphorus, sulphur is generally undesired, except where machinability is an important goal for the steel. Ductility, impact strength or toughness, weldability, and surface quality are all adversely affected by sulphur content. Silicon Serves as a principal deoxidizer in steel. Its content in the steel is dependent upon the steel type. Killed steel has the highest percentage of silicon, upwards of 0.60 percent. Copper The sole purpose of copper is to increase resistance to atmospheric corrosion. Does not significantly affect mechanical properties. Causes brittleness in the steel at high temperatures, thereby negatively affecting surface quality. Chromium (Cr) Increases the steel's hardenability, corrosion resistance, and provides wear and abrasion resistance in the presence of carbon. It is largely present in stainless steels, usually ranging from 12 to 20%. Molybdenum (Mo) Its use as an alloying element in steel increases hardenability. Nickel (Ni) One of the most widely used alloying elements in steel. In amounts 0.50% to 5.00% its use in alloy steels increases the toughness and tensile strength without detrimental effect on the ductility. Nickel also increases the hardenability. In larger quantities, 8.00% and upwards, nickel is the constituent, together with chromium, of many corrosion resistant and stainless austenitic steels.
32 Titanium (Ti) Small amounts added to steel contribute to its soundness and give a finer grain size. Titanium carbide is also used with tungsten carbide in the manufacture of hard metal tools. Tungsten (W) When used as an alloying element it increases the strength of steel at normal and elevated temperatures. Its "red hardness" value makes it suitable for cutting tools as it enables the tool edge to be maintained at high temperatures. Vanadium (V) - Steels containing vanadium have a much finer grain structure than steels of similar composition without vanadium. It raises the temperature at which grain coarsening sets in and increases hardenability where it is in solution in the austenite prior to quenching. It also lessens softening on tempering and confers secondary hardness on high-speed steels. In the present study we have just touched with the effect of composition on the mechanical properties.
Chapter No. 4
TESTING TECHNIQUES
33
Chapter No. 4
TESTING TECHNIQUES
4.1 TENSILE TEST Strength is defined as the ability of a material to resist applied forces without yielding or fracturing. By convention strength usually denotes the resistance of a material to a tensile load applied axially to a specimen this is the principle of the tensile test. Figure 4.1 Shows a more sophisticated machine suitable for industrial and research laboratories. This machine is capable of performing compression, shear and bending tests as well as tensile tests. Both these machines apply a carefully controlled tensile load to a standard specimen and measure the corresponding extension of that specimen.
Figure 4.1 Testing machine Figure 4.2 and Figure 4.3 shows some standard specimen and the direction of the applied load. These specimens are based upon British standard BS 18. For the test results to be consistent for any given material, it is most important that the standard dimension and profiles are adhered to. The shoulder radii are particularly critical and small variations, or the presences of tooling marks, can cause considerable differences in the test data obtained. Flat specimens are usually machined only on their edges so that the
34 plate or sheet surfaces finish, and any structural deformation at the surface cause by the rolling process are taken into account in the test results. The gauge length is the length over which the elongation of the specimen is measured. The minimum parallel length is the minimum length over which the specimen must maintain a constant cross sectional area before the test load is applied. The lengths Lo, Lc, L1, and the cross sectional area (a) are all specified in BS 18. Cylindrical test specimens are proportioned so that the gauge length Lo, and the crosssectional area a maintain a constant relationship. Hence such specimens are called proportional test pieces. The relationship is given by the expression: Lo = 5.56 a Since a = 0.25( d2) a = 0.886d Thus Lo = 5.56 * 0.886d = 4.93d = 5d approx. Therefore a specimen 5 mm diameter will have a gauge length of 25 mm. The elongation obtained for a given force depends upon the length and area of cross section of the specimen or component, since. Elongation = Force * L E Where L = length a = cross section area E = elastic modulus Therefore if the L/a is kept constant (as it is in a proportional test piece), and E remains constant for a given material, then comparisons can be made between elongation and applied force for specimens of different sizes. a
35
Figure 4.2 Tensile test specimen ( round )
Figure 4.3 Tensile test specimen ( flat ) 4.1.1 Tensile test results The load applied to the specimen and the corresponding extension is plotted in the form of a graph as shown in fig. 4.4. (a) From a to b the extension is proportional to the applied load. Also, if the applied load is removed the specimen returns to its original length. Under these relatively lightly loaded conditions the material is showing elastic properties. (b) From b to c it can be seen from the graph that the metal suddenly extends with no increase in load. If the load is removed at this point the metal will not spring back to its original length and it is said to have taken a permanent set. This is the yield point and the yield stress, which is the stress at the yield point, is the load at b divided by the original cross section area of the specimen. Usually the designer works at the 50 percent of this figure to allow for a factor of safety.
36 (c) From c to d extension is no longer proportional to the load and if the load is removed little or no spring back will occur. Under these relatively greater loads the material is showing plastic properties. (d) The point d is referred to as the ultimate tensile strength when referred to load/extension graphs or the ultimate tensile stress (UTS) when referred to stress/strain graphs. The ultimate tensile stress is calculated by dividing the load at d by the original cross sectional area of the specimen. Although a useful figure for comparing the relative strengths of materials, it has little practical value since engineering equipment is not usually operated so near to the breaking point. (e) From d to e the specimen appears to the stretching under reduced load conditions. In fact the specimen is thinning out (necking) so that the load per unit area or stress is actually increasing. The specimen finally work hardens to such an extent that it breaks at e. In practice, values of load and extension are of limited use since they apply only to one particular size of specimen and it is more usual to plot the stress/strain curve. (An example of a stress\strain curve for a low carbon steel is shown in fig.4.4) stress and strain are calculated as follows. stress = load\area of cross section
Load
Strain = extension\original length Figure 4.4 load / extension curve for low-carbon steel
37 4.1.2 Proof stress

Only very ductile materials such as fully annealed mild steel show a clearly defined yield point. The yield point will not even appear on bright drawn low carbon steel which has become slightly worked hardened during the drawing process. Under such circumstances the proof stress is used. Proof stress is defined as the stress, which produces a specified amount of plastic strain, curve such as 0.1 or 0.2 per unit. Figure 4.5 shows a typical stress\strain curve for a material of relatively low ductility. Such as hardened and tempered medium carbon steel. If a point such as C is taken, the corresponding strain is given by D and this consists of a combination of plastic and elastic components. If the stress is now gradually reduced (by reducing the load on the specimen), the strain is also reduced and the stress\strain relationship during this reduction in stress is represented by the line CB. During the reduction in stress the elastic deformation is recovered so that the line CB is straight and parallel to the initial stages of the loading curve for the material, that is, the part of the loading curve where the material is showing elastic properties. In the example shown, the stress at C has produced a plastic strain of 0.2 percent as represented by AB. Thus the stress at C is referred to as 0.2 percent proof stress, AB being the plastic deformation and BD being the elastic deformation when the specimen is stressed to the point C. The material will have fulfilled its specification if, after the proof stress has been applied for 15 seconds and removed, the permanent set of the specimen is not greater than the specified percentage of the gauge length which, in this example, is 0.2 percent.
Figure 4.5 Proof stress
38
4.1.3 The Interpretation of tensile test results The interpretation of tensile test data requires skill born out of experience, since many factors can affect the test results, for instance the temperature at which the test is carried out, since the tensile modulus and tensile strength decrease as the temperature rises for most metals and plastics, whereas the ductility increases as the temperature rises. The test results are also influenced by the rate at which the specimen s strained.
Figure 4.6(a) shows a typical stress\strain curve for annealed mild steel. From such a curve the following information can be deduced. (a) The material is ductile since there is a long plastic range. (b) The material is fairly rigid since the slope of the initial elastic range is sleep. (c) The limit of proportionality (elastic limit) occurs at about 230 Mpa. (d) The upper yield point occurs at about 260 Mpa. (e) (f) The lower yield point occurs at about 230 Mpa. The ultimate tensile stress (UTS) occurs at about 400 Mpa. Figure 4.6(b) shows atypical stress\strain curve for a gray cast iron. From such a curve the following information can be deduced. (a) The material is brittle since there is little plastic deformation before it fractures. (b) Again the material is fairly rigid since the slope of the initial elastic range is sleep. (c) It is difficult to determine the point at which the limit of proportionality occurs, but it is approximately 200 Mpa. (d) The ultimate tensile stress (UTS) is the same as the breaking stress for this sample. This indicates negligible reduction in cross section (necking) and minimal ductility and malleability. It occurs at approximately 250 Mpa. Figure 4.6(c) shows a typical stress\strain curve for a wrought light alloy. From such a curve the following information can be deduced.
39 (a) The material has a high level of ductility since it shows a long plastic range. The material is much less rigid than either low carbon steel or cast iron since the slope of the initial plastic range is much less sleep when plotted to the same scale. The limit of proportionality is almost impossible to determine, so proof stress will be specified instead. For this sample a 0.2 percent proof stress is approximately 500 Mpa (AB). The tensile test can also yield other important facts about a material
under test. ( a ) Stress/strain curve for annealed mild steel
( b) Stress/strain curve for gray cast iron
40
( c ) Stress/strain curve for light alloy Figure 4.6 Typical stress/strain curves 4.1.4 The effect of grain size and structure on tensile testing: The test piece should be chosen so that it reflects as closely as possible the component and the material from which the component is produced. This is relatively easy for components produced from bar stock, but not so easy for components produced from forgings as he grain flow will be influenced by the contour of the component and will not be uniform. Castings also present problems since the properties of a specially cast test piece are unlikely to reflect those of the actual casting. This is due to the difference in size and the corresponding difference in cooling rates. The lay of the strain in rolled bar and plate can greatly affect the tensile strength and other properties of a specimen taken from them. Figure 4.7 shows the relative grain orientation for transverse and longitudinal test pieces. The tensile strength for the
41 longitudinal test piece is substantially greater than that of the transverse test piece, a factor which the designer of large fabrications must take into account. Figure 4.8 shows the effect of processing upon the properties of a material. A low carbon steel of high ductility, in the annealed condition shows the classical stress\strain curve with a pronounced yield point and a long plastic deformation range. The same material, after finishing by cold drawing no longer shows a yield point and the plastic range is sinceably reduced.
Figure 4.7 Effect of grain orientation on material testing
( i ) Annealed low-carbon steel ( ii ) Cold-drawn low-carbon steel Figure 4.8 effect of processing on the properties of low-carbon steel
42 Figure 4.9 shows the effect of heat treatment upon the properties of a medium carbon steel. In this example the results have been obtained by quench hardening a batch of identical specimens and then tempering them at different temperatures. Figure 4.10 shows the effect of heat treatment upon the properties of a work hardened metallic material. Stress relief (recovery) has very little effect upon the tensile strength and elongation (ductility) until the recrystallization (annealing) temperature is reached. The metal initially shows the high tensile strength and lack of ductility associated with a severely distorted grain structure. After stress relief the tensile strength rises and the ductility falls until the recrsytallization range temperature is reached. During the recrystallization range there is a marked change in properties. The tensile strength is rapidly reduced and the ductility, in terms of elongation percentage rapidly increases.
Figure 4.9 Effect of tempering on tensile test
43
4.2
Figure 4.10 effect of temperature on cold-worked material IMPACT TESTING The tensile test does not tell the whole story. Figure 4.12 shows how a piece of
high carbon steel rod will bend when in the annealed condition yet snap easily in the quench hardened condition despite the fact that in the latter condition it will show a much higher value of tensile strength. Impact tests consist of striking a suitable specimen with a controlled blow and measuring the energy absorbed in bending or breaking the specimen. The energy value indicates the toughness of the material under test. Figure 4.11 shows a typical impact-testing machine. This machine has a hammer which is suspended like pendulum, a voice for holding the specimen in the correct position relative to the hammer and a dial for indicating the energy absorbed in carrying out the test in joules (J). If there is maximum over swing, as there would be if no specimen was placed in the vice, then zero energy absorption is indicated. If the hammer is stopped by the specimen with no over swing, then maximum energy absorption is indicated. Intermediate readings are the impact values (J) of the materials being tested (their toughness ort lack of brittleness). There are two standard tests currently use.
44
Figure 4.11 Typical impact testing machine
(a)
(b)
Figure 4.12 Impact loading ( a ) A piece of high-carbon steel rod ( 1.0%) in the annealed (soft) condition will bend when struck with a hammer. UTS 925 Mpa ( b ) The same piece of high-carbon steel rod, as in ( a ), after hardening and lightly tempering will fracture when hit a hammer despite its UTS having increased to 1285 Mpa.
45 4.2.1 The Izod Test In this the test a 10 mm square, notched specimen is used. The striker of the pendulum hits the specimen with a kinetic energy of 162.72 J at a velocity of 3.8 m\s. Figure 4.13 shows details of the specimen and the manner in which it is supported.
Detail of notch
Section of test piece Position of the striker
Figure 4.13 Izod test
46 4.2.2 The Charpy Test In the Izod test the specimen is supported as a cantilever, but in the charpy test it is supported as a beam. It is struck with a kinetic energy of 298.3 J at a velocity of 5 m\s. figure 4.14 shows details of the charpy test specimen and the manner in which it is supported. Since both tests use a notched specimen, useful information can be obtained regarding the resistance of the material to the spread of a crack which may originate from a point of stress concentration such as sharp corners, undercuts, sudden changes in section, and machining marks in stressed components. Such points of stress concentration should be eliminated during design and manufacture.
Figure 4.14 Charpy test
47 4.2.3 The interpretation of impact tests The results of an impact test should specify the energy used to bend or break the specimen and the particular test used, i.e. Izod or charpy. In the case of the charpy test it is also necessary to specify the type of notch used as this test allows for three types of notch , as shown in fig.4.15. A visual examination of the fractured surface after the test also provides useful information. (a) Brittle Metals. A clean break with little deformation and little reduction in cross sectional area at the point of fracture. The fractured surface will show a granular structure . (b) Ductile Metals: The fracture will be rough and fibrous. In very ductile materials the fracture will not be complete the specimen bending over and only showing slight tearing from the notch. There will also be some reduction in cross sectional area at the point of fracture or bending.
Figure 4.15 Standard charpy notches
48 The temperature of the specimen at the time of making the test also has an important influence on the test results. Figure 4.16 shows the embrittlement of low carbon steels at refrigerated temperatures and hence their unsuitability for use in refrigeration plant and space vehicles. 4.2.4 The effect of processing on toughness Impact tests are frequently used to determine the effectiveness of annealing temperatures on the grain structure and impact strength of cold worked ductile metals. In the case of cold worked low carbon steel, the impact strength is quite low, initially, as the heavily deformed grain structure will be relatively brittle and lacking in ductility, particularly if the limit of cold working has been approached. Annealing at low temperatures has little effect as it only promotes recovery of the crystal lattice on the atomic scale and does not result in crystallization. In fact during recovery there may even be a slight reduction in the impact strength. However, at about 550oC to 650 o C recrystallization of low carbon steels occurs with only slight grain growth. Annealing in this temperature range results in the impact strength increasing dramatically as shown in fig. 4.7 and the appearance of the fracture changes from that of a brittle material to that of a ductile material. Annealing at higher temperatures or prolonged soaking at the lower annealing temperature results in grain growth and a corresponding fall in impact strength.
Figure 4.16 Effect of temperature on toughness The effect of tempering on the impact value of a quench hardened high carbon steel is shown in fig.4.18. Initially, only stress relief occurs but as the tempering temperature increases, the toughness also increases which is why cutting tools are tempered. Tempering modifies the extremely hard and brittle martensitic structure of
49 quench hardened plain carbon steels and causes a considerable increase in toughness with very little loss of hardness.
Figure 4.17 Effect of Annealing on the toughness of low-carbon steel
Figure 4.18 Effect of tempering on quench-hardened high carbon steel 4.3 HARDNESS TESTING Hardness has defined as the resistance of a material to indentation or abrasion by another hard body. It is by indentation that most hardness tests are performed. A hard
50 indenter is pressed into the specimen by a standard load, and the magnitude of the indentation (either area or depth) is taken as a measure of hardness. 4.3.1 The Brinell hardness test In this test, hardness is measured by pressing a hard steel ball into the surface of the test piece, using a known load. It is important to choose the combination of load and ball size carefully so that the indentation is free from distortion and suitable for measurement. The relationship between load P(kg) and the diameter D(mm) of the hardened ball indenter is given by the expression: P/D2 = K Where K is a constant. Typical values of K are: Ferrous metals Copper and copper alloys Aluminum and aluminum alloys Lead, tin, and white beating metals used. Figure 4.20 shows the principle of the Brinell hardness test. The diameter of the indentation d is measured in two directions at right-angles and the average taken. The hardness number HB is the load divided by the spherical area of the indentation which can be calculated knowing the values of d and D. In practice, conversion tables are used to translate the value of diameter d directly into hardness numbers HB. K=30 K=10 K=05 K=01
Thus, for steel, a load of 3000 kg is required if a 10mm diameter ball indenter is
51
Figure 4.19 Brinell hardness tester
Figure 4.20 Brinell hardness principle
To ensure consistent results the following precautions should be observed: (a) (b) (c) the thickness of the specimen should be at least seven times the depth of the indentation to allow unrestricted plastic flow below the indenter; the edge of the indentation should be at least three times the diameter of the indentation from the edge of the test piece; the test is unsuitable for materials whose hardness exceeds 500 HB, as the ball indenter tends to flatten.
52 Relationship between hardness and tensile strength: 4.3.1.1 Machinability With high speed steel cutting tools, the hardness of the stock being cut should not exceed HB =100 will tend to tear and leave a poor surface finish. 4.3.1.2 Relationship Between Hardness And Tensile strength There is a definite relationship between strength and hardness, and the ultimate tensile stress (UTS) of a component can be approximated as follows: UTS (MPa) = HB * 3.54 (for annealed plain-carbon steels); = HB * 3.25 (for quench-hardened and tempered plain-carbon steels); = HB * 5.6 (for ductile brass alloys); = HB * 4.2 (for wrought aluminium alloys).
Figure: 4.21 Work-hardening capacity
53 4.3.1.3 Work-hardening capacity Materials which will cold work without work hardening unduly will pile up round the indenter as shown in fig. 4.21 (a). Material which work-harden readily will sink around the indenter as shown in fig. 4.21(b). 4.3.2 The Vickers hardness test This test is preferable to the Brinell test where hard materials are concerned, as it uses a diamond indenter.(Diamond is the hardest material known_approximatly 6000 HB.) The diamond indenter is in the form of a square-based pyramid with an angle of 136o between opposite faces. Since only one type of indenter is used the load has to be varied for different hardness ranges. Standard loads are 5, 10, 20, 30, 50 and 100kg. It is necessary to state the load when specifying a Vickers hardness number. For example if the hardness number is found to be 200 when using a 50kg load, then the hardness number is written HD (50) =200. Figure 4.22(a) shows a universal hardness testing machine suitable for performing both Brinell and Vickers hardness tests, whilst fig. 4.22(b) shows the measuring screen for determining the distance across the corners of the indentation. The screen can be rotated so that two readings at right-angles can be taken and the average is used to determine the hardness number (HD). This is calculated by dividing the load by the projected area of the indentation: HD = P/D2, Where D = the average diagonal (mm), P = load (kg).
Figure 4.22 Micro Vicker And Vicker Hardness Testers
54 4.3.3 The Rockwell hardness test Although not so reliable as the Brinell and Vickers hardness tests for laboratory purposes, the Rockwell test is widely used in industry, as it is quick, simple, and direct reading. Figure 4.23 shows a typical hardness indicating scale. Universal electronic hardness testing machines are now widely used which, at the turn of a switch, can provide either Brinell, Vicker, or Rockwell tests and which show the hardness number as a digital readout automatically. They also give a hard copy printout of the test result together with the test conditions and date. However, the mechanical testing machines described in this chapter are still widely used and will be for some time to come. In principle the Rockwell hardness test compares the difference in depth of penetration of the indenter when using forces of two different values, that is, a minor force is first applied (to take up the backlash and pierce the skin of the component) and the scales are set to read zero. Then a major force is applied over and above the minor force and the increased depth of penetration is shown on the scales of the machine as a direct reading of hardness without the need for calculation or conversion tables. The indenters most commonly used are a 1.6mm diameter hard steel ball and a diamond cone with an apex angle of 1200. The minor force in each instance is 98 N. Table 4.1 gives the combinations of type of indenter and additional (major) force for the range of Rockwell scales, together with typical applications. The B and C scales are the most widely used in engineering. The standard Rockwell test cannot be used for very thin sheet and foils, and for these the Rockwell superficial Hardness Test is used. The minor force is reduced from 98 N to 29.4 N and the major force is also reduced. Typical values are listed in Table4.2.
55
Figure 4.23 Rockwell hardness tester
56 Scale A Indenter Diamond cone Additional force(kN) 0.59 Steel sheet; shallow case-hardened components B Ball, 1.588mm 0.98 Copper alloys; aluminium alloys, and annealed low carbon steels C Diamond cone 1.47 Most widely used range: hardened steels; cast irons; deep case-hardened components D Diamond cone 0.98 Thin but hard steelmedium depth casehardened compounds E Ball, 3.175mm 0.98 Cast iron, aluminium alloys; magnesium alloys, bearing metals F Ball, 1.588mm 0.59 Annealed copper alloys, thin soft sheet metals G Ball, 1.558mm 1.47 Malleable irons; phosphor bronze; gun-metal; cupronikel alloys, etc H Ball 3.175mm 0.59 Soft materials; high ferritic aluminium, Applications
57 lead, zinc K Ball 3.175mm 1.47 Aluminium and magnesium alloys Table 4.1 Rockwell hardness test conditions Table 4.2 Rockwell superficial hardness test conditions Scale 15-N 30-N 45-N 15-T 30-T 45-T 4.3.4 Shore scleroscope In the tests previously described, the test piece must be small enough to mount in the testing machine, and hardness is measured as a function of indentation. However the scleroscoe, works on a different principle and hardness is measured as a function of resilience. Further, since the scleroscope can be carried to the work piece, it is useful for testing large surfaces such as the slideways on machine tools. A diamond-tipped hammer of mass 2.5g drops through a height of 250mm. The height of the first rebound indicates the hardness on a 140-division scale. 4.3.4.1 The effect of processing on hardness All metals work-harden to some extent when cold-worked. Figure 4.24 shows the relationship between the Vickers hardness number (HD) and the percentage reduction in thickness for rolled strip. The metals become harder and more brittle as the amount of cold-working increases until a point is reached where the metal is so hard and brittle that cold-working cannot be continued. Aluminium reaches this state when a 60 percent reduction in strip thickness is achieved in one pass through the rolls of a rolling mill. In this condition the material is said to be fully work-hardened. The degree of workIndenter Diamond cone Diamond cone Diamond cone Ball, 1.588mm Ball, 1.588mm Ball, 1.588mm Additional force (kN) 0.14 0.29 0.44 0.14 0.29 0.44
58 hardening or temperof strip and sheet material is arbitrarily stated as soft (fully annealed), hard, hard, hard, and hard (fully work-hardened). The effect of heating a work-hardened material such as brass is shown in Fig. 4.25. Once again very little effect occurs until the temperature of recrystallisation is reached. At this temperature there is a rapid fall off in hardness, after which the decline in hardness becomes more gradual as grain growth occurs and the metal becomes fully annealed. The effect of heating a quench-hardened plain-carbon steel is more gradual as shown in Fig. 4.26. During the tempering range of the steel no grain growth occurs, but there are structural changes. Initially, there is a change in the very hard martensite as particles of carbide precipitate out. As tempering proceeds and the temperature is increased, the structure loses its acicular martensitic appearance and spheroidal carbide particles in a matrix of ferrite can be seen under high magnification. These structural changes increase the toughness of the metal considerably, but with some loss of hardness.
Figure 4.24 Effect of cold-working on the hardness of various metals
Chapter No. 5
USING NEURAL NETWORK TOOL BOX
60
Chapter No. 5
USING NEURAL NETWORK TOOL BOX
5.1 INTRODUCTION Neural networks involves very huge amount of calculations in shape of manipulating data for training and verification that becomes very easy with the help of a computer. The manipulation can be carried out in any of the programming languages i.e. C++, Fortran, Matlab etc. In this work Matlab is used for this purpose because, Matlab is a very powerful tool for mathematical calculation, visualization and programming. In addition to the pure mathematical part of Matlab there are several tool boxes available to expand the capabilities of Matlab, the Neural Network Toolbox (NN Toolbox) is one of these toolboxes. This chapter is intended for students unacquainted to Matlab and the neural network toolbox to get practice in using these tools. The contents of this chapter are more focused toward practical examples and problems than to theory. The reason for this is that most theory is covered in the help files of Matlab itself. In fact, one will not be able to learn Matlab and NN Toolbox with this tutorial alone; one will have to actively explore the documentation and demos available in Matlab. What lacks from Matlab is a set of examples and problems that helps the user to learn to use the tools. 5.2 THE STRUCTURE OF THE NEURAL NETWORK TOOLBOX The toolbox is based on the network object. This object contains information about everything that concern the neural network. Type network at the matlab command prompt, and an empty network will be created and its parameters will be shown.
>> network
ans =
Neural Network object:
architecture:
61
numInputs: 0 numLayers: 0 biasConnect: [] inputConnect: [] layerConnect: [] outputConnect: [] targetConnect: []
numOutputs: 0 numTargets: 0 numInputDelays: 0 numLayerDelays: 0
(read-only) (read-only) (read-only) (read-only)
First the architecture parameters are shown. Because the network command creates an empty network all parameters are set to 0. The subobject structures follows:
subobject structures:
inputs: {0x1 cell} of inputs layers: {0x1 cell} of layers outputs: {1x0 cell} containing no outputs targets: {1x0 cell} containing no targets biases: {0x1 cell} containing no biases inputWeights: {0x0 cell} containing no input weights layerWeights: {0x0 cell} containing no layer weights
This paragraph is subobject structures which is the various input and output matrices, biases and inputweights.
functions:
adaptFcn: (none) initFcn: (none) performFcn: (none) trainFcn: (none)
The next paragraph is interesting, it contains the training, initialization and performance functions. The trainFcn and adaptFcn are essentially the same but trainFcn will be
62 used in this tutorial. By setting the trainFcn parameter you tell Matlab which training algorithm it should use. The ANN toolbox includes almost 20 training functions. The performance function is the function that determines how well the ANN is doing its task. The initFcn is the function that initialized the weights and biases of the network. To get a list of the functions that is available type help nnet. To change one of these functions to another one in the toolbox or one that you have created, just assign the name of the function to the parameter, e.g.
net.trainFcn = 'mytrainingfun';
The parameters that concerns these functions are listed in the next paragraph.
parameters:
adaptParam: (none) initParam: (none) performParam: (none) trainParam: (none)
By changing these parameters you can change the default behavior of the functions mentioned above. The parameters you will use the most are probably the components of trainParam. The most used of these are net.trainParam.epochs, which tells the algorithm the maximum number of epochs to train, and net.trainParam.show that tells the algorithm how many epochs there should be between each presentation of the performance. Type help train for more information. The weights and biases are also stored in the network structure:
weight and bias values:
IW: {0x0 cell} containing no input weight matrices LW: {0x0 cell} containing no layer weight matrices b: {0x1 cell} containing no bias vectors
other:
userdata: (user stuff)
The .IW component is a cell array that holds the weights between the input layer and the first hidden layer. The .LW component holds the weights between the hidden layers and the output layer.
63
5.3
CONSTRUCTING LAYERS It is assumed that you have an empty network object named `net' in your
workspace, if not, type >> net = network; To get one. Let's start with defining the properties of the input layer. The NNT supports networks that have multiple input layers. Lets set this to 1:
>> net.numInputs = 1;
Now we should define the number of neurons in the input layer. This should of course be equal to the dimensionality of your data set. The appropriate property to set is
net.inputs{i}.size, where i is the index of the input layers. So to make a network,
which has 2 dimensional points as inputs, type:

>> net.inputs{1}.size = 2;
This defines (for now) the input layer. The next properties to set are net.numLayers, which not surprisingly sets the total number of layers in the network, and net.layers{i}.size, which sets the number of neurons in the ith layer. To build our example network, we define 2 extra layers (a hidden layer with 3 neurons and an output layer with 1 neuron), using:
>> net.numLayers = 2; >> net.layers{1}.size = 3; >> net.layers{2}.size = 1;
For details refer Appendix B 5.3.2 Connecting Layers Now it's time to define which layers are connected. First, define to which layer the inputs are connected by setting net.inputConnect(i) to 1 for the appropriate layer i (usually the first, so i = 1). The connections between the rest of the layers are defined a connectivity matrix called net.layerConnect, which can have either 0 or 1 as element entries. If element (i,j) is 1, then the outputs of layer j are connected to the inputs of layer i. We also have to define which layer is the output layer by setting
net.outputConnect(i) to 1 for the appropriate layer i.
64 Finally, if we have a supervised training set, we also have to define which layers are connected to the target values. (Usually, this will be the output layer.) This is done by setting net.targetConnect(i) to 1 for the appropriate layer i. So, for our example, the appropriate commands would be
>> net.inputConnect(1) = 1; >> net.layerConnect(2, 1) = 1; >> net.outputConnect(2) = 1; >> net.targetConnect(2) = 1;
5.4
SETTING TRANSFER FUNCTIONS Each layer has its own transfer function which is set through the
net.layers{i}.transferFcn property. So to make the first layer use sigmoid transfer
functions, and the second layer linear transfer functions, use

>> net.layers{1}.transferFcn = 'logsig'; >> net.layers{2}.transferFcn = 'purelin';
For detail refer appendix B
5.5
WEIGHTS AND BIASES Now, define which layers have biases by setting the elements of
net.biasConnect to either 0 or 1, where net.biasConnect(i) = 1 means layer i has
biases attached to it. To attach biases to each layer in our example network, we'd use
>> net.biasConnect = [ 1 ; 1];
Now you should decide on an initialization procedure for the weights and biases. When done correctly, you should be able to simply issue a
>> net = init(net);
to reset all weights and biases according to your choices. The first thing to do is to set net.initFcn. Unless you have built your own initialization routine, the value 'initlay' is the way to go. This let's each layer of weights and biases use their own initialization routine to initialize.
>> net.initFcn = 'initlay';
65 Exactly which function this is should of course be specified as well. This is done through the property net.layers{i}.initFcn for each layer. The two most practical options here are Nguyen-Widrow initialization ('initnw', type 'help initnw' for details), or 'initwb', which let's you choose the initialization for each set of weights and biases separately. When using 'initnw' you only have to set
>> net.layers{i}.initFcn = 'initnw';
For each layer i When using 'initwb', you have to specify the initialization routine for each set of weights and biases separately. The most common option here is 'rands', which sets all weights or biases to a random number between -1 and 1. First, use
>> net.layers{i}.initFcn = 'initwb';
For each layer i. Next, define the initialization for the input weights,
>> net.inputWeights{1,1}.initFcn = 'rands';
And for each set of biases

>> net.biases{i}.initFcn = 'rands';
And weight matrices

>> net.layerWeights{i,j}.initFcn = 'rands';
Where net.layerWeights{i,j} denotes the weights from layer j to layer i. 5.6 TRAINING FUNCTIONS & PARAMETERS The difference between train and adapt One of the more counterintuitive aspects of the NNT is the distinction between
train and adapt. Both functions are used for training a neural network, and most of the
time both can be used for the same network. What then is the difference between the two? The most important one has to do with incremental training (updating the weights after the presentation of each single training sample) versus batch training (updating the weights after each presenting the complete data set). When using adapt, both incremental and batch training can be used. Which one is actually used depends on the format of your training set. If it consists of two matrices of input and target vectors, like
>> P = [ 0.3 0.2 0.54 0.6 ; 1.2 2.0 1.4 1.5]
66
P =
0.3000 1.2000
0.2000 2.0000
0.5400 1.4000
0.6000 1.5000
>> T = [ 0 1 1 0 ]
T =
The network will be updated using batch training. (In this case, we have 4 samples of 2 dimensional input vectors, and 4 corresponding 1D target vectors). If the training set is given in the form of a cell array,
>> P = {[0.3 ; 1.2] [0.2 ; 2.0] [0.54 ; 1.4] [0.6 ; 1.5]}
P =
[2x1 double]
[2x1 double]
[2x1 double]
[2x1 double]
>> T = { [0] [1] [1] [0] }
T =
[0]
[1]
[1]
[0]
Then incremental training will be used. When using train on the other hand, only batch training will be used, regardless of the format of the data (you can use both). The big plus of train is that it gives you a lot more choice in training functions (gradient descent, gradient descent w/ momentum, Levenberg-Marquardt, etc.), which are implemented very efficiently. So when you don't have a good reason for doing incremental training, train is probably your best choice. (And it usually saves you setting some parameters). The most important difference between adapt and train is the difference between passes and epochs. When using adapt, the property that determines how many
67 times the complete training data set is used for training the network is called
net.adaptParam.passes. But, when using train, the exact same property is now called net.trainParam.epochs.
5.6.1 Performance Functions The two most common options here are the Mean Absolute Error (mae) and the Mean Squared Error (mse). The mae is usually used in networks for classification, while the mse is most commonly seen in function approximation networks. The performance function is set with the net.performFcn property, for instance:
>> net.performFcn = 'mse';
5.6.2 Train Parameters If you are going to train your network using train, the last step is defining
net.trainFcn, and setting the appropriate parameters in net.trainParam. Which
parameters are present depends on your choice for the training function. So if you for example want to train your network using a Gradient Descent w/ Momentum algorithm, you'd set
>> net.trainFcn = 'traingdm';
And then set the parameters

>> net.trainParam.lr = 0.1; >> net.trainParam.mc = 0.9;
To the desired values (In this case, lr is the learning rate, and mc the momentum term.) Two other useful parameters are net.trainParam.epochs, which is the maximum number of times the complete data set may be used for training, and
net.trainParam.show, which is the time between status reports of the training function.
For example,
>> net.trainParam.epochs = 1000; >> net.trainParam.show = 100;
5.6.3 Adapt Parameters The same general scheme is also used in setting adapt parameters. First, set
net.adaptFcn to the desired adaptation function. We'll use adaptwb (from 'adapt
weights and biases'), which allows for a separate update algorithm for each layer. Again, check the Matlab documentation for a complete overview of possible update algorithms.
68
>> net.adaptFcn = 'adaptwb';
Next, since we're using adaptwb, we'll have to set the learning function for all weights and biases:
>> net.inputWeights{1,1}.learnFcn = 'learnp'; >> net.biases{1}.learnFcn = 'learnp';
Where in this example we've used learnp, the Perceptron learning rule. (Type 'help
learnp', etc.)
Finally, a useful parameter is net.adaptParam.passes, which is the maximum number of times the complete training set may be used for updating the network:
>> net.adaptParam.passes = 10;
5.7
BASIC NEURAL NETWORK EXAMPLE The task is to create and train a neural network that solves the XOR problem. XOR is a function that returns 1 when the two inputs are not equal see table 5.1. Table 5.1: The XOR-problem A B A XOR B 1 1 0 1 0 1 0 1 1 0 0 0
To solve this we will need a feedforward neural network with two input neurons, and one output neuron. Because that the problem is not linearly separable it will also need a hidden layer with two neurons. Now we know how our network should look like, but how do we create it? To create a new feed forward neural network use the command newff. You have to enter the max and min of the input values, the number of neurons in each layer and optionally the activation functions.
69
>> net = newff([0 1; 0 1],[2 1],{'logsig','logsig'})
The variable net will now contain an untrained feedforward neural network with two neurons in the input layer, two neurons in the hidden layer and one output neuron, exactly as we want it. The [0 1; 0 1] tells matlab that the input values ranges between 0 and 1. The {'logsig','logsig'} tells matlab that we want to use the logsig function as activation function in all layers. The first parameter tells the network how many nodes there should be in the input layer, hence you do not have to specify this in the second parameter. You have to specify at least as many transfer functions as there are layers, not counting the input layer. If you do not specify any transfer function Matlab will use the default settings.
Figure 5.1: The logsig activation function Now we want to test how good our untrained network is on the XOR problem. First we construct a matrix of the inputs. The input to the network is always in the columns of the matrix. To create a matrix with the inputs "1 1", "1 0", "0 1" and "0 0" we enter:
>> input = [1 1 0 0; 1 0 1 0]
70
input =
1 1
1 0
0 1
0 0
Now we have constructed inputs to our network. Let us push these into the network to see what it produces as output. The command sim is used to simulate the network and calculate the outputs, for more information on how to use the command type
helpwin sim. The simplest way to use it, is to enter the name of the neural network and
input matrix, it returns an output matrix.

>> output=sim(net,input)
output =
0.5923
0.0335
0.9445
0.3937
The output was not exactly what we wanted! We wanted (0 1 1 0) but got near to (0.60 0.03 0.95 0.40). (Note that your network might give a different result, because the network's weights are given random values at the initialization.) You can now plot the output and the targets; the targets are the values that we want the network to generate. Construct the target vector:
>> target = [0 1 1 0]
target =
To plot points we use the command "plot". We want that the targets should be small circles so we use the command:
>> plot(target, 'o')
We want to plot the output in the same window. Normally the contents in a window are erased when you plot something new in it. In this case we want the targets to remain in the picture so we use the command hold on. The output is plotted as +'s.
71
>> hold on
>> plot(output, '+')
In the resulting figure (Fig5.2) it's easy to see that the network does not give the wanted results. To change this we have to train it. Now we will train the network by hand by adjusting the weights manually.
Figure 5.2: The targets and the actual output from an untrained XOR network. The targets are represented as 'o' and the output as '+'
5.7.1
Manually set weights The network we have constructed so far does not really behave as it should. To
correct this the weights will be adjusted. All the weights are stored in the net structure that were created with newff. The weights are numbered by the layers they connect and the neurons within these layers. To get the value of the weights between the input layer and the first hidden layer we type:
>> net.IW
ans =
72
[2x2 double] []
>> net.IW{1,1}
ans =
5.5008 2.5404
-5.6975 -7.5011
This means that the weight between the second neuron in the input layer to the first neuron in the first hidden layer is -5.6975. To change it to 1, enter:
>> net.IW{1,1}(1,2)=1; >> net.IW{1,1}
ans =
5.5008 2.5404
1.0000 -7.5011
The weights between the hidden layers and the output layer are stored in the .LW component, which can be used in the same manner as .IW.
>> net.LW
ans =
[] [1x2 double]
[] []
>> net.LW{2,1}
ans =
-3.5779
-4.3080
The change we made in the weight makes our network give an other output when we simulate it, try it by enter:
>> output=sim(net,input)
output =
73
0.8574
0.0336
0.9445
0.3937
>> plot(output,'g*');
Now the new output will appear as green stars in your picture, are they closer to the o's than the +'s were?
5.7.2
Training Algorithms In the neural network toolbox there are several training algorithms already
implemented. That is good because they can do the heavy work of training much smoother and faster than we do by manually adjust the weights. Now let us apply the default training algorithm to our network. The matlab command to use is train, it takes the network, the input matrix and the target matrix as input. The train command returns a new trained network. For more information type helpwin train. In this example we do not need all the information that the training algorithms shows, so we turn it of by entering:
>> net.trainParam.show=NaN;
The most important training parameters are .epochs which determines the maximum number of epochs to train, .show the interval between each presentation of training progress. If the gradient of the performance is less than .min_grad the training is ended. The .time component determines the maximum time to train. And to train the network enter:
>> net = train(net,input,target);
Because of the small size of the network, the training is done in only a second or two. Now we try to simulate the network again, to se how it reacts to the inputs:
>> output = sim(net,input)
output =
0.0000
1.0000
1.0000
0.0000
74 That was exactly what we wanted the network to output! You may now plot the output and see that the +'s falls in the o's. Now examine the weights that the training algorithm has set, does they look like the weights that you found?
>> net.IW{1,1}
ans =
11.0358 16.8909
-9.5595 -17.5570
>> net.LW{2,1}
ans =
25.9797
-25.7624
It is also possible to enter the name of the training algorithm when the network is created, see help newff for more information
5.8 GRAPHICAL USER INTERFACE

5.8.1 Introduction to the GUI The graphical user interface (GUI) is designed to be simple and user friendly, but we will go through a simple example to get you started. In what follows you bring up a GUI Network/Data Manager window. This window has its own work area, separate from the more familiar command line workspace. Thus, when using the GUI, you might "export" the GUI results to the (command line) workspace. Similarly you may want to "import" results from the command line workspace to the GUI. Once the Network/Data Manager is up and running, you can create a network, view it, train it, simulate it and export the final results to the workspace. Similarly, you can import data from the workspace for use in the GUI.
75 The following example deals with a perceptron network. We go through all the steps of creating a network and show you what you might expect to see as you go along. 5.8.2 Create a Perceptron Network (nntool) We create a perceptron network to perform the AND function in this example. It has an input vector p= [0 0 1 1;0 1 0 1] and a target vector t=[0 0 0 1]. We call the network ANDNet. Once created, the network will be trained. We can then save the network, its output, etc., by "exporting" it to the command line.
5.8.3 Input and target
To start, type nntool. The following window appears.
Figure 5.3 Network data manager
76 Click on Help to get started on new problem and see descriptions of the buttons and lists. First, we want to define the network input, which we call p, as having the particular value [0 0 1 1;0 1 0 1]. Thus, the network had a two-element input and four sets of such two-element vectors are presented to it in training. To define this data, click on New Data, and a new window, Create New Data appears. Set the Name to p, the Value to [0 0 1 1;0 1 0 1], and make sure that Data Type is set to Inputs.The Create New Data window will then look like this:
Figure 5.4 Create new data window Now click Create to actually create an input file p. The Network/Data Manager window comes up and p shows as an input. Next we create a network target. Click on New Data again, and this time enter the variable name t, specify the value [0 0 0 1], and click on Target under data type. Again click on Create and you will see in the resulting Network/Data Manager window that you now have t as a target as well as the previous p as an input.
77
5.8.4 Create Network
Now we want to create a new network, which we will call ANDNet.To do this, click on New Network, and a CreateNew Network window appears. Enter ANDNet under Network Name. Set the Network Type to Perceptron, for that is the kind of network we want to create. The input ranges can be set by entering numbers in that field, but it is easier to get them from the particular input data that you want to use. To do this, click on the down arrow at the right side of Input Range. This pull-down menu shows that you can get the input ranges from the file p if you want. That is what we want to do, so click on p. This should lead to input ranges [0 1;0 1].We want to use a hardlim transfer function and a learnp learning function, so set those values using the arrows for Transfer function and Learning function respectively. By now your Create New Network window should look like:
Figure 5.5 Create new network window Next you might look at the network by clicking on View. For example:
78
Figure 5.6 View network window This picture shows that you are about to create a network with a single input (composed of two elements), a hardlim transfer function, and a single output. This is the perceptron network that we wanted. Now click Create to generate the network. You will get back the Network/Data Manager window. Note that ANDNet is now listed as a network.
5.8.5
Train the Perceptron To train the network, click on ANDNet to highlight it. Then click on Train. This
leads to a new window labeled Network:ANDNet. At this point you can view the network again by clicking on the top tab Train. You can also check on the initialization by clicking on the top tab Initialize. Now click on the top tab Train. Specify the inputs and output by clicking on the left tab Training Info and selecting p from the pop-down list of inputs and t from the pull-down list of targets. The Network:ANDNet window should look like:
79
Figure 5.7 Main network window Note that the Training Result Outputs and Errors have the
name ANDNet appended to them. This makes them easy to identify later when they are exported to the command line. While you are here, click on the Training Parameters tab. It shows you parameters such as the epochs and error goal. You can change these parameters at this point if you want. Now click Train Network to train the perceptron network. You will see the following training results.
80
Figure 5.8 training result window Thus, the network was trained to zero error in four epochs. (Note that other kinds of networks commonly do not train to zero error and their error commonly cover a much larger range. On that account, we plot their errors on a log scale rather than on a linear scale such as that used above for perceptrons.) You can check that the trained network does indeed give zero error by using the input p and simulating the network. To do this, get to the Network/Data Manager window and click on Network Only: Simulate). This will bring up the Network:ANDNet window. Click there on Simulate. Now use the Input pull-down menu to specify p as the input, and label the output as ANDNet_outputsSim to distinguish it from the training output. Now click Simulate Network in the lower right corner. Look at the Network/Data Manager and you will see a new variable in the
81 output:
ANDNet_outputsSim.
Double-click
on
it
and
small
window
Data:ANDNet_outputsSim appears with the value

[0 0 0 1]
Thus, the network does perform the AND of the inputs, giving a 1 as an output only in this last case, when both inputs are 1. 5.8.6 Export Perceptron Results to Workspace To export the network outputs and errors to the MATLAB command line workspace, click in the lower left of the Network:ANDNet window to go back to the Network/Data Manager. Note that the output and error for the ANDNet are listed in the Outputs and Error lists on the right side. Next click on Export This will give you an Export or Save from Network/Data Manager window. Click on ANDNet_outputs and
ANDNet_errors to highlight them, and then click the Export button. These two variables
now should be in the command line workspace. To check this, go to the command line and type who to see all the defined variables. The result should be
who Your variables are: ANDNet_errors ANDNet_outputs
You might type ANDNet_outputs and ANDNet_errors to obtain the following

ANDNet_outputs = 0 0 0 1
and
ANDNet_errors =
82
0 0 0 0.
You can export p, t, and ANDNet in a similar way. You might do this and check with who to make sure that they got to the command line. Now that ANDNet is exported you can view the network description and examine the network weight matrix. For instance, the command
ANDNet.iw{1,1}
gives
ans = 2 1
Similarly,
ANDNet.b{1}
yields
ans = -3.
5.8.7
Clear Network/Data Window You can clear the Network/Data Manager window by highlighting a variable
such as p and clicking the Delete button until all entries in the list boxes are gone. By doing this, we start from clean slate. Alternatively, you can quit MATLAB. A restart with a new MATLAB, followed by nntool, gives a clean Network/Data Manager window.
83 Recall however, that we exported p, t, etc., to the command line from the perceptron example. They are still there for your use even after you clear the Network/Data Manager. 5.8.8 Importing from the Command Line To make thing simple, quit MATLAB. Start it again, and type nntool to begin a new session. Create a new vector.
r= [0; 1; 2; 3] r = 0 1 2 3
Now click on Import, and set the destination Name to R (to distinguish between the variable named at the command line and the variable in the GUI). You will have a window that looks like this
84 .
Figure 5.9 Import / load window Now click Import and verify by looking at the Network/DAta Manager that the variable R is there as an input. 5.8.9 Save a Variable to a File and Load It Later
Bring up the Network/Data Manager and click on New Network. Set the name to
mynet. Click on Create. the network name mynet should appear in the Network/Data
Manager. In this same manager window click on Export. Select mynet in the variable
85 list of the Export or Save window and click on Save. This leads to the Save to a MAT file window. Save to a file mynetfile. Now lets get rid of mynet in the GUI and retrieve it from the saved file. First go to the Data/Network Manager, highlight mynet, and click Delete. Next click on Import. This brings up the Import or Load to Network/Data Manager window. Select the Load from Disk button and type mynetfile as the MAT-file Name. Now click on Browse. This brings up the Select MAT file window with mynetfile as an option that you can select as a variable to be imported. Highlight mynetfile, press Open, and you return to the Import or Load to Network/Data Manager window. On the Import As list, select Network. Highlight mynet and lick on Load to bring mynet to the GUI. Now mynet is back in the GUI Network/Data Manager window.
Chapter No. 6
EXPERIMENTAL WORK
86
Chapter No. 6
EXPERIMENTAL WORK
The data shown in the table 6.1 was used to train the network. Some of the data was used as unseen data to obtain the result; this is shown in table 6.2. 6.1 DATA SET USED
C= carbon (minimum and maximum) Mn= manganese (minimum and maximum) P=phasphorus S=sulphur T-S= tensile strength S.No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 SAE No. 1006 1008 1009 1010 1012 1015 1016 1017 1018 1019 1020 1022 1023 1024 1025 1027 1030 1033 1035 1037 1038 1039 1040 1042 1043 1045 1046 1050 1052 C-min 0.08 0.1 0.15 0.08 0.1 0.13 0.13 0.15 0.15 0.15 0.18 0.18 0.2 0.19 0.22 0.22 0.28 0.3 0.32 0.32 0.35 0.37 0.37 0.4 0.4 0.43 0.43 0.48 0.47 Mn-min 0.25 0.25 0.6 0.3 0.3 0.3 0.6 0.3 0.6 0.7 0.3 0.7 0.3 1.35 0.3 1.2 0.6 0.7 0.6 0.7 0.6 0.7 0.6 0.6 0.7 0.6 0.7 0.6 1.2 P-max 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 S-max 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 T-S 43000 44000 43000 47000 48000 50000 55000 53000 58000 59000 55000 62000 56000 74000 58000 75000 68000 72000 72000 74000 75000 79000 76000 80000 82000 82000 85000 90000 108000
87
30 31 32 33 34 35 36 37 38 39 40 1055 1060 1065 1070 1074 1078 1084 1085 1086 1090 1095 0.5 0.55 0.6 0.65 0.7 0.72 0.8 0.8 0.8 0.85 0.9 0.6 0.6 0.6 0.6 0.5 0.3 0.6 0.7 0.3 0.6 0.3 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 94000 98000 100000 102000 105000 100000 119000 121000 112000 122000 120000
Table 6.1. Data set to be used for training the network

S.No. 1 2 3 4 5 6 7 SAE No. 1021 1026 1036 1041 1049 1064 1080 C-min 0.18 0.22 0.3 0.36 0.46 0.6 0.75 Mn-min 0.6 0.6 1.2 1.35 0.6 0.5 0.6 P-max 0.04 0.04 0.04 0.04 0.04 0.04 0.04 S-max 0.05 0.05 0.05 0.05 0.05 0.05 0.05 T-S 61000 64000 83000 92000 87000 97000 112000
Table 6.2. Data set to be used as unseen data for the network
6.2
METHODOLOGY The data shown in table 6.1 was used to train the network while the performance
of the network was tested on the data shown in table 6.2. 6.2.1 Algorithm % p1,p2,p3 and p4 are the variables with the values of C, Mn, P, and S respectively% p1=[0.08 0.1 0.15 0.08 0.1 0.13 0.13 0.15 0.15 0.15 0.18 0.18 0.2 0.19 0.22 0.22 0.28 0.3 0.32 0.32 0.35 0.37 0.37 0.4 0.4 0.43 0.43 0.48 0.47 0.5 0.55 0.6 0.65 0.7 0.72 0.8 0.8 0.8 0.85 0.9]; p2=[0.25 0.025 0.6 0.3 0.3 0.3 0.6 0.3 0.6 0.7 0.3 0.7 0.3 1.35 0.3 1.2 0.6 0.7 0.6 0.7 0.6 0.7 0.6 0.6 0.7 0.6 0.7 0.6 1.2 0.6 0.6 0.6 0.6 0.5 0.3 0.6 0.7 0.3 0.6 0.3];
88 p3=[0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04]; p4=[0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05]; % t is the target value for the above inputs, they will be used for training% t=[43000 44000 43000 47000 48000 50000 55000 53000 58000 59000 55000 62000 56000 74000 58000 75000 68000 72000 72000 74000 75000 79000 76000 80000 82000 82000 85000 90000 108000 94000 98000 100000 102000 105000 100000 119000 121000 112000 122000 120000]; p=[p1;p2;p3;p4]; % Initializing net with one neuron , one layer, tansig function for input layer, purelin function for hidden layer and trainlm function for training% net=newff(minmax(p),{1,1},{'tansig','purelin'},'trainlm'); ** Warning in INIT ** Network "input{1}.range" has a row with equal min and max values. ** Constant inputs do not provide useful information. net.trainparam.show=10; net.trainparam.goal=0.001; %Training the network with inputs p and target t% net=train(net,p,t); TRAINLM, Epoch 0/100, MSE 6.6312e+009/0.001, Gradient 3.15473e+006/1e-010 TRAINLM, Epoch 4/100, MSE 5.57749e+008/0.001, Gradient 5.05325e-006/1e-010
89 TRAINLM, Maximum MU reached, performance goal was not met.
Figure 6.1 training of the network with TRAINLM function and 1 neuron % Defining input vectors for testing% in1=[0.18;0.06;0.04;0.05]; in2=[022;0.6;0.04;0.05]; in2=[0.22;0.6;0.04;0.05]; in3=[0.3;1.2;0.04;0.05]; in4=[0.36;1.35;0.04;0.05]; in5=[0.46;0.6;0.04;0.05]; in6=[0.6;0.5;0.04;0.05]; in7=[0.75;0.6;0.04;0.05];
NUMBER OF NEURONS = 1 %Simulating the network with inputs 1, 2, 3, 4, 5, 6,and 7% y=sim(net,in1)
90
y = 7.7270e+004 y=sim(net,in2) y = 7.7270e+004 y=sim(net,in3) y = 8.5667e+004 y=sim(net,in4) y = 8.5667e+004 y=sim(net,in5) y = 7.7270e+004 y=sim(net,in6) y = 7.7270e+004 y=sim(net,in7) y = 7.8113e+004
91
NUMBER OF NEURONS = 7
Figure 6.2 training of network with 7neurons %%Simulating the network with inputs 1, 2, 3, 4, 5, 6,and 7% y=sim(net,in1) y = 5.2638e+004 y=sim(net,in2) y = 6.8513e+004 y=sim(net,in3) y = 6.8513e+004
92
y=sim(net,in4) y = 6.8513e+004 y=sim(net,in5) y = 8.8258e+004 y=sim(net,in6) y = 1.0500e+005 y=sim(net,in7) y = 8.8258e+004
93 NUMBER OF NEURONS = 9
Figure 6.3 training of network with 9 neurons %Simulating the network with inputs 1, 2, 3, 4, 5, 6,and 7% y=sim(net,in1) y = 5.5400e+004 y=sim(net,in2) y = 8.2920e+004 y=sim(net,in3) y = 1.0800e+005
94 y=sim(net,in4) y = 1.0800e+005 y=sim(net,in5) y = 8.2920e+004 y=sim(net,in6) y = 8.2920e+004 y=sim(net,in7) y = 8.2920e+004
Figure 6.4 the network
Chapter No. 7
RESULTS AND CONCLUSION
95
Chapter No. 7
RESULTS AND CONCLUSION
7.1 RESULTS
S.No. SAE No. C-min Mn-min P-max 1 1021 0.18 0.6 0.04 2 1026 0.22 0.6 0.04 3 1036 0.3 1.2 0.04 4 1041 0.36 1.35 0.04 5 1049 0.46 0.6 0.04 6 1064 0.6 0.5 0.04 7 1080 0.75 0.6 0.04 Average Error S-max 0.05 0.05 0.05 0.05 0.05 0.05 0.05 T-S (act) 1neuron 7 neuron 9 neurons 77270 52638 55400 61000 77270 68513 82920 64000 85667 68513 108000 83000 85667 68513 108000 92000 77270 88258 82920 87000 77270 105000 82920 97000 88258 82920 112000 78113 18.6062 16.26343 18.26556
Weights to layer 1 from input net.iw{1,1} ans = 1.0e+004 * 0.8415 -0.2418 0.0978 0.1222
0.5361 -5.3917 -0.1597 -0.1996 -4.1999 2.2293 -0.2352 -0.2940 0.0239 0.0299
4.8301 -1.5508
-0.2510 -0.9045 -0.0411 -0.0513 -1.7587 3.4961 -0.1782 -0.2228 0.1268 0.0877 0.1428 0.1585 0.1096 0.1785
-0.9265 -2.8471 0.5230 -6.9870 -2.4701 -6.0088 Weight to layer lw{2,1}
96 [9468.7077 27719.2353 -11582.5085 16750 3650.5453 -20946.5199 -12540 16540 30300] Bias To Layer 1 b{1} [24454.0639; -39904.3358; -58806.4363; 5984.5838; -10268.4537; -44553.1615; 31696.3873; 21915.2677; 35697.9616] Bias to layer 2 b{2} [54322.0444]
comparison of actual value and predicted value with 1 neuron

1000000 100000 10000 1000 100 10 1 1 2 3 4 5 6 7 x
Graph 7.1 Comparisons of actual value and predicted value with 1 neuron
97
comparison of actual value and predicted value with 7 neuron

1000000 100000 10000 1000 100 10 1 1 2 3 4 5 6 7 x
Graph 7.2 Comparisons of actual value and predicted value with 7 neuron comparison of actual value and predicted value with 9 neuron
1000000 100000 10000 1000 100 10 1 1 2 3 4 5 6 7 x
Graph 7.3 Comparisons of actual value and predicted value with 9 neuron
98
120000 100000 80000 60000 40000 20000 0 0 20000 40000 60000 80000 100000 120000
Graph 7.4 Regression line for the values predicted and actual 7.2 CONCLUSION The mechanical properties of plain carbon steels were predicted using feed forward back propagation artificial neural network, with an error of 18.6062 with 1 neuron,
16.26343 with 7 neurons and 18.26556 with 9 neurons. The reason of these Errors is the use of constant data for sulfur and phosphorus. Using more parameters and experimental data can reduce errors. Overall performance of neural networks was very satisfactory; it is a highly
significant and beneficial tool in design, development and analysis of plain carbon steels that will result in increasing efficiency and productivity.
7.3 FUTURE WORK In future, more data and parameters can be used to upgrade the present work. Such as Process of manufacturing Heat treatment performed Type of product Mechanical working done
Genetic algorithm may be applied to obtain reverse predictions, such as, obtaining composition by using mechanical properties as input parameters.
APPENDICES
APPENDIX A
DEFINITION OF TERMS Activation / initialisation function The time varying value that is the output of a neuron. Backpropagation (generalized delta-rule) A name given to the process by which the Perceptron neural network is "trained" to produce good responses to a set of input patterns. In light of this, the Perceptron network is sometimes called a "back-prop" network.
Bias The net input (or bias) is proportional to the amount that incoming neural activations must exceed in order for a neuron to fire.
Connectivity The amount of interaction in a system, the structure of the weights in a neural network, or the relative number of edges in a graph. Pattern recognition The act of identifying patterns within previously learned data. A neural network even in the presence of noise or when some data is missing can carry this out. Epoch One complete presentation of the training set to the network during training. Input layer Neurons whose inputs are fed from the outside world. Learning algorithms (supervised, unsupervised)
99
An adaptation process whereby synapses, weights of neural network's, classifier strengths, or some other set of adjustable parameters is automatically modified so that some objective is more readily achieved. The backpropagation and bucket brigade algorithms are two types of learning procedures. Learning rule The algorithm used for modifying the connection strengths, or weights, in response to training patterns while training is being carried out. Layer A group of neurons that have a specific function and are processed as a whole. The most common example is in a feedforward network that has an input layer, an output layer and one or more hidden layers. Monte-Carlo method The Monte-Carlo method provides approximate solutions to a variety of mathematical problems by performing statistical sampling experiments on a computer. Multilayer-perceptron (MLP) A type of feedforward neural network that is an extension of the perceptron in that it has at least one hidden layer of neurons. Layers are updated by starting at the inputs and ending with the outputs. Each neuron computes a weighted sum of the incoming signals, to yield a net input, and passes this value through its sigmoidal activation function to yield the neuron's activation value. Unlike the perceptron, an MLP can solve linearly inseparable problems. Neural Network (NN) A network of neurons that are connected through synapses or weights. Each neuron performs a simple calculation that is a function of the activations of the neurons that are connected to it. Through feedback mechanisms and/or the nonlinear output response of neurons, the network as a whole is capable of performing extremely complicated tasks, including universal computation and universal approximation. Three different classes of neural networks are
100
feedforward, feedback, and recurrent neural networks, which differ in the degree and type of connectivity that they possess.
Neuron A simple computational unit that performs a weighted sum on incoming signals, adds a threshold or bias term to this value to yield a net input, and maps this last value through an activation function to compute its own activation. Some neurons, such as those found in feedback or Hopfield networks, will retain a portion of their previous activation. Output neuron A neuron within a neural network whose outputs are the result of the network. Perceptron An artificial neural network capable of simple pattern recognition and classification tasks. It is composed of three layers where signals only pass forward from nodes in the input layer to nodes in the hidden layer and finally out to the output layer. There are no connections within a layer.
Sigmoid function An S-shaped function that is often used as an activation function in a neural network. Threshold A quantity added to (or subtracted from) the weighted sum of inputs into a neuron, which forms the neuron's net input. Intuitively, the net input (or bias) is proportional to the amount that the incoming neural activations must exceed in order for a neuron to fire. Training set A neural network is trained using a training set. A training set comprises information about the problem to be solved as input stimuli. In some computing systems the training set is called the "facts" file.
101
Weight In a neural network, the strength of a synapse (or connection) between two neurons. Weights may be positive (excitatory) or negative (inhibitory). The thresholds of a neuron are also considered weights, since they undergo adaptation by a learning algorithm.
102
APPENDIX B
Network Layers The term `layer' in the neural network sense means different things to different people. In the NNT, a layer is defined as a layer of neurons, with the exception of the input layer. So in NNT terminology this would be a one-layer network:
Figure: And this would be a two-layer network:
Figure:
Each layer has a number of properties, the most important being the transfer functions of the neurons in that layer, and the function that defines the net input of each neuron given its weights and the output of the previous layer
103
Activation Functions When a neuron updates it passes the sum of the incoming signals through an activation function, or transfer function as Matlab calls it. There are different types of activation functions, some are saturated and assures that the output value lies within a specific range, like logsig, tansig, hardlims and satlin. Some transfer functions are not saturated like purelin. Some of the transfer functions in the neural network toolbox are plotted in figure 5.3. The transfer function is chosen when you create the network, and is assigned to each layer. To create a feed forward network with a layer of two input neurons, three tansig neurons in the hidden layer and one logsig neuron in the output layer enter:
>> net=newff([0 1;0 1],[3 1],{'tansig','logsig'});
Figure: Transfer functions supplied by Matlab plotted in the same scale. Note the difference between tansig and logsig. tansig ranges between -1 and 1 and logsig ranges between 0 and 1. The same relationship applies between hardlim and hardlims and between satlin and satlins.
104
BIBLIOGRAPHY
105
BIBLIOGRAPHY
1. R. L. Timings Engineering Materials vol; 1 longman group United Kingdom 1994. 2. Heikkin, Koivo : Neural Networks: Basics Using Matlab, Neural Network Tool Box USA 2000. 3. Iqbal Shah Tensile Properties Of Austenitic Stainless Steels UK 2002. 4. H. K. P. H. Bhadeshia Neural Networks In Materials Science isij international 39:10 1999 pp 966-979. 5. Degramo E.Paul Materials And Process In Manufacturing 9th edition, John Willey And Sons United States 2003. 6. Colanglo, Vito J Analysis of Metallurgical Failure 2nd edition John Willey And Sons Singapore 1987. 7. O. P. Khana Text Book Of Metallurgy And Materials Engineering India 2002. 8. M. H. Jokhio, M. A. Unar Application Of Neural Network In Powder Metallurgy Engineering Materials Proceeding, 2004. 9. Internet web site www.ide.his.se 10. Demuth, Markbeak Neural Network Tool Box For Matlab Mathwork Inc. USA 2000. 11. Internet web site www.astm.org 12. Internet webs site www.mathworks.com 13. Internet webs site www.igi.trgraz.at 14. Internet webs site www.cs.wisc.edu 15. Internet webs site www.statsoftinc.com 16. Internet webs site www.azom.com 17. Internet webs site http://carol.wins.ura.nl 18. Internet webs site http://envistat.esa.cn
106 19. Internet webs site http://www.brain.web-us.com 20. Internet webs site http://njuct.edu.cn 21. Internet webs site www.baldwininternational.com 22. Internet webs site www.cs.man.ac.uk 23. Internet webs site www.tms.org/pubs/jom.htm 24. Internet web site http://www.torch.ch/matos/convolutions.pdf. 25. Carlos Gershenson Artificial Neural Networks for Beginners UK 2003 26. Internet web site www.benbest.com 27. Ivan Galkin, U. Mass Lawell Crash Introduction To Artificial Neural Networks (Materials for UML 91.531 Data Mining Course).

Predicting Steel Properties with ANN

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Predicting Steel Properties with ANN

Uploaded by

Copyright:

Available Formats

PREDICTING THE MECHANICAL PROPERTIES OF PLAIN CARBON STEELS USING ANN

Professor Mohammad Hayat Jokhio

______, Dec 2004 Chairman Department of Metallurgy And Materials Engineering

THE ARTIFICIAL NEURON 2.5.1 The Work Mechanism

2.6 2.7 2.8 2.9 2.10

2.11. DESIGNING 2.11.1 Layers

The effect of grain size and structure on tensile testing:

SETTING TRANSFER FUNCTIONS 5.4.1 Activation Functions

BASIC NEURAL NETWORK EXAMPLE 5.7.1 Manually Set Weights

APPENDICES Appendix a 99 Appendix b 103 BIBLIOGRAPHY 105

Figure: 2.4 Feed Forward Networks

2.10.7 Boltzmann machine

Figure: 2.6 network layers

The Delta Rule

Kohonens Learning Law

Plasticity Plasticity is that property of a material by virtue of which it may be permanently

Effect of alloying elements

Figure 4.2 Tensile test specimen ( round )

37 4.1.2 Proof stress

Figure 4.5 Proof stress

under test. ( a ) Stress/strain curve for annealed mild steel

( b) Stress/strain curve for gray cast iron

Figure 4.7 Effect of grain orientation on material testing

Figure 4.9 Effect of tempering on tensile test

Figure 4.11 Typical impact testing machine

Section of test piece Position of the striker

Figure 4.13 Izod test

Figure 4.14 Charpy test

Figure 4.15 Standard charpy notches

Figure 4.17 Effect of Annealing on the toughness of low-carbon steel

Figure 4.19 Brinell hardness tester

Figure 4.20 Brinell hardness principle

Figure: 4.21 Work-hardening capacity

Figure 4.22 Micro Vicker And Vicker Hardness Testers

Figure 4.23 Rockwell hardness tester

Figure 4.24 Effect of cold-working on the hardness of various metals

Neural Network object:

numInputs: 0 numLayers: 0 biasConnect: [] inputConnect: [] layerConnect: [] outputConnect: [] targetConnect: []

numOutputs: 0 numTargets: 0 numInputDelays: 0 numLayerDelays: 0

(read-only) (read-only) (read-only) (read-only)

adaptFcn: (none) initFcn: (none) performFcn: (none) trainFcn: (none)

adaptParam: (none) initParam: (none) performParam: (none) trainParam: (none)

userdata: (user stuff)

which has 2 dimensional points as inputs, type:

net.layers{i}.transferFcn property. So to make the first layer use sigmoid transfer

functions, and the second layer linear transfer functions, use

For detail refer appendix B

net.biasConnect to either 0 or 1, where net.biasConnect(i) = 1 means layer i has

And for each set of biases

And weight matrices

>> T = { [0] [1] [1] [0] }

And then set the parameters

>> net = newff([0 1; 0 1],[2 1],{'logsig','logsig'})

input matrix, it returns an output matrix.

>> plot(output, '+')

5.8 GRAPHICAL USER INTERFACE

To start, type nntool. The following window appears.

Figure 5.3 Network data manager

Data:ANDNet_outputsSim appears with the value

You might type ANDNet_outputs and ANDNet_errors to obtain the following

Table 6.1. Data set to be used for training the network

89 TRAINLM, Maximum MU reached, performance goal was not met.