Archive for the ‘Models and Techniques’ Category
Product Description
Deals with the estimation of natural resources using a Monte Carlo methodology. Includes a set of tools to describe the morphological, statistical and stereological properties of spatial random models. Discusses the geostatistical simulation techniques in such a specific way. … More >>
The is the second of four videos covering the basics of Connected Solutions using the Advanced Design System from Agilent EEsof EDA. For more information: www.agilent.com For a free evaluation copy of ADS: www.agilent.com
Product Description
Covers the moving averages trading method from start to finish, building a solid foundation using the most basic moving averages (MA) concepts to advanced techniques. These methods can be applied for day-trading or for position traders of individual stocks. Included are chapters explaining the 20-day/40-day MA technique, the 30-day/60-day MA method, and the 50-day/100-day MA tactic. Scores of actual stock chart examples provided showing how each moving average se… More >>
Product Description
Successful traders know that using Moving Averages can result in more profitable trades–if applied properly. But, what are Moving Averages? When–and how–should they be used? Now, noted trader Clif Droke takes the mystery out of Moving Averages by explaining them in detail, describing how they can be employed to zero in on buy/sell signals that result in more profitable trades–more often. Traders of every level will also discover how to:
-Calculat… More >>
An introduction to Moving Averages and the quest for the perfect Moving Average. My first You Tube video, so not technically perfect, but the idea taught is simple and sound.
Product Description
Monte Carlo simulation has become an essential tool in the pricing of derivative securities and in risk management. These applications have, in turn, stimulated research into new Monte Carlo methods and renewed interest in some older techniques. This book develops the use of Monte Carlo methods in finance and it also uses simulation as a vehicle for presenting models and ideas from financial engineering. It divides roughly into three parts. The first part develo… More >>
Product Description
Optimization models play an increasingly important role in financial decisions. This is the first textbook devoted to explaining how recent advances in optimization models, methods and software can be applied to solve problems in computational finance more efficiently and accurately. Chapters discussing the theory and efficient solution methods for all major classes of optimization problems alternate with chapters illustrating their use in modeling problems of mathe… More >>
Product Description
This is the collected papers presented at the Integrating Safety and Environmental Knowledge Into Food Studies towards European Sustainable Development (ISEKI) workshop on risk assessment in the food industry…. More >>
1. INTRODUCTION
There are many alternatives to represent classifiers. The decision tree is probably the most widely used approach for this purpose. Originally it has been studied in the fields of decision theory and statistics. However, it was found to be effective in other disciplines such as data mining, machine learning, and pattern recognition. Decision trees are also implemented in many real-world applications. Given the long history and the intense interest in this approach, it is not surprising that several surveys on decision trees are available in the literature. Nevertheless, this survey proposes a profound but concise description of issues related specifically to top-down construction of decision trees, which is considered the most popular construction approach. This paper aims to organize all significant methods developed into a coherent and unified reference.
2. DECISION TREES
A decision tree (or tree diagram) is a decision support tool that uses a tree-like graph or model of decisions and their possible consequences, including chance event outcomes, resource costs, and utility. Decision trees are commonly used in operations research, specifically in decision analysis, to help identify a strategy most likely to reach a goal. Another use of decision trees is as a descriptive means for calculating conditional probabilities. In data mining and machine learning, a decision tree is a predictive model; that is, a mapping from observations about an item to conclusions about its target value. More descriptive names for such tree models are classification tree (discrete outcome) or regression tree (continuous outcome). In these tree structures, leaves represent classifications and branches represent conjunctions of features that lead to those classifications. The machine learning technique for inducing a decision tree from data is called decision tree learning, or (colloquially) decision trees.
3. DECISION TREE REPRESNTATION
The decision tree induction algorithm has been used broadly for several years. It is an approximation discrete function method and can yield lots of useful expressions. It is one of the most important methods for classification. This algorithm’s terms follow the “tree” metaphor. It has a root, which is the first split point of the data attribute for building a decision tree. It also has leaves, so that every path from root to leaf will form a rule that is easily understood. Since the decision tree is built by given data, the data value and character will be more important. For example, the amount of data will affect the result of the tree building procedure. The type of attribute value will also affect the tree model. Decision trees need two kinds of data: Training and Testing.
Training data, which are usually the bigger part of data, are used for constructing trees. The more training data collected, the higher the accuracy of the results. The other group of data, testing, is used to get the accuracy rate and misclassification rate of the decision tree. Many decision-tree algorithms have been developed. One of the most famous is ID3 (Quinlan 1986, 1983), whose choice of split attribute is based on information entropy. C4.5 is an extension of ID3 (Prather et al. 1997). It improves computing efficiency, deals with continuous values, handles attributes with missing values, avoids over fitting, and performs other functions.
CART (Classification and Regression tree) is a data-exploration and prediction algorithm similar to C4.5, which is a tree construction algorithm. Breiman et al. (1984) summarized the classification and regression tree. Instead of information entropy, it introduces measures of node impurity. It is used on a variety of different problems, such as the detection of chlorine from the data contained in a mass spectrum). Although decision trees may not be the best method for classification accuracy, even people who are not familiar with them find them easy to use and understand. Figure 1 shows a binary decision tree. It gives us an impression of a decision. It uses a circle as the decision node and a square as the terminal node. Each decision node has a condition that is represented by a function F, and the parameter is the split point of the split attribute. Each terminal node has a class label C, the value of which represents a class. It is apparent that it is easy to use decision trees to interpret the tree to rules, from which we can do analysis, and easy to interpret the representation of a nonlinear input-output mapping (Jang 1994).
Figure 1: A Typical binary Decision tree
Figure 1. A typical binary decision tree Lots of works address the splitting node choosing method and optimization of tree size, but less attention has been given to the weight of the data attributes. In this study, we use a system-reconstruction analysis method to get the weight of each attribute, which we use to reform raw data. After that, we use the decision-tree algorithm mentioned above to build a decision tree, from which we can find the decision-accuracy and misclassification rates.
4. ID3 ALGORITHM
The ID3 algorithm can be summarized as follows:
Take all unused attributes and count their entropy concerning test samples
Choose attribute for which entropy is maximum Make node containing that attribute
The algorithm is as follows:
According to Gestwicki, Itemized Dichotomozer 3 algorithm, or better known as ID3 algorithm was first introduced by J.R Quinlan in the late 1970’s. The algorithm ‘learned’ from relatively small training set of data to organize and process very large data sets. Ballard stated that ID3 algorithm is a greedy algorithm that selects the next attributes based on the information gain associated with the attributes. The information gain is measured by entropy, where Claude Shannon first introduced the idea in 1948.
ID3 algorithm prefers that the generated tree is shorter and the attributes with lower entropies are put near the top of the tree. These techniques satisfy the idea of Occam’s Razor. Occam’s Razor stated that, “one should not increase, beyond what is necessary, the number of entities required to explain anything”, which means that one should not make more assumptions than minimum needed. Hild described the basic technique on the implementation of ID3 algorithm and it is shown below.
For each uncategorized attribute, its entropy would be calculated with respect to the categorized attribute or conclusion. The attribute with lowest entropy would be selected. The data would be divided into sets according to the attribute’s value. For example, if the attribute ‘Size’ was chosen, and the values for ‘Size’ were ‘big’, ‘medium’ and ‘small, therefore three sets would be created, divided by these values. A tree with branches that represent the sets would be constructed. For the above example, three branches would be created where first branch would be ‘big’, second branch would be ‘medium’ and third branch would be ‘small’. Step 1 would be repeated for each branch, but the already selected attribute would be removed and the data used was only the data that exists in the sets. The process stopped when there were no more attribute to be considered or the data in the set had the same conclusion, for example, all data had the ‘Result’ = yes.
ID3 algorithm had been used and implemented in many fields. One of the earliest implementation of ID3 algorithm is on a chess game. Ivan Bratko, the artificial intelligence researcher was the one implemented this chess game. According to Gestwicki, Bratko supplied the ID3 program with several pages of textbook recommendations for playing the chess endgame of white king and rook versus black king and knight. He made the rules around the idea of ‘knight’s side lost in at most n moves’. The result shows that ID3 algorithm is efficient in both time and space considerations, where the feature vector of the games and the decision tree size is small, compared to the training instances.
In a study by Gestwicki, one experiment had been conducted to predict the greyhound race. The experiment was to compare between the net profit gained by the ID3 algorithm and by three greyhound-racing experts. In this experiment, the system had been trained with 200 training races and 1600 dogs. The result shows that there are 26 races that the ID3 did not make any bet. This showed that the system was restricted from making any illogical choices, which is unlike human that were to gamble without logic in order to gain more winning.
5. C4.5 ALGORITHM
At each node of the tree, C4.5 chooses one attribute of the data that most effectively splits its set of samples into subsets enriched in one class or the other. Its criterion is the normalized information gain (difference in entropy) that results from choosing an attribute for splitting the data. The attribute with the highest normalized information gain is chosen to make the decision. The C4.5 algorithm then recurses on the smaller sublists. This algorithm has a few base cases.
All the samples in the list belong to the same class. When this happens, it simply creates a leaf node for the decision tree saying to choose that class. None of the features provide any information gain. In this case, C4.5 creates a decision node higher up the tree using the expected value of the class. Instance of previously-unseen class encountered. Again, C4.5 creates a decision node higher up the tree using the expected value.
In pseudo code the algorithm is
Check for base cases For each attribute a Find the normalized information gain Let a_best be the attribute with the highest normalized information gain Create a decision node that splits on a_best Recurse on the sublists obtained by splitting on a_best, and add those nodes as children of node Improvements from ID3 algorithm
C4.5 made a number of improvements to ID3. Some of these are:
Handling both continuous and discrete attributes – In order to handle continuous attributes, C4.5 creates a threshold and then splits the list into those whose attribute value is above the threshold and those that are less than or equal to it. Handling training data with missing attribute values – C4.5 allows attribute values to be marked for missing. Missing attribute values are simply not used in gain and entropy calculations. Handling attributes with differing costs. Pruning trees after creation – C4.5 goes back through the tree once it’s been created and attempts to remove branches that do not help by replacing them with leaf nodes.
6. CART ALGORITHM
Classification and regression trees (CART) is a non-parametric technique that produces either classification or regression trees, depending on whether the dependent variable is categorical or numeric, respectively. Trees are formed by a collection of rules based on values of certain variables in the modeling data set.
Rules are selected based on how well splits based on variables’ values can differentiate observations based on the dependent variable Once a rule is selected and splits a node into two, the same logic is applied to each “child” node (i.e. it is a recursive procedure) Splitting stops when CART detects no further gain can be made, or some pre-set stopping rules are met
Each branch of the tree ends in a terminal node
Each observation falls into one and exactly one terminal node Each terminal node is uniquely defined by a set of rules
The basic idea of tree growing is to choose a split among all the possible splits at each node so that the resulting child nodes are the “purest”. In this algorithm, only univariate splits are considered. That is, each split depends on the value of only one predictor variable. All possible splits consist of possible splits of each predictor.
7. COMPARISON OF ID3, C4.5 and CART
Algorithm designers have had much success with greedy, divide-and-conquer approaches to building class descriptions. It is chosen decision tree learners made popular by ID3, C4.5 (Quinlan1986) and CART (Breiman, Friedman, Olshen, and Stone 1984) for this survey, because they are relatively fast and typically they produce competitive classifiers. In fact, the decision tree generator C4.5, a successor to ID3, has become a standard factor for comparison in machine learning research, because it produces good classifiers quickly. For non numeric datasets, the growth of the run time of ID3 (and C4.5) is linear in all examples.
The practical run time complexity of C4.5 has been determined empirically to be worse than O (e2) on some datasets. One possible explanation is based on the observation of Oates and Jensen (1998) that the size of C4.5 trees increases linearly with the number of examples. One of the factors of a in C4.5’s run-time complexity corresponds to the tree depth, which cannot be larger than the number of attributes. Tree depth is related to tree size, and thereby to the number of examples. When compared with C4.5, the run time complexity of CART is satisfactory.
8. CONCLUSION
The decision-tree algorithm is one of the most effective classification methods. The data will judge the efficiency and correction rate of the algorithm. The survey is made on the decision tree algorithms ID3, C4.5 and CART towards their steps of processing data and Complexity of running data. The inductive learning algorithms had successfully recognized and generalized the rules contains in the training data given. The accuracies for the algorithms were also very high, which means the system produced a reliable result. This result also showed that inductive learning can be successfully implemented in a complex problem domain, and therefore it is very useful to be implemented in the real world problems. The second conclusion is that the algorithms had the ability to learn new rules and therefore had the ability to adapt to changes. Finally it can be concluded that between the three algorithms, the CART algorithm performs better in performance of rules generated and accuracy. CART algorithm produced less rules yet was more accurate than the other two algorithms. This showed that the CART algorithm is better in induction and rules generalization compared to ID3 algorithm and C4.5 algorithm.
ACKNOWLEDGEMENT
First, I would like to thank Almighty for His blessings towards the successful completion of this survey paper. I would like to extend my thanks to my Research Guide Dr. (Mrs.) M. Punithavalli, Director, Dept. of Computer Science, Sri Rama Krishna College for Women, Coimbatore for her valuable assistance, help and guidance during the research process. I also would like to extend my gratitude to my Husband Mr. M. S. Raja Sekaran for his moral support and co-operation.
REFERENCES
[1] S. R. Safavin and D. Landgrebe. A survey of decision tree classifier methodology. IEEE Trans. on Systems, Man and Cybernetics, 21(3):660-674, 1991.
[2] S. K. Murthy, Automatic Construction of Decision Trees from Data: A MultiDisciplinary Survey. Data Mining and Knowledge Discovery, 2(4):345-389, 1998.
[3] R. Kohavi and J. R. Quinlan. Decision-tree discovery. In Will Klosgen and Jan M. Zytkow, editors, Handbook of Data Mining and Knowledge Discovery, chapter 16.1.3, pages 267-276. Oxford University Press, 2002.
[4] S. Grumbach and T. Milo: Towards Tractable Algebras for Bags. Journal of Computer and System Sciences 52(3): 570-588, 1996. IEEE TRANSACTIONS ON SYSTEMS, MAN AND CYBERNETICS: PART C, VOL. 1, NO. 11, NOVEMBER 2002 11
[5] L. Breiman, J. Friedman, R. Olshen, and C. Stone. Classification and Regression Trees. Wadsworth Int. Group, 1984.
[6] J.R. Quinlan, Simplifying decision trees, International Journal of Man-Machine Studies, 27, 221-234, 1987.
[7] T. R. Hancock, T. Jiang, M. Li, J. Tromp: Lower Bounds on Learning Decision Lists and Trees. Information and Computation 126(2): 114-122, 1996.
[8] L. Hyafil and R.L. Rivest. Constructing optimal binary decision trees is NP-complete. Information Processing Letters, 5(1):15-17, 1976
[9] H. Zantema and H. L. Bodlaender, Finding Small Equivalent Decision Trees is Hard, International Journal of Foundations of Computer Science, 11(2):343-354, 2000.
[10] G.E. Naumov. NP-completeness of problems of construction of optimal decision trees. Soviet Physics: Doklady, 36(4):270-271, 1991.
[11] J.R. Quinlan, Induction of decision trees, Machine Learning 1, 81-106, 1986.
ROSILINE JEETHA B.1 Dr. (Mrs.) PUNITHAVALLI M.2
1. DEPARTMENT OF MCA, RVS COLLEGE OF ARTS & SCIENCE, COIMBATORE
2. DIRECTOR, DEPARTMENT OF COMPUTER SCIENCE, SRI
RAMAKRISHNA COLLEGE FOR WOMWN, COIMBATORE
- ISBN13: 9781590477038
- Condition: USED – VERY GOOD
- Notes:
Product Description
Predictive Modeling with SAS Enterprise Miner: Practical Solutions for Business Applications demonstrates how to make the fullest use of SAS Enterprise Miner software. Kattamuri Sarma provides an in-depth explanation of the methodology and the theory behind each tool that he covers, and then shows you how the software performs the tasks. Step by step, you’ll be able to compare manual calculations with the calculations that are performed by SAS Enterprise Miner. Exam… More >>
Predictive Modeling With SAS Enterprise Miner: Practical Solutions for Business Applications
Describes the differences between traditional point estimate forecast methods and Monte Carlo simulation methods, along with a description of the advantages of simulation over point estimates.
I’ll add to this later
It used to be that basic data was enough to make successful decisions within an organization. A CEO could look at common key performance indicators such as net profit margin, debt to income ratio, and return on investment and be able to make the best decisions available at the time.
For the past several decades, companies have collected large amounts of data in order to evaluate why they performed the way they did and to understand their customer’s needs and preferences. They built data warehouses and advance reports to improve accuracy to improve key processes, and optimize performance.
As time went on, companies learned that they could use historical data and trends to predict future behavior, and to make decisions. This was seen in examples as when a call center manager uses call volume by hour statistics to staff a call center for peak and non peak times.
Then organizations moved beyond reporting capabilities and began gathering even larger amounts of data to apply statistical analysis to further predict future trends and behavioral patterns. This was seen in examples like the banking industry using credit history, residential information, job information, debts, etc to calculate a credit score to determine if a person is likely to pay off a loan. This is an example of predictive analytics, and organizations in all genres are learning to apply it to their reporting capabilities. Predictive analytics applies large volumes of data to capture relationships between explanatory variables (variables used in a relationship to explain or predict changes in the values of another variable) and predicted variables from past data, and applying it to predict future outcomes.
Predictive modeling is the process by which data is modeled and diagnosed to try to best predict the probability of an outcome. In many cases the model is chosen on the basis of detection theory to try to guess the probability of a signal given a set amount of input data. Models can use one or more classifiers in trying to determine the probability of a set of data belonging to another set.
There are three main types of models associated with predictive analytics: predictive models, descriptive models, and decision models.
Predictive models predict future behavior and anticipate the consequences of change. Predictive models are comprised of a number of predictors (factors likely to affect future behavior or results). For example, in marketing a customer’s age, sex and income can be used to predict the likelihood of buying.
Predictive analytics’ central building block is the predictor, a single value measured for each customer. For example, ‘most recent’, which is based on the number of weeks since the customer’s last purchase, has higher values for more recent customers. This predictor is usually a reliable campaign response predictor: you will receive more responses from those customers more highly ranked by ‘most recent’. That means that if you contact your customers in order of ‘most recent’ – first, call the most-recent customer; next, call the next-most-recent customer; and so on – you will improve your response rate. For each prediction goal, there are an abundance of predictors that will help rank your customer database. For example, consider a customer’s online behavior: Customers who spend less time logged on may be less likely to renew their annual subscription. In this case, retention campaigns can be cost-effectively targeted to customers with a low monthly usage predictor value.
Descriptive models quantify the relationships between data in order to classify customers into groups. While predictive models focus on predicting one customer’s behavior, descriptive models identify relationships between several customers or products. Descriptive models do not predict a target value, but focus more on the intrinsic structure, relations, interconnectedness, etc. Descriptive models are used in our earlier example of the financial industry and credit scores.
Cluster analysis is a descriptive modeling technique that identifies clusters embedded in the data. A cluster is a collection of data objects that are similar in some sense to one another.
Another descriptive modeling technique is the k-means algorithm. K-means algorithm is a distance-based clustering algorithm that partitions the data into a predetermined number of clusters (provided there are enough distinct cases). The k-means algorithm works only with numerical attributes. Distance-based algorithms rely on a distance metric (function) to measure the similarity between data points.
Decision models describe the relationship between all decision elements and predict the results of decisions, allowing you to try different scenarios, and optimize results. Clinical Decision Support Systems use predictive analysis in the health care industry to determine at risk patients and sometimes to determine which course of action would be best given a multiple array of variables.
Rational decision models are based around a cognitive judgment of the pros and cons of various options. It is organized around selecting the most logical and sensible alternative that will have the desired effect. The decisions are normally organized through a detailed analysis of alternatives and a comparative assessment of the advantages of each. Weighted criteria scoring is an example of rational decision models.
Hopefully this has given you a better understanding of the basic predictive analysis models that drive predictive analytics. Check out my article on predictive modeling techniques to learn about 12 common techniques used to predict future behavior.
Victor Holman is a performance management expert who provides fast, simple and inexpensive ways to transform organizational performance.
Check out his FREE performance management kit, which includes several templates, plans, and guides to help you get started with your next initiative.
Victor’s Complete Lifecycle Performance Management Kit is a turnkey organizational performance management solution consisting of a web based organizational performance analysis, 7 guides, 39 templates, 600+ metrics, 35 best practices, 48 key processes, a performance roadmap and more.
Learn all about performance management at The Performance Portal
demonstrations.wolfram.com The Wolfram Demonstrations Project contains thousands of free interactive visualizations, with new entries added daily. This Demonstration approximates using the Monte Carlo method: (1) randomly select points in a square with an inscribed circle; (2) multiply the number of points inside the circle by four; (3) divide by the total number of points in the square. Contributed by: Zubeyir Cinkir






