Warsaw University of Technology Faculty of Mathematics and Information Science
Artificial Neural Networks in the Simulation of a Real Scene
Author: Maciej Gorywoda Supervisor: prof. dr hab. B. Macukow
WARSAW JUNE 2005
For Marta ‘Achi’ Charkiewicz (7 VI 1983 – 7 VI 2005)
Introduction The theory: Artificial Neural Networks as Decision-Making Units 1. Simulating a real scene – the most important aspects of placing an A.N.N.-controlled object in a simulation of a real world 1.1.Physics in the simulation of a real scene 1.2.Path-Finding Algorithms 1.2.1. Flood Path-Finding 1.2.2. The A* Algorithm 1.3.Collecting data about the world 1.3.1. Waypoints 1.3.2. Camera 2. Other methods of implementing Artificial Intelligence as decision-making units for agents moving in simulations of a real scene. 2.1.Finite State Machines 2.1.1. Characteristics of FSM 2.2.Fuzzy Logic Decision-Making Architecture 2.2.1. Characterstics of FLA 3. Specifics of implementation of an Artificial Neural Network as a decision-making unit 3.1.Controlling an object 3.2.The external memory 3.3.Teaching an Artificial Neural Network 4. Proposed Artificial Neural Network models 4.1.The Hopfield model 4.2.The Multi-Layered Perceptron The practice: Implementation of an example simulation 1. The simulation module 1.1. The scene 1.2. A cell 1.3. A material 1.4. The array of materials 2. Objects 2.1. An object 2.2. A mobile object 2.3. An energy source 2.4. A data source 2.5. Vehicle 2.6. A camera 3. The agent module 3.1. The agent 3.2. A situation 3.3. The analyzer 3.4. The interpreter 4. The Artificial Neural Network implementation 4.1. A common superclass 4.2. The Hopfield model 4.3. The Multi-Layered Perceptron 4.4. The suspension mechanism Conclusions Technicalia Documentation
4 7 7 8 12 13 15 17 17 18 21 21 22 23 25 27 28 29 30 31 31 33 36 38 38 42 44 45 46 46 50 53 54 55 59 61 63 68 69 74 77 78 80 83 86 88 90 90
Automation of the process of making decisions is still in its infancy. The presence of a human being is obligatory even if the task is performed in environment so controlled as when operating a fork-lift in a warehouse. Although we are able to automate almost everything – from washing machines to wheel-chairs to sophisticated airplanes’ mechanisms, there is still a need for a human finger which will press the right button. This problem may be reduced because of the progress in the field of Artificial Intelligence. Situations, in which we have to choose from a set of possible decisions, may be simplified and then, to some degree, measured and presented in the form of sequences of numbers. Next, such sequences may be processed by one of many algorithms known to Artificial Intelligence scientists and result in associating the original situation with one of a few previously defined classes of situations – and every such class is in turn associated with one of previously defined answers. Let’s consider for a moment how a human brain works like when it is presented to a situation in which it has to make a choice out of a few possibilities and the decision has to be made quick – in a couple of seconds at most. In such cases human beings usually improvise. We erase from the given situation factors we are not certain of, simplifying it, and then we compare the result with our memories – patterns of situations from the past along with our reaction from that moment and results of that reaction. Of course, we search for a pattern connected to a positive reaction. A negative reaction can tell us how not to react this time, but since we have to make a decision in a matter of seconds this information is not useful. So we search for a pattern connected to a positive reaction and as fast as we find it we apply the reaction to our current situation. Even if the current situation differs significantly from the pattern. The point is that it differs even more with any other pattern we remember. The task of my master’s thesis is to propose an example implementation of an Artificial Neural Network, a sub-field of Artificial Intelligence inspired by the way the human brain works in, as a mechanism controlling a mobile robot, a probe or, in a simulated environment, any kind of agents designed to gather information. Being inspired by biology, an Artificial Neural Network (a neural network, or an A.N.N. in short) works in a way which, to some 4
extend, reminds the way we make decision. It takes a simplified version of a situation the agent is presented to – the version previously analyzed and prepared by other mechanisms – and tries to find the most similar pattern. The comparison is performed by erasing uncertainties and replacing details which do not fit completely with correspondent parts of the patterns. The pattern which needs the least effort is the chosen one – but please note that as human beings sometimes make decisions because they only “seem” to be right, a least effort pattern of a neural network may not be the best one. In other words, a neural network may come up with a result which is only the local optima, not the global one. Additionally the thesis tries to compare two types of Artificial Neural Networks: a MultiLayered Perceptron (MLP in short) and a Hopfield network. What are their characteristics? What is similar? How do they differ from the programmatic point of view? The MLP model is more straight-forward: after presenting it to the vector of input data we process each intermediary and output neuron and then we collect the vector of output data which represents one of possible decisions. On the other hand, a Hopfield network works in a less direct way. Its neurons are connected to each other – they gather input from all other neurons and expose its output to all others as well. When presented to a vector of input data, a Hopfield network tries to convert it to one of patterns it remembers by processing its neurons one after another – and repeating the process if necessary. After the pattern is found we need to use some sort of a dictionary which associate the pattern with a decision.
The thesis is divided into several parts: • The theoretical part explains the most important problems of implementing a simulation of a real scene, such as path-finding and simplifying the input data, and approaches to solve them, as well as the theoretical basis for both models of Artificial Neural Network. • The practical part describes how an example implementation of such simulation can be done. The program which is attached to the thesis is based on this description (it differs in several places but all mechanisms and features presented in the thesis are implemented in the program).
Conclusions answer the question asked above: how both models of Artificial Neural Networks cope with the problem of controlling objects in a simulated environment, what are their similarities and differences.
Technicalia and Bibliography – a list of programmatic tools and books and articles which were used during the process of writing the thesis.
The theory: Artificial Neural Networks as Decision Making Units
1. Simulating a real scene – the most important aspects of placing an A.N.N.-controlled object in a simulation of a real world
To implement a simulation of a real scene three things are needed: • • • a simulated environment along with its physics and a method of representation of the terrain; a class of objects which exist in the environment, some of them being mobile (that is, they can move through the environment if pushed or pulled), others being immobile; a class of agents, controlling some of mobile objects.
The environment is composed of various materials, each having different characteristics, for example friction. The environment may be compared to a board of a game while the physics - describing how objects behave and what are results of agents' actions - may be compared to the rules of a game. If the environment is a board then objects are pawns. Some of them are immobile and are used as sources of energy or information, or simply as obstacles on a path from one point to another. Others are mobile, they move if they are pushed or pulled, according to the physics of the simulation, but they are passive – they do not move by themselves. These objects may be used as sources of energy or data as well. Some of the mobile objects may be vehicles. A vehicle contains an engine and moves because of the force the engine generates. The vehicles are controlled by agents, in other words. an agent may perform actions on a vehicle, such as: • • turning it to a given direction, turning in and turning out the engine,
requesting to draw energy from an energy source or data from a data source, in order to complete a task given to it by the user.
The core of an agent is its decision-making unit. In fact, the agent may be considered a wrapper on a decision-making unit: it collects data from the environment and objects, filter it, and convert it to the form understandable for the decision-making unit. After that the unit starts to work, a decision is returned to the agent and the agent converts it into a set of actions. The following sub-chapters present implementation methods of some of the aspects of a real scene simulation. The implementation is described from the point of view of the influence the given solutions have on the later programming of the decision making unit. After that, in the practical part of the thesis, it is shown how some of these solutions can be implemented in a C++ program.
1.1. Physics in the simulation of a real scene
Designing and then implementing an artificial environment simulating the real world is continuous balancing between two extremities. The first is a situation when the designed mechanism resembles perfectly the functionality of the real world. However, in consequence, it is so sophisticated that implementing it in a realistic amount of time is simply impossible. Or, if it is programmable, it requires so much time and memory resources that there is no point in using it. (And, unfortunately, usually such a solution is both hard to program and resource-consuming). The other extremity is the case when our design is, as a matter of fact, easy to implement, and its requirements are reasonable, but instead we are forced to accept unrealistic assumptions, to resign from handling situations which are rare (but yet they are still occurring sometimes), or we need to consent to a high margin of error in computations. Programming dilemmas of this type are not exclusive to simulations of a real scene. We do not need to look too far: computer graphic artists are forced to make such Shakespearean choices as well, ergo. between realistic models of light and shade plays and realistic
requirements of the animation they implement. A good example of such a dilemma in a simulation of a real scene is a collision between two objects. Let us consider it briefly. In the beginning the situation seems to be simple. Each object is characterized by its mass and velocity – together they form momentum. Using the momentum preservation rule we can write the equation computing velocities of two objects after a non-elastic collision: m 1 v 1m 2 v 2 m 1m 2
Where: p -the resulting momentum m 1 , m 2 – masses of the first and the second object, respectively v 1 ,v 2 – velocities But it's just the beginning. Above all, on a real scene collisions take place in three dimensions, not one. We can assume that objects in our simulations move on a (flat) surface what reduces the number of dimensions to two, but still we are forced to add to the object's characteristics a structure describing the movement vector. Because of this our equation gets more complicated immediately. Because momentum is a vector, whenever we analyze a collision in two or three dimensions the momentum has to be split up into components. Let us consider the following example to see how this works. A 1000 kg car traveling at 30 m/s, 30° south of east, collides with a 3000 kg truck heading northeast at 20 m/s. The collision is completely inelastic, so the two vehicles stick together after the collision. How fast, and in what direction, are the car and truck traveling after the collision?
To solve this, the algorithm should compute two conservation of momentum equations, one for the y-direction (positive y being north) and another for the x-direction (positive x being east). A vector diagram should be helpful in visualizing this:
• • • •
p c – the car's momentum p t – the truck's momentum p f – the resulting momentum – the angle of the resulting vector
Let's write down the equation for the momentum before the collision in the y-direction and set it equal to the momentum after the collision in the y-direction, and then do the same thing in the x-direction.
y −direction : −m c v c sin30m t v t sin45=m c m t v f sin x −direction : m c v c cos30m t v t cos45=m c m t v f sin
m c , m t – the car and the truck's masses, respectively v c , v t – the car and the truck's velocities, respectively
The y equation can be rearranged to solve for the y component of the final velocity:
 v fy =v f
−m c v c sin30m t v t sin45 =...=6.857 m /s m c m t
Similarly, in the x-direction: m c v c cos30m t v t cos45 =...=17.102 m /s m c m t
 v fx =v f
And, after that, we can compute the final velocity :
 v f = v 2 v 2 =...=18.4 fx fy
But this is not over yet. As it was mentioned above the collision is not elastic. This assumption makes sense if our simulation deals with collisions of cars and trucks, but what about a collision of two pool balls? If we want our algorithm to handle such collisions too we have to add to the object's characteristics information whether collisions the object is involved in are elastic or not. If yes, we need to consider it in our algorithm. Additionally, up to now we have been assuming that our objects are points. None of the above-mentioned equations consider a situation when the collision occurs in a place significantly distant from the centre of one of the objects. This situation may occur if a car hits one of the ends of a big truck or when two boats or ships collide. Computations used in this kind of a realistic model may complicate the algorithm to a large extent. Therefore the designer, before starting her work, needs to think over what she really expects from the algorithm she designs. What assumptions may be taken? Will simplification of the representation of an object significantly influence the results? Are there algorithms which give results similar to real ones but simpler and/or faster than brute force? For example,
when computing the rotation resulting from a collision, quite often the algorithm makes a “fraud” of a kind: • • • the object is artificially divided into parts (ergo. “an arm” - “a trunk”), the collision is computed only in the range of the part it occurred in, subsequently, the momentum change is transferred to other parts of the object via equations specific to the given object. The task of the designer is to choose mechanisms of the scene simulation in a such way that both the sheep is alive and the wolf is satisfied. That is, objects on the real scene have to behave naturally and in the same time algorithms creating this natural-looking behaviour have to be simple enough to be implemented in a reasonable amount of time – and their requirements have to allow us to carry out simulations without unwanted delays.
1.2. Path-Finding Algorithms
Inside the agent implementation we are forced to make decision similar to the above ones. An important, if not the most important aspect of the Artificial Intelligence module – implemented as an Artificial Neural Network or in any other way – is to make decisions about the goal the controlled object should approach. The decision is made on the basis of accessibility and importance of the potential goals. The importance of the goal is usually a constant value. (although not always – ergo. if the AI module is informed that the surroundings are becoming dangerous for further work or even existence of the object, AI may decide that goals described as “shelters” are more important than others in the moment). The accessibility component varies depending on the location of the object. A goal with a high priority, but a distant one, may be discarded in favour of a goal of lower quality but easier access. Thus it is important to find the shortest path from the actual position of the controlled object to all of its known goals – and to compute the total cost of going that way. (By “total cost” we mean the sum of weights assigned to connections between the points on the path).
Below I would like to present two popular algorithms used to compute such paths – the first one is easy to implement but relatively slow, the second is faster but harder to program.
1.2.1. Flood Path-Finding
The Flood Path-Finding algorithm takes advantage of the graph-searching functionality of the Breadth-First Search method. The surface the object moves on is divided into fields of the same size which are subsequently treated as vertices of a graph. The weights of connections between vertices correspond to the “difficulty characteristics” of the terrain – that is, the difference of altitudes between two given fields and/or the quality of the surface. It is quite common that instead of assigning a weight to each edge we define a constant value for the whole vertex – the cost of traveling through the corresponding field. This simplification may end up in wrong results. For example, if the cause of the high movement cost is the difference of altitudes between two fields – going uphill is more expensive than going downhill – what means that it is the direction of movement that matters, not a simple fact of moving through a particular field. However, in most cases, we can consent to this flaw and in exchange we get a simpler and faster path-finding algorithm. Below you can find definitions of a few structures used by both Flood Path-Finding and by the A* algorithm (described in the next sub-chapter) and the Flood Path-Finding algorithm itself in the form of pseudo-code. Definitions:
• • •
node – contains information describing the corresponding field. cost(node1,node2) – returns the cost of traveling from the node1 to the node2. total_cost(node) – a structure – for example a list or a hash-table – containing the total cost of traveling from the starting point to the node. accessible(node) – returns TRUE if the node is accessible (i.e. if it is possible to travel through it) or FALSE otherwise. open_list – a list of nodes pending for processing.
closed_list – a list of already processed nodes. path – a path between the starting and the ending point in a form of a list.
The algorithm: The algorithm receives the starting_node and the ending_node via parameters.
initialize all elements of the total_cost structure to \ an artificial value indicating that the total cost \ for this element was not computed. add the starting_node to the open_list. // This is the only node on the open_list in the moment. total_cost(starting_node) := 0 while true begin if the open_list is empty then return error: the path cannot be found actual_node := the first node on the open_list remove the first node from the open_list if the actual_node = ending_node then break from the while loop actual_cost := total_cost(actual_node) for every node connected to the actual_node (known as processed_node) begin if accessible(processed_node) = true then begin c := actual_cost + cost(actual_node,processed_node) if (processed_node is on the closed_list) and (total_cost(processed_node) > c) then total_cost(processed_node) := c else if the processed_node is not on the closed list then begin total_cost(processed_node) := c add the processed_node to the end of the open_list end if end if end for add the actual_node to the closed_list end while
If the path-finding was successful the algorithm sets the entries of the total_cost structure to costs of traveling from the starting point to fields corresponding to the entries.
Now, to construct a list of nodes belonging to the path we need to process this structure – and, starting from the ending node, in each iteration search for a node connected with our actual one such that the corresponding entry in the total_cost structure has the minimal value (but under the condition that the value was computed by the algorithm – this is why at the beginning of the above-mentioned algorithm we initializes entries of the total_cost structure to an artificial value). This is the pseudo-code describing this operation:
actual_node := ending_node add the actual_node to the path while the actual_node is not the starting_node begin minimal_node = nil minimal_node = +infinity for every node connected to the actual_node (known as processed_node) begin if total_cost(processed_node) < minimal_node then begin minimal_node := processed node minimal_node := total_cost(processed_node) end if end for actual_node := minimal_node add the actual_node to the path end while
1.2.2. The A* Algorithm
A* is a directed algorithm - it does not search for a path blindly (like the Flood Path-Finding algorithm) but instead tries to guess the best direction to explore, sometimes even backtracking to try alternatives. This is what makes the A* algorithm so flexible. A* explores the map by creating nodes corresponding to various positions. The nodes are used to record data about the progress of the search. In addition to the positional data, each node contains variables for fitness, goal, and heuristics values. Meanings of these terms are explained below.
the goal value ( g ) is the cost to get from the starting node to the given one. the heuristic value ( h ) is the estimated cost - or distance - to get from the given node to the goal. the fitness value ( f ) is the sum of g and h. The f value represents our best guess for the cost of a path going through the given node. The lower value - the better.
Contrary to the Flood Path-Finding algorithm, A* traverses its open list to always choose the best node to explore – that is, the node with the lowest f value. Because of this the open list is usually implemented as a priority queue where nodes are sorted according to their f values so the best node is always on top. A priority queue allows for fast removal (from the top), but slower insertion. The other possibility is to have an unsorted list where insertion is fast, but removal is slow - in fact [AIGPW02] claims that the second way is better since there are always four or eight times more removal than insertion operation. The pseudo-code for the A* algorithm looks as follows:
initialize all elements of the total_cost structure \ to an artificial value indicating that the total \ cost for this element was not computed. P := the starting_point total_cost(P) := P.g := 0 P.h := compute_distance(starting_point,goal_point) P.f := P.g + P.h add P to the open_list // at this point P is the only node // on the open_list while the open_list is not empty begin B := get_node_with_lowest_f(open_list) if B = goal_node then break from the while loop if the open_list is empty then return error: a path cannot be found for every node connected to B (known as C) begin if accessible(C) = true then begin t := B.g + cost(B,C) if C is on the closed_list and total_cost(C) > t then total_cost(C) := C.g := t else if C is not on the closed_list then begin total_cost(C) := C.g := t C.h := compute_distance(B,goal_point)
C.f := C.g + C.h add C to the open_list end if end if end for end while
If the path-finding was successful the algorithm sets the entries of the total_cost structure to costs of traveling from the starting point to fields corresponding to the entries – however, contrary to the Flood Path-Finding Algorithm, A* starts its processing from the most promising nodes, passing over the majority or even all of the nodes which do not belong to the path, so it is very probable that it will find the path much more quickly than the Flood Path-Finding. The entries of the total_cost structure corresponding to the omitted nodes will preserve their artificial value which they were initialized to at the beginning of the algorithm. The second step, the construction of the path, is exactly the same as in the case of the Flood Path-Finding algorithm – we use the constructing-path code described in the previous subchapter.
1.3. Collecting data about the world
Another important element of an application simulating the real scene is the method of collecting data about the surroundings, implemented in the agent. The way of collecting data is tightly connected to the method of representation of the scene.
One of the most popular implementations of the collecting data methods is a mechanism taking advantage of “waypoints”. The simulation stores objects of a special kind associated with defined points in the scene and containing information about surroundings of these points. The objects – waypoints – have no physical characteristics, in other words. they are
not visible inside the simulation and cannot interact with other objects, but they are used as sources of information. An agent may ask the simulation to find its closest waypoint and send information stored by the waypoint to the agent. Apart from other data a waypoint may store identifiers of paths it belongs to. That is, with the help of a waypoint it is possible to collect information where an agent can arrive if it follows the path leading through that waypoint. Thanks to this solution the path-finding algorithm (like for example Flood Path-Finding, or A*) does not have to process a large collection of nodes created for the path-finding only, but it can use waypoints – which already exist in the simulation and are loosely coupled (i.e. much less waypoints than nodes are used to span the same surface), so there is much less to process. The main limitation of the waypoint mechanism is the fact that waypoints store only “static” data about their surroundings, that is data which does not change during the simulation. If due to a some kind of an event (for example if another, big object blocks it) a path leading through one of the waypoints cedes to exist, the waypoint will still send information that it is possible to travel the path. Of course, we can implement an algorithm which will check, in regular intervals of time, the validity of all waypoints, but it is a time-expensive process). What is more, waypoints do not store information about mobile objects moving in their surroundings. Therefore we need to implement additional functionality enabling us to determine which of the remaining mobile objects are visible to our agent (the way of implementing such functionality is proposed in the following sub-chapter). To summarize, collecting data with the use of a waypoint mechanism looks in a following way:
closest_waypoint=null closest_waypoint_distance=+infinity for every waypoint on the scene (known as current_waypoint) begin if closest_waypoint_distance > current_waypoint_distance then begin closest_waypoint_distance := current_waypoint_distance closest_waypont := current_waypoint end if end for get static data about surroundings from the closest_waypoint for every mobile_object on the scene begin check if mobile_object is visible // with the camera functionality, for example
if it is visible then get data about the mobile_object end for
Another way, more realistic from the point of view of the simulation of a real scene – but in the same time more sophisticated – is to implement an object used as a camera (or “eyes”) of an agent. The camera object contains: • • • • data defining its location on the scene, the direction it is turned to, some configuration parameters – describing, for example, the quality of the picture, and an algorithm determining what part of the scene is visible to the camera.
The camera asks the simulation for data of all objects in the visible area and returns it to its agent in the form defined by the designer. However, an implementation referring to the way a real camera works has some serious disadvantages. The model of a real camera should consists of the following characteristics: the horizontal angle of view, the vertical angle of view and the quality of the picture. Information about the angles enables the algorithm to compute the area of the scene visible to the camera. Information about the quality of the picture is used to determine how faint the picture of an agent may be to be still identifiable1. The horizontal and vertical angle of view let the algorithm compute two normalized vectors being the opposite edges of a cone whose top is placed in the point the camera is located. To determine if a given object is visible we compute a vector with the beginning in the location of the camera and the end in the location of the object. Then we normalize the vector and check whether it falls into the interval fixed by the cone.
vector_normalized := normalize(vector)
We assume that it is not necessary to identify an object judging from its shape, colour, or behaviour. If an object is visible with sufficient quality, and is not eclipsed by another object, it is identified immediately – that is, the simulation sends to the agent information about the object, along with its identifier. The mechanism of identification of objects is not the subject of this work.
if vector_normalized_x <= cone_edge1_x OR vector_normalized_x >= cone_edge2_x OR vector_normalized_y <= cone_edge1_y OR vector_normalized_y >= cone_edge2_y OR vector_normalized_z <= cone_edge1_z OR vector_normalized_z <= cone_edge2_z then return error: the object is not in the visible area
If the vector happens to be inside the cone the algorithm checks whether its length is smaller than the value described by the quality of the picture. If it is not, the object is too far away to be identifiable.
vector_x := object_location_x - agent_location_x vector_y := object_location_y - agent_location_y vector_z := object_location_z - agent_location_z length := compute_length(vector) if length > compute_max_length(picture_quality) then return error: the picture of the object is too faint
If the quality of the picture is sufficient enough, we can move to the next step, that is, to check if it is not eclipsed by another object. To do it the camera use an LOS (Line Of Sight) algorithm. In its simplest form, the line-of-sight algorithm blindly performs a series of expensive ray-polygon intersections, trying to find out whether a given object "collides" with the line of sight. You can find a more sophisticated and optimized solution in [AIGPW03]. Usually, methods of optimization base on minimizing the number of intersections which need to be computed. Nevertheless, this solution is very expensive.
2. Other methods of implementing Artificial Intelligence as decision-making units for agents moving in simulations of a real scene.
2.1. Finite State Machines
Finite State Machines (FSM), also known as Finite State Automation (FSA) are models of the behaviors of an agent, with a limited number of defined conditions or modes, where mode transitions change depending on data coming from the agent's sensors. Finite state machines consist of 4 main elements: • • • • states which define behavior and may produce actions state transitions which are movement from one state to another rules or conditions which must be met to allow a state transition input events which are either externally or internally generated, which may possibly trigger rules and lead to state transitions
A finite state machine preserves its current state which is the result of the last state transition. Received input events act as triggers, which cause an evaluation of some kind of the rules that govern the transitions from the current state to other states. FSM is typically used as a type of control system where knowledge is represented in the states, and actions are constrained by rules. A finite state machine implemented according to these principles is deterministic, in other words. for the same input and the same current state it results in the same state transition (but, contrary to artificial neural networks characteristics described below, a finite state machine may be programmed to give different output for the same input data since it remembers, and alternates, its internal state).
2.1.1 Characteristics of FSM
• • • • Relatively simple to design and implement - the design is very easy to represent on a graph, making it easy to transfer from the design phase to the coding phase. Predictable, given a set of inputs and a known current state, the state transition can be predicted, allowing for easy testing. Efficient, due to their design simplicity. The predictable nature of deterministic FSMs can be unwanted in some domains such as computer games and artificial life - this disadvantage may be avoided by adding a factor of unpredictability, like an additional input coming from a random numbers generator. • Not suited to all problem domains, should only be used when a systems behavior can be decomposed into separate states with well defined conditions for state transitions. This means that all states, transitions and conditions need to be known up front and be well defined. • The conditions for state transitions are fixed - this disadvantage may be omitted with adding fuzzy logic functionality to the conditions.
Finite State Machines are light, fast, and relatively easy to implement solution for the decision-making unit for agents. Their very nature supports the process of designing the agent's behaviour and makes it easy to extend it if there is such a need in the future. However, their use is limited to problems which can be defined clearly with no “no-man's land” between possible answers. This condition may be relaxed a bit by using fuzzy logic architecture (described below) to qualify situations which are not clear enough to one of previously defined classes of situations – but nevertheless it makes Finite State Machines less universal than Artificial Neural Networks.
2.2. Fuzzy Logic Decision-Making Architecture
Fuzzy logic is a part of mathematics dealing with concepts that cannot be defined as either true or false but rather as being true to some degree. It has been growing in use in science and engineering over the last decades. Fuzzy Logic Architecture consists of a memory being a set of characteristics – or “impressions”, if we search for a better word – an agent “feels” for every object, or a type of objects it may interact with during the simulation. The values of these characteristics, along with the location, distance from the given object, and any other information the designer decided to use, compose the input data for the decision-making unit. Decision making consists in qualifying these values into appropriate intervals (e.g., the distance may be “short”, “middle” or “long”) and subsequently in choosing one of the implemented behaviours from a hash-table whose arguments are identifiers of these intervals. To show how a fuzzy decision-making process works I will use a simple simulation scenario. An agent "A" moves through a scene and encounters a movable object "M". "A" spots "M" in its data collecting phase through its camera. "A" searches its memory and finds a structure of data describing its impressions about "M". This structure contains characteristics "Danger", scored on the range from 0.0 to 1.0 with a value 0.7. Additionally, "A" estimates the distance between itself and "M" as 0.15. These values serve as the input data to our fuzzy system. Below, you can find tables with defined possible input ranges.
Distance Input Range Distance Short Medium Long Value 0.00 - 0.45 0.35 - 0.75 0.65 - 1.00
Danger Input Range Danger Low Medium High Value 0.00 – 0.50 0.30 – 0.70 0.65 – 1.00
Together, these input ranges create fuzzy membership sets. The agent needs to select a behavior from its available behavior set (a table below). It can run away from the object, come closer to it, or behave as if the object's existence had no meaning.
Possible reactions Run away Stay neutral Come closer
To choose the best behavior the Fuzzy Logic Architecture checks the hash-table which takes fuzzy membership sets identifiers as parameters and returns the identifier of behavior associated with the product of these sets. The following table shows all possible answers.
Behaviour hash-table Distance Danger Low Medium High Short Stay neutral Run away Run away Medium Come closer Stay neutral Run away Long Come closer Come closer Stay neutral
In this example the distance is short and the danger is medium/high so “A” decides to run away. If the distance was medium the agent would have a possibility to choose between staying neutral and running away – in such cases the decision is made by choosing a random number and comparing it to a previously defined coefficient. A Fuzzy Logic Architecture may contain more than one hash-table – each connected to a different membership set. This way an agent is able to make a number of parallel decisions, for example apart from running away it may decide to send a message about the danger to another agent (or it may decide not to send any message if the danger is not big enough or if conditions occur which result in rejecting this possibility – the basis of flexibility of Fuzzy Logic Architectures is the fact that such decisions are being made in parallel; they are not results of behaviour coming from only one source).
2.2.1 Characteristics of FLA
• • • • • Simple, easy to debug. In controversial situations it may give unpredictable answers, thus simulating behaviour of a living organism. Due to the ability of using more than one hash-table it allows the designer to easily divide the situation into parallel components. Not universal – fuzzy membership sets have to be defined separately for each application of a Fuzzy Logic Architecture. Hard to expand – creating a new interval in a range requires adding a new row/column to each hash-table which uses the range. For the same reason big and sophisticated FLA can be quite memory expensive.
3. Specifics of implementation of an Artificial Neural Network as a decision-making unit
An Artificial Neural Network, implemented in compliance with the Object Oriented Programming paradigm, draws in features of a black box. An agent passes to it data in the form of a vector of numbers, starts the processing and, subsequently, collects the answer, also in the form of a vector of numbers. The intermediate states which occur during the processing of an A.N.N. have no meaning to the agent and to the user. Even more, they are almost completely not understandable. We are able to say what is the meaning of the input vector because it is created on the basis of information passed to the network from the outside – the input vector is a reflex of the current state of the agent and its surroundings. Also, we are able to interpret the answer and extract information about actions the agent should make. But we are not able to describe what do intermediate states mean – what does the vector of signals of one of the interim iterations in the Hopfield model mean, or what the neurons of the hidden layer in the perceptron-based network correspond to. The design of an Artificial Neural Network as a black box has one very important advantage – it is completely independent from the implementation of the agent and the real scene. The code responsible for conversion of data into the form understandable to the network is implemented in the agent, not in the A.N.N., as well as the code responsible for conversion of the answer into the form understandable for the rest of the application. On the contrary, the code of the A.N.N. itself may be used without any changes to solve very different problems – starting from hand-writing recognition, to forecasting of stock exchange indexes, to controlling a simulated, or a real robot/vehicle.
Also the Artificial Neural Network learning algorithm is independent from the environment it works in. Analogically to using an already trained neural network, the burden of interpretation of data lays in the implementation of the agent. The A.N.N. simply receives a set of input vectors along with associated output vectors. The training process takes place inside the black box.
3.1. Controlling an object
In my paper I focus on the last of the above-mentioned three examples of utilizing an Artificial Neural Network: controlling a vehicle (in the case of my thesis it is a simulated one). With this assumption we can say something more about the methods of interpretation of collected data and the answers. Apart from the decision-making unit the most important elements of the agent implementation are
• • •
sensors – a mechanism of collecting data from the environment, effectors – a mechanism of acting upon the environment. an accumulator/engine collecting and distributing energy which is needed to perform actions,
“internal sensors” - a mechanism of collecting data about the internal state of the agent.
T he decision-making unit receives information about the situation of the agent, in other words. about the current state of the part of the simulated scene that is visible (data collected using the external sensors, for example a camera) and about the internal state of the agent (for example, amount of energy stored in the accumulator). This data is filtered and converted to an input vector which is subsequently passed to the Neural Network. The network computes the answer which is later on converted to the identifier of a reaction the agent should make. The reaction identifier is passed to the agent and results in: • • collecting from the accumulator energy needed to perform an actions performing an action described by the identifier
3.2. The external memory
It would not be the best solution to give the Neural Network an input vector describing the situation of the agent directly. First, if the agent is updated (see “Definitions” chapter, actualization of an agent) often enough the situation rarely changes between two subsequent turns. And if the situation does not change the input vector is the same so an Artificial Neural Network (assuming that its implementation does not use random numbers generator to alter the signals) will give the same answer as in the previous turn. Therefore it would be reasonable to implement functionality for comparing the current situation to a recently remembered one – if the situations are identical there is no need to use the Neural Network. Second, usually there are only a few objects in the surrounding of the agent that are important to its process of making decision. In such situation passing all data about the surrounding to the Neural Network, and forcing it to filter important information by itself, is an enormous waste of resource, not to mention that the process of training a Neural Network to work under such conditions would be very hard if not impossible. The solution to this problem is to filter out unimportant data with the use of an external memory – external in relation to the Artificial Neural Network. The external memory stores a copy of information about the scene. The agent collects data from the surroundings and checks if corresponding entries of the external memory need to be updated. If yes, the update
is performed and the agent sets a flag indicating that the situation has changed and it is needed to launch the Artificial Neural Network. Then the algorithm extracts from the external memory information about objects important to the process of decision-making – and basing on this data the input vector is created. To implement an external memory we need a two- or three-dimensional array whose entries are defined as simple data structures and their characteristics depend on the method of representation of the scene. The coordinates of the entries correspond to coordinates of fields of the simulated scene (the scene is artificially divided into a grid – each field in the grid is described by one entry). If the scene is small, the external memory array may span the whole world. If it is bigger, the decision making unit may internally divide the representation of the scene into areas – if the agent leaves the area, the contents of the memory are erased and the coordinates are redefined.
3.3. Teaching an Artificial Neural Network
To train an Artificial Neural Network to control an object in the simulation of a real scene we need to prepare a training set consisting of pairs of data (situation, reaction). After preparing it, in a form of a text file for example, and uploading it into an agent, the data is converted into a set of input patterns (vectors) along with their corresponding output vectors. The vectors are passed to the Artificial Neural Network and the process of training starts. From the point of view of a decision-making unit, an Artificial Neural Network is a heteroassociative memory – that is, for a given input it tries to find an output vector corresponding to an input pattern (remembered during the training process) which is the most similar to the given input. The remembered input pattern and the remembered output vector do not have to be identical. The method of training an Artificial Neural Network depends on the used A.N.N. model.
3.4. Proposed Artificial Neural Network models
3.4.1. The Hopfield model
The Hopfield model was proposed by John Hopfield of the California Institute of Technology during the early 1980s. The publication of his work in 1982 significantly contributed to the renewed interest in research in artificial neural networks. He showed how an ensemble of simple processing units can have fairly complex collective computational abilities and behaviour.
The Hopfield computes its output recursively until the system becomes stable. Unlike a multi-layered perceptron network the Hopfield model consists of a single layer of processing elements where each unit is connected to every other unit in the network. The neurons in the Hopfield model act for both input and output . A pattern is stored by computing the pattern 30
weight matrix as follows:
Wk Xk Xk
W k - the k-th pattern weight matrix X k - the k-th input vector
The whole weight matrix is the sum of pattern weight matrices:
Where: • p – number of patterns
A detailed description of the training and decision-making process can be found in [NET03]. The object-oriented implementation of both methods is explained in the practical part of the thesis. Please note that the Hopfield model works not as a hetero-associative but as an autoassociative memory, in other words. it recreates a pattern basing on a similar input vector, but it does not remember its counterpart, that is the response. To achieve this functionality we need to implement a “dictionary” of a sort – a set of pairs (situation, reaction) used after the decoding process to identify a response associated to the decoded pattern. In its simplest implementation the dictionary is an array, resulting in linear time complexity O(p) where p is the number of patterns, but with little effort it can be programmed as a binary tree where subsequent levels of nodes correspond to subsequent values of the pattern vector, or a hashtable, resulting in the constant time complexity (dependent only on the length of the vectors). 31
During decoding, there are several schemes that can be used to update the output of the units. The updating schemes are synchronous, asynchronous, or a combination of the two. • Using the synchronous updating scheme, the output of the units are updated as a group prior to feeding the output back to the network. • Using the asynchronous updating scheme, the output of the units are updated in some order (e.g. random or sequential) and the output are then fed back to the network after each unit update.
Using the hybrid synchronous-asynchronous updating scheme, subgroups of units are updated synchronously while units in each subgroup updated asynchronously. The choice of the updating scheme is up to the designer. From the point of view of implementation, one more feature of the Hopfield model is important: The maximum number of patterns that can be stored in the Hopfield model of N neurons before the error in the retrieved pattern becomes severe is around 0.15N. (It can be increased by various methods including deterministic chaos and stochastic models).
3.4.2. The Multi-Layered Perceptron model
The Multi-Layered Perceptron model (MLP) is a natural extension to the single layer perceptron that were very popular in the 1960's. The MLP model is able to overcome the limitations of its single-layer predecessor. This plus the availability of several learning algorithms for finding suitable weights made multi-layered perceptrons widely popular - with applications in finance, chemistry, plant control, autonomous vehicle steering, and various other function approximation and general pattern recognition problems. The most popular way of training the MLP network is called back-propagation. Backpropagation networks, contrary to the Hopfield model, have distinct input, output, and hidden layers. The neurons function basically like perceptrons, except that the transition (output) rule and the weight update (learning) mechanism are more complex. 32
The figure below presents the architecture of back-propagation networks. There may be any number of hidden layers, and any number of hidden units in any given hidden layer. However, usually only one hidden layer is used since it was shown that any function approximated by an MLP model with more than one hidden layer can be also approximated by the MLP network with only one hidden layer.
A detailed description of the training (using the back-propagation method) and decisionmaking process can be found in [NET04]. The object-oriented implementation of both methods is explained in the practical part of the thesis. Unlike the Hopfield model, the Multi-Layered Perceptron is not a recursive network, in other words, to compute the result each neuron has to be activated only once. Additionally, the MLP model is a hetero-associative memory so there is no need for artificial structures like a dictionary used to change the Hopfield model from an auto-associative to a heteroassociative network. In this situation only one thing still needs to be discussed: the number of neurons of an MLP model. Unfortunately, there are no concrete indications as to the number of hidden neurons needed for any given function to be approximated. As such, the most common approaches are either
start with one hidden neurons and proceed to add one hidden unit at a time, each time fully training the network, until such time that the network is performing within the maximum tolerable error;
or start with a sufficiently large network (this requires some estimation and heuristics) and proceed to by removing one hidden neuron at the time until an adequate network is obtained.
The practice: Implementation of an example simulation
The practical part of the thesis consists mainly of a detailed description of an example implementation of a simulation of a real scene. The description shows how to design and implement such simulation and how to design and implement an artificial neural network which controls an object performing simple tasks in the simulation. The design uses objectoriented paradigm what results in that the artificial neural network works as a black box: its computations are hidden from the rest of the program. This way the network becomes a very flexible and universal tool which can be used to solve a great variety of problems. The practical part of the thesis is divided into following chapters: • The simulation module Describes classes which form the very core of the program: World, Cell, Material and MaterialArray. Every action which takes place in the simulation has to, at some point, access information defined in these classes. • Objects Describes classes defining "physical" objects (physical in the sense of the simulation) which move, interact with the scene and with each other, or simply are present in the simulation. • The agent module Consists of the description of the Agent class and classes which define the parts of it. The agent is an abstract layer between the "physical" functionality of the simulation and the artificial neural network. • The Artificial Neural Network implementation Describes in details how to design and implement two of the variety of models of artificial neural networks: the Hopfield model and the Multi-Layered Perceptron, as well as discusses its similarities, differences, abilities and limitations. . Each chapter starts with an UML diagram describing how all of the classes of the given module interfere with each other and then moves to a brief explanation of the main functionality of each, or only the most important of the classes. It is important to note that the 35
program, which presents an example implementation of this design, does not follow this description in every detail: some classes of the program encapsulate the functionality in a bit different way than it is presented below, and data types used in the diagrams and in the description are not necessarily the ones used in the implementation of the program. The program usually uses aliases of data types. For example, in places where the paper describes the integer number type as 'int' the implementation may use 'int', 'DWORD' (an alias to 'unsigned long') or 'wsize' (an alias to 'unsigned int'). Similarly, the floating-point numbers are described in the paper as 'float' whereas the implementation uses both 'float' and 'double'. Also, the following description of the program does not discuss low-level functionality, as for example implementation of priority queues used to sort data in the agent module, or the Singleton class used as a superclass of two main classes of the program:
1. The simulation module
1.1. The scene
The simulation's main class is called simply "World". The world consists of a threedimensional array of cells. Though the array is 3D, it can be said that the simulation is in fact 2.5D: the scene is three-dimensional, but all interesting events takes place on its surface.
As it was mentioned in the theoretical part, many algorithms, as for example path-finding and line-of-sight algorithms, divide the scene into squares, or cubes, and process them as if the characteristics of all details inside one square/cube were unified. In fact, the process of dividing the scene in this way is so common that it was decided that the program would implement the scene as already divided into cubes. This solution makes possible to simplify greatly two very important algorithms used in the program: the A* path-finding and the calculation of movement of objects (i.e. physics of the scene). As a side-effect, the solution makes it easy to display the map of the scene in the console window, used as the main user interface of the program. API Main characteristics: • The cells array A 3D array of cells - each cell describes the characteristics of its contents. The contents can be air, ground, or an object. • • • The objects array An array of objects (i.e. instances of the class Object) currently in the simulation. The agents array An array of agents currently in the simulation. Artificial objects: the surface and the outside The scene defines two artificial objects (so-called sentinels) to simplify some of the algorithms. As their names suggest, the surface object is used when an algorithm, while searching for an object in a given cell, finds the surface of the scene, and the outside object is used when the algorithm wants to search for an object outside the scene. • The gravity coefficient A constant describing the gravity in the scene. It is used to compute the force pushing an object down the slope of surface. • ANN-iterations-per-tick A constant describing how long an Artificial Neural Network should operate in one 38
turn before giving the control back to the agent. If it is set to 0, the neural network is free to work as long as it needs to compute its results. For a more detailed description see “The suspension mechanism” description in “The Artificial Neural Network implementation” chapter. Main methods:
int surfaceat(int x, int y)
The scene is oriented in a such way that the x and the y coordinate describe horizontal directions whereas the z coordinate describe the altitude. The 'surfaceat' method takes as parameter the horizontal position (x,y) in the scene and returns the altitude of the surface of the scene at the given position.
• void slidingforce(int position,float weight,float force)
The method computes and returns the force which pushes an object of the given weight down the slope of the surface at the given position.
• float frictioneffect(int position,float weight)
The method computes and returns the coefficient of friction of the surface at the given position, basing on the slope at the given position and the weight of an object. Later the coefficient is used to compute the reduction of the velocity vector of the (mobile) object.
• float energycost(int position,float direction,float weight)
The method estimates how much energy is used to move to the given position from the given direction for an object of the given weight. The result is then used by the path-finding algorithm to compute the total cost of moving from the starting to the given position.
The most important - yet very simple - method of the simulation. The 'tick' method simply call the 'update' method for every object in the objects array and, after that, the 'update' method for every agent in the agents array.
Pseudo-code describing the 'void World::tick(void)' method:
for every object in the objects array object.update() for every agent in the agents array if the agent.canbeexcludedfromupdate == false then agent.update(ann-iterations-per-tick)
For the explanation of the 'canbeexcludedfromupdate' property please go to “The Agent module” chapter.
1.2. A cell
API Main characteristics: • • The default material The default material, used when no object occupies the cell. The actual material If no object occupies the cell this characteristics is the same as the default material. Otherwise it is the material of the object. • A reference to an object The reference to an object, or a part of an object spanning the cell. If the cell is not claimed by any object the reference is set to NULL. One can think about the relation between a cell, a material, and an object as about a similar relation in the real world: a cell is an abstract cube in space. The space can be temporarily occupied by an object - a wall, a chair, or a human being. The material of the cell is either the material of the object or a default material, for example air or 41
soil, used when no object occupies the cell. • The slope of the surface If the cell consists of the surface of the scene, entries of this two-dimensional vector describe its slope. Otherwise it is undefined. The information is used to compute the force which pushes object down the slope due to the gravity of the scene.
1.3. A material
API Main characteristics: 1. The friction 2. The density 3. The identifier 4. The colour
1.4. The array of materials
API Main characteristics: 1. The array of materials Physics of the world, as well as the user interface, is often in need of information about the material being a part of an object or the surface of the scene in a given place. Instead of searching the given material in the scene's cells array it is much faster to identify the material in the array of materials and to store a reference to the given material in an artificial structure (for example: the SurfaceInfo class described in the Agent module).
This is the common superclass for all objects populating the simulation. It provides the basic 45
functionality to: • • • • • • describe and identify an immobile object in the simulation specify its position a cell may hold only one object at time an object may span more than one cube the exact position of an object is specified by the size of its shape and its offset – the entry in the object's shape array where its “centre of weight” is located. an object is immovable; only subclasses of the Object class provide functionality for movement. API Main characteristics: • The size. A 3D vector of positive integers holding the size of the shape array. A programmer accesses the shape array via the OBJMAT(x,y,z) macro which returns a pointer to a material of the specified entry, or NULL if the entry is empty - in other words. it belongs to the shape array, but it describes a field which is outside the shape, for example a hole. The shape doesn't have to be rectangular so the shape array may hold empty spaces. • The centre of mass. A 3D vector of positive integers holding the position of the centre of mass of the object. • The offset A 3D vector of positive integers holding the offset. To calculate which cells of the scene are taken by the object we need to know the size of the object and the position of the cell the object’s shape array starts at. To calculate it we use the center of mass and the offset. The starting position of the shape array is:
Please note that:
If, for example, the size is (5,4,3) (5 cells in x, 4 cells in y, 3 cells in z), the center of the object is at the [10,10,10] and the offset is [2,2,2] we can calculate that the object spans from [8,8,8] (the first cell taken by the object) to [12,11,10] – although some of the entries of the shape array may be empty. • The mass A floating point number describing the mass of the object. Every entry of the shape array is either NULL or holds a pointer to a material. Every material has its density the mass of one cell of this material. The mass of an object is the sum of masses of cells the object spans. • The shape array A 3D array of pointers to materials. The array describes a box containing the object. Its entries may be either NULL or may hold pointers to materials. If the entry is NULL it means the cell it is referring to is not a part of the object. Main methods:
byte load(const char * filename)
Loads up an object - its shape, position and other characteristics - from a file. Returns 1 if it succeeds, 0 otherwise.
• void update(void)
Updates the object's characteristics. This is a virtual function overrode by every subclass of the Object class. When the simulation calls the ‘update’ method of an object in its objects array, the program finds and executes an ‘update’ method version corresponding to the real type of the object. For example, for an immobile object the ‘update’ method does nothing, whereas for a mobile object it checks what forces works upon the object and updates its position and the velocity vector, and for a vehicle the ‘update’ method uses the vehicle’s engine to create force which moves the vehicle in the desired direction.
• byte init(wsize x,wsize y)
Places the object onto the surface of the world, in other words,. it places the center of mass of the object at the (x,y,z) cell, where z is the surface of the world at the (x,y) horizontal coordinates.
• byte teleport(wsize x,wsize y)
A primitive method, created for testing purposes. It moves the object “out of” the world, changes its position, and moves it back “into” the world. Faster than destroying the old object and creating its exact replica in another place.
2.2. A mobile object
This is the common superclass for all objects that can be moved by external or internal (i.e. engine) force. It provides functionality for: • • Dealing with physics of the world, including the gravity and the friction of the surface, forces applied to it and collisions. Enabling the programmer to “push” the object, thus providing the foundation for the vehicle's functionality. Additional limitations: Mobile objects are always of size (1,1,1) (one cell long, wide, and high) API Main characteristics:
The velocity A 3D vector of floating point numbers describing the velocity of the object. Numbers are from -1.0 to 1.0 and denote what part of a cell in the given direction the object can move by per turn. (The vector can take negative numbers so it does not tell us only about the velocity but also about the direction the object is moving in). Note that because we also know the mass of the object, this is equal to knowing the momentum of the object.
The shift A 3D vector of floating point numbers telling us about the exact position of a mobile object. An immobile object always has the shift = [0,0,0], i.e. its centre of mass is always in the centre of a cell, but a moving one may be "caught" when it moves from one cube to another. For example, if the position of the mobile object is (1,1,1) and its shift is [0.5,0,0] it means that the object is exactly in the middle from (1,1,1) to (2,1,1) .When the shift increases to 1.0, or decreases to -1.0, the position of the object changes.
Accesses the physics of the simulation to calculate the forces working upon the object, in other words. the friction of the terrain, the slope of the terrain and the gravitational force. Subsequently, it calculates the new velocity of the object and uses it to change its shift and/or its position.
• void applyforce(vector3D & force)
Enables the programmer to “push” the object. The applied force is used to calculate the new velocity of the object.
• void collision(Object * with_object)
Calculates a collision between two mobile objects (calculation of a mobile object hitting an immobile one is handled separately due to optimizations) and updates their velocities.
The collision's algorithm An algorithm solving a collision between two mobile object is a very simplified version of the algorithm shown in the theoretical part. Both objects are assumed to be of the same size and shape and the collision is assumed to be inelastic.
k := object1.mass / object2.mass; // mass coefficient for all dimensions (known as d) begin object1.velocity[d] := - object1.velocity[d] * (1-k)/(1+k) + object2.velocity[d] * 2/(1+k); object2.velocity[d] := object2.velocity[d] * (1-k)/(1+k) + object1.velocity[d] * 2/(1+k); end
2.3. An energy source
A small subclass of MobileObject, used – as the name suggests – as an energy source for vehicles. The energy accumulated in the source is assumed to be infinite. API Main methods:
float getenergy(float wanted)
The method simply returns the wanted amount of energy (technically, the number passed to the method as the parameter 'wanted').
2.4. A data source
A small subclass of MobileObject, used as a data source for vehicles controlled by agents. API Main characteristics: • The data. A floating-point number describing the data held in the data source. The number is from -1.0 to 1.0. Main methods:
Used to declare that a vehicle accesses the data contained in the data source. Subsequently the method returns the number it has contained.
2.5. A vehicle
A subclass of MobileObject extended by the functionality enabling an object of this class to move by itself, in other words. to use its engine to generate force pushing it in a given direction. The usual way of using objects of the Vehicle class is to put them under control of agents, but a vehicle can also operate being under control of a human user. API Main characteristics: • The engine The vehicle can draw energy from an energy source and store it in the engine (which also works as an accumulator) to later use it to create force that "pushes" the vehicle
in a desired direction. To draw a desired amount of energy from the engine, the vehicle uses its method ' void Engine::setpower(float coefficient) ' where the coefficient is a real number from 0 to 1 describing how much of the engine’s power (a property set during the initialization) should be used to produce the energy. Then the vehicle accesses float Engine::getenergy(void) to draw the energy. If the engine runs out of energy the vehicle cannot move anymore. • The camera The functionality of a camera enables an agent to retrieve data about the surrounding of the vehicle it is controlling. The vehicle itself has no use of a camera.
The locked velocity Used by the “autopilot” - the locked velocity describes the velocity and the direction the vehicle should move in.
The maximal velocity Defines what is the maximal velocity of the vehicle. If the usage of energy retrieved from the accumulator results in that the velocity of the vehicle increases above this limit, the velocity is lowered to the maximal velocity and the rest of the energy is lost.
• void update(void)
The method extends the 'update' method defined in the MobileObject class with accessing the engine of the vehicle (namely, the method float Engine::getenergy
With this call the vehicle retrieves an amount of energy (methods of the
class enable its user - an agent or the user of the program - to set how much
energy should be retrieved in one turn) and use it to “push” the vehicle in the direction it is turned to.
• void autopilot(float direction, float velocity)
The method helps to control the vehicle by automatizing the process of maintaining a constant velocity and direction of movement. In each turn physics of the world alters the movement of a vehicle – the friction of the terrain slows it down and the slope and the gravitational force changes its velocity and direction of movement. The autopilot
stores the parameters it is given (that is, the desired direction of movement and the desired velocity) and, in each turn, changes the vector of force created by the vehicle such that the result velocity and direction were as close as possible to the desired ones.
• SurfaceInfo * takeashot(void)
This method works as a facade for the camera. It simply accesses 'SurfaceInfo *
Camera::takeashot(void)' • float getdata(void)
and returns the result array.
When the vehicle uses this method it symbolizes the act of accessing data from a data source (or from data sources) in one (or some) of fields surrounding the position of the vehicle. The method accesses the float DataSource::getdata(void) method and returns the data of the data source it has accessed.
• float getenergy(void)
Using this method the vehicle draws energy from an energy source (or energy sources) in one (or some) of fields surrounding its own position (the first energy source found is used). The method accesses the float EnergySource::getenergy
method, where wanted := capacity – accumulated.
Pseudo-code for the 'void Vehicle::update(void)' method.
access the MobileObject functionality to compute the changes \ in the velocity vector made due to the gravitational force if the vehicle is near an energy source then draw energy from the energy source to the engine if the vehicle is near a data source then draw data from the data source if the autopilot is on then begin actual_momentum := velocity_vector * mass desired_momentum := autopilot_v_vector * mass force_vector_needed := desired_momentum – actual_momentum energy_needed := 0.0 for all dimensions do energy_needed := energy_needed + force_vector_needed[dimension] if engine.max_power > energy_needed then engine.setpower( energy_needed / engine.max_power ) else engine.setpower( 1.0 )
energy := engine.getenergy() if energy < energy_needed then for every dimension (known as d) do force_vector[d] := force_vector_needed[d] * energy / energy_needed else for every dimension (known as d) do force_vector[d] := force_vector_needed[d] applyforce(force_vector) // this results in a change of the velocity vector end if r := length_of(velocity_vector) if r > max_velocity then for every dimension (known as d) velocity[d] := velocity[d] * (max_velocity/r) access the MobileObject functionality to compute the changes \ in the velocity vector made due to the friction of the surface
2.6. A camera
A camera, although technically a part of a vehicle, is useful only if the vehicle is controlled by an agent – that is, the human user or an object of the class Agent. The implementation of the camera in the program is greatly simplified comparing to the theoretical model presented in the previous part of the thesis. The program does not use a cone of sight, as described before, nor the functionality of one of Line Of Sight algorithms. The camera contains only two characteristics: the resolution, and the quality, which together describe a rectangle of the surface visible to the camera. We assume that all fields in the rectangle are visible – a Line Of Sight algorithm is not used. API Main characteristics: • The resolution 58
The width of the rectangle. • The quality of the image. The length of the rectangle. Main methods:
• SurfaceInfo * takeashot(void)
The method makes a snapshot, that is, it returns information about the cells in the visible rectangle in the form of an array of SurfaceInfo objects. SurfaceInfo is a simple structure holding the position of the cell it describes, the material and the slope of the surface in this cell – or, if the cell contains an object (an entity of the Object class), the identifier of this object – and the number of the turn the snapshot was taken.
3. The agent module
As it was mentioned in the theoretical part of the thesis, an agent – in the sense of the simulation of a real scene – may be considered a wrapper on the decision-making unit, that is, the artificial neural network. The task of an agent is to retrieve information about the scene and make decisions basing on it. To perform this task an agent takes following steps: 1. Retrieve data about the scene and the vehicle it is connected to; 1. Filter necessary information; 2. Convert it in the form of a vector of floating-point number; 3. Load the vector to the artificial neural network; 4. Launch the network; 5. Retrieve the output of the network in the form of a vector; • • Convert the vector into an identifier of one of possible actions; Perform the action onto the vehicle.
In the implementation of the program each of these steps is associated with one of classes which together form the agent module.
# 1 2 3 4 5 6 7 8
Step Retrieve data Filter information Convert to a vector Load it to the A.N.N. Launch the A.N.N. Retrieve the output Convert to an action Perform the action
SurfaceInfo * Camera::takeashot(void) // via Vehicle::takeashot(void) Situation * Analyzer::analyzedata
(SurfaceInfo *) Interpreterbool Interpreter::createinputvector(float *, Situation *) ANN bool ANN::feed(float *) ANN bool ANN::work(int) ANN bool ANN::getresponse(float *) InterpreterSituation * Interpreter::createreaction Agent (float *) bool Agent::dispatch(Situation *)
Classes: Agent, Analyzer and Interpreter, as well as their components, are discussed in the following chapter. ANN – the class of the artificial neural network, as well as its subclasses, HopfieldANN and MultiLayeredANN, and its components, is discussed in the next chapter: “The Artificial Neural Network”.
3.1. The agent
API Main characteristics: • A reference to a vehicle A pointer (technically) to an object of the class Vehicle which is under control of the agent. Via this pointer the agent accesses the camera and retrieves information about the scene – as well as accesses the vehicle for information about its internal state – and, after the decision is made, performs the chosen action. • Analyzer Data retrieved from the camera is passed to the analyzer for initial processing. The analyzer contains several structures used to filter important information and, after that, it returns a Situation object which contains only data necessary to make a 62
decision. • Interpreter A tool used to convert the Situation object returned by the analyzer into a form of a vector of floating-point numbers – that is, the input data for the artificial neural network. After the A.N.N. makes its computations, the interpreter is used to convert the output vector into another Situation object containing the decision. • Artificial Neural Network The core of the agent module. In fact, the agent holds an object of the class ANN – the abstract superclass of two models of neural networks used in the program: the Hopfield model and the Multi-Layered Perceptron model. The decision which of the models is used is being made during the initialization. • Excluded-from-update flag If, from any reason, an agent should not be processed in a given turn the program sets this flag to TRUE. The main loop of the program checks if it is set, and only if it is not the agent is processed. Possible reasons for setting this flag are: the agent’s vehicle has run out of energy, the user wants to test another agent and does not want this one to interfere. The main methods:
void update(int ann-iterations-per-tick)
The main method of the agent. The parameter indicates the maximal number of neurons of the A.N.N. which should be processed before the A.N.N. gives control back to the agent. The more detailed explanation can be found in the “Suspension mechanism” sub-chapter of the “Artificial Neural Network” chapter.
bool learn(const char *learningfile)
Can be called by the user to train the agent (more precisely, its artificial neural network). The method load data from a given text file, convert it into objects of the Situation class which hold both a pattern the network should be able to recognize and a reaction associated with the pattern. The Situation objects are then converted to two sets : the set of input vectors and the set of output vectors and passed to the
artificial neural network ‘void ANN::learn(float *inputset,float
The method encapsulating the processing from the point of accessing the data about the scene and the vehicle to the point of producing an input vector for the artificial neural network.
The method encapsulating the processing from the point of acquiring the output vector of the artificial neural vector to performing the decision on the vehicle.
void dispatch(Situation *r)
A big switch/case statement containing the set of possible actions and their definitions (i.e. sequences of calls to methods of the vehicle needed to perform the given action). Pseudo-code of the ‘void Agent::update(int ann-iterations-per-tick)’ method:
if ANN.isLocked() = false then begin preprocessing(); if preprocessing returns false then end the algorithm: the situation is the same as in the previous turn and nothing needs to be changed end if end if let the ANN process ANN_iterations_per_tick neurons if ANN.isLocked() = false then begin postprocessing() end if
Pseudo-code of the ‘bool Agent::preprocessing(void)’ method:
snapshot := vehicle.takeashot() actual_situation := analyzer.analyzedata(snapshot) allocate memory for the A.N.N.’s input_vector interpreter.createinputvector(input_vector,actual_situation) if input_vector is he same as previous_turn_input_vector then return false: the vectors are the same previous_turn_input_vector := input_vector
ann.feed(input_vector) return true
Pseudo-code of the ‘void Agent::postprocessing(void)’ method:
allocate memory for the A.N.N.’s output_vector ann.getresponse(output_vector) reaction := interpreter.createreaction(output_vector) dispatch(reaction)
Pseudo-code of the ‘void Agent::dispatch(Situation *reaction)’ method:
if reaction is equal to: STOP: turn off the autopilot of the vehicle set power of the vehicle to 0 end case LOOKAROUND: for i in 1 .. 4 do begin snapshot := vehicle.takeashot(); analyzer.analyzedata(snapshot); turn the vehicle to the left end for end case GOTOENERGYSOURCE: esi := analyzer.getclosestESI(); if actual_goal_position != esi.position then begin actual_goal_position := esi.position analyzer.setpath(vehicle.position,esi.position); end if next_step := analyzer.nextstep(); for every dimension (known as d) do direction_vector[d] := next_step[d] - vehicle.position[d] turn on the autopilot of the vehicle set the desired speed of the vehicle to maximum set the desired direction of the vehicle to direction_vector end case GOTODATASOURCE: dsi := analyzer.getclosestDSI(); if actual_goal_position != dsi.position then begin actual_goal_position := dsi.position analyzer.setpath(vehicle.position,dsi.position); end if next_step := analyzer.nextstep(); for every dimension (known as d) do direction_vector[d] := next_step[d] - vehicle.position[d] turn on the autopilot of the vehicle set the desired speed of the vehicle to maximum set the desired direction of the vehicle to direction_vector
end case end if
Pseudo-code of the ‘void Agent::learn(const char *learning_file)’ method:
get the number_of_examples from the learning_file allocate memory for input_vectors_array basing on \ the number_of_examples and the input_vector size allocate memory for output_vectors_array basing on \ the number_of_examples and the output_vector size for every example in the learning_file do begin create a situation_reaction object basing on the example // the situation_reaction object contains data about both // the situation and a reaction to this situation allocate memory for the A.N.N.’s input_vector interpreter.createinputvector(input_vector,situation_reaction); add the input_vector to the input_vectors_array allocate memory for the A.N.N.’s output_vector interpreter.createoutputvector(output_vector,situation_reaction); add the output_vector to the output_vectors_array end for ann.learn(input_vectors_array,output_vectors_array,number_of_examples);
3.2. A situation
Describes the situation in the vicinity of the agent’s vehicle, as well as the internal state of the vehicle. All values stored in an object of the class Situation are in the quantified form – that is, they are described by constant positive integers. The main characteristics: • • Distance to the closest data source Distance to the closest energy source The valid values of these variables are: VERYLARGE, LARGE, MEDIUM, SMALL, ALMOSTZERO, or UNKNOWN if the agent does not know the position of any data or energy source in the scene. • • Amount of energy in the vehicle’s engine The valid values are: HIGH, MEDIUM, LOW and ALMOSTZERO. Ratio of the number of cells in the vicinity of the vehicle which are known to the agent to the number of all cells in the vicinity. The valid values are: HIGH, MEDIUM, LOW and ALMOSTZERO. • The reaction associated with the given situation. The valid values are: GOTOENERGYSOURCE, GOTODATASOURCE, LOOKAROUND and STOP.
3.3. The analyzer
A part of the agent dedicated to analyze and filter data which come to it in the form of a
objects' array, as well as information about how much energy is accumulated
in the vehicle's engine. The analyzer encapsulates a few structures and variables working as the external memory of the agent - the most important of them are explained below. The result of the analyzer's work is a Situation object which is subsequently transformed into an input vector for the artificial neural network. API The main characteristics: • The map The map is a 2D array of SurfaceInfo objects. Each SurfaceInfo describes one of the cells of the scene's surface - the position of the given SurfaceInfo in the map corresponds to the position of the given cell in the scene. Every time a new snapshot of the vehicle’s vicinity is passed to the analyzer the map is updated. The updated version of the map can be subsequently used to calculate distances between the vehicle and known energy and data sources (that is, those sources whose position is remembered by the agent), as well as to set a path between the vehicle and the data or the energy source which the agent wants to approach to. • • The energy sources' identifiers queue The data sources' identifiers queue Two priority queues used to store identifiers of energy and data sources which were previously spotted by the vehicle's camera. The sources are sorted accordingly to their distance to the vehicle (each in their corresponding queues). Every time the position of the vehicle changes the distances are recalculated and the queues are resorted. After the analysis of data quantified versions of the closest energy source and the closest data source distance are encapsulated in a Situation object and returned to
the agent. The algorithm used to sort identifiers is the Heap Sort. The main methods:
• Situation * analyzedata(SurfaceInfo * snapshot, int width, int length, float accumulated_energy)
The main method of the class: The agent supplies it with the snapshot taken by the vehicle's camera, two parameters describing the width and the length of the snapshot, and the proportion of the amount of energy accumulated by the vehicle's engine and the maximal amount of energy the engine is able to accumulate. The method updates the map and analyzes the current situation what results in returning an object of the type
Situation. • • float distance(int from, int to) bool setpath(int from, int to)
Both methods access the map and use the A* path-finding algorithm to calculate the closest way of traveling from the position ‘from‘ to the position ‘to’. After that the 'distance' method simply returns the calculated distance between those two positions (so it uses only the first part of the algorithm as it is explained in “The A* algorithm” chapter), whereas 'setpath' additionally creates a path between them (using both the first and the second part of the algorithm) and returns 'true' if it is successful or 'false' if the path cannot be determined. Calculation of distances between the current position of the vehicle and all known data and energy sources is crucial to the decision-making process because it determines how much energy the vehicle will spend on moving from one place to another.
• bool nextstep(int currentpos,int * nextpos)
When the path is calculated by the 'setpath' method it is stored in the map. The 'nextstep', supplied with the current position of the vehicle, sets the 'nextpos' array with the next cell on the path and returns 'true' - or returns 'false' if the current position of the vehicle is outside the previously calculated path. In such case a new 'setpath' call is needed which will override the current path.
• • ESInfo * getclosestESI(void) DSInfo * getclosestDSI(void)
These methods simply return the identifiers of the closest sources - energy sources and data sources, respectively. Since the identifiers are stored in priority queues there is no need to search for the closest ones - they are always on top.
• • • int quantifydistance(float v) int quantifyenergy(float v) int quanitifyvicinity(float v)
Basing on the borders of intervals taken from the configuration file during the initialization of the agent these methods quantify given values of distance, energy, and known vicinity ratio into integer numbers which describe those values as “short”, “medium”, “long”, etc. Pseudo-code for the 'Situation * Analyzer::analyzedata(SurfaceInfo * snapshot,
int width, int length, float accumulated_energy)'method.
// the first part of the algorithm updates the map with the SurfaceInfo // objects from the snapshot. for every entry in the snapshot begin cell := the cell correspondent to the entry if the cell is taken by any object (entity of the type Object) then continue with the next entry update the corresponding entry of the map for every identifier in the d.s. identifiers queue begin if identifier.position = entry.position begin // apparently, the data source identified by DSI // has moved to another place or ceased to exist remove the identifier from the d.s. identifiers queue break the for loop // only one object can span one cell // in the same time so there is no need // to search for more end if end for for every identifier in the e.s. identifiers queue begin if identifier.position = entry.position begin remove the identifier from the e.s. identifiers queue break the for loop end if end for end for // the second part of the algorithm searches through the snapshot for // energy and data sources and put their identifiers in their // corresponding queues
for every entry in the snapshot begin cell := the cell corresponding to the entry if the cell is NOT taken by an object (entity of the type Object) continue with the next entry O := cell.actualobject if O is an energy source then begin for every identifier in the e.s. identifiers queue do if the identifier identifies O AND positions of the identifier and O are not identical then remove the identifier from the e.s. identifiers queue end for identifier := create an identifier of O add the identifier to the e.s. identifiers queue end if if O is a data source then begin for every identifier in the d.s. identifiers queue do if the identifier identifies O AND positions of the identifier and O are not identical then remove the identifier from the d.s. identifiers queue end for identifier := create an identifier of O add the identifier to the d.s. identifiers queue end if end for // the third part of the algorithm creates the Situation object // corespondent to the current situation in the environemnt and the // internal state of the vehicle situation := a new Situation object esi := getclosestESI(); esi_distance := distance(vehicle.position,esi.position); esi_quantified := quantifydistance(esi_distance); situation.closest_es_distance := esi_quantified; dsi := getclosestDSI(); dsi_distance := distance(vehicle.position,dsi.position); dsi_quantified := quantifydistance(dsi_distance); situation.closest_ds_distance := dsi_quantified; energy_quantified := quantifyenergy(energy_accumulated); situation.energy := energy_quantified; known_vicinity := calculate how much entries of the map in the range [vehicle.position[X] - length , vehicle.position[Y] - length , vehicle.position[X] + length ,
vehicle.position[Y] - length ] is filled with data known_vicinity_ratio := known_vicinity / ( 4 * length) known_vicinity_quantified := quantifyvicinity(known_vicinity_ratio); situation.knownvicinity := known_vicinity_quantified; return situation;
3.4. The interpreter
The interpreter works as a translator between the analyzer and the Artificial Neural Network. It has two main methods – the first for creating the input vector for the A.N.N. basing on a Situation object, and the second for finding the ID of the reaction computed by the A.N.N., basing on the output vector taken previously from the Artificial Neural Network. Additionally, the interpreter exposes one more method for changing a Situation object containing the answer into an output vector. This way the user is able to prepare a training set for the Artificial Neural Network in for Situation objects (which should contain both description of the situation and the ID of the reaction associated with the given situation). During the learning process the agent uses the interpreter to translate those Situation objects into pairs (input vector, output vector) and then fetches it to the ‘void ANN::learn(float *
inputset, float * outputset, int number_of_examples)’
• void createinputvector(Situation * situation, float * inputvector)
Creates the input vector for the Artificial Neural Network basing on the Situation object created previously by the analyzer. See the pseudo-code below for more details.
• int createreaction(float * outputvector)
Translates the output vector given by the Artificial Neural Network into a positive integer identifying the decision. Subsequently, the number is given to the ‘void
method which performs actions associated with
• void createoutputvector(int reaction, float * outputvector)
Basing on the reaction ID number given to it as the first parameter, this method recreates the output vector of the Artificial Neural Network corresponding to the given reaction. It is used in the learning process. Pseudo-code for the ‘void Interpreter::creteinputvector(Situation * situation,
float * inputvector)’: size := number of input neurons for i in 0 .. size-1 do inputvector[i]=-1.0 end for pointer := 0 // first 5 entries denote the distance to the closest energy source if situation.closest_es_distance is equal to: VERYLARGE: end case // nothing to do, the case denoted as [-----] LARGE: inputvector[pointer] := +1.0 end case // the case denoted as [+----] MEDIUM: for i in 0 .. 1 inputvector[pointer+i] := +1.0 end case // the case denoted as [++---] SMALL: for i in 0 .. 2 inputvector[pointer+i] := +1.0 end case // the case denoted as [+++--] VERYSMALL: for i in 0 .. 3 inputvector[pointer+0] := +1.0 end case // the case denoted as [++++-] ALMOSTZERO: for i in 0 .. 4 inputvector[pointer+0] := +1.0 end case // the case denoted as [+++++] end if pointer := 5 // next 5 entries denote the distance to the closest data source if situation.closest_ds_distance is equal to: ... // as above end if pointer := 10 // next 4 entries denote how much energy the engine accumulated if situation.energy is equal to: ALMOSTZERO: end case // nothing to do, the case denoted as [----] LOW: inputvector[pointer] := +1.0 end case // the case denoted as [+---] MEDIUM: for i in 0 .. 1 inputvector[pointer+i] := +1.0 end case // the case denoted as [++--] HIGH: for i in 0 .. 2 inputvector[pointer+i] := +1.0 end case // the case denoted as [+++-] VERYHIGH: for i in 0 .. 3 inputvector[pointer+i] := +1.0 end case // the case denoted as [++++] end if pointer := 14 // the last 4 entries denote how much of the vicinity is // known to the agent if situation.knownvicinity is equal to: ... // as above end if
4. The Artificial Neural Network implementation
For the decision-making process it is enough for the agent to be equipped in only one instance of an Artificial Neural Network. In fact, designing the Agent class in the way which explicitly shows that there may be more than one type of A.N.N. working is a bad practice. The fact that the implementation is used not only for performing tasks but also for testing various models of neural networks should be hidden from the agent. If we implement the simulation with the Object-Oriented Paradigm this problem can be easily solved by using the concept of virtuality. First, we design a class called ANN which holds all characteristics and functionality common to both Multi-Layered Perceptron and the Hopfield model, as well as declares methods which work differently in each of these models but have common names and parameters. These are virtual methods – they will be defined separately, for each model in its class. The agent does not know which model it uses. It treats them both as instances of the ANN class. This way, after the test phase is finished, we can get rid of one of the models and use only the one we have chosen without making any changes to the rest of the agent module.
4.1. The common superclass
The main characteristics: • • Number of input neurons Number of output neurons Two variables describing how much input and how much output neurons the Artificial Neural Network has got. The number of input neurons is defined during the initialization of the program, so it could be stored in a more global structure to save memory and speed up access, but for the sake of Object-Oriented Programming each instance of the A.N.N. holds its own variables of this type. • The locking flag A flag indicating whether the network is locked – that is, is it permitted to pass an input vector to it and/or collect a response. For details, see the following sub-chapter explaining the suspension mechanism. • The iterations counter A variable used during the process of decision-making to count the number of neurons which were already processed (and since in C++ indexes of elements in an array start from zero, the number of processed neurons may also be used as the index as the next neuron to be processed).. It becomes helpful when the suspension mechanism is used.
The main methods (all virtual):
• void feed(float * inputvector)
Sets signals of the input neurons to the correspondent entries of the input vector.
• void work(int ann_iterations_per_tick)
The main method of an Artificial Neural Network class, starting the processing of the neurons. If ann_iterations_per_tick is equal to zero the network stops the processing 76
when the response is computed. Otherwise, if after processing
neurons the response is still not computed, the method sets
the locking flag to ‘true’ and stops, waiting for the next call.
• void getresponse(float * outputvector)
Copies signals of the output neurons to the correspondent entries of the output vector.
• void learn(float * inputset, float * outputset, int number_of_examples)
The input set and the output set are two-dimensional arrays whose rows are associated input and output vectors. The size of the input set is number of input neurons * number of examples and the size of the output set is number of output neurons * number of examples. The method iterates through both sets, training the network to correctly associate the vector pairs.
4.2. The Hopfield model
API The main characteristics: • An array of neurons The number of input neurons is the same as the number of the output neurons and is equal to the number of all neurons in the network. The threshold function used by the Hopfield model’s neurons is defined as follows:
if input < 0 then output := -1.0 else if input > 0 then output := +1.0 else output := 0
The last output A vector of floating point numbers. It stores the output signals of the neurons from the last iteration and compares them to the new result. If they are the same the network stops.
A dictionary A set of input vectors from the training input set with associated output vectors from the training output set. After a pattern is recognized, the Hopfield model searches through the dictionary and returns the output vector associated with the pattern. The program implements the dictionary as two arrays of numbers of the type ‘float’ filled with data in such way that if inputarray[i] is the first entry of a given input vector then outputarray[i] is the first entry of the associated output vector. The searching is done by using a very efficient, low-level function ‘int memcpy(void *
m1, void * m2,int number_of_bytes)’
which compares the given number of
bytes starting from two given entries of the memory and returns 0 if they are identical. The main methods:
• void work(int ann_iterations_per_tick)
The Hopfield model processes its neurons sequentially, working in the asynchronous
mode. After processing the last neuron the network compares the result with the one from the previous N iterations (where N is the number of neurons in the network) and eventually starts again from the first neuron.
• void learn(float * inputset, float * outputset, int number_of_examples)
The algorithm used for the learning is the Hebbian rule. See the pseudo-code below.
• void getresponse(float * outputvector)
Copies signals of the neurons into a local array and searches through the dictionary for the output vector associated with the right pattern. Subsequently, it copies the output vector from the dictionary into the array passed to it as the parameter.
Pseudo-code of the ‘void HopfieldANN::work(int ann_iterations_per_tick)’:
if the_last_output = null then begin allocate memory for the_last_output for i in 0 .. number_of_neurons-1 do the_last_output[i] := 0 end if if ann_iterations_per_tick = 0 then begin while the neurons’ signals differ from the_last_output do fill the_last_output with the neurons’ signals for i in the_iterations_counter .. number_of_neurons compute the new signal for the neuron i end while else begin lock the network iterations := 0 while the neurons’ signals differ from the_last_output AND iterations < ann_iterations_per_tick do fill the_last_output with the neurons’ signals for i in (the_iterations_counter modulo number_of_neurons) .. number_of_neurons begin compute the new signal for the neuron i iterations := iterations + 1 if iterations = ann_iterations per tick then the_iterations_counter := the_iterations_counter + iterations return from the method end for end while end if
unlock the network the_iterations_counter := 0;
Pseudo-code of the ‘void HopfieldANN::learn(float * inputset, float *
outputset, int number_of_examples)’: copy the inputset to the dictionary’s input array copy the outputset to the dictionary’s outputarray set all entries of the weights’ matrix to 0 for n in 0 .. number_of_examples-1 do for i in 0 .. number_of_neurons-1 do for j in 0 .. number_of_neurons-1 do if i <> j then matrix[i,j] := matrix[i,j] + inputset[n,i] * inputset[n,j]; end for end for end for for i in 0 .. number_of_neurons-1 do for j in 0 .. number_of_neurons-1 do set weight of the connection between the neurons i and j \ to matrix[i,j]/number_of_examples
Due to its limitations in memorizing patterns, the Hopfield network used in the program associated with this paper learns only three patterns: one when the reacton should be “go to an energy source”, one for “go to a data source” and one for “look around”. This is because the number of neurons in the Hopfield network used in the program is equal to 18 ( = the length of te input vector ) and we know that possible number of patterns is ≈0.15∗Ν = 2.7 . The reaction “stop” is used artificially, when the network is unable to associate the given situation with any pattern.
4.3. A Multi-Layered Perceptron
API The main characteristics: • An array of neurons Even though it consists of three layers, the Multi-Layered Perceptron implementation holds its neurons in a single array. The number of input neurons and the number of output neurons are used to create “borders” between subsequent layers. The number of input and output neurons depends solely on the method of interpretation of Situation objects into input vectors. The threshold function used by the MultiLayered Perceptron’s neurons is defined as follows:
output := 1.0 /( 1 + exp(-2.0 * input) )
The main methods:
• void work(int ann_iterations_per_tick)
The Multi-Layered Perceptron processes its neurons sequentially, from the first neuron of the hidden layer to the last neuron of the output layer. After processing the last neuron the response is ready.
• void learn(float * inputset, float * outputset, int number_of_examples)
The method used for the learning algorithm is back-propagation. See the pseudo-code below.
Pseudo-code of the ‘void MultiLayeredANN::work(int ann_iterations_per_tick)’:
if ann_iterations_per_tick = 0 then begin while the_iterations_counter < number_of__neurons begin compute the new signal for the neuron with index = the_iterations_counter
the_iterations_counter := the_iterations_counter + 1 end while end else begin lock the network iterations := 0 while the_iterations_counter < number_of__neurons begin compute the new signal for the neuron with index = the_iterations_counter iterations := iterations + 1 if iterations = ann_iterations per tick then begin the_iterations_counter := the_iterations_counter + iterations return from the method end if end while end if unlock the network the_iterations_counter := number_of_input_neurons
Pseudo-code of the ‘void MultiLayeredANN::learn(float * inputset, float * outputset, int number_of_examples)’:
set weights of connections to random values offset:= number_of_neurons – number_of_output_neurons hiddenneurons := offset – number_of_input_neurons for nt in 1..number_of_trials do learning_coefficient := 1.0 / nt for k in 1 .. number_of_examples do extract the k-th inputvector from the inputset extract the k-th desiredoutput from the outputset feed(inputvector) work() getresponse(outputvector) for i in 0 .. number_of_output_neurons-1 do outputerror[i]=(desiredoutput[i]-outputvector[i]) * 2.0 * outputvector[i] * (1.0F-outputvector[i]) end for for i in 0 .. hiddenneurons do sum := 0 for k in 0 .. number_of_output_neurons do sum := sum + neuronsarray[offset+k].weight[i] * outputerror[k] hiddenerror[i]=sum * 2.0 * neuronsarray[number_of_input_neurons+i].getsignal() * (1 - neuronsarray[number_of_input_neurons+i].getsignal()) end for end for for i in 0 .. number_of_output_neurons do for j in 0 .. hiddenneurons do neuronsarray[offset+i].weight[j] := neuronsarray[offset+i].weight[j] +
learning_coefficient * outputerror[i] * neuronsarray[number_of_input_neurons+j].getsignal() end for end for for i in 0 .. hiddenneurons do for j in 0 .. number_of_input_nuerons do neuronsarray[offset+i].weight[j] := neuronsarray[offset+i].weight[j] + learning_coefficient * hiddenerror[i] * neuronsarray[j].getsignal() end for end for end for end for
The Multi-Layered Perceptron is able to memorize all four patterns (each associated with one of the reactions: “go to an energy source”, “go to a data source”, “look around”, “stop”).
4.4. The suspension mechanism
An already trained Artificial Neural Network makes its computations in (usually) constant time. In other words, no matter how sophisticated is the situation a Neural Network of the Multi-Layered Perceptron type uses the same amount of time to make a decision. The Hopfield network may need different number of iterations for a situation which is quite different from any pattern it remembers and for a situation almost identical to one of them. But since we know that an asynchronous Hopfield model is always divergent, and we can compute the maximal number of iterations needed (let’s call it I) then we can say that the network needs at most I units of time for its calculations, which is a constant number. However, the time the Artificial Neural Network needs may be too large for the user to accept. For example, if the network is used as a decision-making unit in a real time simulation, where one turn has to be performed in a given time interval, the decision-making process cannot exceed a pre-defined part of this interval assigned to it. On the other hand, the network of a given size has to spend a constant amount of time on solving a problem and sometimes it is not possible to reconcile it with the requirements of the real time simulation the network is a part of. Please note that this is a common problem of all Artificial Intelligence techniques. To make a decision a given amount of time is needed. During that time the control is inside the decision-making unit and the simulation or at least the agent, if we use multi-threading or the agent controls a real vehicle, not a simulated one - cannot do anything else. This problem can be neglected in the case of a simple A.I., for example a small Finite State Machine, when time needed to calculate the decision is relatively short, but it may become serious if the decision-making unit is designed to handle many complex situations and the set of possible reactions is large. The suspension mechanism is a feature unique to Artificial Neural Networks, very helpful in such cases. The way a neural network performs its computations - by repeating the same small number of operations for each neuron - allows us to easily divide the process int several parts. At the beginning of the computations the network is locked. In other words, the program sets a flag indicating that the agent should neither pass new input vectors to the network nor extract output vectors from it. In each part a constant number K of neurons (in
the case of MLP K <= N, where N is the number of all neurons – in the case of the Hopfield model K may be any number) is processed, and the counter used to identify the next neuron to be processed is updated. After the last part of neurons is processed the network is unlocked and the agent can collect the output vector as if the computations were performed at once. Please note that N does not have to be divisible by K. In the case of a Multi-Layered Perceptron computations come to an end when the last neuron of the output layer is processed. In the case of the Hopfield model after each processing of N neurons the network compares its current state with the one from the last comparison - if they are the same then the process ends. Between each two subsequent parts the control is given back to the agent which can either perform other tasks or give it back to the simulation. For example, the agent can collect information about the scene, analyze it and, if the resulting Situation object is very different from the previous one (indicating that there was a sudden change in the environment) it can abort the current computations. Each part of the decision-making process is performed in constant time and its length is dependent on the number of neurons being processed in each part. This number is set during the initialization of the agent and can be changed during the simulation if needed. Other Artificial Intelligence techniques, although usually faster than Artificial Neural Networks, do not have this ability. To make them useful in complex simulations of a real scene their functionality has to be simple, able to react only in a couple of ways. On the other hand, the suspension mechanism allows programmers to implement and train a big, sophisticated Artificial Neural Network, which is able to remember a large set of (Situation,Reaction) pairs. The neural network will be slower, but the suspension mechanism lets us divide its computations in such way that time of processing of each part is shorter or equal to the time interval given to the decision-making process in one turn. The processing is spread over many turns but they do not influence the overall performance of the program.
The design and implementation of a simulation of a real scene, and of objects and agents interacting within such simulation, is not an easy task. Fortunately, using the Object-Oriented Programming the designer is able to encapsulate functionality of each used mechanism in a couple of classes, making it easier to comprehend the whole design. Classes presented in the example implementation describe general features of such simulation. Every scene has to be built of materials of a sort and has to be able to store information about positions and characteristics of its objects – both mobile and immobile. What is even more important, every agent has to be equipped in a set of sensors, gathering data from its vicinity, as well as data about the internal state of the agent. Subsequently, the agents need a tool for making decisions basin on the data and a set of effectors which turn the decision into a sequence of actions. And the decision-making part of the agent, if we decide to implement it as an Artificial Neural Network, needs a mechanism to filter important information from the received data, an interpreter to translate the information in the form understandable to the A.N.N. and back, from the form understandable to the A.N.N. to the form understandable to the rest of the agent – and the Artificial Neural Network itself. Other techniques of Artificial Intelligence usually do not need an interpreter. They operate on symbols which are understandable to the rest of the agent module. But similarly to the Artificial Neural Network in most cases they need some sort of an analyzer. What is different is the fact that a conventional A.I. works in ways far more transparent to programmers than an Artificial Neural Network – and usually it does its job faster. A wellprogrammed Finite State Machine needs only a couple of transitions to come to the same answer as an Artificial Neural Network which has to process all its neurons. And if it is a Hopfield network then there is a possibility that it will have to process its neurons more than once. Two arguments may be used in defense of Artificial Neural Networks. The first is the fact that the decision-making process of a network may be suspended – the feature explained in the previous sub-chapter. This ability enables the agent to avoid one of the threats of sophisticated decision-making: if it takes too much time, the conditions may change during 86
the processing and the response will not be valid, or even not applicable. The second argument is that the fact that Artificial Neural Networks are slower than other, more conventional techniques of A.I is a fee we pay for flexibility. An Artificial Neural Network does not have to be re-programmed in order to be able to handle a new problem. It is sufficient that the solution comes in the form of a file describing the problem and the proposed reaction. The comparison of two popular models of A.N.N – the Hopfield model and the MultiLayered Perceptron – shows that the differences between them are not very large. The MultiLayered Perceptron is a bit simpler to implement, since it does not need additional features like a dictionary, and its memorizing abilities do not depend on the length of the input vector. On the other hand, its learning process is more sophisticated. In summary, Artificial Neural Networks can be used as a solution to the problem of controlling mobile objects with positive results under a condition: Using them we cannot expect lightning fast responses. But since computers become faster and faster this fault does not have to be crucial. After all, usually we do not have to make tens or hundreds of decisions per second. The more important is that implementing a neural network may take more time than implementing another A.I. technique (but again, only if we start from scratch – in many cases we can use an already programmed A.N.N. and just train it to solve new problems). So, we should think about using Artificial Neural Networks mainly if our goal is flexibility. For simulations and environments which do not need a flexible, expandable solution, simpler techniques are usually sufficient.
In the process of designing and implementing an example simulation of a real scene I used following tools:
• • • • •
AgroUML – an UML toolkit (http://agrouml.tigris.org) KDevelop 3.0.3 – a programming environment for Linux (http://www.kdevelop.org) GCC 3.2 – a C++ compiler (http://gcc.gnu.org) Mesa 6.2.1 – a free OpenGL implementation (http://www.mesa3d.org) FLTK, Fast Light Tool Kit, a GUI library (http://www.fltk.org)
The implementation was done under Linux Mandrake 9.1 and the programming language was C++.
[AIGPW01] Matthews, J., Basic A* Pathfinding Made Simple, AI Game Programming Wisdom, Charles River Media Inc., 2002. [AIGPW02] Higgins, D., How to Achieve Lightning Fast A*, Charles River Media Inc., 2002. [AIGPW03] Vykruta, T., Simple and Efficient Line-of-Sight for 3D Landscapes, Charles River Media Inc., 2002. [NET01] http://physics.bu.edu/~duffy/py105/notes/Momentum.html [NET02] http://ai-depot.com/FiniteStateMachines/FSM-Framework.html [NET03] http://www.comp.nus.edu.sg/~pris/AssociativeMemory/HopfieldModel.html [NET04] http://www.comp.nus.edu.sg/~pris/ArtificialNeuralNetworks/MultiLayeredPerceptrons.html [THM01] Macukow, B., a lecture for the 4st year of Computer Science MSc studies, “The Hopfield Model”
Additional note: If you want to contact the author, please send an e-mail to [email protected]
or [email protected]
(the second one is more official). I will try to answer questions, if you have any, and - if you wanted me to get involved in something - I enjoy writing articles and working on ambitious projects (though usually I don't have time for this). You can find more about me at my LiveJornal:http://www.livejournal.com/~makingthematrix . You can download the binaries of the program associated to the thesis from: http://satan.aster.net.pl/~samuel/annsim.tgz (though the link may not be valid). Additionally, I would like to mention that the part about Finite State Automata and the A* algorithm were based on articles from "AI Game Programming Wisdom"(Charles River Media Inc., 2002.). It is a great book. If you are interested in programming AI for games, you got to read it. Maciej Gorywoda, aka Samuel Rai