Complete the following problems, worth a total of 10 points. Points for each question are shown below for ISE483 and SSIE583.
Please summarize your answers in one document file (.doc,.pdf)
including pictures. Organize this document well, make it clear which
questions you are answering and provide appropriate labels where you
can. Submit your well-documented code as separate python files. Also,
mention in your summary document the names of the relevant python files
for each question.
Please submit your well-documented code as separate python files, one for each answer. For the Eureqa question, attach the '.fxp' file containing your work. Summarize your answers in a PDF document including
figures, if any. Organize this document well, clearly indicating the question being answered and the respective Python file name(s). Upload all files into the 'Assignment 5' folder in Brightspace.
[2 points][1 point]The basics: Read Lab slides and class lecture 18. Formalize the
Ant Clustering Algorithm (ACA) in pseudo-code or in words. Give detailed instructions for calculating the 'perceived
fraction of items' f and the interactions of each ant with its environment (pickup, move, drop-off of material). This is only a warm-up exercise and is not asking you to solve any particular problem using ACA.
[3 points][3 points] Grooming an ant colony: Implement a
2-dimensional grid-world with wrap-around borders just like the cellular automata from lab3 (only, it is 2-dimensional now). Randomly place a few 'dead ants' on it (simply fill the cells of the world containing dead ants with a particular color). Now, randomly place a few 'living ants' on the world.
Your objective is to have the live ants cluster together the dead ants using ACA. Define a neighborhood size for each living ant; the immediate neighborhood of 9 cells (3x3 square) is a good choice. Compute f, as the fraction of dead ants perceived by a living ant in its neighborhood, which is a value between 0 and 1. Implement equations (5.11) and (5.12) that define the pick-up and drop-off probabilities. Choose values for k1 and k2 so that they are neither too large nor too small when compared to the range of f. Run your ACA for many iterations and watch the ant colony getting groomed. How many clusters do you see? Tune the above parameters until you are satisfied with the results.
For a particular parameter choice, rerun your program a few times starting from different initial conditions. Report your parameter choices and the corresponding results. You can use our sample code that draws a rectangular grid-world, and randomly places a few dead ants and live ants on it.
[5 points][4 points]Data clustering: (This question is also described in exercise 7 in p.260 of the textbook) Now, instead of dead ants, use the "data items" corresponding to the 16 animals described in this dataset
(you can use membership in 'terrestrial' and 'birds' in order to color
your items, so that you can see how well your clustering is
proceeding). Initialize your virtual world by placing the 16 data items
randomly across your 2D grid. (Note:
each data item is an array of boolean
values as shown in the file. You can assign an ID to each vector and
then refer to the IDs in
your program). Compute f as the similarity of a data item (that a living ant is sitting on) with all others in its neighborhood, using a variant of equation (5.13) described in exercise 7 in p.260. Provided is a sample piece of code to compute the similarity measure that uses equation (5.13) for a pair of 2-dimensional data items. Additionally, have the ants interact with the items by
picking them up, carrying, and dropping them according to the same
probability functions defined in equations (5.11) and (5.12). Choose parameter values for k1, k2, and α that give you meaningful clusterings of the 16 animals. One possible clustering you may expect to see is all the 'terrestrial' data items in one cluster and the 'bird' items in another.
[Optional, extra 2 points][2 points]
The parameter α defines the scale of dissimilarity: it determines when two items should be or should not be located next to each other (p.229). Once you are satisfied with the results from the above, rerun your algorithm by varying just α. How do higher and lower values of α affect the clustering performance?