CS 461: Machine Learning
Instructor: Kiri Wagstaff

CS 461 Homework 3

Due: Midnight, February 12, 2009

Please your answers to these questions in <yourlastname>-hw3.txt (or PDF).

Part 1: Neural Networks (50 points)

  1. Given the following perceptron:

    calculate the output y given this input: x = [ -6 1 5 ]. (6 points)

  2. Draw a multi-layer perceptron (MLP) that operates on input data having four features and with two nodes in the hidden layer. You may draw your perceptron in ASCII text, or upload a file called <yourlastname>-hw3-perceptron.pdf (or .jpg or .gif). (8 points)

  3. How many weights must the MLP you created for question 2 learn? (2 points)

  4. Consider the XOR problem, where the data set has two features and the label for a data item is positive if only one feature is 1 and negative otherwise. Why can't a single perceptron learn to classify data based on the XOR concept? (4 points)

  5. Why does a multilayer perceptron succeed at the XOR problem? (4 points)

  6. Read this short paper:
    An Artificial Intelligence System Suggests Arbitrariness of Death Penalty by Stamos T. Karamouzis and Dee Wood Harper, 2007.

    6a. What kind of feature would you use to represent "race"? (2 points)

    6b. What kind of feature would you use to represent "highest year of education completed at time of first imprisonment for capital offense"? (2 points)

    6c. Which single feature of the 17 input features (exclude ID number and the outcome) do you think should be the most relevant for predicting the class label ("execution" vs. "non-execution"), and why? (2 points)

    6d. This study used "cross-validation" but not "k-fold cross validation". What is the difference? (2 points)

    6e. Considering "executed" to be the positive class, write down the confusion matrix for the neural network's performance on the test set. (4 points)

    6f. What was the neural network's precision and recall on the test set? Show your work. (4 points)

    6g. What is your opinion of this study? Write at least one paragraph. (Consider commenting on the motivations, the methodology, the machine learning, the results, what the results mean, any questions you have that the paper didn't address, etc.) (10 points)

Part 2: Bayesian Methods (50 points)

  1. Assume that the probability of there being rain on any given day is 0.15, and that the (independent) probability of getting in a car accident is 0.02. What is the probability of there being a day that is both rainy and you get in a car accident? (5 points)

  2. From vehicle accident records, assume that we've determined that the probability of it being a rainy day, given that a car accident was observed, is 0.3. Using Bayes' Rule and the prior probabilities in question 1, what is the conditional probability of getting into a car accident today, given that you already know today is rainy? (8 points)

    Why is this probability different from the one you got for question 1? (2 points)

  3. I want to model the factors affecting whether a research paper will be rejected by a journal (not published). Draw a Bayesian network to capture the following information. Put your network in a file called <yourlastname>-hw3-bayesian.pdf (or .jpg or .gif). (5 points)

    The random variables, which should become nodes in your network, are shown in quotation marks.

    • The probability of getting a "grouchy reviewer" who doesn't like anything they read is 0.15.
    • The probability of there being a "technical flaw" in the paper is 0.05.
    • If there is a "grouchy reviewer" and a "technical flaw", the probability of rejection is 0.95.
    • If there is no "grouchy reviewer" and a "technical flaw", the probability of rejection is still high: 0.8.
    • If there is a "grouchy reviewer" and no "technical flaw", the probability of rejection is 0.3 (the grouchy reviewer may be overriden by a nicer reviewer).
    • If there is neither a "grouchy reviewer" nor a "technical flaw", the probability of rejection is 0.25 (there are many other reasons a paper may be rejected).

    Annotate each node with the appropriate probabilities, as in Figure 3.3 in the Alpaydin book. (5 points)

    What is the probability that the paper is rejected, given that there is a technical flaw? (10 points)

  4. Given this data set describing students, in which the class label is the person's major:

    Height Gender Wears glasses? Class (major)
    shortmaleyesComputer Science
    tallmalenoComputer Science
    tallmaleyesComputer Science
    mediumfemaleyesComputer Science
    mediummaleyesElectrical Engineering
    tallmalenoElectrical Engineering
    tallmaleyesElectrical Engineering

    Use the Naive Bayes classifier to predict this person's major (show your work): (15 points)

    Height Gender Wears glasses? Class (major)
  5. What is an association rule? Give an example. (5 points)

What to turn in

Upload this file to CSNS under "Homework 3":

  1. <yourlastname>-hw3.txt (or <yourlastname>-hw3.pdf if you prefer to submit in PDF format)
  2. <yourlastname>-hw3-perceptron.pdf (or .gif or .jpg)
  3. <yourlastname>-hw3-bayesian.pdf (or .gif or .jpg)

Extra credit

If you wish to tackle some extra credit, you may do so here. You can earn up to 5 points to be applied to any of your homework assignments (but not to exceed 100 on any assignment). To receive these points, you must get at least a 70% on the main part of Homework 3, so finish the regular assignment before moving on to this part.

A). (5 points) Given your Bayesian network in problem 3 of part 2, what is the probability of getting a grouchy reviewer, given that the paper was rejected?

What to turn in

Upload this file to CSNS under "Homework 3: Extra Credit":

  1. <yourlastname>-hw3-extra.txt (or <yourlastname>-hw3-extra.pdf if you prefer to submit in PDF format)