Machine Learning addresses the problem of identifying patterns in data. The major goal of machine learning is allow to computers to learn (potentially complex) patterns from data, and then make decisions based on these patterns. This class will provide an introduction to the fundamentals of this discipline.
Much of machine learning relies on mathematical foundations from Probability and Statistics. The course will provide an overview to the requisite math. However, students with some exposure to this field will have a smoother time understanding the mathematical underpinnings of the material.
Upon successful completion of this course, a student can expect to have:
Come to Class. It will be difficult to do well in the class without regular attendance. There is no additional penalty for missing class.
Cell phones must be on silent, and are not to be checked or used during class - if you are expecting an urgent call, tell the instructor at the start of class.
Laptops are fine for taking notes. No internet, no chat, no games.
Cell phone and Laptop policy: One warning, after that 5 points off the next homework or exam for each issue. Same policy for the instructor. One warning, after that, everyone gets 5 points on the next homework or exam.
Pattern Recognition and Machine Learning by Christopher Bishop.
Retail Price: $89.95 (12/27 Amazon Price: $58.83)
Assignments: 50% (5 x 10%)
Final Project: 50%
The Final Letter Grade will be based on a scaled adjustment of the Final Numeric Grade. When the scale has been determined, the class will be informed either in class or over email, and it will be posted to the course webpage (here).
Do not cheat. You may discuss assignments with your classmates, but write or program your assignment alone. Do not ask for or offer to share code, or written assignments. If you discuss an assignment with a classmate, or on an online forum, include the name of the classmate or URL of the forum on your assignment or in the documentation of your code. The first instance of cheating results in an automatic zero for the assignment (or midterm or final project). A second instance of cheating results in a zero (F) for the course. The Computer Science Department will be notified in writing of all instances of cheating. On a second instance a report will be submitted to the Office of Academic Integrity.
Assignments will be posted to the website (here) after class on Tuesdays.
All assignments will be scored out of 100 points.
There are 5 assignments. Each assignment will have a theoretical (pen-and-paper) component. Assignments may also include an implementation (coding) component.
Assignments will be due just after the start of class, 11:50am. Assignments should be delivered electronically, where possible.
Deliver assignments with a timestamp before 11:50am to avoid a late penalty. If an extension is needed let me know as soon as possible. I will do my best to be reasonable to you and fair to the rest of class. Delivering an assignment while being more than 5 minutes late for class will be make the assignment considered Late. There is a 5 point Late Penalty for each 12 hours late the assignment is delivered. Friday 11:50am - Friday 11:50pm: -5 points. Friday 11:51pm - Saturday 11:50am: -10 pts. Saturday 11:51am - Saturday 11:50pm: -15pts. Saturday 11:51pm - Sunday 11:50am: -20pts.
Grades will be delivered one class after they are due. After 11:50am, 2 days after an assignment was due, no assignments will be accepted.
After each assignment and the midterm is graded, anonymous mean, median and standard deviations of scores will be presented during class.
Coding assignments can be written in C++, java or matlab.
In general, grading will be 65% Implementation (compilation, passing tests, implementational details) and 35% Documentation and Style. This may be adjusted for some assignments. Always read the assignment for the grading breakdown.
Testing will be performed automatically. Sample tests will be delivered with each assignment. If code does not operate using the published and distributed testing format, the assignment will be considered Incorrect and a significant (~50%) Implementation penalty will be imposed.
Detailed requirements will accompany each assignment. The instructions and requirements on a particular assignment always take precedence over the general guidelines on the course website.
Submission of coding assignments should be performed over electronically. Submitting multiple times is fine. The latest assignment submitted on time will be graded. If you submit an assignment late, after submitting an assignment on time, you must let me know, via email, that you would like the late submission graded for the assignment.
Each coding assignment will require a README file as a component of its documentation. A README file should provide a high-level description of your assignment, or project.
A successful README file will include the following:
A sample README will be distributed with the first implementation assignment to serve as a template.
Written Assignments should also be delivered electronically.
Electronic copies must be in one of the following formats: .pdf, Microsoft Word .doc, Google Docs.
Points for each question will be described in each assignment.
The Final Project can take one of two forms: A paper or an implementation project. Possible project ideas will be presented in class. Individual meetings about the project topics will take place early in the semester. A progress meeting will take place at least 2 weeks before the project is due. Part of the project will be a short (15-20 minute) presentation of your work.
Identify a relatively narrow machine learning topic. Extensively review the literature, and write a paper describing the topic, approaches and evaluation. The goal of this paper should be that an intelligent and reasonably informed reader with no exposure to the topic would be able to understand the topic, know the current open issues - unresolved research questions, and know where to look for more information on the topic.
A successful survey paper would be approximately 10 pages.
Perform a machine learning experiment. This will involved implementation of a machine learning algorithm, evaluation on data set as well as comparing the results to other approaches. Some part of this experiment should be novel. Either a modification to an existing machine learning algorithm, or a novel evaluation technique or application of the algorithm to a new problem or new set of data. Note: a successful project does not need to generate state-of-the-art results. Some element of novelty, however, is required. A short, 4 page, report on the algorithm, dataset/problem, and evaluation is expected as part of the project.
Welcome. Introduction. Basic Classification
Math Review. Probability. Linear Algebra. Vector Calculus.
|Read Chapter 1.1, 1.2, 2.1, 2.2, 2.3, 3.1
HW 1 Assigned
|February 11||Lincoln's Birthday Observed. No class.|
Lagrange Multipliers. Linear Regression. Regularization.
|HW 1 Due. Read Chapter 1.1, 1.2, 1.5, 4.1, 4.2, 4.3
HW 2 Assigned train test
Information Gain Decision Trees. Graphical Models.
|Read Chapter 8.1, 8.2, 8.4|
|March 11||Junction Tree Algorithm. Belief Propagation. [ppt]||HW 2 Due, Read Chapter 13.1, 13.2 HW 3 Assigned|
|March 18||Hidden Markov Models.[pdf] Sampling. [ppt]||Read Chapter 4.1.7, 5|
|March 25||Perceptrons. [pptx] [pptx]||Read Chapter 6.|
|April 1||Support Vector Machines and Kernel Methods. [pptx] [pptx]|
Supervised Learning Recap. [pptx] Clustering Overview. [pdf] Gaussian Mixture Models [pptx]
||HW 3 Due. HW 4 Assigned|
Expectation Maximization (also EM in graphical models) [pptx]. PCA [pptx]
|April 22||Spring Recess. No Class.|
Model Adaptation. [pptx]
Evaluation Methods. Part 1 [pptx] Part 2 [pptx]
|HW-4 Due. HW-5 Assigned data|
|May 6||Genetic Programming and Dimensionality Reduction (Guest Lecturer: Ilknur Icke) [11:45am-12:45pm]|
|May 13||No class -- Work on your projects, office hours by appointment.|
|May 20 - 10:45-1:45||
Note the Earlier Start time!
|Final Project Due 11:45am|
|May 27||Class Over.||HW 5 Due (Extended deadline)|