Course Title: | 11-411/11-611: Natural Language Processing |
Semester: | Spring 2025 |
Meeting Times: | Monday, Wednesday, Friday |
Location: | Gates Hall |
Instructors: | Prof. David R. Mortensen and Prof. Eric H. Nyberg |
This course is an introduction to natural language processing, one of the most exciting and important fields of artificial intelligence. It is a multidisciplinary field that combines insights and methodologies from machine learning, theoretical computer science, linguistics, and the social sciences.
In this course, you will learn the fundamental concepts of NLP that will allow you apply natural language processing not just in the current technological environment (which is heavily focused on self-supervised language models) but in the future, when new technologies have achieved the state of the art. Students will learn not just how to develop models, but how to develop data sets and use evaluation metrics so that these models can be trained and evaluated.
The first half of the course will focus on fundamental methodologies for representing and modeling natural language using statistical methods, including machine learning and neural networks. The second half of the course will focus on specific applications and tasks, such as information extraction, question answering, machine translation and speech processing.
Speech and Language Processing (3rd Edition) by Daniel Jurafsky and James H. Martin
This course will use the draft 3rd edition of Jurafsky and Martin's Speech and Language Processing (SLP3). It is available online for free.
You are expected to do the readings before attending class since they will provide background needed to understand the lectures.
The course will have four graded components (none of which will be curved):
Assessment | Percentage |
---|---|
HW1 | 10.00% |
HW2 | 10.00% |
HW3 | 10.00% |
HW4 | 20.00% |
Midterm Exam | 25.00% |
Final Exam | 25.00% |
Total | 100.00% |
Grade | Range |
---|---|
A+ | 100% to 97.0% |
A | < 96.0% to 93.0% |
A- | < 92.0% to 90.0% |
B+ | < 89.0% to 87.0% |
B | < 86.0% to 83.0% |
B- | < 82.0% to 80.0% |
C+ | < 79.0% to 77.0% |
C | < 76.0% to 73.0% |
C- | < 72.0% to 70.0% |
D+ | < 69.0% to 67.0% |
D | < 66.0% to 63.0% |
D- | < 63.0% to 60.0% |
R | < 60.0% to 0.0% |
All homework assignments will be submitted via Gradescope.
There will be two in-person, written exams (midterm and final). Both will consist of a small number of multipart questions. The midterm will consist of 3–4 questions (each with 3–5 parts) and the final will consist of 4–5 questions of similar length.
The questions will be designed to test conceptual understanding (not recall of facts). They are open book and open note but are non-collaborative (you may not consult with one another or any other sentient being including LaMDA).
Each student is allocated five late days to use as they need throughout the semester, sometimes called "grace days." If a student hands in a homework assignment two days late, this will use two of their late days (leaving three days if no late days had been used previously).
There is no grade penalty for using late days. However, after late days are exhausted, late work will not be accepted for credit. The only exceptions to this are unexpected events—the death of a friend or family member, a severe illness, a mental health crisis or episode, etc. These exceptions will be handled on a case-by-case basis with the instructors.
No accommodations for health or disability matters will be made without mediation by the Office of Disability Resources. Students are encouraged to reach out to instructors as soon as they anticipate requiring accommodations beyond the grace day limit, ideally in advance of the assignment due date.
Because of this policy, students are urged to manage their late days with care and to save them for dire times.
Important information about the course such as changes to this syllabus, due dates, office hours, etc. will be communicated in class as well as on Piazza. Grades will be available via Gradescope and Canvas, and slides will be posted on Canvas.
Students are encouraged to post questions (and answers!) on Piazza, using anonymous/limited audience posting as appropriate. Everyone, both students and course staff, is expected to check Piazza regularly, and to communicate clearly, kindly, and with respect.
Students attending CMU in Pittsburgh are expected to attend every class in person. That being said, we understand that scheduling conflicts may arise infrequently throughout the semester, such as a job interview, conference, or illness. Lectures will be recorded and access to missed lectures can be requested using the form provided on Canvas.
Any cheating or plagiarism will be dealt with according to the University policies on academic integrity. In general, high-level discussion of tools, concepts, frameworks, and formalisms is acceptable collaboration and is encouraged. Sharing specific aspects of solutions or results with other students, or consulting work from previous semesters or other universities, is considered cheating.
Submitting work produced by generative AI systems like ChatGPT or GitHub Copilot will be considered silly, for reasons we will explore in this course.
Many people have disabilities, including members of our own families. We see disabilities as deficits not in disabled people but in the institutions and societies that are structured such that they are disadvantaged. We wish to do our part to overcome this disparate treatment.
If you have a disability (visible or invisible), please let us know as soon as possible (you don't need to tell us the nature of the disability) and work via the Office of Disability Resources to develop a set of accommodations which we can then approve. These may include extra time on exams, a quiet place in which to take an exam, alt text on all images, documents that work for people with differences in vision, sign language interpretation, captioning, etc.
Throughout human history, some people have been denied the rights and opportunities available to others on the basis of their race, gender, economic class, caste, ancestry, language community, age, religion, beliefs, political affiliation, and abilities (visible and invisible). A single course cannot undo the injustices of history, but we—as a teaching staff—are committed to fighting inequity and promoting inclusion. We encourage you to join us.
If you feel that you, or those around you, have been treated unfairly based upon their identity (or perceived identity) by us, by other members of the teaching staff, or by other students in the course, we ask that you share your experience with the Center for Student Diversity and Inclusion (csdi@andrew.cmu.edu, (412) 268-2150) or by using the anonymous Report-It reporting platform (reportit.net, username: tartans, password: plaid).