Course Syllabus - Database Design & Implementation
Database Design & Implementation
New York University
Department of Computer Science
Course description
In this course, we introduce principles and applications of databases. We begin by studying issues related to data collection and discovery, and techniques for prepping and munging the raw data to shape it into a form that is ready for database storage. Studying the design of relational databases, we learn how to craft databases to model “real-world” data in a way that lends itself to the types of manipulation and analysis that we have in mind. Once we have a clear understanding of these issues, we look at more advanced topics in data, such as data analysis and visualization in Python, how NoSQL databases like MongoDB compare to relational systems, ethical and societal implications of the world of data, and how to build simple web apps that publish and save data stored in databases.
Credits
4 credits
Meeting pattern
Tuesdays/Thursdays 11:00-12:15
Prerequisites
Introduction to Computer Programming (either CSCI-UA.0002
or CSCI-UA.0003
) with a grade of C or better. Students that have successfully completed Data Management and Analysis (CSCI-UA 479
) are not eligible to take this course.
Learning objectives
Upon completing this course, students will be familiar with some of the most common database systems and practices, including:
- Explorations of “data in the wild” - common sources of and issues related to working with real data sets
- Textual representations of data - common formats for storing data as text, e.g.
CSV
,JSON
,XML
,HTML
, and fixed-width columnar text files - Spreadsheets as databases - Using database features of standard spreadsheet software such as Microsoft Excel and Google Sheets
- Using Python to prepare datafiles - scrubbing, munging, and massaging data to prepare it for analysis
- Relational database programming with SQL - a deep dive into the SQL language
- Python database integration - connecting Python programs to both SQL and NoSQL databases
- Data analysis - contemporary data analysis techniques, including standard Python modules and the
pandas
library within a Jupyter Notebook environment - Visual display of data - using common data visualization libraries, such as
pandas
andmatplotlib
, to create charts and graphs to more easily spot and communicate trends in data - Constructing and evaluating data-based claims - using data to make and test hypotheses and evaluate claims about the world
- Societal implications of data collection and use - evaluate the societal and ethical consequences of different forms of data collection and utilization
- Relational database theory - the relational model, normalization, and Entity-Relationship Diagrams
- MongoDB - comparing and contrasting MongoDB with relational systems and viewing it as an example of the trend towards NoSQL databases
- Web application implementation - implementation of a Python-based database-driven web application using MongoDB,
Flask
, andpymongo
- Time-permitting, we will cover blockchain technology in detail as a unique variety of database
To achieve mastery in these topics, students will take quizzes and exercises corresponding to each lecture topic as well as a midterm and final exam.
While one class session is usually dedicated to hands-on student work to start each assigned exercise, students are expected to work independently to complete the exercises for approximately 10-15 hours each week.
Instructor
Amos Bloomberg
WWH 424
Department
This course is offered by the Computer Science Department. For department-related questions or concerns, please see the department’s contact information.
Textbooks
All books are available online:
- via Online Access through NYU Libraries / O'Reilly Safari (log in using your NYU Net ID email address; see notes on accessing via Safari apps)
- free html / ebook versions supplied by author or publisher
- database platform's documentation
Readings will be selected from the following books:
- Python for Everybody: Exploring Data Using Python 3 by Charles Severance (py4e)
- Bad Data Handbook by Q. Ethan McCallum
- Using SQLite by Jay A. Kreibich
- Database Design by Adrienne Watt (primary author)
- Learning MySQL and MariaDB: heading in the right direction with MySQL and MariaDB by Russell J. T. Dyer
- MongoDB Manual
Getting help
Help resources available to you are listed in order of urgency of your problem:
Messaging
Our course will use a message board (link to be distributed in class) as its main communication channel for announcements and discussion. This is a good place to ask questions that anyone - other students, graders, tutors, or the professor - can answer. This is a resource best used when the answer is not required urgently.
Tutoring
Tutors for this course are waiting to answer your questions, either on our message board or during dedicated tutoring hours (hours to be distributed in class). Use tutoring for more involved questions and when you prefer a more immediate answer.
Talk with the instructor
For any issues at all, contact the instructor:
- see me before class
- raise your hand or simply speak during class
- see me after class
- come to my open office hours (hours to be distributed in class)
Additional tutoring resources
Additional academic support is also available through the University Learning Center.
Attendance & participation
Attendance is mandatory and more than two absences may be penalized up to 10% of the total grade. In-class and online message board participation is encouraged. Anecdotally, students who do not attend class regularly and who do not participate in discussions tend to do poorly.
Student and instructor interaction during class
Class sessions are a mixture of lecture, discussion, and project work. During any lecture or discussion, students are generally encouraged to participate with questions, comments, and constructive criticism of the material being covered. On days when students work on their assigned projects, students work individually and occasionally in small groups of typically 2-3 people to complete specific projects, with help and guidance provided by the instructor.
Required software and hardware
All students require access to a desktop or laptop computer on which they can write software using a specific set of applications.
i6 account
In addition to your NYU Home Account, we will be using a special computer account on a Unix Web server named i6.cims.nyu.edu which will be assigned to you automatically based on your enrollment. This is called an i6 account and we will use it to host our websites.
- Common questions about i6 accounts are answered on this FAQ page.
- If you forget your i6 password and would like to reset it, go to the i6 password reset page for instructions on how to do so.
- If you do not receive notification that an account has been created for you, check your spam, and try to reset your password using the link above.
Grading
You will receive a grade calculated mechanically on the following rubric:
- 15%: Quizzes
- 35%: Exercises
- 25%: Midterm exam
- 25%: Final exam
Attendance may be taken into account in the final grade.
Letter grades
The final class grade will be assigned as follows:
Grade Range | Letter Grade |
---|---|
93-100% | A |
90-92% | A- |
87-89% | B+ |
83-86% | B |
80-82% | B- |
77-79% | C+ |
70-76% | C |
60-69% | D |
0-59% | F |
Notification of grades
Students will be sent their complete individual grades via email approximately once per week.
Quizzes
Quizzes are completed outside of class. You must be logged into Google with your NYU Net ID account in Google in order to view the Quizzes.
Quizzes are submitted by submitting a Google Form.
Exercises
Exercises are usually begun in-class with the remainder completed outside of class.
All exercises are submitted by pushing code to GitHub.
- we will cover how to push code to GitHub
- unless you have good reason to do otherwise, follow best-practices for all basic file names and file extensions
Late policy
All assigned work is due before class on the due date indicated on the schedule
- for every 24 hours that work is late, we apply a 10% penalty on the grade, up to a maximum penalty of 30%.
- after 72 hours, we will no longer accept the work.
Extensions
Students are automatically granted 2 late assignment extensions of up to 3 days late each, with the exception that all assignments must be submitted before the last day of regular classes before the final exam period.
- extensions must be used immediately upon submitting the work and cannot be retroactively applied later on.
- when submitting an assignment for which you would like to use one of these automatic extensions, you must notify the grader that you are using the extension, otherwise your assignment will be rejected.
- for any group work, each member of the group must use an extension (or lose points if none is available) for the entire group to submit work late.
- No additional extensions will be given, except in the case of a documented medical emergency.
Regrade requests
If a student requests a regrade of any work, we will regrade the work in full, not just the part that the student believes has been mis-graded.
Disability disclosure statement
Academic accommodations are available for students with disabilities. Please contact the Moses Center for Student Accessibility (212-998-4980 or mosescsd@nyu.edu) for further information. Students who are requesting academic accommodations are advised to reach out to the Moses Center as early as possible in the semester for assistance.
Student wellness
In a large, complex community like NYU, it’s vital to reach out to others, particularly those who are isolated or engaged in self-destructive activities. Student wellness is the responsibility of all of us.
The NYU Wellness Exchange is the constellation of NYU’s programs and services designed to address the overall health and mental health needs of its students. Students can access this service 24 hours a day, seven days a week - wellness.exchange@nyu.edu; (212) 443-9999. Students can call the Wellness Exchange hotline (212-443-9999) or the NYU Counseling Service (212-998-4780) to make an appointment for Single Session, Short-term, or Group counseling sessions.
Academic Integrity
Working with others and leveraging all resources available to you is a prerequisite for success. This is different from copying, cheating, plagiarism, and mental laziness. All submitted work must be your own. There are very reliable systems we use to detect plagiarism in computer code, such as moss and compare50. If you submit any work that is not your own, you risk failure or worse.
Please read the Computer Science department’s policy on academic integrity and the University-wide policy which supercedes it.