Course Syllabus - Database Design & Implementation
Database Design & Web Implementation
New York University
Department of Computer Science
In this course, we introduce principles and applications of databases. We begin by studying issues related to data collection and discovery, and techniques for prepping and munging the raw data to shape it into a form that is ready for database storage. Studying the design of relational databases, we learn how to craft databases to model “real-world” data in a way that lends itself to the types of manipulation and analysis that we have in mind. Once we have a clear understanding of these issues in relational database design, we look at more advanced topics in data, such as data analysis in Python, NoSQL databases like MongoDB, web APIs, blockchain, and building simple web apps that publish data held in databases.
Introduction to Computer Programming (either CSCI-UA.0002 or CSCI-UA.0003) with a grade of C or better. Students that have successfully completed Data Management and Analysis (CSCI-UA 479) are not eligible to take this course.
Upon completing this course, students will be familiar with some of the most common database systems and practices, including:
- Explorations of “data in the wild” - common sources of and issues with working in real data sets
- Spreadsheets as databases - Using database features of standard spreadsheet software such as Microsoft Excel and Google Sheets
- Textual representations of data - common formats for storing data as text, e.g.
HTML, and fixed-width columnar text files
- Using Python to prepare datafiles - scrubbing, munging, and massaging data to prepare it for analysis
- Relational database theory - the relational model, normalization, and Entity-Relationship Diagrams
- Relational database programming with SQL - a deep dive into the SQL language
- MongoDB - comparing and contrasting MongoDB with relational systems and viewing it as an example of the trend towards NoSQL databases
- Python database integration - connecting Python programs to both SQL and NoSQL databases
- Web application implementation - implementation of a Python-based database-driven web application using MongoDB,
- Data analysis with pandas - contemporary data analysis techniques using the
pandaslibraries within the Jupyter Notebook environment
- Data visualization with matplotlib - using
matplotlibto create charts and graphs to more easily spot and communicate trends in data
- Time-permitting, we will cover blockchain technology in detail as a unique variety of database
To achieve mastery in these topics, students will take quizzes and exercises corresponding to each lecture topic as well as a midterm and final exam.
While one class session is usually dedicated to hands-on student work to start each assigned exercise, students are expected to work independently to complete the exercises for approximately 10-15 hours each week.
amos at cs dot nyu dot edu
All books are available online:
- via Online Access through NYU Libraries / O'Reilly Safari (log in using your NYU Net ID email address; see notes on accessing via Safari apps)
- free html / ebook versions supplied by author or publisher
- database platform's documentation
Readings will be selected from the following books:
- Python for Everybody: Exploring Data Using Python 3 by Charles Severance (py4e)
- Bad Data Handbook by Q. Ethan McCallum
- Using SQLite by Jay A. Kreibich
- Database Design by Adrienne Watt (primary author)
- Learning MySQL and MariaDB: heading in the right direction with MySQL and MariaDB by Russell J. T. Dyer
- MongoDB Manual
Help resources available to you are listed in order of urgency of your problem:
Our course will use a message board (link to be distributed in class) as its main communication channel for announcements and discussion. This is a good place to ask questions that anyone - other students, graders, tutors, or the professor - can answer. This is a resource best used when the answer is not required urgently.
Tutors for this course are waiting to answer your questions, either on our message board or during dedicated tutoring hours. Use tutoring for more involved questions and when you prefer a more immeidate answer.
Tutoring hours (all times in Eastern Time):
Talk with the instructor
For any issues at all, contact the instructor:
- see me before class
- raise your hand or simply speak during class
- see me after class
- come to my open office hours - hours to be distributed in class
Additional tutoring resources
Additional academic support is also available through the University Learning Center.
Attendance & participation
Attendance is mandatory. In-class and online message board participation is encouraged. Anecdotally, students who do not attend class regularly and who do not participate in discussions tend to do poorly.
Student and instructor interaction during class
Class sessions are a mixture of lecture, discussion, and project work. During any lecture or discussion, students are generally encouraged to participate with questions, comments, and constructive criticism of the material being covered. On days when students work on their assigned projects, students work individuall and occasionally in small groups of typically 2-3 people to complete specific projects, with help and guidance provided by the instructor.
Required software and hardware
All students require access to a desktop or laptop computer on which they can write software using a specific set of applications.
In addition to your NYU Home Account, we will be using a special computer account on a Unix Web server named i6.cims.nyu.edu which will be assigned to you automatically based on your enrollment. This is called an i6 account and we will use it to host our websites.
- Common questions about i6 accounts are answered on this FAQ page.
- If you forget your i6 password and would like to reset it, go to the i6 password reset page for instructions on how to do so.
- If you do not receive notification that an account has been created for you, check your spam, and try to reset your password using the link above.
You will receive a grade calculated mechanically on the following rubric:
- 15%: Quizzes
- 35%: Exercises
- 25%: Midterm exam
- 25%: Final exam
Notification of grades
Students will be sent their complete individual grades via email approximately once per week.
Quizzes are completed outside of class. You must be logged into Google with your NYU Net ID account in Google in order to view the Quizzes.
Quizzes are submitted by submitting a Google Form.
Exercises are usually begun in-class with the remainder completed outside of class.
All exercises are submitted by pushing code to GitHub.
- we will cover how to push code to GitHub
- unless you have good reason to do otherwise, follow best-practices for all basic file names and file extensions
All assigned work is due before class on the due date indicated on the schedule
- for every 24 hours that work is late, we apply a 10% penalty on the grade, up to a maximum penalty of 30%.
- after 72 hours, we will no longer accept the work.
Students are automatically granted 2 late assignment extensions of up to 3 days late each, with the exception that all assignments must be submitted before the last day of regular classes before the final exam period.
- When submitting an assignment for which you would like to use one of these automatic extensions, you must notify the grader that you are using the extension, otherwise your assignment will be rejected.
- for any group work, each member of the group must use an extension (or lose points if none is available) for the entire group to submit work late.
- Do not ask for any extensions from the professor
If a student requests a regrade of any work, we will regrade the work in full, not just the part that the student believes has been mis-graded.
Student Accommodations and Accessibility
Students who believe that they may need accessibility accommodations in this class are encouraged to contact the Moses Center for Student Accessibility at (212) 998-4980 as soon as possible to better ensure that such accommodations are implemented in a timely fashion.
Working with others and leveraging all resources available to you is a prerequisite for success. This is different from copying, cheating, plagiarism, and mental laziness. All submitted work must be your own. There are very reliable systems we use to detect plagiarism in computer code, such as moss and compare50. If you submit any work that is not your own, you risk failure or worse.
Please read the Computer Science department’s policy on academic integrity and the University-wide policy which supercedes it.