The goal of the Final Project is to prove or disprove a hypothesis using skills learned in this class, and demonstrate understanding of those techniques through explaining them to others. It's open-ended — you decide what you're investigating. We're looking for you to be creative, and just the right amount of ambitious.
- General assignment information
- Create a new notebook to do the actual analysis; that is what you'll turn in.
- Go back and find any information that's available around the data, to get a better understanding of what it contains and means.
- Might include a data dictionary
- Might involve poking around a government agency's web site to understand their processes
- Understand what all the different columns and values represent
- If you end up answering your initial research question easily (haven't met the requirements below), ask and answer follow-up question(s).
In addition to the applicable general assignment requirements, your submission should:
- Read like a blog post, and do the following - 35 points
- Re-state the question, hypothesis, and data source(s) with link(s)
- If you hit any dead ends in your analysis, leave them in.
- For example, include charts that you generate that may not show anything interesting and explain what you are choosing to look at instead.
- You should still be cleaning up unused/broken code to make your notebook readable.
- You may need to tweak your research question as you go. Show and explain why.
- Have a conclusion that speaks to your question and hypothesis.
- Use pandas - 15 points
- Not be trivial - 35 points - requiring:
- Have a visualization (chart or map) of some kind - 15 points
- Follow best practices
If you answer the first question easily, that's fine; dig into / build off of it. Go deep, not broad.
If you insist: Make sure you use at least 40 lines of code to come to a conclusion.
- That code should be relevant to answering your question. In other words, having 40 lines of `print("hello world")` wouldn't count.
- If you meet all the requirements above, you will likely be well over this number.
- [How to count them automaticaly](final_project/resources.md#counting-lines-of-code)
Note this is different than submission for other assignments.
- Remove the following from the notebook, filename, file paths, etc:
- Name
- NetID
- Sensitive information
- API keys
- Personally-identifiable information (PII)
- Share the notebook. Under
General Access, select:- {% if id == "columbia" %}
LionMail{% else %}Anyone with the link{% endif %}, then Viewer
- {% if id == "columbia" %}
- Submit.
{% if id == "columbia" -%}
- Go to the
Final ProjectAssignment in {{lms_name}}. {% else -%} - Go to
Content->Final Project. {%- endif %} - Submit the URL to your notebook (
https://colab.research.google.com/drive/...).
- Go to the
- DO NOT WAIT UNTIL THE LAST MINUTE TO SUBMIT. Leave yourself time to fix any issues that come up in doing so, computer crashing, etc.
- Because it's the end of the course and your peers are doing the reviews, there will be no extensions.
- Hold off on responding to comments on your notebook before you get your Project grade.
The instructor {% if id == "columbia" %}and {{assistant_name}} don't{% else %}doesn't{% endif %} have bandwidth to review everyone's full notebooks. Therefore, to be fair to everyone, any requests to have notebooks reviewed end to end will be denied — aside from appeals to the peer grade. In other words, please don't ask "I think I'm done — can you make sure my Final Project is ok?" That said, more than happy to answer specific questions and help troubleshoot specific sections.
To confirm you meet the requirements prior to submitting, you can:
- Take a pass through your own notebook, pretending you are grading someone else
- Ask someone else in the class to do so