Wojciech Hardy

Office hours, Room B201 or A204:
Mondays, 16:45-17:45.
Wednesdays, 13:30-14:30.
Send me an e-mail first (!)

Reproducible Research | Results

Grading

Class activities (50 pts.) + Final project (50 pts.) for a total of 100 pts.

1) Class activities (50 pts.)

Along the way, we’ll be (sometimes) doing class activities (i.e. during classes).
– You’ll be asked to upload them to an online repository at the end of the class (as a prerequisite for having it counted) (or to send them in some other way before we learn how to work with online repositories).
– They will not take place every week – you’ll be able to check if there are assignments when I upload the materials. This should happen no later than on Wednesday mornings.
– Each activity will be counted as done (1) or not done (0). At the end, the total will be rescaled for a maximum of 50 points towards the final grade. For example, if you do 4 out of 7 activities, you’ll get 4*(50/7) = 28,6 , i.e. 29 pts. towards the final grade.

You’ll have two options to have your activities/assignments included in the final score:
Option 1) you participate in the class. Sometimes we’ll end up having more time for the exercises,  sometimes less. By the end of the class you submit what you have (in the way described in the materials), and I check whether you’ve been actually trying to solve the exercises and following the instructions, etc. It’s not about finishing everything, but about checking activity during the classes. We might not always have enough time for all the tasks, so it counts if you put effort to do the tasks during the provided time (not whether you managed to do everything).
 
In option 1, you need to submit your class activities (in the specified manner) by the time the classes end.
 
Option 2) you can’t participate (or perhaps you already know the particular topic). In that case, you’ll be expected to solve all the exercises from materials and submit them by Thursday, 20:00. The assumption is that if you don’t participate in the classes, you know the particular topic well enough to just solve all the provided exercises. You’ll need to submit the activities by the respective  Thursday, 20:00 (not after).
 
The assignments will always be indicated by early Wednesday and can be checked in the Materials tab.
 

2) Final project (50 pts.):

Tl;dr

A reproduction project that is itself reproducible; done in teams of 3. Example topics at the end of this below.

Long version

Deadlines

March 31, 2023 -> send information about teams (send an e-mail, including names of the three members)

April 16, 2023 -> send a link to the team’s GitHub repository (or invite via GitHub)

May 5, 2023 -> confirm project topics (send an e-mail with the topic for confirmation)

June 18, 2023 -> no further changes in the project repository allowed

You can finish the project earlier and ask for an earlier grading as well.

Detailed project grading

1. project stored in a repository from the start (viewable history of well-described contributions from all team members who collaborate) (20 pp.)

2. code and results with appropriate documentation (e.g. Markdown) appropriate for *full* reproducibility
(including software version, etc.) (20 pp.)

Note: I’ll have to be able to understand and run your project on my own.

3. code in a clean and easily readable format (10 pp.)

Note: use a repository dedicated solely to the project. You can collaborate with your team members via pull requests or by making them collaborators for your repository.
You can use branches for development but don’t have to (do what’s best
for your workflow). You will be graded based on that repository. Make sure it’s well-kept, that there are no junk files, that it’s clear what’s inside, that it has proper and neat documentation (Readme.md), that the results are well-described, that the history of commits shows a continuous workflow and collaboration from all team members (not e.g. a one-time dump of all the files).

The core idea

The project isn’t aimed at testing your programming, econometric or statistical skills. The correctness of your analytic reasoning will not be graded in this course.

Instead, your projects should focus on:
1) the process of reproduction (did it succeed? what where the challenges? what was missing in the original source? what didn’t work? what could have been done in the original work instead? etc.)
2) making your own process reproducible
3) documentation and good coding practices
4) collaboration and version control via Git

Rules on plagiarism

No tolerance for plagiarism, including self-plagiarism (!).

Plagiarism is not only about failing to provide a source, it is also about copying large parts from one project to another, especially without adequate modifications.

AI policy

This course is not about your programming or literacy skills. It’s about understanding the importance of reproducibility and the ways of achieving it.

You can use AI to support your writing and coding. If so, please state:
– the scope of the support,
– the AI model and version,
– the (exact) prompts used.

Be aware that AI can go wrong. It can support your work, but it requires conscious double-checking. You are responsible for what you include in your project.

(Also cite other online sources, even if only of code snippets)

Teamwork

Work in teams of **3**. If you can’t find a team by the time of the deadline, you will be assigned to a team.

If you run into any problems regarding your collaboration, address them on the go. You may involve the course coordinator if so.

Do not wait until the deadline to notify the coordinator about not having been able to work with other team members.

Project topics

Project topics (and scope!) are agreed upon with the coordinator via e-mail. If you change them without approval, your project will not be graded (or will be graded accordingly with the initial scope that was agreed
upon).

Pick one:

1) Take an econometric / statistical analysis (e.g. from your bachelor thesis or Kaggle):
– Translate the codes to a different programming language (e.g. R to Python or Stata to R, etc.)*
– Reproduce the results.
– Pick a way to improve the study or update the data** or perform a robustness check and do it.
– Discuss the findings, potential problems, inconsistencies and conclusions.

* you should try exploiting the advantages and functions of the ‘new’ language
** data update can include: collecting new survey data (not many), using newer datasets, etc.

2) Take a published (in a peer-reviewed journal) research paper with no code attached. Reproduce its (main) findings. Report on the problems along the way.

3) Take a simple meta-analysis study (examples below). Then do the following:
– Reproduce the obtained results using the reported sample of studies.
– Add 2-5 newer studies, preferably using the selection process reported in the original study.
– Replicate the results with the extended sample.
– Describe your findings and discuss them.

Examples:

Card, D., & Krueger, A. (1995). Time-Series Minimum-Wage Studies: A Meta-analysis. The American Economic Review, 85(2), 238-243. [Original table with sample of studies available upon requestc (albeit with some missing information)]

Gorg, H., & Strobl, E. (2001). Multinational Companies and Productivity Spillovers: A Meta-Analysis. The Economic Journal, 111(475), 723-739.

Glass, G. V., & Smith, M. L. (1979). Meta-Analysis of Research on Class Size and Achievement. Educational Evaluation and Policy Analysis, 1(1), 2–16.

4) Have a different idea that fits the scope and theme of the course? Write an e-mail and we can discuss it.

3) Scoring

91-100 pts – 5
81-90 pts – 4.5
71-80 pts – 4
61-70 pts – 3.5
51-60 pts – 3
0-50 pts   – 2

 

Last but not least

If you have a cough, please stay at home. You can submit the activities remotely, or provide me with a doctor’s opinion and you’ll have extended time for the exercises. If you need to catch up afterwards, we can set up an appointment during office hours – no worries.