Provide citation guidance in the Downloads tab
See original GitHub issueMigrated from Notion: https://www.notion.so/owid/Provide-citation-guidance-in-the-Downloads-tab-50edb8b91cc24a3f9bfd2dc164f659fe
Problem
People download and reuse datasets, then cite us instead of the original providers. This makes original providers less happy sharing data with us, and by extension the general public.
Quick fix solution
Put citation guidance into the download tab of every chart, under the download button. In that guidance, we should separate out how to cite the data (it should be the same as Sources in the chart).
Full discussion below.
Background
We have a general worry that data providers do not get enough credit in our work.
That’s bad obviously because they deserve lots of credit. But it’s also a strategic risk for us: In order for us to do our job, we need data providers to be happy and supportive of our work.
One aspect of this general worry is how people cite data when accessing it through us. It’s a common thing that people say ‘Source: OWID’ at the bottom of a chart they’ve made, so that the data provider gets no credit.
We should do what we can to avoid this happening.
A simple step is simply to give users clear guidance on how they should write the citation when using data from OWID but not produced by OWID.
Where should the citation guidance be given?
❗ We should show it prominently in the place where most people reusing our data get it – the download tab on charts.

- 
We should also think about how we can make this clearer for people taking data from GitHub.
For instance, in the COVID dataset we mention that “you should always check the license of any such third-party data”, but we don’t tell people how they should cite this data – only how they should cite ‘our’ testing and vaccinations data.
(And the same would apply to a future API…)
 
Our general policy on this should maybe also be written up as an FAQ in our About section.
What should the citation guidance be? How should we implement it?
Quick fix proposal for now
Add something like the following to all charts:
How to cite this work Data should be cited as: ‘
source’. Chart should be cited as: ‘Data fromsource, Chart from Our World in Data’
Or to make it very explicit:
How to cite this work If reusing this work, please provide a citation that makes clear the contribution of the data providers: Data should be cited as: ‘
source’. Chart should be cited as: ‘Data fromsource, Chart from Our World in Data’
This message should be automatically generated, but with the possibility for manual override (i.e. it’s another field in the Grapher/Bulk-FASTT admin).
Longer-term solution
There are different cases that we need to think through what the guidance should be. In the short-term we could override less typical cases manually, but in the longer these different cases should be mostly be handled automatically.
- 
The typical scenario – data from the World Bank, FAO, WHO etc.
The text given above would apply to most typical scenarios. For instance, for the [Share in extreme poverty](https://ourworldindata.org/grapher/share-of-population-in-extreme-poverty?country=BGD~BOL~MDG~IND~CHN~ETH~COD) from World Bank, Povcal:
If reusing this work, please provide a citation that makes clear the contribution of the data providers: Data should be cites as: ‘World Bank, Povcal’. Chart should be cited as: ‘Data from World Bank, Povcal; Chart from Our World in Data’
 - 
A very long reference or an academic paper reference
We’d perhaps need to provide both a short and a full reference? i.e.
*If reusing this work, please provide a citation that makes clear the contribution of the data providers:
- Data should be cites as: ‘Poore and Nemecek (2018)’.
 - Chart should be cited as: ‘Data from Poore and Nemecek (2018); Chart from Our World in Data’
 
– Poore, J., & Nemecek, T. (2018). Reducing food’s environmental impacts through producers and consumers. Science, 360(6392), 987-992. – Hannah Ritchie and Max Roser (2020) - “Environmental Impacts of Food Production”. Published at [OurWorldInData.org](http://ourworldindata.org/).*
 - 
Where OWID itself is clearly the data source
e.g. Vaccinations, Testing, War Deaths project
If there’s a separate publication (as for Vaccinations and testing) then we can mention that instead.
 - 
‘OWID based on X and Y’
A more common case is where we have made changes/transformations etc. such that the data includes observations that couldn’t be found in the original sources – but where it’s not right for us to claim to be source, at least not in isolation.
There’s really a range of sub-cases here:
- Where we have made very minimal changes, or just calculated simple transformations (per capita rates say).
 - Where we have extended series by linking a small number of sources (E.g. [Working hours](https://ourworldindata.org/grapher/annual-working-hours-per-worker))
 - Where we have done some more substantial tinkering with the methods used in the original source (E.g. Bastian’s work on the [age of democracies](https://ourworldindata.org/democracies-age).)
 
We might want a different recommended citation in these different sub-cases? (We might also want to revisit how we ourselves refer to these different cases within our own charts…)
 
Issue Analytics
- State:
 - Created 2 years ago
 - Comments:11 (6 by maintainers)
 

Top Related StackOverflow Question
Discussed this with the Future of Publishing group today:
World Bank. [2021]), which we will want to respect when we’re giving guidance.Some notes on this after talking to @danyx23 about this issue, which I’m gonna work on next cycle:
Official data, do this)Data by UNbut don’t specify the UN dataset then that’s unhelpful and we’re giving that as the “official” guidance.