question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Provide citation guidance in the Downloads tab

See original GitHub issue

Migrated from Notion: https://www.notion.so/owid/Provide-citation-guidance-in-the-Downloads-tab-50edb8b91cc24a3f9bfd2dc164f659fe

Problem

People download and reuse datasets, then cite us instead of the original providers. This makes original providers less happy sharing data with us, and by extension the general public.

Quick fix solution

Put citation guidance into the download tab of every chart, under the download button. In that guidance, we should separate out how to cite the data (it should be the same as Sources in the chart).

Full discussion below.


Background

We have a general worry that data providers do not get enough credit in our work.

That’s bad obviously because they deserve lots of credit. But it’s also a strategic risk for us: In order for us to do our job, we need data providers to be happy and supportive of our work.

One aspect of this general worry is how people cite data when accessing it through us. It’s a common thing that people say ‘Source: OWID’ at the bottom of a chart they’ve made, so that the data provider gets no credit.

We should do what we can to avoid this happening.

A simple step is simply to give users clear guidance on how they should write the citation when using data from OWID but not produced by OWID.

Where should the citation guidance be given?

❗ We should show it prominently in the place where most people reusing our data get it – the download tab on charts.

how to cite in download tab-01

  • We should also think about how we can make this clearer for people taking data from GitHub.

    For instance, in the COVID dataset we mention that “you should always check the license of any such third-party data”, but we don’t tell people how they should cite this data – only how they should cite ‘our’ testing and vaccinations data.

    (And the same would apply to a future API…)

Our general policy on this should maybe also be written up as an FAQ in our About section.

What should the citation guidance be? How should we implement it?

Quick fix proposal for now

Add something like the following to all charts:

How to cite this work Data should be cited as: ‘source’. Chart should be cited as: ‘Data from source, Chart from Our World in Data’

Or to make it very explicit:

How to cite this work If reusing this work, please provide a citation that makes clear the contribution of the data providers: Data should be cited as: ‘source’. Chart should be cited as: ‘Data from source, Chart from Our World in Data’

This message should be automatically generated, but with the possibility for manual override (i.e. it’s another field in the Grapher/Bulk-FASTT admin).

Longer-term solution

There are different cases that we need to think through what the guidance should be. In the short-term we could override less typical cases manually, but in the longer these different cases should be mostly be handled automatically.

  • The typical scenario – data from the World Bank, FAO, WHO etc.

    The text given above would apply to most typical scenarios. For instance, for the [Share in extreme poverty](https://ourworldindata.org/grapher/share-of-population-in-extreme-poverty?country=BGD~BOL~MDG~IND~CHN~ETH~COD) from World Bank, Povcal:

    If reusing this work, please provide a citation that makes clear the contribution of the data providers: Data should be cites as: ‘World Bank, Povcal’. Chart should be cited as: ‘Data from World Bank, Povcal; Chart from Our World in Data’

  • A very long reference or an academic paper reference

    We’d perhaps need to provide both a short and a full reference? i.e.

    *If reusing this work, please provide a citation that makes clear the contribution of the data providers:

    • Data should be cites as: ‘Poore and Nemecek (2018)’.
    • Chart should be cited as: ‘Data from Poore and Nemecek (2018); Chart from Our World in Data’

    – Poore, J., & Nemecek, T. (2018). Reducing food’s environmental impacts through producers and consumers. Science, 360(6392), 987-992. – Hannah Ritchie and Max Roser (2020) - “Environmental Impacts of Food Production”. Published at [OurWorldInData.org](http://ourworldindata.org/).*

  • Where OWID itself is clearly the data source

    e.g. Vaccinations, Testing, War Deaths project

    If there’s a separate publication (as for Vaccinations and testing) then we can mention that instead.

  • ‘OWID based on X and Y’

    A more common case is where we have made changes/transformations etc. such that the data includes observations that couldn’t be found in the original sources – but where it’s not right for us to claim to be source, at least not in isolation.

    There’s really a range of sub-cases here:

    We might want a different recommended citation in these different sub-cases? (We might also want to revisit how we ourselves refer to these different cases within our own charts…)

@JoeHasell @maxroser

Issue Analytics

  • State:open
  • Created 2 years ago
  • Comments:11 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
marcelgerbercommented, Oct 10, 2022

Discussed this with the Future of Publishing group today:

  • We will want a free-text field on the chart-level that can override the auto-generated citation message. One reason to have this is that some data publishers give some citation guidance themselves (e.g. World Bank. [2021]), which we will want to respect when we’re giving guidance.
  • We may want a way to disable the citation guidance display for some charts.
  • I will check all the short source lines, to see how many cases we have where the auto-generated text doesn’t line up well.
  • We should definitely make sure that the Covid Vaccinations citation works well, since that’s our most widely-accessed dataset.
0reactions
marcelgerbercommented, Sep 20, 2022

Some notes on this after talking to @danyx23 about this issue, which I’m gonna work on next cycle:

  • The solution in #1607 gets us 80% of the way there.
  • As mentioned in #1607, there are cases where the English sentence we build is nonsensical. We can try to:
    • either detect that using heuristics (e.g. if the source starts with Official data, do this)
    • or, offer an option to specify a custom phrase in the Admin
  • It might be the case that a citation guidance that is not great is making things worse, e.g. if we’re saying Data by UN but don’t specify the UN dataset then that’s unhelpful and we’re giving that as the “official” guidance.
  • Should we add a button like “Give feedback on this citation”?
  • I will join the Future of Publishing group after my holidays to chat about this a bit.
Read more comments on GitHub >

github_iconTop Results From Across the Web

Add citations in a Word document - Microsoft Support
Click at the end of the sentence or phrase that you want to cite, and then on the References tab, in the Citations...
Read more >
How To Use Mendeley Reference Manager ... - YouTube
How To Use Mendeley Reference Manager (Complete Beginner's Guide ). 246K views 10 months ago ... DOWNLOAD LINKS Mendeley Reference Manager: ...
Read more >
Citations & Bibliographies - Mendeley Basics - Research Guides
Inserting Citations Within Word. Download the Cite add-in. You will see Mendeley Cite on the Reference tab, on the right-hand side.
Read more >
EndNote 20 - Web of Science - LibGuides at Clarivate Analytics
This quick reference guide shows how to find and insert references from your EndNote library while staying in Microsoft Word.
Read more >
EndNote Online Quick Reference Guide
QUICK REFERENCE GUIDE – ENDNOTE ... references from your browser window and one that links your online library ... Go to the Options...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found