Page MenuHomePhabricator

Add datatype time to harvest_template.py for importing dates
Closed, ResolvedPublicFeature

Description

Time is one of the new Wikibase datatypes. harvest_template.py should support importing time.

https://www.wikidata.org/wiki/User:Underlying_lk/harvest_template_old.py can be used as reference, but that code combines everything. I split it up in multiple bugs.

See Also:
T57004: claimit.py sample: add options for entry of time values
T73699: harvest_template does not harvest

Details

Event Timeline

bzimport raised the priority of this task from to High.Nov 22 2014, 3:08 AM
bzimport set Reference to bz64503.
bzimport added a subscriber: Unknown Object (????).

What are some example template fields which could be harvested to wikidata?

Thinking out loud: if we get precision of date, rather than time, is that good enough? It looks like Underlying_lk's version is only date.

(In reply to John Mark Vandenberg from comment #1)

What are some example template fields which could be harvested to wikidata?

For example the date fields in https://en.wikipedia.org/wiki/Template:Persondata or in Dutch https://nl.wikipedia.org/wiki/Sjabloon:Infobox_persoon "geboortedatum" and "sterfdatum"

Thinking out loud: if we get precision of date, rather than time, is that
good enough? It looks like Underlying_lk's version is only date.

For now that's the only thing we can implement anyway. Wikibase doesn't support a precision greater than a day at the moment so we should just focus on the date and ignore the time (for now).

I removed https://bugzilla.wikimedia.org/show_bug.cgi?id=64501 btw. This bug doesn't depend on it and it's wontfix anyway.

To see this bug, run

$ python pwb.py scripts/harvest_template.py -simulate -family:wikipedia -lang:en -page:MediaWiki -template:Infobox_software released P580

(In reply to John Mark Vandenberg from comment #3)

To see this bug, run

$ python pwb.py scripts/harvest_template.py -simulate -family:wikipedia
-lang:en -page:MediaWiki -template:Infobox_software released P580

Hmm. That should output 'time is not a supported datatype', but it doesnt.

http://git.wikimedia.org/blob/pywikibot%2Fcore.git/master/scripts%2Fharvest_template.py#L171

After bug 71699 was fixed, this now shows the problem

$ python pwb.py scripts/harvest_template.py -simulate -family:wikipedia -lang:en -page:MediaWiki -template:Infobox_software released P580
Finding redirects...
Retrieving 1 pages from wikipedia:en.


>>> MediaWiki <<<
time is not a supported datatype.

I think that if T112141 gets implemented that this works too without any extra code.

Change 814822 had a related patch set uploaded (by Matěj Suchánek; author: Matěj Suchánek):

[pywikibot/core@master] [FEAT] Support harvesting time values

https://gerrit.wikimedia.org/r/814822

Change 814822 merged by jenkins-bot:

[pywikibot/core@master] [FEAT] Support harvesting time values

https://gerrit.wikimedia.org/r/814822

Xqt assigned this task to matej_suchanek.
matej_suchanek changed the subtype of this task from "Task" to "Feature Request".
matej_suchanek moved this task from Framework to Backlog on the Pywikibot-Wikidata board.

For the User-notice I'm wondering:

  • What is the simple summary? How would you describe the news in 1-2 sentences? What page(s) should be linked?
    • I.e. Is it a (bug-fix? feature-addition? editor-action-needed?) type of thing...
    • Suggested draft-wording always helps and is appreciated!
  • Does this belong in Tech News - I.e. is it of interest to technical editors at every wiki, or only at Wikidata?
  • Does this also need to be announced at pywikibot@ mailing list in addition to (or instead of) Tech News?

Thanks!

Ah, I see you've already had added a line to https://meta.wikimedia.org/wiki/Tech/News/2022/30 (I've hesitantly removed that, for now)

It is now possible to import dates from templates to Wikidata using Pywikibot.

I'll refine my questions above to just these:

  1. I'm still uncertain about whether it belongs in Tech News. Perhaps it should instead just be within the Wikidata newsletter? i.e. https://www.wikidata.org/wiki/Wikidata:Status_updates/Next
  2. Does this also need to be announced at pywikibot@ mailing list?
  3. If you're linking the announcement (at any location) just to this task, then is the current Task Description clear enough that it will help all pywikibot developers, or could it be improved with some examples or details? Or should it be documented more formally somewhere discoverable (onwiki) ?

Thanks!

I'm still uncertain about whether it belongs in Tech News. Perhaps it should instead just be within the Wikidata newsletter? i.e. https://www.wikidata.org/wiki/Wikidata:Status_updates/Next

Thanks for reminding me, I was actually going to put it there, too. If you are hesitant about putting to TN, I am definitely not insisting on it.

Does this also need to be announced at pywikibot@ mailing list?

Periodically, new releases are announced there with a list of notable changes.

If you're linking the announcement (at any location) just to this task, then is the current Task Description clear enough that it will help all pywikibot developers, or could it be improved with some examples or details? Or should it be documented more formally somewhere discoverable (onwiki) ?

Users familiar with the script will probably very quickly find out what they have to do to try this out. They can even request help within the script. Anyway, I could have at least added a link to harvest_template.py.