Wikidata:Requests for comment/how to manage software versions
An editor has requested the community to provide input on "how to manage software versions" via the Requests for comment (RFC) process. This is the discussion page regarding the issue.
If you have an opinion regarding this issue, feel free to comment below. Thank you! |
THIS RFC IS CLOSED. Please do NOT vote nor add comments.
- The following discussion is closed. Please do not modify it. Subsequent comments should be made in a new section. A summary of the conclusions reached follows.
- no consensus to change current pracitce --Pasleim (talk) 19:53, 3 March 2021 (UTC)[reply]
WikiProject Informatics has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead. WikiProject Ontology has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead.
@Ghouston, Dhx1:
Contents
Issue
editCurrently, it's impossible to distinguish between major versions, minor versions, builds, etc. This is due to the fact that:
- there is only a property for software versions: software version identifier (P348);
- often Wikidata has an item only for a software as a whole and not for its versions.
As a consequence, often software version identifier (P348) lists major version, minor versions, releases... all mixed together. Needless to say that I've never seen something like version type (P548) → minor version
as a qualifier (it doesn't even exists an item like "minor version").
Proposal: treat software as we treat books
editFirst, create an item for every version (major, minor...) of every software.
In order to organize the hierarchy of items about books we have a common model (FRBR: Work → Expression/Manifestation → Exemplar) and so we need only two propertes to link Work with Expression/Manifestation (has edition or translation (P747) and its inverse edition or translation of (P629)) and Expression/Manifestation with Exemplar (exemplar of (P1574)). With software versions, such a common model doesn't exists (as far as I know). To represent all the various version numbering schemes (for example, MAJOR.MINOR.PATCH or MAJOR.MINOR.REVISION.BUILDNUMBER) adopted by softwares, I think we have two options:
- create a property (and its inverse property) for every version type: we'll have "major version", "major version of", "minor version", "minor version of", "build", "build of", "release", "release of"... properties;
- create only a "software version" property (different from software version identifier (P348) because with an item datatype) and its inverse "software version of" AND consistently use
object of statement has role (P3831)version type (P548) qualifier.
Here are examples of the two options:
Option 1
editCreate a "software version" property and use object of statement has role (P3831)version type (P548) as qualifier in order to indicate the version type.
Optionally, create a "software version scheme" to verify that the version scheme is respected.
software version |
| ||||||||||||
add value |
software version |
| ||||||||||||
add value |
software version |
| ||||||||||||
add value |
Option 2
editcreate "major version", "minor version", "build"... properties (all subproperties of "software version").
major version |
| ||||||||||
add value |
minor version |
| ||||||||||
add value |
WikiProject Informatics has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead.
- Note that software version identifier (P348) can already be qualified with the version type (P548) property, whose values are instances or subclasses of software version type (Q28530564) -- such as stable version (Q2804309), alpha version (Q2122918), beta version (Q3295609), revision tag (Q7318449), daily build (Q5209391), software release (Q20631656), etc. (see complete list).
I would suggest that we agree on a comprehensive ontology for these, and completing the gaps so that existing version statements in software items can be qualified appropriately. --Waldir (talk) 09:16, 22 May 2018 (UTC)[reply]
- @Waldir: The problem with software version identifier (P348) is it's a string-valued property, so it can't point to an item. I like the options suggested above, probably Option 1 is fine (and fine to use version type (P548) as qualifier, I think that makes sense. But software version identifier (P348) would need to be changed to an item-valued property - see (and please comment on) Wikidata:Property proposal/software version.
- A third option would be to only record version data in the reverse direction - see Wikidata:Property proposal/software version of, and then I don't even think qualifiers are needed, can't we just indicate the version type via instance of (P31)?? ArthurPSmith (talk) 13:33, 22 May 2018 (UTC)[reply]
- Can you expand a bit on why it's important to point to items rather than string values? Is it because we can model richer data? And does that mean that we'd be agreeing to eventually have one item for each version of each software product? Not that I object to that, just trying to understand where you're coming from. --Waldir (talk) 14:13, 22 May 2018 (UTC)[reply]
- If you'll notice in the examples above, the proposal is to attach minor versions to the items for major versions, and so down the hierarchy, rather than having all major, minor, and other versions listed on the main item for the software. That requires items for the versions, not strings. I wouldn't expect we would have one item for each version of every software product, just as we don't have one item for each human or star or whatever. If there's no need for any additional information, the present use of software version identifier (P348) is probably fine. But it does allow attaching additional information (such as release date, download size and URL's, etc.) to the version in a natural way, with references, etc. ArthurPSmith (talk) 14:32, 22 May 2018 (UTC)[reply]
- @Waldir:You're right, version type (P548) is better than object of statement has role (P3831).
- @ArthurPSmith:I support your proposal about recording version data only in the reverse direction but books don't do it. Do you know why? A regards the option to indicate version type via instance of (P31), it's possible but I think it's better to have a specific property.--Malore (talk) 12:13, 27 May 2018 (UTC)[reply]
- If you'll notice in the examples above, the proposal is to attach minor versions to the items for major versions, and so down the hierarchy, rather than having all major, minor, and other versions listed on the main item for the software. That requires items for the versions, not strings. I wouldn't expect we would have one item for each version of every software product, just as we don't have one item for each human or star or whatever. If there's no need for any additional information, the present use of software version identifier (P348) is probably fine. But it does allow attaching additional information (such as release date, download size and URL's, etc.) to the version in a natural way, with references, etc. ArthurPSmith (talk) 14:32, 22 May 2018 (UTC)[reply]
- Can you expand a bit on why it's important to point to items rather than string values? Is it because we can model richer data? And does that mean that we'd be agreeing to eventually have one item for each version of each software product? Not that I object to that, just trying to understand where you're coming from. --Waldir (talk) 14:13, 22 May 2018 (UTC)[reply]
- We currently use subclass of (P279), see e.g., SPM99 (Q41566546), — which can be applied recursively. A more specialized property would be Wikidata:Property proposal/software version of — Finn Årup Nielsen (fnielsen) (talk) 15:42, 22 May 2018 (UTC)[reply]
- I substituted all the instances of object of statement has role (P3831) with version type (P548)--Malore (talk) 16:20, 8 June 2018 (UTC)[reply]
- Any reason we can't use software version identifier (P348) qualified by both statement is subject of (P805) and version type (P548)? --Yair rand (talk) 00:20, 24 October 2018 (UTC)[reply]
- @Yair rand: It’s a personal opinion but I don’t really like statement is subject of (P805). It’s description in english is « item which describes the relation identified in this statement », which taken literally means that its value describes the relationship beetween a software and its version. But not really, the item is about the version … I’d prefer the other way around, semantically : ⟨ software ⟩ edition/version (P9767) unknown value Help. unknown value Help is for the case we don’t have an item but it’s an entity in its own right, but it clearly means there could be one, and actually we don’t need statement is subject of (P805) author TomT0m / talk page 20:31, 31 October 2018 (UTC)[reply]
version identifier/number Search ⟨ x.y.z ⟩
version type (P548) ⟨ if you like ⟩
- @Yair rand: It’s a personal opinion but I don’t really like statement is subject of (P805). It’s description in english is « item which describes the relation identified in this statement », which taken literally means that its value describes the relationship beetween a software and its version. But not really, the item is about the version … I’d prefer the other way around, semantically :
- Support migration, versions of softwares should also have their items. --117.14.250.142 05:36, 12 April 2019 (UTC)[reply]
- I support the creation of items for versions of software. I agree that being able to model additional statements per version item is important for Wikidata. For example, different file formats are associated in different ways with versions of some software. YULdigitalpreservation (talk) 17:41, 29 May 2019 (UTC)[reply]
- Support Option 1. Because that will make the corresponding between the infobox and Wikidata with Lua better than option 2. It's also more tidy (if you ask me). - Premeditated (talk) 14:39, 11 July 2019 (UTC)[reply]
- Strong Oppose. It's unclear how you define version and revision and for what you want to create items. There are endless many different versioning schemes – the two you listed are only two of the more "conventionally" types of versioning, there are completely different schemes. Some projects use the date as version number (20190815) some use a compilation counter (3245646) or version-control-hashes (5eff689), the version-number of TeX is a number becoming closer and closer to Pi (current version is 3.14159265) Latex has currently the version "2ε Issue 29" and and and… There is simply no way to add hierarchy like major, minor, bug… without distorting the reality. Version-numbers can only be strings, all further description has to be saved qualifiers or other properties. So the only option if we want to create items for every version is to create items for literally every version and release of every software. Considering that there are projects with huge amounts of releases it's unclear to me how this should scale. JOSM (Q12877) has 15238 releases, Chromium (Q48524) has even more since they have a rolling release-model in which every modification of the software gets a version number. Furthermore all Infoboxes in all Wikipedias have to be rewritten, probably also the Lua modules have to be improved for this. All the Bots updating version-numbers have to be rewritten. So creating items for software versions is a huge project. I'm not saying that this should not be done. But I'm saying that we cannot do it today. We lack of the resources to do something like that! There are way to few people updating software versions and the bots that we have lack developers to improve them. Consider alone the software in Arch-Linux: there are ~300 outdated software-items (some outdated for quite some time!), even through I updated thousands of software versions by hand and improved github-wiki-bot to update thousands of items. We should not make us way more work, when we can't handle the current amount of work to do – especially since the benefits are very small and vague. We can already map 99% of the information with software version identifier (P348) and qualifiers like version type (P548). Let us instead start to fix the urgent problems with software versions on Wikidata:
- The descriptions of version type (P548) are wrong (different from the English/German/French/… description) in some smaller languages
- stable version (Q2804309) is not properly defined and differs between different languages – which is the reason why version type (P548) isn't used widely!
- github-wiki-bot neeeds improvements
- Wrong ranks: https://w.wiki/77J
- and most importantly: decrease the huge back-log.
- Support Option 1. Mainly because I prefer less properties with qualifiers over a lot of different properties. But I‘m not sure about creating items for every version. I think that just having the versions listed inside the main item should be enough in most cases --DaSch (talk) 06:47, 17 January 2020 (UTC)[reply]
- Oppose I agree with Michael. Too vague, high manpower cost. I believe the current model is good enough for now. --So9q (talk) 05:30, 22 March 2020 (UTC)[reply]
- Neutral: I understand Michael's worries but on the other hand the current situation must be improved somehow. See this example: Microsoft Edge (Q18698690)software engine (P408)EdgeHTML (Q19668903)
software version identifier (P348)< 79. It's not possible at the moment to create a machine-readable documentation of when the rendering engine of Microsoft Edge was changed. -- Discostu (talk) 10:45, 30 March 2020 (UTC)[reply] - Oppose In the particular case of Windows, Microsoft is not that clear about what is a major or minor release anymore, and in fact intends to phase out this notion of "Windows n", in a similar vein as Apple only making incremental updates to OS X. The classification of a release as "major" or "minor" should not be forced onto all software, as not all software uses that release scheme. For certain large companies like Microsoft which have a well-defined versioning scheme, we can create their own classification of versions as "major" or "minor". Windows Server 2012 R2 and Windows 8.1 in particular are ambiguous in this regard. Much better would be the ability to store the version history as a branched list of sorts, and either use a classification specific for a company or just let the consumer of the data decide.--Jasper Deng (talk) 01:46, 20 April 2020 (UTC)[reply]
- @Jasper Deng: This is at least impossible for the MediaWiki (Q83), where its versions can be new one-per-a week. --Liuxinyu970226 (talk) 03:15, 14 May 2020 (UTC)[reply]
- @Liuxinyu970226: I'm not sure what you mean here. MediaWiki's versioning scheme has been one of the most consistent I've ever seen.--Jasper Deng (talk) 03:48, 14 May 2020 (UTC)[reply]
- @Jasper Deng: This is at least impossible for the MediaWiki (Q83), where its versions can be new one-per-a week. --Liuxinyu970226 (talk) 03:15, 14 May 2020 (UTC)[reply]