Hello fellow wikians,
as most of you who also follow the Warframe subreddit surely have noticed, the prominent dataminer VoID_Glitch recently succeeded after months of hard work to decrypt the entirety of Warframe's mission rewards tables, with all possible rewards and their chances for every mission. You can find his submission on reddit here, make sure to check it out and thank him for this tremendous achievement.
This warrants the question to what extent we may utilize this plethora of important information on the wiki.
Should we accept datamined content?
I'm familiar with the old datamining policies. However, with the publishing of datamined content by an administrator in his blog, I've had the impression that datamining has become somewhat accepted as a method of game mechanics research where there is no alternative to it, and I'd like to stress that I highly endorse this opening.
So one intention of this blog post is to refresh the discussion on the acceptance of datamining on this wiki. I suggest allowing and accepting it where it is concerned with mechanics research only. Especially in the field of RNG it seems inevitable, as the only alternative to it, inferential statistics, requires tedious and disproportionate amounts of effort collecting data to determine the chances with any barely acceptable accuracy, and samples can very easily be manipulated and become biased intentionally by trolls and even unintentionally through selection bias (people are more likely to list/report their result if they got what they wanted as opposed to something else). This, in practice, makes datamining arguably more reliable than statistics while also being more powerful (achieving perfect accuracy) and more feasible.
I strongly oppose publishing mined data of upcoming content on the wiki, like spoiling future prime accesses, but RNG mechanics research is nearly impossible to imagine without it, and mechanics research in general is one of the mainstays of this wiki.
This was my personal opinion on the subject. I encourage everyone to discuss the issue in the comments and would especially appreciate the administrators clarifying their current stance on this, and potentially adjusting the datamining policy.
How could we utilize the Mission Rewards Table?
The first and immediate application that comes to mind is adding the single chances to the articles of:
- the respective missions/mission types, so one can look it up to see what can be obtained with what chance in that type of mission.
- For example, say you want to farm some fusion cores and can't decide whether to go to Hieracon, Pluto or Triton, Neptune for it, since half of the people you've asked are adamant in favor of the one and the other half adamant in favor of the other. You could look up the tables to see what else is rewarded in each of the two missions, and decide for yourself which of the other stuff you find more useful. You would also see that the fusion core reward chance is the same so people's arguments which drops more cores are pointless.
- the rewarded items themselves. This is often even more useful.
- For example, say you need the Loki Prime Helmet. You look it up and see it's a reward in T3 Sab, T4 Sab and T4 Sur Rotation C. Where of the three places should you go to try and obtain it? No idea, you could roll dice if you want, or reason somehow that since T4 keys are harder to obtain, the chance for desirable rewards should be greater there, so you go with, let's say, T4 Sab. Wouldn't it have been convenient to know that the chance is actually 25.3% in T3 Sab rotation A, 7.5% in T4 Sab rotation A, and 14.3% in T4 Sur rotation C? If that was on the wiki article you looked up, you would have gone T3 Sab straight away and be happy, doing the first cache for both rotation A rewards you'd have been expected to only take 2.26 runs, but instead you grinded down precious T4 Sab keys for on awful lot and maybe still don't have it by the time they're gone, and than are nastily surprised to get it on T3 Sab first try.
So just adding the chances in each line within the Drop Locations would already be really helpful and very easy to implement.
The second, more advanced application is quite sophisticated and builds upon this data to get information closer to what we're actually looking for when we look up these reward chances: We want to get an idea of how many runs / how much time it will presumably require to obtain the desired items. This requires some probability calculus and is certainly not trivial, hence worthy to be added to the wiki.
For only one desired part in the reward table of a certain mission, the number of tries required to obtain it is geometrically distributed, hence with very simple and straightforward expected value of 1/p, the inverse of its chance.
But its generalization towards multiple different desired rewards, and the number of tries required to obtain each at least once, becomes really tricky. I've written an introductory guide to this two years ago for the warframe parts rewarded on boss runs under the naive assumption DE had not employed annoying weights to it, which unfortunately turned out to be true. I can still recommend reading the article as it's a good introduction to understand the heart of the problem. Non-Warframe-related, this is commonly called the Coupon collector's problem, which I did not know at the time of writing the aforementioned Steam guide and was subsequently pleasantly surprised to find out my results were in accordance to the wikipedia content, and by providing a pmf for the case n=3 actually still exceed it by this day (to be fair the wikipedia article is terribly neglected).
Meanwhile, I've come a long way in two years in regards to this issue, and managed to find generalizations to this for arbitrary numbers of discernible required rewards with different weights to them. I'd be happy to apply these methods to the now, thanks to VoID_Glitch, known underlying probability distributions and integrate the results into relevant articles. I can also write another blog or article for user research explaining these methods so that the results can be reproduced.
The tangible implementation of this would be providing some of the following:
- the expected value. "How many tries required on average?"
- the modal value. "How many tries most likely required?"
- the median value. "After how many tries half the people will have it / it's equally likely to have it and not to have it."
- quantiles. "After how many tries, 75%, 85%, etc of the people will have it.
Where "it" means all the parts that are desired.
Similarly to the first application, this is relevant both on the missions themselves ("How often will I need to do T4MD to obtain every prime part from there / every prime part that is exclusively there?") and on the desired items ("How often will I need to do each mission where one part of it is rewarded?", "How often will I need to do Archwing Interception to get Elytron?").
I hope I was able to illustrate the many interesting applications of this data and am curious on the mechanics datamining discussion. I'd love to integrate this stuff into the wiki, but won't do so until I get green light from an administrator to make sure it's okay and wanted here.