As a rule of thumb, the list team considers the opinions of every victor when initially constructing an opinion list. However, opinions may be weighted differently based on several factors that inherently affect player reliability. Players will be asked by a team member to self-assess their reliability for each list opinion on a scale from 0 to 10. Reliability factors that players should consider include but are not limited to the following list:
Players that do not cooperate with the list team when providing opinions (e.g. deliberately providing inaccurate information) will usually not be considered for future list placements or movements. Players may also choose to be excluded from providing list opinions by contacting a member of the list team.
In addition to these factors, sometimes a level may contain a built-in Low Detail Mode that makes it significantly easier. Since all built-in LDMs are considered acceptable for list records, if the LDM makes a particular level much easier, the list team will primarily consider opinions from players that use it. This emphasis allows the list placement to correspond to the easiest "official" version of the level.
Once the list team has an opinion list, we aim to analyze its statistics to determine the most likely accurate placement given what we know. The final analysis begins once reliability is considered; if the eventual placement does not seem to line up with some opinions you saw from other players, then it's likely that one or more of the factors above decreased their reliability.
Sometimes, there is a lot of variation in the opinions we receive! Some recent notable examples include Wasureta and Void Wave (something about purple levels, maybe). While we will try our best to figure out a consensus with what we have, please keep in mind that these placements are significantly more likely to change following the initial estimate based on the opinions from new victors.
Although the pure average of the list of opinions is not always the best way to estimate the proper placement, it's usually a good starting point. Especially if the distribution of opinions is relatively symmetric and if there are a lot of victors, we often use the average as the tentative placement for the new level.
However, because no distribution is perfectly balanced, we try to capture any "skew" in the data that stems from reliable opinions. As such, the median is never a good metric for determining list placements.
Example: Let's say a level has three opinions so far: #75, #77, and #82. Without any other opinions, we would consider #78 - the average - to be a reasonable placement for it. The median of this list is #77, which does not account for the "skew" towards the higher value (#82 is farther away from the median than #75).
Nonetheless, using the mean has its weaknesses as well. Most importantly, any outliers in the dataset have a magnified effect on the average, which could offset it away from an otherwise clear consensus.
Example: Let's say a level has five opinions: two at #67, two at #68, and one at #84. The average of this distribution is #71, but without the outlier opinion (#84), we see a clear consensus of either #67 or #68. We'd likely pick one of those two placements as a good fit for the level in question.
In some cases, such as the example provided above, there may be more than one possible placement that we think is reasonable for a list-worthy level. When we have multiple options, we often consider the levels currently occupying those positions and whether they may be moved up or down in the near future.
For instance, if two possible placements for a level are #103 and #104, and the level currently at #103 is generally thought to be underrated, #104 would be a better placement for the new level.