added revision summary for RAD revisisted

2020-11-24 10:59:10 -08:00 · 2020-11-24 10:59:10 -08:00 · de63698442
commit de63698442
parent 211ba396c6
4 changed files with 354 additions and 0 deletions
--- a/chi_rebuttals/2018-rise_and_decline/README.txt
+++ b/chi_rebuttals/2018-rise_and_decline/README.txt
@ -0,0 +1,7 @@
+Rebuttal material for:
+
+TeBlunthuis, Nathan, Aaron Shaw, and Benjamin Mako
+Hill. 2018. “Revisiting ‘The Rise and Decline’ in a Population of Peer
+Production Projects.” In Proceedings of the 2018 CHI Conference on
+Human Factors in Computing Systems (CHI ’18), 355:1–355:7. New York,
+New York: ACM. https://doi.org/10.1145/3173574.3173929.
--- a/chi_rebuttals/2018-rise_and_decline/rad_revisisted-2018-rebuttal.txt
+++ b/chi_rebuttals/2018-rise_and_decline/rad_revisisted-2018-rebuttal.txt
@ -0,0 +1,65 @@
+Rebuttal
+-------------------------------------------------
+
+Thanks to the reviewers and ACs for your careful attention to this submission. We appreciate the reviewers' positive feedback and constructive suggestions for improvement. Below, we respond and describe several minor adjustments that we believe will address the reviewers' concerns. We agree that these changes will improve the manuscript and look forward to implementing them before the camera ready deadline.
+
+1.
+R1 and R2 raised questions about generalizability of our findings to other peer production communities and to other communities of practice. Both R4 and R2 ask us to discuss possible underlying causes of the relationships we observe.
+
+We plan to amend our discussion to explain the following: Because our dataset is limited to wikis, we cannot address these questions empirically. However, we believe several general mechanisms may drive our findings and that these likely apply to other communities without pre-defined hierarchies or formal structures. In particular, we will cite earlier works on oligarchy (Michels, 1915) as well as "the tyranny of structurelessness" (Freeman, 1972)—both theorized as features of democratic organizations more broadly—that suggest that a tendency toward "calcification" in open organizations is likely neither unique to wikis nor peer production.
+
+2.
+R3 and R4 suggest additional description of Halfaker et al.'s methods. We had cut a longer summary of RAD's methodology to ensure that our paper's size was clearly commensurate to its contribution. On the recommendation of R3 and R4, we will reintroduce some of this text. Following R2, who said "I particularly appreciate how they've nearly entirely skipped re-iterating a methods section," we will try to limit ourselves to 3 additional paragraphs. We agree that this new text can increase clarity and transparency without attempting to simply repeat RAD. We are flexible on this point and would welcome further guidance from the reviewers in their responses.
+
+3.
+R3 proposes that we describe "shortcomings of the original RAD paper."  We appreciate this idea and will add a couple of sentences to the methods section to this effect. We think the biggest issues relate to unique aspects of English Wikipedia potentially driving RAD's findings. These include questions about whether something happened around 2007 (like the rise of Facebook) that drove Wikipedia's editor decline. Our results suggest that this was likely not the case. 
+
+4.
+R3 calls for more discussion and scrutiny of our use of Namespace 4 instead of policy pages to operationalize norm entrenchment. We agree that this difference is important and was glossed over in our submitted manuscript. At the same time, we maintain that Namespace 4 provides the best available opportunity to study norm entrenchment in Wikia where many wikis do not have policy pages that precisely parallel Wikipedia's. We will highlight and flag this issue as an important threat to validity in the methods paragraph for Study 3.
+
+5.
+R1 suggests adding a visual indication of uncertainty to Figure 1. We will add error bars to each point indicating bootstrapped 95% confidence intervals. We have made this change and the error bars are, as expected, larger for later periods where data is thinner but do not alter the takeaways from the figure.
+
+6.
+R2 pointed out an important typo that we will fix (the reported estimate for β is correct). We will also have the manuscript proofread professionally before submission of a camera ready copy to address any other stylistic issues.
+
+7.
+In addition to the issues raised in the reviews, we also propose adding one new robustness check for Studies 2 and 3 that we identified after submission. In these studies, our units of analysis are newcomers and namespace 4 edits. Because the wikis in our sample have different numbers of both, the average effects we report could disproportionately reflect the experience of users in the communities that contribute the most observations to our sample. The average user experience across all the wikis in our sample remains the most reasonable estimate to report (as our models already do), but we wanted to know if our findings described only the experience of users from the bigger wikis.
+
+To address this, we fit another set of regression models in which each wiki is given equal weight. Our conclusions are robust to this change and the re-weighted models suggest that the RAD dynamics of entrenchment and newcomer rejection may even be stronger in smaller or less active communities. We plan to add the results of the robustness check to the supplementary material and to add a few new sentences to the discussion that summarize the threat, our new robustness check, and the substantively unchanged findings. We propose this change here because we believe the addition reflects a minor but important improvement. We apologize for not identifying this issue before submission and we hope that the reviewers are amenable to this small addition.
+
+Bullet-point summary of reviews:
+-------------------------------------------------
+
+Metareview (AC)
+- The analysis is not as in-depth as in Halfaker
+- Generalizability to non wiki communities?
+- Possible underlying causes/mechanisms of these effects?
+- More details on Halfaker et al.'s methods
+
+Review 1 (2AC)
+ more discussion of how impact of the replication is limited to wikis. 
+ Why or why not will this generalize to non-wiki peer production communities. 
+ Error bars on Figure 1
+
+Review 2
+ really likes that we don't re-iterate a methods section. 
+ wants discussion of organizational mechanisms / "reasons these patterns happen"
+ Wants to know if Wikipedia and Wikias is the first time we noticed this dynamic. Seems to want a connection to broader theory about newcomers and groups.
+ They are OK if this is left to future work. 
+ Noticed a typo in the interpretation of the coefficient of bot-revert 
+
+Review 3 
+ "the difference between policy pages on wikipedia and project namespaces on the Wikia platform is significant enough to warrant greater discussion and scruitiny."
+ "articulate not just the findings, but perhaps some of the shortcomings of the original RAD paper."
+
+
+Bike Rack
+-------------
+
+While this result is intriguing, we feel that understanding how community size and activity level interact withg newcomer retention requires further analysis that is outside the scope of replicating RAD and is best left to future work. We plan to add the robustness check to the supplementary material and to include a few sentences of new text in the discussion that summarizes the threat and the robustness check and that points to the opportunity for future work. We apologize for not including this material originally. The issue emerged in conversations with colleagues after the submission deadline.
+
+
+We will add text to the discussion to describe this limitation and to assert our belief that, given more broadly applicable theoretical mechanisms, our results represent informative, but unconvincing, evidence favoring the notion that RAD's findings generalize beyond wikis.
+
+
--- a/chi_rebuttals/2018-rise_and_decline/rad_revisisted-2018-reviews.txt
+++ b/chi_rebuttals/2018-rise_and_decline/rad_revisisted-2018-reviews.txt
@ -0,0 +1,243 @@
+Original reviews (full text)
+
+
+CHI 2018 Papers
+
+Reviews of submission #3186: "Revisiting ‘The Rise and Decline’ in a
+Population of Peer Production Projects"
+
+------------------------ Submission 3186, Review 4 ------------------------
+
+Reviewer:           AC
+
+Expertise
+
+   3  (Knowledgeable)
+
+Recommendation
+
+   . . . Between possibly accept and strong accept; 4.5 
+
+Award Nomination
+
+   If accepted, this paper would be among the top 20% of papers presented at CHI (Best Paper: Honorable Mention nomination)
+
+1AC: The Meta-Review
+
+   This short paper replicates and extends Halfaker's (2013) important paper
+   titled, "The Rise and Decline of an Open Collaboration System" (RAD)",
+   which examined open collaboration communities in Wikipedia.  The authors
+   use similar methods as the earlier paper on a set of 700+ Wikia
+   communities.  The results provide evidence that Halfaker's findings
+   extend to a wider set of wiki communities. The reviewers were uniformly
+   very positive about this paper and I concur.. Their reviews raised only a
+   small set of non-critical issues that the authors might want to consider:
+
+   R1 notes that the analysis is not as in-depth as in Halfaker
+
+   R1 notes that the focus is on more wiki communities and asks the authors'
+   thoughts on generalizability to non-wiki communities. Similarly, R2 would
+   like to see more discussion of the possible underlying causes of these
+   effects.
+
+   R3 notes, and I agree, that the paper would benefit from more details on
+   Halfaker's methods that are applied here.  Not everyone reading the paper
+   will be aware of those methods or have the earlier paper available while
+   reading this one.
+
+   All authors may submit an optional 5000 character rebuttal to address any
+   misunderstanding or factual errors in the reviews.  There is not much to
+   rebut for this paper, though perhaps the authors will want to briefly
+   address the minor points above.
+
+Rebuttal response
+
+   (blank)
+
+
+------------------------ Submission 3186, Review 1 ------------------------
+
+Reviewer:           2AC
+
+Expertise
+
+   3  (Knowledgeable)
+
+Recommendation
+
+   . . . Between possibly accept and strong accept; 4.5 
+
+Award Nomination
+
+   If accepted, this paper would not be among the top 20% of papers presented at CHI
+
+Review
+
+   This paper replicates prior work by Halfaker et al. (2013) tracking the
+   influx of newcomers to Wikipedia. They examine this same behavior for 740
+   Wikia wikis. I strongly recommend publication in CHI. Replicating
+   important contributions like Halfaker et al. (2013) is important and
+   should be done more often. Understanding how participation changes over
+   the lifecycle of a peer production community is both theoretically
+   interesting from the perspective of norm development and socialization in
+   an online organization and is practically relevant to the survival of
+   this type of community.
+
+   My only reservations with this paper are 1) authors were not able to
+   examine behavior in as much depth as Halfaker et al (e.g. could not track
+   edits to deleted pages, did not distinguish good faith from bad faith
+   edits) and 2) they examine Wikia which shares many similarities (and
+   editors) with Wikipedia limiting the impact of the replication on
+   generalizing about the lifecycle of non-wiki peer-production communities.
+   However, both limitations are understandable given the constraints of a
+   single research study. The authors do a good job of being transparent
+   about the limitations of 1). I would be interested to see more discussion
+   of 2). For example, do the authors believe these results will replicate
+   in non-wiki peer production communities, why or why not? 
+
+   Minor points - it would be nice if there were error bars (CI, SE) on the
+   points in figure 1.
+
+Rebuttal response
+
+   (blank)
+
+
+------------------------ Submission 3186, Review 2 ------------------------
+
+
+Expertise
+
+   4  (Expert )
+
+Recommendation
+
+   . . . Between possibly accept and strong accept; 4.5 
+
+Award Nomination
+
+   If accepted, this paper would be among the top 5% of papers presented at CHI  (Best Paper nomination)
+
+Review
+
+   In this paper, the authors replicate and extend the analyses performed by
+   Halfaker et al. 2013 "The Rise and Decline" (RAD).  The authors argue
+   that replication of this work is essential to turn the conclusions of one
+   study into generalizable knowledge.  They apply measurements similar to
+   those used in Wikipedia across a broad set of Wikia wikis (and discuss
+   diversity of content, time period, etc. among these wikis).  They
+   conclude that the patterns seen in RAD *are* in fact common to this broad
+   set of open production environments. 
+
+   This is a great, short paper.  The authors make efficient use of their
+   prose to describe the past results of RAD, argue the importance of their
+   study, and to discuss methodological differences between this research
+   and RAD.  I particularly appreciate how they've nearly entirely skipped
+   re-iterating a methods section and instead only discuss the differences
+   between their work and RAD's methods.  Reviewing RAD and then reading
+   this paper was straightforward. 
+
+   My only regret is that the authors did not provide more discussion of
+   organizational reasons why these patterns happen.  Is the study of
+   Wikipedia and Wikias the first time we noticed this dynamic?  I doubt it!
+    It seems like these patterns should be common in any community of
+   practice.  Regardless, given that this paper is a short, I think it's
+   totally forgive-able that this type of discussion is left for future
+   work. 
+
+
+   I just have one nit-pick:
+
+   Pg. 4: 
+   * "Our parameter estimate for tool reverted (β = −0.22, SE = 0.28)
+   suggests that newcomers who are rejected by a bot might be more likely to
+   survive."  -- Is this wrong?  It seems like a negative coef suggests that
+   newcomers who were rejected by a bot are *less* likely to survive. 
+
+Rebuttal response
+
+   (blank)
+
+
+------------------------ Submission 3186, Review 3 ------------------------
+
+
+Expertise
+
+   4  (Expert )
+
+Recommendation
+
+   . . . Between possibly accept and strong accept; 4.5 
+
+Award Nomination
+
+   If accepted, this paper would be among the top 20% of papers presented at CHI (Best Paper: Honorable Mention nomination)
+
+Review
+
+   This short paper presents an attempt at replicating the work of an
+   influential prior research paper - Halfaker et al's "The Rise and Decline
+   of an Open Collaboration System" (RAD). I applaud these authors for their
+   effort in replicating the findings from RAD for a number of reasons:
+
+   - Essentially, this is only the second instance of replicating prior work
+   that I have encountered in the CHI community. And I agree with the
+   authors that in a field that rewards novelty, replication is often rarely
+   done, and also leads to other issues such as the generalizability of
+   findings. Too often in CHI we have one-off studies of systems with no way
+   to assess some of the claims being made, and more importantly, form
+   theories within the field that we can call our own. 
+
+   - Open collaboration has been the subject HCI research for some time now,
+   however, much of the claims of open collaboration studies have been made
+   through investigations of Wikipedia. This is a real issue that the
+   authors also explicitly set out to address. However, there are also good
+   reasons for why so many studies are based on Wikipedia as well - for
+   instance, it provides a sizable and (somewhat) easily accessible dataset
+   for researchers to investigate. Additionally, in "field" of open
+   collaboration is littered with one-off systems that are essentially
+   experiments - this makes it hard to do any form of comparative or
+   generalizable work with Wikipedia. Hence, I commend the authors'
+   resourcefulness in replicating the RAD study on a Wikia dataset - that is
+   not only somewhat similar to Wikipedia (in functionality) but also in
+   terms of size through the aggregation of 740 publicly hosted wikis in
+   their dataset. 
+
+   - The authors are also rigorous in their replication study highlighting
+   not only the ways in which their analyses diverged from the original RAD
+   study. This was commendable for me as Wikia is a significantly different
+   platform from Wikipedia. By highlighting how they have to use different
+   statistical methods appropriate to nested community structure of their
+   data, and how they have accounted for potential repeated membership of
+   newcomers across the various Wikia wikis, the authors have made me more
+   confident in trusting their analyses. 
+
+   - Most impressively, the were able to reproduce some of the main findings
+   found in the RAD paper - most notably that of the decline of
+   contributions to open collaboration systems, the survival and retention
+   of newcomers and the "calcification" of norms over time in wikis.
+   Overall, I found the findings relatively persuasive, with the exception
+   of the findings from Study 3 - examining the entrenchment of norms. It
+   seems to me that
+   project namespaces on the Wikia platform is significant enough to warrant
+   greater discussion and scrutiny. 
+
+   If there is one criticism of this paper is the heavy reliance on the
+   assumption that the reader would be familiar with the original RAD paper.
+   This may not necessarily be the case and thus I feel that the authors
+   would do well to better articulate not just the findings, but perhaps
+   some of the shortcomings of the original RAD paper.
+
+   Overall I am pretty impressed with this submission and the succinct
+   clarity with which the authors have not only managed to report the
+   findings of reproducing the original RAD paper, but also summarizing
+   overall findings and making a case for replication of prior research. I
+   would recommend the acceptance of this paper for the CHI conference
+   wholeheartedly. 
+
+Rebuttal response
+
+   (blank)
+
+
--- a/chi_rebuttals/2018-rise_and_decline/rad_revisisted-2018-revision_summary.txt
+++ b/chi_rebuttals/2018-rise_and_decline/rad_revisisted-2018-revision_summary.txt
@ -0,0 +1,39 @@
+We made the following substantive changes to our manuscript. Each of these changes was described in our rebuttal and the points in this summary correspond to the points in our rebuttal document. 
+
+1. We added a paragraph beginning "Despite our efforts at generalization" to the end of the discussion section. This paragraph refers to prior work that provides general mechanisms for norm entrenchment, increasing newcomer rejection, and newcomer retention and argues for the generalizability of our findings.
+
+2. We made the following changes to our methods section: 
+
+   i. To provide readers with a better mental model of the structure of the RAD study, we explain that RAD present three interdependent analyses. 
+
+   ii. We amended ¶2 to clarify that RAD's plots of newcomer survival and rejection showed good-faith newcomers but that our replications of these plots show all newcomers.
+
+   iii. To better help readers' grasp RAD's argument, we modified ¶2-6 to specify that RAD's time series plots are provided in support of their explanations of Wikipedia's decline while their regression models provide evidence for mechanisms. 
+
+   iv. We have inserted a sentence to ¶3 explaining that RAD fit two models, one for all newcomers, and one for good-faith newcomers. We also list all the variables in RAD's regression model and amend the final sentence to explain that we replicate the model on all newcomers. 
+
+   v. We inserted a new paragraph (¶4) that summarizes RAD's analysis of algorithmic tools.
+
+   vi. We amended the first sentence of ¶5 (formerly ¶3) to mention guidelines.
+
+   vii. We inserted a sentence to ¶5 describing RAD's plot showing changes in edits to norm pages over time.
+
+   viii. We broke ¶3 into two paragraphs (¶5-6). ¶7 now more fully describes RAD's second logistic regression and mentions the result that essay pages were less calcified.
+
+   ix. We split paragraph 8 into two paragraphs. The first describes RAD's use of a sample of "good faith" newcomers in greater depth. The second explains that we do not attempt to replicate this part of RAD's analysis.   
+
+   x. We removed the former ¶5 because its content is now covered in the paragraphs above.
+
+3. We added a paragraph to the end of the methods section describing two limitations of RAD an that our analysis partially addresses one.
+
+4. We added two sentances to the methods section of Study 3 to explain that using all edits to namespace 4 is a threat to validity of our replication but represents the best available opportunity to study norm entrenchment on Wikia. 
+
+5. We added error bars to Figure 1. 
+
+6. We corrected a typo in the interpretation of the parameter estimate for tool reverted in the results for Study 2. 
+
+7. We added a new paragraph to the discussion section (¶4) describing the threat of varying levels of activity and numbers of newcomers across wikis in our sample. We explain how we address the threat and refer to supplementary material.  We also add material to our supplementary material to describe and interpret the robustness check.  
+
+8. We carefully edited our paper for style and clarity. We had our paper professionally proofread.
+
+9. We have unblinded our paper, added our copyright blurb, and added an "Acknowledgement" section. We also added a new section called "Access to Data" that includes a hyperlink to an archival copy of our code and dataset that we have published in the Harvard Dataverse.