34/34 for pass 1

2025-07-29 22:56:09 -04:00 · 2025-07-29 22:56:09 -04:00 · 6fabfdd2ab
commit 6fabfdd2ab
parent f387493d42
1 changed files with 1 additions and 1 deletions
--- a/070725_papers_master.csv
+++ b/070725_papers_master.csv
@ -16,7 +16,7 @@ WE8VYWEX,journalArticle,2021,"Geiger, R. Stuart; Howard, Dorothy; Irani, Lilly",
 3F7CJATB,journalArticle,2022,"Yin, Likang; Chakraborti, Mahasweta; Yan, Yibo; Schweik, Charles; Frey, Seth; Filkov, Vladimir",Open Source Software Sustainability: Combining Institutional Analysis and Socio-Technical Networks,10.1145/3555129,"Procedural – discussion of governance in OSS projects ---- measured through identiification of Institutional Statements within Apache-incubator projects --- more functional-based discussions around IS statements precedes graduation from ASFI\*. more IS discussion in general precedes increased activity; more governance discussion from mentors (ASFI proxies) precedes engagement with the project. *However, without more robust description of the interpretation of the LDA topics, I have a bit of skepticism around this. There are other notes about evolution with regards to attracting contributors and setting up contributions, but that’s secondary to the adaptation discussion.","stakeholder agents for the OSS projects within ASFI (committers, contributors, mentors). Different than contributor or maintainer because the role of mentor is rather bespoke in this environment.","Apache Software Foundation incubator and graduation from the environment  “Those having foundation support, like the ASF, may additionally be in the process of organizing the developers’ structured interactions under a second tier of governance prescriptions as required by the ASF Incubator.” ([Yin et al., 2022, p. 4]","quantitative ---- computationally modeling institutional norms and rules from OSS communications as well as socio-technical collaboration records; collecting data for both historical development activity such as commits and communications as well as project-level discussions regarding mailing list discussions  creating a socio-technical network of collaboration for the different projects  fine-tuning a BERT classifier to identify institutional statements with the ASF-specific language from different things; with this longitudinal data, they use Granger causality to identify “causal links” of different things. Limitations using granger causality when evaluating cause and effect; some discussed in RQ3 results","longstanding gripes about the generalizeability of ASF studies --- the isomorphic pressure is, I think, too big for any takeaway to be meaningfully transferred to other settings --- ;like this oper  The ASFI can have mentors that represent the desires (re: fit) of the environment more broadly, directly adapting the focal project to the envrionment;s wants"
 BFEMKQCR,journalArticle,2014,"Gamalielsson, Jonas; Lundell, Björn",Sustainability of Open Source software communities beyond a fork: How and why has the LibreOffice project evolved?,https://doi.org/10.1016/j.jss.2013.11.1077,Procedural – creating hard fork of a prior project; making the prior project now an environmental factor in the construction of the new project. Ideologically motivated hard fork (at the time of forking) LO from OO. Adaptation in the construction of the hard fork means ideologically motivated restructuring of governance and licensing (further egalitatrian and copyleft) which is even different than the AOO proejct which also evolved from OO,"engaged contributors and maintainers of the LibreOffice project, a fork from the pre-existing OpenOffice project","OpenOffice derivative forks; OO was an open source productivity library that was bought by Sun and then sold to Oracle.; LO is an offshoot of the project by OO contributors who wanted to depart from the mainline software; AO is an offshoot that is the result of Oracle selling the OO project to Apache. In a way, cross-project competition in the same product space ","mixed-methods case study analysis of the OpenOffice project and its different derivative forks.  first, quantitative analysis of the contribution patterns of the different projects and their forks; quantitative modeling spent a lot of time looking at passive evolution of projects rather than intentional adaptation.  second, interview study talking to different active participants in the LO community --- qualitative analysis based in open coding/qual coding practices","Interesting notes about sustainable communities and what that might mean for the projects involved; contributor activity peaks during the fraught times within the project’s lifecycle, when there’s a lot of activity surrounding the hard fork from upstream"
 FJSA37EW,journalArticle,2021,"Bogart, Chris; Kästner, Christian; Herbsleb, James; Thung, Ferdian",When and How to Make Breaking Changes: Policies and Practices in 18 Open Source Software Ecosystems,10.1145/3447245,procedural-- breaking changes across ecosystems; both how do developers decide and mange the performance of breaking changes for their own package and how do they respond to when a dependency makes  a breaking change,"package maintainers within different ecosystems; these package maintainers are often the same as the project developers, but not always (59%). most were 33 year old males","first study; Eclipse, NPM and CRAN package ecosystems. second study; 18 different package ecosystems.  “software ecosystems as communities built around shared programming languages, shared platforms, or shared dependency management tools, allowing developers to create packages that import and build on each others’ functionality.” ([Bogart et al., 2021, p. 3](zotero://select/library/items/FJSA37EW)) ([pdf](zotero://open-pdf/library/items/PJWUIBY2?page=3&annotation=DP2EWBXF)) Really takes a more ecosystem/environment-first perspective to the behavioral question, even though looking at the independent actions of developers. What seems to be the most important factor of the environment in the shaping of projects’ breaking change deployments are the rather gooey values and ideas held within the ecosystem, the norms around how packages relate to one another  Though spoken values are putatively shared across the ecosystem, the implemented values and practices vary widely","mixed-methods; first an interview case study across three different package management environments, then a survey, mining, and document analysis study to look across 18 ecosystems  first study; interviewing 28 developers across three ecosystems. sampling for package maintainers with both upstream and downstream dependencies; used inductive thematic coding  second study: then tried to do a systematic mapping of values and practices in a broad sample of ecosystems. used a grounded approach to code the free response sections of the survey. then mining stated ecosystem policies and commit activity to see how different breaking changes were actually integrated into the project.  sampling for study 2 was distinct form the sampling for study 1, however the research questions were largely derived from the interview study","“Another consequence of Eclipse’s stability, along with its use of semantic versioning, is that many packages have not changed their major version number in over 10 years” ([Bogart et al., 2021, p. 31](zotero://select/library/items/FJSA37EW)) ([pdf](zotero://open-pdf/library/items/PJWUIBY2?page=31&annotation=L49A55FP)) how do you study the absence of action?!"
-72F8GVAP,journalArticle,2025,"Jahanshahi, Mahmoud; Reid, David; Mockus, Audris",Beyond Dependencies: The Role of Copy-Based Reuse in Open Source Software Development,10.1145/3715907,,,,,
+72F8GVAP,journalArticle,2025,"Jahanshahi, Mahmoud; Reid, David; Mockus, Audris",Beyond Dependencies: The Role of Copy-Based Reuse in Open Source Software Development,10.1145/3715907,"technical-  copy-based software reuse; black-box reuse (using a dependency manager, not touching the depended-upon code) and white box reuse (copying the original code and committing the duplicate code to the focal repository); prevalent, 6.9% of code in OSS is reused at least once, and each reused blob is copied to an average of 24 other projects; 80% of OSS projects have reused blobs from another project at least once; all types of projects big and small; (see notes in the bottom re: timeliness of this); seems to be a cultural thing for each ecosystem in terms of modularity and copying things over --- same with R, though that might also be explained by R code being very specific and statistics oriented, while R-type projects are more likely to pull in other code into their project; Reusers deny reusing the code? seems that its a taboo adaptation --- one that carries a lot of stigma with it; rationale: developers reuse code to incorporate existing functionalities into their projects -- in order to save time and effort in development. or reusing resources.","commit author who introduced the reused blob in the destination repository. ‘reuser’ is the correct term because the individual could be anyone in the destination project, who just happened to introduce the pre-existing code into things.","The first-introduced of the blob is the external environment that the focal project is engaging with. E.g. if the blob first appears in project A, then project B does a white-box reuse, then project A is the environment of project B. I guess this is also bidirectional, though the original commit has little conception of the environment when writing the original piece of code.","mixed-methods; repository mining first and then a survey to follow.; mining: measurement framework of tracking repository blobs and their inclusion over time in different repositories and projects; by using the blob approach, they are able to compare hashes of code versions across the entire code ecosystem, tracing the origins of a specific piece of code across different repos in a way that the other identification of clones cannot --- Conversely, “This means that copy-based reuse is detected only if an entire file is duplicated without any alterations [46].” ([Jahanshahi et al., 2025, p. 10](zotero://select/library/items/72F8GVAP)) ([pdf](zotero://open-pdf/library/items/QC47MLLB?page=10&annotation=QMLZXAVJ)) This feels like an enormous blind spot when mapping the usage of copy-based reuse, as it only accounts for full and total copying of the original blob. There are so many instances of reuse where the file is copied only piecemeal!; given the scale of the data, the time-complexity was expected to take about a week. Per their estimation, looking for code-based reuse at a finer level of granularity than this would be prohibitive in terms of length of time.; for mining, only looking at the original versions of the project, no derivative forks.; also statistically modeled whether or not the different characteristics of a given project impacted its likelihood of reuse; alternatively, whether characteristics of the original hosting project impact whether the blob is copied over to a new project; because they mined using the WorldOfCode dataset, they are making the claim that their empirical analysis encapsulates ‘almost the entire OSS ecosystem’; They tried to do stratified sampling across different instances of the copy-based reuse that they had identified in the project. they DO use the results from the quantitative side of the methods in order to find the sample for the qualitative side; -     three waves of surveys: first picking 24 developers for an initial survey of open-ended questions; second a big survey of 724 subjects; then a bigger survey of 8734 subjects.; - they received 247 complete responses from reusers and 127 from creators; low low response rate!; Thematic analysis of the qualitative data, looking at the different free response as well as quantitative things","in terms of adaptation, the paper explicitly cites social contaigion theory for spreading the practice of software reuse  “2 Associated Risks” ([Jahanshahi et al., 2025, p. 4](zotero://select/library/items/72F8GVAP)) ([pdf](zotero://open-pdf/library/items/QC47MLLB?page=4&annotation=BCIFZUKB)) highlighting how the adaptive  action could be bad for the focal project; also though framing it in the positive sense when looking at the different importance of the project  Is this currently a common adaptive action? It seems as though it is very dependent on when things are written and copied (like, decades ago) in terms of prevalence?"
 QEKG8ISF,journalArticle,2016,"Hilton, Michael; Tunnell, Timothy; Huang, Kai; Marinov, Darko; Dig, Danny","ASE - Usage, costs, and benefits of continuous integration in open-source projects",10.1145/2970276.2970358,"Procedural – adoption and change to CI systems within project builds --- though the rational for initial adoption are intrinsic to the project, the reasons for changing or evolving the CI yaml file are largely contingent on dependencies and reliability","OSS maintianers, specifically of popular projects on GitHub, many of whom have used CI workflows in their builds",OSS on GitHub --- no evaluation of whether CI changes ‘work’ for their environment though I guess adherence to the new dependency is the thing that would display that ,"mixed-methods: mining open source projects from GitHub while also surveying developers from  popular projects.survey sampling is NOT from the mined authorship data, instead focusing on popular GitHub projects",he discussion of adaptive change is often framed within the broader setting of whether or not the change is popular/relevant -- i.e. 40% of projects are using CI systems 
 ZGK4HR76,journalArticle,2015,"Vendome, Christopher; Linares-Vasquez, Mario; Bavota, Gabriele; Di Penta, Massimiliano; German, Daniel M.; Poshyvanyk, Denys",ICSME - When and why developers adopt and change software licenses,10.1109/icsm.2015.7332449,Procedural – license changes within OSS projects --- the change event is the real adaptation action; rationale for commercial reuse is also a motivating factor ,"OSS maintainers (Java, on GitHub) who change the license --- oftentimes this is a copyright holder or primary core contirbutor of the project ","nebulous environment of commercial reuse, ‘community’, and user base --- there’s also discussion around the role that changes to the broader dependency network play in the structure of the project and changes to the license",Mixed-methods; mining 16k java open source projects and their commits and then supplemented with survey study of 138 developers; sample of developers surveyed are from the trace data of the mined project actions --- specifically looking to identify projects that had shifted over time with regard to licensing,many of the intitial decisions and adaptations were motivated from a range of intrinsic and extrinsic motivations
 PSZSSAS3,journalArticle,2017,"Ding, Hui; Ma, Wanwangying; Chen, Lin; Zhou, Yuming; Xu, Baowen",APSEC - An Empirical Study on Downstream Workarounds for Cross-Project Bugs,10.1109/apsec.2017.38,"Technical – downstream workarounds from errors and bugs and undesirable method behavior introduced by upstream dependencies -- common workaround per prior literature -- workarounds fall into four common patterns: using a different method, wrapping the current method in a conditional per input, augmenting input to match method, augmenting method output to match downstream ",OSS project developers of downstream scientific python libraries on GitHub ,"packaging ecosystems, Scientific python dependency ecosystem on GitHub",mixed-methods; though statistical methods were used in evaluating hypotheses or finding differences in the workarounds; manual methods were used for cross-project bugs with workarounds in the first place as well as some characterization of the bugs ,the short-term adaptations are meaningfully different than the long-term fixes across the environment  there’s a lot of similarity in the kinds of cross-project bugs that run into this problem/adaptation --- often when the downstream project encounters an emergent case that is untested by the upstream repository