1
0
This commit is contained in:
mgaughan 2025-07-27 23:44:47 -04:00
parent 036d42e17e
commit 80abfd40fa

View File

@ -28,7 +28,7 @@ TFDYF5UM,journalArticle,2011,"Capiluppi, Andrea; Stol, Klaas-Jan; Boldyreff, Cor
XDY5INZ6,conferencePaper,2018,"Lotter, Adriaan; Licorish, Sherlock A.; Savarimuthu, Bastin Tony Roy; Meldrum, Sarah",Code Reuse in Stack Overflow and Popular Open Source Java Projects,10.1109/ASWEC.2018.00027,"technical -- code reuse from Stack Overflow within popular Java OSS projects--  rationale is copying from stackoverflow or other popular projects, which inherently increases technical fit with the environment --- disregarding within project copying, that is almost a meaningless metric copying between projects is larger (in size of code segment thats copied over) and may be more prevalent?",OSS project developers for popular/well-regarded projects -- I guess this action is also at the developer/contributor level.,"The most popular Java OSS projects on SourceForge and GitHub --- in 2017, which projects had the highest weekly popularity and contained requisite Java code or, alternatively, the projects that had the highest popularity on SourceForge; Also looking at all Java StackOverflow comments from 2014-2017. pulling out the code snippets from these answers","quantitative repo mining from Stack Overflow and most popular Java projects --- focusing on weekly and all-time popularity for GitHub and SourceForge metrics --- used near-OTS code reuse identification software changing parameters for a few things ; reliant on syntax and token similarity, not AST for analysis","paper finger wags about code reuse the whole time, discussing how its not a good practice. Adaptive change can lead to adherence to substandard environmental norms. Seems like code reuse from Stack Overflow isnt even that prevalent? Not a very good paper honestly; never discusses when the code was reused within or in which direction --- its cross sectional data!! How can you even make an argument surrounding copying, which is a time-dependent action!"
MBVCDT66,journalArticle,2023,"He, Runzhi; He, Hao; Zhang, Yuxia; Zhou, Minghui",Automating Dependency Updates in Practice: An Exploratory Study on GitHub Dependabot,10.1109/TSE.2023.3278129,"organizational/procedural -- adoption of dependabot--- stated rationale: managing project dependencies --- specifically, keeping project dependencies up to date ---","OSS developers for projects that have adopted dependabot --- specifically, developers who are likely to be familiar with dependabot and its workings","GitHub ---- specifically the projects on GitHub but often, the platform itself structures the environment that projects operate in --- also dependency networks, the reasoning why dependabot is used is for projects to align better with the dependencies that theyre reliant on","mixed-methods, EDA and a developer survey --- from a sample of 1823 projects, mining the PRs made by Dependabot with a survey of the projects developers (n131)--- sampled for the survey from the projects that were identified in the mining scenario -- a lot of researcher selection in finding maintainers who are likely to be familiar with it","a lot of analysis on whether or not adopting dependabot works, like whether the thing that people are doing to adapt is… actually working… even though dependabot is not that useful for automatic updates, many developers believe that it is useful for notifying updates"
DGV2UJNM,conferencePaper,2020,"Zhou, Shurui; Vasilescu, Bogdan; Kästner, Christian",How has forking changed in the last 20 years? a study of hard forks on GitHub,10.1145/3377811.3380412,"technical and procedural --- the changing of a fork from social to hard in GitHub;the forks often start social, but then move to hard once obstacles to contributing upstream were found, whether that be unresponsive maintainers or rejected pull requests","GitHub repo owners of hard-forked OSS projects are the ones who effect the adaptive change ; though the quantitative sample looks at 15,306 hard forks on GitHub (initiated/shepherded by the owners of those hard forks), the interview sample looks at 15 owners of forked libraries --- these are long-tenured OSS contributors ","forking networks on GitHub, the environmental characteristics are unresponsive upstream projects or barriers to contributing to the upstream project; the hard fork is downstream of the construction of the upstream project as external to the social fork","mixed-methods; mining and classifying hard-forked projects and then interviewing the owners of those repositories/maintainers of upstream; heuristic classifier to find hard forks, qualitative card sorting to characterize them; qualitative interviews to gain perspective -- interview sample created from identifying hard forked projects and authors ","hard forks are, generally, a rare phenomenon across GitHub -- though the scale of GitHub means that the total number of these forks is actually pretty large; another verification of respondent claims: “we see little evidence of actual synchronization or merging across forks in the repositories:” ([Zhou et al., 2020, p. 453"
QLSEMWTQ,journalArticle,2017,"Vendome, Christopher; Bavota, Gabriele; Penta, Massimiliano Di; Linares-Vásquez, Mario; German, Daniel; Poshyvanyk, Denys",License usage and changes: a large-scale study on gitHub,10.1007/s10664-016-9438-4,,,,,
QLSEMWTQ,journalArticle,2017,"Vendome, Christopher; Bavota, Gabriele; Penta, Massimiliano Di; Linares-Vásquez, Mario; German, Daniel; Poshyvanyk, Denys",License usage and changes: a large-scale study on gitHub,10.1007/s10664-016-9438-4,"procedural, license adoption and change in a wide range of OSS projects on GitHub. why do people add or change licenses? ranging from internal fixes to explicitly external decision making surrounding the demands of users","contributors/maintainers of the project; those who can change the projects license --- the contributor is rarely mentioned or identified, things happen but the action data is the focal point, not the contributor or the contributors actions; some discussion of the contributor in terms of the project developer or contributor and some allusions to governance structures impacting how and when licenses are changed (e.g. cannot change license until subset of contributors sign off)","he projects broader user/dependency environment? environment of code reuse and reproduction. dependency or user environments making requests for more legibility or accessibility for license use. License adoptions and changes are adaptive to the direct requests of the environment, where the issue-created makes a demand for a certain procedural functionlaity within the project ; there are a subset of focal changes that are independent of the broader environment entirely and are solely concerned with the internal workings of the project","mixed-methods; first a longitudinal analysis of license changes within Java projects on GitHub. Then qualitative analysis of 1160 projects in different programming languages to look at commit messages and other stuff that has to do with changing the license It seems like the quantitative analysis has little significance other than to establish the scale and frequency with which this is happening across the empirical space. Right, like the meaning/depth is really in the qualitative evaluation, this just tracks how often these things happen quantitative analysis -- collected all commit data for random sample of Java projects on GitHub and then tracked commits where the (coarse-level) license changed from one distinct thing to another. in the end, identifying 1833 license changes with the quantitative analysis; matching these changes to the commit messages/issue reports qualitative analsysis: used a different sampling method to find the sample of the qualitative study; using qualitative open coding to look at the different moments of selecting code changes for projects e.g. the initial adoption of licenses and the license migration did somewhat contrived diverisity analysis of the projects in their data set to evaluate how much coverage of open source ecosystem they were able to get with their two data sets --- through the ways that they analyzed this, the data set was representative . Kind of an interesting breadth-based diversity metric analysis, trying to figure out what characteristics of broader open source are met by the projects in the data set.","big issue i have with this lies in the methods section for identifying license changes, they use two different ways of identifying license changes --- in the quantitative way they mind across code source and in the qualitative study they keyword search across commit messages it doesnt seem like they accounted for when a project drops its license during a period of litigation surrounding the new/next license selection e.g. apache -> no license could be followed by no license -> LGPL but apache -> LGPL isnt really optimized"
5E2EWRQN,journalArticle,2020,"Abdalkareem, Rabe; Oda, Vinicius; Mujahid, Suhaib; Shihab, Emad",On the impact of using trivial packages: an empirical case study on npm and PyPI,10.1007/s10664-019-09792-9,technical: code reuse: trivial package reuse: rationale trivial packages provide well-implemented and tested code from the packaging ecosystem: enables adherence to the quality testing of the broader ecosystem,application developers: long-tenured JS and Python coders: largely professional but some independents,package managemeny systems: npm and PyPI: change adheres project to well-tested and implemented environment: no project evaluation of change success wrt environment ,mixed methods: pilot survey data mining follow up survey data mining to validate survey responses: sampling from prior methods step: skews to university ,internal motivations for productiivty: many also stated that reuse was bad: paper spends a lot of time defining trivial packages
P3MTJWXP,conferencePaper,2022,"Zhang, Xunhui; Wang, Tao; Yu, Yue; Zeng, Qiubing; Li, Zhixing; Wang, Huaimin","Who, What, Why and How? Towards the Monetary Incentive in Crowd Collaboration: A Case Study of Githubs Sponsor Mechanism",10.1145/3491102.3501822,procedural/organizational: developer adoption and participation in the GitHub sponsors program --- rationale for adopting the sponsorship model: I should be rewarded or recognized for my OSS work ,"OSS developers, but not necessarily those with big commits or key contributions, the popular ones who work on big projects","GitHub --- and also broader society. the adoption of the feature is bound to the platform as the environment --- as such, the bounds of the projects activity are restricted by GitHub as a platform --- to what extent is broader social environment (intrinsic desire for payment) also the environment here?",mixed-methods -- both data mining and survey; quantitative data mining of different sponsorship events within GitHub --- pulling a lot of data on the individual sponsorships and the sponsoring events --- statistic modeling (lmer) of maintainer/contributor balance etc.; Sampling from the data mining to identify the relevant population.; qualitative already with a questionnaire about the why and what questions.  -- survey looked up the expectations and rationales for using the sponsor feature ; two-stage survey,"again a validity check of the contributor rationales --- How effective is the sponsorship mechanism with carving out time for maintainers to work on things --- didnt hold up!; configurable adaptations more throughout this, not the first paper that discusses this; environment (GitHub) is incredibly deterministic in establishing who adapts/adopts the feature --- instead of being an amorphous social pressure or anything like that --- it is a platform trying to get you to use their most recent feature; intersection of rationales for doing things sit at the intersection of intrinsic and extrinsic motivations"
DW9Q2W6V,conferencePaper,2022,"Businge, John; Zerouali, Ahmed; Decan, Alexandre; Mens, Tom; Demeyer, Serge; De Roover, Coen",Variant Forks - Motivations and Impediments,10.1109/SANER53432.2022.00105,"technical change rooted in procedural motivations ---- creating variant of existing project in order to redirect the technical aims of the work. The creation of the variant fork is switching the social fork from contributing back to mainline to being more of a hard fork. creating a variant fork of a project on a social coding platform -- variant fork is the same as hard fork I think. --- technical maintenance, wanting to change something and running into issues contributing back to the project or wanting to redirect the features ;; addition of different types of things","OSS maintainers who support the variant fork, who walked away from the upstream mainline project in order to create a new repository and effort",the environment is the newly-constructed relationship between the fork project and its upstream mainline. The actions of the mainline thus become the rationale for moving the fork from social to variant. ,"Mixed-methods largely a survey with 105 maintainers of social fork projects; 12 questions, 15 minutes long of surveying authors. then selected things Likert scales and free responses from the survey responses. But also Sampled for variant forks through heuristic-based mining of platforms like Libraries.io and GitHub. Then also collected popularity metrics for the forks and their variants --- leading to mixed-methods analysis, even though the quantitative side of it is weak… to say the least.",Interesting that the categories of technical and procedural run into issues when the motivations are based in the procedural gripes with technical maintenance

1 Key Item Type Publication Year Author Title DOI Change characteristics (blue) Actors (purple) Environmental characteristics (green) Methods details (orange) Misc. (red)
28 XDY5INZ6 conferencePaper 2018 Lotter, Adriaan; Licorish, Sherlock A.; Savarimuthu, Bastin Tony Roy; Meldrum, Sarah Code Reuse in Stack Overflow and Popular Open Source Java Projects 10.1109/ASWEC.2018.00027 technical -- code reuse from Stack Overflow within popular Java OSS projects--  rationale is copying from stackoverflow or other popular projects, which inherently increases technical fit with the environment --- disregarding within project copying, that is almost a meaningless metric copying between projects is larger (in size of code segment that’s copied over) and may be more prevalent? OSS project developers for popular/well-regarded projects -- I guess this action is also at the developer/contributor level. The most popular Java OSS projects on SourceForge and GitHub --- in 2017, which projects had the highest weekly popularity and contained requisite Java code or, alternatively, the projects that had the highest popularity on SourceForge; Also looking at all Java StackOverflow comments from 2014-2017. pulling out the code snippets from these answers quantitative repo mining from Stack Overflow and most popular Java projects --- focusing on weekly and all-time popularity for GitHub and SourceForge metrics --- used near-OTS code reuse identification software changing parameters for a few things ; reliant on syntax and token similarity, not AST for analysis paper finger wags about code reuse the whole time, discussing how it’s not a good practice. Adaptive change can lead to adherence to substandard environmental norms. Seems like code reuse from Stack Overflow isn’t even that prevalent? Not a very good paper honestly; never discusses when the code was reused within or in which direction --- it’s cross sectional data!! How can you even make an argument surrounding copying, which is a time-dependent action!
29 MBVCDT66 journalArticle 2023 He, Runzhi; He, Hao; Zhang, Yuxia; Zhou, Minghui Automating Dependency Updates in Practice: An Exploratory Study on GitHub Dependabot 10.1109/TSE.2023.3278129 organizational/procedural -- adoption of dependabot--- stated rationale: managing project dependencies --- specifically, keeping project dependencies up to date --- OSS developers for projects that have adopted dependabot --- specifically, developers who are likely to be familiar with dependabot and its workings GitHub ---- specifically the projects on GitHub but often, the platform itself structures the environment that projects operate in --- also dependency networks, the reasoning why dependabot is used is for projects to align better with the dependencies that they’re reliant on mixed-methods, EDA and a developer survey --- from a sample of 1823 projects, mining the PRs made by Dependabot with a survey of the projects’ developers (n131)--- sampled for the survey from the projects that were identified in the mining scenario -- a lot of researcher selection in finding maintainers who are likely to be familiar with it a lot of analysis on whether or not adopting dependabot works, like whether the thing that people are doing to adapt is… actually working… even though dependabot is not that useful for automatic updates, many developers believe that it is useful for notifying updates
30 DGV2UJNM conferencePaper 2020 Zhou, Shurui; Vasilescu, Bogdan; Kästner, Christian How has forking changed in the last 20 years? a study of hard forks on GitHub 10.1145/3377811.3380412 technical and procedural --- the changing of a fork from social to hard in GitHub;the forks often start social, but then move to hard once obstacles to contributing upstream were found, whether that be unresponsive maintainers or rejected pull requests GitHub repo owners of hard-forked OSS projects are the ones who effect the adaptive change ; though the quantitative sample looks at 15,306 hard forks on GitHub (initiated/shepherded by the owners of those hard forks), the interview sample looks at 15 owners of forked libraries --- these are long-tenured OSS contributors forking networks on GitHub, the environmental characteristics are unresponsive upstream projects or barriers to contributing to the upstream project; the hard fork is downstream of the construction of the upstream project as external to the social fork mixed-methods; mining and classifying hard-forked projects and then interviewing the owners of those repositories/maintainers of upstream; heuristic classifier to find hard forks, qualitative card sorting to characterize them; qualitative interviews to gain perspective -- interview sample created from identifying hard forked projects and authors hard forks are, generally, a rare phenomenon across GitHub -- though the scale of GitHub means that the total number of these forks is actually pretty large; another verification of respondent claims: “we see little evidence of actual synchronization or merging across forks in the repositories:” ([Zhou et al., 2020, p. 453
31 QLSEMWTQ journalArticle 2017 Vendome, Christopher; Bavota, Gabriele; Penta, Massimiliano Di; Linares-Vásquez, Mario; German, Daniel; Poshyvanyk, Denys License usage and changes: a large-scale study on gitHub 10.1007/s10664-016-9438-4 procedural, license adoption and change in a wide range of OSS projects on GitHub. why do people add or change licenses? ranging from internal fixes to explicitly external decision making surrounding the demands of users contributors/maintainers of the project; those who can change the project’s license --- the contributor is rarely mentioned or identified, things happen but the action data is the focal point, not the contributor or the contributor’s actions; some discussion of the contributor in terms of the project ‘developer’ or ‘contributor’ and some allusions to governance structures impacting how and when licenses are changed (e.g. cannot change license until subset of contributors sign off’) he project’s broader user/dependency environment? environment of code reuse and reproduction. dependency or user environments making requests for more legibility or accessibility for license use. License adoptions and changes are adaptive to the direct requests of the environment, where the issue-created makes a demand for a certain procedural functionlaity within the project ; there are a subset of focal changes that are independent of the broader environment entirely and are solely concerned with the internal workings of the project mixed-methods; first a longitudinal analysis of license changes within Java projects on GitHub. Then qualitative analysis of 1160 projects in different programming languages to look at commit messages and other stuff that has to do with changing the license It seems like the quantitative analysis has little significance other than to establish the scale and frequency with which this is happening across the empirical space. Right, like the meaning/depth is really in the qualitative evaluation, this just tracks how often these things happen quantitative analysis -- collected all commit data for random sample of Java projects on GitHub and then tracked commits where the (coarse-level) license changed from one distinct thing to another. in the end, identifying 1833 license changes with the quantitative analysis; matching these changes to the commit messages/issue reports qualitative analsysis: used a different sampling method to find the sample of the qualitative study; using qualitative open coding to look at the different moments of selecting code changes for projects e.g. the initial adoption of licenses and the license migration did somewhat contrived diverisity analysis of the projects in their data set to evaluate how much coverage of open source ecosystem they were able to get with their two data sets --- through the ways that they analyzed this, the data set was representative . Kind of an interesting breadth-based diversity metric analysis, trying to figure out what characteristics of broader open source are met by the projects in the data set. big issue i have with this lies in the methods section for identifying license changes, they use two different ways of identifying license changes --- in the quantitative way they mind across code source and in the qualitative study they keyword search across commit messages it doesn’t seem like they accounted for when a project drops its license during a period of litigation surrounding the new/next license selection e.g. apache -> no license could be followed by no license -> LGPL but apache -> LGPL isn’t really optimized
32 5E2EWRQN journalArticle 2020 Abdalkareem, Rabe; Oda, Vinicius; Mujahid, Suhaib; Shihab, Emad On the impact of using trivial packages: an empirical case study on npm and PyPI 10.1007/s10664-019-09792-9 technical: code reuse: trivial package reuse: rationale – trivial packages provide well-implemented and tested code from the packaging ecosystem: enables adherence to the quality testing of the broader ecosystem application developers: long-tenured JS and Python coders: largely professional but some independents package managemeny systems: npm and PyPI: change adheres project to well-tested and implemented environment: no project evaluation of change ‘success’ wrt environment mixed methods: pilot survey – data mining – follow up survey – data mining to validate survey responses: sampling from prior methods step: skews to university internal motivations for productiivty: many also stated that reuse was bad: paper spends a lot of time defining trivial packages
33 P3MTJWXP conferencePaper 2022 Zhang, Xunhui; Wang, Tao; Yu, Yue; Zeng, Qiubing; Li, Zhixing; Wang, Huaimin Who, What, Why and How? Towards the Monetary Incentive in Crowd Collaboration: A Case Study of Github’s Sponsor Mechanism 10.1145/3491102.3501822 procedural/organizational: developer adoption and participation in the GitHub sponsors program --- rationale for adopting the sponsorship model: I should be rewarded or recognized for my OSS work OSS developers, but not necessarily those with big commits or key contributions, the popular ones who work on big projects GitHub --- and also broader society. the adoption of the feature is bound to the platform as the environment --- as such, the bounds of the project’s activity are restricted by GitHub as a platform --- to what extent is broader social environment (intrinsic desire for payment) also the environment here? mixed-methods -- both data mining and survey; quantitative data mining of different sponsorship events within GitHub --- pulling a lot of data on the individual sponsorships and the sponsoring events --- statistic modeling (lmer) of maintainer/contributor balance etc.; Sampling from the data mining to identify the relevant population.; qualitative already with a questionnaire about the why and what questions.  -- survey looked up the expectations and rationales for using the sponsor feature ; two-stage survey again a validity check of the contributor rationales --- How effective is the sponsorship mechanism with carving out time for maintainers to work on things --- didn’t hold up!; configurable adaptations more throughout this, not the first paper that discusses this; environment (GitHub) is incredibly deterministic in establishing who adapts/adopts the feature --- instead of being an amorphous social pressure or anything like that --- it is a platform trying to get you to use their most recent feature; intersection of rationales for doing things sit at the intersection of intrinsic and extrinsic motivations
34 DW9Q2W6V conferencePaper 2022 Businge, John; Zerouali, Ahmed; Decan, Alexandre; Mens, Tom; Demeyer, Serge; De Roover, Coen Variant Forks - Motivations and Impediments 10.1109/SANER53432.2022.00105 technical change rooted in procedural motivations ---- creating variant of existing project in order to redirect the technical aims of the work. The creation of the variant fork is switching the social fork from contributing back to mainline to being more of a hard fork. creating a ‘variant fork’ of a project on a social coding platform -- variant fork is the same as hard fork I think. --- technical maintenance, wanting to change something and running into issues contributing back to the project or wanting to redirect the features ;; addition of different types of things OSS maintainers who support the variant fork, who walked away from the upstream mainline project in order to create a new repository and effort the environment is the newly-constructed relationship between the fork project and its upstream mainline. The actions of the mainline thus become the rationale for moving the fork from social to variant. Mixed-methods – largely a survey with 105 maintainers of social fork projects; 12 questions, 15 minutes long of surveying authors. then selected things Likert scales and free responses from the survey responses. But also Sampled for variant forks through heuristic-based mining of platforms like Libraries.io and GitHub. Then also collected popularity metrics for the forks and their variants --- leading to mixed-methods analysis, even though the quantitative side of it is weak… to say the least. Interesting that the categories of technical and procedural run into issues when the motivations are based in the procedural gripes with technical maintenance