adaptation-slr/snowballing/forward-snowball-citations.ris


TY  - NA
AU  - Ding, Steven H. H.; Fung, Benjamin C. M.; Charland, Philippe
TI  - IEEE Symposium on Security and Privacy - Asm2Vec: Boosting Static Representation Robustness for Binary Clone Search against Code Obfuscation and Compiler Optimization
PY  - 2019
AB  - Reverse engineering is a manually intensive but necessary technique for understanding the inner workings of new malware, finding vulnerabilities in existing systems, and detecting patent infringements in released software. An assembly clone search engine facilitates the work of reverse engineers by identifying those duplicated or known parts. However, it is challenging to design a robust clone search engine, since there exist various compiler optimization options and code obfuscation techniques that make logically similar assembly functions appear to be very different. A practical clone search engine relies on a robust vector representation of assembly code. However, the existing clone search approaches, which rely on a manual feature engineering process to form a feature vector for an assembly function, fail to consider the relationships between features and identify those unique patterns that can statistically distinguish assembly functions. To address this problem, we propose to jointly learn the lexical semantic relationships and the vector representation of assembly functions based on assembly code. We have developed an assembly code representation learning model \emph{Asm2Vec}. It only needs assembly code as input and does not require any prior knowledge such as the correct mapping between assembly functions. It can find and incorporate rich semantic relationships among tokens appearing in assembly code. We conduct extensive experiments and benchmark the learning model with state-of-the-art static and dynamic clone search approaches. We show that the learned representation is more robust and significantly outperforms existing methods against changes introduced by obfuscation and optimizations.
SP  - 472
EP  - 489
JF  - 2019 IEEE Symposium on Security and Privacy (SP)
VL  - NA
IS  - NA
PB  -
DO  - 10.1109/sp.2019.00003
ER  -

TY  - CHAP
AU  - Lundell, Björn; Gamalielsson, Jonas; Butler, Simon; Brax, Christoffer; Persson, Tomas; Mattsson, Anders; Gustavsson, Tomas; Feist, Jonas; Öberg, Jonas
TI  - OSS - Enabling OSS Usage Through Procurement Projects: How Can Lock-in Effects Be Avoided?
PY  - 2021
AB  - Formulation of mandatory requirements in procurement projects has significant influence on opportunities for development and deployment of Open Source Software (OSS). The paper contributes insights on a widespread practice amongst public procurement projects which causes problematic lock-in effects and thereby inhibits opportunities for use of OSS solutions. Through a systematic investigation of 30 randomly selected procurement projects in the software domain the paper highlights illustrative examples of mandatory requirements which cause lock-in and presents five recommendations for how requirements instead should be formulated in order to avoid causing lock-in. Findings show significant lock-in caused by current procurement practices with a stark preference for proprietary software and SaaS solutions amongst procuring organisations.
SP  - 16
EP  - 27
JF  - IFIP Advances in Information and Communication Technology
VL  - NA
IS  - NA
PB  -
DO  - 10.1007/978-3-030-75251-4_2
ER  -

TY  - NA
AU  - Gaughan, Matthew; Champion, Kaylea; Hwang, Sohyeon
TI  - Engineering Formality and Software Risk in Debian Python Packages
PY  - 2024
AB  - NA
SP  - 1005
EP  - 1010
JF  - 2024 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)
VL  - NA
IS  - NA
PB  -
DO  - 10.1109/saner60148.2024.00108
ER  -

TY  - CONF
AU  - Borreguero, Ferran; Di Nitto, Elisabetta; Stebliuk, Dmitrii; Tamburri, Damian A.; Zheng, Chengyu
TI  - CHASE@ICSE - Fathoming software evangelists with the D-index
PY  - 2015
AB  - The increased importance represented by open-source and crowd-sourced software developers and software development in general, inspired us to consider the following dilemma: can we "compute" virtuous software developers? The D-Index is our preliminary attempt. Essentially, the D-Index meaningfully equates several indicators for the virtues of a developer, such as, contributed code, its quality, mentoring in online learning communities, community engagement. Our preliminary evaluation of the index suggests that establishing the virtues for certain developers eases the identification of software "evangelists", key success enablers for software communities.
SP  - 85
EP  - 88
JF  - NA
VL  - NA
IS  - NA
PB  -
DO  - NA
ER  -

TY  - JOUR
AU  - Hwang, Sohyeon; Kiene, Charles; Ong, Serene; Shaw, Aaron
TI  - Adopting Third-party Bots for Managing Online Communities
PY  - 2024
AB  - <jats:p>Bots have become critical for managing online communities on platforms, especially to match the increasing technical sophistication of online harms. However, community leaders often adoptthird-party bots, creating room for misalignment in their assumptions, expectations, and understandings (i.e., their technological frames) about them. On platforms where sharing bots can be extremely valuable, how community leaders can revise their frames about bots to more effectively adopt them is unclear. In this work, we conducted a qualitative interview study with 16 community leaders on Discord examining how they adopt third-party bots. We found that participants addressed challenges stemming from uncertainties about a bot's security, reliability, and fit through emergent social ecosystems. Formal and informal opportunities to discuss bots with others across communities enabled participants to revise their technological frames over time, closing gaps in bot-specific skills and knowledge. This social process of learning shifted participants' perspectives of the labor of bot adoption into something that was satisfying and fun, underscoring the value of collaborative and communal approaches to adopting bots. Finally, by shaping participants' mental models of the nature, value, and use of bots, social ecosystems also raise some practical tensions in how they support user creativity and customization in third-party bot use. Together, the social nature of adopting third-party bots in our interviews offers insight into how we can better support the sharing of valuable user-facing tools across online communities.</jats:p>
SP  - 1
EP  - 26
JF  - Proceedings of the ACM on Human-Computer Interaction
VL  - 8
IS  - CSCW1
PB  -
DO  - 10.1145/3653707
ER  -

TY  - JOUR
AU  - Amrollahi, Alireza; Rowlands, Bruce
TI  - OSPM: A design methodology for open strategic planning
PY  - 2018
AB  - This study employs a design science perspective to propose a methodology for open strategic planning (OSP). Habermas’ discourse theory and Bryson’s strategy change cycle are used as informing kernel theories. A methodology is proposed to satisfy the requirements retrieved from the kernel theories. The proposed methodology contains modules for a planning system and a planning process. Design principles are explained through a blueprint of the system and process. The proposed methodology is applied and evaluated in two cases. Contributions to the literature involve extending the literature on OSP to an applicable methodology with guidelines on how to implement open strategy.
SP  - 667
EP  - 685
JF  - Information & Management
VL  - 55
IS  - 6
PB  -
DO  - 10.1016/j.im.2018.01.006
ER  -

TY  - NA
AU  - Ilyas, Muhammad; Khan, Siffat Ullah
TI  - ICIS - Practices for software integration success factors in GSD environment
PY  - 2016
AB  - The use and size of software are both growing, due to the advances in ICTs, resulting in increased software complexity. The software vendors overcome this complexity by decomposing the product into different components and then these components are developed in-house, outsourced or purchased as off the shelf (OTS) components. The next step is to integrate these components into a final product. In our previous work we identified, through systematic literature review (SLR), a list of nine critical success factors (CSFs) for global software development (GSD) vendors in the software integration process. In order to implement the identified CSFs by GSD vendors, we conducted another SLR study and identified a total of 116 practices/solutions. These practices will assist GSD vendors in the implementation of the identified CSFs in order to overcome the complexity of the integration process in GSD projects.
SP  - 1
EP  - 6
JF  - 2016 IEEE/ACIS 15th International Conference on Computer and Information Science (ICIS)
VL  - NA
IS  - NA
PB  -
DO  - 10.1109/icis.2016.7550828
ER  -

TY  - JOUR
AU  - Alvares, Cecília M.S.; Grossi, Luiza B.; Ramos, Ramatisa L.; Magela, Cíntia S.; Amaral, Míriam Cristina Santos
TI  - Bi-dimensional modelling of the thermal boundary layer and mass flux prediction for direct contact membrane distillation
PY  - 2019
AB  - NA
SP  - 1205
EP  - 1215
JF  - International Journal of Heat and Mass Transfer
VL  - 141
IS  - NA
PB  -
DO  - 10.1016/j.ijheatmasstransfer.2019.07.014
ER  -

TY  - JOUR
AU  - dos Santos, Carlos Denner
TI  - Changes in free and open source software licenses: managerial interventions and variations on project attractiveness
PY  - 2017
AB  - The license adopted by an open source software is associated with its success in terms of attractiveness and maintenance of an active ecosystem of users, bug reporters, developers, and sponsors because what can and cannot be done with the software and its derivatives in terms of improvement and market distribution depends on legal terms there specified. By knowing this licensing effect through scientific publications and their experience, project managers became able to act strategically, loosening up the restrictions associated with their source code due to sponsor interests, for example; or the contrary, tightening restrictions up to guarantee source code openness, adhering to the “forever free” strategy. But, have project managers behaved strategically like that, changing their projects license? Up to this paper, we did not know if and what types of changes in these legal allowances project managers have made and, more importantly, whether such managerial interventions are associated with variations in intervened project attractiveness (i.e., related to their numbers of web hits, downloads and members). This paper accomplishes these two goals and demonstrates that: 1) managers of free and open source software projects do change the distribution rights of their source code through a change in the (group of) license(s) adopted; and 2) variations in attractiveness are associated with the strategic choice of a licensing schema. To reach these conclusions, a unique dataset of open source projects that have changed license was assembled in a comparative form, analyzing intervened projects over its monthly periods of different licenses. Based on a sample of more than 3500 active projects over 44 months obtained from the FLOSSmole repository of Sourceforge.net
 data, 756 projects that had changed their source code distribution allowances and restrictions were identified and analyzed. A dataset on these projects’ type of changes was assembled to enable a descriptive and exploratory analysis of the types of license interventions observed over a period of almost four years anchored on projects’ attractiveness. More than 35 types of interventions were detected. The results indicate that variations in attractiveness after a license intervention are not symmetric; that is, if a change from license schema A to B is beneficial to attractiveness, a change from B to A is not necessarily prejudicial. This and other interesting findings are discussed in detail. In general, the results here reported support the current literature knowledge that the restrictions imposed by the license on the source code distribution are associated with market success vis-a-vis project attractiveness, but they also suggest that the state-of-the-science is superficial in terms of what is known about why these differences in attractiveness can be observed. The complexity of the results indicates to free software managers that no licensing schema should be seen as the right one, and its choice should be carefully made, considering project strategic goals as perceived relevant to stakeholders of the application and its production. These conclusions create awareness of several limitations of our current knowledge, which are discussed along with guidelines to understand them deeper in future research endeavors.
SP  - 1
EP  - 12
JF  - Journal of Internet Services and Applications
VL  - 8
IS  - 1
PB  -
DO  - 10.1186/s13174-017-0062-3
ER  -

TY  - JOUR
AU  - Kong, Dezhen; Liu, Jiakun; Bao, Lingfeng; Lo, David
TI  - Toward Better Comprehension of Breaking Changes in the NPM Ecosystem
PY  - 2025
AB  - <jats:p>Code evolution is prevalent in software ecosystems, which can provide many benefits, such as new features, bug fixes, security patches, while still introducing breaking changes that make downstream projects fail to work. Breaking changes cause a lot of effort to both downstream and upstream developers: downstream developers need to adapt to breaking changes and upstream developers are responsible for identifying and documenting them. In the NPM ecosystem, characterized by frequent code changes and a high tolerance for making breaking changes, the effort is larger.</jats:p>
          <jats:p>For better comprehension of breaking changes in the NPM ecosystem and to enhance breaking change detection tools, we conduct a large-scale empirical study to investigate breaking changes in the NPM ecosystem. We construct a dataset of explicitly documented breaking changes from 381 popular NPM projects. We find that 95.4% of the detected breaking changes can be covered by developers’ documentation, and 19% of the breaking changes cannot be detected by regression testing. Then in the process of investigating source code of our collected breaking changes, we yield a taxonomy of JavaScript- and TypeScript-specific syntactic breaking changes and a taxonomy of major types of behavioral breaking changes. Additionally, we investigate the reasons why developers make breaking changes in NPM and find three major reasons, i.e., to reduce code redundancy, to improve identifier names, and to improve API design, and each category contains several sub-items.</jats:p>
          <jats:p>We provide actionable implications for future research, e.g., automatic naming and renaming techniques should be applied in JavaScript projects to improve identifier names, future research can try to detect more types of behavioral breaking changes. By presenting the implications, we also discuss the weakness of automatic renaming and breaking change detection approaches, such as the lack of support for public identifiers and various types of breaking changes.</jats:p>
SP  - 1
EP  - 23
JF  - ACM Transactions on Software Engineering and Methodology
VL  - 34
IS  - 4
PB  -
DO  - 10.1145/3702991
ER  -

TY  - JOUR
AU  - Setó-Rey, Daniel; Santos-Martín, José Ignacio; López-Nozal, Carlos
TI  - Vulnerability of Package Dependency Networks
PY  - 2023
AB  - Software reuse by importing packages from centralised repositories is an efficient and increasingly widespread way to develop software. Given the transitivity of dependencies, defects introduced in the repository can have extensive effects on the software ecosystem. Drawing from complex network theory, we define a model of repository vulnerability based on the statistically expected damage that the repository sustains from the random introduction of software defects. We test the model in stylized networks derived from real repositories, PyPI, Maven and npm, and show that the existence of a giant strongly connected component (SCC) explains most of the vulnerability. Indeed, we found that theoretical protection (immunization) of this entire component would remove almost all vulnerability from the network. Since repositories and their communities have limited resources to mitigate issues, we further model the problem of how to best apply these resources, finding sets much smaller than the giant SCC whose protection is nearly as good. Furthermore, we prove that the optimal selection of sets of given size is NP-Hard but can be approached with heuristics, yielding respectable results. Our model contributes to a better understanding of software package repositories and could also be applied to other systems with a similar structure.
SP  - 1
EP  - 13
JF  - IEEE Transactions on Network Science and Engineering
VL  - NA
IS  - NA
PB  -
DO  - 10.1109/tnse.2023.3260880
ER  -

TY  - CHAP
AU  - Deshmukh, Anand B.; Chopade, Pallavi Devidas; Yadav, Rajeev; Chavhan, Gajanan H.; Ingle, Pravin E.; Ajani, Samir N.
TI  - Collaborative Decision-Making for Adaptive Cybersecurity Policies
PY  - 2025
AB  - NA
SP  - 565
EP  - 582
JF  - Smart Innovation, Systems and Technologies
VL  - NA
IS  - NA
PB  -
DO  - 10.1007/978-981-96-0147-9_46
ER  -

TY  - JOUR
AU  - Spinellis, Diomidis; Louridas, Panos; Kechagia, Maria
TI  - Software evolution: the lifetime of fine-grained elements
PY  - 2021
AB  - A model regarding the lifetime of individual source code lines or tokens can estimate maintenance effort, guide preventive maintenance, and, more broadly, identify factors that can improve the efficiency of software development. We present methods and tools that allow tracking of each line’s or token’s birth and death. Through them, we analyze 3.3 billion source code element lifetime events in 89 revision control repositories. Statistical analysis shows that code lines are durable, with a median lifespan of about 2.4 years, and that young lines are more likely to be modified or deleted, following a Weibull distribution with the associated hazard rate decreasing over time. This behavior appears to be independent from specific characteristics of lines or tokens, as we could not determine factors that influence significantly their longevity across projects. The programing language, and developer tenure and experience were not found to be significantly correlated with line or token longevity, while project size and project age showed only a slight correlation.
SP  - 1
EP  - 33
JF  - PeerJ. Computer science
VL  - 7
IS  - NA
PB  -
DO  - 10.7717/peerj-cs.372
ER  -

TY  - JOUR
AU  - Barcomb, Ann; Stol, Klaas-Jan; Fitzgerald, Brian; Riehle, Dirk
TI  - Managing Episodic Volunteers in Free/Libre/Open Source Software Communities
PY  - 2020
AB  - We draw on the concept of episodic volunteering (EV) from the general volunteering literature to identify practices for managing EV in free/libre/open source software (FLOSS) communities. Infrequent but ongoing participation is widespread, but the practices that community managers are using to manage EV, and their concerns about EV, have not been previously documented. We conducted a policy Delphi study involving 24 FLOSS community managers from 22 different communities. Our panel identified 16 concerns related to managing EV in FLOSS, which we ranked by prevalence. We also describe 65 practices for managing EV in FLOSS. Almost three-quarters of these practices are used by at least three community managers. We report these practices using a systematic presentation that includes context, relationships between practices, and concerns that they address. These findings provide a coherent framework that can help FLOSS community managers to better manage episodic contributors.
SP  - 1
EP  - 18
JF  - IEEE Transactions on Software Engineering
VL  - NA
IS  - NA
PB  -
DO  - NA
ER  -

TY  - JOUR
AU  - Curto-Millet, Daniel; Corsín Jiménez, Alberto
TI  - The sustainability of open source commons
PY  - 2022
AB  - The sustainability of commons has benefited from Elinor Ostrom´s analysis of shared resources. In her work, sustainability was described in a univocal manner–successful or not–depending on the common's long-term capacity to survive within an uncertain environment. In recent years, this view of sustainability has been applied to the study of digital commons, including open source. Building on more recent work on sustainability, this paper challenges this univocal conception of sustainability in open source. Through a critical review of the literature, it unveils the coexistence of multiple notions of sustainability in open source and proposes a typology of sustainabilities (resource-based, infrastructural, and interactional). We propose that the degree and quality of the interrelationship between these different types of sustainability need to be explored, leading to the theorisation of three possible scenarios (trade-offs, synergy, and independence). We discuss and put forward a research agenda.
SP  - 763
EP  - 781
JF  - European Journal of Information Systems
VL  - 32
IS  - 5
PB  -
DO  - 10.1080/0960085x.2022.2046516
ER  -

TY  - BOOK
AU  - Lotter, Adriaan; Licorish, Sherlock A.; Savarimuthu, Bastin Tony Roy; Meldrum, Sarah
TI  - ASWEC - Code Reuse in Stack Overflow and Popular Open Source Java Projects
PY  - 2018
AB  - Solutions provided in Question and Answer (Q&A) websites such as Stack Overflow are regularly used in Open Source Software (OSS). However, many developers are unaware that both Stack Overflow and OSS are governed by licenses. Hence, developers reusing code from Stack Overflow for their OSS projects may violate licensing agreements if their attributions are not correct. Additionally, if code migrates from one OSS through Stack Overflow to another OSS, then complex licensing issues are likely to exist. Such forms of software reuse also have implications for future software maintenance, particularly where developers have poor understanding of copied code. This paper investigates code reuse between these two platforms (i.e., Stack Overflow and OSS), with the aim of providing insights into this issue. This study mined 151,946 Java code snippets from Stack Overflow, 16,617 Java files from 12 of the top weekly listed projects on SourceForge and GitHub, and 39,616 Java files from the top 20 most popular Java projects on SourceForge. Our analyses were aimed at finding the number of clones (indicating reuse) (a) within Stack Overflow posts, (b) between Stack Overflow and popular Java OSS projects, and (c) between the projects. Outcomes reveal that there was up to 3.3% code reuse within Stack Overflow, while 1.0% of Stack Overflow code was reused in recent popular Java projects and 2.3% in those projects that were more established. Reuse across projects was much higher, accounting for as much as 77.2%. Our outcomes have implication for strategies aimed at introducing strict quality assurance measures to ensure the appropriateness of code reuse, and licensing requirements awareness.
SP  - 141
EP  - 150
JF  - 2018 25th Australasian Software Engineering Conference (ASWEC)
VL  - NA
IS  - NA
PB  -
DO  - 10.1109/aswec.2018.00027
ER  -

TY  - NA
AU  - McGuire, Sean; Schultz, Erin; Ayoola, Bimpe; Ralph, Paul
TI  - Sustainability is Stratified: Toward a Better Theory of Sustainable Software Engineering
PY  - 2023
AB  - Background: Sustainable software engineering (SSE) means creating software in a way that meets present needs without undermining our collective capacity to meet our future needs. It is typically conceptualized as several intersecting dimensions or "pillars"-environmental, social, economic, technical and in-dividual. However; these pillars are theoretically underdeveloped and require refinement. Objectives: The objective of this paper is to generate a better theory of SSE. Method: First, a scoping review was conducted to understand the state of research on SSE and identify existing models thereof. Next, a meta-synthesis of qualitative research on SSE was conducted to critique and improve the existing models identified. Results: 961 potentially relevant articles were extracted from five article databases. These articles were de-duplicated and then screened independently by two screeners, leaving 243 articles to examine. Of these, 109 were non-empirical, the most common empirical method was systematic review, and no randomized controlled experiments were found. Most papers focus on ecological sustainability (158) and the sustainability of software products (148) rather than processes. A meta-synthesis of 36 qualitative studies produced several key propositions, most notably, that sustainability is stratified (has different meanings at different levels of abstraction) and multisystemic (emerges from interactions among multiple social, technical, and sociotechnical systems). Conclusion: The academic literature on SSE is surprisingly non-empirical. More empirical evaluations of specific sustainability interventions are needed. The sustainability of software development products and processes should be conceptualized as multisystemic and stratified, and assessed accordingly.
SP  - 1996
EP  - 2008
JF  - 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE)
VL  - NA
IS  - NA
PB  -
DO  - 10.1109/icse48619.2023.00169
ER  -

TY  - JOUR
AU  - Choi, Seongsook
TI  - The case for open source software: The interactional discourse lab
PY  - 2015
AB  - Computational techniques and software applications for the quantitative content analysis of texts are now well established, and many qualitative data software applications enable the manipulation of input variables and the visualization of complex relations between them via interactive and informative graphical interfaces. Although advances in text analysis have helped researchers mine text data for semantic content and identify language patterns in text with greater facility, interactional dynamics and patterns of talk have been neglected. This article introduces a new open-source tool, Interactional Discourse Lab. This tool is designed to map dynamics in spoken interaction and to represent them in easily accessible visual form, capturing aspects such as the frequency and patterning of exchanges, and the distribution of turns and discourse features. It is designed to contribute, with other analytical tools such as those used in text analysis, to the development of interactional topographies. The paper sets the tool within a wider case for the development of open-source software in applied linguistics as a platform for methodological innovation.
SP  - 100
EP  - 120
JF  - Applied Linguistics
VL  - 37
IS  - 1
PB  -
DO  - 10.1093/applin/amv066
ER  -

TY  - JOUR
AU  - Mirazchiyski, Plamen Vladkov
TI  - RALSA: Design and Implementation
PY  - 2021
AB  - International large-scale assessments (ILSAs) provide invaluable information for researchers and policy makers. Analysis of their data, however, requires methods that go beyond the usual analysis techniques assuming simple random sampling. Several software packages that serve this purpose are available. One such is the R Analyzer for Large-Scale Assessments (RALSA), a newly developed R package. The package can work with data from a large number of ILSAs. It was designed for user experience and is suitable for analysts who lack technical expertise and/or familiarity with the R programming language and statistical software. This paper presents the technical aspects of RALSA—the overall design and structure of the package, its internal organization, and the structure of the analysis and data preparation functions. The use of the data.table package for memory efficiency, speed, and embedded computations is explained through examples. The central aspect of the paper is the utilization of code reuse practices to the achieve consistency, efficiency, and safety of the computations performed by the analysis functions of the package. The comprehensive output system to produce multi-sheet MS Excel workbooks is presented and its workflow explained. The paper also explains how the graphical user interface is constructed and how it is linked to the data preparation and analysis functions available in the package.
SP  - 233
EP  - 248
JF  - Psych
VL  - 3
IS  - 2
PB  -
DO  - 10.3390/psych3020018
ER  -

TY  - CONF
AU  - Catolino, Gemma; Palomba, Fabio; Tamburri, Damian A.; Serebrenik, Alexander; Ferrucci, Filomena
TI  - ICSE-SEIS - Refactoring community smells in the wild: the practitioner's field manual
PY  - 2020
AB  - Community smells have been defined as sub-optimal organizational structures that may lead to social debt. Previous studies have shown that they are highly diffused in both open- and closed-source projects, are perceived as harmful by practitioners, and can even lead to the introduction of technical debt in source code. Despite the presence of this body of research, little is known on the practitioners' perceived prominence of community smells in practice as well as on the strategies adopted to deal with them. This paper aims at bridging this gap by proposing an empirical study in which 76 software practitioners are inquired on (i) the prominence of four well-known community smells, i.e., Organizational Silo, Black Cloud, Lone Wolf, and Radio Silence, in their contexts and (ii) the methods they adopted to "refactor" them. Our results first reveal that community smells frequently manifest themselves in software projects and, more importantly, there exist specific refactoring practices to deal with each of the considered community smells.
SP  - 25
EP  - 34
JF  - NA
VL  - NA
IS  - NA
PB  -
DO  - NA
ER  -

TY  - JOUR
AU  - Weeraddana, Nimmi Rashinika; Alfadel, Mahmoud; McIntosh, Shane
TI  - Dependency-Induced Waste in Continuous Integration: An Empirical Study of Unused Dependencies in the npm Ecosystem
PY  - 2024
AB  - <jats:p>Modern software systems are increasingly dependent upon code from external packages (i.e., dependencies). Building upon external packages allows software reuse to span across projects seamlessly. Package maintainers regularly release updated versions to provide new features, fix defects, and address security vulnerabilities. Due to the potential for regression, managing dependencies is not just a trivial matter of selecting the latest versions. Since it is perceived to be less risky to retain a dependency than remove it, as projects evolve, they tend to accrue dependencies, exacerbating the difficulty of dependency management. It is not uncommon for a considerable proportion of external packages to be unused by the projects that list them as a dependency. Although such unused dependencies are not required to build and run the project, updates to their dependency specifications will still trigger Continuous Integration (CI) builds. The CI builds that are initiated by updates to unused dependencies are fundamentally wasteful. Considering that CI build time is a finite resource that is directly associated with project development and service operational costs, understanding the consequences of unused dependencies within this CI context is of practical importance.


In this paper, we study the CI waste that is generated by updates to unused dependencies. We collect a dataset of 20,743 commits that are solely updating dependency specifications (i.e., the package.json file), spanning 1,487 projects that adopt npm for managing their dependencies. Our findings illustrate that 55.88% of the CI build time that is associated with dependency updates is only triggered by unused dependencies. At the project level, the median project spends 56.09% of its dependency-related CI build time on updates to unused dependencies. For projects that exceed the budget of free build minutes, we find that the median percentage of billable CI build time that is wasted due to unused-dependency commits is 85.50%. Moreover, we find that automated bots are the primary producers of dependency-induced CI waste, contributing 92.93% of the CI build time that is spent on unused dependencies. The popular Dependabot is responsible for updates to unused dependencies that account for 74.52% of that waste. To mitigate the impact of unused dependencies on CI resources, we introduce Dep-sCImitar, an approach to cut down wasted CI time by identifying and skipping CI builds that are triggered due to unused-dependency commits. A retrospective evaluation of the 20,743 studied commits shows that Dep-sCImitar reduces wasted CI build time by 68.34% by skipping wasteful builds with a precision of 94%.</jats:p>
SP  - 2632
EP  - 2655
JF  - Proceedings of the ACM on Software Engineering
VL  - 1
IS  - FSE
PB  -
DO  - 10.1145/3660823
ER  -

TY  - JOUR
AU  - Li, Xiaozhou; Moreschini, Sergio; Zhang, Zheying; Taibi, Davide
TI  - Exploring factors and metrics to select open source software components for integration: An empirical study
PY  - 2022
AB  - Open Source Software (OSS) is nowadays used and integrated in most of the commercial products. However, the selection of OSS projects for integration is not a simple process, mainly due to a of lack of clear selection models and lack of information from the OSS portals. We investigate the factors and metrics that practitioners currently consider when selecting OSS. We also investigate the source of information and portals that can be used to assess the factors, as well as the possibility to automatically extract such information with APIs. We elicited the factors and the metrics adopted to assess and compare OSS performing a survey among 23 experienced developers who often integrate OSS in the software they develop. Moreover, we investigated the APIs of the portals adopted to assess OSS extracting information for the most starred 100K projects in GitHub. We identified a set consisting of 8 main factors and 74 sub-factors, together with 170 related metrics that companies can use to select OSS to be integrated in their software projects. Unexpectedly, only a small part of the factors can be evaluated automatically, and out of 170 metrics, only 40 are available, of which only 22 returned information for all the 100K projects. Therefore, we recommend project maintainers and project repositories to pay attention to provide information for the project they are hosting, so as to increase the likelihood of being adopted. OSS selection can be partially automated, by extracting the information needed for the selection from portal APIs. OSS producers can benefit from our results by checking if they are providing all the information commonly required by potential adopters. Developers can benefit from our results, using the list of factors we selected as a checklist during the selection of OSS, or using the APIs we developed to automatically extract the data from OSS projects.
SP  - 111255
EP  - 111255
JF  - Journal of Systems and Software
VL  - 188
IS  - NA
PB  -
DO  - 10.1016/j.jss.2022.111255
ER  -

TY  - JOUR
AU  - Racero, F. José; Bueno, Salvador; Gallego, M. Dolores
TI  - Predicting Students’ Behavioral Intention to Use Open Source Software: A Combined View of the Technology Acceptance Model and Self-Determination Theory
PY  - 2020
AB  - This study focuses on students’ behavioral intention to use Open Source Software (OSS). The article examines how students, who were trained in OSS, are motivated to continue using it. A conceptual model based on Self-Determination Theory and the Technological Acceptance Model (TAM) was defined in order to test the behavioral intention to use OSS, comprising six constructs: (1) autonomy, (2) competence, (3) relatedness, (4) perceived ease of use, (5) perceived usefulness and (6) behavioral intention to use. A survey was designed for data collection. The participants were recent secondary school graduates, and all of them had received mandatory OSS training. A total of 352 valid responses were used to test the proposed structural model, which was performed using the Lisrel software. The results clearly confirmed the positive influence of the intrinsic motivations; autonomy and relatedness, to improve perceptions regarding the usefulness and ease of use of OSS, and; therefore, on behavioral intention to use OSS. In addition, the implications and limitations of this study are considered.
SP  - 2711
EP  - NA
JF  - Applied Sciences
VL  - 10
IS  - 8
PB  -
DO  - 10.3390/app10082711
ER  -

TY  - CHAP
AU  - Capiluppi, Andrea; Stol, Klaas-Jan; Boldyreff, Cornelia
TI  - Software Reuse in Open Source A Case Study
PY  - 2013
AB  - <jats:p>A promising way to support software reuse is based on Component-Based Software Development (CBSD). Open Source Software (OSS) products are increasingly available that can be freely used in product development. However, OSS communities still face several challenges before taking full advantage of the “reuse mechanism”: many OSS projects duplicate effort, for instance when many projects implement a similar system in the same application domain and in the same topic. One successful counter-example is the FFmpeg multimedia project; several of its components are widely and consistently reused in other OSS projects. Documented is the evolutionary history of the various libraries of components within the FFmpeg project, which presently are reused in more than 140 OSS projects. Most use them as black-box components; although a number of OSS projects keep a localized copy in their repositories, eventually modifying them as needed (white-box reuse). In both cases, the authors argue that FFmpeg is a successful project that provides an excellent exemplar of a reusable library of OSS components.</jats:p>
SP  - 151
EP  - 176
JF  - Open Source Software Dynamics, Processes, and Applications
VL  - NA
IS  - NA
PB  -
DO  - 10.4018/978-1-4666-2937-0.ch008
ER  -

TY  - JOUR
AU  - Llano, Stephen M.
TI  - Intercollegiate debate: stick a fork in it!
PY  - 2025
AB  - NA
SP  - 47
EP  - 65
JF  - Argumentation and Advocacy
VL  - 61
IS  - 1
PB  -
DO  - 10.1080/10511431.2024.2442837
ER  -

TY  - JOUR
AU  - Ilyas, Muhammad; Khan, Siffat Ullah; Rashid, Nasir
TI  - Empirical Validation of Software Integration Practices in Global Software Development
PY  - 2020
AB  - NA
SP  - 1
EP  - 23
JF  - SN Computer Science
VL  - 1
IS  - 3
PB  -
DO  - 10.1007/s42979-020-00175-2
ER  -

TY  - JOUR
AU  - Rahman, Shah Mohammad Motiur; Mollah, Saiful Alam; Anirban, Shikha; Rahman, Habibur; Rahman, Mostafijur; Hassan, Maruf; Sharif, Hasan
TI  - OSCRUM: A Modified Scrum for Open Source Software Development
PY  - 2019
AB  - NA
SP  - NA
EP  - NA
JF  - International journal of simulation: systems, science & technology
VL  - NA
IS  - NA
PB  -
DO  - 10.5013/ijssst.a.19.03.20
ER  -

TY  - JOUR
AU  - Paxton, Alexandra; Varoquaux, Nelle; Holdgraf, Chris; Geiger, R Stuart
TI  - Community, Time, and (Con)text: A Dynamical Systems Analysis of Online Communication and Community Health among Open-Source Software Communities.
PY  - 2022
AB  - Free and open-source software projects have become essential digital infrastructure over the past decade. These projects are largely created and maintained by unpaid volunteers, presenting a potential vulnerability if the projects cannot recruit and retain new volunteers. At the same time, their development on open collaborative development platforms provides a nearly complete record of the community's interactions; this affords the opportunity to study naturally occurring language dynamics at scale and in a context with massive real-world impact. The present work takes a dynamical systems view of language to understand the ways in which communicative context and community membership shape the emergence and impact of language use-specifically, sentiment and expressions of gratitude. We then present evidence that these language dynamics shape newcomers' likelihood of returning, although the specific impacts of different community responses are crucially modulated by the context of the newcomer's first contact with the community.
SP  - e13134
EP  - NA
JF  - Cognitive science
VL  - 46
IS  - 5
PB  -
DO  - 10.1111/cogs.13134
ER  -

TY  - JOUR
AU  - Islam, A. K. M. Najmul; Mäntymäki, Matti; Turunen, Marja
TI  - Why do blockchains split? An actor-network perspective on Bitcoin splits
PY  - 2019
AB  - Abstract This paper investigates the focal actors in a blockchain network and their heterogeneity in splits. Disagreements in blockchain communities often lead to splits in both the blockchain and the community. We use three key elements of the actor-network theory — punctualization, translation, and actor heterogeneity—and employ case study methodology to examine Bitcoin splits. We identify several human actors, such as miners, developers, merchants, and investors, as well as non-human actors, including blockchain, exchanges, hardware manufacturers, and wallets, involved in Bitcoin splits. Our results show that the consolidation of actors in homogeneous groups plays a key role in blockchain splits. We further describe how the human and non-human actors' fluid moves into micro and macro actor positions in the network affect the development of the split. In addition, we discuss the roles of these actors and their engagement in forming micro and macro agencies in blockchain splits.
SP  - 119743
EP  - NA
JF  - Technological Forecasting and Social Change
VL  - 148
IS  - NA
PB  -
DO  - 10.1016/j.techfore.2019.119743
ER  -

TY  - JOUR
AU  - Squire, Megan
TI  - How the FLOSS Research Community Uses Email Archives
PY  - 2012
AB  - Artifacts of the software development process, such as source code or emails between developers, are a frequent object of study in empirical software engineering literature. One of the hallmarks of free, libre, and open source software (FLOSS) projects is that the artifacts of the development process are publicly-accessible and therefore easily collected and studied. Thus, there is a long history in the FLOSS research community of using these artifacts to gain understanding about the phenomenon of open source software, which could then be compared to studies of software engineering more generally. This paper looks specifically at how the FLOSS research community has used email artifacts from free and open source projects. It provides a classification of the relevant literature using a publicly-available online repository of papers about FLOSS development using email. The outcome of this paper is to provide a broad overview for the software engineering and FLOSS research communities of how other researchers have used FLOSS email message artifacts in their work.
SP  - 37
EP  - 59
JF  - International Journal of Open Source Software and Processes
VL  - 4
IS  - 1
PB  -
DO  - 10.4018/jossp.2012010103
ER  -

TY  - JOUR
AU  - Hucka, Michael; Graham, Matthew J.
TI  - Software search is not a science, even among scientists: A survey of how scientists and engineers find software
PY  - 2018
AB  - Improved software discovery is a prerequisite for greater software reuse: after all, if someone cannot find software for a particular task, they cannot reuse it. Understanding people’s approaches and preferences when they look for software could help improve facilities for software discovery. We surveyed people working in several scientific and engineering fields to better understand their approaches and selection criteria. We found that even among highly-trained people, the rudimentary approaches of relying on general Web searches, the opinions of colleagues, and the literature were still the most commonly used. However, those who were involved in software development differed from nondevelopers in their use of social help sites, software project repositories, software catalogs, and organization-specific mailing lists or forums. For example, software developers in our sample were more likely to search in community sites such as Stack Overflow even when seeking ready-to-run software rather than source code, and likewise, asking colleagues was significantly more important when looking for ready-to-run software. Our survey also provides insight into the criteria that matter most to people when they are searching for ready-to-run software. Finally, our survey also identifies some factors that can prevent people from finding software.
SP  - 171
EP  - 191
JF  - Journal of Systems and Software
VL  - 141
IS  - NA
PB  -
DO  - 10.1016/j.jss.2018.03.047
ER  -

TY  - JOUR
AU  - Sojer, Manuel; Henkel, Joachim
TI  - License risks from ad hoc reuse of code from the internet
PY  - 2011
AB  - Software developers' reuse of code from the Internet bears legal and economic risks for their employers.
SP  - 74
EP  - 81
JF  - Communications of the ACM
VL  - 54
IS  - 12
PB  -
DO  - 10.1145/2043174.2043193
ER  -

TY  - JOUR
AU  - Borg, Markus; Chatzipetrou, Panagiota; Wnuk, Krzysztof; Alégroth, Emil; Gorschek, Tony; Papatheocharous, Efi; Shah, Syed M.; Axelsson, Jakob
TI  - Selecting component sourcing options: A survey of software engineering’s broader make-or-buy decisions
PY  - 2019
AB  - Context: Component-based software engineering (CBSE) is a common approach to develop and evolve contemporary software systems. When evolving a system based on components, make-or-buy decisions are ...
SP  - 18
EP  - 34
JF  - Information and Software Technology
VL  - 112
IS  - 112
PB  -
DO  - 10.1016/j.infsof.2019.03.015
ER  -

TY  - NA
AU  - Businge, John; Zerouali, Ahmed; Decan, Alexandre; Mens, Tom; Demeyer, Serge; De Roover, Coen
TI  - Variant Forks - Motivations and Impediments
PY  - 2022
AB  - Social coding platforms centred around git provide explicit facilities to share code between projects: forks, pull requests, cherry-picking to name but a few. Variant forks are an interesting phenomenon in that respect, as they permit for different projects to peacefully co-exist, yet explicitly acknowledge the common ancestry. Several researchers analysed forking practices on open source platforms and observed that variant forks get created frequently. However, little is known on the motivations for launching such a variant fork. Is it mainly technical (e.g., diverging features), governance (e.g., diverging interests), legal (e.g., diverging licences), or do other factors come into play? We report the results of an exploratory qualitative analysis on the motivations behind creating and maintaining variant forks. We surveyed 105 maintainers of different active open source variant projects hosted on GitHub. Our study extends previous findings, identifying a number of fine-grained common motivations for launching a variant fork and listing concrete impediments for maintaining the co-existing projects.
SP  - 867
EP  - 877
JF  - 2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)
VL  - NA
IS  - NA
PB  -
DO  - 10.1109/saner53432.2022.00105
ER  -

TY  - CHAP
AU  - Lundell, Björn; Gamalielsson, Jonas; Tengblad, Stefan; Yousefi, Bahram Hooshyar; Fischer, Thomas; Johansson, Gert; Rodung, Bengt; Mattsson, Anders; Oppmark, Johan; Gustavsson, Tomas; Feist, Jonas; Landemoo, Stefan; Lonroth, Erik
TI  - OSS - Addressing Lock-in, Interoperability, and Long-Term Maintenance Challenges Through Open Source: How Can Companies Strategically Use Open Source?
PY  - 2017
AB  - This industry paper reports on how strategic use of open source in company contexts can provide effective support for addressing the fundamental challenges of lock-in, interoperability, and longevity of software and associated digital assets. The fundamental challenges and an overview of an ongoing collaborative research project are presented. Through a conceptual model for open source usage in company contexts we characterise how companies engage with open source and elaborate on how the fundamental challenges can be effectively addressed through open source usage in company contexts.
SP  - 80
EP  - 88
JF  - IFIP Advances in Information and Communication Technology
VL  - NA
IS  - NA
PB  -
DO  - 10.1007/978-3-319-57735-7_9
ER  -

TY  - JOUR
AU  - Tiwari, Deepika; Zhang, Long; Monperrus, Martin; Baudry, Benoit
TI  - Production Monitoring to Improve Test Suites
PY  - 2022
AB  - In this article, we propose to use production executions to improve the quality of testing for certain methods of interest for developers. These methods can be methods that are not covered by the existing test suite or methods that are poorly tested. We devise an approach called pankti which monitors applications as they execute in production and then automatically generates differential unit tests, as well as derived oracles, from the collected data. pankti’s monitoring and generation focuses on one single programming language, Java. We evaluate it on three real-world, open-source projects: a videoconferencing system, a PDF manipulation library, and an e-commerce application. We show that pankti is able to generate differential unit tests by monitoring target methods in production and that the generated tests improve the quality of the test suite of the application under consideration.
SP  - 1
EP  - 17
JF  - IEEE Transactions on Reliability
VL  - 71
IS  - 3
PB  -
DO  - 10.1109/tr.2021.3101318
ER  -

TY  - NA
AU  - Souza, Hugo Henrique Fumero de; Wiese, Igor; Steinmacher, Igor; Ré, Reginaldo
TI  - A characterization study of testing contributors and their contributions in open source projects.
PY  - 2022
AB  - Even though open source projects have some different characteristics from projects in the industry, the commitment of maintainers and contributors to achieve a high level of software quality is constant. Therefore, tests are among the main practices of the communities. Thus, motivating contributors to write new tests and maintain regression tests during testing activities is essential for the project's health. The objective of our work is to characterize testers and their contributions to open source projects as part of a broad study about testers' motivation. Thus, we conducted a study with 3,936 repositories and 7 different and important programming languages (C, C++, C#, Java, Javascript, Python, and Ruby), analyzing a total of 4,409,142 contributions to classify contributing members and their contributions. Our results show that test-only contributors exist, regardless of programming language or project. We conclude that, despite the unfavorable scenario, there are contributors who feel motivated and dedicate their time and effort to contribute to new tests or to the evolution of existing tests.
SP  - 95
EP  - 105
JF  - Proceedings of the XXXVI Brazilian Symposium on Software Engineering
VL  - NA
IS  - NA
PB  -
DO  - 10.1145/3555228.3555244
ER  -

TY  - JOUR
AU  - Praschl, Christoph; Pointner, Andreas; Baumgartner, David; Zwettler, Gerald
TI  - Imaging framework: An interoperable and extendable connector for image-related Java frameworks
PY  - 2021
AB  - Abstract The number of computer vision and image processing tasks has increased during the last years. Although Python is most of the time the first choice in this area, there are situations, where the utilization of another programming language such as Java should be preferred. For this reason, multiple Java based frameworks as e.g. OpenIMAJ, ND4J or multiple OpenCV wrappers are available. Unfortunately, these frameworks are not interoperable at all. In this work, the open-source Imaging Framework is introduced to solve exactly this problem. The project features a concept for combining multiple frameworks and provides an interoperable and extendable foundation to 9 image-related projects with 10 different image representations in Java.
SP  - 100863
EP  - NA
JF  - SoftwareX
VL  - 16
IS  - NA
PB  -
DO  - 10.1016/j.softx.2021.100863
ER  -

TY  - JOUR
AU  - Constantino, Kattiana; Souza, Maurício; Zhou, Shurui; Figueiredo, Eduardo; Kästner, Christian
TI  - Perceptions of open-source software developers on collaborations: An interview and survey study
PY  - 2021
AB  - <jats:title>Abstract</jats:title><jats:p>With the emergence of social coding platforms, collaboration has become a key and dynamic aspect to the success of software projects. In such platforms, developers have to collaborate and deal with issues of collaboration in open‐source software development. Although collaboration is challenging, collaborative development produces better software systems than any developer could produce alone. Several approaches have investigated collaboration challenges, for instance, by proposing or evaluating models and tools to support collaborative work. Despite the undeniable importance of the existing efforts in this direction, there are few works on collaboration from perspectives of developers. In this work, we aim to investigate the perceptions of open‐source software developers on collaborations, such as motivations, techniques, and tools to support global, productive, and collaborative development. Following an ad hoc literature review, an exploratory interview study with 12 open‐source software developers from<jats:sc>GitHub</jats:sc>, our novel approach for this problem also relies on an extensive survey with 121 developers to confirm or refute the interview results. We found different collaborative contributions, such as managing change requests. Besides, we observed that most collaborators prefer to collaborate with the core team instead of their peers. We also found that most collaboration happens in software development (60%) and maintenance (47%) tasks. Furthermore, despite personal preferences to work independently, developers still consider collaborating with others in specific task categories, for instance, software development. Finally, developers also expressed the importance of the social coding platforms, such as<jats:sc>GitHub</jats:sc>, to support maintainers, and contributors in making decisions and developing tasks of the projects. Therefore, these findings may help project leaders optimize the collaborations among developers and reduce entry barriers. Moreover, these findings may support the project collaborators in understanding the collaboration process and engaging others in the project.</jats:p>
SP  - NA
EP  - NA
JF  - Journal of Software: Evolution and Process
VL  - 35
IS  - 5
PB  -
DO  - 10.1002/smr.2393
ER  -

TY  - JOUR
AU  - Lin, Jiahuei; Zhang, Haoxiang; Adams, Bram; Hassan, Ahmed E.
TI  - Upstream bug management in Linux distributions
PY  - 2022
AB  - NA
SP  - NA
EP  - NA
JF  - Empirical Software Engineering
VL  - 27
IS  - 6
PB  -
DO  - 10.1007/s10664-022-10173-y
ER  -

TY  - CHAP
AU  - Ali, Tarek A.; Nasr, Eman S.; Gheith, Mervat
TI  - CrowdSWD: A Novel Framework for Crowdsourcing Software Development Inspired by the Concept of Biological Metaphor
PY  - 2017
AB  - Crowdsourcing software development is a broad term that describes large-scale distributed systems that comprise many computing elements, each of which may have their own individual characteristics, objectives, and actions. Our society increasingly depends on such systems, in which collections of heterogeneous computing elements are tightly entangled with human and social structures to plan collective intelligence. The premise of this research is that existing frameworks for crowdsourcing software development are not powerful enough to cover large classes of aspects-relevant problems. To address this, we explored one instance of system development life cycle, which can be used to solve those problems. The outputs were in the form of (1) mechanisms for modeling the crowdsourcing software that empowers a crowd socially to solve complex problems that require effective management among participants with relevant abilities and limitations, (2) modeling supportive environments for crowdsourcing software, (3) modeling an adaptive engine that learns relevant characteristics of participants based on observations of their behavior and learned models, and (4) designing 34 heterogeneous computing elements that can be used in crowdsourcing software. A single experimental study, presented in this chapter, provides a richness of data and can lead to a deep understanding of a phenomenon in a single context.
SP  - 171
EP  - 208
JF  - Computer Communications and Networks
VL  - NA
IS  - NA
PB  -
DO  - 10.1007/978-3-319-54325-3_8
ER  -

TY  - NA
AU  - Ding, Steven H. H.; Fung, Benjamin C. M.; Charland, Philippe
TI  - KDD - Kam1n0: MapReduce-based Assembly Clone Search for Reverse Engineering
PY  - 2016
AB  - Assembly code analysis is one of the critical processes for detecting and proving software plagiarism and software patent infringements when the source code is unavailable. It is also a common practice to discover exploits and vulnerabilities in existing software. However, it is a manually intensive and time-consuming process even for experienced reverse engineers. An effective and efficient assembly code clone search engine can greatly reduce the effort of this process, since it can identify the cloned parts that have been previously analyzed. The assembly code clone search problem belongs to the field of software engineering. However, it strongly depends on practical nearest neighbor search techniques in data mining and databases. By closely collaborating with reverse engineers and Defence Research and Development Canada (DRDC), we study the concerns and challenges that make existing assembly code clone approaches not practically applicable from the perspective of data mining. We propose a new variant of LSH scheme and incorporate it with graph matching to address these challenges. We implement an integrated assembly clone search engine called Kam1n0. It is the first clone search engine that can efficiently identify the given query assembly function's subgraph clones from a large assembly code repository. Kam1n0 is built upon the Apache Spark computation framework and Cassandra-like key-value distributed storage. A deployed demo system is publicly available. Extensive experimental results suggest that Kam1n0 is accurate, efficient, and scalable for handling large volume of assembly code.
SP  - 461
EP  - 470
JF  - Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
VL  - NA
IS  - NA
PB  -
DO  - 10.1145/2939672.2939719
ER  -

TY  - JOUR
AU  - Decan, Alexandre; Mens, Tom; Zerouali, Ahmed; De Roover, Coen
TI  - Back to the Past – Analysing Backporting Practices in Package Dependency Networks
PY  - 2022
AB  - NA
SP  - 4087
EP  - 4099
JF  - IEEE Transactions on Software Engineering
VL  - 48
IS  - 10
PB  -
DO  - 10.1109/tse.2021.3112204
ER  -

TY  - JOUR
AU  - Zhou, Hongli; Zhang, Xiaodong; Hu, Yang
TI  - Robustness of open source product innovation community’s knowledge collaboration network under the dynamic environment
PY  - 2020
AB  - NA
SP  - 122888
EP  - NA
JF  - Physica A: Statistical Mechanics and its Applications
VL  - 540
IS  - NA
PB  -
DO  - 10.1016/j.physa.2019.122888
ER  -

TY  - NA
AU  - Heinemann, Lars
TI  - Effective and Efficient Reuse with Software Libraries
PY  - 2012
AB  - This thesis empirically analyzes the extent and nature of third-party
code reuse in practice. Motivated by the findings, a dynamic approach
for detecting functionally similar code is evaluated. An API
recommendation system is introduced that assists developers during
programming with software libraries by providing context-specific
suggestions for API methods within the development environment. This
principle is transferred to model-based development.
SP  - NA
EP  - NA
JF  - NA
VL  - NA
IS  - NA
PB  -
DO  - NA
ER  -

TY  - JOUR
AU  - Foss, Nicolai J.; Frederiksen, Lars; Rullani, Francesco
TI  - Problem‐formulation and problem‐solving in self‐organized communities: How modes of communication shape project behaviors in the free open‐source software community
PY  - 2015
AB  - Research summary: Building on the problem-solving perspective, we study behaviors related to projects and the communication-based antecedents of such behaviors in the free open-source software (FOSS) community. We examine two kinds of problem/project-behaviors: Individuals can set up projects around the formulation of new problems or join existing projects and define and/or work on subproblems within an existing problem. The choice between these two behaviors is influenced by the mode of communication. A communication mode with little a priori structure is the best mode for communicating about new problems (i.e., formulating a problem); empirically, it is associated with project launching behaviors. In contrast, more structured communication fits subproblems better and is related to project joining behaviors. Our hypotheses derive support from data from the FOSS community.

Managerial summary: We study how the way in which individuals communicate influence the project-behaviors they engage in. We find that relatively unstructured communication is associated with the setting up new projects, while communication that is structured around an artifact is associated with joining projects. Our findings hold implications for understanding how management may influence project behaviors and problem-solving: Firms that need to concentrate on more incremental problem-solving efforts (e.g., because a sufficient number of attractive problems have already been defined) should create environments in which interaction is undertaken mainly via artifacts. On the other hand, if firms seek to generate new problems (e.g., new strategic opportunities), they should create environments in which open-ended, verbal conversation is relatively more important than artifact-based communication. Copyright © 2015 John Wiley & Sons, Ltd.
SP  - 2589
EP  - 2610
JF  - Strategic Management Journal
VL  - 37
IS  - 13
PB  -
DO  - 10.1002/smj.2439
ER  -

TY  - CHAP
AU  - Kyriakou, Kyriakos-Ioannis D.; Tselikas, Nikolaos D.; Kapitsaki, Georgia M.
TI  - OSS - Improving C/C++ Open Source Software Discoverability by Utilizing Rust and Node.js Ecosystems
PY  - 2018
AB  - Discovering Open Source Software (OSS) components efficiently is not always an easy task. Node.js is a popular JavaScript runtime environment, whereas Rust is widely used for system programming, and both can be utilized for OSS discovery purposes. In this work, we examine whether Rust and Node.js can be used, along with their respective tooling and package repositories, in order to achieve improved discoverability of existing OSS implemented in C/C++. The paper describes how the capabilities of Rust in C/C++ interoperability can be combined with novel compilation techniques of low-level code to asm.js and WebAssembly, in order to harness JavaScript’s popularity as the medium to publicize hard to discover C/C++ OSS. A proposed incremental methodology is presented and the main, as well as the collateral, effects of enforcing the proposed methodology in a proof-of-concept situation are examined. Our findings indicate potential increase in discoverability, code quality, portability, along with viable performance degradation of portable binaries, demonstrating 8.7 times slower execution compared to machine code, in a worst-case scenario.
SP  - 181
EP  - 192
JF  - IFIP Advances in Information and Communication Technology
VL  - NA
IS  - NA
PB  -
DO  - 10.1007/978-3-319-92375-8_15
ER  -

TY  - NA
AU  - Linåker, Johan; Papatheocharous, Efi; Olsson, Thomas
TI  - How to characterize the health of an Open Source Software project? A snowball literature review of an emerging practice
PY  - 2022
AB  - Motivation: Society's dependence on Open Source Software (OSS) and the communities that maintain the OSS is ever-growing. So are the potential risks of, e.g., vulnerabilities being introduced in projects not actively maintained. By assessing an OSS project's capability to stay viable and maintained over time without interruption or weakening, i.e., the OSS health, users can consider the risk implied by using the OSS as is, and if necessary, decide whether to help improve the health or choose another option. However, such assessment is complex as OSS health covers a wide range of sub-topics, and existing support is limited. Aim: We aim to create an overview of characteristics that affect the health of an OSS project and enable the assessment thereof. Method: We conduct a snowball literature review based on a start set of 9 papers, and identify 146 relevant papers over two iterations of forward and backward snowballing. Health characteristics are elicited and coded using structured and axial coding into a framework structure. Results: The final framework consists of 107 health characteristics divided among 15 themes. Characteristics address the socio-technical spectrum of the community of actors maintaining the OSS project, the software and other deliverables being maintained, and the orchestration facilitating the maintenance. Characteristics are further divided based on the level of abstraction they address, i.e., the OSS project-level specifically, or the project's overarching ecosystem of related OSS projects. Conclusion: The framework provides an overview of the wide span of health characteristics that may need to be considered when evaluating OSS health and can serve as a foundation both for research and practice.
SP  - 1
EP  - 12
JF  - Proceedings of the 18th International Symposium on Open Collaboration
VL  - NA
IS  - NA
PB  -
DO  - 10.1145/3555051.3555067
ER  -

TY  - JOUR
AU  - Augustin, Nils; Eckhardt, Andreas; de Jong, Alexander Willem
TI  - Understanding decentralized autonomous organizations from the inside
PY  - 2023
AB  - <jats:title>Abstract</jats:title><jats:p>Blockchain technology is argued to drastically change the way we operate within an organizational context, with decentralized autonomous organizations (DAOs) representing a first manifestation of this ongoing trend. DAOs are characterized by an online community that builds the organization’s backbone by providing knowledge and human resources in a transparent, virtual manner, as well as the use of blockchain technology to coordinate their endeavor. Nevertheless, current research highlights the conceptual ambiguity of this emerging phenomenon, leading to potential issues for practitioners and researchers. To provide further clarity on the phenomenon, we study DAOs through the perspective of their members with a two-staged approach by combining elements of a netnographic approach and structural topic modeling. Our findings highlight several contextual features surrounding DAOs, such as their members’ underlying beliefs and views, helping to embed DAOs in existing research streams.</jats:p>
SP  - NA
EP  - NA
JF  - Electronic Markets
VL  - 33
IS  - 1
PB  -
DO  - 10.1007/s12525-023-00659-y
ER  -

TY  - NA
AU  - Nyman, Linus
TI  - OpenSym - Hackers on Forking
PY  - 2014
AB  - All open source licenses allow the copying of an existing body of code for use as the basis of a separate development project. This practice is commonly known as forking the code. This paper presents the results of a study in which 11 programmers were interviewed about their opinions on the right to fork and the impact of forking on open source software development. The results show that there is a general consensus among programmers' views regarding both the favourable and unfavourable aspects that stem from the right to fork. Interestingly, while all programmers noted potential downsides to the right to fork, it was seen by all as an integral component of open source software, and a right that must not be infringed regardless of circumstance or outcome.
SP  - 6
EP  - 10
JF  - Proceedings of The International Symposium on Open Collaboration
VL  - NA
IS  - NA
PB  -
DO  - 10.1145/2641580.2641590
ER  -

TY  - JOUR
AU  - Becker, Markus C.; Rullani, Francesco; Zirpoli, Francesco
TI  - The role of digital artefacts in early stages of distributed innovation processes
PY  - 2021
AB  - NA
SP  - 104349
EP  - NA
JF  - Research Policy
VL  - 50
IS  - 10
PB  -
DO  - 10.1016/j.respol.2021.104349
ER  -

TY  - NA
AU  - Foundjem, Armstrong
TI  - ICSE (Workshops) - Cross-distribution Feedback in Software Ecosystems
PY  - 2020
AB  - Despite the proliferation of software ecosystems (SECOs), growing a sustainable and healthy SECO remains a significant challenge. One approach to mitigate this challenge is the utilization of a mechanism that collects feedback from distributors (distros) and end-users of the SECO releases. This presentation aims at investigating the effectiveness of the feedback mechanism implemented by OpenStack to address the needs of end-users and distros. I mined the OpenStack repositories and mapped 20 distros' bug-related activities. Results suggest that OpenStack releases are actively maintained for 18 months before reaching end-of-life (EOL), which makes coordination with distros difficult because distros usually provide services to their end-users for a period between 36 - 60 months before reaching EOL. Also, bugs are fixed faster by the distros (7 - 76 days) than the OpenStack community (average of 4 months). However, only 22% of the bugs addressed by OpenStack distros are pushed back upstream.
SP  - 723
EP  - 724
JF  - Proceedings of the IEEE/ACM 42nd International Conference on Software Engineering Workshops
VL  - NA
IS  - NA
PB  -
DO  - 10.1145/3387940.3392188
ER  -

TY  - CHAP
AU  - Asri, Ikram El; Kerzazi, Noureddine; Benhiba, Lamia; Janati, Mohammed
TI  - PRO-VE - From Periphery to Core: A Temporal Analysis of GitHub Contributors’ Collaboration Network
PY  - 2017
AB  - Open-source projects in GitHub exhibit rich temporal dynamics, and diverse contributors’ social interactions further intensify this process. In this paper, we analyze temporal patterns associated with Open Source Software (OSS) projects and how the contributor’s notoriety grows and fades over time in a core-periphery structure. In order to explore the temporal dynamics of GitHub communities we formulate a time series clustering model using both Social Network Analysis (SNA) and technical metrics. By applying an adaptive time frame incremental approach to clustering, we locate contributors in different temporal networks. We demonstrate our approach on five long-lived OSS projects involving more than 700 contributors and found that there are three main temporal shapes of attention when contributors shift from periphery to core. Our analyses provide insights into common temporal patterns of the growing OSS communities on GitHub and broaden the understanding of the dynamics and motivation of open source contributors.
SP  - 217
EP  - 229
JF  - IFIP Advances in Information and Communication Technology
VL  - NA
IS  - NA
PB  -
DO  - 10.1007/978-3-319-65151-4_21
ER  -

TY  - NA
AU  - Farah, Juan Carlos; Spaenlehauer, Basile; Lu, Xinyang; Ingram, Sandy; Gillet, Denis
TI  - An exploratory study of reactions to bot comments on GitHub
PY  - 2022
AB  - The widespread use of bots to support software development makes social coding platforms such as GitHub a particularly rich source of data for the study of human-bot interaction. Software development bots are used to automate repetitive tasks, interacting with their human counterparts via comments posted on the various discussion interfaces available on such platforms. One type of interaction supported by GitHub involves reacting to comments using predefined emoji. To investigate how users react to bot comments, we conducted an observational study comprising 54 million GitHub comments, with a particular focus on comments that elicited the laugh reaction. The results from our analysis suggest that some reaction types are not equally distributed across human and bot comments and that a bot's design and purpose influence the types of reactions it receives. Furthermore, while the laugh reaction is not exclusively used to express laughter, it can be used to convey humor when a bot behaves unexpectedly. These insights could inform the way bots are designed and help developers equip them with the ability to recognize and recover from unanticipated situations. In turn, bots could better support the communication, collaboration, and productivity of teams using social coding platforms.
SP  - 18
EP  - 22
JF  - Proceedings of the Fourth International Workshop on Bots in Software Engineering
VL  - 1172
IS  - NA
PB  -
DO  - 10.1145/3528228.3528409
ER  -

TY  - JOUR
AU  - Han, Junxiao; Zhang, Jiahao; Lo, David; Xia, Xin; Deng, Shuiguang; Wu, Minghui
TI  - Understanding Newcomers' Onboarding Process in Deep Learning Projects
PY  - 2024
AB  - Attracting and retaining newcomers are critical for the sustainable development of Open Source Software (OSS) projects. Considerable efforts have been made to help newcomers identify and overcome barriers in the onboarding process. However, fewer studies focus on newcomers’ activities before their successful onboarding. Given the rising popularity of deep learning (DL) techniques, we wonder what the onboarding process of DL newcomers is, and if there exist commonalities or differences in the onboarding process for DL and non-DL newcomers. Therefore, we reported a study to understand the growth trends of DL and non-DL newcomers, mine DL and non-DL newcomers’ activities before their successful onboarding (i.e., past activities), and explore the relationships between newcomers’ past activities and their first commit patterns and retention rates. By analyzing 20 DL projects with 9,191 contributors and 20 non-DL projects with 9,839 contributors, and conducting email surveys with contributors, we derived the following findings: 1) DL projects have attracted and retained more newcomers than non-DL projects. 2) Compared to non-DL newcomers, DL newcomers encounter more deployment, documentation, and version issues before their successful onboarding. 3) DL newcomers statistically require more time to successfully onboard compared to non-DL newcomers, and DL newcomers with more past activities (e.g., issues, issue comments, and watch) are prone to submit an intensive first commit (i.e., a commit with many source code and documentation files being modified). Based on the findings, we shed light on the onboarding process for DL and non-DL newcomers, highlight future research directions, and provide practical suggestions to newcomers, researchers, and projects.
SP  - 443
EP  - 460
JF  - IEEE Transactions on Software Engineering
VL  - 50
IS  - 3
PB  -
DO  - 10.1109/tse.2024.3353297
ER  -

TY  - JOUR
AU  - Kyriakou, Kyriakos-Ioannis D.; Tselikas, Nikolaos D.; Kapitsaki, Georgia M.
TI  - Enhancing C/C++ based OSS development and discoverability with CBRJS: A Rust/Node.js/WebAssembly framework for repackaging legacy codebases
PY  - 2019
AB  - NA
SP  - 110395
EP  - NA
JF  - Journal of Systems and Software
VL  - 157
IS  - NA
PB  -
DO  - 10.1016/j.jss.2019.110395
ER  -

TY  - JOUR
AU  - Liao, Zhifang; Huang, Xuechun; Zhang, Bolin; Wu, Jinsong; Cheng, Yu
TI  - BDGOA: A bot detection approach for GitHub OAuth Apps
PY  - 2023
AB  - As various software bots are widely used in open source software repositories, some drawbacks are coming to light, such as giving newcomers non-positive feedback and misleading empirical studies of software engineering researchers. Several techniques have been proposed by researchers to perform bot detection, but most of them are limited to identifying bots performing specific activities, let alone distinguishing between GitHub App and OAuth App. In this paper, we propose a bot detection technique for OAuth App, named BDGOA. 24 features are used in BDGOA, which can be divided into three dimensions: account information, account activity, and text similarity. To better explore the behavioral features, we define a fine-grained classification of behavioral events and introduce self-similarity to quantify the repeatability of behavioral sequence. We leverage five machine learning classifiers on the benchmark dataset to conduct bot detection, and finally choose random forest as the classifier, which achieves the highest F1-score of 95.83%. The experimental results comparing with the state-of-the-art approaches also demonstrate the superiority of BDGOA.
SP  - 181
EP  - 197
JF  - Intelligent and Converged Networks
VL  - 4
IS  - 3
PB  -
DO  - 10.23919/icn.2023.0006
ER  -

TY  - JOUR
AU  - Sayago-Heredia, Jaime; Pérez-Castillo, Ricardo; Piattini, Mario
TI  - A Systematic Mapping Study on Analysis of Code Repositories
PY  - 2021
AB  - NA
SP  - 619
EP  - 660
JF  - Informatica
VL  - 32
IS  - 3
PB  -
DO  - 10.15388/21-infor454
ER  -

TY  - NA
AU  - Choksi, Madiha Zahrah; Mandel, Ilan; Widder, David; Shvartzshnaider, Yan
TI  - The Emerging Artifacts of Centralized Open-Code
PY  - 2024
AB  - In 2022, generative model based coding assistants became widely available with the public release of GitHub Copilot. Approaches to generative coding are often critiqued within the context of advances in machine learning. We argue that tools such as Copilot are better understood when contextualized against technologies derived from the same communities and datasets. Our work traces the historical and ideological origins of free and open source code and characterizes the process of centralization. We examine three case studies —Dependabot, Crater, and Copilot— to compare the engineering, social, and legal qualities of technical artifacts derived from shared community-based labor. Our analysis focuses on the implications these artifacts create for infrastructural dependencies, community adoption, and intellectual property. Reframing generative coding assistants through a set of peer technologies broadens considerations for academics and policymakers beyond machine learning, to include the ways technical artifacts are derived from communities.
SP  - 1971
EP  - 1983
JF  - The 2024 ACM Conference on Fairness, Accountability, and Transparency
VL  - NA
IS  - NA
PB  -
DO  - 10.1145/3630106.3659019
ER  -

TY  - NA
AU  - Constantino, Kattiana; Zhou, Shurui; Souza, Maurício; Figueiredo, Eduardo; Kästner, Christian
TI  - ICGSE - Understanding collaborative software development: an interview study
PY  - 2020
AB  - In globally distributed software development, many software developers have to collaborate and deal with issues of collaboration. Although collaboration is challenging, collaborative development produces better software than any developer could produce alone. Unlike previous work which focuses on the proposal and evaluation of models and tools to support collaborative work, this paper presents an interview study aiming to understand (i) the motivations, (ii) how collaboration happens, and (iii) the challenges and barriers of collaborative software development. After interviewing twelve experienced software developers from GitHub, we found different types of collaborative contributions, such as in the management of requests for changes. Our analysis also indicates that the main barriers for collaboration are related to non-technical, rather than technical issues.
SP  - 55
EP  - 65
JF  - Proceedings of the 15th International Conference on Global Software Engineering
VL  - NA
IS  - NA
PB  -
DO  - 10.1145/3372787.3390442
ER  -

TY  - JOUR
AU  - Duenas, Santiago; Cosentino, Valerio; Gonzalez-Barahona, Jesus M.; San Felix, Alvaro del Castillo; Izquierdo-Cortazar, Daniel; Cañas-Díaz, Luis; García-Plaza, Alberto Pérez
TI  - GrimoireLab: A toolset for software development analytics.
PY  - 2021
AB  - Background After many years of research on software repositories, the knowledge for building mature, reusable tools that perform data retrieval, storage and basic analytics is readily available. However, there is still room to improvement in the area of reusable tools implementing this knowledge. Goal To produce a reusable toolset supporting the most common tasks when retrieving, curating and visualizing data from software repositories, allowing for the easy reproduction of data sets ready for more complex analytics, and sparing the researcher or the analyst of most of the tasks that can be automated. Method Use our experience in building tools in this domain to identify a collection of scenarios where a reusable toolset would be convenient, and the main components of such a toolset. Then build those components, and refine them incrementally using the feedback from their use in both commercial, community-based, and academic environments. Results GrimoireLab, an efficient toolset composed of five main components, supporting about 30 different kinds of data sources related to software development. It has been tested in many environments, for performing different kinds of studies, and providing different kinds of services. It features a common API for accessing the retrieved data, facilities for relating items from different data sources, semi-structured storage for easing later analysis and reproduction, and basic facilities for visualization, preliminary analysis and drill-down in the data. It is also modular, making it easy to support new kinds of data sources and analysis. Conclusions We present a mature toolset, widely tested in the field, that can help to improve the situation in the area of reusable tools for mining software repositories. We show some scenarios where it has already been used. We expect it will help to reduce the effort for doing studies or providing services in this area, leading to advances in reproducibility and comparison of results.
SP  - e601
EP  - NA
JF  - PeerJ. Computer science
VL  - 7
IS  - NA
PB  -
DO  - 10.7717/peerj-cs.601
ER  -

TY  - JOUR
AU  - Wibisurya, Aswin; Adinugroho, Timothy Yudi
TI  - A Reusable Software Copy Protection Using Hash Result and Asymetrical Encryption
PY  - 2014
AB  - Desktop application is one of the most popular types of application being used in computer due to the one time install simplicity and the quick accessibility from the moment the computer being turned on. Limitation of the copy and usage of desktop applications has long been an important issue to application providers. For security concerns, software copy protection is usually integrated with the application. However, developers seek to reuse the copy protection component of the software. This paper proposes an approach of reusable software copy protection which consists of a certificate validator on the client computer and a certificate generator on the server. The certificate validator integrity is protected using hashing result while all communications are encrypted using asymmetrical encryption to ensure the security of this approach.
SP  - 647
EP  - 655
JF  - ComTech: Computer, Mathematics and Engineering Applications
VL  - 5
IS  - 2
PB  -
DO  - 10.21512/comtech.v5i2.2215
ER  -

TY  - JOUR
AU  - Chua, Bee Bee; Zhang, Ying
TI  - Applying a Systematic Literature Review and Content Analysis Method to Analyse Open Source Developers’ Forking Motivation Interpretation, Categories and Consequences
PY  - 2020
AB  - <jats:p>In open source (OS) environments, forking is a powerful social collaborative technique that creates a social coding community and increases code visibility but it has not been adopted by OS software (OSS) developers. This paper investigates OS forking divergence using contextual frameworks (systematic literature review and content analysis) to analyse OSS developer forking motivation, interpretation, categorisation and consequences. We identified five theoretical forking patterns: 1) forking can revive original project health; 2) few effective frameworks exist to describe project-to-project developer migration; 3) there is a literature on social forking community behaviour; 4) poor guidance is a threat to forking; and 5) most research uses mixed methods. We introduce guidelines for OSS communities to reduce organisational barriers to developer motivation and highlight the important of understanding developer forking. The challenge remains to analyse forking and sustainability from a social community perspective, particularly how programming language, file repositories and developer interest can predict forking motivation and behaviour for both novice OSS developers or experienced developers who want to improve forking performance.</jats:p>
SP  - NA
EP  - NA
JF  - Australasian Journal of Information Systems
VL  - 24
IS  - NA
PB  -
DO  - 10.3127/ajis.v24i0.1714
ER  -

TY  - CHAP
AU  - Wang, Ying; Cheung, Shing-Chi; Yu, Hai; Zhu, Zhiliang
TI  - Boosting the Propagation of Vulnerability Fixes in the npm Ecosystem
PY  - 2024
AB  - NA
SP  - 179
EP  - 232
JF  - Managing Software Supply Chains
VL  - NA
IS  - NA
PB  -
DO  - 10.1007/978-981-96-1797-5_8
ER  -

TY  - NA
AU  - Jahn, Leonie; Engelbutzeder, Philip; Michel, Lea Katharina; Prost, Sebastian; Twidale, Michael Bernard; Randall, Dave; Wulf, Volker
TI  - Blending Code and Cause: Understanding the Dynamic Motivations of Volunteer Developers in community-driven FOSS projects
PY  - 2025
AB  - NA
SP  - 1
EP  - 17
JF  - Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems
VL  - NA
IS  - NA
PB  -
DO  - 10.1145/3706598.3713416
ER  -

TY  - JOUR
AU  - Decan, Alexandre; Mens, Tom; Onsori Delicheh, Hassan
TI  - On the outdatedness of workflows in the GitHub Actions ecosystem
PY  - 2023
AB  - NA
SP  - 111827
EP  - 111827
JF  - Journal of Systems and Software
VL  - 206
IS  - NA
PB  -
DO  - 10.1016/j.jss.2023.111827
ER  -

TY  - NA
AU  - Yin, Likang; Chen, Zhuangzhi; Xuan, Qi; Filkov, Vladimir
TI  - Sustainability Forecasting for Apache Incubator Projects
PY  - 2021
AB  - Although OSS development is very popular, ultimately more than 80 percent of OSS projects fail. Identifying the factors associated with OSS success can help in devising interventions when a project takes a downturn. OSS success has been studied from a variety of angles, more recently in empirical studies of large numbers of diverse projects, using proxies for sustainability, e.g., internal metrics related to productivity and external ones, related to community popularity. The internal socio-technical structure of projects has also been shown important, especially their dynamics. This points to another angle on evaluating software success, from the perspective of self-sustaining and self-governing communities.
To uncover the dynamics of how a project at a nascent development stage gradually evolves into a sustainable one, here we apply a socio-technical network modeling perspective to a dataset of Apache Software Foundation Incubator (ASFI), sustainability-labeled projects. To identify and validate the determinants of sustainability, we undertake a mix of quantitative and qualitative studies of ASFI projects' socio-technical network trajectories. We develop interpretable models which can forecast a project becoming sustainable with more than 93 percent accuracy, within 8 months of incubation start. Based on the interpretable models we describe a strategy for real-time monitoring and suggesting actions, which can be used by projects to correct their sustainability trajectories.
SP  - NA
EP  - NA
JF  - arXiv: Software Engineering
VL  - NA
IS  - NA
PB  -
DO  - NA
ER  -

TY  - NA
AU  - Hachemi, Asma
TI  - Collaboration in software development processes: a review on modeling and executing collaborative processes
PY  - 2024
AB  - NA
SP  - 70
EP  - 75
JF  - Proceedings of the 2024 8th International Conference on Software and e-Business
VL  - NA
IS  - NA
PB  -
DO  - 10.1145/3715885.3715888
ER  -

TY  - NA
AU  - Zhou, Shurui; Vasilescu, Bogdan; Kästner, Christian
TI  - ESEC/SIGSOFT FSE - What the fork: a study of inefficient and efficient forking practices in social coding
PY  - 2019
AB  - Forking and pull requests have been widely used in open-source communities as a uniform development and contribution mechanism, giving developers the flexibility to modify their own fork without affecting others before attempting to contribute back. However, not all projects use forks efficiently; many experience lost and duplicate contributions and fragmented communities. In this paper, we explore how open-source projects on GitHub differ with regard to forking inefficiencies. First, we observed that different communities experience these inefficiencies to widely different degrees and interviewed practitioners to understand why. Then, using multiple regression modeling, we analyzed which context factors correlate with fewer inefficiencies.We found that better modularity and centralized management are associated with more contributions and a higher fraction of accepted pull requests, suggesting specific best practices that project maintainers can adopt to reduce forking-related inefficiencies in their communities.
SP  - 350
EP  - 361
JF  - Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering
VL  - NA
IS  - NA
PB  -
DO  - 10.1145/3338906.3338918
ER  -

TY  - JOUR
AU  - Kapitsaki, Georgia M.; Charalambous, Georgia
TI  - Modeling and Recommending Open Source Licenses with findOSSLicense
PY  - 2021
AB  - Open source software is widely used in the software industry and the academia. Licenses applied to open source software provide the terms for its further use and distribution. Decisions regarding licensing for new software systems are essential for the system's future use. In this paper, we introduce findOSSLicense, a license recommender that guides users into choosing the appropriate open source license for their software under creation. We also introduce our license modeling concept that is used in the recommendation process. The license modeling captures the properties usually found in existing open source licenses following an analysis performed on license texts. The recommendation process of findOSSLicense is based on a hybrid recommender that uses constraint-based, content-based and collaborative filtering giving also space for flexibility in the use of the system by its end-users who can adapt some system properties. User input, but also external sources of information including existing open source projects, are considered for the creation of the recommendations, whereas licenses used in third party software employed in the software are examined on a limited basis. findOSSLicense has been evaluated with the participation of users of various expertise.
SP  - 919
EP  - 935
JF  - IEEE Transactions on Software Engineering
VL  - 47
IS  - 5
PB  -
DO  - 10.1109/tse.2019.2909021
ER  -

TY  - JOUR
AU  - Decan, Alexandre; Mens, Tom; Zerouali, Ahmed; De Roover, Coen
TI  - Back to the Past – Analysing Backporting Practices in Package Dependency Networks
PY  - 2021
AB  - NA
SP  - 1
EP  - 1
JF  - IEEE Transactions on Software Engineering
VL  - 2021
IS  - 01
PB  -
DO  - NA
ER  -

TY  - CHAP
AU  - E'mari, Salam Al; Sanjalawe, Yousef; Fataftah, Fuad
TI  - AI-Driven Security Systems and Intelligence Threat Response Using Autonomous Cyber Defense
PY  - 2025
AB  - <jats:p>The expanding cyber threat landscape has compelled organizations to adopt AI-driven security systems for robust defense against sophisticated attacks. This chapter explores artificial intelligence in cybersecurity, emphasizing its role in intelligent threat detection, analysis, and response. AI models, including supervised and unsupervised learning, deep learning, and reinforcement learning, have redefined cybersecurity by enabling behavior-based anomaly detection and automated threat mitigation. Key discussions highlight autonomous systems making real-time decisions, leveraging adaptive control loops, and employing self-healing mechanisms for resilience. This chapter also examines challenges in operational scalability, ethical implications of automation, and the necessity of human oversight in decision-making. The findings underscore the need for synergy between automation and human expertise to foster an intelligent, adaptive cyber defense ecosystem.</jats:p>
SP  - 35
EP  - 78
JF  - Advances in Computational Intelligence and Robotics
VL  - NA
IS  - NA
PB  -
DO  - 10.4018/979-8-3373-0954-5.ch002
ER  -

TY  - JOUR
AU  - Han, Yue; Ozturk, Pinar; Nickerson, Jeffrey V.
TI  - Leveraging the Wisdom of the Crowd to Address Societal Challenges: Revisiting the Knowledge Reuse for Innovation Process through Analytics
PY  - 2020
AB  - NA
SP  - 8
EP  - 1152
JF  - Journal of the Association for Information Systems
VL  - 21
IS  - 5
PB  -
DO  - 10.17705/1jais.00632
ER  -

TY  - NA
AU  - Reid, David; Mockus, Audris
TI  - Applying the Universal Version History Concept to Help De-Risk Copy-Based Code Reuse
PY  - 2023
AB  - The ability to easily copy code among open source projects makes it difficult to comply with the need to determine the provenance of code essential for cybersecurity and for complying with the licensing terms. Such provenance encompasses the exact origin of each component and its license, and various qualities of the component, such as absence of vulnerabilities and high likelihood of future maintenance. With the aim to address these challenges, we created an approach supported by a tool prototype, UVHistory, that links each piece of source code to all projects where it resides and, also, to its version histories in all these projects. This combined version history of a file from all open source projects we refer to as universal version history. We exemplify UVHistory via scenarios illustrating how it can help developers identify bugs and vulnerabilities and verify that license terms are not violated. Specifically, using UVHistory, developers can find the origin of a file including the open source repository where it originated, follow the evolution of the file over time and across different repositories, identify which authors have worked on a file, and read all the log messages for any modifications to that file in any repository. We also evaluate UVHistory in two contexts: to identify license non-compliance and to find instances of unfixed vulnerabilities. We find that in active and popular projects both problems are common and anyone can easily identify them using our approach.
SP  - 1
EP  - 12
JF  - 2023 IEEE 23rd International Working Conference on Source Code Analysis and Manipulation (SCAM)
VL  - NA
IS  - NA
PB  -
DO  - 10.1109/scam59687.2023.00012
ER  -

TY  - JOUR
AU  - Gao, Haoyu; Treude, Christoph; Zahedi, Mansooreh
TI  - Adapting Installation Instructions in Rapidly Evolving Software Ecosystems
PY  - 2025
AB  - NA
SP  - 1334
EP  - 1357
JF  - IEEE Transactions on Software Engineering
VL  - 51
IS  - 4
PB  -
DO  - 10.1109/tse.2025.3552614
ER  -

TY  - NA
AU  - Harrison, Francis
TI  - A Method for Identifying Software Ecosystems of Technically Dependent Projects
PY  - 2015
AB  - NA
SP  - NA
EP  - NA
JF  - NA
VL  - NA
IS  - NA
PB  -
DO  - NA
ER  -

TY  - JOUR
AU  - Lumbard, Kevin; Germonprez, Matt; Goggins, Sean
TI  - An empirical investigation of social comparison and open source community health
PY  - 2023
AB  - <jats:title>Abstract</jats:title><jats:p>It is well known that corporations rely on open source software as part of their product development lifecycle. Given these commitments, understanding the health of open source communities is a central concern in today's business setting. Our research uses social comparison theory as a framework for understanding how open source communities consider community health beyond any single metric within any single open source community—including a broader view of how others are using these health indicators in practice. Using methods from engaged field research, including 38 interviews, we examine practices of social comparison as an advancement in understanding open source community health—and subsequently engagement with open source communities. The results of this study show that open source community health is not a single set of discrete metrics but is an ongoing social construction. Through our study, we advance theoretical and applied knowledge regarding issues of open source community health, open source community engagement, and social comparison.</jats:p>
SP  - 499
EP  - 532
JF  - Information Systems Journal
VL  - 34
IS  - 2
PB  -
DO  - 10.1111/isj.12485
ER  -

TY  - NA
AU  - Lundell, Björn; Gamalielsson, Jonas
TI  - OpenSym - Sustainable digitalisation through different dimensions of openness: how can lock-in, interoperability, and long-term maintenance of IT systems be addressed?
PY  - 2018
AB  - Lock-in, interoperability, and long-term maintenance are three fundamental challenges that need to be addressed by any organisation involved in development, use and procurement of IT systems. This paper clarifies fundamental concepts and key dimensions of openness and provides examples of work-practices and recommendations for achieving sustainable digitalisation through addressing the fundamental challenges. Specifically, there are three main contributions. First, the concepts open standard, open source software, and open content are clarified and elaborated. Second, the associated three dimensions standard, software, and content are elaborated through examples of how different combinations along the dimensions can enable and inhibit sustainable digitalisation when IT-systems are developed and procured. Third, work-practices used by public sector organisations in specific projects for development and procurement of IT-systems are elaborated with the view to discuss how the three fundamental challenges are being addressed and provide guidance for how organisations can achieve a sustainable digitalisation.
SP  - 3
EP  - 10
JF  - Proceedings of the 14th International Symposium on Open Collaboration
VL  - NA
IS  - NA
PB  -
DO  - 10.1145/3233391.3233527
ER  -

TY  - NA
AU  - Vrochidou, Eleni; Manios, Michail; Papakostas, George A.; Aitsidis, Charalabos N.; Panagiotopoulos, Fotis
TI  - SoftCOM - Open-Source Robotics: Investigation on Existing Platforms and Their Application in Education
PY  - 2018
AB  - In recent years a continuous effort to foster robotics at the earliest stages of education is reported. Robots’ code and hardware are subjected to licensing, a fact that places manufacturers on the top of the beneficiaries of this new trend. Hence, open-source robotics (OSR) enter the mainstream, to enable rapid development on a lower budget. This paper provides an overview of the up-to-date commercial OSR platforms that can support education, in terms of hardware, software and simulators. The aim of the paper is to provide comprehensive knowledge of the available OSR platforms, according to their most recently reported applications in education, so as to enlighten teachers, amateurs and researchers. Extensibility and applicability are investigated, and comparison of features takes place. Future challenges and possibilities are also discussed.
SP  - 1
EP  - 6
JF  - 2018 26th International Conference on Software, Telecommunications and Computer Networks (SoftCOM)
VL  - NA
IS  - NA
PB  -
DO  - 10.23919/softcom.2018.8555860
ER  -

TY  - JOUR
AU  - Alrawashdeh, Thamer A.
TI  - Evaluating Open Source Software Usability Using a Multistage Fuzzy Model Approach
PY  - 2015
AB  - In recent years, development of Open Source Software has obtained significant importance in the production of software products. Although, developers of Open Source Software have developed software with functional competitiveness as compared to closed proprietary software; computer users still prefer closed proprietary software than open source due its usability strength. On the other hand, once the usability of OSS is evaluated, it would be easier to develop and implement an acceptable and qualitative product, since the software usability is considered to be one of the most important quality factors. Thus, this work proposed a multistage fuzzy model approach for evaluating the Open Source Software usability, which includes nine usability characteristics to be taken into account when designing and implementing OSS software. The model takes a project, developed in MATLAB and quantifies its usability. The Analytical Hierarchy Process (AHP) technique was employed to verify the proposed model approach and to rank its usability characteristics. These characteristics are sequenced according to its importance as follows; learnability, understandability, efficiency, error prevention, memorability, operability, familiarity, attractiveness, and usability-compliance.
SP  - 1018
EP  - 1026
JF  - International Review on Computers and Software (IRECOS)
VL  - 10
IS  - 10
PB  -
DO  - 10.15866/irecos.v10i10.7668
ER  -

TY  - BOOK
AU  - Horcas, Jose-Miguel; Galindo, José A.; Heradio, Ruben; Fernandez-Amoros, David; Benavides, David
TI  - SPLC (A) - Monte Carlo tree search for feature model analyses: a general framework for decision-making
PY  - 2021
AB  - The colossal solution spaces of most configurable systems make intractable their exhaustive exploration. Accordingly, relevant analyses remain open research problems. There exist analyses alternatives such as SAT solving or constraint programming. However, none of them have explored simulation-based methods. Monte Carlo-based decision making is a simulation-based method for dealing with colossal solution spaces using randomness. This paper proposes a conceptual framework that tackles various of those analyses using Monte Carlo methods, which have proven to succeed in vast search spaces (e.g., game theory). Our general framework is described formally, and its flexibility to cope with a diversity of analysis problems is discussed (e.g., finding defective configurations, feature model reverse engineering or getting optimal performance configurations). Additionally, we present a Python implementation of the framework that shows the feasibility of our proposal. With this contribution, we envision that different problems can be addressed using Monte Carlo simulations and that our framework can be used to advance the state of the art a step forward.
SP  - 190
EP  - 201
JF  - Proceedings of the 25th ACM International Systems and Software Product Line Conference - Volume A
VL  - NA
IS  - NA
PB  -
DO  - 10.1145/3461001.3471146
ER  -

TY  - NA
AU  - Lercher, Alexander
TI  - Managing API Evolution in Microservice Architecture
PY  - 2024
AB  - Nowadays, many software systems are split into loosely coupled microservices only communicating via Application Programming Interfaces (APIs) to improve maintainability, scalability, and fault tolerance. However, the loose coupling between microservices provides no immediate feedback on breaking API changes, and consuming services break or exhibit unexpected behavior only after the first actual call to the changed API. Hence, development teams must actively identify and communicate all breaking changes to affected teams to stay compatible. This research addresses this problem with three contributions. First, we identified API evolution strategies and open challenges in practice with an explorative study. Based on the study findings, we formulated two open research directions for evolving publicly accessible APIs, i.e., REpresentational State Transfer (REST) APIs. As the second contribution, we will introduce a REST API change extraction approach to improve the change notification accuracy. We plan experiments on open-source projects to evaluate our approach's accuracy and compare it to openapi-diff for structural changes. Third, we plan to investigate methods for automating communication with affected teams, which will then improve the change notification reliability. Finally, we will evaluate the accuracy and reliability of our notifications with a user study.
SP  - 195
EP  - 197
JF  - Proceedings of the 2024 IEEE/ACM 46th International Conference on Software Engineering: Companion Proceedings
VL  - NA
IS  - NA
PB  -
DO  - 10.1145/3639478.3639800
ER  -

TY  - JOUR
AU  - Dalle, Jean-Michel; David, Paul A; Rullani, Francesco; Bolici, Francesco
TI  - The interplay between volunteers and firm's employees in distributed innovation: emergent architectures and stigmergy in open source software
PY  - 2022
AB  - <jats:title>Abstract</jats:title><jats:p>This paper focuses on the interplay between firms and open and collaborative innovation communities. We develop a formal model where both volunteers (agents setting their agendas freely) and firm’s employees (agents whose agenda is mostly set by their employer) participate in the creation of a common artifact. In this framework, we discuss how firms can influence the architecture of the emerging product to assure fast and performant development and a desirable distribution of innovative labor within the project team. We find that closing the project only to employees implies high speed and performance if employees are given autonomy in certain dimensions and are directed in others. In this case, however, we observe a trade-off in terms of ideal core–periphery division of labor on one side and development speed and performance on the other side. At the opposite extreme, creating a volunteer-only project can ease the trade-off but assures positive results only if the firm is able to set up an entry mechanism that “surgically” selects volunteers with specific preferences. A mixture of both employees and volunteers can strike a good balance, relaxing the two constraints.</jats:p>
SP  - 1358
EP  - 1386
JF  - Industrial and Corporate Change
VL  - 31
IS  - 6
PB  -
DO  - 10.1093/icc/dtac037
ER  -

TY  - NA
AU  - Ochoa, Lina; Degueule, Thomas; Falleri, Jean-Remy
TI  - BreakBot: Analyzing the Impact of Breaking Changes to Assist Library Evolution
PY  - 2022
AB  - "If we make this change to our code, how will it impact our clients?" It is difficult for library maintainers to answer this simple—yet essential!—question when evolving their libraries. Library maintainers are constantly balancing between two opposing positions: make changes at the risk of breaking some of their clients, or avoid changes and maintain compatibility at the cost of immobility and growing technical debt. We argue that the lack of objective usage data and tool support leaves maintainers with their own subjective perception of their community to make these decisions.We introduce BreakBot, a bot that analyses the pull requests of Java libraries on GitHub to identify the breaking changes they introduce and their impact on client projects. Through static analysis of libraries and clients, it extracts and summarizes objective data that enrich the code review process by providing maintainers with the appropriate information to decide whether—and how—changes should be accepted, directly in the pull requests.
SP  - 26
EP  - 30
JF  - 2022 IEEE/ACM 44th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER)
VL  - NA
IS  - NA
PB  -
DO  - 10.1109/icse-nier55298.2022.9793524
ER  -

TY  - NA
AU  - Abdalkareem, Rabe
TI  - ESEC/SIGSOFT FSE - Reasons and drawbacks of using trivial npm packages: the developers' perspective
PY  - 2017
AB  - Code reuse is traditionally seen as good practice. Recent trends have pushed the idea of code reuse to an extreme, by using packages that implement simple and trivial tasks, which we call ‘trivial packages’. A recent incident where a trivial package led to the breakdown of some of the most popular web applications such as Facebook and Netflix, put the spotlight on whether using trivial packages should be encouraged. Therefore, in this research, we mine more than 230,000 npm packages and 38,000 JavaScript projects in order to study the prevalence of trivial packages. We found that trivial packages are common, making up 16.8% of the studied npm packages. We performed a survey with 88 Node.js developers who use trivial packages to understand the reasons for and drawbacks of their use. We found that trivial packages are used because they are perceived to be well-implemented and tested pieces of code. However, developers are concerned about maintaining and the risks of breakages due to the extra dependencies trivial packages introduce.
SP  - 1062
EP  - 1064
JF  - Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering
VL  - NA
IS  - NA
PB  -
DO  - 10.1145/3106237.3121278
ER  -

TY  - NA
AU  - Sun, Jiayi
TI  - Sustaining Scientific Open-Source Software Ecosystems: Challenges, Practices, and Opportunities
PY  - 2024
AB  - NA
SP  - 234
EP  - 236
JF  - Proceedings of the 2024 IEEE/ACM 46th International Conference on Software Engineering: Companion Proceedings
VL  - NA
IS  - NA
PB  -
DO  - 10.1145/3639478.3639805
ER  -

TY  - CHAP
AU  - Münch, Tobias; Roosmann, Rainer
TI  - Enhance Web-Components in Order to Increase Security and Maintainability
PY  - 2022
AB  - NA
SP  - 443
EP  - 449
JF  - Lecture Notes in Computer Science
VL  - NA
IS  - NA
PB  -
DO  - 10.1007/978-3-031-09917-5_33
ER  -

TY  - JOUR
AU  - Lei, Shaojuan; Zhang, Xiaodong; Liu, Suhui
TI  - Dynamic Robustness of Open-Source Project Knowledge Collaborative Network Based on Opinion Leader Identification
PY  - 2021
AB  - A large amount of semantic content is generated during designer collaboration in open-source projects (OSPs). Based on the characteristics of knowledge collaboration behavior in OSPs, we constructed a directed, weighted, semantic-based knowledge collaborative network. Four social network analysis indexes were created to identify the key opinion leader nodes in the network using the entropy weight and TOPSIS method. Further, three degradation modes were designed for (1) the collaborative behavior of opinion leaders, (2) main knowledge dissemination behavior, and (3) main knowledge contribution behavior. Regarding the degradation model of the collaborative behavior of opinion leaders, we considered the propagation characteristics of opinion leaders to other nodes, and we created a susceptible–infected–removed (SIR) propagation model of the influence of opinion leaders’ behaviors. Finally, based on empirical data from the Local Motors open-source vehicle design community, a dynamic robustness analysis experiment was carried out. The results showed that the robustness of our constructed network varied for different degradation modes: the degradation of the opinion leaders’ collaborative behavior had the lowest robustness; this was followed by the main knowledge dissemination behavior and the main knowledge contribution behavior; the degradation of random behavior had the highest robustness. Our method revealed the influence of the degradation of collaborative behavior of different types of nodes on the robustness of the network. This could be used to formulate the management strategy of the open-source design community, thus promoting the stable development of OSPs.
SP  - 1235
EP  - NA
JF  - Entropy (Basel, Switzerland)
VL  - 23
IS  - 9
PB  -
DO  - 10.3390/e23091235
ER  -

TY  - JOUR
AU  - Klug, Daniel; Bogart, Christopher; Herbsleb, James D.
TI  - "They Can Only Ever Guide": How an Open Source Software Community Uses Roadmaps to Coordinate Effort
PY  - 2021
AB  - Unlike in commercial software development, open source software (OSS) projects do not generally have managers with direct control over how developers spend their time, yet for projects with large, diverse sets of contributors, the need exists to focus and steer development in a particular direction in a coordinated way. This is especially important for "infrastructure" projects, such as critical libraries and programming languages that many other people depend on. Some projects have taken the approach of borrowing planning tools that originated in commercial development, despite the fact that these techniques were designed for very different contexts, e.g. strong top-down control and profit motives. Little research has been done to understand how these practices are adapted to a new context. In this paper, we examine the Rust project's use of roadmaps: how has an important OSS infrastructure project adapted an inherently top-down tool to the freewheeling world of OSS? We find that because Rust's roadmaps are built in part by summarizing what motivated developers most prefer to work on, they are in some ways more a description of the motivated labor available than they are a directive that the community move in a particular direction. They allow the community to avoid wasting time on unpopular proposals by revealing that there will be little help in building them, and encouraging work on popular features by making visible the amount of consensus in those features. Roadmaps generate a collective focus without limiting the full scope of what developers work on: roadmap issues consume proportionally more effort than other issues, but constitute a minority of the work done (i.e issues and pull requests made) by both central and peripheral participants. They also create transparency among and beyond the community into what central contributors' plans are, and allow more rational decision-making by providing a way for evidence about community needs to be linked to decision-making.
SP  - 1
EP  - 28
JF  - Proceedings of the ACM on Human-Computer Interaction
VL  - 5
IS  - CSCW1
PB  -
DO  - 10.1145/3449232
ER  -

TY  - JOUR
AU  - Samuel, Binny M.; Bala, Hillol; Daniel, Sherae L.; Ramesh, V.
TI  - Deconstructing the Nature of Collaboration in Organizations Open Source Software Development: The Impact of Developer and Task Characteristics
PY  - 2022
AB  - NA
SP  - 3969
EP  - 3987
JF  - IEEE Transactions on Software Engineering
VL  - 48
IS  - 10
PB  -
DO  - 10.1109/tse.2021.3108935
ER  -

TY  - CHAP
AU  - Laganà, Antonio; Gervasi, Osvaldo; Tasso, Sergio; Perri, Damiano; Franciosa, Francesco
TI  - ICCSA (5) - The ECTN Virtual Education Community Prosumer Model for Promoting and Assessing Chemical Knowledge
PY  - 2018
AB  - The dynamism of the learning economies is examined in order to single out the key factors allowing to promote knowledge dissemination and invention developments. The various steps involved in the production and usage of both tacit and explicit technological knowledge as common good are analysed in order to optimize its portability. The role played in this respect by business clusters (especially when adopting the prosumer model) dealing with knowledge is discussed with particular reference to chemical education in Higher Education Institutions. The adoption of the prosumer model for building a European system aimed at promoting and assessing chemical knowledge is examined. The particular case considered in the paper is the one born out of the activities of the Universities member of the European Chemistry Thematic Network association through its Virtual Education Community Committee and the operational support of the former spinoff of the University of Perugia Master-UP s.r.l. Results achieved during the first year of activity are discussed.
SP  - 533
EP  - 548
JF  - Lecture Notes in Computer Science
VL  - NA
IS  - NA
PB  -
DO  - 10.1007/978-3-319-95174-4_42
ER  -

TY  - JOUR
AU  - Poderi, Giacomo
TI  - Sustaining platforms as commons: perspectives on participation, infrastructure, and governance
PY  - 2019
AB  - ABSTRACTThis work finds its place within Participatory Design (PD) as a specific approach to co-design that focuses on the politics of technological innovation and socio-technical transformations. ...
SP  - 243
EP  - 255
JF  - CoDesign
VL  - 15
IS  - 3
PB  -
DO  - 10.1080/15710882.2019.1631351
ER  -

TY  - JOUR
AU  - da Silva, Antonio Cesar Brandao Gomes; de Figueiredo Carneiro, Glauco; Brito e Abreu, Fernando; Monteiro, Miguel P.
TI  - Frequent releases in open source software: a systematic review
PY  - 2017
AB  - Context: The need to accelerate software delivery, supporting faster time-to-market and frequent community developer/user feedback are issues that have led to relevant changes in software development practices. One example is the adoption of Rapid Release (RR) by several Open Source Software projects (OSS). This raises the need to know how these projects deal with software release approaches. Goal: Identify the main characteristics of software release initiatives in OSS projects, the motivations behind their adoption, strategies applied, as well as advantages and difficulties found. Method: We conducted a Systematic Literature Review (SLR) to reach the stated goal. Results: The SLR includes 33 publications from January 2006 to July 2016 and reveals nine advantages that characterize software release approaches in OSS projects; four challenge issues; three possibilities of implementation and two main motivations towards the adoption of RR; and finally four main strategies to implement it. Conclusion: This study provides an up-to-date and structured understanding of the software release approaches in the context of OSS projects based on findings systematically collected from a list of relevant references in the last decade.
SP  - 109
EP  - NA
JF  - Information
VL  - 8
IS  - 3
PB  -
DO  - 10.3390/info8030109
ER  -

TY  - JOUR
AU  - Tamburri, Damian A.; Palomba, Fabio; Serebrenik, Alexander; Zaidman, Andy
TI  - Discovering community patterns in open-source: a systematic approach and its evaluation
PY  - 2018
AB  - The open-source phenomenon has reached the point in which it is virtually impossible to find large applications that do not rely on it. Such grand adoption may turn into a risk if the community regulatory aspects behind open-source work (e.g., contribution guidelines or release schemas) are left implicit and their effect untracked. We advocate the explicit study and automated support of such aspects and propose Yoshi (Y ielding O pen-S ource H ealth I nformation), a tool able to map open-source communities onto community patterns, sets of known organisational and social structure types and characteristics with measurable core attributes. This mapping is beneficial since it allows, for example, (a) further investigation of community health measuring established characteristics from organisations research, (b) reuse of pattern-specific best-practices from the same literature, and (c) diagnosis of organisational anti-patterns specific to open-source, if any. We evaluate the tool in a quantitative empirical study involving 25 open-source communities from GitHub, finding that the tool offers a valuable basis to monitor key community traits behind open-source development and may form an effective combination with web-portals such as OpenHub or Bitergia. We made the proposed tool open source and publicly available.
SP  - 1369
EP  - 1417
JF  - Empirical Software Engineering
VL  - 24
IS  - 3
PB  -
DO  - 10.1007/s10664-018-9659-9
ER  -

TY  - NA
AU  - Li, Yong; Wang, Yu
TI  - Design and implementation of reservoir dam safety monitoring platform based on ASP.NET
PY  - 2017
AB  - With the rapid development of China's economy, the traditional hydrological monitoring system based on C/S architecture can't meet the increasingly complex information needs of the water industry. The hydrologic monitoring system of the reservoir is mainly composed of data acquisition system, safety monitoring system, hydrological forecasting system and so on. Based on ASP.NET technology, this paper puts forward a design idea of integrated monitoring platform for water body based on B/S structure, and builds the overall architecture of water monitoring platform, and realizes the whole architecture of water monitoring platform. A reservoir dam monitoring platform has been designed to enhance emergency response capability. The system includes Web management system, data acquisition system and database. The results show that the interface of the platform is interactive, safe and reliable. This system effectively improves the level of information and intelligence.
SP  - 2644
EP  - 2648
JF  - 2017 IEEE 2nd Advanced Information Technology, Electronic and Automation Control Conference (IAEAC)
VL  - NA
IS  - NA
PB  -
DO  - 10.1109/iaeac.2017.8054504
ER  -

TY  - NA
AU  - Adams, Bram; McIntosh, Shane
TI  - FOSE@SANER - Modern Release Engineering in a Nutshell -- Why Researchers Should Care
PY  - 2016
AB  - The release engineering process is the process that brings high quality code changes from a developer's workspace to the end user, encompassing code change integration, continuous integration, build system specifications, infrastructure-as-code, deployment and release. Recent practices of continuous delivery, which bring new content to the end user in days or hours rather than months or years, have generated a surge of industry-driven interest in the release engineering pipeline. This paper argues that the involvement of researchers is essential, by providing a brief introduction to the six major phases of the release engineering pipeline, a roadmap of future research, and a checklist of three major ways that the release engineering process of a system under study can invalidate the findings of software engineering studies. The main take-home message is that, while release engineering technology has flourished tremendously due to industry, empirical validation of best practices and the impact of the release engineering process on (amongst others) software quality is largely missing and provides major research opportunities.
SP  - 78
EP  - 90
JF  - 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER)
VL  - 5
IS  - NA
PB  -
DO  - 10.1109/saner.2016.108
ER  -

TY  - CONF
AU  - Murphy, Stephen
TI  - Adopting Open Source IT Certification in Higher Education: Lessons from the Field
PY  - NA
AB  - This paper suggests areas of good practice and considerations based upon the experience of embedding an open source information technology (IT) certification into a UK higher education program. Academically, open source is used as a vehicle for teaching general academic skills and values, but also as a collection of marketable skills. IT certification is used to further develop and signpost these skills to employers. This paper critically reviews literature in the fields of open source software in education and IT certification. A case study then discusses the methods used to embed such certification at Birmingham City University in the UK. Key barriers are reviewed along with a summary of lessons learned for the benefit of those considering similar actions.
SP  - NA
EP  - NA
JF  - NA
VL  - NA
IS  - NA
PB  -
DO  - 10.13140/rg.2.2.27661.95208
ER  -

TY  - NA
AU  - Widder, David Gray; Sunshine, Joshua; Fickas, Stephen
TI  - VL/HCC - Barriers to Reproducible Scientific Programming
PY  - 2019
AB  - Scientists about their programming practices, and the extent to which they adhere to six common best practices. We argue that these practices are essential to the core scientific value of reproducibility. Our results indicate that many of these practices are not followed because of barriers such as low self-efficacy and misaligned incentive structures. We conclude with suggested improvements to the tooling, education, and incentives of scientific programmers.
SP  - 217
EP  - 221
JF  - 2019 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC)
VL  - NA
IS  - NA
PB  -
DO  - 10.1109/vlhcc.2019.8818907
ER  -

TY  - JOUR
AU  - Bauer, Veronika; Vetro, Antonio
TI  - Comparing reuse practices in two large software-producing companies
PY  - 2016
AB  - We compare two empirical investigations on software reuse at two large companies.We analyzed 108 survey responses and 35?h of interviews with 30 participants.Homogeneous, coherent settings produce clear benefits in development and maintenance.We identify coherence between culture and approach as important reuse success factor.Systematic reuse in heterogeneous contexts requires structured decision support. ContextReuse can improve productivity and maintainability in software development. Research has proposed a wide range of methods and techniques. Are these successfully adopted in practice? ObjectiveWe propose a preliminary answer by integrating two in-depth empirical studies on software reuse at two large software-producing companies. MethodWe compare and interpret the study results with a focus on reuse practices, effects, and context. ResultsBoth companies perform pragmatic reuse of code produced within the company, not leveraging other available artefacts. Reusable entities are retrieved from a central repository, if present. Otherwise, direct communication with trusted colleagues is crucial for access.Reuse processes remain implicit and reflect the development style. In a homogeneous infrastructure-supported context, participants strongly agreed on higher development pace and less maintenance effort as reuse benefits. In a heterogeneous context with fragmented infrastructure, these benefits did not materialize.Neither case reports statistically significant evidence of negative side effects of reuse nor inhibitors. In both cases, a lack of reuse led to duplicate implementations. ConclusionTechnological advances have improved the way reuse concepts can be applied in practice. Homogeneity in development process and tool support seem necessary preconditions. Developing and adopting adequate reuse strategies in heterogeneous contexts remains challenging.
SP  - 545
EP  - 582
JF  - Journal of Systems and Software
VL  - 117
IS  - NA
PB  -
DO  - 10.1016/j.jss.2016.03.067
ER  -

TY  - BOOK
AU  - Robinson, Paul T.; Beecham, Sarah
TI  - ICSSP - TWINS: this workflow is not scrum: agile process adaptation for open source software projects
PY  - 2019
AB  - It is becoming commonplace for companies to contribute to open source software (OSS) projects. At the same time, many software organizations are applying Scrum software development practices, for productivity and quality gains. Scrum calls for self-organizing teams, in which the development team has total control over its development process. However, OSS projects typically have their own processes and standards, which might not mesh well with a company's internal processes, such as Scrum. This paper presents an experience report from Sony Interactive Entertainment (SIE), where the "toolchain CPU compiler" team directly participates in the "LLVM" OSS project. The team ran into a number of difficulties when using Scrum to manage their development. In particular, the team often failed to complete Scrum sprints where tasks required interaction with the open source community. We look at how the team redefined task flows to alleviate these difficulties, and eventually evolved a highly modified process, dubbed TWINS (This Workflow Is Not Scrum). We assess the revised process, and compare it to other established agile methods, finding it bears a strong resemblance to Scrumban (the SIE team was not aware of Scrumban previously). The TWINS framework presented here may help other organizations who develop software in-house and engage in OSS projects, to gain the best of both worlds.
SP  - 24
EP  - 33
JF  - 2019 IEEE/ACM International Conference on Software and System Processes (ICSSP)
VL  - NA
IS  - NA
PB  -
DO  - 10.1109/icssp.2019.00014
ER  -

TY  - JOUR
AU  - Karhu, Kimmo; Gustafsson, Robin; Lyytinen, Kalle
TI  - Exploiting and defending open digital platforms with boundary resources: Android's five platform forks
PY  - 2018
AB  - Digital platforms can be opened in two ways to promote innovation and value generation. A platform owner can open access for third-party participants by establishing boundary resources, such as API...
SP  - 479
EP  - 497
JF  - Information Systems Research
VL  - 29
IS  - 2
PB  -
DO  - 10.1287/isre.2018.0786
ER  -

TY  - JOUR
AU  - Bock, Thomas; Alznauer, Nils; Joblin, Mitchell; Apel, Sven
TI  - Automatic Core-Developer Identification on GitHub: A Validation Study
PY  - 2023
AB  - <jats:p>
            Many open-source software projects are self-organized and do not maintain official lists with information on developer roles. So, knowing which developers take core and maintainer roles is, despite being relevant, often tacit knowledge. We propose a method to
            <jats:italic>automatically</jats:italic>
            identify core developers based on role permissions of privileged events triggered in GitHub issues and pull requests. In an empirical study on 25/GitHub projects, (1) we validate the set of automatically identified core developers with a sample of project-reported developer lists, and (2) we use our set of identified core developers to assess the accuracy of state-of-the-art unsupervised developer classification methods. Our results indicate that the set of core developers, which we extracted from privileged issue events, is sound and the accuracy of state-of-the-art unsupervised classification methods depends mainly on the data source (commit data versus issue data) rather than the network-construction method (directed versus undirected, etc.). In perspective, our results shall guide research and practice to choose appropriate
            <jats:italic>unsupervised</jats:italic>
            classification methods, and our method can help create reliable ground-truth data for training
            <jats:italic>supervised</jats:italic>
            classification methods.
          </jats:p>
SP  - 1
EP  - 29
JF  - ACM Transactions on Software Engineering and Methodology
VL  - 32
IS  - 6
PB  -
DO  - 10.1145/3593803
ER  -

TY  - NA
AU  - Keshani, Mehdi; Bot, Gideon; Rungta, Priyam; Izadi, Maliheh; Van Deursen, Arie; Proksch, Sebastian
TI  - Maven Unzipped: Exploring the Impact of Library Packaging on the Ecosystem
PY  - 2024
AB  - NA
SP  - 50
EP  - 62
JF  - 2024 IEEE International Conference on Software Maintenance and Evolution (ICSME)
VL  - NA
IS  - NA
PB  -
DO  - 10.1109/icsme58944.2024.00016
ER  -

TY  - JOUR
AU  - Poba-Nzaou, Placide; Uwizeyemungu, Sylvestre
TI  - Worries of open source projects' contributors: Patterns, structures and engagement implications
PY  - 2019
AB  - NA
SP  - 174
EP  - 185
JF  - Computers in Human Behavior
VL  - 96
IS  - NA
PB  -
DO  - 10.1016/j.chb.2019.02.005
ER  -

TY  - NA
AU  - Xu, Weiwei; He, Hao; Gao, Kai; Zhou, Minghui
TI  - Understanding and Remediating Open-Source License Incompatibilities in the PyPI Ecosystem
PY  - 2023
AB  - The reuse and distribution of open-source software must be in compliance with its accompanying open-source license. In modern packaging ecosystems, maintaining such compliance is challenging because a package may have a complex multi-layered dependency graph with many packages, any of which may have an incompatible license. Although prior research finds that license incompatibilities are prevalent, empirical evidence is still scarce in some modern packaging ecosystems (e.g., PyPI). It also remains unclear how developers remediate the license incompatibilities in the dependency graphs of their packages (including direct and transitive dependencies), let alone any automated approaches. To bridge this gap, we conduct a large-scale empirical study of license incompatibilities and their remediation practices in the PyPI ecosystem. We find that 7.27% of the PyPI package releases have license incompatibilities and 61.3 % of them are caused by transitive dependencies, causing challenges in their remediation; for remediation, developers can apply one of the five strategies: migration, removal, pinning versions, changing their own licenses, and negotiation. Inspired by our findings, we propose Silence, an SMT-solver-based approach to recommend license incompatibility remediations with minimal costs in package dependency graph. Our evaluation shows that the remediations proposed by Silencecan match 19 historical real-world cases (except for migrations not covered by an existing knowledge base) and have been accepted by five popular PyPI packages whose developers were previously unaware of their license incompatibilities.
SP  - 178
EP  - 190
JF  - 2023 38th IEEE/ACM International Conference on Automated Software Engineering (ASE)
VL  - NA
IS  - NA
PB  -
DO  - 10.1109/ase56229.2023.00175
ER  -

TY  - CHAP
AU  - Kritikos, Apostolos; Stamelos, Ioannis
TI  - OSS - Open Source Software Resilience Framework
PY  - 2018
AB  - An Open Source Software (OSS) project can be utilized either as is, to serve specific needs on an application level, or on the source code level, as a part of another software system serving as a component, a library, or even an autonomous third party dependency. There are several OSS quality models that provide metrics to measure specific aspects of the project, like its structural quality. Although other dimensions, like community health and activity, software governance principles or license permissiveness, are taken into account, there is no universally accepted OSS assessment model. In this work we are proposing an evaluation approach based on the adaptation of the City Resilience Framework to OSS with the aim of providing a strong theoretical basis for evaluating OSS projects.
SP  - 39
EP  - 49
JF  - IFIP Advances in Information and Communication Technology
VL  - NA
IS  - NA
PB  -
DO  - 10.1007/978-3-319-92375-8_4
ER  -

TY  - CHAP
AU  - Paschali, Maria-Eleni; Ampatzoglou, Apostolos; Bibi, Stamatia; Chatzigeorgiou, Alexander; Stamelos, Ioannis
TI  - ICSR - A Case Study on the Availability of Open-Source Components for Game Development
PY  - 2016
AB  - Nowadays the amount of source code that is freely available inside open-source software repositories offers great reuse opportunities to software developers. Therefore, it is expected that the implementation of several requirements can be facilitated by reusing open source software components. In this paper, we focus on the reuse opportunities that can be offered in one specific application domain, i.e., game development. In particular, we performed an embedded multiple case study on approximately 110 open-source games, exploiting a large-scale repository of OSS components, and investigated: a which game genres can benefit from open source reuse, and b what types of requirements can the available open-source components map to. The results of the case study suggest that: a game genres with complex game logic, e.g., First Person Shooter, Strategy, Role-Playing, and Sport games offer the most reuse opportunities, and b the most common requirement types that can be developed by reusing OSS components are related to scenarios and characters.
SP  - 149
EP  - 164
JF  - Lecture Notes in Computer Science
VL  - NA
IS  - NA
PB  -
DO  - 10.1007/978-3-319-35122-3_11
ER  -

TY  - JOUR
AU  - Jeon, Byeongmin; Geum, Youngjung
TI  - GCN-based Reviewer Recommendation in Github based on the Network Expansion
PY  - 2023
AB  - Github is an open-source platform which focuses on collaboration based on full-request. As an important task of the Github collaboration platform, a code review is critically important, as it plays a key role in the collaboration ecosystem as well as the software quality improvement. Therefore, there have been several works to recommend proper reviewers to improve the Github collaboration ecosystem. Despite the fact, however, previous studies have some common limitations. What is at the core in previous works is to focus on the recommendation of the existing network only, not considering the chances of extending the network in order to consider more proper candidates. For this purpose, this study suggests a GCN-based link prediction based on the network extension, which expands the boundaries of potential reviewer candidates in the network. As a result, the model based on the network extension shows a good performance compared to the existing network. In addition, we conducted a comparison for the existing reviewers and new reviewers, and identified the new and potential reviewers as having a positive impact compared to the existing reviewers.
SP  - 209
EP  - 222
JF  - Journal of the Korean Institute of Industrial Engineers
VL  - 49
IS  - 3
PB  -
DO  - 10.7232/jkiie.2023.49.3.209
ER  -

TY  - JOUR
AU  - Liao, Zhifang; Deng, Libing; Fan, Xiaoping; Zhang, Yan; Liu, Hui; Qi, Xiaofei; Zhou, Yun
TI  - Empirical research on the evaluation model and method of sustainability of the open source ecosystem
PY  - 2018
AB  - The development of open source brings new thinking and production modes to software engineering and computer science, and establishes a software development method and ecological environment in which groups participate. Regardless of investors, developers, participants, and managers, they are most concerned about whether the Open Source Ecosystem can be sustainable to ensure that the ecosystem they choose will serve users for a long time. Moreover, the most important quality of the software ecosystem is sustainability, and it is also a research area in Symmetry. Therefore, it is significant to assess the sustainability of the Open Source Ecosystem. However, the current measurement of the sustainability of the Open Source Ecosystem lacks universal measurement indicators, as well as a method and a model. Therefore, this paper constructs an Evaluation Indicators System, which consists of three levels: The target level, the guideline level and the evaluation level, and takes openness, stability, activity, and extensibility as measurement indicators. On this basis, a weight calculation method, based on information contribution values and a Sustainability Assessment Model, is proposed. The models and methods are used to analyze the factors affecting the sustainability of Stack Overflow (SO) ecosystem. Through the analysis, we find that every indicator in the SO ecosystem is partaking in different development trends. The development trend of a single indicator does not represent the sustainable development trend of the whole ecosystem. It is necessary to consider all of the indicators to judge that ecosystem’s sustainability. The research on the sustainability of the Open Source Ecosystem is helpful for judging software health, measuring development efficiency and adjusting organizational structure. It also provides a reference for researchers who study the sustainability of software engineering.
SP  - 747
EP  - NA
JF  - Symmetry
VL  - 10
IS  - 12
PB  -
DO  - 10.3390/sym10120747
ER  -

TY  - JOUR
AU  - Jullien, Nicolas; Viseur, Robert; Zimmermann, Jean-Benoît
TI  - A theory of FLOSS projects and Open Source business models dynamics
PY  - 2025
AB  - NA
SP  - 112383
EP  - 112383
JF  - Journal of Systems and Software
VL  - 224
IS  - NA
PB  -
DO  - 10.1016/j.jss.2025.112383
ER  -

TY  - JOUR
AU  - Li, Hao; Bezemer, Cor-Paul
TI  - Bridging the language gap: an empirical study of bindings for open source machine learning libraries across software package ecosystems
PY  - 2024
AB  - NA
SP  - NA
EP  - NA
JF  - Empirical Software Engineering
VL  - 30
IS  - 1
PB  -
DO  - 10.1007/s10664-024-10570-5
ER  -

TY  - NA
AU  - Harrison, Francis
TI  - Reference Coupling: A Method for Identifying Software Ecosystems of Technically Dependent Projects
PY  - 2015
AB  - NA
SP  - NA
EP  - NA
JF  - NA
VL  - NA
IS  - NA
PB  -
DO  - NA
ER  -

TY  - JOUR
AU  - Lakhani, Karim R.; Gulley, Ned
TI  - The Determinants of Individual Performance and Collective Value in Private-Collective Software Innovation
PY  - 2010
AB  - NA
SP  - NA
EP  - NA
JF  - SSRN Electronic Journal
VL  - NA
IS  - NA
PB  -
DO  - 10.2139/ssrn.1550352
ER  -

TY  - JOUR
AU  - Wei, Kangning; Crowston, Kevin; Eseryel, U. Yeliz
TI  - Participation in community-based free/libre open source software development tasks: the impact of task characteristics
PY  - 2021
AB  - This paper explores how task characteristics in terms of trigger type and task topic influence individual participation in community-based free/libre open source software (FLOSS) development by considering participation in individual tasks rather than entire projects.,A quantitative study was designed using choose tasks that were carried out via the email discourse on the developers' email fora in five FLOSS projects. Choice process episodes were selected as the unit of analysis and were coded for the task trigger and topic. The impact of these factors on participation (i.e. the numbers of participants and messages) was assessed by regression.,The results reveal differences in participation related to different task triggers and task topics. Further, the results suggest the mediating role of the number of participants in the relationships between task characteristics and the number of messages. The authors also speculate that project type serves as a boundary condition restricting the impacts of task characteristics on the number of participants and propose this relationship for future research.,Empirical support was provided to the important effects of different task characteristics on individual participation behaviors in FLOSS development tasks.,The findings can help FLOSS participants understand participation patterns in different tasks and choose the types of tasks to attend to.,This research explores the impact of task characteristics on participation in FLOSS development at the task level, while prior research on participation in FLOSS development has focused mainly on factors at the individual and/or project levels.
SP  - 1177
EP  - 1202
JF  - Internet Research
VL  - 31
IS  - 4
PB  -
DO  - 10.1108/intr-03-2020-0112
ER  -

TY  - JOUR
AU  - Badampudi, Deepika; Unterkalmsteiner, Michael; Britto, Ricardo
TI  - Modern Code Reviews—Survey of Literature and Practice
PY  - 2023
AB  - <jats:p>
            <jats:bold>Background:</jats:bold>
            Modern Code Review (MCR) is a lightweight alternative to traditional code inspections. While secondary studies on MCR exist, it is u
            <jats:italic>a</jats:italic>
            nknown whether the research community has targeted themes that practitioners consider important.
          </jats:p>
          <jats:p>
            <jats:bold>Objectives:</jats:bold>
            The objectives are to provide an overview of MCR research, analyze the practitioners’ opinions on the importance of MCR research, investigate the alignment between research and practice, and propose future MCR research avenues.
          </jats:p>
          <jats:p>
            <jats:bold>Method:</jats:bold>
            We conducted a systematic mapping study to survey state of the art until and including 2021, employed the Q-Methodology to analyze the practitioners’ perception of the relevance of MCR research, and analyzed the primary studies’ research impact.
          </jats:p>
          <jats:p>
            <jats:bold>Results:</jats:bold>
            We analyzed 244 primary studies, resulting in five themes. As a result of the 1,300 survey data points, we found that the respondents are positive about research investigating the impact of MCR on product quality and MCR process properties. In contrast, they are negative about human factor– and support systems–related research.
          </jats:p>
          <jats:p>
            <jats:bold>Conclusion:</jats:bold>
            These results indicate a misalignment between the state of the art and the themes deemed important by most survey respondents. Researchers should focus on solutions that can improve the state of MCR practice. We provide an MCR research agenda that can potentially increase the impact of MCR research.
          </jats:p>
SP  - 1
EP  - 61
JF  - ACM Transactions on Software Engineering and Methodology
VL  - 32
IS  - 4
PB  -
DO  - 10.1145/3585004
ER  -

TY  - JOUR
AU  - Constantino, Kattiana; Belém, Fabiano; Figueiredo, Eduardo
TI  - Dual analysis for helping developers to find collaborators based on co‐changed files: An empirical study
PY  - 2023
AB  - <jats:title>Summary</jats:title><jats:p>Software developers must collaborate at all stages of the software life‐cycle to create successful complex software systems. To enable this collaboration, social coding platforms, for example, GitHub, include an increasing number of tools to support collaboration. However, for large projects with hundreds of dynamic developers, such as several successful open–source projects, it can be complex to find developers with the same interest and familiarity and thus, gain suitable collaborations and new insights. In this context, resources and efforts may be wasted, discouraging many developers from contributing. Moreover, it can be costly to manage many contributions, which is another challenge for the maintainer who wants to take advantage of this small, timid, but valuable contribution made by a volunteer developer in a short time. In this context, this paper presents an empirical study aiming to evaluate two strategies to recommend collaborators based on co‐changed files. Inspired in the TF–IDF (Term Frequency–Inverse Document Frequency) weighting scheme established in the Information Retrieval field, these strategies first estimate the importance of relevant files modified by developers and use these estimates to represent each developer “profile”. As a second step, they estimate the similarity between developers using the Cosine metric, providing top‐ranked developers according to this measure as recommendations. We evaluated these strategies based on an extensive survey with 102 real–world developers. We observed that developers have interest and familiarity with the co‐changed files for all strategies evaluated. These considerations are of relevance because many opportunities for contributions to the project are linked to coding. Thus, theses results may indicate one less barrier for improving collaboration among developers. Overall, the strategies present an acceptance rate of up to 81%, contributing to the discovery of further collaborators.</jats:p>
SP  - 1438
EP  - 1464
JF  - Software: Practice and Experience
VL  - 53
IS  - 6
PB  -
DO  - 10.1002/spe.3194
ER  -

TY  - NA
AU  - Li, Hanlin; Vincent, Nicholas; Chancellor, Stevie; Hecht, Brent
TI  - The Dimensions of Data Labor: A Road Map for Researchers, Activists, and Policymakers to Empower Data Producers
PY  - 2023
AB  - Many recent technological advances (e.g. ChatGPT and search engines) are possible only because of massive amounts of user-generated data produced through user interactions with computing systems or scraped from the web (e.g. behavior logs, user-generated content, and artwork). However, data producers have little say in what data is captured, how it is used, or who it benefits. Organizations with the ability to access and process this data, e.g. OpenAI and Google, possess immense power in shaping the technology landscape. By synthesizing related literature that reconceptualizes the production of data for computing as ``data labor'', we outline opportunities for researchers, policymakers, and activists to empower data producers in their relationship with tech companies, e.g advocating for transparency about data reuse, creating feedback channels between data producers and companies, and potentially developing mechanisms to share data's revenue more broadly. In doing so, we characterize data labor with six important dimensions - legibility, end-use awareness, collaboration requirement, openness, replaceability, and livelihood overlap - based on the parallels between data labor and various other types of labor in the computing literature.
SP  - 1151
EP  - 1161
JF  - 2023 ACM Conference on Fairness Accountability and Transparency
VL  - NA
IS  - NA
PB  -
DO  - 10.1145/3593013.3594070
ER  -

TY  - JOUR
AU  - Gustavsson, Tomas
TI  - Managing the Open Source Dependency
PY  - 2020
AB  - Organizations use open source software in a majority of computer application programs. Here we describe some of the technical challenges and offer recommendations about how to manage open source software dependencies and avoid the most common pitfalls that might be encountered through decision -making, automated scanning, upgrading, and strategic contributions.
SP  - 83
EP  - 87
JF  - Computer
VL  - 53
IS  - 2
PB  -
DO  - 10.1109/mc.2019.2955869
ER  -

TY  - JOUR
AU  - de la Vega, Alfonso; García-Saiz, Diego; Zorrilla, Marta E.; Sánchez, Pablo
TI  - FLANDM: a development framework of domain-specific languages for data mining democratisation
PY  - 2018
AB  - Abstract Companies have an increasing interest in employing data mining to take advantage of the vast amounts of data their systems store nowadays. This interest confronts two problems: (1) business experts usually lack the skills required to apply data mining techniques, and (2) the specialists who know how to use these techniques are a scarce and valuable asset. To help democratise data mining, we proposed, in a previous work, the development of domain-specific languages (DSLs) that hide the complexity of data mining techniques. The objective of these DSLs is to allow business experts to specify analysis processes by using high-level primitives and terminology from the application domain. These specifications would then be automatically transformed into a low-level, executable form. Although these DSLs might offer a promising solution to the aforementioned problems, their development from scratch requires a considerable effort and, consequently, they are costly. In order to make these languages affordable, we present FLANDM, an ecosystem devised for the rapid development of DSLs for data mining democratisation. FLANDM provides a base infrastructure that can be easily customised for the particularities of each domain, enabling controlled and systematic reuse of previously developed artefacts. By using FLANDM, new DSLs for data mining democratisation can be defined achieving a 50% of reduction in their development costs.
SP  - 316
EP  - 336
JF  - Computer Languages, Systems & Structures
VL  - 54
IS  - NA
PB  -
DO  - 10.1016/j.cl.2018.07.002
ER  -

TY  - JOUR
AU  - Syeed, M. M. Mahbubul; Hammouda, Imed
TI  - Socio-Technical Dependencies in Forked OSS Projects: Evidence from the BSD Family
PY  - 2014
AB  - Existing studies show that open source projectsmay enjoy high level of socio-technical congruence despitetheir open and distributed character. Such observation isyet to be conﬁrmed in the case of forking, where projectsoriginating from the same root evolve in parallel and aretypically lead by different development teams. In this paper,we empirically investigate the endogenous and exogenouscharacteristics of BSD family projects related to socio-technical congruence. Our motivation is that BSD family,as a representative example of forked projects, share acommon development ground for both the code-base andthe development community, which may inﬂuence theirevolution from a socio-technical perspective. Our studyresults show that the BSD family maintain a certain levelof collaboration throughout the project history, mainly dueto a shared portion of the community. This partly explainsthe relative harmony of socio-technical congruence levels inthe BSD projects.
SP  - 2895
EP  - 2909
JF  - Journal of Software
VL  - 9
IS  - 11
PB  -
DO  - 10.4304/jsw.9.11.2895-2909
ER  -

TY  - JOUR
AU  - Xing, Wanli; Goggins, Sean; Introne, Josh
TI  - Quantifying the Effect of Informational Support on Membership Retention in Online Communities through Large-Scale Data Analytics
PY  - 2018
AB  - NA
SP  - 227
EP  - 234
JF  - Computers in Human Behavior
VL  - 86
IS  - NA
PB  -
DO  - 10.1016/j.chb.2018.04.042
ER  -

TY  - JOUR
AU  - Alsamman, Alsamman M.
TI  - The Art of Bioinformatics Learning in Our Arabic World
PY  - 2019
AB  - Bioinformatics became a significant field in life sciences that, draws a number of researchers and extends into a wide range of biological disciplines. Rendering bioinformatics analysis techniques are the most desirable skills in a variety of scholarship programs and academic positions. Teaching bioinformatics is very challenging since it is a multidisciplinary field, where most of the undergraduate programs in colleges provide only one area required for bioinformatics. Besides the regular education system, few bioinformatics training courses are offered and less are affordable to fresh graduates in countries most of which are categorized as developing countries. The high cost of learning, confusing education systems, and the complexity of bioinformatics science has made it very difficult to be taught and more challenging to be studied in Arab countries. This review provides possible solutions to most of these issues and offers the best practice to guide future Arab bioinformaticians to learn bioinformatics in a way that fits our social, financial and academic circumstances. Moreover, it discusses the key aspects that a bioinformatician needs to be aware of and the basic knowledge that must be gained. On the other side, it will illustrate how to start learning, to address some of these challenges and how to deal with some of the related social issues.
SP  - 1
EP  - 10
JF  - Highlights in BioScience
VL  - 2
IS  - NA
PB  -
DO  - 10.36462/h.biosci.20193
ER  -

TY  - NA
AU  - Wessel, Mairieli; Serebrenik, Alexander; Wiese, Igor; Steinmacher, Igor; Gerosa, Marco Aurélio
TI  - Quality Gatekeepers: Investigating the Effects ofCode Review Bots on Pull Request Activities.
PY  - 2021
AB  - Software bots have been facilitating several development activities in Open Source Software (OSS) projects, including code review. However, these bots may bring unexpected impacts to group dynamics, as frequently occurs with new technology adoption. Understanding and anticipating such effects is important for planning and management. To analyze these effects, we investigate how several activity indicators change after the adoption of a code review bot. We employed a regression discontinuity design on 1,194 software projects from GitHub. We also interviewed 12 practitioners, including open-source maintainers and contributors. Our results indicate that the adoption of code review bots increases the number of monthly merged pull requests, decreases monthly non-merged pull requests, and decreases communication among developers. From the developers' perspective, these effects are explained by the transparency and confidence the bot comments introduce, in addition to the changes in the discussion focused on pull requests. Practitioners and maintainers may leverage our results to understand, or even predict, bot effects on their projects.
SP  - NA
EP  - NA
JF  - arXiv: Software Engineering
VL  - NA
IS  - NA
PB  -
DO  - NA
ER  -

TY  - CHAP
AU  - Wang, Qianjin; Li, Xiangdong; Yue, Chong; He, Yuchen
TI  - A Survey of Control Flow Graph Recovery for Binary Code
PY  - 2023
AB  - NA
SP  - 225
EP  - 244
JF  - Communications in Computer and Information Science
VL  - NA
IS  - NA
PB  -
DO  - 10.1007/978-981-99-8761-0_16
ER  -

TY  - NA
AU  - Reyes, Frank; Baudry, Benoit; Monperrus, Martin
TI  - Breaking-Good: Explaining Breaking Dependency Updates with Build Analysis
PY  - 2024
AB  - NA
SP  - 36
EP  - 46
JF  - 2024 IEEE International Conference on Source Code Analysis and Manipulation (SCAM)
VL  - NA
IS  - NA
PB  -
DO  - 10.1109/scam63643.2024.00014
ER  -

TY  - NA
AU  - Regis, Getúlio Coimbra; Wiese, Igor; Polato, Ivanilton; Silva, Marco Aurélio Graciotto; Ré, Reginaldo; Nakamura, Walter; Steinmacher, Igor
TI  - An Empirical Study of the Comparison of Task Recommendation Techniques and Similar Source Code in Open Source Software Projects
PY  - 2025
AB  - <title>Abstract</title>
        <p><bold>Context:</bold> Managing issues in open-source software projects is challenging and costly, as many developers are casual and/or newcomers. On the one hand, maintainers must ensure the quality of issue descriptions and their labels and create mechanisms for recommending and assigning issues. On the other hand, to complete the issue, contributors must understand it and locate the artifacts related to a given functionality or the defect to be fixed.
<bold>Objectives:</bold> This work aimed to conduct a comparative study of different models for recommending similar issues that could help developers with their contributions.
<bold>Methods:</bold> We collected data on issues and pull requests from 35 open-source projects hosted on GitHub. We used the Term Frequency Inverse Document Frequency (TF-IDF), Sentence BERT (SBERT), and Word2Vec techniques to recommend similar issues and source code to assist newcomers' contributions.
<bold>Results:</bold> The models based on the SBERT and TF-IDF techniques yielded better results in the recommendations generated than Word2Vec in the two evaluated scenarios (general issues and those marked as good for newcomers). SBERT was able to recommend past issues where the code used in the solution was approximately 17% similar to the actual solution of the issue used as a query to evaluate the models, reaching results similar to those of GPT 3.5 and GPT 4.
<bold>Conclusion:</bold> Based on the empirical results obtained, we hope to take the next steps in transferring the knowledge gained to software projects and developers, especially by supporting newcomers developers during their first contribution.</p>
SP  - NA
EP  - NA
JF  - NA
VL  - NA
IS  - NA
PB  -
DO  - 10.21203/rs.3.rs-6322361/v1
ER  -

TY  - NA
AU  - Borreguero, Ferran; Nitto, Elisabetta Di; Stebliuk, Dmitrii; Tamburri, Damian A.; Zheng, Chengyu
TI  - Fathoming Software Evangelists with the D-Index
PY  - 2015
AB  - The increased importance represented by open-source and crowd-sourced software developers and software development in general, inspired us to consider the following dilemma: can we "compute" virtuous software developers? The D-Index is our preliminary attempt. Essentially, the D-Index meaningfully equates several indicators for the virtues of a developer, such as, contributed code, its quality, mentoring in online learning communities, community engagement. Our preliminary evaluation of the index suggests that establishing the virtues for certain developers eases the identification of software "evangelists", key success enablers for software communities.
SP  - 85
EP  - 88
JF  - 2015 IEEE/ACM 8th International Workshop on Cooperative and Human Aspects of Software Engineering
VL  - 2
IS  - NA
PB  -
DO  - 10.1109/chase.2015.26
ER  -

TY  - JOUR
AU  - Foundjem, Armstrong; Eghan, Ellis E.; Adams, Bram
TI  - A Grounded Theory of Cross-Community SECOs: Feedback Diversity Versus Synchronization
PY  - 2023
AB  - Despite their proliferation, growing sustainable software ecosystems (SECOs) remains a substantial challenge. One approach to mitigate this challenge is by collecting and integrating feedback from distributors (distros) and end-users of the SECO releases into future SECO releases, tools, or policies. This paper performs a socio-technical analysis of cross-community collaboration in the OpenStack SECO, which consists of the upstream OpenStack project and 21 distribution (distro) communities. First, we followed Masood et al.'s adaptation of Strauss-Corbinian grounded theory methodology for socio-technical contexts on data from an open-ended unstructured interview, a survey, focus groups, and 384 mailing list threads to investigate how SECOs manage to sustain cross-community collaboration. Our theory has 15 constructs divided into four categories: diverse feedback types and mechanisms (2), characteristics of feedback (2), challenges (7), and the benefits (4) of cross-community collaboration. We then empirically study the salient aspects of the theory, i.e., diversity and synchronization, among 21 OpenStack distros. We empirically mined feedback that distros contribute to upstream, i.e., 140,261 mailing list threads, 142,914 bugs reported, 65,179 bugs resolved, and 4,349 new features. Then, we use influence maximization social network analysis to model the synchronization of feedback in the OpenStack SECO. Our results suggest that distros contribute substantially towards the sustainability of the SECO in the form of 25.6% of new features, 30.7% of emails, 44.3% of bug reports, and 30.7% of bug fixes. Finally, we found evidence of distros playing different roles in a SECO, with nine distros contributing all four types of feedback in equal proportions, while 12 distros specialize in one type of feedback. Distros that are influential in propagating a given type of feedback to the SECO community are not necessarily specialized in that feedback type.
SP  - 4731
EP  - 4750
JF  - IEEE Transactions on Software Engineering
VL  - 49
IS  - 10
PB  -
DO  - 10.1109/tse.2023.3313875
ER  -

TY  - NA
AU  - Gonçalves, Rodrigo Feitosa; Werner, Cláudia Maria Lima; Farias, Claudio Miceli de
TI  - Investigating Developer Experience in Software Reuse
PY  - 2024
AB  - <jats:p>Software reuse has been recognized as a key strategy for improving productivity, reducing development costs, and enhancing software quality. However, successfully implementing software reuse practices largely depends on the developer experience (DX). This study investigates the factors, barriers, and strategies influencing DX in software reuse. Through a Rapid Review (RR), we analyzed 328 studies, selecting 10 for detailed data extraction based on defined filters and the backward snowballing technique. Our findings identify 15 factors affecting DX in software reuse, categorized into technical, organizational, and human/social factors. We also uncover 7 barriers that impede developers from improving DX and identify 13 strategies to enhance it. The results highlight the critical role of comprehensive documentation, a clear understanding of software functionality, and robust reuse-compatible infrastructure as key technical factors. Organizational support, effective resource allocation, and fostering a communication, collaboration, and self-efficacy culture are essential for successful software reuse. This study’s insights have significant implications for researchers and practitioners, offering practical guidance to develop more effective reuse practices and improve DX.</jats:p>
SP  - 71
EP  - 80
JF  - Anais do XVIII Simpósio Brasileiro de Componentes, Arquiteturas e Reutilização de Software (SBCARS 2024)
VL  - NA
IS  - NA
PB  -
DO  - 10.5753/sbcars.2024.3865
ER  -

TY  - CHAP
AU  - Alrabaee, Saed; Debbabi, Mourad; Shirani, Paria; Wang, Lingyu; Youssef, Amr; Rahimian, Ashkan; Nouh, Lina; Mouheb, Djedjiga; Huang, He; Hanna, Aiman
TI  - Clone Detection
PY  - 2020
AB  - NA
SP  - 187
EP  - 209
JF  - Advances in Information Security
VL  - NA
IS  - NA
PB  -
DO  - 10.1007/978-3-030-34238-8_8
ER  -

TY  - JOUR
AU  - Jahanshahi, Mahmoud; Reid, David; Mockus, Audris
TI  - Beyond Dependencies: The Role of Copy-Based Reuse in Open Source Software Development
PY  - 2025
AB  - <jats:p>In Open Source Software, resources of any project are open for reuse by introducing dependencies or copying the resource itself. In contrast to dependency-based reuse, the infrastructure to systematically support copy-based reuse appears to be entirely missing. Our aim is to enable future research and tool development to increase efficiency and reduce the risks of copy-based reuse. We seek a better understanding of such reuse by measuring its prevalence and identifying factors affecting the propensity to reuse. To identify reused artifacts and trace their origins, our method exploits World of Code infrastructure. We begin with a set of theory-derived factors related to the propensity to reuse, sample instances of different reuse types, and survey developers to better understand their intentions. Our results indicate that copy-based reuse is common, with many developers being aware of it when writing code. The propensity for a file to be reused varies greatly among languages and between source code and binary files, consistently decreasing over time. Files introduced by popular projects are more likely to be reused, but at least half of reused resources originate from “small” and “medium” projects. Developers had various reasons for reuse but were generally positive about using a package manager.</jats:p>
SP  - NA
EP  - NA
JF  - ACM Transactions on Software Engineering and Methodology
VL  - NA
IS  - NA
PB  -
DO  - 10.1145/3715907
ER  -

TY  - JOUR
AU  - Kovács, Adrián; Van Looy, Bart; Cassiman, Bruno
TI  - Exploring the scope of open innovation: a bibliometric review of a decade of research
PY  - 2015
AB  - NA
SP  - 951
EP  - 983
JF  - Scientometrics
VL  - 104
IS  - 3
PB  -
DO  - 10.1007/s11192-015-1628-0
ER  -

TY  - NA
AU  - Overney, Cassandra; Meinicke, Jens; Kästner, Christian; Vasilescu, Bogdan
TI  - ICSE - How to not get rich: an empirical study of donations in open source
PY  - 2020
AB  - Open source is ubiquitous and many projects act as critical infrastructure, yet funding and sustaining the whole ecosystem is challenging. While there are many different funding models for open source and concerted efforts through foundations, donation platforms like PayPal, Patreon, and OpenCollective are popular and low-bar platforms to raise funds for open-source development. With a mixed-method study, we investigate the emerging and largely unexplored phenomenon of donations in open source. Specifically, we quantify how commonly open-source projects ask for donations, statistically model characteristics of projects that ask for and receive donations, analyze for what the requested funds are needed and used, and assess whether the received donations achieve the intended outcomes. We find 25,885 projects asking for donations on GitHub, often to support engineering activities; however, we also find no clear evidence that donations influence the activity level of a project. In fact, we find that donations are used in a multitude of ways, raising new research questions about effective funding.
SP  - 1209
EP  - 1221
JF  - Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering
VL  - NA
IS  - NA
PB  -
DO  - 10.1145/3377811.3380410
ER  -

TY  - JOUR
AU  - Rose, Jeremy; Furneaux, Brent
TI  - Innovation Drivers and Outputs for Software Firms
PY  - 2016
AB  - Software innovation, the ability to produce novel and useful software systems, is an important capability for software development organizations and information system developers alike. However, the software development literature has traditionally focused on automation and efficiency while the innovation literature has given relatively little consideration to the software development context. As a result, there is a gap in our understanding of how software product and process innovation can be managed. Specifically, little attention has been directed toward synthesizing prior learning or providing an integrative perspective on the key concepts and focus of software innovation research. We therefore identify 93 journal articles and conference papers within the domain of software innovation and analyse repeating patterns in this literature using content analysis and causal mapping. We identify drivers and outputs for software innovation and develop an integrated theory-oriented concept map. We then discuss the implications of this map for future research.
SP  - 1
EP  - 25
JF  - Advances in Software Engineering
VL  - 2016
IS  - NA
PB  -
DO  - 10.1155/2016/5126069
ER  -

TY  - JOUR
AU  - Chauhan, Aarjav; Sarkar, Dipto; Agrawaal, Taneea S.; Soden, Robert
TI  - Value Tensions in OpenStreetMap: Openness, Membership, and Policy in Online Communities
PY  - 2024
AB  - <jats:p>The social life and long-term trajectories of online peer production communities are shaped and animated in part by value tensions that arise when distributed, heterogeneous participants are brought together into collaboration. This study of OpenStreetMap (OSM) draws upon values-based approaches to investigate how peer production communities enact their values and navigate tensions between them. We examine how conflicts within the community over the rise of corporate participation in OSM provided a stage for the articulation and enactment of community values, shedding light on the broader dynamics and trajectory of the platform and its participants. The contributions of this work include reflections on how increasing corporate participation in OSM intersects with discourses about the emancipatory potential of emerging mapping technologies, insights into the challenges of scaling membership in peer production communities, and exploring the role of values in understanding the social life and governance of online communities.</jats:p>
SP  - 1
EP  - 25
JF  - Proceedings of the ACM on Human-Computer Interaction
VL  - 8
IS  - CSCW2
PB  -
DO  - 10.1145/3686919
ER  -

TY  - JOUR
AU  - Gamalielsson, Jonas; Lundell, Björn
TI  - On Engagement With ICT Standards and Their Implementations in Open Source Software Projects: Experiences and Insights From the Multimedia Field
PY  - 2021
AB  - <p>This paper presents novel results concerning engagement with ICT standards and their implementations in open source software (OSS). Specifically, findings draw from observations and analysis related to standards and implementations in the multimedia field. The first part of the study reports on experiences and insights from engagement with standards in the multimedia field and from implementation of such standards in OSS projects. The second part of the study focuses on the case of the ITU-T H.264 standard and the two OSS projects OpenH264 and x264 that both implement the standard, and reports on a characterisation of organisations that engage with and control the H.264 standard, and organisations that engage with and control OSS projects implementing the H.264 standard. Further, projects for standardisation and implementation of H.264 are contrasted with respect to mix of contributing organisations, and findings are related to organisational strategies of contributing organisations and previous research.</p>
SP  - 1
EP  - 28
JF  - International Journal of Standardization Research
VL  - 19
IS  - 1
PB  -
DO  - 10.4018/ijsr.287102
ER  -

TY  - JOUR
AU  - Jemine, Grégory; Dubois, Christophe; Pichault, François
TI  - When the Gallic Village Strikes Back: The Politics Behind ‘New Ways of Working’ Projects*
PY  - 2020
AB  - ABSTRACTIn the last decade, the interest of managers and professionals for New Ways of Working (NWoW) has grown rapidly, as evidenced by multiple firms claiming to implement ‘NWoW workspaces’ in Be...
SP  - 146
EP  - 170
JF  - Journal of Change Management
VL  - 20
IS  - 2
PB  -
DO  - 10.1080/14697017.2020.1720777
ER  -

TY  - NA
AU  - Staal, Jens; Driege, Yasmine; Borghi, Alice; Hulpiau, Paco; Lievens, Laurens; Gul, Ismail Sahin; Sundararaman, Srividhya; Gonçalves, Amanda; Dhondt, Ineke; Braeckman, Bart P.; Technau, Ulrich; Saeys, Yvan; van Roy, Frans; Beyaert, Rudi
TI  - The CARD-CC/Bcl10/paracaspase signaling complex is functionally conserved since the last common ancestor of Planulozoa
PY  - 2016
AB  - Type 1 paracaspases originated in the Ediacaran geological period before the last common ancestor of bilaterans and cnidarians (planulozoa). Cnidarians have several paralog type 1 paracaspases, type 2 paracaspases, and a homolog of Bcl10. Notably in bilaterans, lineages like nematodes and insects lack Bcl10 whereas other lineages such as vertebrates, hemichordates, annelids and mollusks do contain Bcl10. A survey of invertebrate CARD-coiled-coil (CC) domain homologs of CARMA/CARD9 revealed such homologs only in species with Bcl10, indicating an ancient co-evolution of the entire CARD-CC/Bcl10/MALT1-like paracaspase (CBM) complex. Furthermore, vertebrate-like Syk/Zap70 tyrosine kinase homologs with the ITAM-binding SH2 domain were found in invertebrate organisms with CARD-CC/Bcl10, indicating that this pathway might be the original user of the CBM complex. We also established that the downstream signaling proteins TRAF2 and TRAF6 are functionally conserved in cnidaria. There also seems to be a correlation where invertebrates with CARD-CC and Bcl10 have type 1 paracaspases which are more similar to the paracaspases found in vertebrates. A proposed evolutionary scenario includes at least two ancestral type 1 paracaspase paralogs in the planulozoan last common ancestor, where at least one paralog usually is dependent on CARD-CC/Bcl10 for its function. Functional analyses of invertebrate type 1 paracaspases and Bcl10 homologs support this scenario and indicate an ancient origin of the CARD-CC/Bcl10/paracaspase signaling complex. Results from cnidaria, nematodes and mice also suggest an ancient neuronal role for the type 1 paracaspases.
SP  - 046789
EP  - NA
JF  - NA
VL  - NA
IS  - NA
PB  -
DO  - 10.1101/046789
ER  -

TY  - NA
AU  - Imtiaz, Ahsan; Shehzad, Danish; Nasim, Fawad; Afzaal, Muhammad; Rehman, Muhammad; Imran, Ali
TI  - Analysis of Cybersecurity Measures for Detection, Prevention, and Misbehaviour of Social Systems
PY  - 2023
AB  - The rapid proliferation of digital financial products has given rise to profound challenges in safeguarding consumer interests, notably concerning fraud and scams. These issues, which carry the potential to erode trust in digital services, are especially pronounced in developing nations. Consequently, preventing victimization has emerged as a paramount policy imperative. Notably, recent revelations have illustrated the misuse of Large Language Models (LLMs) for fraudulent activities, impersonation, and the generation of malicious software. Concurrently, other researchers have delved into the broader issue of AI alignment. This underscores the imperative for both developers and practitioners to remain cognizant of the security-related challenges posed by such models. The detection of online sexual predatory behaviors and the combatting of abusive language on social media platforms have assumed paramount importance in contemporary research. This concern is driven by mounting apprehensions surrounding online safety, particularly for vulnerable segments of the population, including children and adolescents. Researchers have been diligently exploring diverse techniques and approaches to devise effective detection systems capable of identifying and mitigating these inherent risks. This paper conducts a comprehensive analysis of prevalent challenges spanning various global sectors and assesses their far-reaching consequences. Additionally, it undertakes an evaluation of the associated hurdles while identifying optimal strategies to address these multifaceted challenges.
SP  - 1
EP  - 7
JF  - 2023 Tenth International Conference on Social Networks Analysis, Management and Security (SNAMS)
VL  - NA
IS  - NA
PB  -
DO  - 10.1109/snams60348.2023.10375405
ER  -

TY  - NA
AU  - Hucka, Michael; Graham, Matthew J.
TI  - Software search is not a science, even among scientists
PY  - 2016
AB  - When they seek software for a task, how do people go about finding it? Past research found that searching the Web, asking colleagues, and reading papers have been the predominant approaches---but is it still true today, given the popularity of Facebook, Stack Overflow, GitHub, and similar sites? In addition, when users do look for software, what criteria do they use? And finally, if resources such as improved software catalogs were to be developed, what kind of information would people want in them? These questions motivated our cross-sectional survey of scientists and engineers. We sought to understand the practices and experiences of people looking for ready-to-run software as well as people looking for source code. The results show that even in our highly educated sample of people, the relatively unsophisticated approaches of relying on general Web searches, the opinions of colleagues, and the literature remain the most popular approaches overall. However, software developers are more likely than non-developers to search in community sites such as Stack Overflow and GitHub, even when seeking ready-to-run software rather than source code. We also found that when searching for source code, poor documentation was the most common reason for being unable to reuse the code found. Our results also reveal a variety of characteristics that matter to people searching for software, and thus can inform the development of future resources to help people find software more effectively.
SP  - NA
EP  - NA
JF  - arXiv: Computers and Society
VL  - NA
IS  - NA
PB  -
DO  - NA
ER  -

TY  - JOUR
AU  - Daniel, Sherae L.; Stewart, Katherine J.
TI  - Open source project success: Resource access, flow, and integration
PY  - 2016
AB  - NA
SP  - 159
EP  - 176
JF  - The Journal of Strategic Information Systems
VL  - 25
IS  - 3
PB  -
DO  - 10.1016/j.jsis.2016.02.006
ER  -

TY  - CHAP
AU  - Stefi, Anisa; Hess, Thomas
TI  - ICSOB - To Develop or to Reuse? Two Perspectives on External Reuse in Software Projects
PY  - 2015
AB  - Using existing software components is a key factor when it comes to increasing productivity and improving the quality of software. It can be regarded as a mean to manage the increasing complexity of software, as software has become prevalent in most areas of our life. Thus, this study seeks to better understand the reuse of external software components. Based on two different theoretical lenses, non-rational effects on decision-making and the transaction cost theory, we analyze the degree of external reuse in software development projects. We tested our theoretical model empirically, with data collected in Germany. The empirical evidence is generally supportive of the theory with some exceptions. We find out that the not-invented-here bias plays the most important role in this strategic decision. Whereas, transaction cost constructs show mixed results. For example, technical uncertainty does not play a role, whereas business uncertainty positively influences the degree of external reuse.
SP  - 192
EP  - 206
JF  - Lecture Notes in Business Information Processing
VL  - NA
IS  - NA
PB  -
DO  - 10.1007/978-3-319-19593-3_18
ER  -

TY  - BOOK
AU  - Zamansky, Anna; Reinhartz-Berger, Iris
TI  - SCME-iStarT@ER - Visualizing Code Variabilities for Supporting Reuse Decisions.
PY  - 2017
AB  - NA
SP  - 25
EP  - 34
JF  - NA
VL  - NA
IS  - NA
PB  -
DO  - NA
ER  -

TY  - JOUR
AU  - Yin, Ying; Zhao, Yuhai; Sun, Yiming; Chen, Chen
TI  - Automatic Code Review by Learning the Structure Information of Code Graph.
PY  - 2023
AB  - At present, the explosive growth of software code volume and quantity makes the code review process very labor-intensive and time-consuming. An automated code review model can assist in improving the efficiency of the process. Tufano et al., designed two automated tasks to help improve the efficiency of code review based on the deep learning approach, from two different perspectives, namely, the developer submitting the code and the code reviewer. However, they only used code sequence information and did not explore the logical structure information with a richer meaning of the code. To improve the learning of code structure information, a program dependency graph serialization algorithm PDG2Seq algorithm is proposed, which converts the program dependency graph into a unique graph code sequence in a lossless manner, while retaining the program structure information and semantic information. We then designed an automated code review model based on the pre-trained model CodeBERT architecture, which strengthens the learning of code information by fusing program structure information and code sequence information, and then fine-tuned the model according to the code review activity scene to complete the automatic modification of the code. To verify the efficiency of the algorithm, the two tasks in the experiment were compared with the best Algorithm 1-encoder/2-encoder. The experimental results show that the model we proposed has a significant improvement under the BLEU, Lewinshtein distance and ROUGE-L metrics.
SP  - 2551
EP  - 2551
JF  - Sensors (Basel, Switzerland)
VL  - 23
IS  - 5
PB  -
DO  - 10.3390/s23052551
ER  -

TY  - JOUR
AU  - Zhou, Jiayuan; Wang, Shaowei; Kamei, Yasutaka; Hassan, Ahmed E.; Ubayashi, Naoyasu
TI  - Studying donations and their expenses in open source projects: a case study of GitHub projects collecting donations through open collectives
PY  - 2021
AB  - NA
SP  - NA
EP  - NA
JF  - Empirical Software Engineering
VL  - 27
IS  - 1
PB  -
DO  - 10.1007/s10664-021-10060-y
ER  -

TY  - JOUR
AU  - Menéndez-Caravaca, Eloísa; Bueno, Salvador; Gallego, M. Dolores
TI  - Exploring the link between free and open source software and the collaborative economy: A Delphi-based scenario for the year 2025
PY  - 2021
AB  - Abstract Despite the growth experienced by the Collaborative Economy in recent years, there are still unexplored gaps within this phenomenon. One of the areas of study with scarce literature is linked with the impact of the Information and Communication Technologies based on collaborative environments, such as Free and Open Source Software, on the spread of the Collaborative Economy. Some questions are raised, such as: (1) To what extent do organizations linked with Collaborative Economy make use of Free and Open Source Software?, (2) What are the incentives that motivate the implementation of Free and Open Source Software in Collaborative Economy companies?, (3) What use do Collaborative Economy companies give to Free and Open Source Software?, and (4) Is there a greater use of Free and Open Source Software expected for the coming years among these organizations? To answer these questions, a study based on the Delphi method has been designed. To this end, a panel of 15 high-level experts in the field was formed. From the consensus of the experts, a significant role for Free and Open Source Software in the different collaborative components and industries is evident, with the current levels practically being maintained by the year 2025.
SP  - 121087
EP  - NA
JF  - Technological Forecasting and Social Change
VL  - 173
IS  - NA
PB  -
DO  - 10.1016/j.techfore.2021.121087
ER  -

TY  - NA
AU  - Balali, Sogol; Annamalai, Umayal; Padala, Hema Susmita; Trinkenreich, Bianca; Gerosa, Marco Aurélio; Steinmacher, Igor; Sarma, Anita
TI  - OpenSym - Recommending Tasks to Newcomers in OSS Projects: How Do Mentors Handle It?
PY  - 2020
AB  - Software developers who want to start contributing to an Open Source Software (OSS) project often struggle to find appropriate first tasks. The voluntary, self-organizing distribution of decentralized labor and the distinct nature of some OSS projects intensifies this challenge. Mentors, who work closely with newcomers, develop strategies to recommend tasks. However, to date neither the challenges mentors face in recommending tasks nor their strategies have been formally documented or studied. In this paper, we interviewed mentors of well-established OSS projects (n=10) and qualitatively analyzed their answers to identify both challenges and strategies related to recommending tasks for newcomers. Then, we employed a survey (n=30) to map the strategies to challenges and collect additional strategies. Our study identified 7 challenges and 13 strategies related to task recommendation. Strategies such as "tagging the issues based on difficulty," "adding documentation," "assigning a small task first and then challenge the newcomers with bigger tasks," and "dividing tasks into smaller pieces" were frequently mentioned as ways to overcome multiple challenges. Our results provide insights for mentors about the strategies OSS communities can use to guide their mentors and for tool builders who design automated support for task assignment.
SP  - 1
EP  - 14
JF  - Proceedings of the 16th International Symposium on Open Collaboration
VL  - NA
IS  - NA
PB  -
DO  - 10.1145/3412569.3412571
ER  -

TY  - JOUR
AU  - Decan, Alexandre; Constantinou, Eleni; Mens, Tom; Rocha, Henrique
TI  - GAP: Forecasting commit activity in git projects
PY  - 2020
AB  - NA
SP  - 110573
EP  - NA
JF  - Journal of Systems and Software
VL  - 165
IS  - NA
PB  -
DO  - 10.1016/j.jss.2020.110573
ER  -

TY  - JOUR
AU  - Khadpe, Pranav; Xu, Olivia; Kaufman, Geoff; Kulkarni, Chinmay
TI  - <i>Hug Reports</i>
            : Supporting Expression of Appreciation between Users and Contributors of Open Source Software Packages
PY  - 2025
AB  - <jats:p>Contributors to open source software packages often describe feeling discouraged by the lack of positive feedback from users. This paper describes a technology probe, Hug Reports, that provides users a communication affordance within their code editors, through which users can convey appreciation to contributors of packages they use. In our field study, 18 users interacted with the probe for 3 weeks, resulting in messages of appreciation to 550 contributors, 26 of whom participated in subsequent research. Our findings show how locating a communication affordance within the code editor, and allowing users to express appreciation in terms of the abstractions they are exposed to (packages, modules, functions), can support exchanges of appreciation that are meaningful to users and contributors. Findings also revealed the moments in which users expressed appreciation, the two meanings that appreciation took on -- as a measure of utility and as an act of expressive communication -- and how contributors' reactions to appreciation were influenced by their perceived level of contribution. Based on these findings, we discuss opportunities and challenges for designing appreciation systems for open source in particular, and peer production communities more generally.</jats:p>
SP  - 1
EP  - 32
JF  - Proceedings of the ACM on Human-Computer Interaction
VL  - 9
IS  - 2
PB  -
DO  - 10.1145/3710997
ER  -

TY  - JOUR
AU  - Chrysos, Paris
TI  - Empathy in the business model: how Facebook and Google Maps manage external problem-solving processes
PY  - 2018
AB  - This paper shows how leading Internet enterprises manage problem-solving processes occurring on their interfaces through the use of empathy. The data of the developer supports forums of Facebook and Google Maps reveal a particularly low problem solving rate (less than 15% of problems solves over a period of six months). To explain this phenomenon a generic construct for business models is proposed on the basis of empirical examination of the problem-solving process followed in those forums, rendering compatible the notion of empathy with well-known value adding activities.
SP  - NA
EP  - NA
JF  - SSRN Electronic Journal
VL  - NA
IS  - NA
PB  -
DO  - 10.2139/ssrn.3266866
ER  -

TY  - JOUR
AU  - Linåker, Johan; Regnell, Björn; Damian, Daniela
TI  - A Community Strategy Framework – How to obtain influence on requirements in meritocratic open source software communities?
PY  - 2019
AB  - NA
SP  - 102
EP  - 114
JF  - Information and Software Technology
VL  - 112
IS  - 112
PB  -
DO  - 10.1016/j.infsof.2019.04.010
ER  -

TY  - JOUR
AU  - Tabash, Mosab I.; Kumar, Ashish; Sharma, Shikha; Vashistha, Ritu; El Refae, Ghaleb A.
TI  - International journal of organizational analysis: a bibliometric review (2005–2020)
PY  - 2022
AB  - <jats:sec>
<jats:title content-type="abstract-subheading">Purpose</jats:title>
<jats:p>The <jats:italic>International Journal of Organizational Analysis</jats:italic> (<jats:italic>IJOA</jats:italic>) is a leading journal that has published high-quality research focused on various facets of organizational analysis since 1993. This paper aims to conduct a retrospective analysis of the <jats:italic>IJOA</jats:italic> journey from 2005 to 2020.</jats:p>
</jats:sec>
<jats:sec>
<jats:title content-type="abstract-subheading">Design/methodology/approach</jats:title>
<jats:p>The data used in this study was extracted using the Scopus database. The bibliometric analysis, using several indicators, is adopted to reveal the major trends and themes of the journal. The mapping of bibliographic data is carried using VOSviewer and Biblioshiny.</jats:p>
</jats:sec>
<jats:sec>
<jats:title content-type="abstract-subheading">Findings</jats:title>
<jats:p>The study findings indicate that <jats:italic>IJOA</jats:italic> has grown for publications and citations since its inception. Five significant research directions emerged, i.e. organizational diagnostics, organization citizenship behaviour, organizational commitment to employee retention, psychological capital and firm performance, based on cluster analysis of <jats:italic>IJOA</jats:italic>’s publications.</jats:p>
</jats:sec>
<jats:sec>
<jats:title content-type="abstract-subheading">Originality/value</jats:title>
<jats:p>To the best of the authors’ knowledge, this is the first study to conduct a comprehensive bibliometric analysis of <jats:italic>IJOA</jats:italic>. The study presents the key themes and trends emerging from a leading journal, considered a high-quality journal, for researching various facets of organizational functioning by academicians, scholars and practitioners.</jats:p>
</jats:sec>
SP  - 2141
EP  - 2182
JF  - International Journal of Organizational Analysis
VL  - 31
IS  - 6
PB  -
DO  - 10.1108/ijoa-10-2021-2990
ER  -

TY  - NA
AU  - Schwittek, Widura; Eicker, Stefan
TI  - CBSE - A study on third party component reuse in Java enterprise open source software
PY  - 2013
AB  - Recent studies give empirical evidence that much of today's software is to a large extent built on preexisting software, such as commercial-off-the-shelf (COTS) and open source software components. In this exploratory study we want to contribute to this small but increasing body of knowledge by investigating third party component reuse in 36 Java web applications that are open source and are meant to be used in an enterprise context. Our goal is to get a better understanding on how third party components are reused in web applications and how to better support it.The results are in line with existing research in this field. 70 third party components are being reused on average. 50 percent of the 40 most reused third party components are maintained by the Apache Foundation. Further research questions based on the study results were generated and are presented at the end of this paper.
SP  - 75
EP  - 80
JF  - Proceedings of the 16th International ACM Sigsoft symposium on Component-based software engineering
VL  - NA
IS  - NA
PB  -
DO  - 10.1145/2465449.2465468
ER  -

TY  - CHAP
AU  - Truman, Barbara
TI  - Fostering Collaborative Open Simulation for Next-Gen Enterprise Learning Ecosystems
PY  - 2019
AB  - <jats:p>Leaders seek models of healthy ecosystems to better foster systemic, predictable performance improvement for their learning enterprises. Ecosystems may be viewed narrowly, involving information technology architecture, content, and standards for interoperability or expansively where stakeholders connect to seek next-generation, transdisciplinary learning opportunities across society. Ecosystem stewardship is a responsibility of community/societal leaders and citizens who must collaborate to shape and harness forces and drivers of emerging technology. Mass collaboration is needed to push open simulation into an enterprise capability that monitors and models what was, what is, and can be. This chapter frames academic need and United States military use of open simulation suitable for exploring new ways to steward ecosystem wealth in the interest of learning enterprises and beyond. </jats:p>
SP  - 255
EP  - 280
JF  - Advances in Educational Technologies and Instructional Design
VL  - NA
IS  - NA
PB  -
DO  - 10.4018/978-1-5225-9679-0.ch013
ER  -

TY  - JOUR
AU  - Viseur, Robert; Charleux, Amel
TI  - Changement de gouvernance et communautés open source : le cas du logiciel Claroline
PY  - 2019
AB  - Depuis le milieu des annees 1990, les logiciels libres et
				open source
integrent progressivement la sphere economique et entrainent le developpement de grandes communautes en ligne. Bien que ces communautes aient ete largement etudiees, leur cohabitation avec les editeurs de logiciels reste cependant mal maitrisee. Plus precisement, le role que peuvent jouer ces communautes au-dela de leur contribution technique est sous-etudie. Dans notre recherche, nous questionnons le rapport entre un editeur et sa communaute en prenant pour unite d’analyse la reaction de la communaute face a des changements strategiques de gouvernance inities par l’editeur. Afin d’atteindre notre objectif, une etude longitudinale s’imposait. Claroline est un projet populaire de
				Learning Management System
				open source
dont sont issus, directement ou indirectement, plusieurs autres projets
				open source. Compte tenu de sa diffusion, de l’evolution de sa gouvernance, de ses
				forks
et de sa resilience, Claroline presentait un terrain de recherche ideal pour comprendre les dynamiques communautaires. Finalement, nos resultats montrent que les communautes libres et
				open source
peuvent etre a l’origine de resistances et entraver la mise en œuvre de certaines decisions strategiques liees notamment aux modes de gouvernance.
			Codes JEL : L17
SP  - 71
EP  - 104
JF  - Innovations
VL  - 58
IS  - 1
PB  -
DO  - 10.3917/inno.058.0071
ER  -

TY  - NA
AU  - Cui, Xing; Wu, Jingzheng; Wu, Yanjun; Wang, Xu; Luo, Tianyue; Qu, Sheng; Ling, Xiang; Yang, Mutian
TI  - An Empirical Study of License Conflict in Free and Open Source Software
PY  - 2023
AB  - Free and Open Source Software (FOSS) has become the fundamental infrastructure of mainstream software projects. FOSS is subject to various legal terms and restrictions, depending on the type of open source license in force. Hence it is important to remain compliant with the FOSS license terms. Identifying the licenses that provide FOSS and understanding the terms of those licenses is not easy, especially when dealing with a large amount of reuse that is common in modern software development. Since reused software is often large, automated license analysis is needed to address these issues and support users in license compliant reuse of FOSS. However, existing license assessment tools can only identify the name and quantity of licenses embedded in software and thus cannot identify whether the licenses are being used safely and correctly. Moreover, they cannot provide a comprehensive analysis of the compatibility and potential risk that come with the term conflicts.In this paper, we propose DIKE, an automated tool that can perform license detection and conflict analysis for FOSS. First, DIKE extracts 12 terms under 3,256 unique open source licenses by manual analysis and Natural Language Processing (NLP) and constructs a license knowledge base containing the responsibilities of the terms. Second, DIKE scans all licenses from the code snippet for the input software and outputs the scan results in a tree structure. Third, the scan results match the license knowledge base to detect license conflicts from terms and conditions. DIKE designs two solutions for software with license conflicts: license replacement and code replacement. To demonstrate the effectiveness of DIKE, we first evaluate with the term extraction and responsibility classification, and the results show that their F1-scores reach 0.816 and 0.948, respectively. In addition, we conduct a measurement study of 16,341 popular projects from GitHub based on our proposed DIKE to explore the conflict of license usage in FOSS. The results show that 1,787 open source licenses are used in the project, and 27.2% of licenses conflict. Our new findings suggest that conflicts are prevalent in FOSS, warning the open source community about intellectual property risks.
SP  - 495
EP  - 505
JF  - 2023 IEEE/ACM 45th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP)
VL  - NA
IS  - NA
PB  -
DO  - 10.1109/icse-seip58684.2023.00050
ER  -

TY  - NA
AU  - Link, Georg J.P.; Jeske, Debora
TI  - OpenSym - Understanding Organization and Open Source Community Relations through the Attraction-Selection-Attrition Model
PY  - 2017
AB  - Organizations increasingly engage with open source communities. Extant research identified the benefits to organizations for engaging with open source and documented how open source communities operate to accommodate organizational engagement. The complexities involved in what attracts organizations to specific communities, how they choose to engage, and how subsequently the organizational-communal engagement shapes the community and organization are not yet well understood. In this paper, we explore how the Attraction-Selection-Attrition Model supports the study of how communities attract, retain, and lose members, and how these aspects relate to organizational-communal engagement between organizations and open source communities. This conceptual paper provides an introduction to the ASA model, having briefly outlined the lack of research connecting ASA and open source communities. Following this, the paper outlines how existing research related to the ASA model may be effectively related to existing open source research, resulting in several questions for future research.
SP  - 17
EP  - 8
JF  - Proceedings of the 13th International Symposium on Open Collaboration
VL  - NA
IS  - NA
PB  -
DO  - 10.1145/3125433.3125472
ER  -

TY  - NA
AU  - Liu, Tao; Liu, Chengwei; Liu, Tianwei; Wang, He; Wu, Gaofei; Liu, Yang; Zhang, Yuqing
TI  - Catch the Butterfly: Peeking into the Terms and Conflicts Among SPDX Licenses
PY  - 2024
AB  - NA
SP  - 477
EP  - 488
JF  - 2024 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)
VL  - NA
IS  - NA
PB  -
DO  - 10.1109/saner60148.2024.00056
ER  -

TY  - NA
AU  - Linåker, Johan; Runeson, Per
TI  - OpenSym - Public Sector Platforms going Open: Creating and Growing an Ecosystem with Open Collaborative Development
PY  - 2020
AB  - Background: By creating ecosystems around platforms of Open Source Software (OSS) and Open Data (OD), and adopting open collaborative development practices, platform providers may exploit open innovation benefits. However, adopting such practices in a traditionally closed organization is a maturity process that we hypothesize cannot be undergone without friction.Objective: This study aims to investigate what challenges may occur for a newly-turned platform provider in the public sector, aiming to adopt open collaborative practices to create an ecosystem around the development of the underpinning platform.Method: An exploratory case-study is conducted at a Swedish public sector platform provider, which is creating an ecosystem around OSS and OD, related to the labor market. Data is collected through interviews, document studies, and prolonged engagement.Results: Findings highlight a fear among developers of being publicly questioned for their work, as they represent a government agency undergoing constant scrutiny. Issue trackers, roadmaps, and development processes are generally closed, while multiple channels are used for communication, causing internal and external confusion. Some developers are reluctant to communicate externally as they believe it interferes with their work. Lack of health metrics limits possibilities to follow ecosystem growth and for actors to make investment decisions. Further, an autonomous team structure is reported to complicate internal communication and enforcement of the common vision, as well as collaboration. A set of interventions for addressing the challenges are proposed, based on related work.Conclusions: We conclude that several cultural, organizational, and process-related challenges may reside, and by understanding these early on, platform providers can be preemptive in their work of building healthy ecosystems.
SP  - 10
EP  - 10
JF  - Proceedings of the 16th International Symposium on Open Collaboration
VL  - NA
IS  - NA
PB  -
DO  - 10.1145/3412569.3412572
ER  -

TY  - JOUR
AU  - Samaana, Haya; Costa, Diego Elias; Abdellatif, Ahmad; Shihab, Emad
TI  - Opportunities and security risks of technical leverage: A replication study on the NPM ecosystem
PY  - 2025
AB  - NA
SP  - NA
EP  - NA
JF  - Empirical Software Engineering
VL  - 30
IS  - 4
PB  -
DO  - 10.1007/s10664-025-10648-8
ER  -

TY  - NA
AU  - Stoudt, Sara; Jernite, Yacine; Marshall, Brandeis; Marwick, Ben; Sharan, Malvika; Whitaker, Kirstie; Danchev, Valentin
TI  - Ten simple rules for building and maintaining a responsible data science workflow.
PY  - 2024
AB  - Contributors and beneficiaries of data-intensive research have become increasingly concerned about social and ethical risks from data science and machine learning applications [1][2][3][4][5][6].Instances of unethical use of technology and harms caused to vulnerable communities have made it even more urgent for researchers to broaden the considerations of ethics and societal impact in their research.There has been a proliferation of ethical guidelines [7-10], checklists for responsible research [11,12], and teaching materials [13] encouraging the application of good research practices in all areas of data science research, including machine learning (ML), artificial intelligence (AI), and natural language processing (NLP).While encouraging, there is also a risk that ethical considerations from guidelines and checklists may be added to a project as an afterthought unless such considerations are incorporated into the research process from the onset so that data science can be performed responsibly by design (in a similar vein as advocated for by Open Science by Design [14]).To help enable this goal of incorporating ethics through the entire research process, we outline 10 simple rules of a responsible data science workflow.A responsible data science workflow scaffolds practices and processes of ethical research, defined by the European Commission as "an approach that anticipates and assesses potential implications and societal expectations with regard to research and innovation, with the aim to foster the design of inclusive and sustainable research and innovation" [15].We stress that this approach should be considered at each stage of the data science lifecycle [6,16,17]-ranging from team assembling and research design to data collection and evaluation, model building, model evaluation, and reporting.Data science projects often involve multiple teams and contributor groups, and hence, it is our ethical responsibility to embed practices for inclusive and collaborative research as well.A responsible data science workflow identifies and invites different stakeholders, possibly with different interests, expertise and access to resources [18], to participate in the workflow and provide feedback, especially those who are affected by data science research, including research subjects, collaborators, community members, and those from marginalized groups (see Fig 1).Historically, questions and considerations around research ethics have primarily focused on the issues of privacy, confidentiality, and rights of research participants (or data subjects) [19].More recently, attention has also been placed on fairness and bias of prediction models and modelers' responsibility towards users and members from minorities and underrepresented groups as well as regulations concerning data use and privacy as well as explainability of outputs to those affected [4].The movement towards openness and research transparency
SP  - e1012232
EP  - e1012232
JF  - PLoS computational biology
VL  - 20
IS  - 7
PB  -
DO  - 10.1371/journal.pcbi.1012232
ER  -

TY  - NA
AU  - Vendome, Christopher; German, Daniel M.; Di Penta, Massimiliano; Bavota, Gabriele; Linares-Vasquez, Mario; Poshyvanyk, Denys
TI  - ICSE - To distribute or not to distribute?: why licensing bugs matter
PY  - 2018
AB  - Software licenses dictate how source code or binaries can be modified, reused, and redistributed. In the case of open source projects, software licenses generally fit into two main categories, permissive and restrictive, depending on the degree to which they allow redistribution or modification under licenses different from the original one(s). Developers and organizations can also modify existing licenses, creating custom licenses with specific permissive/restrictive terms. Having such a variety of software licenses can create confusion among software developers, and can easily result in the introduction of licensing bugs, not necessarily limited to well-known license incompatibilities. In this work, we report a study aimed at characterizing licensing bugs by (i) building a catalog categorizing the types of licensing bugs developers and other stakeholders face, and (ii) understanding the implications licensing bugs have on the software projects they affect. The presented study is the result of the manual analysis of 1,200 discussions related to licensing bugs carried out in issue trackers and in five legal mailing lists of open source communities. Our findings uncover new types of licensing bugs not addressed in prior literature, and a detailed assessment of their implications.
SP  - 268
EP  - 279
JF  - Proceedings of the 40th International Conference on Software Engineering
VL  - NA
IS  - NA
PB  -
DO  - 10.1145/3180155.3180221
ER  -

TY  - NA
AU  - Linåker, Johan; Link, Georg; Lumbard, Kevin
TI  - Sustaining Maintenance Labor for Healthy Open Source Software Projects through Human Infrastructure: A Maintainer Perspective
PY  - 2024
AB  - NA
SP  - 37
EP  - 48
JF  - Proceedings of the 18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement
VL  - NA
IS  - NA
PB  -
DO  - 10.1145/3674805.3686667
ER  -

TY  - NA
AU  - Yin, Likang; Chen, Zhuangzhi; Xuan, Qi; Filkov, Vladimir
TI  - ESEC/SIGSOFT FSE - Sustainability forecasting for Apache incubator projects
PY  - 2021
AB  - Although OSS development is very popular, ultimately more than 80% of OSS projects fail. Identifying the factors associated with OSS success can help in devising interventions when a project takes a downturn. OSS success has been studied from a variety of angles, more recently in empirical studies of large numbers of diverse projects, using proxies for sustainability, e.g., internal metrics related to productivity and external ones, related to community popularity. The internal socio-technical structure of projects has also been shown important, especially their dynamics. This points to another angle on evaluating software success, from the perspective of self-sustaining and self-governing communities. To uncover the dynamics of how a project at a nascent development stage gradually evolves into a sustainable one, here we apply a socio-technical network modeling perspective to a dataset of Apache Software Foundation Incubator (ASFI), sustainability-labeled projects. To identify and validate the determinants of sustainability, we undertake a mix of quantitative and qualitative studies of ASFI projects’ socio-technical network trajectories. We develop interpretable models which can forecast a project becoming sustainable with 93+% accuracy, within 8 months of incubation start. Based on the interpretable models we describe a strategy for real-time monitoring and suggesting actions, which can be used by projects to correct their sustainability trajectories.
SP  - 1056
EP  - 1067
JF  - Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering
VL  - NA
IS  - NA
PB  -
DO  - 10.1145/3468264.3468563
ER  -

TY  - JOUR
AU  - Foundjem, Armstrong; Constantinou, Eleni; Mens, Tom; Adams, Bram
TI  - A mixed-methods analysis of micro-collaborative coding practices in OpenStack.
PY  - 2022
AB  - Technical collaboration between multiple contributors is a natural phenomenon in distributed open source software development projects. Macro-collaboration, where each code commit is attributed to a single collaborator, has been extensively studied in the research literature. This is much less the case for so-called micro-collaboration practices, in which multiple authors contribute to the same commit. To support such practices, GitLab and GitHub started supporting social coding mechanisms such as the "Co-Authored-By:" trailers in commit messages, which, in turn, enable to empirically study such micro-collaboration. In order to understand the mechanisms, benefits and limitations of micro-collaboration, this article provides an exemplar case study of collaboration practices in the OpenStack ecosystem. Following a mixed-method research approach we provide qualitative evidence through a thematic and content analysis of semi-structured interviews with 16 OpenStack contributors. We contrast their perception with quantitative evidence gained by statistical analysis of the git commit histories ( <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:mo>∼</mml:mo></mml:math> 1M commits) and Gerrit code review histories ( <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:mo>∼</mml:mo></mml:math> 631K change sets and <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:mo>∼</mml:mo></mml:math> 2M patch sets) of 1,804 OpenStack project repositories over a 9-year period. Our findings provide novel empirical insights to practitioners to promote micro-collaborative coding practices, and to academics to conduct further research towards understanding and automating the micro-collaboration process.
SP  - 120
EP  - NA
JF  - Empirical software engineering
VL  - 27
IS  - 5
PB  -
DO  - 10.1007/s10664-022-10167-w
ER  -

TY  - NA
AU  - Foundjem, Armstrong
TI  - ICSE (Companion Volume) - Release synchronization in software ecosystems
PY  - 2019
AB  - Software ecosystems bring value by integrating projects related to a given domain, for example, open source projects in a Linux distribution or mobile apps on the Android platform. However, the major challenge of managing an infrastructure ecosystem like OpenStack or Debian is to provide a polished, well-integrated product to the end user, since each individual project has its own release cycle and roadmap. To understand how modern ecosystems deal with this challenge, I empirically study the release synchronization strategy of the OpenStack ecosystem, in which a central release management team manages the six-month release cycle of the overall OpenStack product. By studying one year of release team IRC meeting logs, 9 major federated release management activities were identified, which were cataloged and documented. My findings suggest that even though an ecosystem's power lies in the interaction of autonomous projects, release synchronization is a non-trivial goal. Currently, I am performing interviews with key software developers within the OpenStack ecosystem, in order to understand the major release activities.
SP  - 135
EP  - 137
JF  - 2019 IEEE/ACM 41st International Conference on Software Engineering: Companion Proceedings (ICSE-Companion)
VL  - NA
IS  - NA
PB  -
DO  - 10.1109/icse-companion.2019.00058
ER  -

TY  - JOUR
AU  - von Krogh, Georg; Rossi-Lamastra, Cristina; Haefliger, Stefan
TI  - Phenomenon-based Research in Management and Organisation Science: When is it Rigorous and Does it Matter?
PY  - 2012
AB  - Recently, the editors of Long Range Planning called for more phenomenon-based research. Such research focuses on identifying and reporting on new or recent phenomena of interest and relevance to management and organisation science. In this article, we explore the nature of phenomenon-based research and develop a research strategy that provides guidelines for researchers seeking to make this type of scientific inquiry rigorous and relevant. Phenomenon-based research establishes and describes the empirical facts and constructs that enable scientific inquiry to proceed. An account of the study of open source software development illustrates the research strategy. Rigorous phenomenon-based research tackles problems that are relevant to management practice and fall outside the scope of available theories. Phenomenon-based research also bridges epistemological and disciplinary divides because it unites diverse scholars around their shared interest in the phenomenon and their joint engagement in the research activities: identification, exploration, design, theorising and synthesis.
SP  - 277
EP  - 298
JF  - Long Range Planning
VL  - 45
IS  - 4
PB  -
DO  - 10.1016/j.lrp.2012.05.001
ER  -

TY  - NA
AU  - Sotiropoulos, Thodoris; Mitropoulos, Dimitris; Spinellis, Diomidis
TI  - ICSE - Practical fault detection in puppet programs
PY  - 2020
AB  - Puppet is a popular computer system configuration management tool. By providing abstractions that model system resources it allows administrators to set up computer systems in a reliable, predictable, and documented fashion. Its use suffers from two potential pitfalls. First, if ordering constraints are not correctly specified whenever a Puppet resource depends on another, the non-deterministic application of resources can lead to race conditions and consequent failures. Second, if a service is not tied to its resources (through the notification construct), the system may operate in a stale state whenever a resource gets modified. Such faults can degrade a computing infrastructure's availability and functionality. We have developed an approach that identifies these issues through the analysis of a Puppet program and its system call trace. Specifically, a formal model for traces allows us to capture the interactions of Puppet resources with the file system. By analyzing these interactions we identify (1) resources that are related to each other (e.g., operate on the same file), and (2) resources that should act as notifiers so that changes are correctly propagated. We then check the relationships from the trace's analysis against the program's dependency graph: a representation containing all the ordering constraints and notifications declared in the program. If a mismatch is detected, our system reports a potential fault. We have evaluated our method on a large set of popular Puppet modules, and discovered 92 previously unknown issues in 33 modules. Performance benchmarking shows that our approach can analyze in seconds real-world configurations with a magnitude measured in thousands of lines and millions of system calls.
SP  - 26
EP  - 37
JF  - Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering
VL  - NA
IS  - NA
PB  -
DO  - 10.1145/3377811.3380384
ER  -

TY  - JOUR
AU  - Basir, Md. Samiul; Buckmaster, Dennis; Raturi, Ankita; Zhang, Yaguang
TI  - From pen and paper to digital precision: a comprehensive review of on-farm recordkeeping
PY  - 2024
AB  - <jats:title>Abstract</jats:title><jats:p>In the present era of agricultural digitalization, documenting on-farm operations is critical. These records contextualize other layers of data and underpin economic analysis and informed decision-making. On-farm recordkeeping is rooted in an ancient tradition and has evolved from pen and paper to digital means integrating diverse tools and methods. These tools vary widely in mode of data recording and this presents challenges in achieving complete, accurate and interoperable data. Assessing this diversity of existing recordkeeping systems is a key step toward the improvement in recordkeeping systems that enhance data quality and interoperability. Despite the importance, as of present, comprehensive studies addressing this challenge are lacking. A systematic review of existing on-farm recordkeeping systems was carried out to address their advantages and weaknesses and to analyze their features and traits, focusing on interoperability and adherence to efficient and comprehensive on-farm recordkeeping. Paper-based recordkeeping, a longstanding and reliable method, is gradually being replaced by digital platforms. Many universities and agencies have released farm management spreadsheets and interactive database forms representing the initial step toward intuitive recordkeeping. Furthermore, farm management software, web apps, and user-friendly smartphone apps are increasingly crucial for handling agricultural big data. Notably, among the surveyed software packages and apps, most of them are not free and only a few support data interoperability. The survey also indicates a scope for further development in open-source tools with automation in recordkeeping. Adopting digital on-farm recordkeeping tools can positively impact both on and off the farm, fostering data interoperability, controlled yet flexible data access, completeness, and appropriate accuracy.</jats:p>
SP  - 2643
EP  - 2682
JF  - Precision Agriculture
VL  - 25
IS  - 5
PB  -
DO  - 10.1007/s11119-024-10172-7
ER  -

TY  - NA
AU  - Santos, Fabio; Wiese, Igor; Trinkenreich, Bianca; Steinmacher, Igor; Sarma, Anita; Gerosa, Marco Aurélio
TI  - MSR - Can I Solve It? Identifying APIs Required to Complete OSS Tasks
PY  - 2021
AB  - Open Source Software projects add labels to open issues to help contributors choose tasks. However, manually labeling issues is time-consuming and error-prone. Current automatic approaches for creating labels are mostly limited to classifying issues as a bug/non-bug. In this paper, we investigate the feasibility and relevance of labeling issues with the domain of the APIs required to complete the tasks. We leverage the issues’ description and the project history to build prediction models, which resulted in precision up to 82% and recall up to 97.8%. We also ran a user study (n=74) to assess these labels’ relevancy to potential contributors. The results show that the labels were useful to participants in choosing tasks, and the API-domain labels were selected more often than the existing architecture-based labels. Our results can inspire the creation of tools to automatically label issues, helping developers to find tasks that better match their skills.
SP  - 346
EP  - 257
JF  - 2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR)
VL  - NA
IS  - NA
PB  -
DO  - 10.1109/msr52588.2021.00047
ER  -

TY  - JOUR
AU  - Lercher, Alexander; Glock, Johann; Macho, Christian; Pinzger, Martin
TI  - Microservice API Evolution in Practice: A Study on Strategies and Challenges
PY  - 2024
AB  - Nowadays, many companies design and develop their software systems as a set of loosely coupled microservices that communicate via their Application Programming Interfaces (APIs). While the loose coupling improves maintainability, scalability, and fault tolerance, it poses new challenges to the API evolution process. Related works identified communication and integration as major API evolution challenges but did not provide the underlying reasons and research directions to mitigate them. In this paper, we aim to identify microservice API evolution strategies and challenges in practice and gain a broader perspective of their relationships. We conducted 17 semi-structured interviews with developers, architects, and managers in 11 companies and analyzed the interviews with open coding used in grounded theory. In total, we identified six strategies and six challenges for REpresentational State Transfer (REST) and event-driven communication via message brokers. The strategies mainly focus on API backward compatibility, versioning, and close collaboration between teams. The challenges include change impact analysis efforts, ineffective communication of changes, and consumer reliance on outdated versions, leading to API design degradation. We defined two important problems in microservice API evolution resulting from the challenges and their coping strategies: tight organizational coupling and consumer lock-in. To mitigate these two problems, we propose automating the change impact analysis and investigating effective communication of changes as open research directions. Editor's note: Open Science material was validated by the Journal of Systems and Software Open Science Board.
SP  - 112110
EP  - 112110
JF  - Journal of Systems and Software
VL  - 215
IS  - NA
PB  -
DO  - 10.1016/j.jss.2024.112110
ER  -

TY  - JOUR
AU  - Dong, John Qi; Wu, Weifang; Zhang, Yixin Sarah
TI  - The faster the better? Innovation speed and user interest in open source software
PY  - 2019
AB  - Abstract It is often believed that for open source software (OSS) projects the faster the release, the better for attracting user interest in the software. Whether this is true, however, is still open to question. There is considerable information asymmetry between OSS projects and potential users as project quality is unobservable to users. We suggest that innovation speed of OSS project can signal the unobservable project quality and attract users’ interest in downloading and using the software. We contextualize innovation speed of OSS projects as initial release speed and update speed and examine their impacts on user interest. Drawing on the signaling theory, we propose a signaling effect through which a higher initial release speed or update speed increases user interest, while the effect diminishes as initial release or update speed increases. Using a large-scale panel data set from 7442 OSS projects on SourceForge between 2007 and 2010, our results corroborate the inverted U-shaped relationships between initial release speed and user downloads and between update speed and user downloads.
SP  - 669
EP  - 680
JF  - Information & Management
VL  - 56
IS  - 5
PB  -
DO  - 10.1016/j.im.2018.11.002
ER  -

TY  - JOUR
AU  - Kapitsaki, Georgia M.; Kramer, Frederik; Tselikas, Nikolaos D.
TI  - Automating the license compatibility process in open source software with SPDX
PY  - 2017
AB  - NA
SP  - 386
EP  - 401
JF  - Journal of Systems and Software
VL  - 131
IS  - NA
PB  -
DO  - 10.1016/j.jss.2016.06.064
ER  -

TY  - NA
AU  - Chakraborti, Mahasweta; Bonagiri, Sailendra Akash; Virgüez-Ruiz, Santiago; Frey, Seth
TI  - NLP4Gov: A Comprehensive Library for Computational Policy Analysis
PY  - 2024
AB  - Formal rules and policies are fundamental in formally specifying a social system: its operation, boundaries, processes, and even ontology. Recent scholarship has highlighted the role of formal policy in collective knowledge creation, game communities, the production of digital public goods, and national social media governance. Researchers have shown interest in how online communities convene tenable self-governance mechanisms to regulate member activities and distribute rights and privileges by designating responsibilities, roles, and hierarchies. We present NLP4Gov, an interactive kit to train and aid scholars and practitioners alike in computational policy analysis. The library explores and integrates methods and capabilities from computational linguistics and NLP to generate semantic and symbolic representations of community policies from text records. Versatile, documented, and accessible, NLP4Gov provides granular and comparative views into institutional structures and interactions, along with other information extraction capabilities for downstream analysis.
SP  - 1
EP  - 8
JF  - Extended Abstracts of the CHI Conference on Human Factors in Computing Systems
VL  - NA
IS  - NA
PB  -
DO  - 10.1145/3613905.3650810
ER  -

TY  - JOUR
AU  - Yurtsever, M. Mücahit Enes; Özcan, Muhammet; Taruz, Zübeyir; Eken, Süleyman; Sayar, Ahmet
TI  - Figure search by text in large scale digital document collections
PY  - 2021
AB  - <jats:title>Summary</jats:title><jats:p>Digital document collections have been created with the transfer of a large number of documents to digital media. These digital archives have provided many benefits to users. As the diversity and size of digital image collections have grown exponentially, it has become increasingly important and difficult to obtain the desired image from them. The images on the document might contain critical information about the subject of it. In this study, an architecture is developed that can work on large‐scale data by creating regular expressions together with full‐text search approaches. The performance of the system has been tested on different academic documents and Elasticsearch and Apache Solr insert times are compared. Compared to Elasticsearch, Apache Solr achieved faster and more successful results.</jats:p>
SP  - NA
EP  - NA
JF  - Concurrency and Computation: Practice and Experience
VL  - 34
IS  - 1
PB  -
DO  - 10.1002/cpe.6529
ER  -

TY  - NA
AU  - Gamalielsson, Jonas; Lundell, Björn
TI  - OpenSym - On licensing and other conditions for contributing to widely used open source projects: an exploratory analysis
PY  - 2017
AB  - Open source software (OSS) projects are provided under different open source licenses and some projects use other conditions (in addition to licensing terms) for contributors to adhere to. Licensing terms and conditions may affect community involvement and contributions, and are perceived differently by different stakeholders in different OSS projects. The study reports from an exploratory analysis of licensing terms and other conditions for 200 widely used OSS projects, and an investigation of the relationship between licensing terms and other conditions for contributing. We find that strong copyleft licenses are most common and are used in the majority of the projects. Further, a clear majority of the OSS projects use no specific other condition for contributing in addition to the license terms. However, a clear majority of the OSS projects supported by foundations use other conditions for contributing in addition to the license terms. Finally, use of no specific other conditions in addition to the license terms is more common for projects using strong copyleft licensing compared to projects using non-copyleft licensing.
SP  - 1
EP  - 14
JF  - Proceedings of the 13th International Symposium on Open Collaboration
VL  - NA
IS  - NA
PB  -
DO  - 10.1145/3125433.3125456
ER  -

TY  - JOUR
AU  - Jarvenpaa, Sirkka L.; Välikangas, Liisa
TI  - Forking From the Future: How an Interorganizational Network Learned Its Way to New Software Business
PY  - 2024
AB  - Technological disruptions call for new capabilities beyond the reach of a single firm. How does a large interorganizational network learn its way to new Cloud software business while leveraging forking—that is, copying and branching of code common to software development? Such forking promotes commitment and innovation but may compromise joint learning in the network required for developing new capabilities under technological disruption. Elaborating a seminal learning theory in an interorganizational network context, we demonstrate how forking was about to sacrifice long-term aspirations to short-term interests early on. We contribute a new type of forking—forking from the future—and describe how this type of forking harnesses future aspirations and builds collaborative capabilities in the network while reducing tendencies for shortsightedness. Forking from the future consists of the introduction of external perspectives, the implementation of new rules, the use of integrative tools, and generally accelerating collaboration among the network participants from the future back (not from the present forward). As software becomes increasingly important across industries, forking from the future becomes foundational for future competitiveness.
SP  - 2744
EP  - 2757
JF  - IEEE Transactions on Engineering Management
VL  - 71
IS  - NA
PB  -
DO  - 10.1109/tem.2022.3193959
ER  -

TY  - NA
AU  - Nahar, Nadia; Zhou, Shurui; Lewis, Grace A.; Kästner, Christian
TI  - More Engineering, No Silos: Rethinking Processes and Interfaces in Collaboration between Interdisciplinary Teams for Machine Learning Projects.
PY  - 2021
AB  - The introduction of machine learning (ML) components in software projects has created the need for software engineers to collaborate with data scientists and other specialists. While collaboration can always be challenging, ML introduces additional challenges with its exploratory model development process, additional skills and knowledge needed, difficulties testing ML systems, need for continuous evolution and monitoring, and non-traditional quality requirements such as fairness and explainability. Through interviews with 45 practitioners from 28 organizations, we identified key collaboration challenges that teams face when building and deploying ML systems into production. We report on common collaboration points in the development of production ML systems for requirements, data, and integration, as well as corresponding team patterns and challenges. We find that most of these challenges center around communication, documentation, engineering, and process and collect recommendations to address these challenges.
SP  - NA
EP  - NA
JF  - arXiv: Software Engineering
VL  - NA
IS  - NA
PB  -
DO  - NA
ER  -

TY  - JOUR
AU  - Plegat, Manon; Lewkowicz, Myriam
TI  - "I do not see the point of using the system". Understanding a Broken Policy Knot in the Primary Care Sector
PY  - 2025
AB  - <jats:p>This paper explores the implementation of a policy promoting new cooperative practices in the primary care sector. Through a two-year multi-sited ethnographic study of Multi-Professional Healthcare Centers (MPHCs) and their coordination mechanisms, we highlight the gaps between the coordinative protocols that prescribe how these structures operate, and the certified Health Information System (HIS) that has been defined by public authorities to support the new practices. These gaps make us say that the policy knot - that entangles policy, practice, and design - broke. To understand why, we studied the biography of the certified HIS and identified that it is based on the practice of general medicine only. Without simply concluding that public policy has failed because of the system's shortcomings, we reveal the human effort involved in compensating for the HIS's inability to support the articulation work required for multi-professional coordination. Based on this empirical contribution, we offer the policy-oriented technological frame as a concept to make sense of IT in public health, and the congruence loop as a guideline to avoid the breakage of a policy knot.</jats:p>
SP  - 1
EP  - 28
JF  - Proceedings of the ACM on Human-Computer Interaction
VL  - 9
IS  - 2
PB  -
DO  - 10.1145/3711003
ER  -

TY  - JOUR
AU  - Zerouali, Ahmed; Mens, Tom; Decan, Alexandre; De Roover, Coen
TI  - On the impact of security vulnerabilities in the npm and RubyGems dependency networks
PY  - 2022
AB  - NA
SP  - NA
EP  - NA
JF  - Empirical Software Engineering
VL  - 27
IS  - 5
PB  -
DO  - 10.1007/s10664-022-10154-1
ER  -

TY  - NA
AU  - Catolino, Gemma; Palomba, Fabio; Tamburri, Damian A.; Serebrenik, Alexander; Ferrucci, Filomena
TI  - Refactoring community smells in the wild
PY  - 2020
AB  - NA
SP  - 25
EP  - 34
JF  - Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering: Software Engineering in Society
VL  - NA
IS  - NA
PB  -
DO  - 10.1145/3377815.3381380
ER  -

TY  - JOUR
AU  - Adams, Bram; Kavanagh, Ryan; Hassan, Ahmed E.; German, Daniel M.
TI  - An empirical study of integration activities in distributions of open source software
PY  - 2015
AB  - NA
SP  - 960
EP  - 1001
JF  - Empirical Software Engineering
VL  - 21
IS  - 3
PB  -
DO  - 10.1007/s10664-015-9371-y
ER  -

TY  - JOUR
AU  - Kim, Hee-Woong; Chan, Hock Chuan; Lee, So-Hyun
TI  - User Resistance to Software Migration: The Case on Linux
PY  - 2014
AB  - The demand for software has increased rapidly in the global industrial environment. Open source software OSS has exerted significant impact on the software industry. Large amounts of resources and effort have been devoted to the development of OSS such as Linux. Based on the technology adoption model TAM, the development of Linux as the most well-known OSS with a graphical user interface designed for ease of use and a wide range of functionalities is expected to result in high levels of Linux adoption by individual users. Linux, however, currently controls about 1% of the operating system market for personal computers. The resistance of users to switch to a new operating system remains one of the major obstacles to widespread adoption of Linux among individual users. Based on the integration of the equity implementation model and the TAM, this study examines the formation of user resistance, as well as the effects of user resistance, on the migration to Linux for personal computers. This study discusses the role and effect of user resistance based on the equity implementation model in comparison with the two main determinants in the TAM. This study contributes to the advancement of theoretical understanding of Linux migration and user resistance. The findings also offer suggestions for software communities and practitioners, of OSS in particular, to promote the use of new software by individual users.
SP  - 59
EP  - 79
JF  - Journal of Database Management
VL  - 25
IS  - 1
PB  -
DO  - 10.4018/jdm.2014010103
ER  -

TY  - JOUR
AU  - Scacchi, Walt; Alspaugh, Thomas A.
TI  - Understanding the role of licenses and evolution in open architecture software ecosystems
PY  - 2012
AB  - NA
SP  - 1479
EP  - 1494
JF  - Journal of Systems and Software
VL  - 85
IS  - 7
PB  -
DO  - 10.1016/j.jss.2012.03.033
ER  -

TY  - JOUR
AU  - Hsieh, Jane; Kim, Joselyn; Dabbish, Laura; Zhu, Haiyi
TI  - "Nip it in the Bud": Moderation Strategies in Open Source Software Projects and the Role of Bots
PY  - 2023
AB  - <jats:p>Much of our modern digital infrastructure relies critically upon open sourced software. The communities responsible for building this cyberinfrastructure require maintenance and moderation, which is often supported by volunteer efforts. Moderation, as a non-technical form of labor, is a necessary but often overlooked task that maintainers undertake to sustain the community around an OSS project. This study examines the various structures and norms that support community moderation, describes the strategies moderators use to mitigate conflicts, and assesses how bots can play a role in assisting these processes. We interviewed 14 practitioners to uncover existing moderation practices and ways that automation can provide assistance. Our main contributions include a characterization of moderated content in OSS projects, moderation techniques, as well as perceptions of and recommendations for improving the automation of moderation tasks. We hope that these findings will inform the implementation of more effective moderation practices in open source communities.</jats:p>
SP  - 1
EP  - 29
JF  - Proceedings of the ACM on Human-Computer Interaction
VL  - 7
IS  - CSCW2
PB  -
DO  - 10.1145/3610092
ER  -

TY  - JOUR
AU  - Foroudi, Pantea; Marvi, Reza; Cuomo, Maria Teresa; D'Amato, Antonio
TI  - Sustainable Development Goals in a regional context: conceptualising, measuring and managing residents' perceptions
PY  - 2024
AB  - This study explores how national-level Sustainable Development Goals (SDGs) are implemented at the regional level in Italy by achieving three objectives: (1) conceptualising local residents' perceptions of the SDGs; (2) creating a scale to measure these perceptions; and (3) validating this scale across Italian regions. Using a six-step methodology, including panel data analysis and surveys with 2303 respondents, this research validates key SDGs significant to Italian regions. The results provide policymakers with a framework to tailor regional policies that resonate with residents' views on SDGs.
SP  - 1
EP  - 16
JF  - Regional Studies
VL  - 59
IS  - 1
PB  -
DO  - 10.1080/00343404.2024.2373871
ER  -

TY  - NA
AU  - Gu, Zuguang
TI  - Two Separated Worlds: on the Preference of Influence in Life Science and Biomedical Research
PY  - 2024
AB  - <jats:title>Abstract</jats:title><jats:p>We introduced a new metric, “citation enrichment”, to measure country-to-country influence using citation data. This metric evaluates the degree to which a country prefers to cite another country compared to a random citation process. We applied the citation enrichment method to over 12 million publications in the life science and biomedical fields and we have the following key findings: 1) The global scientific landscape is divided into two separated worlds where developed Western countries exhibit an overall mutual under-influence with the rest of the world; 2) Within each world, countries form clusters based on their mutual citation preferences, with these groupings strongly associated with their geographical and cultural proximity; 3) The two worlds exhibit distinct patterns of the influence balance among countries, revealing underlying mechanisms that drive influence dynamics. We have constructed a comprehensive world map of scientific influence which greatly enhances the deep understanding of the international exchange of scientific knowledge. The citation enrichment metric is developed under a well-defined statistical framework and has the potential to be extended into a versatile and powerful tool for bibliometrics and related research fields.</jats:p>
SP  - NA
EP  - NA
JF  - NA
VL  - NA
IS  - NA
PB  -
DO  - 10.1101/2024.05.10.592442
ER  -

TY  - JOUR
AU  - Sojer, Manuel; Alexy, Oliver; Kleinknecht, Sven; Henkel, Joachim
TI  - Understanding the Drivers of Unethical Programming Behavior: The Inappropriate Reuse of Internet-Accessible Code
PY  - 2014
AB  - AbstractProgramming is riddled with ethical issues. Although extant literature explains why individuals in IT would act unethically in many situations, we know surprisingly little about what causes them to do so during the creative act of programming. To address this issue, we look at the reuse of Internet-accessible code: software source code legally available for gratis download from the Internet. Specifically, we scrutinize the reasons why individuals would unethically reuse such code by not checking or purposefully violating its accompanying license obligations, thus risking harm for their employer. By integrating teleological and deontological ethical judgments into a theory of planned behavior model—using elements of expected utility, deterrence, and ethical work climate theory—we construct an original theoretical framework to capture individuals’ decision-making process leading to the unethical reuse of Internet-accessible code. We test this framework with a unique survey of 869 professional softwa...
SP  - 287
EP  - 325
JF  - Journal of Management Information Systems
VL  - 31
IS  - 3
PB  -
DO  - 10.1080/07421222.2014.995563
ER  -

TY  - NA
AU  - Zhou, Shurui; Vasilescu, Bogdan; Kästner, Christian
TI  - How has forking changed in the last 20 years?
PY  - 2020
AB  - NA
SP  - 445
EP  - 456
JF  - Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering
VL  - NA
IS  - NA
PB  -
DO  - 10.1145/3377811.3380412
ER  -

TY  - NA
AU  - Jahanshahi, Mahmoud; Mockus, Audris
TI  - Dataset: Copy-based Reuse in Open Source Software
PY  - 2024
AB  - NA
SP  - 42
EP  - 47
JF  - Proceedings of the 21st International Conference on Mining Software Repositories
VL  - NA
IS  - NA
PB  -
DO  - 10.1145/3643991.3644868
ER  -

TY  - NA
AU  - Wessel, Mairieli; Gerosa, Marco A.; Shihab, Emad
TI  - Software bots in software engineering
PY  - 2022
AB  - Software bots are becoming increasingly popular in software engineering (SE). In this tutorial, we define what a bot is and present several examples. We also discuss the many benefits bots provide to the SE community, including helping in development tasks (such as pull request review and integration) and onboarding newcomers to a project. Finally, we discuss the challenges related to interacting with and developing software bots.
SP  - 724
EP  - 725
JF  - Proceedings of the 19th International Conference on Mining Software Repositories
VL  - NA
IS  - NA
PB  -
DO  - 10.1145/3524842.3528533
ER  -

TY  - NA
AU  - Wu, Yiwen; Zhang, Yang; Xu, Kele; Wang, Tao; Wang, Huaimin
TI  - Understanding and Predicting Docker Build Duration: An Empirical Study of Containerized Workflow of OSS Projects
PY  - 2022
AB  - Docker building is a critical component of containerized workflow, which automates the process by which sources are packaged and transformed into container images. If not run properly, Docker builds can bring long durations (i.e., slow builds), which increases the cost in human and computing resources, and thus inevitably affect the software development. However, the current status and remedy for the duration cost in Docker builds remain unclear and need an in-depth study. To fill this gap, this paper provides the first empirical investigation on 171,439 Docker builds from 5,833 open source software (OSS) projects. Starting with an exploratory study, the Docker build durations can be characterized in real-world projects, and the developers' perceptions of slow builds are obtained via a comprehensive survey. Driven by the results of our exploratory study, we propose a prediction modeling of Docker build duration, leveraging 27 handcrafted features from build-related context and configuration and 8 regression algorithms for the prediction task. Our results demonstrate that Random Forest model provides the superior performance with a Spearman's correlation of 0.781, outperforming the baseline random model by 82.9% in RMSE, 90.6% in MAE, and 94.4% in MAPE, respectively. The implications of this study will facilitate research and assist practitioners in improving the Docker build process.
SP  - 1
EP  - 13
JF  - Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering
VL  - NA
IS  - NA
PB  -
DO  - 10.1145/3551349.3556940
ER  -

TY  - JOUR
AU  - Francesco, Zirpoli; Rullani, Francesco; Becker, Markus C.
TI  - Coordination of joint search in distributed innovation processes. Lessons from the effects of initial code release in Open Source Software development
PY  - 2013
AB  - This paper casts light on the role of initial code release for providing coordination of joint search processes, i.e., search processes that involve several agents who search together. We develop hypotheses about the role of initial code release for providing coordination, and for whether development projects remain active. We test these hypotheses on a dataset of 5703 open source software projects registered on SourceForge during a two-year period. We find that initial code release is indeed associated with improved coordination, and a higher chance that software development projects will actually release further code subsequently. We contribute to theory on coordination in joint search, common in distributed innovation settings.
SP  - 1
EP  - 30
JF  - SSRN Electronic Journal
VL  - NA
IS  - NA
PB  -
DO  - 10.2139/ssrn.2343917
ER  -

TY  - JOUR
AU  - Strasser, Carly; Hertweck, Kate; Greenberg, Josh; Taraborelli, Dario; Vu, Elizabeth
TI  - Ten simple rules for funding scientific open source software.
PY  - 2022
AB  - Scientific research increasingly relies on open source software (OSS). Funding OSS development requires intentional focus on issues of scholarly credit, unique forms of labor, maintenance, governance, and inclusive community-building. Such issues cut across different scientific disciplines that make them of interest to a variety of funders and institutions but may present challenges in understanding generalized needs. Here we present 10 simple rules for investing in scientific OSS and the teams who build and maintain it.
SP  - e1010627
EP  - e1010627
JF  - PLoS computational biology
VL  - 18
IS  - 11
PB  -
DO  - 10.1371/journal.pcbi.1010627
ER  -

TY  - JOUR
AU  - Racero, F. José; Bueno, Salvador; Gallego, M. Dolores
TI  - Can the OSS-Focused Education Impact on OSS Implementations in Companies? A Motivational Answer through a Delphi-Based Consensus Study
PY  - 2021
AB  - In the last few decades, the Open Source Software (OSS) diffusion has grown remarkably in companies. In this context, the present study has analyzed the factors that incentivize OSS implementations for enterprise purposes, linking two perspectives: (1) managerial and (2) educational. Thus, the Delphi methodology was applied to a panel of experts with two aims: (1) to know managers’ perceptions about organizational users’ motivations toward OSS after receiving OSS training and (2) to develop a forecasting study to examine the OSS diffusion in the medium term in companies and educational centers. In this context, the Self-Determination Theory (SDT) was the theoretical approach through which we identified the motivational factors. Specifically, three SDT motivations were added: (1) autonomy, (2) competence and (3) relatedness. The 104 selected experts were managers from companies with employees who have studied in educational centers where OSS usage is mandatory. The results show that managers perceive that OSS training incentivizes OSS implementations in companies. At the same time, user motivations are considered to be extremely relevant, especially autonomy. In addition, is the results foresee a similar level of OSS implementation in the business and educational fields in the medium term. Finally, conclusions, practical implications and limitations are discussed.
SP  - 277
EP  - NA
JF  - Electronics
VL  - 10
IS  - 3
PB  -
DO  - 10.3390/electronics10030277
ER  -

TY  - NA
AU  - Liu, Peng; Gui, Liang
TI  - DSIT - Structural Analysis of Collaboration Network in OSS Communities
PY  - 2021
AB  - The success of open-source software (OSS) depends on the self-organizing collaboration of developers and the structure of developer collaboration network are intensively investigated in the literature. However, the research on the relationship between network structure and developers’ contribution is still insufficient. This paper investigates developer collaboration networks in three OSS communities by data analytics. The results indicate that real networks are mainly characterized by the modular small-world structure, which is inherently correlated with the sub-project participation of developers. Most module members are single-dimensional developers whose coding-collaboration focuses on a small number of sub-projects (called the main dimension of the module), while a small proportion of module members are multi-dimensional developers who conduct coding-collaboration in the main dimension of different modules. These results may deepen our understandings of the collaborative pattern of OSS communities, and also have some reference value for the studies of open collaborative innovation in large-scale crowds.
SP  - 84
EP  - 91
JF  - 2021 4th International Conference on Data Science and Information Technology
VL  - NA
IS  - NA
PB  -
DO  - 10.1145/3478905.3478923
ER  -

TY  - JOUR
AU  - Kulkarni, Naveen; Varma, Vasudeva
TI  - Perils of opportunistically reusing software module
PY  - 2016
AB  - Summary
Opportunistic reuse is a need based sourcing of software modules without a prior reuse plan. It is a common tactical approach in software development. Developers often reuse an external software module opportunistically to improve their productivity. But, studies have shown that this results in extensive refactoring and adds maintenance owes. We assert this problem to the mismatches between the software under development and the reused external module; caused because of their different assumptions and constraints. We highlight the problems of such opportunistic reuse practices with the help of a case study. In our study, we found issues such as unanticipated behavior, violated constraints, conflict in assumption, fragile structure, and software bloat. In this paper, we like to draw attention of the research community to the wide spread opportunistic reuse practices and the lack of methods to pro-actively identify and resolve the mismatches. We propose the need for supporting developers in reasoning before reuse from the perspective of identifying and fixing both local and global mismatches. Furthermore, we identify other opportunistic software development practices where similar issues can be observed and also suggest the research areas where further investigation can benefit developers in improving their productivity. Copyright © 2016 John Wiley & Sons, Ltd.
SP  - 971
EP  - 984
JF  - Software: Practice and Experience
VL  - 47
IS  - 7
PB  -
DO  - 10.1002/spe.2439
ER  -

TY  - NA
AU  - Vendome, Christopher; Linares-Vasquez, Mario; Bavota, Gabriele; Di Penta, Massimiliano; German, Daniel M.; Poshyvanyk, Denys
TI  - ICSME - When and why developers adopt and change software licenses
PY  - 2015
AB  - Software licenses legally govern the way in which developers can use, modify, and redistribute a particular system. While previous studies either investigated licensing through mining software repositories or studied licensing through FOSS reuse, we aim at understanding the rationale behind developers' decisions for choosing or changing software licensing by surveying open source developers. In this paper, we analyze when developers consider licensing, the reasons why developers pick a license for their project, and the factors that influence licensing changes. Additionally, we explore the licensing-related problems that developers experienced and expectations they have for licensing support from forges (e.g., GitHub). Our investigation involves, on one hand, the analysis of the commit history of 16,221 Java open source projects to identify the commits where licenses were added or changed. On the other hand, it consisted of a survey—in which 138 developers informed their involvement in licensing-related decisions and 52 provided deeper insights about the rationale behind the actions that they had undertaken. The results indicate that developers adopt licenses early in the project's development and change licensing after some period of development (if at all). We also found that developers have inherent biases with respect to software licensing. Additionally, reuse—whether by a non-contributor or for commercial purposes—is a dominant reason why developers change licenses of their systems. Finally, we discuss potential areas of research that could ameliorate the difficulties that software developers are facing with regard to licensing issues of their software systems.
SP  - 31
EP  - 40
JF  - 2015 IEEE International Conference on Software Maintenance and Evolution (ICSME)
VL  - NA
IS  - NA
PB  -
DO  - 10.1109/icsm.2015.7332449
ER  -

TY  - NA
AU  - Wilner, Tamar; Adavi, Krishna Akhil Kumar; Mandava, Sreehana; Bhimdiwala, Ayesha; Frluckaj, Hana; Turns, Jennifer; Arif, Ahmer
TI  - From Concept to Community: Unpacking the Work of Designing Educational and Activist Toolkits
PY  - 2024
AB  - Toolkits are an important means of sharing expertise and influencing practice. However, the work of making and sustaining toolkits is not well understood. We address this gap by conducting 20 semi-structured interviews with toolkit designers, focusing on toolkits intended to help practitioners such as librarians, teachers, and community workers. We analyze these interviews to surface key aspects of participants' design journeys: (1) how their projects began; (2) how they conceptualized use; (3) how they collaborated with users; (4) and what happened once their toolkit was released. We illustrate these aspects through three narratives, and discuss our findings to provide considerations for designers and scholars. We highlight how designers co-construct communities alongside their toolkits, helping us form a more nuanced understanding of the social aspects underpinning toolkit projects. Collectively, these contributions can help us identify challenges and opportunities in this design space, laying the groundwork to increase toolkits' social impact.
SP  - 1
EP  - 15
JF  - Proceedings of the CHI Conference on Human Factors in Computing Systems
VL  - NA
IS  - NA
PB  -
DO  - 10.1145/3613904.3642681
ER  -

TY  - NA
AU  - Sharma, Pankajeshwara Nand; Savarimuthu, Bastin Tony Roy; Stanger, Nigel; Licorish, Sherlock A.; Rainer, Austen
TI  - EASE - Investigating developers' email discussions during decision-making in Python language evolution
PY  - 2017
AB  - Context: Open Source Software (OSS) developers use mailing lists as their main forum for discussing the evolution of a project. However, the use of mailing lists by developers for decision-making has not received much research attention. Objective: We have explored this issue by studying developers' email discussions around Python Enhancement Proposals (PEPs). Method: Our dataset comprised 42,672 emails from six different mailing lists pertaining to PEP development. We performed multiple forms of analysis on these emails, involving both quantitative measures (e.g., frequency) and deeper analysis of specific PEP discussions (i.e., outlier analysis). Results: Out of three PEP types (Informational, Process and Standard Track), Standard Track PEPs attract a large amount of discussion (both in volume and average number of messages per proposal). Our study also identified specific PEP states and topics that generated a disproportionate amount of discussion. Conclusion: Our outcomes point to several opportunities for improving the management of an OSS team based on the knowledge generated from discussions. We have also identified several interesting avenues for future work such as identifying individuals or groups that present persuasive arguments during decision-making.
SP  - 286
EP  - 291
JF  - Proceedings of the 21st International Conference on Evaluation and Assessment in Software Engineering
VL  - NA
IS  - NA
PB  -
DO  - 10.1145/3084226.3084271
ER  -

TY  - NA
AU  - Zhang, Xunhui; Wang, Tao; Yu, Yue; Zeng, Qiubing; Zhixing, Li; Wang, Huaimin
TI  - Who, What, Why and How? Towards the Monetary Incentive in Crowd Collaboration: A Case Study of Github's Sponsor Mechanism
PY  - 2021
AB  - While many forms of financial support are currently available, there are still many complaints about inadequate financing from software maintainers. In May 2019, GitHub, the world's most active social coding platform, launched the Sponsor mechanism as a step toward more deeply integrating open source development and financial support. This paper collects data on 8,028 maintainers, 13,555 sponsors, and 22,515 sponsorships and conducts a comprehensive analysis. We explore the relationship between the Sponsor mechanism and developers along four dimensions using a combination of qualitative and quantitative analysis, examining why developers participate, how the mechanism affects developer activity, who obtains more sponsorships, and what mechanism flaws developers have encountered in the process of using it. We find a long-tail effect in the act of sponsorship, with most maintainers' expectations remaining unmet, and sponsorship has only a short-term, slightly positive impact on development activity but is not sustainable. While sponsors participate in this mechanism mainly as a means of thanking the developers of OSS that they use, in practice, the social status of developers is the primary influence on the number of sponsorships. We find that both the Sponsor mechanism and open source donations have certain shortcomings and need further improvements to attract more participants.
SP  - NA
EP  - NA
JF  - arXiv: Human-Computer Interaction
VL  - NA
IS  - NA
PB  -
DO  - NA
ER  -

TY  - NA
AU  - Imam, Ahmed; Dey, Tapajit; Nolte, Alexander; Mockus, Audris; Herbsleb, James D.
TI  - The Secret Life of Hackathon Code.
PY  - 2021
AB  - Background: Hackathons have become popular events for teams to collaborate on projects and develop software prototypes. Most existing research focuses on activities during an event with limited attention to the evolution of the code brought to or created during a hackathon. Aim: We aim to understand the evolution of hackathon-related code, specifically, how much hackathon teams rely on pre-existing code or how much new code they develop during a hackathon. Moreover, we aim to understand if and where that code gets reused, and what factors affect reuse. Method: We collected information about 22,183 hackathon projects from DEVPOST -- a hackathon database -- and obtained related code (blobs), authors, and project characteristics from the World of Code. We investigated if code blobs in hackathon projects were created before, during, or after an event by identifying the original blob creation date and author, and also checked if the original author was a hackathon project member. We tracked code reuse by first identifying all commits containing blobs created during an event before determining all projects that contain those commits. Result: While only approximately 9.14% of the code blobs are created during hackathons, this amount is still significant considering the time and member constraints of such events. Approximately a third of these code blobs get reused in other projects. The number of associated technologies and the number of participants in a project increase reuse probability. Conclusion: Our study demonstrates to what extent pre-existing code is used and new code is created during a hackathon and how much of it is reused elsewhere afterwards. Our findings help to better understand code reuse as a phenomenon and the role of hackathons in this context and can serve as a starting point for further studies in this area.
SP  - NA
EP  - NA
JF  - arXiv: Software Engineering
VL  - NA
IS  - NA
PB  -
DO  - NA
ER  -

TY  - NA
AU  - Sojer, Manuel; Henkel, Joachim
TI  - License Risks from Ad-Hoc Reuse of Code from the Internet: An Empirical Investigation
PY  - 2011
AB  - Reusing code that is downloadable from the Internet—particularly open source software (OSS) code—in commercial software development is attractive for both firms and their software developers. However, to avoid serious economic and legal consequences for firms, the license obligations of the reused code have to be met. While this risk seems to be manageable in systematic reuse, colloquial evidence suggests that when reusing Internet code in ad-hoc fashion, individual professional software developers sometimes do not treat license obligations properly. Quantitatively investigating this issue, we explore the ad-hoc Internet code reuse of professional software developers with a particular focus on license issues by analyzing a unique global dataset of 869 professional software developers. We find that ad-hoc Internet code reuse has become prevalent in commercial software development. Despite this, when reusing Internet code in ad-hoc fashion, professional software developers appear not to fully account for license issues potentially resulting from their behavior. Moreover, our results point out that professional software developers receive little effective training and information on the topic of Internet code reuse from official channels. Furthermore, professional software developers are on average not fully aware of many common Internet code license obligations, and tend to overestimate their own knowledge. Most firms also do not provide close guardrails to their software developers regarding Internet code reuse through policies. Consequently, a considerable share of professional software developers has violated Internet code license obligations in the past. Based on our findings we discuss practical implications for firms developing software and suggest levers to reduce the economic and legal risks from license violations through professional software developers’ ad-hoc reuse of Internet code.
SP  - NA
EP  - NA
JF  - NA
VL  - NA
IS  - NA
PB  -
DO  - NA
ER  -

TY  - CHAP
AU  - Heinemann, Lars; Deissenboeck, Florian; Gleirscher, Mario; Hummel, Benjamin; Irlbeck, Maximilian
TI  - ICSR - On the extent and nature of software reuse in open source Java projects
PY  - 2011
AB  - Code repositories on the Internet provide a tremendous amount of freely available open source code that can be reused for building new software. It has been argued that only software reuse can bring the gain of productivity in software construction demanded by the market. However, knowledge about the extent of reuse in software projects is only sparse. To remedy this, we report on an empirical study about software reuse in 20 open source Java projects with a total of 3.3 MLOC. The study investigates (1) whether open source projects reuse third party code and (2) how much white-box and black-box reuse occurs. To answer these questions, we utilize static dependency analysis for quantifying black-box reuse and code clone detection for detecting white-box reuse from a corpus with 6.1 MLOC of reusable Java libraries. Our results indicate that software reuse is common among open source Java projects and that black-box reuse is the predominant form of reuse.
SP  - 207
EP  - 222
JF  - Lecture Notes in Computer Science
VL  - NA
IS  - NA
PB  -
DO  - 10.1007/978-3-642-21347-2_16
ER  -

TY  - CHAP
AU  - Gallo, Giuseppe; Tuzzolino, Giovanni Francesco
TI  - Open-Source for a Sustainable Development of Architectural Design in the Fourth Industrial Revolution
PY  - 2023
AB  - NA
SP  - 113
EP  - 131
JF  - Lecture Notes in Mechanical Engineering
VL  - NA
IS  - NA
PB  -
DO  - 10.1007/978-3-031-36922-3_8
ER  -

TY  - CHAP
AU  - Stol, Klaas-Jan; Fitzgerald, Brian
TI  - Contemporary Empirical Methods in Software Engineering - Guidelines for Conducting Software Engineering Research
PY  - 2020
AB  - This chapter presents a holistic overview of software engineering research strategies. It identifies the two main modes of research within the software engineering research field, namely knowledge-seeking and solution-seeking research—the Design Science model corresponding well with the latter. We present the ABC framework for research strategies as a model to structure knowledge-seeking research. The ABC represents three desirable aspects of research—generalizability over actors (A), precise control of behavior (B), and realism of context (C). Unfortunately, as our framework illustrates, these three aspects cannot be simultaneously maximized. We describe the two dimensions that provide the foundation of the ABC framework—generalizability and control, explain the four different types of settings in which software engineering research is conducted, and position eight archetypal research strategies within the ABC framework. We illustrate each strategy with examples, identify appropriate metaphors, and present an example of how the ABC framework can be used to design a research program.
SP  - 27
EP  - 62
JF  - Contemporary Empirical Methods in Software Engineering
VL  - NA
IS  - NA
PB  -
DO  - 10.1007/978-3-030-32489-6_2
ER  -

TY  - CHAP
AU  - Businge, John; Abdi, Mehrdad; Demeyer, Serge
TI  - Analyzing Variant Forks of Software Repositories from Social Coding Platforms
PY  - 2023
AB  - NA
SP  - 131
EP  - 152
JF  - Software Ecosystems
VL  - NA
IS  - NA
PB  -
DO  - 10.1007/978-3-031-36060-2_6
ER  -

TY  - JOUR
AU  - Radu, Laura Diana
TI  - An Ecological View on Software Reuse
PY  - 2018
AB  - The increase of consumption is an important motivation for the reuse of either physical or virtual products. As the software market has risen, software reuse has become a practice with favourable effects for software development companies and their clients. The most important benefits are increased productivity, reduced costs, better and easier maintenance, decreased development lead times and the improved quality of software products. Successful reuse depends on several technical and non-technical factors. The ecological impact of software is an important non-technical factor of software reuse that needs to be analysed in the context of the rapid evolution of optimization techniques. The main goal of this study is to identify ecological perspectives on software reuse. These will complete the framework of software reuse together with other technical factors, such as compatibility, and non-technical factors, such as economic and ethical implications.
SP  - 75
EP  - 85
JF  - Informatica Economica
VL  - 22
IS  - 3
PB  -
DO  - 10.12948/issn14531305/22.3.2018.07
ER  -

TY  - JOUR
AU  - Liu, Yaxin; He, Peng; Wu, Gaoyan; Li, Yilu
TI  - Towards Understanding Developers’ Collaborative Behavior in Open Source Software Ecosystems
PY  - 2017
AB  - NA
SP  - 393
EP  - 405
JF  - Journal of Software
VL  - 12
IS  - 6
PB  -
DO  - 10.17706/jsw.12.6.393-405
ER  -

TY  - NA
AU  - Allen, John; Kelleher, Caitlin
TI  - An Exploratory Study of Programmers' Analogical Reasoning and Software History Usage During Code Re-Purposing
PY  - 2024
AB  - Background: Software development relies on collaborative problem-solving. Understanding previously addressed problems in software is crucial for developers to identify and repurpose functionalities for new problem-solving contexts.
SP  - 109
EP  - 120
JF  - Proceedings of the 2024 IEEE/ACM 17th International Conference on Cooperative and Human Aspects of Software Engineering
VL  - 26
IS  - NA
PB  -
DO  - 10.1145/3641822.3641864
ER  -

TY  - JOUR
AU  - Eckert, Remo; Stuermer, Matthias; Myrach, Thomas
TI  - Alone or Together? Inter-organizational affiliations of open source communities
PY  - 2019
AB  - NA
SP  - 250
EP  - 262
JF  - Journal of Systems and Software
VL  - 149
IS  - NA
PB  -
DO  - 10.1016/j.jss.2018.12.007
ER  -

TY  - NA
AU  - Rastogi, Ayushi; Nagappan, Nachiappan
TI  - SANER - Forking and the Sustainability of the Developer Community Participation -- An Empirical Investigation on Outcomes and Reasons
PY  - 2016
AB  - A majority of OSS projects fails due to their inability to garner significant and sustained developer community participation. The problem proliferates when competing projects emerge from the source code of an existing project, a phenomenon called forking of the original project, claiming existing and potential developer community participation. In this study, we empirically analyze the influence of forking on the sustainability of the developer community participation in the original project. Further, we try to explain the observed behavior in terms of the characteristics of the project observed at the time of forking. A large-scale study of 2,217 projects hosted on GitHub shows that 1 in every 5 original projects observes a decline in the sustainability of the developer community participation after forking. We find that the negative effect is more pronounced in projects ported to GitHub from other platforms (&#x2248; 20%), compared to GitHub developed projects (&#x2248; 9%). We also find that the observed behavior can be explained in terms of the characteristics of the competing projects at the time of forking. For instance, in medium sized projects an increase in the maturity of the original project by a year decreases the odds of decline in the sustainability of the developer participation by 23%.
SP  - 102
EP  - 111
JF  - 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER)
VL  - 1
IS  - NA
PB  -
DO  - 10.1109/saner.2016.27
ER  -

TY  - NA
AU  - Merendino, Nicolò; Rodà, Antonio
TI  - Defining an open source CAD workflow for experimental music and media arts
PY  - 2021
AB  - The practice of designing and building instruments, interfaces and hardware in general, became a crucial part of contemporary audio and media arts productions. This task could benefit from the high performance tools offered by state of the art Open source Computer Aided Design (CAD). Although these applications have reached a good level of maturity, their use in the artistic field is still not so widespread, due to an initial barrier probably caused by a lack of accessible documentation and best practices.
SP  - 1
EP  - 6
JF  - 10th International Conference on Digital and Interactive Arts
VL  - NA
IS  - NA
PB  -
DO  - 10.1145/3483529.3483715
ER  -

TY  - NA
AU  - Zhang, Xunhui; Wang, Tao; Yu, Yue; Zeng, Qiubing; Li, Zhixing; Wang, Huaimin
TI  - Who, What, Why and How? Towards the Monetary Incentive in Crowd Collaboration: A Case Study of Github's Sponsor Mechanism
PY  - 2022
AB  - NA
SP  - 1
EP  - 18
JF  - CHI Conference on Human Factors in Computing Systems
VL  - NA
IS  - NA
PB  -
DO  - 10.1145/3491102.3501822
ER  -

TY  - JOUR
AU  - Crowston, Kevin; Shamshurin, Ivan
TI  - Core-periphery communication and the success of free/libre open source software projects
PY  - 2017
AB  - We examine the relationship between communications by core and peripheral members and Free/Libre Open Source Software project success. The study uses data from 74 projects in the Apache Software Foundation Incubator. We conceptualize project success in terms of success building a community, as assessed by graduation from the Incubator. We compare successful and unsuccessful projects on volume of communication and on use of inclusive pronouns as an indication of efforts to create intimacy among team members. An innovation of the paper is that use of inclusive pronouns is measured using natural language processing techniques. We also compare the volume and content of communication produced by core (committer) and peripheral members and by those peripheral members who are later elected to be core members. We find that volume of communication is related to project success but use of inclusive pronouns does not distinguish successful projects. Core members exhibit more contribution and use of inclusive pronouns than peripheral members.
SP  - 1
EP  - 11
JF  - Journal of Internet Services and Applications
VL  - 8
IS  - 1
PB  -
DO  - 10.1186/s13174-017-0061-4
ER  -

TY  - JOUR
AU  - de Almeida Borges, Ana; Casanueva Artís, Annalí; Falleri, Jean-Rémy; Gallego Arias, Emilio Jesús; Martin-Dorel, Érik; Palmskog, Karl; Serebrenik, Alexander; Zimmermann, Théo
TI  - Lessons for Interactive Theorem Proving Researchers from a Survey of Coq Users
PY  - 2025
AB  - <jats:title>Abstract</jats:title>
          <jats:p>The Coq Community Survey 2022 was an online public survey of users of the Coq proof assistant conducted during February 2022. Broadly, the survey asked about use of Coq features, user interfaces, libraries, plugins, and tools, views on renaming Coq and Coq improvements, and also demographic data such as education and experience with Coq and other proof assistants and programming languages. The survey received 466 submitted responses, making it the largest survey of users of an interactive theorem prover (ITP) so far. We present the design of the survey, a summary of key results, and analysis of answers relevant to ITP technology development and usage. In particular, we analyze user characteristics associated with adoption of tools and libraries and make comparisons to adjacent software communities. Notably, we find that experience has significant impact on Coq user behavior, including on usage of tools, libraries, and integrated development environments (IDEs).</jats:p>
SP  - NA
EP  - NA
JF  - Journal of Automated Reasoning
VL  - 69
IS  - 1
PB  -
DO  - 10.1007/s10817-025-09720-1
ER  -

TY  - NA
AU  - Chen, Qihong; Câmara, Rúben; Campos, José; Souto, André; Ahmed, Iftekhar
TI  - The Smelly Eight: An Empirical Study on the Prevalence of Code Smells in Quantum Computing
PY  - 2023
AB  - Quantum Computing (QC) is a fast-growing field that has enhanced the emergence of new programming languages and frameworks. Furthermore, the increased availability of computational resources has also contributed to an influx in the development of quantum programs. Given that classical and QC are significantly different due to the intrinsic nature of quantum programs, several aspects of QC (e.g., performance, bugs) have been investigated, and novel approaches have been proposed. However, from a purely quantum perspective, maintenance, one of the major steps in a software development life-cycle, has not been considered by researchers yet. In this paper, we fill this gap and investigate the prevalence of code smells in quantum programs as an indicator of maintenance issues. We defined eight quantum-specific smells and validated them through a survey with 35 quantum developers. Since no tool specifically aims to detect quantum smells, we developed a tool called QSmell that supports the proposed quantum-specific smells. Finally, we conducted an empirical investigation to analyze the prevalence of quantum-specific smells in 15 open-source quantum programs. Our results showed that 11 programs (73.33%) contain at least one smell and, on average, a program has three smells. Furthermore, the long circuit is the most prevalent smell present in 53.33% of the programs.
SP  - 358
EP  - 370
JF  - 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE)
VL  - NA
IS  - NA
PB  -
DO  - 10.1109/icse48619.2023.00041
ER  -

TY  - NA
AU  - Wessel, Mairieli; Wiese, Igor; Steinmacher, Igor; Gerosa, Marco Aurélio
TI  - Don't Disturb Me: Challenges of Interacting with SoftwareBots on Open Source Software Projects
PY  - 2021
AB  - Software bots are used to streamline tasks in Open Source Software (OSS) projects' pull requests, saving development cost, time, and effort. However, their presence can be disruptive to the community. We identified several challenges caused by bots in pull request interactions by interviewing 21 practitioners, including project maintainers, contributors, and bot developers. In particular, our findings indicate noise as a recurrent and central problem. Noise affects both human communication and development workflow by overwhelming and distracting developers. Our main contribution is a theory of how human developers perceive annoying bot behaviors as noise on social coding platforms. This contribution may help practitioners understand the effects of adopting a bot, and researchers and tool designers may leverage our results to better support human-bot interaction on social coding platforms.
SP  - NA
EP  - NA
JF  - arXiv: Software Engineering
VL  - NA
IS  - NA
PB  -
DO  - NA
ER  -

TY  - NA
AU  - Lampropoulos, Alexander; Ampatzoglou, Apostolos; Bibi, Stamatia; Chatzigeorgiou, Alexander; Stamelos, Ioannis
TI  - QUATIC - REACT - A Process for Improving Open-Source Software Reuse
PY  - 2018
AB  - Software reuse is a popular practice, which is constantly gaining ground among practitioners. The main reason for this is the potential that it provides for reducing development effort and increasing the end-product quality. At the same time, Open-Source Software (OSS) repositories are nowadays flourishing and can facilitate the reuse process, through the provision of a variety of software artifacts. However, up-to-date OSS reuse processes have mostly been opportunistic, leading to not fully capitalizing existing reuse potentials. In this study we propose a process (namely REACT) for improving planned OSS reuse practices, i.e., we define the activities that a software engineer can perform to reuse OSS artifacts. To illustrate the applicability of REACT, we provide an example, in which a mobile application is developed based upon the reuse of OSS artifacts. To validate the proposed process we compared the effort required to develop the application with and without adapting REACT process. Our preliminary results suggest that REACT may reduce up to 50% the effort required to build an application from scratch.
SP  - 251
EP  - 254
JF  - 2018 11th International Conference on the Quality of Information and Communications Technology (QUATIC)
VL  - NA
IS  - NA
PB  -
DO  - 10.1109/quatic.2018.00044
ER  -

TY  - JOUR
AU  - Ochoa, Lina; Hammad, Muhammad; Giray, Görkem; Babur, Önder; Bennin, Kwabena
TI  - Characterising harmful API uses and repair techniques: Insights from a systematic review
PY  - 2025
AB  - NA
SP  - 100732
EP  - 100732
JF  - Computer Science Review
VL  - 57
IS  - NA
PB  -
DO  - 10.1016/j.cosrev.2025.100732
ER  -

TY  - NA
AU  - Butler, Simon; Gamalielsson, Jonas; Lundell, Björn; Jonsson, Per; Sjoberg, Johan; Mattsson, Anders; Ricko, Niklas; Gustavsson, Tomas; Feist, Jonas; Landemoo, Stefan; Lonroth, Erik
TI  - ICSE (SEIP) - An investigation of work practices used by companies making contributions to established OSS projects
PY  - 2018
AB  - Professionals contribute to open source software (OSS) projects as part of their employment. Previous research has addressed motivations of individuals and the ways they engage with OSS projects. However, there is a lack of research which examines and explains work practices used by companies in their engagement with projects. Work practices used by companies to contribute to five established OSS projects are investigated through examination of the actions of employees in public communication channels and draw on our experiences when analysing engagement with the same projects. We find that companies utilise work practices for contributing which are congruent with the circumstances and their capabilities that support their short and long term needs. We find that companies contribute to OSS projects in different ways, such as employing core project developers, making donations, and joining project steering committees in order to advance strategic interests.
SP  - 201
EP  - 210
JF  - Proceedings of the 40th International Conference on Software Engineering: Software Engineering in Practice
VL  - NA
IS  - NA
PB  -
DO  - 10.1145/3183519.3183531
ER  -

TY  - JOUR
AU  - alghaswyneh, odai falah
TI  - Exploring the Impact of AI-Driven Personalization on Consumer Engagement in
Digital Marketing.
PY  - 2025
AB  - <jats:p>&lt;b&gt;Research background and purpose:&lt;/b&gt; This study examines AI-driven impact personalization on consumer engagement within Saudi Arabia’s digital marketing landscape, aligning with Vision 2030 objectives. It underscores the transformative potential of artificial intelligence in enhancing customer interaction, satisfaction, and loyalty by delivering tailored experiences that address consumer preferences. The research focuses on key factors—ethical considerations, technological readiness, organizational culture, and cost—that influence the effectiveness of AI-driven personalization, providing insights into fostering robust consumer relationships and supporting Saudi Arabia’s digital transformation initiatives.
&lt;b&gt;Design/methodology/approach: &lt;/b&gt; The study uses a descriptive-analytical approach to explore the relationship between AI-driven personalization and consumer engagement. Researchers collected data through a structured questionnaire distributed to a randomly selected sample of 350 participants and analyzed 300 valid responses. They applied statistical methods, including descriptive statistics and correlation analysis, to examine the relationships between variables. Additionally, Cronbach’s alpha evaluated the reliability of the research instruments.
&lt;b&gt;Findings:&lt;/b&gt; The study reveals a significant positive relationship between AI-driven personalization and consumer engagement. Ethical considerations, particularly data privacy and transparency (correlation coefficient = 0.81), play the most influential role by emphasizing the need for secure and transparent data practices to build trust. Organizational culture (0.75) also plays a crucial role, with innovation and professionalism strengthening consumer trust and loyalty. Technological readiness and cost further enhance engagement, as organizations leverage advanced AI technologies and strategic pricing to deliver personalized experiences. Participants appreciate the convenience, efficiency, and tangible benefits provided by these personalized services.
&lt;b&gt;Value added and limitations:&lt;/b&gt; This study provides critical insights into the role of AI in Saudi Arabia’s digital economy, emphasizing the integration of ethical standards and technological innovation to gain a competitive edge. However, reliance on self-reported data and a geographically confined sample may limit generalizability. Future research should include broader demographics and additional variables to expand these findings.</jats:p>
SP  - 117
EP  - 144
JF  - Management
VL  - NA
IS  - 1
PB  -
DO  - 10.58691/man/200968
ER  -

TY  - NA
AU  - Vendome, Christopher
TI  - Assisting Software Developers With License Compliance
PY  - NA
AB  - NA
SP  - NA
EP  - NA
JF  - NA
VL  - NA
IS  - NA
PB  -
DO  - 10.21220/s2-xp8w-0w53
ER  -

TY  - JOUR
AU  - Flath, Christoph M.; Friesike, Sascha; Wirth, Marco; Thiesse, Frédéric
TI  - Copy, transform, combine: exploring the remix as a form of innovation
PY  - 2017
AB  - The reuse of existing knowledge is an indispensable part of the creation of novel ideas. In the creative domain knowledge reuse is a common practice known as “remixing”. With the emergence of open internet-based platforms in recent years, remixing has found its way from the world of music and art to the design of arbitrary physical goods. However, despite its obvious relevance for the number and quality of innovations on such platforms, little is known about the process of remixing and its contextual factors. This paper considers the example of Thingiverse, a platform for the 3D printing community that allows its users to create, share, and access a broad range of printable digital models. We present an explorative study of remixing activities that took place on the platform over the course of six years by using an extensive set of data on models and users. On the foundation of these empirically observed phenomena, we formulate a set of theoretical propositions and managerial implications regarding (1) the role of remixes in design communities, (2) the different patterns of remixing processes, (3) the platform features that facilitate remixes, and (4) the profile of the remixing platform’s users.
SP  - 306
EP  - 325
JF  - Journal of Information Technology
VL  - 32
IS  - 4
PB  -
DO  - 10.1057/s41265-017-0043-9
ER  -

TY  - JOUR
AU  - Zhang, Beiqi; Fu, Liming; Liang, Peng; Yu, Jiaxin; Wang, Chong
TI  - Demystifying code snippets in code reviews: a study of the OpenStack and Qt communities and a practitioner survey
PY  - 2024
AB  - NA
SP  - NA
EP  - NA
JF  - Empirical Software Engineering
VL  - 29
IS  - 4
PB  -
DO  - 10.1007/s10664-024-10484-2
ER  -

TY  - BOOK
AU  - Palomba, Fabio; Serebrenik, Alexander; Zaidman, Andy
TI  - BENEVOL - Social debt analytics for improving the management of software evolution tasks
PY  - 2017
AB  - The success of software engineering projects is in a large part dependent on social and organization aspects of the development community. Indeed, it not only depends on the complexity of the product or the number of requirements to be implemented, but also on people, processes, and how they impact the technical side of software development. Social debt represents patterns across the organizational structure around a software system that may lead to additional unforeseen project costs. Condescending behavior, disgruntlement or rage quitting are just some examples of social issues that may occur among the developers of a software project. While the research community has recently investigated the underlying dynamics leading to the introduction of social debt (e.g., the so-called “community smells” which represent symptoms of the presence of social problems in a community), as well as how such debt can be payed off, there is still a noticeable lack of empirical evidence on how social debt impacts software maintenance and evolution. In this paper, we present our position on how social debt can impacts technical aspects of source code by presenting a road map toward a deeper understanding of such relationship.
SP  - 18
EP  - 21
JF  - NA
VL  - NA
IS  - NA
PB  -
DO  - NA
ER  -

TY  - NA
AU  - Couldry, Nick; Rodríguez, Clemencia; Bolin, Göran; Cohen, Julie E.; Goggin, Gerard; Kraidy, Marwan M.; Iwabuchi, Koichi; Lee, Kwang-Suk; Qiu, Jack Linchuan; Volkmer, Ingrid; Wasserman, Herman; Zhao, Yuezhi; Koltsova, Olessia; Rakhmani, Inaya; Rincón, Omar; Magallanes-Blanco, Claudia; Thomas, Pradip
TI  - Inequality and Communicative Struggles in Digital Times: a Global Report on Communication for Social Progress
PY  - 2018
AB  - NA
SP  - NA
EP  - NA
JF  - NA
VL  - NA
IS  - NA
PB  -
DO  - NA
ER  -

TY  - JOUR
AU  - Ni, Rong; Wang, Jue
TI  - Navigating disruptions: The effects of the pandemic on scientific collaboration and research novelty in Hong Kong
PY  - 2025
AB  - NA
SP  - 101656
EP  - 101656
JF  - Journal of Informetrics
VL  - 19
IS  - 2
PB  -
DO  - 10.1016/j.joi.2025.101656
ER  -

TY  - JOUR
AU  - Shatnawi, Raed
TI  - A Classification of Software Modules into Library and Application Components in the Open-Source Field
PY  - 2016
AB  - Software reuse significantly reduce the costs of software production in the field of open-source software (OSS) development and lead to produce more reliable software systems. Software metrics have been proposed as indicators of software quality factors such as reusability. However, few empirical research papers have validated the relationship between components reusability and software metrics. This research aims to validate Chidamber and Kemerer (CK) metrics as predictors of software reusability. In order to achieve this goal, an empirical study is conducted to validate metrics in classifying two groups of components: library (reuse-prone) and non-library (less reuseprone). A nearest neighbor’s technique is used to classify library and application components using object-oriented software metrics. The approach is applied to a number of library and application systems available online. The conducted nearest neighbors models have produced acceptable classification. The results provide evidence of using metrics as surrogates of software reusability when models are evaluated using Fmeasure. CK metrics can be used to measure component reuse-proneness and can be used to differentiate between library and application components. A nearest neighbor’s technique can be used to identify the reuse-prone components in open-source application.
SP  - 179
EP  - 190
JF  - International Journal of Software Engineering and Its Applications
VL  - 10
IS  - 3
PB  -
DO  - 10.14257/ijseia.2016.10.3.16
ER  -

TY  - NA
AU  - Jamieson, Jack; Yamashita, Naomi; Foong, Eureka
TI  - Predicting open source contributor turnover from value-related discussions: An analysis of GitHub issues
PY  - 2024
AB  - Discussions about project values are important for engineering software that meets diverse human needs and positively impacts society. Because value-related discussions involve deeply held beliefs, they can lead to conflicts or other outcomes that may affect motivations to continue contributing to open source projects. However, it is unclear what kind of value-related discussions are associated with significant changes in turnover. We address this gap by identifying discussions related to important project values and investigating the extent to which those discussions predict project turnover in the following months. We collected logs of GitHub issues and commits from 52 projects that share similar ethical commitments and were identified as part of the DWeb (Decentralized Web) community. We identify issues related to DWeb's core values of respectfulness, freedom, broadmindedness, opposing centralized social power, equity & equality, and protecting the environment. We then use Granger causality analysis to examine how changes in the proportion of discussions related to those values might predict changes in incoming and outgoing turnover. We found multiple significant relationships between value-related discussions and turnover, including that discussions about respectfulness predict an increase in contributors leaving and a decrease in new contributors, while discussions about social power predicted better contributor retention. Understanding these antecedents of contributor turnover is important for managing open source projects that incorporate human-centric issues. Based on the results, we discuss implications for open source maintainers and for future research.
SP  - 1
EP  - 13
JF  - Proceedings of the IEEE/ACM 46th International Conference on Software Engineering
VL  - NA
IS  - NA
PB  -
DO  - 10.1145/3597503.3623340
ER  -

TY  - JOUR
AU  - Wang, Jing; Shih, Patrick C.; Wu, Yu; Carroll, John M.
TI  - Comparative case studies of open source software peer review practices
PY  - 2015
AB  - NA
SP  - 1
EP  - 12
JF  - Information and Software Technology
VL  - 67
IS  - 67
PB  -
DO  - 10.1016/j.infsof.2015.06.002
ER  -

TY  - JOUR
AU  - Moriwaki, Takuya; Igaki, Hiroshi; Yamanaka, Yuki; Yoshida, Norihiro; Kusumoto, Shinji; Inoue, Katsuro
TI  - Towards an Analysis of Who Creates Clone and Who Reuses it
PY  - NA
AB  - Code clone analysis is valuable because it can reveal reuse behaviours ef- ficiently from software repositories. Recently, some code reuse analyses using clone genealogies and code clones over multiple projects were conducted. However, most of the conventional analyses do not consider the developers' individual difference to reuse behaviors. In this paper, we propose a method for code reuse analysis which takes particular note of the differences among individuals. Our analysis method clarifies who reused whose source code across multiple repositories. We believe the result might provide us with constructive perceptions such as characteristics of reused code itself by multiple developers, and developers who implement reusable code.
SP  - NA
EP  - NA
JF  - Electronic Communication of The European Association of Software Science and Technology
VL  - 63
IS  - NA
PB  -
DO  - 10.14279/tuj.eceasst.63.924
ER  -

TY  - JOUR
AU  - ISLAM, Syful; GAIKOVINA KULA, Raula; TREUDE, Christoph; CHINTHANET, Bodin; ISHIO, Takashi; MATSUMOTO, Kenichi
TI  - An Empirical Study of Package Management Issues via Stack Overflow
PY  - 2023
AB  - The package manager (PM) is crucial to most technology stacks, acting as a broker to ensure that a verified dependency package is correctly installed, configured, or removed from an application. Diversity in technology stacks has led to dozens of PMs with various features. While our recent study indicates that package management features of PM are related to end-user experiences, it is unclear what those issues are and what information is required to resolve them. In this paper, we have investigated PM issues faced by end-users through an empirical study of content on Stack Overflow (SO). We carried out a qualitative analysis of 1,131 questions and their accepted answer posts for three popular PMs (i.e., Maven, npm, and NuGet) to identify issue types, underlying causes, and their resolutions. Our results confirm that end-users struggle with PM tool usage (approximately 64-72%). We observe that most issues are raised by end-users due to lack of instructions and errors messages from PM tools. In terms of issue resolution, we find that external link sharing is the most common practice to resolve PM issues. Additionally, we observe that links pointing to useful resources (i.e., official documentation websites, tutorials, etc.) are most frequently shared, indicating the potential for tool support and the ability to provide relevant information for PM end-users.
SP  - 138
EP  - 147
JF  - IEICE Transactions on Information and Systems
VL  - E106.D
IS  - 2
PB  -
DO  - 10.1587/transinf.2022mpp0001
ER  -

TY  - JOUR
AU  - ,
TI  - Agile Transformation in Public Sector IT Projects Using Lean-Agile Change Management and Enterprise Architecture Alignment
PY  - 2024
AB  - <jats:p>This review explores the strategic integration of Lean-Agile change management frameworks and enterprise architecture (EA) alignment in driving successful agile transformations within public sector IT projects. Unlike the private sector, public institutions often operate under rigid regulatory frameworks, legacy systems, and complex stakeholder landscapes, which pose significant barriers to agile adoption. This paper critically examines how Lean-Agile principles—such as incremental delivery, customer-centric value streams, and cross-functional collaboration—can be tailored to the public sector’s unique operational constraints to enhance IT service delivery, transparency, and responsiveness. The study also evaluates the pivotal role of enterprise architecture in aligning business goals with agile practices by providing a structural blueprint that harmonizes governance, digital infrastructure, and evolving stakeholder needs. Key focus areas include the synchronization of EA layers with agile portfolios, the role of leadership in orchestrating agile governance, and the adoption of value stream mapping to eliminate bureaucratic inefficiencies. Through the synthesis of case studies, industry frameworks like SAFe (Scaled Agile Framework) and TOGAF (The Open Group Architecture Framework), and empirical findings, the review highlights best practices, implementation challenges, and metrics for evaluating transformation success. Ultimately, the paper provides a comprehensive roadmap for public organizations aiming to modernize IT delivery through scalable, sustainable agile transformation strategies rooted in enterprise alignment and lean governance principles.</jats:p>
SP  - 21
EP  - 39
JF  - International Journal of Scientific Research and Modern Technology
VL  - NA
IS  - NA
PB  -
DO  - 10.38124/ijsrmt.v3i8.432
ER  -

TY  - CHAP
AU  - Viseur, Robert; Charleux, Amel
TI  - OSS - Open Source Communities and Forks: A Rereading in the Light of Albert Hirschman's Writings
PY  - 2021
AB  - The literature dedicated to free and open source software emphasizes the support given by the community to software producers. However, the community is also a place of conflict and can sometimes experience violent splits (forks). Communities can show different forms of resistance to change. In this research, we propose a re-reading of these mechanisms of opposition in light of Albert Hirschman's theory (exit, voice, loyalty). We present the fork as a new form of defection (exit) allowed by licenses and discuss the rationality of choice for the economic actors who implement it.
SP  - 59
EP  - 67
JF  - IFIP Advances in Information and Communication Technology
VL  - NA
IS  - NA
PB  -
DO  - 10.1007/978-3-030-75251-4_6
ER  -

TY  - JOUR
AU  - Ruohonen, Jukka; Rauti, Sampsa; Hyrynsalmi, Sami; Leppänen, Ville
TI  - A case study on software vulnerability coordination
PY  - 2018
AB  - Abstract Context: Coordination is a fundamental tenet of software engineering. Coordination is required also for identifying discovered and disclosed software vulnerabilities with Common Vulnerabilities and Exposures (CVEs). Motivated by recent practical challenges, this paper examines the coordination of CVEs for open source projects through a public mailing list. Objective: The paper observes the historical time delays between the assignment of CVEs on a mailing list and the later appearance of these in the National Vulnerability Database (NVD). Drawing from research on software engineering coordination, software vulnerabilities, and bug tracking, the delays are modeled through three dimensions: social networks and communication practices, tracking infrastructures, and the technical characteristics of the CVEs coordinated. Method: Given a period between 2008 and 2016, a sample of over five thousand CVEs is used to model the delays with nearly fifty explanatory metrics. Regression analysis is used for the modeling. Results: The results show that the CVE coordination delays are affected by different abstractions for noise and prerequisite constraints. These abstractions convey effects from the social network and infrastructure dimensions. Particularly strong effect sizes are observed for annual and monthly control metrics, a control metric for weekends, the degrees of the nodes in the CVE coordination networks, and the number of references given in NVD for the CVEs archived. Smaller but visible effects are present for metrics measuring the entropy of the emails exchanged, traces to bug tracking systems, and other related aspects. The empirical signals are weaker for the technical characteristics. Conclusion: Software vulnerability and CVE coordination exhibit all typical traits of software engineering coordination in general. The coordination perspective elaborated and the case studied open new avenues for further empirical inquiries as well as practical improvements for the contemporary CVE coordination.
SP  - 239
EP  - 257
JF  - Information and Software Technology
VL  - 103
IS  - 103
PB  -
DO  - 10.1016/j.infsof.2018.06.005
ER  -

TY  - CHAP
AU  - Wang, Ying; Cheung, Shing-Chi; Yu, Hai; Zhu, Zhiliang
TI  - Common Types of Dependency Issues
PY  - 2024
AB  - NA
SP  - 35
EP  - 52
JF  - Managing Software Supply Chains
VL  - NA
IS  - NA
PB  -
DO  - 10.1007/978-981-96-1797-5_3
ER  -

TY  - CHAP
AU  - Nketsiah, Richard Nana; Millham, Richard C.; Agbehadji, Israel Edem; Freeman, Emmanuel; Epizitone, Ayogeboh
TI  - Optimising a Formulated Cost Model to Minimise Labour Cost of Computer Networking Infrastructure: A Systematic Review
PY  - 2023
AB  - NA
SP  - 427
EP  - 442
JF  - Communications in Computer and Information Science
VL  - NA
IS  - NA
PB  -
DO  - 10.1007/978-3-031-48858-0_34
ER  -

TY  - NA
AU  - Reid, David; Rahkema, Kristiina; Walden, James
TI  - Large Scale Study of Orphan Vulnerabilities in the Software Supply Chain
PY  - 2023
AB  - The security of the software supply chain has become a critical issue in an era where the majority of software projects use open source software dependencies, exposing them to vulnerabilities in those dependencies. Awareness of this issue has led to the creation of dependency tracking tools that can identify and remediate such vulnerabilities. These tools rely on package manager metadata to identify dependencies, but open source developers often copy dependencies into their repositories manually without the use of a package manager.
SP  - 22
EP  - 32
JF  - Proceedings of the 19th International Conference on Predictive Models and Data Analytics in Software Engineering
VL  - NA
IS  - NA
PB  -
DO  - 10.1145/3617555.3617872
ER  -

TY  - NA
AU  - Cai, Jie; Lin, Ya-Fang; Zhang, He; Carroll, John M.
TI  - Third-Party Developers and Tool Development For Community Management on Live Streaming Platform Twitch
PY  - 2024
AB  - Community management is critical for stakeholders to collaboratively build and sustain communities with socio-technical support. However, most of the existing research has mainly focused on the community members and the platform, with little attention given to the developers who act as intermediaries between the platform and community members and develop tools to support community management. This study focuses on third-party developers (TPDs) for the live streaming platform Twitch and explores their tool development practices. Using a mixed method with in-depth qualitative analysis, we found that TPDs maintain complex relationships with different stakeholders (streamers, viewers, platform, professional developers), and the multi-layered policy restricts their agency regarding idea innovation and tool development. We argue that HCI research should shift its focus from tool users to tool developers with regard to community management. We propose designs to support closer collaboration between TPDS and the platform and professional developers and streamline TPDs' development process with unified toolkits and policy documentation.
SP  - 1
EP  - 18
JF  - Proceedings of the CHI Conference on Human Factors in Computing Systems
VL  - NA
IS  - NA
PB  -
DO  - 10.1145/3613904.3642787
ER  -

TY  - NA
AU  - Li, Hanlin; Ajmani, Leah; Zhou, Moyan; Vincent, Nicholas; Hwang, Sohyeon; Piccardi, Tiziano; Narayan, Sneha; Daniel, Sherae; Veselovsky, Veniamin
TI  - Ethical Tensions, Norms, and Directions in the Extraction of Online Volunteer Work
PY  - 2022
AB  - Online volunteer work such as moderating forums and participating in open source projects not only underpins today's digital infrastructures, but also helps companies generate immense profits. However, there remains a lack of ethical norms around using volunteer labor for corporate interests, opening opportunities for unchecked extraction of online volunteer work at scale. Early evidence suggests that the extraction of online volunteer work may have negative implications on the tech ecosystem and obfuscate the potential for exploitative labor practices. In this workshop, we invite participants to discuss 1) what ethical tensions exist in the current approaches to extracting online volunteer work, 2) what ethical norms should be followed or recommended and 3) what are the opportunities for social computing technologies to promote these norms. Furthermore, we open a dialogue around whether online platforms should be providing non-monetary compensation, such as education and resources, that is often promised in in-person volunteer settings. We plan to involve a diversity of roles beyond academic researchers, such as online volunteers and practitioners to discuss these questions.
SP  - 273
EP  - 277
JF  - Companion Publication of the 2022 Conference on Computer Supported Cooperative Work and Social Computing
VL  - NA
IS  - NA
PB  -
DO  - 10.1145/3500868.3560923
ER  -

TY  - NA
AU  - Linåker, Johan; Runeson, Per
TI  - Sustaining Open Data as a Digital Common – Design principles for Common Pool Resources applied to Open Data Ecosystems
PY  - 2022
AB  - Motivation. Digital commons is an emerging phenomenon and of increasing importance, as we enter a digital society. Open data is one example that makes up a pivotal input and foundation for many of today's digital services and applications. Ensuring sustainable provisioning and maintenance of the data, therefore, becomes even more important. Aim. We aim to investigate how such provisioning and maintenance can be collaboratively performed in the community surrounding a common. Specifically, we look at Open Data Ecosystems (ODEs), a type of community of actors, openly sharing and evolving data on a technological platform. Method. We use Elinor Ostrom's design principles for Common Pool Resources as a lens to systematically analyze the governance of earlier reported cases of ODEs using a theory-oriented software engineering framework. Results. We find that, while natural commons must regulate consumption, digital commons such as open data maintained by an ODE must stimulate both use and data provisioning. Governance needs to enable such stimulus while also ensuring that the collective action can still be coordinated and managed within the frame of available maintenance resources of a community. Subtractability is, in this sense, a concern regarding the resources required to maintain the quality and value of the data, rather than the availability of data. Further, we derive empirically-based recommended practices for ODEs based on the design principles by Ostrom for how to design a governance structure in a way that enables a sustainable and collaborative provisioning and maintenance of the data. Conclusion. ODEs are expected to play a role in data provisioning which democratize the digital society and enables innovation from smaller commercial actors. Our empirically based guidelines intend to support this development.
SP  - NA
EP  - NA
JF  - Proceedings of the 18th International Symposium on Open Collaboration
VL  - NA
IS  - NA
PB  -
DO  - 10.1145/3555051.3555066
ER  -

TY  - JOUR
AU  - DION-POULIN, ALEXANDRA; VEILLEUX, SOPHIE; PERREAULT, VÉRONIQUE; TURGEON, SYLVIE L.
TI  - MANAGING THE CO-CREATION PROCESS: WHEN THE CAKE DOES NOT RISE
PY  - 2023
AB  - <jats:p> Co-creation is recognised in the literature as fostering successful collaboration between academia and industry. Although models do exist, they only contain general principals and provide no details about the process from ideation to value creation. Moreover, they are established based on a consideration that industry submits a problem and the university provides solutions. However, with increasing pressure on researchers for their research to lead to tangible applications, universities must now also turn to firms to pinpoint their needs and practices. The purpose of this paper is to understand how a researcher can implement and manage a co-creation project in collaboration with firms to foster innovation. A university research team in food science and technology, in response to the issue of allergen management in the food service industry, more specifically the use of eggs in pastries, has led a co-creation project with six professional pastry chefs to improve cake formulations, in which eggs were replaced with legume puree. Based on the results and the literature, a model to manage the co-creation process between academia and industry that incorporates a collaboration platform is proposed. This paper also identifies the concrete practices that foster creativity and interaction among participants and that lead to innovation. </jats:p>
SP  - NA
EP  - NA
JF  - International Journal of Innovation Management
VL  - 27
IS  - 5
PB  -
DO  - 10.1142/s136391962340008x
ER  -

TY  - NA
AU  - Ding, Hui; Ma, Wanwangying; Chen, Lin; Zhou, Yuming; Xu, Baowen
TI  - APSEC - An Empirical Study on Downstream Workarounds for Cross-Project Bugs
PY  - 2017
AB  - GitHub has fostered complicated and enormous software ecosystems, in which projects depend on and co-evolve with each other. An error in an upstream project may affect its downstream projects through inter-dependencies, forming crossproject bugs. Though the upstream developers should fix the bugs on their side, proposing a workaround, i.e., a temporary solution in the downstream project is a common practice for the downstream developers. In this study, we empirically investigated the characteristics of downstream workarounds in the scientific Python ecosystem. Combining the statistical comparisons and manual inspection, we have the following three main findings. First, in general, the workarounds and the corresponding upstream fixes are significantly different in code size and code structure. Second, there are three kinds of crossproject bugs that the downstream developers usually work around. Last, four types of common patterns are identified from the investigated workarounds. The findings of this study lead to better understanding of cross-project bugs and the practices of developers in software ecosystems.
SP  - 318
EP  - 327
JF  - 2017 24th Asia-Pacific Software Engineering Conference (APSEC)
VL  - NA
IS  - NA
PB  -
DO  - 10.1109/apsec.2017.38
ER  -

TY  - NA
AU  - Meloca, Rômulo; Pinto, Gustavo; Baiser, Leonardo; Mattos, Marco; Polato, Ivanilton; Wiese, Igor; German, Daniel M.
TI  - MSR - Understanding the usage, impact, and adoption of non-OSI approved licenses
PY  - 2018
AB  - The software license is one of the most important non-executable pieces of any software system. However, due to its non-technical nature, developers often misuse or misunderstand software licenses. Although previous studies reported problems related to licenses clashes and inconsistencies, in this paper we shed the light on an important but yet overlooked issue: the use of non-approved open-source licenses. Such licenses claim to be open-source, but have not been formally approved by the Open Source Initiative (OSI). When a developer releases a software under a non-approved license, even if the interest is to make it open-source, the original author might not be granting the rights required by those who use the software. To uncover the reasons behind the use of non-approved licenses, we conducted a mix-method study, mining data from 657K open-source projects and their 4,367K versions, and surveying 76 developers that published some of these projects. Although 1,058,554 of the project versions employ at least one non-approved license, non-approved licenses account for 21.51% of license usage. We also observed that it is not uncommon for developers to change from a non-approved to an approved license. When asked, some developers mentioned that this transition was due to a better understanding of the disadvantages of using an non-approved license. This perspective is particularly important since developers often rely on package managers to easily and quickly get their dependencies working.
SP  - 270
EP  - 280
JF  - Proceedings of the 15th International Conference on Mining Software Repositories
VL  - NA
IS  - NA
PB  -
DO  - 10.1145/3196398.3196427
ER  -

TY  - JOUR
AU  - Luo, Zhenhao; Wang, Baosheng; Tang, Yong; Xie, Wei
TI  - Semantic-Based Representation Binary Clone Detection for Cross-Architectures in the Internet of Things
PY  - 2019
AB  - Code reuse is widespread in software development as well as internet of things (IoT) devices. However, code reuse introduces many problems, e.g., software plagiarism and known vulnerabilities. Solving these problems requires extensive manual reverse analysis. Fortunately, binary clone detection can help analysts mitigate manual work by matching reusable code and known parts. However, many binary clone detection methods are not robust to various compiler optimization options and different architectures. While some clone detection methods can be applied across different architectures, they rely on manual features based on human prior knowledge to generate feature vectors for assembly functions and fail to consider the internal associations between features from a semantic perspective. To address this problem, we propose and implement a prototype GeneDiff, a semantic-based representation binary clone detection approach for cross-architectures. GeneDiff utilizes a representation model based on natural language processing (NLP) to generate high-dimensional numeric vectors for each function based on the Valgrind intermediate representation (VEX) representation. This is the first work that translates assembly instructions into an intermediate representation and uses a semantic representation model to implement clone detection for cross-architectures. GeneDiff is robust to various compiler optimization options and different architectures. Compared to approaches using symbolic execution, GeneDiff is significantly more efficient and accurate. The area under the curve (AUC) of the receiver operating characteristic (ROC) of GeneDiff reaches 92.35%, which is considerably higher than the approaches that use symbolic execution. Extensive experiments indicate that GeneDiff can detect similarity with high accuracy even when the code has been compiled with different optimization options and targeted to different architectures. We also use real-world IoT firmware across different architectures as targets, therein proving the practicality of GeneDiff in being able to detect known vulnerabilities.
SP  - 3283
EP  - NA
JF  - Applied Sciences
VL  - 9
IS  - 16
PB  -
DO  - 10.3390/app9163283
ER  -

TY  - CHAP
AU  - Poo-Caamaño, Germán; German, Daniel M.
TI  - OSS - The Right to a Contribution: An Exploratory Survey on How Organizations Address It
PY  - 2015
AB  - Free and Open Source Software (FOSS) projects are characterized by the opportunity to attract external contributors, where contributions can be in any form of copyrightable material, such as code or documentation. In most of them it is understood that contributions would be licensed in similar or compatible terms than the project’s license. Some projects require a copyright transfer from the contributor to an organization for the work contributed to a project, such documents are known as copyright assignment agreements. In a way, it is similar to the copyright transfer than some researchers grant to a publisher. In this work we present an exploratory survey of the multiple visions of copyright assignments, and aggregate them in a work that researchers and practitioners could use to get informed of the alternatives available in the literature. We expect that our findings help inform practitioners on legal concerns when receiving external contributions.
SP  - 157
EP  - 167
JF  - IFIP Advances in Information and Communication Technology
VL  - NA
IS  - NA
PB  -
DO  - 10.1007/978-3-319-17837-0_15
ER  -

TY  - JOUR
AU  - German, Daniel M.; Di Penta, M.
TI  - A Method for Open Source License Compliance of Java Applications
PY  - 2012
AB  - Open source license compliance (OSLC) is the process of ensuring that an organization satisfies the licensing requirements of the open source software it reuses, whether for its internal use or as a part of a product it ships. The major challenges of OSLC include component identification, provenance discovery, license identification, and licensing requirements analysis. Kenen is an approach that assists organizations in OSLC for Java components.
SP  - 58
EP  - 63
JF  - IEEE Software
VL  - 29
IS  - 3
PB  -
DO  - 10.1109/ms.2012.50
ER  -

TY  - JOUR
AU  - Cao, Yulu; Chen, Lin; Ma, Wanwangying; Li, Yanhui; Zhou, Yuming; Wang, Linzhang
TI  - Towards Better Dependency Management: A First Look at Dependency Smells in Python Projects
PY  - 2023
AB  - Managing cross-project dependencies is tricky in modern software development. A primary way to manage dependencies is using dependency configuration files, which brings convenience to the entire software ecosystem, including developers, maintainers, and users. However, developers may introduce dependency smells if dependency configuration files are not well written and maintained. Dependency smells are recurring violations of dependency management in dependency configuration files and can potentially lead to severe consequences. This paper provides an in-depth look at three dependency smells, namely, <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Missing Dependency</i> , <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Bloated Dependency</i> , and <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Version Constraint Inconsistency</i> in Python projects. First, we implement a tool called <underline xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Py</u> thon <underline xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">C</u> ross-project <underline xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">D</u> ependency- PyCD to accurately extract dependency information from configuration files. The evaluation result on 212 Python projects shows that PyCD outperforms state-of-the-art tools. Then, we make an empirical study for three dependency smells in 132 Python projects to investigate the pervasiveness, causes, and evolution. The results show that: 1) dependency smells are prevalent in Python projects and exist inconsistently in different projects; 2) dependency smells are introduced into Python projects for different reasons, mainly due to the problems of synchronous update and collaborative development; and 3) dependency smells can be removed with different patterns according to different dependency smells. Furthermore, we report and get responses for 40 harmful dependency smell instances, 34 of which have been responded that these dependency smells do exist in the projects, and 10 instances are fixed or under process. The feedback from developers indicates that dependency smells can have a negative impact on project maintenance. Our study highlights that these dependency smells deserve the attention of developers.
SP  - 1741
EP  - 1765
JF  - IEEE Transactions on Software Engineering
VL  - 49
IS  - 4
PB  -
DO  - 10.1109/tse.2022.3191353
ER  -

TY  - JOUR
AU  - Cheng, Kathy; Zhou, Shurui; Olechowski, Alison
TI  - "A Lot of Moving Parts": A Case Study of Open-Source Hardware Design Collaboration in the Thingiverse Community
PY  - 2024
AB  - <jats:p>Open-source is a decentralized and collaborative method of development that encourages open contribution from an extensive and undefined network of individuals. Although commonly associated with software development (OSS), the open-source model extends to hardware development, forming the basis of open-source hardware development (OSH). Compared to OSS, OSH is relatively nascent, lacking adequate tooling support from existing platforms and best practices for efficient collaboration. Taking a necessary step towards improving OSH collaboration, we conduct a detailed case study of DrawBot, a successful OSH project that remarkably fostered a long-term collaboration on Thingiverse - a platform not explicitly intended for complex collaborative design. Through analyzing comment threads and design changes over the course of the project, we found how collaboration occurred, the challenges faced, and how the DrawBot community managed to overcome these obstacles. Beyond offering a detailed account of collaboration practices and challenges, our work contributes best practices, design implications, and practical implications for OSH project maintainers, platform builders, and researchers, respectively. With these insights and our publicly available dataset, we aim to foster more effective and efficient collaborative design in OSH projects.</jats:p>
SP  - 1
EP  - 29
JF  - Proceedings of the ACM on Human-Computer Interaction
VL  - 8
IS  - CSCW2
PB  -
DO  - 10.1145/3687008
ER  -

TY  - JOUR
AU  - Mens, Tom; Cataldo, Marcelo; Damian, Daniela
TI  - The Social Developer: The Future of Software Development [Guest Editors' Introduction]
PY  - 2019
AB  - Contemporary Software Engineering has inevitably become much more social. Due to the size, complexity, and diversity of today's software systems, there is a need to interact across organizational, geographical, cultural, and socioeconomic boundaries. Large-scale software development now implies active user involvement and requires close cooperation and collaboration between team members and all types of development activities. Members of software projects across all roles must communicate and interact continuously with other project members as well as with a variety of stakeholders, such as users, analysts, suppliers, customers, and business partners. This theme issue aims to inform software engineering practitioners about current trends and recent advances in research and practice of sociotechnical analysis and support for large-scale software development.
SP  - 11
EP  - 14
JF  - IEEE Software
VL  - 36
IS  - 1
PB  -
DO  - 10.1109/ms.2018.2874316
ER  -

TY  - NA
AU  - Jiang, Ling; Yuan, Hengchen; Tang, Qiyi; Nie, Sen; Wu, Shi; Zhang, Yuqun
TI  - Third-Party Library Dependency for Large-Scale SCA in the C/C++ Ecosystem: How Far Are We?
PY  - 2023
AB  - Existing software composition analysis (SCA) techniques for the C/C++ ecosystem tend to identify the reused components through feature matching between target software project and collected third-party libraries (TPLs). However, feature duplication caused by internal code clone can cause inaccurate SCA results. To mitigate this issue, Centris, a state-of-the-art SCA technique for the C/C++ ecosystem, was proposed to adopt function-level code clone detection to derive the TPL dependencies for eliminating the redundant features before performing SCA tasks. Although Centris has been shown effective in the original paper, the accuracy of the derived TPL dependencies is not evaluated. Additionally, the dataset to evaluate the impact of TPL dependency on SCA is limited. To further investigate the efficacy and limitations of Centris, we first construct two large-scale ground-truth datasets for evaluating the accuracy of deriving TPL dependency and SCA results respectively. Then we extensively evaluate Centris where the evaluation results suggest that the accuracy of TPL dependencies derived by Centris may not well generalize to our evaluation dataset. We further infer the key factors that degrade the performance can be the inaccurate function birth time and the threshold-based recall. In addition, the impact on SCA from the TPL dependencies derived by Centris can be somewhat limited. Inspired by our findings, we propose TPLite with function-level origin TPL detection and graph-based dependency recall to enhance the accuracy of TPL reuse detection in the C/C++ ecosystem. Our evaluation results indicate that TPLite effectively increases the precision from 35.71% to 88.33% and the recall from 49.44% to 62.65% of deriving TPL dependencies compared with Centris. Moreover, TPLite increases the precision from 21.08% to 75.90% and the recall from 57.62% to 64.17% compared with the SOTA academic SCA tool B2SFinder and even outperforms the well-adopted commercial SCA tool BDBA, i.e., increasing the precision from 72.46% to 75.90% and the recall from 58.55% to 64.17%.
SP  - 1383
EP  - 1395
JF  - Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis
VL  - NA
IS  - NA
PB  -
DO  - 10.1145/3597926.3598143
ER  -

TY  - BOOK
AU  - Lundell, Björn; Gamalielsson, Jonas
TI  - SER&amp;IP@ICSE - Collaborative research involving small companies: experiences from co-production of knowledge for research and practice through use of an action case approach
PY  - 2017
AB  - In order for the conduct of collaborative research projects and their outcomes to be valuable for both research and practice it is necessary to successfully address a number of socio-technical challenges in the field of software engineering. Collaborative research involving researchers and practitioners related to software systems have utilised a variety of different research approaches. Adoption of an effective research approach for the situation at hand in a research project may significantly contribute to project success. Experiences from collaborative research show that action case can be an appropriate choice of approach for addressing socio-technical challenges in the software domain, which is appealing to both practitioners and researchers. This paper elaborates on a number of challenges for successful conduct of collaborative research projects and reports on experiences from use of action case as a research approach for conduct of collaborative research related to software systems.
SP  - 24
EP  - 30
JF  - 2017 IEEE/ACM 4th International Workshop on Software Engineering Research and Industrial Practice (SER&IP)
VL  - NA
IS  - NA
PB  -
DO  - 10.1109/ser-ip.2017..4
ER  -

TY  - JOUR
AU  - Terzi, Anastasia; Bibi, Stamatia
TI  - Opening Software Research Data 5Ws+1H
PY  - 2024
AB  - <jats:p>Open Science describes the movement of making any research artifact available to the public, fostering sharing and collaboration. While sharing the source code is a popular Open Science practice in software research and development, there is still a lot of work to be done to achieve the openness of the whole research and development cycle from the conception to the preservation phase. In this direction, the software engineering community faces significant challenges in adopting open science practices due to the complexity of the data, the heterogeneity of the development environments and the diversity of the application domains. In this paper, through the discussion of the 5Ws+1H (Why, Who, What, When, Where, and How) questions that are referred to as the Kipling’s framework, we aim to provide a structured guideline to motivate and assist the software engineering community on the journey to data openness. Also, we demonstrate the practical application of these guidelines through a use case on opening research data.</jats:p>
SP  - 411
EP  - 441
JF  - Software
VL  - 3
IS  - 4
PB  -
DO  - 10.3390/software3040021
ER  -

TY  - JOUR
AU  - Sharif, Khaironi Yatim; English, Michael; Ali, Nour; Exton, Chris; Collins, John; Buckley, Jim
TI  - An empirically-based characterization and quantification of information seeking through mailing lists during Open Source developers’ software evolution
PY  - 2015
AB  - Abstract Context Several authors have proposed information seeking as an appropriate perspective for studying software evolution. Empirical evidence in this area suggests that substantial time delays can accrue, due to the unavailability of required information, particularly when this information must travel across geographically distributed sites. Objective As a first step in addressing the time delays that can occur in information seeking for distributed Open Source (OS) programmers during software evolution, this research characterizes the information seeking of OS developers through their mailing lists. Method A longitudinal study that analyses 17 years of developer mailing list activity in total, over 6 different OS projects is performed, identifying the prevalent information types sought by developers, from a qualitative, grounded analysis of this data. Quantitative analysis of the number-of-responses and response time-lag is also performed. Results The analysis shows that Open Source developers are particularly implementation centric and team focused in their use of mailing lists, mirroring similar findings that have been reported in the literature. However novel findings include the suggestion that OS developers often require support regarding the technology they use during development, that they refer to documentation fairly frequently and that they seek implementation-oriented specifics based on system design principles that they anticipate in advance. In addition, response analysis suggests a large variability in the response rates for different types of questions, and particularly that participants have difficulty ascertaining information on other developer’s activities. Conclusion The findings provide insights for those interested in supporting the information needs of OS developer communities: They suggest that the tools and techniques developed in support of co-located developers should be largely mirrored for these communities: that they should be implementation centric, and directed at illustrating “how” the system achieves its functional goals and states. Likewise they should be directed at determining the reason for system bugs: a type of question frequently posed by OS developers but less frequently responded to.
SP  - 77
EP  - 94
JF  - Information and Software Technology
VL  - 57
IS  - 57
PB  -
DO  - 10.1016/j.infsof.2014.09.003
ER  -

TY  - CHAP
AU  - da Silva, Antonio Cesar Brandao Gomes; de Figueiredo Carneiro, Glauco; Monteiro, Miguel P.; Brito e Abreu, Fernando; Constantino, Kattiana; Figueiredo, Eduardo
TI  - On the Impact of Product Quality Attributes on Open Source Project Evolution
PY  - 2017
AB  - Context: Several Open Source Software (OSS) projects have adopted frequent releases as a strategy to deliver both new features and fixed bugs on time. This cycle begins with express requests from the project’s community, registered as issues in bug repositories by active users and developers. Each OSS project has its own priorities established by their respective communities. A a still open question is the set of criteria and priorities that influence the decisions of which issues should be analyzed, implemented/solved and delivered in next releases. In this paper, we present an exploratory study whose goal is to investigate the influence of target product quality attributes in software evolution practices of OSS projects. The goal is to search for evidence of relationships between these target attributes, priorities assigned to the registered issues and the ways they are delivered by product releases. To this end, we asked six participants of an exploratory study to identify these attributes through the data analysis of repositories of three well-known OSS projects: Libre Office, Eclipse and Mozilla Firefox. Evidence indicated by the participants suggest that OSS community developers use criteria/priorities driven by specific software product quality attributes, to plan and integrate software releases.
SP  - 613
EP  - 620
JF  - Advances in Intelligent Systems and Computing
VL  - NA
IS  - NA
PB  -
DO  - 10.1007/978-3-319-54978-1_77
ER  -

TY  - JOUR
AU  - Cohoon, Johanna
TI  - *READ**THIS*!! Spam as a threat for open science
PY  - 2024
AB  - <jats:p> Drawing on multiple sources of qualitative data, I describe a case of open science infrastructure (OSI) abuse. The case illustrates how developers navigated scholarly value tensions and issues of epistemic and platform legitimacy while battling spam on their open science webapp. Notably, their struggle used precious financial resources and drew attention away from other development tasks like feature expansion. This research makes evident that not only is OSI abuse like spam a financial burden, but it puts scholarly information security—specifically, the legitimacy of open science content—at risk. However, protecting against such abuse is not a trivial matter; it raises questions of who is responsible for defining and enforcing scholarly values. The urgency of this issue is magnified by OSI’s relationship to public trust in science. </jats:p>
SP  - NA
EP  - NA
JF  - New Media & Society
VL  - NA
IS  - NA
PB  -
DO  - 10.1177/14614448241248655
ER  -

TY  - JOUR
AU  - Wessel, Mairieli; Serebrenik, Alexander; Wiese, Igor; Steinmacher, Igor; Gerosa, Marco A.
TI  - Quality gatekeepers: investigating the effects of code review bots on pull request activities
PY  - 2022
AB  - <jats:title>Abstract</jats:title><jats:p>Software bots have been facilitating several development activities in Open Source Software (OSS) projects, including code review. However, these bots may bring unexpected impacts to group dynamics, as frequently occurs with new technology adoption. Understanding and anticipating such effects is important for planning and management. To analyze these effects, we investigate how several activity indicators change after the adoption of a code review bot. We employed a regression discontinuity design on 1,194 software projects from GitHub. We also interviewed 12 practitioners, including open-source maintainers and contributors. Our results indicate that the adoption of code review bots increases the number of monthly merged pull requests, decreases monthly non-merged pull requests, and decreases communication among developers. From the developers’ perspective, these effects are explained by the transparency and confidence the bot comments introduce, in addition to the changes in the discussion focused on pull requests. Practitioners and maintainers may leverage our results to understand, or even predict, bot effects on their projects.</jats:p>
SP  - NA
EP  - NA
JF  - Empirical Software Engineering
VL  - 27
IS  - 5
PB  -
DO  - 10.1007/s10664-022-10130-9
ER  -

TY  - JOUR
AU  - Shukla, Sanjai Kumar; Sushil, NA
TI  - EVALUATING THE PRACTICES OF FLEXIBILITY MATURITY FOR THE SOFTWARE PRODUCT AND SERVICE ORGANIZATIONS
PY  - 2020
AB  - NA
SP  - 71
EP  - 89
JF  - International Journal of Information Management
VL  - 50
IS  - NA
PB  -
DO  - 10.1016/j.ijinfomgt.2019.05.005
ER  -

TY  - JOUR
AU  - Young, Amber Grace; Majchrzak, Ann; Kane, Gerald C.
TI  - Reflection on Writing a Theory Paper: How to Theorize for the Future
PY  - 2021
AB  - NA
SP  - 1212
EP  - 1223
JF  - Journal of the Association for Information Systems
VL  - 22
IS  - 5
PB  -
DO  - NA
ER  -

TY  - JOUR
AU  - Gbagir, Augustine-Moses Gaavwase; Ek, Kylli; Colpaert, Alfred
TI  - OpenDroneMap: Multi-Platform Performance Analysis
PY  - 2023
AB  - <jats:p>This paper analyzes the performance of the open-source OpenDroneMap image processing software (ODM) across multiple platforms. We tested desktop and laptop computers as well as high-performance cloud computing and supercomputers. Multiple machine configurations (CPU cores and memory) were used. We used eBee S.O.D.A. drone image datasets from Namibia and northern Finland. For testing, we used the OpenDroneMap command line tool with default settings and the fast orthophoto option, which produced a good quality orthomosaic. We also used the “rerun-all option” to ensure that all jobs started from the same point. Our results show that ODM processing time is dependent upon the number of images, a high number of which can lead to high memory demands, with low memory leading to an excessively long processing time. Adding additional CPU cores is beneficial to ODM up to a certain limit. A 20-core machine seems optimal for a dataset of about 1000 images, although 10 cores will result only in slightly longer processing times. We did not find any indication of improvement when processing larger datasets using 40-core machines. For 1000 images, 64 GB memory seems to be sufficient, but for larger datasets of about 8000 images, higher memory of up to 256 GB is required for efficient processing. ODM can use GPU acceleration, at least in some processing stages, reducing processing time. In comparison to commercial software, ODM seems to be slower, but the created orthomosaics are of equal quality.</jats:p>
SP  - 446
EP  - 458
JF  - Geographies
VL  - 3
IS  - 3
PB  -
DO  - 10.3390/geographies3030023
ER  -

TY  - NA
AU  - Bauer, Veronika; Volke, Tobias; Eder, Sebastian
TI  - IWSC@SANER - Combining Clone Detection and Latent Semantic Indexing to Detect Re-implementations
PY  - 2016
AB  - Semantic redundancies are frequently reported in practice and cause increased efforts for development and maintenance. However, instances are hard to find with existing approaches that tend to deliver a daunting number of imprecise findings for this specific problem. Can these issues be mitigated by combining different detection techniques? In this paper, we investigate whether a combination of clone detection and latent semantic indexing improves the detection of candidate re-implementations. We evaluate the combination of both techniques on an industrial system, assess the results of both techniques, characterize the different findings, and present a practitioner judgement of their relevance. Our findings suggest that (1) latent semantic indexing and clone detection complement each other, (2) aggregated clone detection can be a better indicator for re-implementations than LSI, and (3) the combination of the techniques provides high quality result sets which were considered relevant and actionable by practitioners.
SP  - 23
EP  - 29
JF  - 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER)
VL  - 3
IS  - NA
PB  -
DO  - 10.1109/saner.2016.26
ER  -

TY  - JOUR
AU  - Abhadiomhen, Stanley Ebhohimhen; Nzeakor, Emmanuel Onyekachukwu; Oyibo, Kiemute
TI  - Health Risk Assessment Using Machine Learning: Systematic Review
PY  - 2024
AB  - <jats:p>According to the World Health Organization, chronic illnesses account for over 70% of deaths globally, underscoring the need for effective health risk assessment (HRA). While machine learning (ML) has shown potential in enhancing HRA, no systematic review has explored its application in general health risk assessments. Existing reviews typically focus on specific conditions. This paper reviews published articles that utilize ML for HRA, and it aims to identify the model development methods. A systematic review following Tranfield et al.’s three-stage approach was conducted, and it adhered to the PRISMA protocol. The literature was sourced from five databases, including PubMed. Of the included articles, 42% (11/26) addressed general health risks. Secondary data sources were most common (14/26, 53.85%), while primary data were used in eleven studies, with nine (81.81%) using data from a specific population. Random forest was the most popular algorithm, which was used in nine studies (34.62%). Notably, twelve studies implemented multiple algorithms, while seven studies incorporated model interpretability techniques. Although these studies have shown promise in addressing digital health inequities, more research is needed to include diverse sample populations, particularly from underserved communities, to enhance the generalizability of existing models. Furthermore, model interpretability should be prioritized to ensure transparent, trustworthy, and broadly applicable healthcare solutions.</jats:p>
SP  - 4405
EP  - 4405
JF  - Electronics
VL  - 13
IS  - 22
PB  -
DO  - 10.3390/electronics13224405
ER  -

TY  - JOUR
AU  - Dennehy, Denis; Conboy, Kieran; Ferreira, Jennifer; Babu, Jaganath
TI  - Sustaining Open Source Communities by Understanding the Influence of Discursive Manifestations on Sentiment
PY  - 2020
AB  - NA
SP  - 241
EP  - 257
JF  - Information Systems Frontiers
VL  - 25
IS  - 1
PB  -
DO  - 10.1007/s10796-020-10059-8
ER  -

TY  - JOUR
AU  - Reid, Brittany; d'Amorim, Marcelo; Wagner, Markus; Treude, Christoph
TI  - NCQ: Code Reuse Support for Node.js Developers
PY  - 2023
AB  - NA
SP  - 3205
EP  - 3225
JF  - IEEE Transactions on Software Engineering
VL  - 49
IS  - 5
PB  -
DO  - 10.1109/tse.2023.3248113
ER  -

TY  - NA
AU  - Devarasetty, Prasad; Reddy, Satyananda; Mic, Hs
TI  - Open Source Software: A Review of Characteristics That Made it Popular and Successful
PY  - 2015
AB  - Open Source Software is an alternative to proprietary software. It is being popular day by day which has brought about an increase in research. It has been successful as the research in this area is growing rapidly and results published support this argument. Research is in progress to identify the characteristics unique to this phenomenon. There are a large set of characteristics that drive to the success of Open Source software. Some of them include the type of the development model, the phenomenon of forking that facilitates to achieve sustainability, the role of contributors and the attractiveness of OSS, the behaviors of Commits, and how the training phenomenon facilitates the acceptance of the OSS. This study elaborates how the five characteristics facilitate to the success and popularity of the Open Source Software.
SP  - NA
EP  - NA
JF  - NA
VL  - NA
IS  - NA
PB  -
DO  - NA
ER  -

TY  - NA
AU  - Ochoa, Lina; Degueule, Thomas; Falleri, Jean-Rémy
TI  - Analyzing the Impact of Pull Requests to Guide Library Evolution.
PY  - 2021
AB  - "If we make this change to our code, how will it impact our clients?" It is difficult for library maintainers to answer this simple -- yet essential! -- question when evolving their libraries. Library maintainers are constantly balancing between two opposing positions: make changes at the risk of breaking some of their clients, or avoid changes and maintain compatibility at the cost of immobility and growing technical debt. We argue that the lack of objective usage data and tool support leaves maintainers with their own subjective perception of their community to make these decisions. We introduce BreakBot, a bot that analyses the pull requests of Java libraries on GitHub to identify the breaking changes they introduce and their impact on client projects. Through static analysis of libraries and clients, it extracts and summarizes objective data that enrich the code review process by providing maintainers with the appropriate information to decide whether -- and how -- changes should be accepted, directly in the pull requests.
SP  - NA
EP  - NA
JF  - arXiv: Software Engineering
VL  - NA
IS  - NA
PB  -
DO  - NA
ER  -

TY  - NA
AU  - Khatami, Ali; Zaidman, Andy
TI  - Quality Assurance Awareness in Open Source Software Projects on GitHub
PY  - 2023
AB  - Software engineers employ a variety of approaches to ensure the quality of software systems, including software testing, modern code review, automated static analysis, build automation, and continuous integration. To make effective decisions regarding quality assurance (QA), software engineers need to have an awareness of (1) the QA approaches that are in use in a project, and (2) how they are used. Through an exploratory, mixed-methods investigation we set out to better understand the awareness of software engineers in open-source software (OSS) development with regard to QA practices. This involved a largescale survey of 471 maintainers and contributors on GitHub. Our findings indicate that a high-level awareness among the respondents is common, but also that the respondents are less certain about how the practices are adopted; we further consider the perspective of both the contributor and the maintainer.
SP  - 174
EP  - 185
JF  - 2023 IEEE 23rd International Working Conference on Source Code Analysis and Manipulation (SCAM)
VL  - NA
IS  - NA
PB  -
DO  - 10.1109/scam59687.2023.00027
ER  -

TY  - JOUR
AU  - Link, Georg J.P.; Rao, Malvika; Marti, Don; Leak, Andy; Bodo, Rich
TI  - Marktplatz zur Koordinierung und Finanzierung von Open Source Software
PY  - 2018
AB  - Open Source ist ein zunehmend beliebter Kollaborationsmechanismus fur die Entwicklung von Software, auch in Unternehmen. Unsere Arbeit schafft die fehlende Verbindung zwischen Open Source Projekten, Unternehmen und Markten. Ohne diese Verbindung wurden Koordinations- und Finanzierungsprobleme sichtbar, die zu schwerwiegenden Sicherheitslucken fuhren. In diesem Paper entwickeln wir acht Design Features, die ein Marktplatz fur Open Source haben sollte, um diese Probleme zu beseitigen. Wir begrunden jedes Design Feature mit den bestehenden Praktiken von Open Source und stellen einen Prototypen vor. Abschliesend diskutieren wir, welche Auswirkungen die Einfuhrung eines solchen Marktplatzes haben konnte.
SP  - 419
EP  - 437
JF  - HMD Praxis der Wirtschaftsinformatik
VL  - 56
IS  - 2
PB  -
DO  - 10.1365/s40702-018-00474-6
ER  -

TY  - CHAP
AU  - Farah, Juan Carlos; Spaenlehauer, Basile; Ingram, Sandy; Purohit, Aditya K.; Holzer, Adrian; Gillet, Denis
TI  - Harnessing Rule-Based Chatbots to Support Teaching Python Programming Best Practices
PY  - 2024
AB  - NA
SP  - 455
EP  - 466
JF  - Lecture Notes in Networks and Systems
VL  - NA
IS  - NA
PB  -
DO  - 10.1007/978-3-031-51979-6_47
ER  -

TY  - NA
AU  - Latendresse, Jasmine; Mujahid, Suhaib; Costa, Diego Elias; Shihab, Emad
TI  - Not All Dependencies are Equal: An Empirical Study on Production Dependencies in NPM
PY  - 2022
AB  - Modern software systems are often built by leveraging code written by others in the form of libraries and packages to accelerate their development. While there are many benefits to using third-party packages, software projects often become dependent on a large number of software packages. Consequently, developers are faced with the difficult challenge of maintaining their project dependencies by keeping them up-to-date and free of security vulnerabilities. However, how often are project dependencies used in production where they could pose a threat to their project's security?
SP  - 1
EP  - 12
JF  - Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering
VL  - NA
IS  - NA
PB  -
DO  - 10.1145/3551349.3556896
ER  -

TY  - JOUR
AU  - Lee, Sanghoon; German, Daniel M.; Hwang, Seung-won; Kim, Sunghun
TI  - Crowdsourcing Identification of License Violations
PY  - 2015
AB  - Free and open source software (FOSS) has created a large pool of source codes that can be easily copied to create new applications. However, a copy should preserve copyright notice and license of the original file unless the license explicitly permits such a change. Through software evolution, it is challenging to keep original licenses or choose proper licenses. As a result, there are many potential license violations. Despite the fact that violations can have high impact on protecting copyright, identification of violations is highly complex. It relies on manual inspections by experts. However, such inspection cannot be scaled up with open source software released daily worldwide. To make this process scalable, we propose the following two methods: use machine-based algorithms to narrow down the potential violations; and guide non-experts to manually inspect violations. Using the first method, we found 219 projects (76.6%) with potential violations. Using the second method, we show that the accuracy of crowds is comparable to that of experts. Our techniques might help developers identify potential violations, understand the causes, and resolve these violations.
SP  - 190
EP  - 203
JF  - Journal of Computing Science and Engineering
VL  - 9
IS  - 4
PB  -
DO  - 10.5626/jcse.2015.9.4.190
ER  -

TY  - JOUR
AU  - Butler, Simon; Gamalielsson, Jonas; Lundell, Björn; Brax, Christoffer; Sjoberg, Johan; Mattsson, Anders; Gustavsson, Tomas; Feist, Jonas; Lonroth, Erik
TI  - On Company Contributions to Community Open Source Software Projects
PY  - 2021
AB  - The majority of contributions to community open source software (OSS) projects are made by practitioners acting on behalf of companies and other organisations. Previous research has addressed the motivations of both individuals and companies to engage with OSS projects. However, limited research has been undertaken that examines and explains the practical mechanisms or work practices used by companies and their developers to pursue their commercial and technical objectives when engaging with OSS projects. This research investigates the variety of work practices used in public communication channels by company contributors to engage with and contribute to eight community OSS projects. Through interviews with contributors to the eight projects we draw on their experiences and insights to explore the motivations to use particular methods of contribution. We find that companies utilise work practices for contributing to community projects which are congruent with the circumstances and their capabilities that support their short- and long-term needs. We also find that companies contribute to community OSS projects in ways that may not always be apparent from public sources, such as employing core project developers, making donations, and joining project steering committees in order to advance strategic interests. The factors influencing contributor work practices can be complex and are often dynamic arising from considerations such as company and project structure, as well as technical concerns and commercial strategies. The business context in which software created by the OSS project is deployed is also found to influence contributor work practices.
SP  - 1381
EP  - 1401
JF  - IEEE Transactions on Software Engineering
VL  - 47
IS  - 7
PB  -
DO  - 10.1109/tse.2019.2919305
ER  -

TY  - NA
AU  - Kim, Hyungjin; Kwon, Yonghwi; Joh, Sangwoo; Kwon, Hyukin; Ryou, Yeonhee; Kim, Taeksu
TI  - Understanding automated code review process and developer experience in industry
PY  - 2022
AB  - Code Review Automation can reduce human efforts during code review by automatically providing valuable information to reviewers. Nevertheless, it is a challenge to automate the process for large-scale companies, such as Samsung Electronics, due to their complexity: various development environments, frequent review requests, huge size of software, and diverse process among the teams.
SP  - 1398
EP  - 1407
JF  - Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering
VL  - NA
IS  - NA
PB  -
DO  - 10.1145/3540250.3558950
ER  -

TY  - NA
AU  - Zargar, Mahmood Shafeie
TI  - REUSING OR REINVENTING THE WHEEL: THE SEARCH-TRANSFER ISSUE IN OPEN SOURCE COMMUNITIES Research-in-Progress
PY  - 2013
AB  - Despite the raising awareness about the importance of open innovation communities in knowledge economies, empirical evidence about the structural determinants of knowledge reuse in these communities is lacking. In order to address this gap, the current study sets out to investigate the network-level determinants of knowledge reuse in open source projects. I suggest tracking code reuse across open source projects as a feasible and accessible proxy measure for knowledge reuse. I argue that in spite of favorable conditions, search and processing costs associated with knowledge reuse remain high enough to localize reuse behavior of open source developers. I hypothesize that network-level proximity of projects within the social network of open source community is a significant determinant of code reuse. A report on the progress of the empirical section of the research project along with some confirmatory preliminary results has been included.
SP  - NA
EP  - NA
JF  - NA
VL  - NA
IS  - NA
PB  -
DO  - NA
ER  -

TY  - JOUR
AU  - Emannuel T. Saligue, ; Rosemarie Y. Saligue, ; Jose C. Agoylo Jr.,
TI  - Semi-Dynamic Mobile-Based Virtual Reality System for University
PY  - 2024
AB  - <jats:p>Abstract: Most of the Virtual Reality today is a static image in which the creator will manually replace it in the system and even replace it as a whole. There are several steps to obtain VR projects, namely:  data processing, VR roles to follow, subsystems, data acquisition, data processing, 3D modeling, the object-oriented property of VR systems, and visualization task. This study aimed to develop and evaluate a Semi-Dynamic Mobile-Based Virtual Reality System for University. The system was developed using server side (PHP) and client side (JavaScript) scripting language for web-based Java, and PhoneGap for android application. Bootstrap and jQuery are also used in the development of the system with MySQL as the database so it can communicate with other platforms. The system can be downloaded git repository together with the android application the developer can use it for customization.  A developmental-Evaluative survey method was used in gathering the data using the adapted questionnaire. The respondents are very satisfied with the performance level of the system in terms of functionality, compatibility, security, and maintainability. Therefore, it is an indication that the developed system is an excellent platform in file management.</jats:p>
SP  - 34
EP  - 42
JF  - International Journal of Latest Technology in Engineering Management & Applied Science
VL  - 13
IS  - 7
PB  -
DO  - 10.51583/ijltemas.2024.130705
ER  -

TY  - JOUR
AU  - Mäenpää, Hanna; Mäkinen, Simo; Kilamo, Terhi; Mikkonen, Tommi; Männistö, Tomi; Ritala, Paavo
TI  - Organizing for openness: six models for developer involvement in hybrid OSS projects
PY  - 2018
AB  - This article examines organization and governance of commercially influenced Open Source Software development communities by presenting a multiple-case study of six contemporary, hybrid OSS projects. The findings provide in-depth understanding on how to design the participatory nature of the software development process, while understanding the factors that influence the delicate balance of openness, motivations, and governance. The results lay ground for further research on how to organize and manage developer communities where needs of the stakeholders are competing, yet complementary.
SP  - 1
EP  - 14
JF  - Journal of Internet Services and Applications
VL  - 9
IS  - 1
PB  -
DO  - 10.1186/s13174-018-0088-1
ER  -

TY  - NA
AU  - Schilling, Andreas
TI  - HICSS - What Do We Know about FLOSS Developers' Attraction, Retention, and Commitment? A Literature Review
PY  - 2014
AB  - Free Libre Open Source Software (FLOSS) is an essential part of our daily life. Many companies and private households rely on FLOSS every day. However, the vast majority of FLOSS initiatives fail. In order to support future research and derive operational advice for FLOSS projects, this research reviews and categorizes the managerial insights from over 20 years of FLOSS research. Based on the central role of the developer base and research on human resource management, developer attraction, retention and commitment are identified as core management areas for FLOSS projects. A detailed analysis of 43 journal articles on FLOSS management identifies an extensive body, which analyses project members' commitment. In contrast, there is relatively little dedicated research on FLOSS developers' attraction and retention. Moreover, the literature review reveals that most articles use solely either an individual-, group- or project-centric research perspective although these perspectives are interrelated with each other.
SP  - 4003
EP  - 4012
JF  - 2014 47th Hawaii International Conference on System Sciences
VL  - NA
IS  - NA
PB  -
DO  - 10.1109/hicss.2014.495
ER  -

TY  - JOUR
AU  - Vendome, Christopher; Bavota, Gabriele; Di Penta, Massimiliano; Linares-Vasquez, Mario; German, Daniel M.; Poshyvanyk, Denys
TI  - License usage and changes: a large-scale study on gitHub
PY  - 2016
AB  - NA
SP  - 1537
EP  - 1577
JF  - Empirical Software Engineering
VL  - 22
IS  - 3
PB  -
DO  - 10.1007/s10664-016-9438-4
ER  -

TY  - NA
AU  - Lekanova, Anna
TI  - Establishing a community for an open-source software startup
PY  - 2020
AB  - NA
SP  - NA
EP  - NA
JF  - NA
VL  - NA
IS  - NA
PB  -
DO  - NA
ER  -

TY  - NA
AU  - Wang, Kai; Ding, Yujie; Jia, Shuai; Ma, Tianyi; Zhang, Yin; Cao, Bin
TI  - Integrating Retrieval-Augmented Generation for Enhanced Code Reuse: A Comprehensive Framework for Efficient Software Development
PY  - 2024
AB  - NA
SP  - 1315
EP  - 1321
JF  - 2024 IEEE Smart World Congress (SWC)
VL  - NA
IS  - NA
PB  -
DO  - 10.1109/swc62898.2024.00205
ER  -

TY  - JOUR
AU  - Santhanam, Sivasurya; Hecking, Tobias; Schreiber, Andreas; Wagner, Stefan
TI  - Bots in software engineering: a systematic mapping study.
PY  - 2022
AB  - Bots have emerged from research prototypes to deployable systems due to the recent developments in machine learning, natural language processing and understanding techniques. In software engineering, bots range from simple automated scripts to decision-making autonomous systems. The spectrum of applications of bots in software engineering is so wide and diverse, that a comprehensive overview and categorization of such bots is needed. Existing works considered selective bots to be analyzed and failed to provide the overall picture. Hence it is significant to categorize bots in software engineering through analyzing why, what and how the bots are applied in software engineering. We approach the problem with a systematic mapping study based on the research articles published in this topic. This study focuses on classification of bots used in software engineering, the various dimensions of the characteristics, the more frequently researched area, potential research spaces to be explored and the perception of bots in the developer community. This study aims to provide an introduction and a broad overview of bots used in software engineering. Discussions of the feedback and results from several studies provide interesting insights and prospective future directions.
SP  - e866
EP  - e866
JF  - PeerJ. Computer science
VL  - 8
IS  - NA
PB  -
DO  - 10.7717/peerj-cs.866
ER  -

TY  - NA
AU  - Zaimi, Asimina; Ampatzoglou, Apostolos; Triantafyllidou, Noni; Chatzigeorgiou, Alexander; Mavridis, Androklis; Chaikalis, Theodore; Deligiannis, Ignatios; Sfetsos, Panagiotis; Stamelos, Ioannis
TI  - BCI - An Empirical Study on the Reuse of Third-Party Libraries in Open-Source Software Development
PY  - 2015
AB  - Software development based on third-party libraries is becoming increasingly popular in recent years. Nowadays, the plethora of open-source libraries that are freely available to developers, offer great reuse opportunities, with relatively low cost. However, the reuse process is in many cases rather ad-hoc. In this paper, we investigate reuse processes in five successful open-source projects, with respect to: (a) the extent to which software functionality is built from scratch or reused, (b) the frequency with which reuse decisions are modified, and (c) the effect of reuse on software product quality. The results of the study suggest that: (a) OSS projects heavily reuse third-party libraries, (b) reuse decisions are not frequently revisited, and (c) there is no clear evidence that reuse decisions are quality-driven.
SP  - 4
EP  - 8
JF  - Proceedings of the 7th Balkan Conference on Informatics Conference
VL  - NA
IS  - NA
PB  -
DO  - 10.1145/2801081.2801087
ER  -

TY  - NA
AU  - Chakraborti, Mahasweta; Atkisson, Curtis; Stănciulescu, Ştefan; Filkov, Vladimir; Frey, Seth
TI  - Do We Run How We Say We Run? Formalization and Practice of Governance in OSS Communities
PY  - 2024
AB  - Open Source Software (OSS) communities often resist regulation typical of traditional organizations. Yet formal governance systems are being increasingly adopted among communities, particularly through non-profit project-sponsoring foundations. Our study looks at the Apache Software Foundation Incubator program and 208 of the projects it has supported. We assemble a scalable, semantic pipeline to discover and analyze the governance behavior of projects from their mailing lists. We then investigate the relationship of such behavior to what the formal policies prescribe, through their own governance priorities and how their members internalize them. Our findings indicate that a greater amount of policy over a governed topic doesn't elicit more governed activity on that topic, but does predict greater internalization by community members. Moreover, alignment of community operations with foundation governance, be it dedicating their governance focus or adopting policy along topics seeing greater policy-making, has limited association with project outcomes.
SP  - 1
EP  - 26
JF  - Proceedings of the CHI Conference on Human Factors in Computing Systems
VL  - NA
IS  - NA
PB  -
DO  - 10.1145/3613904.3641980
ER  -

TY  - CHAP
AU  - Mens, Tom; Roover, Coen De
TI  - An Introduction to Software Ecosystems
PY  - 2023
AB  - NA
SP  - 1
EP  - 29
JF  - Software Ecosystems
VL  - NA
IS  - NA
PB  -
DO  - 10.1007/978-3-031-36060-2_1
ER  -

TY  - JOUR
AU  - Rahkema, Kristiina; Pfahl, Dietmar; Ramler, Rudolf
TI  - The impact of new package managers on the library dependency ecosystem.
PY  - 2024
AB  - Adding dependencies to third-party libraries through package managers is a common practice in software development. The evolution of library dependency networks has been analyzed for many package managers. There are, however, no studies on how the library dependency networks of multiple package managers behave in the same ecosystem. The library dependency network in the Swift ecosystem encompasses libraries from CocoaPods, Carthage, and Swift Package Manager (Swift PM). These three package managers are used when developing, for example, iOS or macOS applications in Swift or Objective-C. In this study, we analyze how the introduction of new package managers has affected the evolution of the library dependency network of the Swift ecosystem. We found that overall the popularity of using package managers has grown over time. We saw that the introduction of Carthage and Swift PM had some but not a large influence on the popularity of CocoaPods. Carthage users; however, are increasingly migrating to Swift PM. This discrepancy could stem from the fundamental differences between CocoaPods and the other two package managers, as well as similarities between Carthage and Swift PM. Based on our observations, we speculate that Apple could increase the popularity of Swift PM by adding features that have so far only been available in CocoaPods, such as a central repository.
SP  - e2617
EP  - e2617
JF  - PeerJ. Computer science
VL  - 10
IS  - NA
PB  -
DO  - 10.7717/peerj-cs.2617
ER  -

TY  - NA
AU  - Yan, Yibo; Frey, Seth; Zhang, Amy; Filkov, Vladimir; Yin, Likang
TI  - GitHub OSS Governance File Dataset
PY  - 2023
AB  - Open-source Software (OSS) has become a valuable resource in both industry and academia over the last few decades. Despite the innovative structures they develop to support the projects, OSS projects and their communities have complex needs and face risks such as getting abandoned. To manage the internal social dynamics and community evolution, OSS developer communities have started relying on written governance documents that assign roles and responsibilities to different community actors.To facilitate the study of the impact and effectiveness of formal governance documents on OSS projects and communities, we present a longitudinal dataset of 710 GitHub-hosted OSS projects with GOVERNANCE.MD governance files. This dataset includes all commits made to the repository, all issues and comments created on GitHub, and all revisions made to the governance file. We hope its availability will foster more research interest in studying how OSS communities govern their projects and the impact of governance files on communities.
SP  - 630
EP  - 634
JF  - 2023 IEEE/ACM 20th International Conference on Mining Software Repositories (MSR)
VL  - NA
IS  - NA
PB  -
DO  - 10.1109/msr59073.2023.00089
ER  -

TY  - CONF
AU  - Zhou, Shurui; Vasilescu, Bogdan; Kästner, Christian
TI  - ICSE (Companion Volume) - How has forking changed in the last 20 years?: a study of hard forks on GitHub
PY  - 2020
AB  - The notion of forking has changed with the rise of distributed version control systems and social coding environments, like GitHub. Traditionally forking refers to splitting off an independent development branch (which we call hard forks); research on hard forks, conducted mostly in pre-GitHub days showed that hard forks were often seen critical as they may fragment a community. Today, in social coding environments, open-source developers are encouraged to fork a project in order to contribute to the community (which we call social forks), which may have also influenced perceptions and practices around hard forks. To revisit hard forks, we identify, study, and classify 15,306 hard forks on GitHub and interview 18 owners of hard forks or forked repositories. We find that, among others, hard forks often evolve out of social forks rather than being planned deliberately and that perception about hard forks have indeed changed dramatically, seeing them often as a positive non-competitive alternative to the original project.
SP  - 268
EP  - 269
JF  - NA
VL  - NA
IS  - NA
PB  -
DO  - NA
ER  -

TY  - JOUR
AU  - Xu, Wenhan; Bo, Hongguang; Chen, Yinglian
TI  - System dynamics-based model for supply chain organizational collaboration
PY  - 2021
AB  - <jats:p>In order to explore the impact of the system-driven supply chain, collaborative operations, and organizational characteristics on supply chain operational performance, this paper based on the system dynamics method to simulate the established information collaborative supply chain model, analyze market demand data, inventory before and after the supply chain sharing The changes of inventory fluctuations in the supply chain and related calculations are compared with the simulation results under the current model to prove the importance of implementing information collaboration in the supply chain of a large retailer-led supply chain. The research in this paper shows that with the supply chain information collaboration model, the average value of the manufacturer’s order quantity has dropped by 30.4%. Affected by this, the dispersion coefficient has also dropped from 0.76 to 0.6, and the average number of orders in the distribution center has also dropped by 12.2%; With the supply chain information synergy model, the average value of the raw material inventory of manufacturers has dropped significantly, from 3400 in the current model to 2500 in the information synergy model, a decrease of 27%, the standard deviation has also decreased by 57%, and the dispersion coefficient has dropped from 0.98 to 0.50; The standard deviation rate of the inventory of the distribution center is 30%; from the perspective of the overall retail supply chain, the inventory has fallen by 14%, the standard deviation has fallen by 34%, and the dispersion coefficient has dropped from 0.76 in the current model to the information collaboration model. 0.6, it can be seen that the mode of supply chain information coordination has a great effect on reducing supply chain costs and improving supply chain efficiency.</jats:p>
SP  - 3085
EP  - 3095
JF  - Journal of Intelligent & Fuzzy Systems
VL  - 40
IS  - 2
PB  -
DO  - 10.3233/jifs-189347
ER  -

TY  - CHAP
AU  - Reyes, Rolando P.; Fonseca, R C Efraín; Castro, John W.; Vaca, Hugo Pérez; Calderón, Manolo Paredes
TI  - ICITS - An Empirical Evaluation of Open Source in Telecommunications Software Development: The Good, the Bad, and the Ugly
PY  - 2018
AB  - Software development for the communication networks ‘monitoring is usually based on Open Source software components, as an effective and low cost technological option. However, when we evaluated a product developed with Open Source components, we found that its efficiency is less than other similar Open Source software developed with proprietary tools; which is unusual or at least it isn’t expected. To the best of our knowledge this phenomenon has not been reported in the literature. Hence, our aim was to identify the circumstances that explain why the efficiency of Open Source software applications tends to be less than Open Source applications developed with proprietary software tools. A controlled experiment was performed at Universidad de las Fuerzas Armadas ESPE of Ecuador to compare the performance of two software tools for communication networks’ monitoring. A post hoc analysis reveled that some causal relationships that could explain the unexpected behavior of compared applications’ efficiency. From the statistical perspective, there is no significant difference in effectiveness between Open Source and proprieta- ry applications for communication networks’ monitoring. Efficiency of the Open Source tools depends on a large extent of software components used for their integration, which apparently is not considered in the development of this kind of applications.
SP  - 508
EP  - 517
JF  - Advances in Intelligent Systems and Computing
VL  - NA
IS  - NA
PB  -
DO  - 10.1007/978-3-319-73450-7_48
ER  -

TY  - JOUR
AU  - Wessel, Mairieli; Wiese, Igor; Steinmacher, Igor; Gerosa, Marco Aurélio
TI  - Don't Disturb Me: Challenges of Interacting with Software Bots on Open Source Software Projects
PY  - 2021
AB  - Software bots are used to streamline tasks in Open Source Software (OSS) projects' pull requests, saving development cost, time, and effort. However, their presence can be disruptive to the community. We identified several challenges caused by bots in pull request interactions by interviewing 21 practitioners, including project maintainers, contributors, and bot developers. In particular, our findings indicate noise as a recurrent and central problem. Noise affects both human communication and development workflow by overwhelming and distracting developers. Our main contribution is a theory of how human developers perceive annoying bot behaviors as noise on social coding platforms. This contribution may help practitioners understand the effects of adopting a bot, and researchers and tool designers may leverage our results to better support human-bot interaction on social coding platforms.
SP  - 1
EP  - 21
JF  - Proceedings of the ACM on Human-Computer Interaction
VL  - 5
IS  - CSCW2
PB  -
DO  - 10.1145/3476042
ER  -

TY  - NA
AU  - Wang, Zhitui; Xiang, Ruoxi; Li, Jimei
TI  - Design of a Classroom Big Data Analysis System Based on Open Source Technology
PY  - 2024
AB  - Based on open-source technology, this paper has designed and implemented the technical solution for a classroom Big Data analysis system and explored the integrated application of this system in the field of education Big Data. Facing the growing demand for education Big Data management, this paper takes classroom Big Data analysis as the service object, using Python+Flask framework as the technical basis, comprehensively applying both API (Application Programming Interface) integration and source code integration, and implementing the technical solution of integrating open interfaces and open-source systems, such as Tencent Text Extraction, LTP (Language Technology Platform) Text Analysis, Superset Analysis and Visualization, and Whoosh Retrieval System, into the classroom Big Data analysis system. It realizes the complete data management function of classroom Big Data from collection, annotation, analysis, visualization to retrieval, providing a technical solution for the analysis and visualization of educational Big Data that can be quickly integrated and applied. This design is very effective in analyzing Big Data in the classroom, which can effectively improve the ability to analyze data and help educators better understand the effectiveness of classroom teaching and learning. This study provides a new idea for the Big Data technology framework in the field of education and gives a useful reference for the design of technical solutions for the in-depth development of educational data analysis.
SP  - 377
EP  - 382
JF  - 2024 13th International Conference on Educational and Information Technology (ICEIT)
VL  - 14
IS  - NA
PB  -
DO  - 10.1109/iceit61397.2024.10540683
ER  -

TY  - JOUR
AU  - Cohoon, Johanna; Du, Caifan; Howison, James
TI  - Tales of Transitions: Seeking Scientific Software Sustainability
PY  - 2025
AB  - <jats:p>Software is crucial to science, but sustaining projects for long term impact is challenging. Scientists and funders look to ''the open source way'' (known as peer production in the organizational literature) as a promising route to sustainability. We studied scientific software projects funded by research grants and which were encouraged by their funder to develop a sustainable community around their open source code. Using interviews and content analysis of online presences, we studied the projects over a seven year period, from receiving grants around 2014 through the typically three years of funding, and for up to four years following the funding. We make four contributions: First, we find that by far the most successful route to peer production was beginning as peer production, a result with clear policy implications. Second, through our taxonomies of organizational forms and of organizational change, we paint a landscape of the variety of forms that scientific software development can take on and transition between providing language for discussion in the literature and among communities of practice. Third, we use these taxonomies to describe multiple routes to sustainable software development. Finally, we discuss the challenges and strategies involved in sustaining scientific software, including choices not to pursue peer production at all.</jats:p>
SP  - 1
EP  - 25
JF  - Proceedings of the ACM on Human-Computer Interaction
VL  - 9
IS  - 1
PB  -
DO  - 10.1145/3701208
ER  -

TY  - JOUR
AU  - Splitter, Violetta; Dobusch, Leonhard; von Krogh, Georg; Whittington, Richard; Walgenbach, Peter
TI  - Openness as Organizing Principle: Introduction to the Special Issue.
PY  - 2022
AB  - 'Openness' has become an organizational leitmotif of our time, spreading across a growing set of organizational domains. However, discussions within these specialized domains (e.g. open data, open government or open innovation) treat openness in isolation and specific to the particularities of those domains. The intention of this Special Issue therefore is to foster cross-domain conversations to exchange insights and build cumulative knowledge on openness. To do so, this Introduction to the Special Issue argues that openness should be investigated as a general organizing principle, which we refer to as <i>Open Organizing</i>. Across domains, we define Open Organizing as a <i>dynamic organizing principle along the primary dimension of transparency/opacity and the secondary dimensions of inclusion/exclusion and distributed/concentrated decision rights</i>. As such, Open Organizing raises an overarching problem of design, which results from more specific epistemic, normative and political challenges.
SP  - 7
EP  - 27
JF  - Organization studies
VL  - 44
IS  - 1
PB  -
DO  - 10.1177/01708406221145595
ER  -

TY  - JOUR
AU  - Gandal, Neil; Naftaliev, Peter; Stettner, Uriel
TI  - Following the Code: Spillovers and Knowledge Transfer
PY  - 2017
AB  - Knowledge spillovers in Open Source Software (OSS) can occur via two channels: In the first channel, programmers take knowledge and experience gained from one OSS project they work on and employ it in another OSS project they work on. In the second channel, programmers reuse software code by taking code from an OSS project and employing it in another. We develop a methodology to measure software reuse in a large OSS network at the micro level and show that projects that reuse code from other projects have higher success. We also demonstrate knowledge spillovers from projects connected via common programmers.
SP  - 243
EP  - 267
JF  - Review of Network Economics
VL  - 16
IS  - 3
PB  -
DO  - 10.1515/rne-2017-0056
ER  -

TY  - JOUR
AU  - Businge, John; Openja, Moses; Nadi, Sarah; Berger, Thorsten
TI  - Reuse and maintenance practices among divergent forks in three software ecosystems
PY  - 2022
AB  - <jats:title>Abstract</jats:title><jats:p>With the rise of social coding platforms that rely on distributed version control systems, software reuse is also on the rise. Many software developers leverage this reuse by creating variants through forking, to account for different customer needs, markets, or environments. Forked variants then form a so-called software family; they share a common code base and are maintained in parallel by same or different developers. As such, software families can easily arise within software ecosystems, which are large collections of interdependent software components maintained by communities of collaborating contributors. However, little is known about the existence and characteristics of such families within ecosystems, especially about their maintenance practices. Improving our empirical understanding of such families will help build better tools for maintaining and evolving such families. We empirically explore maintenance practices in such fork-based software families within ecosystems of open-source software. Our focus is on three of the largest software ecosystems existence today: , , and . We identify and analyze software families that are maintained together and that exist both on the official distribution platform (Google play, , and ) as well as on GitHub , allowing us to analyze reuse practices in depth. We mine and identify <jats:italic>38 software families</jats:italic>, <jats:italic>526 software families</jats:italic>, and <jats:italic>8,837 software families</jats:italic> from the ecosystems of , , and , to study their characteristics and code-propagation practices. We provide scripts for analyzing code integration within our families. Interestingly, our results show that there is little code integration across the studied software families from the three ecosystems. Our studied families also show that techniques of direct integration using <jats:italic>git</jats:italic> outside of GitHub is more commonly used than GitHub pull requests. Overall, we hope to raise awareness about the existence of software families within larger ecosystems of software, calling for further research and better tools support to effectively maintain and evolve them.</jats:p>
SP  - NA
EP  - NA
JF  - Empirical Software Engineering
VL  - 27
IS  - 2
PB  -
DO  - 10.1007/s10664-021-10078-2
ER  -

TY  - JOUR
AU  - Ikuine, Fumihiko; Fujita, Hideki
TI  - How to Avoid Fork
PY  - 2014
AB  - Abstract: In terms of software, "continuous development" of the software is the best quality assurance. Continuous development requires either the original developer to continue development, or the next generation of developers to take over the development. It has been noted that when the original developer has used "open source," a development paradigm in which the source code is kept open to all, highly motivated and competent developers will participate in development. This makes it easier for a project to survive. However, at the same time, when the source code is opened to a broad audience a fork in the source code tends to occur. When this happens, those with a high level of competence and motivation abandon the development and go their separate ways. In such a situation, it is difficult for a project to survive. In the case of Denshin 8 go, a guardian for the source code was appointed, and the original developer and the guardians avoided this dilemma. Richard Stallman of the GNU project and Linus Torvalds of Linux project act as legitimate guardians to avoid fork.Keywords: software development, open source, fork, buried source code, legitimate guardian, Denshin 8 goIntroductionFor software, "continuous development" of the software is the best quality assurance (Ikuine 8s Fujita, 2013). Continuous development requires either the original developer to continue development or others to take over (Fujita 8s Ikuine, 2013b). When other developers take over, the original developer must share the source code with others. Open Source is a paradigm wherein the source code is shared and made broadly available to others. It has been noted that the project adopting open source can attract competent and highly motivated developers. Thus, some claim that developers can readily ensure the survival of software development projects when the source code is open.However, opening the source code to a broad audience can cause heterogeneity in the source code, even when people continue the development based on the original source code, a phenomenon known as "fork (forking)." In the case of open source software (hereinafter, "OSS"), literally anyone can obtain the source code and modify it to develop new software and start up new projects. Therefore, forks often occur in the OSS project, and the impact of forking has been studied for very long (Dibona, Ockman, 8s Stone, 1999; Kogut 8s Metiu, 2001; Nyman 8s Lindman, 2013; Nyman, Mikkonen, Lindman, 8s Fougere, 2012; Raymond, 1997, 1999). Moreover, some studies have focused on how to maintain forked projects, under the assumption that forking will inevitably occur (Gamalielesson 8s Lundell, 2013; Ray 8s Kim, 2012; Ray, Wiley, Kim, 2012).When forking does occur, competent and motivated developers start to abandon the project and go their separate ways, which may affect the survival of these projects. In other words, sharing the source code comes with a dilemma. If one loosens restrictions and shares the source code broadly, that project may survive; however, if the project is forked, competent successors with strong skills and desires will end up working on their own, or perhaps inter-project competition may lead to vying for competent successors. On the other hand, while establishing strict conditions and sharing the source code only with a select few may make the source code easy to control and reduce the possibility of forking, it may become more difficult to find competent successors willing to participate under those conditions. This puts the survival of the project at risk. Ensuring the survival of a project requires that one avoid this dilemma of source code sharing.Research on avoiding the fork has heretofore focused on developer motivation, licenses, and project governance (Ernst, Easterbrook, 8s Mylopoulos, 2010; Neville-Neil, 2011; Nyman 8s Mikkonen, 2011; Robles 8s Gonzalez-Barahona, 2012; Viseur, 2012). However, forking is a phenomenon not limited to OSS, and examples can be seen in Unix as well (Takahashi 8s Takamatsu, 2002, 2013). …
SP  - 283
EP  - 298
JF  - Annals of Business Administrative Science
VL  - 13
IS  - 5
PB  -
DO  - 10.7880/abas.13.283
ER  -

TY  - JOUR
AU  - Yue, Yang; Wang, Yi; Redmiles, David
TI  - Off to a Good Start: Dynamic Contribution Patterns and Technical Success in an OSS Newcomer's Early Career
PY  - 2023
AB  - Attracting and retaining newcomers are critical aspects for OSS projects, as such projects rely on newcomers' sustainable contributions. Considerable effort has been made to help newcomers by identifying and overcoming the barriers during the onboarding process. However, most newcomers eventually fail and drop out of their projects even after successful onboarding. Meanwhile, it has been long known that individuals' early career stages profoundly impact their long-term career success. However, newcomers' early careers are less investigated in SE research. In this paper, we sought to develop an empirical understanding of the relationships between newcomers' dynamic contribution patterns in their early careers and their technical success. To achieve this goal, we compiled a dataset of newcomers' contribution data from 54 large OSS projects under three different ecosystems and analyzed it with time series analysis and other statistical analysis techniques. Our analyses yield rich findings. The correlations between several contribution patterns and technical success were identified. In general, being consistent and persistent in newcomers' early careers is positively associated with their technical success. While these correlations generally hold in all three ecosystems, we observed some differences in detailed contribution patterns correlated with technical success across ecosystems. In addition, we performed a case study to investigate whether another type of contributions, i.e., documentation contribution, could potentially have positive correlations with newcomers' technical success. We discussed the implications and summarized practical recommendations to OSS newcomers. The insights gained from this work demonstrated the necessity of extending the focus of research and practice to newcomers' early careers and hence shed light on future research in this direction.
SP  - 529
EP  - 548
JF  - IEEE Transactions on Software Engineering
VL  - 49
IS  - 2
PB  -
DO  - 10.1109/tse.2022.3156071
ER  -

TY  - CHAP
AU  - Couldry, Nick; Rodríguez, Clemencia; Bolin, Göran; Cohen, Julie E.; Goggin, Gerard; Kraidy, Marwan M.; Iwabuchi, Koichi; Lee, Kwang-Suk; Qiu, Jack Linchuan; Volkmer, Ingrid; Wasserman, Herman; Zhao, Yuezhi; Koltsova, Olessia; Rakhmani, Inaya; Rincón, Omar; Magallanes-Blanco, Claudia; Thomas, Pradip
TI  - Media and Communications
PY  - NA
AB  - Developments in digital technologies over the last 30 years have expanded massively human beings' capacity to communicate and connect. Media infrastructures have acquired huge complexity as a resul ...
SP  - 523
EP  - 562
JF  - Rethinking Society for the 21st Century
VL  - NA
IS  - NA
PB  -
DO  - 10.1017/9781108399647.006
ER  -

TY  - JOUR
AU  - Venturini, Daniel; Cogo, Filipe Roseiro; Polato, Ivanilton; Gerosa, Marco A.; Wiese, Igor Scaliante
TI  - I Depended on You and You Broke Me: An Empirical Study of Manifesting Breaking Changes in Client Packages
PY  - 2023
AB  - <jats:p>
            Complex software systems have a network of dependencies. Developers often configure package managers (e.g.,
            <jats:sans-serif>npm</jats:sans-serif>
            ) to automatically update dependencies with each publication of new releases containing bug fixes and new features. When a dependency release introduces backward-incompatible changes, commonly known as
            <jats:italic>breaking changes</jats:italic>
            , dependent packages may not build anymore. This may indirectly impact downstream packages, but the impact of breaking changes and how dependent packages recover from these breaking changes remain unclear. To close this gap, we investigated the manifestation of breaking changes in the
            <jats:sans-serif>npm</jats:sans-serif>
            ecosystem, focusing on cases where packages’ builds are impacted by breaking changes from their dependencies. We measured the extent to which breaking changes affect dependent packages. Our analyses show that around 12% of the dependent packages and 14% of their releases were impacted by a breaking change during updates of non-major releases of their dependencies. We observed that, from all of the manifesting breaking changes, 44% were introduced in both minor and patch releases, which in principle should be backward compatible. Clients recovered themselves from these breaking changes in half of the cases, most frequently by upgrading or downgrading the provider’s version without changing the versioning configuration in the package manager. We expect that these results help developers understand the potential impact of such changes and recover from them.
          </jats:p>
SP  - 1
EP  - 26
JF  - ACM Transactions on Software Engineering and Methodology
VL  - 32
IS  - 4
PB  -
DO  - 10.1145/3576037
ER  -

TY  - NA
AU  - Nahar, Nadia; Zhou, Shurui; Lewis, Grace; Kästner, Christian
TI  - Collaboration challenges in building ML-enabled systems
PY  - 2022
AB  - The introduction of machine learning (ML) components in software projects has created the need for software engineers to collaborate with data scientists and other specialists. While collaboration can always be challenging, ML introduces additional challenges with its exploratory model development process, additional skills and knowledge needed, difficulties testing ML systems, need for continuous evolution and monitoring, and non-traditional quality requirements such as fairness and explainability. Through interviews with 45 practitioners from 28 organizations, we identified key collaboration challenges that teams face when building and deploying ML systems into production. We report on common collaboration points in the development of production ML systems for requirements, data, and integration, as well as corresponding team patterns and challenges. We find that most of these challenges center around communication, documentation, engineering, and process, and collect recommendations to address these challenges.
SP  - 413
EP  - 425
JF  - Proceedings of the 44th International Conference on Software Engineering
VL  - NA
IS  - NA
PB  -
DO  - 10.1145/3510003.3510209
ER  -

TY  - JOUR
AU  - Li, Zhixing; Yu, Yue; Wang, Tao; Li, Shanshan; Wang, Huaimin
TI  - Opportunities and Challenges in Repeated Revisions to Pull-Requests: An Empirical Study
PY  - 2022
AB  - <jats:p>Background: The Pull-Request (PR) model is a widespread approach adopted by open source software (OSS) projects to support collaborative software development. However, it is often challenging to continuously evaluate and revise PRs in several iterations of code reviewsinvolving technical and social aspects. Aim: Our objective is twofold: identifying best practices for effective collaboration in continuous PR improvement and uncovering problems that deserve special attention to improve collaboration efficiency and productivity. Method: We conducted a mixed-methods empirical study of repeatedly revised PRs (i.e. those that have undergone a high number of revisions). Historical trace data of five long-lived popular GitHub projects were used for manual investigation of practices for requesting changes to PRs and reasons for nonacceptance of repeatedly revised PRs. Surveys of OSS practitioners were conducted to evaluate the results of manual analysis and to provide additional insights into developers' willingness regarding PR revisions and factors causing avoidable revisions in practice. Results: The main results of our research were as follows: (1) We identified 15 code review practices for requesting changes to PRs, among which practices with respect to explaining the reasoning behind requested changes and tracking the progress of PR review and revision were undervalued by reviewers; (2) While submitters can in general undergo 1-5 rounds of revisions, they are willing to offer more revisions when they are in a friendly community and receive helpful feedback; (3) We revealed 11 factors causing avoidable revisions regarding to reviewers' feedback, code review policy, pre-submission issues, and implementation of new revisions; and (4) Nonacceptance of repeatedly revised PRs was due mainly to inactivity of submitters or reviewers and being superseded for better maintenance. Finally, based on these findings, we proposed recommendations and implications for OSS practitioners and tool designers to facilitate efficient collaboration in PR revisions.</jats:p>
SP  - 1
EP  - 35
JF  - Proceedings of the ACM on Human-Computer Interaction
VL  - 6
IS  - CSCW2
PB  -
DO  - 10.1145/3555208
ER  -

TY  - NA
AU  - Wessel, Mairieli; Serebrenik, Alexander; Wiese, Igor; Steinmacher, Igor; Gerosa, Marco Aurélio
TI  - ICSME - Effects of Adopting Code Review Bots on Pull Requests to OSS Projects
PY  - 2020
AB  - Software bots, which are widely adopted by Open Source Software (OSS) projects, support developers on several activities, including code review. However, as with any new technology adoption, bots may impact group dynamics. Since understanding and anticipating such effects is important for planning and management, we investigate how several activity indicators change after the adoption of a code review bot. We employed a regression discontinuity design on 1,194 software projects from GitHub. Our results indicate that the adoption of code review bots increases the number of monthly merged pull requests, decreases monthly non-merged pull requests, and decreases communication among developers. Practitioners and maintainers may leverage our results to understand, or even predict, bot effects on their projects’ social interactions.
SP  - 9240622
EP  - 11
JF  - 2020 IEEE International Conference on Software Maintenance and Evolution (ICSME)
VL  - NA
IS  - NA
PB  -
DO  - 10.1109/icsme46990.2020.00011
ER  -

TY  - NA
AU  - Hoffmann, Manuel; Boysel, Sam; Nagle, Frank; Peng, Sida; Xu, Kevin
TI  - Generative AI and the Nature of Work
PY  - 2024
AB  - NA
SP  - NA
EP  - NA
JF  - NA
VL  - NA
IS  - NA
PB  -
DO  - 10.2139/ssrn.5007084
ER  -

TY  - NA
AU  - Kurppa, Kimmo
TI  - Competitive advantage from leveraging external resources : reuse of open source software components
PY  - 2013
AB  - NA
SP  - NA
EP  - NA
JF  - NA
VL  - NA
IS  - NA
PB  -
DO  - NA
ER  -

TY  - NA
AU  - Santos, Fabio; Penney, Jacob; Pimentel, João Felipe; Wiese, Igor; Steinmacher, Igor; Gerosa, Marco A.
TI  - Tell Me Who Are You Talking to and I Will Tell You What Issues Need Your Skills
PY  - 2023
AB  - Selecting an appropriate task is challenging for newcomers to Open Source Software (OSS) projects. To facilitate task selection, researchers and OSS projects have leveraged machine learning techniques, historical information, and textual analysis to label tasks (a.k.a. issues) with information such as the issue type and domain. These approaches are still far from mainstream adoption, possibly because of a lack of good predictors. Inspired by previous research, we advocate that label prediction might benefit from leveraging metrics derived from communication data and social network analysis (SNA) for issues in which social interaction occurs. Thus, we study how these "social metrics" can improve the automatic labeling of open issues with API domains—categories of APIs used in the source code that solves the issue—which the literature shows that newcomers to the project consider relevant for task selection. We mined data from OSS projects' repositories and organized it in periods to reflect the seasonality of the contributors' project participation. We replicated metrics from previous work and added social metrics to the corpus to predict API-domain labels. Social metrics improved the performance of the classifiers compared to using only the issue description text in terms of precision, recall, and F-measure. Precision (0.922) increased by 15.82% and F-measure (0.942) by 15.89% for a project with high social activity. These results indicate that social metrics can help capture the patterns of social interactions in a software project and improve the labeling of issues in an issue tracker.
SP  - 611
EP  - 623
JF  - 2023 IEEE/ACM 20th International Conference on Mining Software Repositories (MSR)
VL  - NA
IS  - NA
PB  -
DO  - 10.1109/msr59073.2023.00087
ER  -

TY  - JOUR
AU  - Venigalla, Akhila Sri Manasa; Chimalakonda, Sridhar
TI  - An exploratory study of software artifacts on GitHub from the lens of documentation
PY  - 2024
AB  - NA
SP  - 107425
EP  - 107425
JF  - Information and Software Technology
VL  - 169
IS  - NA
PB  -
DO  - 10.1016/j.infsof.2024.107425
ER  -

TY  - NA
AU  - Pinho, Giniele; Caçula, Aguiar Jeová; Costa, Lucas; Wiese, Igor; Araújo, Allysson Allex
TI  - Challenges and Solutions of Free and Open Source Software Documentation: A Systematic Mapping Study
PY  - 2024
AB  - <jats:p>Software documentation is a relevant process for delivering quality software, as it assists stakeholders in using, understanding, maintaining, and implementing software productively. However, notable particularities emerge when investigating the context of Free and Open Source Software (FOSS) projects, which require special attention. Therefore, through a Systematic Mapping Study (SMS), this work aims to map the challenges and solutions regarding software documentation in FOSS based on the last ten years of scientific research (published between 2013 and 2023). From an initial set of 1271 papers, 12 primary studies were identified from which it was possible to categorize five challenges (Collaboration, Quality, Incompleteness, Maintainability, and Categorization) and three general perspectives of solutions (Strategic Use of README, Adoption of Artificial Intelligence, and Support Tools &amp; Approaches). As an academic contribution, we provide an SMS revealing a set of challenges and solutions related to software documentation, a topic still underexplored in the FOSS research context. From a practical and industrial standpoint, this paper promotes a reflection on the use of documentation in FOSS projects, echoing challenges and solutions that can contribute to improving the quality of documentation.</jats:p>
SP  - 114
EP  - 125
JF  - Anais do XXXVIII Simpósio Brasileiro de Engenharia de Software (SBES 2024)
VL  - NA
IS  - NA
PB  -
DO  - 10.5753/sbes.2024.3307
ER  -

TY  - NA
AU  - Imam, Ahmed; Dey, Tapajit; Nolte, Alexander; Mockus, Audris; Herbsleb, James D.
TI  - MSR - The Secret Life of Hackathon Code Where does it come from and where does it go
PY  - 2021
AB  - Background: Hackathons have become popular events for teams to collaborate on projects and develop software prototypes. Most existing research focuses on activities during an event with limited attention to the evolution of the code brought to or created during a hackathon. Aim: We aim to understand the evolution of hackathon-related code, specifically, how much hackathon teams rely on pre-existing code or how much new code they develop during a hackathon. Moreover, we aim to understand if and where that code gets reused, and what factors affect reuse. Method: We collected information about 22,183 hackathon projects from Devpost– a hackathon database – and obtained related code (blobs), authors, and project characteristics from the World of Code. We investigated if code blobs in hackathon projects were created before, during, or after an event by identifying the original blob creation date and author, and also checked if the original author was a hackathon project member. We tracked code reuse by first identifying all commits containing blobs created during an event before determining all projects that contain those commits. Result: While only approximately 9.14% of the code blobs are created during hackathons, this amount is still significant considering time and member constraints of such events. Approximately a third of these code blobs get reused in other projects. The number of associated technologies and the number of participants in a project increase reuse probability. Conclusion: Our study demonstrates to what extent pre-existing code is used and new code is created during a hackathon and how much of it is reused elsewhere afterwards. Our findings help to better understand code reuse as a phenomenon and the role of hackathons in this context and can serve as a starting point for further studies in this area.
SP  - 68
EP  - 79
JF  - 2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR)
VL  - NA
IS  - NA
PB  -
DO  - 10.1109/msr52588.2021.00020
ER  -

TY  - JOUR
AU  - Vial, Gregory
TI  - A Complex Adaptive Systems Perspective of Software Reuse in the Digital Age: An Agenda for IS Research
PY  - 2023
AB  - <jats:p> Most software on which we rely to help us organize our professional and personal lives is based on the reuse of other pieces of software that are created and maintained by groups of software developers that work independently from one another. Oftentimes, these groups simply publish their software in the form of self-contained packages available on dedicated repositories, facilitating the widespread diffusion of their work. Whereas the production and publication of software packages fosters unprecedented levels of digital innovation, there are also drawbacks associated with software reuse (e.g., as was publicly discussed in 2021 with the discovery of the Log4Shell vulnerability). Building on previous research, our work explores the implications associated with the unprecedented scale and uncoordinated nature of packaged software reuse. We use complex adaptive systems as a generative lens to help us conceptualize the phenomenon and identify promising avenues for research and practice on this topic. Our work, therefore, draws attention to the importance of the packaged software reuse phenomenon as well as the need for research to help increase our understanding of its nature and implications considering its prevalence in software development practice and the overall importance of software in our everyday lives. </jats:p>
SP  - 1728
EP  - 1743
JF  - Information Systems Research
VL  - 34
IS  - 4
PB  -
DO  - 10.1287/isre.2023.1200
ER  -

TY  - NA
AU  - Schorlemmer, Taylor R.; Kalu, Kelechi G.; Chigges, Luke; Ko, Kyung Myung; Ishgair, Eman Abu; Bagchi, Saurabh; Torres-Arias, Santiago; Davis, James C.
TI  - Signing in Four Public Software Package Registries: Quantity, Quality, and Influencing Factors
PY  - 2024
AB  - NA
SP  - 1160
EP  - 1178
JF  - 2024 IEEE Symposium on Security and Privacy (SP)
VL  - NA
IS  - NA
PB  -
DO  - 10.1109/sp54263.2024.00215
ER  -

TY  - JOUR
AU  - Barcomb, Ann; Stol, Klaas-Jan; Fitzgerald, Brian; Riehle, Dirk
TI  - Managing Episodic Volunteers in Free/Libre/Open Source Software Communities
PY  - 2022
AB  - NA
SP  - 260
EP  - 277
JF  - IEEE Transactions on Software Engineering
VL  - 48
IS  - 1
PB  -
DO  - 10.1109/tse.2020.2985093
ER  -

TY  - JOUR
AU  - Steinmacher, Igor; Robles, Gregorio; Fitzgerald, Brian; Wasserman, Anthony I.
TI  - Free and open source software development: the end of the teenage years
PY  - 2017
AB  - NA
SP  - 1
EP  - 4
JF  - Journal of Internet Services and Applications
VL  - 8
IS  - 1
PB  -
DO  - 10.1186/s13174-017-0069-9
ER  -

TY  - NA
AU  - Meankaew, Pongthep
TI  - Cross-Platform Mobile App Development for Disseminating Public Health Information to Travelers in Thailand: Development and Usability (Preprint)
PY  - 2021
AB  - <sec>
                    <title>BACKGROUND</title>
                        <p>Disease is a risk that travelers have identified as a key factor in deciding about their travel plans; many people are concerned about getting sick while traveling abroad. Information from mobile devices could be an effective means for travelers to access information about the disease situation in their planned destinations, thereby reducing the risk of disease exposure.</p>
                </sec>
                                <sec>
                    <title>OBJECTIVE</title>
                        <p>We developed a mobile app using cross-platform technology to provide information about the disease situation for travelers to Thailand. We aimed to assess the app’s usability in terms of engagement, search logs, and effectiveness among target users.</p>
                </sec>
                                <sec>
                    <title>METHODS</title>
                        <p>We developed the app, ThaiEpidemics, using the principle of mobile application development life cycle. The app employed cross-platform technology and ran on both iOS and Android. As its data source, the app used data from national disease surveillance. We conduced our study among visitors to Travel Clinic in the Hospital for Tropical Medicine. The participants were informed that the app would collect the usage and search logs related to their queries. After the second log-in, the app prompted participants to complete an e-survey regarding their opinions and preferences related to their awareness of the disease situation.</p>
                </sec>
                                <sec>
                    <title>RESULTS</title>
                        <p>We based our prototype of ThaiEpidemics on a conceptualized framework for visualizing the distribution of 14 major diseases of concern to tourists in Southeast Asia. The app used weekly updated national disease surveillance data, and it provided users with functions and features to search for and visualize the disease situation in Thailand. The participants could access information about their current location and elsewhere in the country. In all, 83 people installed the app, and 52 responded to the e-survey. Regardless of age, education, and continent of origin, almost all e-survey respondents believed the app had raised their awareness of the disease situation when travelling. Most participants searched for information for all 14 diseases; some searched for information about dengue and malaria.</p>
                </sec>
                                <sec>
                    <title>CONCLUSIONS</title>
                        <p>ThaiEpidemics evidently has potential usefulness for travelers. It is important for app developers to address standardization of the data source and users’ concerns about the confidentiality and safety of their mobile devices. We developed ThaiEpidemics as open source software; thus, general developers and others can use our files for further analysis and to build reports and dashboards that meet their own requirements.</p>
                </sec>
SP  - NA
EP  - NA
JF  - NA
VL  - NA
IS  - NA
PB  -
DO  - 10.2196/preprints.33707
ER  -

TY  - CHAP
AU  - Serbout, Souhaila; Pautasso, Cesare
TI  - How Many Web APIs Evolve Following Semantic Versioning?
PY  - 2024
AB  - NA
SP  - 344
EP  - 359
JF  - Lecture Notes in Computer Science
VL  - NA
IS  - NA
PB  -
DO  - 10.1007/978-3-031-62362-2_25
ER  -

TY  - JOUR
AU  - Meankaew, Pongthep; Lawpoolsri, Saranath; Piyaphanee, Watcharapong; Wansatid, Peerawat; Chaovalit, Pimwadee; Lawawirojwong, Siam; Kaewkungwal, Jaranit
TI  - Cross-platform mobile app development for disseminating public health information to travelers in Thailand: development and usability.
PY  - 2022
AB  - <AbstractText Label="BACKGROUND" NlmCategory="BACKGROUND">The risk of disease is a key factor that travelers have identified when planning to travel abroad, as many people are concerned about getting sick. Mobile devices can be an effective means for travelers to access information regarding disease prevalence in their planned destinations, potentially reducing the risk of exposure.</AbstractText>
          <AbstractText Label="METHODS" NlmCategory="METHODS">We developed a mobile app, ThaiEpidemics, using cross-platform technology to provide information about disease prevalence and status for travelers to Thailand. We aimed to assess the app's usability in terms of engagement, search logs, and effectiveness among target users. The app was developed using the principle of mobile application development life cycle, for both iOS and Android. As its data source, the app used weekly data from national disease-surveillance reports. We conduced our study among visitors to the Travel Clinic in the Hospital for Tropical Diseases, Faculty of Tropical Medicine, Mahidol University, Bangkok, Thailand. The participants were informed that the app would collect usage and search logs related to their queries. After the second log-in, the app prompted participants to complete an e-survey regarding their opinions and preferences related to their awareness of disease prevalence and status.</AbstractText>
          <AbstractText Label="RESULTS" NlmCategory="RESULTS">We based our prototype of ThaiEpidemics on a conceptualized framework for visualizing the distribution of 14 major diseases of concern to tourists in Southeast Asia. The app provided users with functions and features to search for and visualize disease prevalence and status in Thailand. The participants could access information for their current location and elsewhere in the country. In all, 83 people installed the app, and 52 responded to the e-survey. Regardless of age, education, and continent of origin, almost all e-survey respondents believed the app had raised their awareness of disease prevalence and status when travelling. Most participants searched for information for all 14 diseases; some searched for information specifically about dengue and malaria.</AbstractText>
          <AbstractText Label="CONCLUSIONS" NlmCategory="CONCLUSIONS">ThaiEpidemics is evidently potentially useful for travelers. Should the app be adopted for use by travelers to Thailand, it could have an impact on wider knowledge distribution, which might result in decreased exposure, increased prophylaxis, and therefore a potential decreased burden on the healthcare system. For app developers who are developing/implementing this kind of app, it is important to address standardization of the data source and users' concerns about the confidentiality and safety of their mobile devices.</AbstractText>
          <CopyrightInformation>© 2022. The Author(s).</CopyrightInformation>
SP  - 17
EP  - NA
JF  - Tropical diseases, travel medicine and vaccines
VL  - 8
IS  - 1
PB  -
DO  - 10.1186/s40794-022-00174-6
ER  -

TY  - JOUR
AU  - Trinkenreich, Bianca; Wiese, Igor; Sarma, Anita; Gerosa, Marco; Steinmacher, Igor
TI  - Women's Participation in Open Source Software: A Survey of the Literature
PY  - 2022
AB  - <jats:p>Women are underrepresented in Open Source Software (OSS) projects, as a result of which, not only do women lose career and skill development opportunities, but the projects themselves suffer from a lack of diversity of perspectives. Practitioners and researchers need to understand more about the phenomenon; however, studies about women in open source are spread across multiple fields, including information systems, software engineering, and social science. This article systematically maps, aggregates, and synthesizes the state-of-the-art on women’s participation in OSS. It focuses on women contributors’ representation and demographics, how they contribute, their motivations and challenges, and strategies employed by communities to attract and retain women. We identified 51 articles (published between 2000 and 2021) that investigated women’s participation in OSS. We found evidence in these papers about who are the women who contribute, what motivates them to contribute, what types of contributions they make, challenges they face, and strategies proposed to support their participation. According to these studies, only about 5% of projects were reported to have women as core developers, and women authored less than 5% of pull-requests, but had similar or even higher rates of pull-request acceptances than men. Women make both code and non-code contributions, and their motivations to contribute include learning new skills, altruism, reciprocity, and kinship. Challenges that women face in OSS are mainly social, including lack of peer parity and non-inclusive communication from a toxic culture. We found 10 strategies reported in the literature, which we mapped to the reported challenges. Based on these results, we provide guidelines for future research and practice.</jats:p>
SP  - 1
EP  - 37
JF  - ACM Transactions on Software Engineering and Methodology
VL  - 31
IS  - 4
PB  -
DO  - 10.1145/3510460
ER  -

TY  - NA
AU  - Vendome, Christopher; Linares-Vasquez, Mario; Bavota, Gabriele; Di Penta, Massimiliano; German, Daniel M.; Poshyvanyk, Denys
TI  - ICSE - Machine learning-based detection of open source license exceptions
PY  - 2017
AB  - From a legal perspective, software licenses govern the redistribution, reuse, and modification of software as both source and binary code. Free and Open Source Software (FOSS) licenses vary in the degree to which they are permissive or restrictive in allowing redistribution or modification under licenses different from the original one(s). In certain cases, developers may modify the license by appending to it an exception to specifically allow reuse or modification under a particular condition. These exceptions are an important factor to consider for license compliance analysis since they modify the standard (and widely understood) terms of the original license. In this work, we first perform a large-scale empirical study on the change history of over 51K FOSS systems aimed at quantitatively investigating the prevalence of known license exceptions and identifying new ones. Subsequently, we performed a study on the detection of license exceptions by relying on machine learning. We evaluated the license exception classification with four different supervised learners and sensitivity analysis. Finally, we present a categorization of license exceptions and explain their implications.
SP  - 118
EP  - 129
JF  - 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)
VL  - NA
IS  - NA
PB  -
DO  - 10.1109/icse.2017.19
ER  -

TY  - NA
AU  - Abdalkareem, Rabe; Nourry, Olivier; Wehaibi, ; Mujahid, Suhaib; Shihab, Emad
TI  - ESEC/SIGSOFT FSE - Why do developers use trivial packages? an empirical case study on npm
PY  - 2017
AB  - Code reuse is traditionally seen as good practice. Recent trends have pushed the concept of code reuse to an extreme, by using packages that implement simple and trivial tasks, which we call `trivial packages'. A recent incident where a trivial package led to the breakdown of some of the most popular web applications such as Facebook and Netflix made it imperative to question the growing use of trivial packages. Therefore, in this paper, we mine more than 230,000 npm packages and 38,000 JavaScript applications in order to study the prevalence of trivial packages. We found that trivial packages are common and are increasing in popularity, making up 16.8% of the studied npm packages. We performed a survey with 88 Node.js developers who use trivial packages to understand the reasons and drawbacks of their use. Our survey revealed that trivial packages are used because they are perceived to be well implemented and tested pieces of code. However, developers are concerned about maintaining and the risks of breakages due to the extra dependencies trivial packages introduce. To objectively verify the survey results, we empirically validate the most cited reason and drawback and find that, contrary to developers' beliefs, only 45.2% of trivial packages even have tests. However, trivial packages appear to be `deployment tested' and to have similar test, usage and community interest as non-trivial packages. On the other hand, we found that 11.5% of the studied trivial packages have more than 20 dependencies. Hence, developers should be careful about which trivial packages they decide to use.
SP  - 385
EP  - 395
JF  - Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering
VL  - NA
IS  - NA
PB  -
DO  - 10.1145/3106237.3106267
ER  -

TY  - NA
AU  - Osborne, Cailean
TI  - Open Source Software Developers' Views on Public and Private Funding: A Case Study on
            <i>scikit-learn</i>
PY  - 2024
AB  - Governments are increasingly funding open source software (OSS) development to enhance software security, digital sovereignty, and national competitiveness in science and innovation, amongst others. However, little is known about how OSS developers view the relative benefits and drawbacks of governmental funding compared to other funding sources. This study explores this question through a case study on scikit-learn, a Python library for machine learning, funded by public research grants, commercial sponsorship, micro-donations, and a €32 million grant announced in France's artificial intelligence strategy. Through 25 interviews with scikit-learn's maintainers and funders, this study makes two key contributions. First, it contributes empirical findings about the benefits and drawbacks of public and private funding for OSS developers, and the governance protocols employed by the maintainers to balance the diverse interests of their funders and community. Second, it offers practical lessons on funding for OSS developers, governments, and companies based on the experience of scikit-learn. The paper concludes with recommendations for future research and practice.
SP  - 154
EP  - 161
JF  - Companion Publication of the 2024 Conference on Computer-Supported Cooperative Work and Social Computing
VL  - NA
IS  - NA
PB  -
DO  - 10.1145/3678884.3681844
ER  -

TY  - JOUR
AU  - Tumbas, Sanja; Berente, Nicholas; vom Brocke, Jan
TI  - Digital innovation and institutional entrepreneurship: Chief Digital Officer perspectives of their emerging role
PY  - 2018
AB  - NA
SP  - 188
EP  - 202
JF  - Journal of Information Technology
VL  - 33
IS  - 3
PB  -
DO  - 10.1057/s41265-018-0055-0
ER  -

TY  - JOUR
AU  - Dennehy, Denis; Conboy, Kieran; Ferreira, Jennifer; Babu, Jaganath
TI  - Sustaining Open Source Communities by Understanding the Influence of Discursive Manifestations on Sentiment
PY  - 2020
AB  - Sustaining open source (OS) communities is fundamental to the long-term success of any open source software (OSS) project. An OSS project consists of a community of software developers who are part of a larger business ecosystem involving hardware and software companies. Peer review of software code, known as patch review comments, is an important quality assurance activity for OSS development that requires developers to provide feedback concerning their degree of satisfaction. Despite the importance of feedback, which can affect sentiment of OS communities, the underlying discourse has not been studied. In this study, we use Activity Theory to identify and categorise 20,651 discursive manifestations of contradictions that occurred in patch review comments of a large, evolving OS community. Unique community-specific expressions are identified and mapped to developers’ sentiment during a software release cycle. The study contributes new insights concerning discursive manifestations of contradictions as a driving force for sustaining OS communities.
SP  - 1
EP  - 17
JF  - Information Systems Frontiers
VL  - NA
IS  - NA
PB  -
DO  - NA
ER  -

TY  - JOUR
AU  - Turzo, Asif Kamal; Sultana, Sayma; Bosu, Amiangshu
TI  - From First Patch to Long-Term Contributor: Evaluating Onboarding Recommendations for OSS Newcomers
PY  - 2025
AB  - NA
SP  - 1303
EP  - 1318
JF  - IEEE Transactions on Software Engineering
VL  - 51
IS  - 4
PB  -
DO  - 10.1109/tse.2025.3550881
ER  -

TY  - JOUR
AU  - Abdalkareem, Rabe; Oda, Vinicius; Mujahid, Suhaib; Shihab, Emad
TI  - On the impact of using trivial packages: an empirical case study on npm and PyPI
PY  - 2020
AB  - NA
SP  - 1168
EP  - 1204
JF  - Empirical Software Engineering
VL  - 25
IS  - 2
PB  -
DO  - 10.1007/s10664-019-09792-9
ER  -

TY  - JOUR
AU  - Frluckaj, Hana; Stevens, Nikki; Howison, James; Dabbish, Laura
TI  - Paradoxes of Openness: Trans Experiences in Open Source Software
PY  - 2024
AB  - <jats:p>In recent years, concerns have increased over the lack of contributor diversity in open source software (OSS), despite its status as a paragon of open collaboration. OSS is an important form of digital infrastructure and part of a career path for many developers. While there exists a growing body of literature on cisgender women's under-representation in OSS, the experiences of contributors from other marginalized groups are comparatively absent from the literature. Such is the case for trans contributors, a historically influential group in OSS. In this study, we interviewed 21 trans participants to understand and represent their experiences in the OSS literature. From their experiences, we theorize two related paradoxes of openness in OSS: the paradox of openness and display and the paradox of openness and governance. In an increasingly violent world for trans people, we draw on our theorizing to build recommendations for more inclusive and safer OSS projects for contributors.</jats:p>
SP  - 1
EP  - 24
JF  - Proceedings of the ACM on Human-Computer Interaction
VL  - 8
IS  - CSCW2
PB  -
DO  - 10.1145/3687047
ER  -

TY  - NA
AU  - Miranda, André; Pimentel, João
TI  - SAC - On the use of package managers by the C++ open-source community
PY  - 2018
AB  - The use of package managers is commonplace for software developers working with programming languages such as Ruby, Python, and JavaScript. This is not the case for C++ developers, which present a low adoption rate of package managers. The goal of this study is to understand what is preventing C++ developers from adopting package managers in the context of open-source software (OSS) projects. In order to achieve this goal, we performed a questionnaire survey with 343 developers from 42 OSS projects. The survey participants answered a questionnaire with 29 questions. After the analysis of the collected data, we could conclude that most participants are not reluctant to use C++ package managers and that Open-Source licensing, High Availability of Libraries, Good Documentation, and Ease of Configuration can be considered crucial factors for the successful adoption of C++ dependency management via language-specific package managers.
SP  - 1483
EP  - 1491
JF  - Proceedings of the 33rd Annual ACM Symposium on Applied Computing
VL  - NA
IS  - NA
PB  -
DO  - 10.1145/3167132.3167290
ER  -

TY  - JOUR
AU  - Capiluppi, Andrea; Stol, Klaas-Jan; Boldyreff, Cornelia
TI  - Software Reuse in Open Source: A Case Study
PY  - 2011
AB  - A promising way to support software reuse is based on Component-Based Software Development CBSD. Open Source Software OSS products are increasingly available that can be freely used in product development. However, OSS communities still face several challenges before taking full advantage of the "reuse mechanism": many OSS projects duplicate effort, for instance when many projects implement a similar system in the same application domain and in the same topic. One successful counter-example is the FFmpeg multimedia project; several of its components are widely and consistently reused in other OSS projects. Documented is the evolutionary history of the various libraries of components within the FFmpeg project, which presently are reused in more than 140 OSS projects. Most use them as black-box components; although a number of OSS projects keep a localized copy in their repositories, eventually modifying them as needed white-box reuse. In both cases, the authors argue that FFmpeg is a successful project that provides an excellent exemplar of a reusable library of OSS components.
SP  - 10
EP  - 35
JF  - International Journal of Open Source Software and Processes
VL  - 3
IS  - 3
PB  -
DO  - 10.4018/jossp.2011070102
ER  -

TY  - JOUR
AU  - Lin, Jiahuei; Zhang, Haoxiang; Adams, Bram; Hassan, Ahmed E.
TI  - Vulnerability management in Linux distributions
PY  - 2023
AB  - NA
SP  - NA
EP  - NA
JF  - Empirical Software Engineering
VL  - 28
IS  - 2
PB  -
DO  - 10.1007/s10664-022-10267-7
ER  -

TY  - JOUR
AU  - Gu, Zuguang
TI  - Two separated worlds: On the preference of influence in life science and biomedical research
PY  - 2025
AB  - NA
SP  - 101641
EP  - 101641
JF  - Journal of Informetrics
VL  - 19
IS  - 2
PB  -
DO  - 10.1016/j.joi.2025.101641
ER  -

TY  - NA
AU  - Cabrey, Craig
TI  - Identifying the Presence of Known Vulnerabilities in the Versions of a Software Project
PY  - 2016
AB  - Department of Software Engineering Master of Science in Software Engineering Identifying the Presence of Known Vulnerabilities in the Versions of a Software Project
SP  - NA
EP  - NA
JF  - NA
VL  - NA
IS  - NA
PB  -
DO  - NA
ER  -

TY  - JOUR
AU  - Buccella, Agustina; Cechich, Alejandra; Pol'la, Matías; Arias, Maximiliano
TI  - Software Product Line Reengineering: A Case Study on the Geographic Domain
PY  - 2016
AB  - The growing adoption of software product lines (SPL) represents perhaps a paradigm shift in software development aiming at improving cost, quality, time to market, and developer productivity. While the underlying concepts are straightforward enough building a family of related products or systems by planned and careful reuse of a base of generalized software development assets the problems can be in the details, as successful product line practice involves domain understanding, technology selection, and so forth. Today, there is an important increment on reporting experiences and lessons about SPL development by capturing aspects that have been gathered during daily practice. Following this line, in this paper we start from our experiences of developing a software product line on the Marine Ecology domain highlighting our reasons for reengineering a previous SPL. Then, we explain step-bystep reengineering activities in terms of motivation, solutions, and lessons learned, which summarize strengths and limitations of the applied practices. Differently from other cases, here we take advantage of using domain standards as well as open source implementations within the geographic domain.
SP  - 14
EP  - 28
JF  - Journal of Computer Science and Technology
VL  - 16
IS  - 1
PB  -
DO  - NA
ER  -

TY  - NA
AU  - Yin, Likang; Zhang, Xiyu; Filkov, Vladimir
TI  - On the Self-Governance and Episodic Changes in Apache Incubator Projects: An Empirical Study
PY  - 2023
AB  - Sustainable Open Source Software (OSS) projects are characterized by the ability to attract new project members and maintain an energetic project community. Building sustainable OSS projects from a nascent state requires effective project governance and socio-technical structure to be interleaved, in a complex and dynamic process. Although individual disciplines have studied each separately, little is known about how governance and software development work together in practice toward sustainability. Prior work has shown that many OSS projects experience large, episodic changes over short periods of time, which can propel them or drag them down. However, sustainable projects typically manage to come out unscathed from such changes, while others do not. The natural questions arise: Can we identify the back-and-forth between governance and socio-technical structure that lead to sustainability following episodic events? And, how about those that do not lead to sustainability? From a data set of social, technical, and policy digital traces from 262 sustainability-labeled ASF incubator projects, here we employ a large-scale empirical study to characterize episodic changes in socio-technical aspects measured by Change Intervals (CI), governance rules and regulations in a form of Institutional Statements (IS), and the temporal relationships between them. We find that sustainable projects during episodic changes can adapt themselves to institutional statements more efficiently, and that institutional discussions can lead to episodic changes intervals in socio-technical aspects of the projects, and vice versa. In practice, these results can provide timely guidance beyond socio-technical considerations, adding rules and regulations in the mix, toward a unified analytical framework for OSS project sustainability.
SP  - 678
EP  - 689
JF  - 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE)
VL  - NA
IS  - NA
PB  -
DO  - 10.1109/icse48619.2023.00066
ER  -

TY  - JOUR
AU  - Barcomb, Ann; Kaufmann, Andreas; Riehle, Dirk; Stol, Klaas-Jan; Fitzgerald, Brian
TI  - Uncovering the Periphery: A Qualitative Survey of Episodic Volunteering in Free/Libre and Open Source Software Communities
PY  - 2020
AB  - Free/Libre and Open Source Software (FLOSS) communities are composed, in part, of volunteers, many of whom contribute infrequently. However, these infrequent volunteers contribute to the sustainability of FLOSS projects, and should ideally be encouraged to continue participating, even if they cannot be persuaded to contribute regularly. Infrequent contributions are part of a trend which has been widely observed in other sectors of volunteering, where it has been termed “episodic volunteering” (EV). Previous FLOSS research has focused on the Onion model, differentiating core and peripheral developers, with the latter considered as a homogeneous group. We argue this is too simplistic, given the size of the periphery group and the myriad of valuable activities they perform beyond coding. Our exploratory qualitative survey of 13 FLOSS communities investigated what episodic volunteering looks like in a FLOSS context. EV is widespread in FLOSS communities, although not specifically managed. We suggest several recommendations for managing EV based on a framework drawn from the volunteering literature. Also, episodic volunteers make a wide range of value-added contributions other than code, and they should neither be expected nor coerced into becoming habitual volunteers.
SP  - 962
EP  - 980
JF  - IEEE Transactions on Software Engineering
VL  - 46
IS  - 9
PB  -
DO  - 10.1109/tse.2018.2872713
ER  -

TY  - JOUR
AU  - Hou, Shengjie; Zhang, Xiang; Yi, Biyi; Tang, Yi
TI  - Public attitudes on open source communities in China: A text mining analysis
PY  - 2022
AB  - NA
SP  - 102112
EP  - 102112
JF  - Technology in Society
VL  - 71
IS  - NA
PB  -
DO  - 10.1016/j.techsoc.2022.102112
ER  -

TY  - JOUR
AU  - Bajaj, Rahul; Fernandes, Eduardo; Adams, Bram; Hassan, Ahmed E.
TI  - Unreproducible builds: time to fix, causes, and correlation with external ecosystem factors
PY  - 2023
AB  - NA
SP  - NA
EP  - NA
JF  - Empirical Software Engineering
VL  - 29
IS  - 1
PB  -
DO  - 10.1007/s10664-023-10399-4
ER  -

TY  - CHAP
AU  - Wessel, Mairieli; Mens, Tom; Decan, Alexandre; Mazrae, Pooya Rostami
TI  - The GitHub Development Workflow Automation Ecosystems
PY  - 2023
AB  - NA
SP  - 183
EP  - 214
JF  - Software Ecosystems
VL  - NA
IS  - NA
PB  -
DO  - 10.1007/978-3-031-36060-2_8
ER  -

TY  - NA
AU  - Barcomb, Ann; Stol, Klaas-Jan; Riehle, Dirk; Fitzgerald, Brian
TI  - ICSE - Why do episodic volunteers stay in FLOSS communities
PY  - 2019
AB  - Successful Free/Libre and Open Source Software (FLOSS) projects incorporate both habitual and infrequent, or episodic, contributors. Using the concept of episodic volunteering (EV) from the general volunteering literature, we derive a model consisting of five key constructs that we hypothesize affect episodic volunteers' retention in FLOSS communities. To evaluate the model we conducted a survey with over 100 FLOSS episodic volunteers. We observe that three of our model constructs (social norms, satisfaction and community commitment) are all positively associated with volunteers' intention to remain, while the two other constructs (psychological sense of community and contributor benefit motivations) are not. Furthermore, exploratory clustering on unobserved heterogeneity suggests that there are four distinct categories of volunteers: satisfied, classic, social and obligated. Based on our findings, we offer suggestions for projects to incorporate and manage episodic volunteers, so as to better leverage this type of contributors and potentially improve projects' sustainability.
SP  - 948
EP  - 959
JF  - 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE)
VL  - NA
IS  - NA
PB  -
DO  - 10.1109/icse.2019.00100
ER  -

TY  - JOUR
AU  - Meluso, John; Chambers, Cassandra R.; Littauer, Richard; Llamas, Nerea; Long Lingo, Elizabeth; Mhangami, Marlene; Pitt, Beck; Splitter, Violetta; Wang, Huajin
TI  - Opening Up: Interdisciplinary Guidance for Managing Open Ecosystems
PY  - 2024
AB  - NA
SP  - NA
EP  - NA
JF  - SSRN Electronic Journal
VL  - NA
IS  - NA
PB  -
DO  - 10.2139/ssrn.4821969
ER  -

TY  - NA
AU  - Guo, Philip J.
TI  - UIST - Ten Million Users and Ten Years Later: Python Tutor’s Design Guidelines for Building Scalable and Sustainable Research Software in Academia
PY  - 2021
AB  - Research software is often built as prototypes that never get widespread usage and are left unmaintained after a few papers get published. To counteract this trend, we propose a method for building research software with scale and sustainability in mind so that it can organically grow a large userbase and enable longer-term research. To illustrate this method, we present the design and implementation of Python Tutor (pythontutor.com), a code visualization tool that is, to our knowledge, one of the most widely-used pieces of research software developed within a university lab. Over the past decade, it has been used by over ten million people in over 180 countries. It has also contributed to 55 publications from 35 research groups in 13 countries. We distilled lessons from working on Python Tutor into three sets of design guidelines: 1) user experience design for scale and sustainability, 2) software architecture design for long-term sustainability, and 3) designing a sustainable software development workflow within academia. These guidelines can enable a student to create long-lasting software that reaches many users and facilitates research from many independent groups.
SP  - 1235
EP  - 1251
JF  - The 34th Annual ACM Symposium on User Interface Software and Technology
VL  - NA
IS  - NA
PB  -
DO  - 10.1145/3472749.3474819
ER  -

TY  - NA
AU  - Boisselle, Vincent; Adams, Bram
TI  - SCAM - The impact of cross-distribution bug duplicates, empirical study on Debian and Ubuntu
PY  - 2015
AB  - Although open source distributions like Debian and Ubuntu are closely related, sometimes a bug reported in the Debian bug repository is reported independently in the Ubuntu repository as well, without the Ubuntu users nor developers being aware. Such cases of undetected cross-distribution bug duplicates can cause developers and users to lose precious time working on a fix that already exists or to work individually instead of collaborating to find a fix faster. We perform a case study on Ubuntu and Debian bug repositories to measure the amount of cross-distribution bug duplicates and estimate the amount of time lost. By adapting an existing within-project duplicate detection approach (achieving a similar recall of 60%), we find 821 cross-duplicates. The early detection of such duplicates could reduce the time lost by users waiting for a fix by a median of 38 days. Furthermore, we estimate that developers from the different distributions lose a median of 47 days in which they could have collaborated together, had they been aware of duplicates. These results show the need to detect and monitor cross-distribution duplicates.
SP  - 131
EP  - 140
JF  - 2015 IEEE 15th International Working Conference on Source Code Analysis and Manipulation (SCAM)
VL  - NA
IS  - NA
PB  -
DO  - 10.1109/scam.2015.7335409
ER  -

TY  - NA
AU  - Wu, Xin; Wu, Jian-Yu; Zhou, Minghui; Wang, Zhi-Qiang; Yang, Li-Yun
TI  - Analysis of open source license selection for the GitHub programming community.
PY  - 2020
AB  - Developers usually select different open source licenses to restrain the conditions of using open source software, in order to protect intellectual property rights effectively and maintain the long-term development of the software. However, the open source community has a wide variety of licenses available, developers generally find it difficult to understand the differences between different open source license. And existing open source license selection tools require developers to understand the terms of the open source license and identify their business needs, which makes it hard for developers to make the right choice. Although academia has extensive research to the open source license, but there is no systematic analysis on the actual difficulties of the developers to choose the open source license, thus lacking a clear understanding, for this reason, the purpose of this paper is to understand the difficulties faced by open source developers in choosing open source licenses, analyze the components of open source license and the affecting factors of open source license selection, and to provide references for developers to choose open source licenses.
SP  - NA
EP  - NA
JF  - arXiv: Software Engineering
VL  - NA
IS  - NA
PB  -
DO  - NA
ER  -

TY  - BOOK
AU  - Wittke, Volker; Hanekop, Heidemarie
TI  - New forms of collaborative innovation and production on the internet - an interdisciplinary perspective
PY  - 2011
AB  - The Internet has enabled new forms of large-scale collaboration. Voluntary contributions by large numbers of users and co-producers lead to new forms of production and innovation, as seen in Wikipedia, open source software development, in social networks or on user-generated content platforms as well as in many firm-driven Web 2.0 services. Large-scale collaboration on the Internet is an intriguing phenomenon for scholarly debate because it challenges well established insights into the governance of economic action, the sources of innovation, the possibilities of collective action and the social, legal and technical preconditions for successful collaboration. Although contributions to the debate from various disciplines and fine-grained empirical studies already exist, there still is a lack of an interdisciplinary approach.
SP  - 196
EP  - NA
JF  - NA
VL  - NA
IS  - NA
PB  -
DO  - 10.17875/gup2011-287
ER  -

TY  - BOOK
AU  - Petrov, Dmitrij; Obwegeser, Nikolaus
TI  - ISD - Barriers to Open-Source Software Adoption: Review and Synthesis.
PY  - 2018
AB  - NA
SP  - NA
EP  - NA
JF  - NA
VL  - NA
IS  - NA
PB  -
DO  - NA
ER  -

TY  - JOUR
AU  - Frluckaj, Hana; Dabbish, Laura; Widder, David Gray; Qiu, Huilian Sophie; Herbsleb, James D.
TI  - Gender and Participation in Open Source Software Development
PY  - 2022
AB  - <jats:p>Open source software represents an important form of digital infrastructure as well as a pathway to technical careers for many developers, but women are drastically underrepresented in this setting. Although there is a good body of literature on open source participation, there is very little understanding of the participation trajectories and contribution experiences of women developers, and how they compare to those of men developers, in open source software projects. In order to understand their joining and participation trajectories, we conducted interviews with 23 developers (11 men and 12 women) who became core in an open source project. We identify differences in women and men's motivations for initial contributions and joining processes (e.g. women participating in projects that they have been invited to) and sustained involvement in a project. We also describe unique negative experiences faced by women contributors in this setting in each stage of participation. Our results have implications for diversifying participation in open source software and understanding open source as a pathway to technical careers.</jats:p>
SP  - 1
EP  - 31
JF  - Proceedings of the ACM on Human-Computer Interaction
VL  - 6
IS  - CSCW2
PB  -
DO  - 10.1145/3555190
ER  -

TY  - NA
AU  - Jahn, Leonie; Engelbutzeder, Philip; Randall, Dave; Bollmann, Yannick; Ntouros, Vasilis; Michel, Lea Katharina; Wulf, Volker
TI  - In Between Users and Developers: Serendipitous Connections and Intermediaries in Volunteer-Driven Open-Source Software Development
PY  - 2024
AB  - Technology plays a pivotal role in driving transformation through grassroots movements, which operate on a local scale while embracing a global perspective on sustainability. Consequently, research emerged within Sustainable HCI, aiming to derive design principles that can empower these movements to scale their impact. However, a notable gap exists in contributions when addressing scalability of large free and open-source software (FOSS) projects.
SP  - 1
EP  - 15
JF  - Proceedings of the CHI Conference on Human Factors in Computing Systems
VL  - NA
IS  - NA
PB  -
DO  - 10.1145/3613904.3642541
ER  -

TY  - NA
AU  - Ochoa, Lina; Degueule, Thomas; Falleri, Jean-Rémy
TI  - BreakBot
PY  - 2022
AB  - "If we make this change to our code, how will it impact our clients?" It is difficult for library maintainers to answer this simple---yet essential!---question when evolving their libraries. Library maintainers are constantly balancing between two opposing positions: make changes at the risk of breaking some of their clients, or avoid changes and maintain compatibility at the cost of immobility and growing technical debt. We argue that the lack of objective usage data and tool support leaves maintainers with their own subjective perception of their community to make these decisions.
SP  - 26
EP  - 30
JF  - Proceedings of the ACM/IEEE 44th International Conference on Software Engineering: New Ideas and Emerging Results
VL  - NA
IS  - NA
PB  -
DO  - 10.1145/3510455.3512783
ER  -

TY  - JOUR
AU  - Berkel, Niels Van; Pohl, Henning
TI  - Collaborating with Bots and Automation on OpenStreetMap
PY  - 2024
AB  - <jats:p>OpenStreetMap (OSM) is a large online community where users collaborate to map the world. In addition to manual edits, the OSM mapping database is regularly modified by bots and automated edits. In this article, we seek to better understand how people and bots interact and conflict with each other. We start by analysing over 15 years of mailing list discussions related to bots and automated edits. From this data, we uncover five themes, including how automation results in power differentials between users and how community ideals of consensus clash with the realities of bot use. Subsequently, we surveyed OSM contributors on their experiences with bots and automated edits. We present findings about the current escalation and review mechanisms, as well as the lack of appropriate tools for evaluating and discussing bots. We discuss how OSM and similar communities could use these findings to better support collaboration between humans and bots.</jats:p>
SP  - 1
EP  - 30
JF  - ACM Transactions on Computer-Human Interaction
VL  - 31
IS  - 3
PB  -
DO  - 10.1145/3665326
ER  -

TY  - JOUR
AU  - Jamieson, Jack; Foong, Eureka; Yamashita, Naomi
TI  - Maintaining Values
PY  - 2022
AB  - <jats:p>Communication technologies have significant social impacts, and it is important to consider how designers' and developers' values shape their design. Increasingly, these technologies are released as continually evolving platforms and services, so their development involves ongoing discussions and debates about unforeseen problems and future directions. However, there is a gap in research about how designers, developers, and other stakeholders engage with values during later stages of development. We investigate discussions about values in the context of open source software development, focusing on projects related to the Decentralized Web. We conducted a large-scale analysis of GitHub issues among diverse yet ideologically-related projects. We show that the percentage of discussions about values increases later in development, and we identify features and outcomes of conflicts related to open source participants' values. Finally, we propose suggestions to improve upon existing discussion practices by supporting common ground among collaborators with diverse goals, perspectives, and experiences.</jats:p>
SP  - 1
EP  - 28
JF  - Proceedings of the ACM on Human-Computer Interaction
VL  - 6
IS  - CSCW2
PB  -
DO  - 10.1145/3555550
ER  -

TY  - NA
AU  - Nevo, Dorit; Furneaux, Brent
TI  - THE POWER OF COMMUNITIES: FROM OBSERVED OUTCOMES TO MEASURABLE
PY  - 2012
AB  - Considerable research has sought to establish the benefits that technology-mediated online communities offer their members. In an effort to capitalize on these benefits, organizations have been introducing internally oriented communities to support a wide range of tasks. As a result, such communities have become quite common within organizations and it is therefore becoming increasingly important to link community participation to business outcomes such as team and organizational performance. This paper develops a model linking three commonly identified outcomes of communities: knowledge access, trust, and bridging ties to team performance. We examine two routes from community outcomes to performance, one direct and the other mediated by individual team member innovation. Results of empirical analysis conducted with members of 115 global teams linked to 41 distinct communities support our hypotheses. These findings provide evidence for the business value of communities and offer insights into the value of communities across levels of analysis.
SP  - NA
EP  - NA
JF  - NA
VL  - NA
IS  - NA
PB  -
DO  - NA
ER  -

TY  - JOUR
AU  - Paschali, Maria-Eleni; Ampatzoglou, Apostolos; Bibi, Stamatia; Chatzigeorgiou, Alexander; Stamelos, Ioannis
TI  - Reusability of open source software across domains
PY  - 2017
AB  - NA
SP  - 211
EP  - 227
JF  - Journal of Systems and Software
VL  - 134
IS  - NA
PB  -
DO  - 10.1016/j.jss.2017.09.009
ER  -

TY  - JOUR
AU  - Chen, Liang; Yang, Yun; Wang, Wei
TI  - Temporal Autoregressive Matrix Factorization for High-Dimensional Time Series Prediction of OSS.
PY  - 2024
AB  - Open-source software (OSS) plays an increasingly significant role in modern software development tendency, so accurate prediction of the future development of OSS has become an essential topic. The behavioral data of different open-source software are closely related to their development prospects. However, most of these behavioral data are typical high-dimensional time series data streams with noise and missing values. Hence, accurate prediction on such cluttered data requires the model to be highly scalable, which is not a property of traditional time series prediction models. To this end, we propose a temporal autoregressive matrix factorization (TAMF) framework that supports data-driven temporal learning and prediction. Specifically, we first construct a trend and period autoregressive model to extract trend and period features from OSS behavioral data, and then combine the regression model with a graph-based matrix factorization (MF) to complete the missing values by exploiting the correlations among the time series data. Finally, use the trained regression model to make predictions on the target data. This scheme ensures that TAMF can be applied to different types of high-dimensional time series data and thus has high versatility. We selected ten real developer behavior data from GitHub for case analysis. The experimental results show that TAMF has good scalability and prediction accuracy.
SP  - 13741
EP  - 13752
JF  - IEEE transactions on neural networks and learning systems
VL  - 35
IS  - 10
PB  -
DO  - 10.1109/tnnls.2023.3271327
ER  -

TY  - JOUR
AU  - Ghosh, Uttam Kumar
TI  - CRITICALLY ANALYSING THE IMPACT OF AI BASED MARKETING ON THE RETAIL SECTOR IN INDIA
PY  - 2024
AB  - <jats:p>The introduction of “Artificial Intelligence” or “AI” has revolutionised the India retail market. Its integration in the marketing strategies has created a profound impact making it an essential topic of study. Several technologies of AI are involved in this process. It includes, “machine learning”, “predictive analysis” and “language processing” technology, which has helped the retailers optimise and automate their processes. It has also resulted in better decision making, and thereby improved the customer interactions. The current study aims to analyse the “effects of AI” in the Indian retail markets. This study, additionally tries to highlight the various opportunities created and the associated challenges that arise. The current study also focuses on the current mechanisms of adaptation to “AI based technology”. It would therefore help in a better understanding of both the benefits and challenges of “AI integration”. For this study, a “mixed methods” approach has been taken. Therefore, both “qualitative” and “quantitative” methods have been considered.

The findings provide a comprehensive view of the “AI integration” in marketing. AI has helped the retailers to effectively increase “customer interaction” with the help of “predictive analytics” and has helped automate the marketing campaigns. Additionally, this study has identified several barriers to the adoption of AI. The associated “high costs”, and the concern for “data security” have been identified as top concerns of the retailers in the Indian market. Additionally, the study has revealed that the effects of AI on the “small and medium enterprises” (SMEs) of India is significant. It has led to them having long term “socio-economic” effects on their businesses. However, AI as a tool in the Indian retail sector has significant potential. Overcoming certain barriers would help the market reach its optimal potential. In this paper, additionally, the research gaps are also highlighted.</jats:p>
SP  - NA
EP  - NA
JF  - ShodhKosh: Journal of Visual and Performing Arts
VL  - 5
IS  - 6
PB  -
DO  - 10.29121/shodhkosh.v5.i6.2024.3458
ER  -

TY  - JOUR
AU  - Miller, Tymoteusz; Durlik, Irmina; Kostecka, Ewelina; Łobodzińska, Adrianna; Matuszak, Marcin
TI  - The Emerging Role of Artificial Intelligence in Enhancing Energy Efficiency and Reducing GHG Emissions in Transport Systems
PY  - 2024
AB  - <jats:p>The global transport sector, a significant contributor to energy consumption and greenhouse gas (GHG) emissions, requires innovative solutions to meet sustainability goals. Artificial intelligence (AI) has emerged as a transformative technology, offering opportunities to enhance energy efficiency and reduce GHG emissions in transport systems. This study provides a comprehensive review of AI’s role in optimizing vehicle energy management, traffic flow, and alternative fuel technologies, such as hydrogen fuel cells and biofuels. It explores AI’s potential to drive advancements in electric and autonomous vehicles, shared mobility, and smart transportation systems. The economic analysis demonstrates the viability of AI-enhanced transport, considering Total Cost of Ownership (TCO) and cost-benefit outcomes. However, challenges such as data quality, computational demands, system integration, and ethical concerns must be addressed to fully harness AI’s potential. The study also highlights the policy implications of AI adoption, underscoring the need for supportive regulatory frameworks and energy policies that promote innovation while ensuring safety and fairness.</jats:p>
SP  - 6271
EP  - 6271
JF  - Energies
VL  - 17
IS  - 24
PB  -
DO  - 10.3390/en17246271
ER  -

TY  - CHAP
AU  - Stewart, Brian; Khare, Anshuman
TI  - eLearning and the Sustainable Campus
PY  - 2014
AB  - eLearning has been commonly accepted as a comparatively cost effective and environmentally friendly educational delivery technology. On the other hand, however, agreement as to its efficacy, quality and appropriateness for the social development and maturing of young adults, among other issues do not yield to a similar consensus. Following Brundtland’s model of sustainable development, this paper employs the Sustainability Circle Framework developed by the Global Compact Cities Programme and applies it to eLearning to provide novel and beneficial insights into the critical factors for the on-going sustainability of eLearning. The paper analyses eLearning with regard to the four domains of ecology, economy, culture and politics, providing a comprehensive and rounded perspective of the critical factors that enable eLearning to be sustainable. In addition the findings will provide input into on-going discussions regarding the adoption of eLearning by different educational sectors, within differing disciplines and across differing economic and cultural regions. Achieving an improved understanding of educational sustainability drivers has potential to not only reduce the ecological footprint of educational institutions but also to ensure that such reductions become systemic and endemic into the fabric of the institutions.
SP  - 291
EP  - 305
JF  - World Sustainability Series
VL  - NA
IS  - NA
PB  -
DO  - 10.1007/978-3-319-08837-2_20
ER  -

TY  - JOUR
AU  - Foundjem, Armstrong; Adams, Bram
TI  - Release synchronization in software ecosystems
PY  - 2021
AB  - NA
SP  - 1
EP  - 50
JF  - Empirical Software Engineering
VL  - 26
IS  - 3
PB  -
DO  - 10.1007/s10664-020-09929-1
ER  -

TY  - BOOK
AU  - Goggins, Sean; Lumbard, Kevin; Germonprez, Matt
TI  - SoHeal@ICSE - Open Source Community Health: Analytical Metrics and Their Corresponding Narratives
PY  - 2021
AB  - Open source projects are most often evaluated by potential contributors and consumers using metrics that describe a level of activity within the project because those measurements are available. The principle question in the minds of most evaluators, however, is “How healthy and sustainable is this project in the context of its competitors or dependent projects”? Limitations of current analysis methods focused on trace data alone are discussed, and reviewed in depth. Next, our methods for conducting engaged field research, developing metrics standards as part of a corporate communal partnership, and molding tools that evolve through a reflexive discourse with practitioners using standard metrics is framed as an approach to consider for examining open source software health and sustainability. Researchers, in particular, need tools for increasing the feasibility of comprehensive, multi-project health and sustainability studies, and connecting trace data with human experience. From a practice perspective, these same conditions are increasing the difficulty organizations and individuals engaged in open source face when trying to understand the status, condition, and health of a particular project, the project's ecosystem or ecosystems emerging around their specific project context. This study examines the work of a Linux Foundation working group, CHAOSS (Community Health Analytics Open Source Software) during the first four years of the formation. The paper concludes with examples of CHAOSS metrics operationalized in partnership with corporate collaborators in a manner that emphasizes comparison, transparency, trajectory and visualization as components for discursive, evolutionary understanding of open source software health.
SP  - 25
EP  - 33
JF  - 2021 IEEE/ACM 4th International Workshop on Software Health in Projects, Ecosystems and Communities (SoHeal)
VL  - NA
IS  - NA
PB  -
DO  - 10.1109/soheal52568.2021.00010
ER  -

TY  - JOUR
AU  - Zhang, Wen; Zhao, Jiangpeng; Peng, Rui; Wang, Song; Yang, Ye
TI  - SusRec: An Approach to Sustainable Developer Recommendation for Bug Resolution Using Multimodal Ensemble Learning
PY  - 2023
AB  - The sustainability of an open source project is essential for the long-term and reliable development of software. Most existing studies focus on the recommendation accuracy of bug report assignment while ignoring inexperienced developers in the open source community. This gives inexperienced developers less opportunity to resolve bugs and can cause them to gradually lose interest in the development of open source software (OSS). To address this problem, this article proposes a novel approach called sustainable recommender (SusRec) to make sustainable report assignments without sacrificing accuracy. The SusRec approach is based on multimodal learning and ensemble learning, and it consists of two stages: the preprocessing stage and the developer scoring stage. In the preprocessing stage, the approach selects candidate developers who have participated in the resolution of bugs under the product of a new bug report. It then divides the candidate developers into three types—core developers, active developers, and peripheral developers—according to their experience. In the developer scoring stage, multimodal learning is adopted to score the three types of bug report–developer pairs, and ensemble learning is adopted to weight the scores of the three types of bug report–developer pairs and recommend developers for bug reports. We conduct extensive experiments using the bug repositories of the Eclipse and Mozilla projects to compare the proposed SusRec approach with the baseline methods in bug report assignment. The results demonstrate that the proposed SusRec approach cannot only improve the accuracy of developer recommendations for bug reports, but also the sustainability of OSS projects by providing more opportunities for active developers and peripheral developers to participate in bug resolution.
SP  - 61
EP  - 78
JF  - IEEE Transactions on Reliability
VL  - 72
IS  - 1
PB  -
DO  - 10.1109/tr.2022.3176733
ER  -

TY  - JOUR
AU  - Alfadel, Mahmoud; Costa, Diego Elias; Shihab, Emad; Adams, Bram
TI  - On the Discoverability of npm Vulnerabilities in Node.js Projects
PY  - 2023
AB  - <jats:p>The reliance on vulnerable dependencies is a major threat to software systems. Dependency vulnerabilities are common and remain undisclosed for years. However, once the vulnerability is discovered and publicly known to the community, the risk of exploitation reaches its peak, and developers have to work fast to remediate the problem. While there has been a lot of research to characterize vulnerabilities in software ecosystems, none have explored the problem taking the discoverability into account.</jats:p>
          <jats:p>Therefore, we perform a large-scale empirical study examining 6,546 Node.js applications. We define three discoverability levels based on vulnerabilities lifecycle (undisclosed, reported, and public). We find that although the majority of the affected applications (99.42%) depend on undisclosed vulnerable packages, 206 (4.63%) applications were exposed to dependencies with public vulnerabilities. The major culprit for the applications being affected by public vulnerabilities is the lack of dependency updates; in 90.8% of the cases, a fix is available but not patched by application maintainers. Moreover, we find that applications remain affected by public vulnerabilities for a long time (103 days). Finally, we devise DepReveal, a tool that supports our discoverability analysis approach, to help developers better understand vulnerabilities in their application dependencies and plan their project maintenance.</jats:p>
SP  - 1
EP  - 27
JF  - ACM Transactions on Software Engineering and Methodology
VL  - 32
IS  - 4
PB  -
DO  - 10.1145/3571848
ER  -

TY  - JOUR
AU  - Plantin, Jean-Christophe; Thomer, Andrea
TI  - Platforms, programmability, and precarity: The platformization of research repositories in academic libraries
PY  - 2023
AB  - <jats:p> We investigate in this article how repository platforms change the sharing and preservation of digital objects in academic libraries. We use evidence drawn from semi-structured interviews with 31 data repository managers working at 21 universities using the product Figshare for institutions. We first show that repository managers use this platform to bring together actors, technologies, and processes usually scattered across the library to assign to them the tasks that they value less—such as data preparation or IT maintenance—and spend more time engaging in activities they appreciate—such as raising awareness of data sharing. While this platformization of data management improves their job satisfaction, we reveal how it simultaneously accentuates the outsourcing of libraries’ core mission to private actors. We eventually discuss how this platformization can deskill librarians and perpetuate precarity politics in university libraries. </jats:p>
SP  - 338
EP  - 358
JF  - New Media & Society
VL  - 27
IS  - 1
PB  -
DO  - 10.1177/14614448231176758
ER  -

TY  - NA
AU  - El-Halawany, Ahmed M.; Elminir, Hamdy K.; El-Bakry, Hazem
TI  - Improving Reuse During the Development Process for Web Systems
PY  - 2024
AB  - <title>Abstract</title><p>Software reuse has emerged as a crucial practice in the software industry, offering significant benefits in time-to-market and resources management. This is particularly pertinent in web systems development, where the integration of diverse technologies and the varied backgrounds of technical teams pose substantial challenges. The rapid expansion of web systems underscores the urgent need to adopt best practices and methodologies for web system reuse to streamline the development process, reducing effort, cost, and time. The objective of this paper is to identify the key challenges of web system reuse in the context of small and medium-sized software companies in Egypt and Saudi Arabia. Using qualitative research methods, including interviews, focus groups, and participant observations, we conducted an empirical study to examine current reuse practices, identify, and understand the root causes of common challenges. According to the results of the empirical study, we have developed a systematic approach to enhance web system reuse during the development process in the context of small and medium-sized software companies in Egypt and Saudi Arabia. Our proposed approach fills critical gaps in current practices, offering practical guides to improve efficiency, reduce development time, and enhance overall software quality. This research contributes to the broader discourse on software reuse by providing context-specific insights and adaptable solutions that are relevant to similar markets worldwide.</p>
SP  - NA
EP  - NA
JF  - NA
VL  - NA
IS  - NA
PB  -
DO  - 10.21203/rs.3.rs-4719614/v1
ER  -

TY  - JOUR
AU  - Ochoa, Lina; Degueule, Thomas; Falleri, Jean-Rémy; Vinju, Jurgen
TI  - Breaking bad? Semantic versioning and impact of breaking changes in Maven Central
PY  - 2022
AB  - NA
SP  - NA
EP  - NA
JF  - Empirical Software Engineering
VL  - 27
IS  - 3
PB  -
DO  - 10.1007/s10664-021-10052-y
ER  -

TY  - JOUR
AU  - Kritikos, Apostolos; Stamelos, Ioannis
TI  - A resilience‐based framework for assessing the evolution of open source software projects
PY  - 2023
AB  - <jats:title>Abstract</jats:title><jats:p>Open source software (OSS) has been developing for more than two decades. It originated as a movement with the introduction of the first free/libre OSS operating system, became a popular trend among the developer community, led to enterprise solutions widely embraced by the global market, and began garnering attention from significant players in the software industry (such as IBM's acquisition of RedHat). Throughout the years, numerous software assessment models have been suggested, some of which were created specifically for OSS projects. Most of these assessment models focus on software quality and maintainability. Some models are taking under consideration health aspects of OSS projects. Despite the multitude of these models, there is yet to be a universally accepted model for assessing OSS projects. In this work, we aim to adapt the City Resilience Framework (CRF) for use in OSS projects to establish a strong theoretical foundation for OSS evaluation focusing on the project's resilience as it evolves over time. We would like to highlight that our goal with the proposed assessment model is not to compare two OSS solutions with each other, in terms of resilience, or even do a resilience ranking between the available OSS tools. We are aiming to investigate resilience of an OSS project as it evolves and identify possible opportunities of improvements in the four dimensions we are defining. These dimensions are as follows: source code, business and legal, integration and reuse, and social (community). The CRF is a framework that was introduced to measure urban resilience and most specifically how cities' resilience is changing as they evolve. We believe that a software evaluation model that focuses on resilience can complement the pre‐existing models based on software quality and software health. Although concepts that are related to resilience, like sustainability or viability, already appear in literature, to our best knowledge, there is no OSS assessment model that evaluates the resilience of an OSS project. We argue that cities and OSS projects are both dynamically evolving systems with similar characteristics. The proposed framework utilizes both quantitative and qualitative indicators, which is viewed as an advantage. Lastly, we would like to emphasize that the framework has been tested on the enterprise software domain as part of this study, evaluating five major versions of six OSS projects, Laravel, Composer, PHPMyAdmin, OKApi, PatternalPHP, and PHPExcel, the first three of which are intuitively considered resilient and the three latter nonresilient, to provide a preliminary validation of the models' ability to distinguish between resilient and not resilient projects.</jats:p>
SP  - NA
EP  - NA
JF  - Journal of Software: Evolution and Process
VL  - 36
IS  - 5
PB  -
DO  - 10.1002/smr.2597
ER  -

TY  - NA
AU  - Reid, Brittany; Barbosa, Keila; d'Amorim, Marcelo; Wagner, Markus; Treude, Christoph
TI  - NCQ: code reuse support for Node.js developers.
PY  - 2021
AB  - Code reuse is an important part of software development. The adoption of code reuse practices is especially noticeable in Node.js. The Node.js package manager, NPM, contains over 1 Million packages and developers often seek out packages to solve programming tasks. Due to the vast number of packages, selecting the right package is difficult and time consuming. With the goal of improving productivity of developers that heavily reuse code through third-party packages, we present Node Code Query(NCQ), a custom Read-Eval-Print Loop environment that allows developers to 1) search for NPM packages using natural language queries, 2) search for code snippets related to those packages, 3) quickly setup new environments for testing those snippets, and 4) transition between search and editing modes. Our user study shows that participants began programming faster and concluded tasks faster with NCQ than with a baseline approach, and that they liked, among other features, the search for code snippets and packages. Our results suggest that NCQ makes Node.js developers more efficient in reusing code.
SP  - NA
EP  - NA
JF  - arXiv: Software Engineering
VL  - NA
IS  - NA
PB  -
DO  - NA
ER  -

TY  - JOUR
AU  - Read, Sarah; Swarts, Jason
TI  - Visualizing and Tracing: Research Methodologies for the Study of Networked, Sociotechnical Activity, Otherwise Known as Knowledge Work
PY  - 2014
AB  - This article demonstrates, by example, 2 approaches to the analysis of knowledge work. Both methods draw on network as a framework: a Latourian actor–network theory analysis and a network analysis. The shared object of analysis is a digital humanities and digital media research lab that is the outcome of the collective and coordinated efforts of researchers and other stakeholders at North Carolina State University. The authors show how the two methods are drawn to different objects of study, different data sources, and different assumptions about how data can be reduced and made understandable. The authors conclude by arguing that although these methods yield different outlooks on the same object, their findings are mutually informing.
SP  - 14
EP  - 44
JF  - Technical Communication Quarterly
VL  - 24
IS  - 1
PB  -
DO  - 10.1080/10572252.2015.975961
ER  -

TY  - NA
AU  - Tiwari, Deepika; Gamage, Yogya; Monperrus, Martin; Baudry, Benoit
TI  - PROZE: Generating Parameterized Unit Tests Informed by Runtime Data
PY  - 2024
AB  - NA
SP  - 166
EP  - 176
JF  - 2024 IEEE International Conference on Source Code Analysis and Manipulation (SCAM)
VL  - NA
IS  - NA
PB  -
DO  - 10.1109/scam63643.2024.00025
ER  -

TY  - JOUR
AU  - Li, Zhixing; Yu, Yue; Wang, Tao; Yin, Gang; Li, ShanShan; Wang, Huaimin
TI  - Are You Still Working on This? An Empirical Study on Pull Request Abandonment
PY  - 2022
AB  - NA
SP  - 2173
EP  - 2188
JF  - IEEE Transactions on Software Engineering
VL  - 48
IS  - 6
PB  -
DO  - 10.1109/tse.2021.3053403
ER  -

TY  - CHAP
AU  - Eckert, Remo
TI  - OSS - How Can Open Source Software Projects Be Compared with Organizations
PY  - 2018
AB  - The existence of a community plays a central role in the development of Open Source Software (OSS). Communities are commonly defined as a group of people sharing common norms or values. The common interest of an OSS project is obvious: to develop software under an OSS license. When we look at the rather general definition of a community, we see that there is a similarity to the term ‘organization’. This paper draws parallels between OSS projects and the general elements of an organization and shows the different elements comprised in an OSS community: people, organization and assets. Each of those elements is enriched with examples from different research in the corresponding OSS research stream and provides a broad overview of the elements of OSS projects. With the help of this comparison, research on OSS can be made more focused and aligned with organizational research.
SP  - 3
EP  - 14
JF  - IFIP Advances in Information and Communication Technology
VL  - NA
IS  - NA
PB  -
DO  - 10.1007/978-3-319-92375-8_1
ER  -

TY  - JOUR
AU  - El-Halawany, Ahmed M; Elminir, Hamdy K; El-Bakry, Hazem
TI  - Improving reuse during the development process for web systems.
PY  - 2024
AB  - Software reuse has emerged as a crucial practice in the software industry, offering significant benefits in time-to-market, and resources management. This is particularly pertinent in web systems development, where the integration of diverse technologies and the varied backgrounds of technical teams pose substantial challenges. The rapid expansion of web systems underscores the urgent need to adopt best practices and methodologies for web system reuse to streamline the development process, reducing effort, cost, and time. This paper aims to identify the key challenges of web system reuse in the context of small and medium-sized software companies in Egypt and Saudi Arabia. Using qualitative research methods, including interviews, focus groups, and participant observations, an empirical study was conducted to examine current reuse practices and understand the root causes of common challenges. Based on the results of the empirical study, a systematic approach was developed to enhance web system reuse during the development process in the context of small and medium-sized software companies in Egypt and Saudi Arabia. The proposed approach addresses critical gaps in current practices, offering practical guidelines to improve efficiency, reduce development time, and enhance overall software quality. This research contributes to the broader discourse on software reuse by providing context-specific insights and adaptable solutions that are relevant to similar markets worldwide.
SP  - 24023
EP  - NA
JF  - Scientific reports
VL  - 14
IS  - 1
PB  -
DO  - 10.1038/s41598-024-74643-7
ER  -

TY  - JOUR
AU  - Ait, Adem; Cánovas Izquierdo, Javier Luis; Cabot, Jordi
TI  - On the suitability of hugging face hub for empirical studies
PY  - 2025
AB  - Context Empirical studies in software engineering mainly rely on the data available on code-hosting platforms, being GitHub the most representative. Nevertheless, in the last years, the emergence of Machine Learning (ML) has led to the development of platforms specifically designed for hosting ML-based projects, with Hugging Face Hub (HFH) as the most popular one. So far, there have been no studies evaluating the potential of HFH for such studies. Objective We aim at performing an exploratory study of the current state of HFH and its suitability to be used as a source platform for empirical studies. Method We conduct a qualitative and quantitative analysis of HFH. The former will be performed by comparing the features of HFH with those of other code-hosting platforms, such as GitHub and GitLab . The latter will be performed by analyzing the data available in HFH. Results We propose a feature framework to characterize HFH and report on the current usage of the platform, both in terms of number and types of projects (and surrounding community) and the features they mostly rely on. Conclusions The results confirm that HFH offers enough features and diverse enough data to be the source of relevant empirical studies on the development, evolution and usage of AI-related projects. The results also triggered a discussion on aspects of HFH that should be considered when performing such empirical studies.
SP  - NA
EP  - NA
JF  - Empirical Software Engineering
VL  - 30
IS  - 2
PB  -
DO  - 10.1007/s10664-024-10608-8
ER  -

TY  - JOUR
AU  - Almarimi, Nuri; Ouni, Ali; Chouchen, Moataz; Mkaouer, Mohamed Wiem
TI  - Improving the detection of community smells through socio‐technical and sentiment analysis
PY  - 2022
AB  - <jats:title>Abstract</jats:title><jats:p>Open source software development is regarded as a collaborative activity in which developers interact to build a software product. Such a human collaboration is described as an organized effort of the “social” activity of organizations, individuals, and stakeholders, which can affect the development community and the open source project health. Negative effects of the development community manifest typically in the form of community smells, which represent symptoms of organizational and social issues within the open source software development community that often lead to additional project costs and reduced software quality. Recognizing the advantages of the early detection of potential community smells in a software project, we introduce a novel approach that learns from various community organizational, social, and emotional aspects to provide an automated support for detecting community smells. In particular, our approach learns from a set of interleaving organizational–social and emotional symptoms that characterize the existence of community smell instances in a software project. We build a multi‐label learning model to detect 10 common types of community smells. We use the ensemble classifier chain (ECC) model that transforms multi‐label problems into several single‐label problems, which are solved using genetic programming (GP) to find the optimal detection rules for each smell type. To evaluate the performance of our approach, we conducted an empirical study on a benchmark of 143 open source projects. The statistical tests of our results show that our approach can detect community smells with an average F‐measure of 93%, achieving a better performance compared to different state‐of‐the‐art techniques. Furthermore, we investigate the most influential community‐related metrics to identify each community smell type.</jats:p>
SP  - NA
EP  - NA
JF  - Journal of Software: Evolution and Process
VL  - 35
IS  - 6
PB  -
DO  - 10.1002/smr.2505
ER  -

TY  - CHAP
AU  - Kim, Hee-Woong; Chan, Hock Chuan; Lee, So-Hyun
TI  - User Resistance to Software Migration
PY  - 2015
AB  - <jats:p>The demand for software has increased rapidly in the global industrial environment. Open source software (OSS) has exerted significant impact on the software industry. Large amounts of resources and effort have been devoted to the development of OSS such as Linux. Based on the technology adoption model (TAM), the development of Linux as the most well-known OSS with a graphical user interface designed for ease of use and a wide range of functionalities is expected to result in high levels of Linux adoption by individual users. Linux, however, currently controls about 1% of the operating system market for personal computers. The resistance of users to switch to a new operating system remains one of the major obstacles to widespread adoption of Linux among individual users. Based on the integration of the equity implementation model and the TAM, this study examines the formation of user resistance, as well as the effects of user resistance, on the migration to Linux for personal computers. This study discusses the role and effect of user resistance based on the equity implementation model in comparison with the two main determinants in the TAM. This study contributes to the advancement of theoretical understanding of Linux migration and user resistance. The findings also offer suggestions for software communities and practitioners, of OSS in particular, to promote the use of new software by individual users. </jats:p>
SP  - 506
EP  - 527
JF  - Open Source Technology
VL  - NA
IS  - NA
PB  -
DO  - 10.4018/978-1-4666-7230-7.ch028
ER  -

TY  - JOUR
AU  - Dong, John Qi; Götz, Sebastian Johannes
TI  - Project leaders as boundary spanners in open source software development: A resource dependence perspective
PY  - 2020
AB  - <jats:title>Abstract</jats:title><jats:p>Digital social innovation is important for addressing various social needs, especially from those who are economically disadvantaged. For instance, open source software (OSS) is developed by mass collaboration on digital communities to provide software users free alternatives to commercial products. OSS is particularly valuable to meet the needs of numerous disadvantaged users for whom proprietary software is not affordable. While OSS projects are lack of formal organizational structure, project leaders play a significant role in initiating and managing these projects and eventually, influencing the degree to which the developed software is used and liked by users. Drawing on resource dependence theory, we investigate the impacts of two team‐level characteristics of OSS project leaders (ie, size and tenure) on how well the developed software can address users' needs, with regard to the quantity of software being used by users and the quality of software to users' satisfaction. Further, from a resource dependence perspective, we examine the moderating role of project leaders' network ties in shaping the contingency of these effects. By using a large‐scale dataset from 43 048 OSS development projects in SourceForge community, we find empirical evidence corroborating our theory. Taken together, our findings suggest the boundary‐spanning role of project leaders in developing digital social innovation.</jats:p>
SP  - 672
EP  - 694
JF  - Information Systems Journal
VL  - 31
IS  - 5
PB  -
DO  - 10.1111/isj.12313
ER  -

TY  - JOUR
AU  - Jayasuriya, Dhanushka; Terragni, Valerio; Dietrich, Jens; Blincoe, Kelly
TI  - Understanding the Impact of APIs Behavioral Breaking Changes on Client Applications
PY  - 2024
AB  - <jats:p>Libraries play a significant role in software development as they provide reusable functionality, which helps expedite the development process. As libraries evolve, they release new versions with optimisations like new functionality, bug fixes, and patches for known security vulnerabilities. To obtain these optimisations, the client applications that depend on these libraries must update to use the latest version. However, this can cause software failures in the clients if the update includes breaking changes. These breaking changes can be divided into syntactic and semantic (behavioral) breaking changes. While there has been considerable research on syntactic breaking changes introduced between library updates and their impact on client projects, there is a notable lack of research regarding behavioral breaking changes introduced during library updates and their impacts on clients.


We conducted an empirical analysis to identify the impact behavioral breaking changes have on clients by examining the impact of dependency updates on client test suites. We examined a set of java projects built using Maven, which included 30,548 dependencies under 8,086 Maven artifacts. We automatically updated out-of-date dependencies and ran the client test suites. We found that 2.30% of these updates had behavioral breaking changes that impacted client tests. Our results show that most breaking changes were introduced during a non-Major dependency update, violating the semantic versioning scheme. We further analyzed the effects these behavioral breaking changes have on client tests. We present a taxonomy of effects related to these changes, which we broadly categorize as Test Failures and Test Errors. Our results further indicate that the library developers did not adequately document the exceptions thrown due to precondition violations.</jats:p>
SP  - 1238
EP  - 1261
JF  - Proceedings of the ACM on Software Engineering
VL  - 1
IS  - FSE
PB  -
DO  - 10.1145/3643782
ER  -

TY  - JOUR
AU  - Assavakamhaenghan, Noppadol; Wattanakriengkrai, Supatsara; Shimada, Naomichi; Kula, Raula Gaikovina; Ishio, Takashi; Matsumoto, Kenichi
TI  - Does the first response matter for future contributions? A study of first contributions
PY  - 2023
AB  - NA
SP  - NA
EP  - NA
JF  - Empirical Software Engineering
VL  - 28
IS  - 3
PB  -
DO  - 10.1007/s10664-023-10299-7
ER  -

TY  - BOOK
AU  - Gakh, Dmitriy
TI  - ISM@FedCSIS - Street Addressing System as an Essential Component in Sustainable Development and Realization of Smart City Conception
PY  - 2021
AB  - Although Street Addressing (SA) seems to be a well-known conception, there is misunderstanding of it and its contribution to the Urban Economics, Smart Cities (SC) initiatives, and Sustainable Development (SD). The role of SA in planning, building, and operation of SC remains unclear. Considering SA as a socio-technical system and underlying part of ICT services allowed formulating peculiarities of SA system towards the implementation of the SC conception [1]. This paper expands the SA for SC research and includes considering SA contribution to SD. Discovery of indirect influence of SA peculiarities to SD is the main and the most important result of this article. This discovery can be considered as an example of SA Quality Assurance for actual SA program and in some extent fills the research gap in SA methodology. The goal of this article is an analysis of how SA affects SC initiatives and SD. Uniqueness of this research is concluded in analysis of SA as a urban infrastructure component through the prism of Software Quality that enables building quality SC services. This approach can be considered as a contribution to science because it shows a possibility to synthesize the ICT and urban management/civil engineering. It also highlights a role of the government in developing and implementing the SA programs as well as issues of integration of SA with urban infrastructure identification and inventory systems. The only limitation could be the use of the available SC literature that is not sufficient to confidently draw long-term conclusions.
SP  - 127
EP  - 145
JF  - Lecture Notes in Business Information Processing
VL  - NA
IS  - NA
PB  -
DO  - 10.1007/978-3-030-71846-6_7
ER  -

TY  - JOUR
AU  - Li, Zhixing; Yu, Yue; Wang, Tao; Yin, Gang; Li, Shanshan; Wang, Huaimin
TI  - Are You Still Working on This An Empirical Study on Pull Request Abandonment
PY  - 2021
AB  - The great success of numerous community-based open source software (OSS) is based on volunteers continuously submitting contributions, but ensuring sustainability is a persistent challenge in OSS communities. Although the motivations behind and barriers to OSS contributors' joining and retention have been extensively studied, the impacts of, reasons for and solutions to contribution abandonment at the individual level have not been well studied, especially for pull-based development. To bridge this gap, we present an empirical study on pull request abandonment based on a sizable dataset. We manually examine 321 abandoned pull requests on GitHub and then quantify the manual observations by surveying 710 OSS developers. We find that while the lack of integrators' responsiveness, the lack of contributors' time and interest remain the main reasons that deter contributors from participation, limitations during the processes of patch updating and consensus reaching can also cause abandonment. We also show the significant impacts of pull request abandonment on project management and maintenance. Moreover, we elucidate the strategies used by project integrators to cope with abandoned pull requests and highlight the need for a practical handover mechanism. We discuss the actionable suggestions and implications for OSS practitioners and tool builders, which can help to upgrade the infrastructure and optimize the mechanisms of OSS communities.
SP  - 1
EP  - 1
JF  - IEEE Transactions on Software Engineering
VL  - NA
IS  - NA
PB  -
DO  - NA
ER  -

TY  - NA
AU  - Sotiropoulos, Thodoris; Mitropoulos, Dimitris; Spinellis, Diomidis
TI  - Detecting Missing Dependencies and Notifiers in Puppet Programs.
PY  - 2019
AB  - Puppet is a popular computer system configuration management tool. It provides abstractions that enable administrators to setup their computer systems declaratively. Its use suffers from two potential pitfalls. First, if ordering constraints are not specified whenever an abstraction depends on another, the non-deterministic application of abstractions can lead to race conditions. Second, if a service is not tied to its resources through notification constructs, the system may operate in a stale state whenever a resource gets modified. Such faults can degrade a computing infrastructure's availability and functionality.
We have developed an approach that identifies these issues through the analysis of a Puppet program and its system call trace. Specifically, we present a formal model for traces, which allows us to capture the interactions of Puppet abstractions with the file system. By analyzing these interactions we identify (1) abstractions that are related to each other (e.g., operate on the same file), and (2) abstractions that should act as notifiers so that changes are correctly propagated. We then check the relationships from the trace's analysis against the program's dependency graph: a representation containing all the ordering constraints and notifications declared in the program. If a mismatch is detected, our system reports a potential fault.
We have evaluated our method on a large set of Puppet modules, and discovered 57 previously unknown issues in 30 of them. Benchmarking further shows that our approach can analyze in minutes real-world configurations with a magnitude measured in thousands of lines and millions of system calls.
SP  - NA
EP  - NA
JF  - arXiv: Software Engineering
VL  - NA
IS  - NA
PB  -
DO  - NA
ER  -

TY  - NA
AU  - Jin, Siyuan; Li, Ziyuan; Chen, Bichao; Zhu, Bing; Xia, Yong
TI  - Software Code Quality Measurement: Implications from Metric Distributions
PY  - 2023
AB  - Software code quality is a construct with three dimensions: maintainability, reliability, and functionality. Although many firms have incorporated code quality metrics in their operations, evaluating these metrics still lacks consistent standards. We categorized distinct metrics into two types: 1) monotonic metrics that consistently influence code quality; and 2) non-monotonic metrics that lack a consistent relationship with code quality. To consistently evaluate them, we proposed a distribution-based method to get metric scores. Our empirical analysis includes 36,460 high-quality open-source software (OSS) repositories and their raw metrics from SonarQube <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sup> and CK <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> . The evaluated scores demonstrate great explainability on software adoption. Our work contributes to the multidimensional construct of code quality and its metric measurements, which provides practical implications for consistent measurements on both monotonic and non-monotonic metrics. <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sup> https://www.sonarsource.com <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> https://github.com/mauricioaniche/ck
SP  - 488
EP  - 496
JF  - 2023 IEEE 23rd International Conference on Software Quality, Reliability, and Security (QRS)
VL  - NA
IS  - NA
PB  -
DO  - 10.1109/qrs60937.2023.00054
ER  -

TY  - NA
AU  - Almeida, Aylton; Xavier, Laerte; Valente, Marco Tulio
TI  - Automatic Library Migration Using Large Language Models: First Results
PY  - 2024
AB  - Despite being introduced only a few years ago, Large Language Models (LLMs) are already widely used by developers for code generation. However, their application in automating other Software Engineering activities remains largely unexplored. Thus, in this paper, we report the first results of a study in which we are exploring the use of ChatGPT to support API migration tasks, an important problem that demands manual effort and attention from developers. Specifically, in the paper, we share our initial results involving the use of ChatGPT to migrate a client application to use a newer version of SQLAlchemy, an ORM (Object Relational Mapping) library widely used in Python. We evaluate the use of three types of prompts (Zero-Shot, One-Shot, and Chain Of Thoughts) and show that the best results are achieved by the One-Shot prompt, followed by the Chain Of Thoughts. Particularly, with the One-Shot prompt we were able to successfully migrate all columns of our target application and upgrade its code to use new functionalities enabled by SQLAlchemy's latest version, such as Python's asyncio and typing modules, while preserving the original code behavior.
SP  - 427
EP  - 433
JF  - Proceedings of the 18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement
VL  - NA
IS  - NA
PB  -
DO  - 10.1145/3674805.3690746
ER  -

TY  - JOUR
AU  - Lundell, Bjorn; Butler, Simon; Fischer, Thomas; Gamalielsson, Jonas; Brax, Christoffer; Feist, Jonas; Gustavsson, Tomas; Katz, Andrew; Kvarnstrom, Bengt; Lonroth, Erik; Mattsson, Anders
TI  - Effective Strategies for Using Open Source Software and Open Standards in Organizational Contexts: Experiences From the Primary and Secondary Software Sectors
PY  - 2022
AB  - NA
SP  - 84
EP  - 92
JF  - IEEE Software
VL  - 39
IS  - 1
PB  -
DO  - 10.1109/ms.2021.3059036
ER  -

TY  - NA
AU  - Mahmood, Ahmad Kamil; Soofi, Aized Amin; Khan, M. Irfan; Iskandar, Bandar Seri
TI  - Using Open Source Software in Reuse-Intensive Software Development - A Qualitative Study
PY  - 2013
AB  - Open Source Software (OSS) is one of the emerging areas in software engineering. Reuse of OSS is employed in reuse-intensive software development such as Component Based Software Development and Software Product Lines. OSS is gaining the interest of the software development community due to its enormous benefits. The context of this study is the use of OSS in reuse-intensive software development. The use of OSS in the systematic reuse of software, such as in Software Product Lines (SPLs) is a new phenomenon. The aim of this study is to identify the different dimensions of this phenomenon. In this study, a qualitative method, namely the interview, is used to acquire data. Interviews are conducted with seven respondents. The data is analyzed using an adapted form of grounded theory. The results of this study include seven categories and their 39 subcategories / dimensions. The results of the study are compared with contemporary studies in this area to highlight the contributions and to complement them. The findings of this study provide an in-depth view of the issues related to the use of OSS in reuse-intensive software development. These findings will help the community to improve their practices and to initiate steps to cope with the challenges.
SP  - NA
EP  - NA
JF  - NA
VL  - NA
IS  - NA
PB  -
DO  - NA
ER  -

TY  - NA
AU  - München, Ludwig-Maximilians-Universität
TI  - DO DEVELOPERS MAKE UNBIASED DECISIONS? - THE EFFECT OF MINDFULNESS AND NOT-INVENTED-HERE BIAS ON THE ADOPTION OF SOFTWARE COMPONENTS
PY  - 2015
AB  - Software reuse can lower costs and increase the quality of software development. Despite a large body of research focused on technical and organisational factors, there is still limited research on the software developers’ perspective regarding software component reuse. Therefore, this paper investigates the developers’ adoption intention to use existing software components. Information systems adoption research has extensively focused on the technological aspects and less on the individual factors. However, studying these individual differences is important, as research has shown that individuals do not always behave according to rational assumptions. This study analyses the adoption of software components based on the unified theory of acceptance and use of technology and extends the research model by integrating the not-invented-here bias and the concept of mindfulness to account for individual differences. An empirical study with 142 software developers was conducted to empirically validate the research model. The results show that performance expectancy, social influence and notinvented here bias play an important role in the developers’ decision to adopt software components. Furthermore, findings show that a mindfulness state has a negative influence on the not-invented-here bias and it directly affects the intention to adopt existing software components.
SP  - NA
EP  - NA
JF  - NA
VL  - NA
IS  - NA
PB  -
DO  - NA
ER  -

TY  - NA
AU  - Jullien, Nicolas; Viseur, Robert; Zimmermann, Jean-Benoit
TI  - Open-Source Business Models: Beneficiaries and Drivers of Floss Project Dynamics
PY  - 2024
AB  - NA
SP  - NA
EP  - NA
JF  - NA
VL  - NA
IS  - NA
PB  -
DO  - 10.2139/ssrn.4889813
ER  -

TY  - NA
AU  - Constantino, Kattiana; Figueiredo, Eduardo
TI  - CoopFinder: Finding Collaborators Based on Co–Changed Files
PY  - 2022
AB  - Successful software projects require engaged collaborators interacting with each other across the entire development life-cycle. Unfortunately, for social coding platforms, e.g., GitHub, identifying a suitable collaborator to strengthen their ties and improve their engagement in the project is challenging, given that reliable information for collaborator identification is often not readily available. In this work, we propose and evaluate a collaborator recommendation tool - CoopFinder - to help developers find collaborators in a specific project based on similar interests related to co–changed files. We design our Web prototype–tool using visualization techniques. As a result of the preliminary user study, we observed that 95% of the participants had positive impressions of the tool. Furthermore, 75% of participants stated that they would use the tool in their daily lives or recommend it to others. Repository: https://github.com/kattiana/CoopFinder
SP  - 1
EP  - 3
JF  - 2022 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC)
VL  - NA
IS  - NA
PB  -
DO  - 10.1109/vl/hcc53370.2022.9833126
ER  -

TY  - JOUR
AU  - He, Runzhi; He, Hao; Zhang, Yuxia; Zhou, Minghui
TI  - Automating Dependency Updates in Practice: An Exploratory Study on GitHub Dependabot
PY  - 2023
AB  - Dependency management bots automatically open pull requests to update software dependencies on behalf of developers. Early research shows that developers are suspicious of updates performed by dependency management bots and feel tired of overwhelming notifications from these bots. Despite this, dependency management bots are becoming increasingly popular. Such contrast motivates us to investigate Dependabot, currently the most visible bot on GitHub, to reveal the effectiveness and limitations of state-of-art dependency management bots. We use exploratory data analysis and a developer survey to evaluate the effectiveness of Dependabot in keeping dependencies up-to-date, interacting with developers, reducing update suspicion, and reducing notification fatigue. We obtain mixed findings. On the positive side, projects do reduce technical lag after Dependabot adoption and developers are highly receptive to its pull requests. On the negative side, its compatibility scores are too scarce to be effective in reducing update suspicion; developers tend to configure Dependabot toward reducing the number of notifications; and 11.3% of projects have deprecated Dependabot in favor of other alternatives. The survey confirms our findings and provides insights into the key missing features of Dependabot. Based on our findings, we derive and summarize the key characteristics of an ideal dependency management bot which can be grouped into four dimensions: configurability, autonomy, transparency, and self-adaptability.
SP  - 4004
EP  - 4022
JF  - IEEE Transactions on Software Engineering
VL  - 49
IS  - 8
PB  -
DO  - 10.1109/tse.2023.3278129
ER  -

TY  - NA
AU  - Fang, Hongbo; Herbsleb, James; Vasilescu, Bogdan
TI  - Novelty Begets Popularity, But Curbs Participation - A Macroscopic View of the Python Open-Source Ecosystem
PY  - 2024
AB  - Who creates the most innovative open-source software projects? And what fate do these projects tend to have? Building on a long history of research to understand innovation in business and other domains, as well as recent advances towards modeling innovation in scientific research from the science of science field, in this paper we adopt the analogy of innovation as emerging from the novel recombination of existing bits of knowledge. As such, we consider as innovative the software projects that recombine existing software libraries in novel ways, i.e., those built on top of atypical combinations of packages as extracted from import statements. We then report on a large-scale quantitative study of innovation in the Python open-source software ecosystem. Our results show that higher levels of innovativeness are statistically associated with higher GitHub star counts, i.e., novelty begets popularity. At the same time, we find that controlling for project size, the more innovative projects tend to involve smaller teams of contributors, as well as be at higher risk of becoming abandoned in the long term. We conclude that innovation and open source sustainability are closely related and, to some extent, antagonistic.
SP  - 1
EP  - 11
JF  - Proceedings of the IEEE/ACM 46th International Conference on Software Engineering
VL  - NA
IS  - NA
PB  -
DO  - 10.1145/3597503.3608142
ER  -

TY  - CHAP
AU  - Capiluppi, Andrea; Stol, Klaas-Jan; Boldyreff, Cornelia
TI  - Software Reuse in Open Source A Case Study
PY  - 2014
AB  - <jats:p>A promising way to support software reuse is based on Component-Based Software Development (CBSD). Open Source Software (OSS) products are increasingly available that can be freely used in product development. However, OSS communities still face several challenges before taking full advantage of the “reuse mechanism”: many OSS projects duplicate effort, for instance when many projects implement a similar system in the same application domain and in the same topic. One successful counter-example is the FFmpeg multimedia project; several of its components are widely and consistently reused in other OSS projects. Documented is the evolutionary history of the various libraries of components within the FFmpeg project, which presently are reused in more than 140 OSS projects. Most use them as black-box components; although a number of OSS projects keep a localized copy in their repositories, eventually modifying them as needed (white-box reuse). In both cases, the authors argue that FFmpeg is a successful project that provides an excellent exemplar of a reusable library of OSS components.</jats:p>
SP  - 1900
EP  - 1926
JF  - Software Design and Development
VL  - NA
IS  - NA
PB  -
DO  - 10.4018/978-1-4666-4301-7.ch090
ER  -

TY  - JOUR
AU  - Murali, Aniruddhan; Sahu, Gaurav; Thangarajah, Kishanthan; Zimmerman, Brian; Rodríguez-Pérez, Gema; Nagappan, Meiyappan
TI  - Diversity in issue assignment: humans vs bots
PY  - 2024
AB  - NA
SP  - NA
EP  - NA
JF  - Empirical Software Engineering
VL  - 29
IS  - 2
PB  -
DO  - 10.1007/s10664-023-10424-6
ER  -

TY  - JOUR
AU  - Yang, Shouguo; Dong, Chaopeng; Xiao, Yang; Cheng, Yiran; Shi, Zhiqiang; Li, Zhi; Sun, Limin
TI  - Asteria-Pro: Enhancing Deep Learning-based Binary Code Similarity Detection by Incorporating Domain Knowledge
PY  - 2023
AB  - <jats:p>
            Widespread code reuse allows vulnerabilities to proliferate among a vast variety of firmware. There is an urgent need to detect these vulnerable codes effectively and efficiently. By measuring code similarities,
            <jats:italic>AI-based binary code similarity detection</jats:italic>
            is applied to detecting vulnerable code at scale. Existing studies have proposed various function features to capture the commonality for similarity detection. Nevertheless, the significant code syntactic variability induced by the diversity of IoT hardware architectures diminishes the accuracy of binary code similarity detection. In our earlier study and the tool
            <jats:italic>Asteria</jats:italic>
            , we adopted a Tree-LSTM network to summarize function semantics as function commonality, and the evaluation result indicates an advanced performance. However, it still has utility concerns due to excessive time costs and inadequate precision while searching for large-scale firmware bugs.
          </jats:p>
          <jats:p>
            To this end, we propose a novel deep learning-enhancement architecture by incorporating domain knowledge-based pre-filtration and re-ranking modules, and we develop a prototype named
            <jats:sc>Asteria-Pro</jats:sc>
            based on
            <jats:italic>Asteria</jats:italic>
            . The pre-filtration module eliminates dissimilar functions, thus reducing the subsequent deep learning-model calculations. The re-ranking module boosts the rankings of vulnerable functions among candidates generated by the deep learning model. Our evaluation indicates that the pre-filtration module cuts the calculation time by 96.9%, and the re-ranking module improves MRR and Recall by 23.71% and 36.4%, respectively. By incorporating these modules,
            <jats:sc>Asteria-Pro</jats:sc>
            outperforms existing state-of-the-art approaches in the bug search task by a significant margin. Furthermore, our evaluation shows that embedding baseline methods with pre-filtration and re-ranking modules significantly improves their precision. We conduct a large-scale real-world firmware bug search, and
            <jats:sc>Asteria-Pro</jats:sc>
            manages to detect 1,482 vulnerable functions with a high precision 91.65%.
          </jats:p>
SP  - 1
EP  - 40
JF  - ACM Transactions on Software Engineering and Methodology
VL  - 33
IS  - 1
PB  -
DO  - 10.1145/3604611
ER  -

TY  - JOUR
AU  - Chen, Hongzhou; Cai, Wei
TI  - A Comparative Analysis of Centralized and Decentralized Developer Autonomous Organizations Managing Conflicts in Discussing External Crises
PY  - 2024
AB  - NA
SP  - 8118
EP  - 8129
JF  - IEEE Transactions on Computational Social Systems
VL  - 11
IS  - 6
PB  -
DO  - 10.1109/tcss.2023.3247464
ER  -

TY  - JOUR
AU  - Wang, Ying; Sun, Peng; Pei, Lin; Yu, Yue; Xu, Chang; Cheung, Shing-Chi; Yu, Hai; Zhu, Zhiliang
TI  - Plumber: Boosting the Propagation of Vulnerability Fixes in the <i>npm</i> Ecosystem
PY  - 2023
AB  - Vulnerabilities are known reported security threats that affect a large amount of packages in the <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">npm</i> ecosystem. To mitigate these security threats, the open-source community strongly suggests vulnerable packages to timely publish vulnerability fixes and recommends affected packages to update their dependencies. However, there are still serious lags in the propagation of vulnerability fixes in the ecosystem. In our preliminary study on the latest versions of 356,283 active <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">npm</i> packages, we found that 20.0% of them can still introduce vulnerabilities via direct or transitive dependencies although the involved vulnerable packages have already published fix versions for over a year. Prior study by (Chinthanet et al. 2021) lays the groundwork for research on how to mitigate propagation lags of vulnerability fixes in an ecosystem. They conducted an empirical investigation to identify lags that might occur between the vulnerable package release and its fixing release. They found that factors such as the branch upon which a fix landed and the severity of the vulnerability had a small effect on its propagation trajectory throughout the ecosystem. To ensure quick adoption and propagation of a release that contains the fix, they gave several actionable advice to developers and researchers. However, it is still an open question how to design an effective technique to accelerate the propagation of vulnerability fixes. Motivated by this problem, in this paper, we conducted an empirical study to learn the scale of packages that block the propagation of vulnerability fixes in the ecosystem and investigate their evolution characteristics. Furthermore, we distilled the remediation strategies that have better effects on mitigating the fix propagation lags. Leveraging our empirical findings, we propose an ecosystem-level technique, <sc xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Plumber</small> , for deriving feasible remediation strategies to boost the propagation of vulnerability fixes. To precisely diagnose the causes of fix propagation blocking, <sc xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Plumber</small> models the vulnerability metadata, and <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">npm</i> dependency metadata and continuously monitors their evolution. By analyzing a full-picture of the ecosystem-level dependency graph and the corresponding fix propagation statuses, it derives remediation schemes for pivotal packages. In the schemes, <sc xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Plumber</small> provides customized remediation suggestions with vulnerability impact analysis to arouse package developers’ awareness. We applied <sc xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Plumber</small> to generating 268 remediation reports for the identified pivotal packages, to evaluate its remediation effectiveness based on developers’ feedback. Encouragingly, 47.4% our remediation reports received positive feedback from many well-known <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">npm</i> projects, such as <monospace xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Tensorflow/tfjs</monospace> , <monospace xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Ethers.js</monospace> , and <monospace xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">GoogleChrome/workbox</monospace> . Our reports have boosted the propagation of vulnerability fixes into 16,403 root packages through 92,469 dependency paths. On average, each remediated package version is receiving 72,678 downloads per week by the time of this work.
SP  - 3155
EP  - 3181
JF  - IEEE Transactions on Software Engineering
VL  - 49
IS  - 5
PB  -
DO  - 10.1109/tse.2023.3243262
ER  -