diff --git a/Ch.00_Abstract.tex b/Ch.00_Abstract.tex index 3dbcb64..aaba617 100644 --- a/Ch.00_Abstract.tex +++ b/Ch.00_Abstract.tex @@ -12,19 +12,19 @@ \begin{otherlanguage}{english} \begin{abstract} - \textbf{Context:} Public copyright licenses (PCL) play a major part in software engineering. For example in open source there must be an appropriate PCL attached to the source code in order for open-source software to be freely available for possible modification and redistribution. Understanding PCLs can be difficult. This could stem from the legal nature of the license texts and the large number of already-existing PCLs. sub-problem here + \textbf{Context:} Public copyright licenses (PCL) are central to the distribution of works in software engineering. For example in open source there must be an appropriate PCL attached to the source code in order for open-source software to be freely available for possible modification and redistribution. Understanding PCLs can be difficult. This could stem from the legal nature of the license texts and the large number of already-existing PCLs. As a result some actions made within the boundaries of the PCLs may come as a surprise to the public. - \textbf{Objective:} Thesis' contribution to solution here. + \textbf{Objective:} The primary goal of this research is to conduct a multivocal literature review of the current state of PCLs in software engineering, the evaluation of the them and the evidence level of the research. The research aims to provide a novel perspective on relevant licenses and to extract key findings through a rigorous literature review process. This study has two main viewpoints: to provide rigorous research on PCLs to the academic field and to provide insights to the professional field of software engineering on PCLs. The grand goal of this thesis is to raise awareness of the importance of PCLs so that more licensers would make the correct choices based on their situations and needs in a mindful way. - \textbf{Method:} Tell about the research method here. + \textbf{Method:} The search strategy examined 6666 sources, found through websites that list PCLs and ad-hoc searches. Applying inclusion and exclusion criteria resulted in the selection of 666 sources, which made relevant contributions related to PCLs in software engineering. - \textbf{Results:} Tell about the results here. + \textbf{Results:} - \textbf{Conclusions:} Tell about the discussion here. + \textbf{Conclusions:} \end{abstract} \end{otherlanguage} \section*{Acknowledgements} -much love to suvi, artemis, sami nurmivaara, prof männistö and mika mäntylä +much love to suvi, artemis, sami nurmivaara, prof männistö and prof mäntylä dedicated to suvi <3 \ No newline at end of file diff --git a/Ch.10_Introduction.tex b/Ch.10_Introduction.tex index 38a4d5e..83c3a64 100644 --- a/Ch.10_Introduction.tex +++ b/Ch.10_Introduction.tex @@ -1,6 +1,6 @@ \chapter{Introduction\label{intro}} -PCLs play a major part in software engineering. For example in open source there must be an appropriate PCL attached to the source code in order for open-source software to be freely available for possible modification and redistribution. Because open source is central to software engineering the licenses enabling open source must also be considered important in the same context. +PCLs play a central to the distribution of works in software engineering. For example in open source there must be an appropriate PCL attached to the source code in order for open-source software to be freely available for possible modification and redistribution. Because open source is central to software engineering the licenses enabling open source must also be considered important in the same context. Public copyright license is defined by Wikipedia with the following words \citep{wiki:publiclicenses}: \begin{quote} @@ -18,10 +18,10 @@ \chapter{Introduction\label{intro}} On top of PCL details, software engineers in general have a tough time understanding the basic goals of PCLs used in software engineering. In the instance of the RHEL incident it would not have been a big surprise to software engineers if they would have known about other licenses and what they try to achieve or how old is GPLv2 and why it has been succeeded by GNU General Public License version 3 (GPL-3.0). -This thesis' goal is to contribute into the solving these problems in a structured manner. First we state definitions and terminology used in the scope of this thesis. We go over the reasons why there does not exist consistent terminology in this area and why the conversely the definitions are the most stabile ones in this area. Second we take a deep dive into the multivocal literature through a systematic literature review. To make more information available, a mapping study connected to the terminology scope defined in the first step is needed. Third includes our own suggestions and basic knowledge for professionals and academics in the industry to enhance the understanding of PCLs in software engineering. This step also includes discussion of the future research and contributes to stablizing the terminology and reinforcing the already-existing definitions in the academic field. +This thesis' goal is to contribute into the solving these problems in a structured manner. First we state definitions and terminology used in the scope of this thesis. We go over the reasons why there does not exist consistent terminology in this area and why the conversely the definitions are the most stabile ones in this area. Second we take a deep dive into the PCLs through a multivocal literature review. To make more information available, a mapping study connected to the terminology scope defined in the first step is needed. Third includes our own suggestions and basic knowledge for professionals and academics in the industry to enhance the understanding of PCLs in software engineering. This step also includes discussion of the future research and contributes to stablizing the terminology and reinforcing the already-existing definitions in the academic field. \section{Research goal, questions and contributions} -The secondary goal of this research is to conduct a multivocal literature review of the current state of PCLs in software engineering, the evaluation of the them and the evidence level of the research. The research aims to provide a novel perspective on relevant licenses and to extract key findings through a rigorous literature review process. The research questions of the review are: +The primary goal of this research is to conduct a multivocal literature review of the current state of PCLs in software engineering, the evaluation of the them and the evidence level of the research. The research aims to provide a novel perspective on relevant licenses and to extract key findings through a rigorous literature review process. The research questions of the review are: \begin{itemize} \item RQ1: How many PCLs are there in software engineering? \item RQ2: What is the average length of a PCL in software engineering? @@ -32,14 +32,20 @@ \section{Research goal, questions and contributions} Terms such as open source, source code, software freedom and other vocabulary must be defined in the scope of this thesis. \hyperref[sec:bg]{Section 1.3} will examine this plethora of of terminology and definitions and will be used to establish a sound basis for discussing this broad subject. -This study has two goals: to provide rigorous research on PCLs to the academic field and to provide insights to the professional field of software engineering on PCLs. The grand goal of this thesis is to raise awareness of the importance of PCLs so that more licensers would make the correct choices based on their situations and needs in a mindful way. +This study has two main viewpoints: to provide rigorous research on PCLs to the academic field and to provide insights to the professional field of software engineering on PCLs. The grand goal of this thesis is to raise awareness of the importance of PCLs so that more licensers would make the correct choices based on their situations and needs in a mindful way. \section{Thesis structure} This thesis follows the IMRaD structure. \hyperref[intro]{Chapter 1} introduces the problem, this thesis' possible contributions and some further background. \hyperref[methods]{Chapter 2} goes over the process and the methods of the multivocal literature review. This is where most of the actual research takes place in. \hyperref[results]{Chapter 3} presents results to the research questions. \hyperref[discussion]{Chapter 4} discusses implications for research. The chapter also discusses software engineering professionals in the thesis' context and the validity of the thesis' research. \hyperref[conclusions]{Chapter 5} concludes this thesis with the help of the research questions and the future of the research. \section{Background and terminology of PCLs} \label{sec:bg} -The current terminology is used with different definitions which leads to inconsistencies in the field of software engineering. For example The Open Source Initiative (OSI) classifies GPL-3.0 under the term ''open source'' whereas the Free Software Foundation (FSF) classifies GPL-3.0 under the term ''free software'' \citep{osi:gplv3}\citep{rms:opensource}. This is because their definitions on open source and free software differ from each other. Some parts of the two definitions are even mutually exclusive. This is rarely mentioned when people talk about Free and Open Source Software (FOSS) or Free / Libre and Open Source Software (FLOSS) which leads to misunderstanding that the two approaches are the same. This is why our focus will be PCLs in software engineering, which distinguishes our investigation from the broader topic of PCLs or the copyright law. This includs also PCLs that are not approved by the FSF nor OSI hence not falling under the group of FLOSS licenses. In this section we aim to increase the accessibility of our discussion by providing a concise overview of the background of the field of PCLs and the terms we employ. +The current terminology is used with different definitions which leads to inconsistencies in the field of software engineering. For example The Open Source Initiative (OSI) classifies GPL-3.0 under the term ''open source'' whereas the Free Software Foundation (FSF) classifies GPL-3.0 under the term ''free software'' \citep{osi:gplv3}\citep{rms:opensource}. This is because their definitions on open source and free software differ from each other. Some parts of the two definitions are even mutually exclusive. This is rarely mentioned when people talk about Free and Open Source Software (FOSS) or Free / Libre and Open Source Software (FLOSS) which leads to misunderstanding that the two approaches are the same. This is why our focus will be PCLs in software engineering, which distinguishes our investigation from the broader topic of PCLs or the copyright law. This includs also PCLs that are not approved by the FSF nor OSI hence not falling under the group of FLOSS licenses. The term ''copyleft'' is defined by \cite{mustonen2003} in the following way: +\begin{quote} + ''Copyeft is a novel licensing scheme. It facilitates open and decentralized software development. Its key feature is that once a program is licensed by the inventor, the subsequent programs based on the original must also be licensed similarly.'' +\end{quote} +This is why the term is often used in the context of free software. + +In this section we aim to increase the accessibility of our discussion by providing a concise overview of the background of the field of PCLs and the terms we employ. To explain our emphasis on PCLs in software engineering, it is essential to examine the other possible areas of interest in PCLs. Our study classifies such efforts into eight domains as mentioned by the GNU Project \citep{gnu:licenselist}. diff --git a/Ch.20_Methods.tex b/Ch.20_Methods.tex index 2e6374f..dbfbf61 100644 --- a/Ch.20_Methods.tex +++ b/Ch.20_Methods.tex @@ -1,9 +1,9 @@ \chapter{Methods\label{methods}} This chapter aims to establish a precisely defined and rigorous research approach to enhance transparency and repeatability. We will take the steps required to ensure that every phase and decision is thoroughly documented, enabling the reader to retrace the research process. In a thesis made by a single researcher the lack of cross-examination of results with multiple researchers and the validation of evaluation criteria for opinion bias pose threats to validity, as will be clarified further in \hyperref[discussion]{Chapter 4}. Therefore, special attention will be paid to address these concerns. By following this approach, this research endeavors to contribute to the existing body of knowledge in the field of computer science in a robust and reliable manner. -The systematic literature review method is a well-established approach for conducting a comprehensive and rigorous analysis of the existing research on specific research question or subject \citep{kitchenham2007}. This method was selected for this study to facilitate a thorough and scientifically interdisciplinary examination of PCLs in software engineering. The existing literature consists of PCLs and as such are considered gray literature, making the thesis a multivocal literature review. The method of a systematic literature review is still conducted the same way. +The systematic literature review method (SLR) is a well-established approach for conducting a comprehensive and rigorous analysis of the existing research on specific research question or subject \citep{kitchenham2007}. This paper presents a multivocal literature review (MLR). MLR is a SLR that includes both academic (AL) and grey literature (GL). This method was selected for this study to facilitate a thorough and scientifically interdisciplinary examination of PCLs in software engineering. The existing literature consists of PCLs and as such are considered gray literature, making the thesis a multivocal literature review. -This study follows the guidelines outlined by \cite{kitchenham2007}, to ensure its quality. The systematic review method consists of three distinct phases: planning, conducting and reporting the review. This study stricly adhered to this structure. The phases can be further broken down into a research protocol, as illustrated in \hyperref[fig:slrphases]{Figure 2.1}. Adhering to the protocol is the first step in ensuring a well-documented and rigorous process, which increases the validity and auditability of the study. +This study follows the guidelines outlined by \cite{kitchenham2007}, to ensure its quality. The multivocal review method consists of three distinct phases: planning, conducting and reporting the review. This study stricly adhered to this structure. The phases can be further broken down into a research protocol, as illustrated in \hyperref[fig:slrphases]{Figure 2.1}. Adhering to the protocol is the first step in ensuring a well-documented and rigorous process, which increases the validity and auditability of the study. \begin{figure} \centering @@ -12,7 +12,7 @@ \chapter{Methods\label{methods}} \label{fig:slrphases} \end{figure} -The systematic literature review process began with the formulation of research questions and the establishment of a comprehensive search strategy and scope. The search process was conducted by employing a quasi-gold standard (QGS) approach based on the implementation by \cite{qgs}. After the completion of the search process, the inclusion and exclusion criteria were defined, and a strategy was developed for assessing the quality of the multivocal literature that met these criteria. To ensure a structured evaluation of the literature, a data extraction form was created. Finally, a strategy for analyzing the extracted data from the literature was designed. +The multivocal literature review process began with the formulation of research questions and the establishment of a comprehensive search strategy and scope. The search process was conducted by employing a quasi-gold standard (QGS) approach based on the implementation by \cite{qgs}. After the completion of the search process, the inclusion and exclusion criteria were defined, and a strategy was developed for assessing the quality of the multivocal literature that met these criteria. To ensure a structured evaluation of the literature, a data extraction form was created. Finally, a strategy for analyzing the extracted data from the literature was designed. To ensure the reliability and validity of the research protocol, it was validated against similar systematic literature reviews in computer science, the aforementioned guidelines by \cite{kitchenham2007}, and was further refined through an iterative process. Specifically, a subset of the data was tested on (The QGS) and any identified issues or problems were recorded and addressed. The details of this process are explained and thoroughly documented in the following sections. Similarly, the same approach was followed for the data extraction process, whereby a subset of literature was tested to refine the data extraction form. The revision of the form was undertaken as necessary to guarantee the completeness and accuracy of the extracted data. @@ -27,7 +27,7 @@ \section{Research questions} \item RQ5: Why do PCLs in software engineering get new versions? \end{itemize} -The systematic literature review in this thesis begins with addressing RQ1, which aims to provide the amount of PCLs that exist in software engineering. The review takes into account attributes like versions, supersedences to a different license family, formal or otherwise and recognizability. These attributes give us different amounts to existing PCLs in software engineering. This information could be most valuable for the practicioners out of all the research questions in the thesis since it could give some sense of the scale when picking a PCL that would serve the practicioners' needs the best. +The multivocal literature review in this thesis begins with addressing RQ1, which aims to provide the amount of PCLs that exist in software engineering. The review takes into account attributes like versions, supersedences to a different license family, formal or otherwise and recognizability. These attributes give us different amounts to existing PCLs in software engineering. This information could be most valuable for the practicioners out of all the research questions in the thesis since it could give some sense of the scale when picking a PCL that would serve the practicioners' needs the best. Next RQ2 seeks to find the average length of the text of a PCL in software engineering. This research question has attributes like the number of characters, sentences, distinct sections and the size of the license on a computer screen. This information could be valuable for the practicioners mentioned in the previous parapgrah for the same reasons of getting a better overview of the PCLs in software engineering. The research questions could also be beneficial for the practicioners working directly within the meta plane of PCLs in software engineering. Let us refer to the latter as researchers. @@ -40,8 +40,10 @@ \section{Search stragey} \subsection{Search method} The search was conducted on Google Search, as mentioned earlier, to obtain a broad set of multivocal literature. This approach yielded a large number of literature that were processed to a subset of high-relevance literature using exclusion and quality criteria presented later in this chapter. Manual searching of databases with thousands of PCLs is not feasible, and it is prone to researcher bias and may overlook relevant venues from other scientific disciplines. However, a preliminary manual search was performed to reduce the number of iterations required and establish the quasi-gold standard (QGS) mentioned earlier. \subsection{Search scope and terms} -The search terms for this study were determined through an iterative process that took into account the research questions and topic. Synonyms for key terms were included and combined using Boolean logic to form a comprehensive search string. The search function on the search engine Google Search served as the foundation for developing the search string. Further specification on Google Search is available online \citep{google:howsearchworks}. +EPLAIN WHY THIS DID NOT HAPPEN EITHER +The search terms for this study were determined through an iterative process that took into account the research questions and topic. +EXPLAIN WHY QGS SEARCH STRING COULD NOT BE DONE BUT WIKIPEDIA INSTEAD The search string was established on a basis of a quasi-gold standard as proposed by \cite{qgs}. For establishing a QGS we employed a manually crafted search string based on the topic and research questions of this study. As we defined PCLs in software engineering as copyright licenses where the licensees are not limited and the copyright license in question is meant be used in licensing software source code in \hyperref[methods]{Chapter 2} and our research questions focus on finding measurements and reasonings to the PCLs' various attributes, we manually formulated the search string: \section{Search process} diff --git a/HY-CS-main.pdf b/HY-CS-main.pdf index a0014ac..73ca939 100644 Binary files a/HY-CS-main.pdf and b/HY-CS-main.pdf differ diff --git a/README.md b/README.md index 33eaae9..0ea783e 100644 --- a/README.md +++ b/README.md @@ -54,21 +54,24 @@ wed 12.6: student submits thesis to E-thesis If 10.6 grappa deadline is missed the next grappa deadline is in the beginning of september. ## General ### Pomodoros -- write new paragraph titles -- replace "literature" and "studies" with "sources" -- write abstract until results -- make other necessary changes to existing text to reflect the mlr nature instead of pure slr nature +- remove any connections to google search or scholar - start describing literature gathering process from the wikipedia mit license ### Break activity - pride and prejudice - tekken +- meditation ### Notes ad hoc notes place reserved here
Diary ## Diary +### week 15 +mon: 3 pomodoros. nice. + ### week 14 +fri: not much done. sent an email to supervisor regarding the sms but ill continue working on the slr style. + thu: i'm not completely sure if i should write a systematic mapping study or a systematic literature review. ask about this from supervisor. wed: 2 pomodoros. finished paragraphizing kuutila et al. 2020. tomorrow should happen something diff --git a/bibliography.bib b/bibliography.bib index 4d2b2c2..2449e32 100644 --- a/bibliography.bib +++ b/bibliography.bib @@ -113,10 +113,15 @@ @inproceedings{qgs series = {EASE'10} } -@misc{google:howsearchworks, - title = "Google Search - What Is Google Search And How Does It Work", - author = "Google", - howpublished = "\url{https://www.google.com/search/howsearchworks/}", - year = "2024", - note = "Accessed: 2024 March 20" +@article{mustonen2003, + title = {Copyleft—the economics of Linux and other open source software}, + journal = {Information Economics and Policy}, + volume = {15}, + number = {1}, + pages = {99-121}, + year = {2003}, + issn = {0167-6245}, + doi = {https://doi.org/10.1016/S0167-6245(02)00090-2}, + url = {https://www.sciencedirect.com/science/article/pii/S0167624502000902}, + author = {Mikko Mustonen}, } \ No newline at end of file