More and more companies are using ChatGPT, GitHub Copilot, Amazon CodeWhisperer, Tabnine, Codeium and other tools to make software development even more efficient with the help of artificial intelligence (AI). We have summarized below what companies should consider from a legal perspective when using these services.
AI as a software developer
AI-supported applications enable companies to solve complex problems in a time- and cost-saving manner. In the field of software development there are also more and more AI solutions designed to enable software developers to develop faster, better and more efficiently.
While at the beginning of the “AI hype” ChatGPT & Co. were initially also used for software development purposes, there are now more and more specialized AI tools which are geared towards the field of software development, such as:
Depending on the scope of functions, software developers can use such AI tools to automatically adopt code suggestions, for example, or to detect and automatically correct errors in the program code.
This in turn offers companies significant savings potential. According to a survey conducted by GitHub together with Wakefield Research, 92% of 500 software developers surveyed in US companies already use AI development tools. A steadily growing market penetration can also be expected in Europe.
Risks when using ChatGPT, GitHub Copilot, Amazon CodeWhisperer & Co.
The use of AI tools in software development involves some legal risks that companies should be aware of. Particular challenges for companies are posed by issues in connection with
- copyrights (own and third-party copyrights),
- the – possibly unnoticed – integration of open-source software into your own program code and
- IT security.
In principle, no copyrights under German law to AI-generated code
Whether and to what extent companies or providers of the AI tools acquire rights to AI-generated program code depends on the
- applicable copyright law (in Germany, the German Copyright Act) and
- the contractual provisions concluded with the AI tool provider.
German copyright law protects personal intellectual creations. As a rule, software is also protected by copyright insofar as it is individual and original, i.e., has a certain level of creation and was created by a human being.
As a rule, program code generated with the help of AI is not deemed to be an intellectual creation within the meaning of German copyright law.
As a result, AI-generated code is generally in the public domain in accordance with German law. This means that in principle, anyone can freely use the code generated by the AI, so that competitors or other third parties may copy and modify the code. If relevant functions of your own program code are based on AI-generated code, this can significantly reduce the market value of the relevant software.
In addition to legal provisions, the contractual conditions of the AI tool provider also play an important role. Providers can, for example, be granted comprehensive rights to AI-generated code, use proprietary parts of the company’s program code, i.e., the input into the AI tool, for their own purposes (e.g., for training purposes) or also prohibit certain use scenarios of the AI tool or the AI-generated code.
However, companies should note that corresponding contractual provisions are only binding on the party concluding the contractual provisions, i.e., the company or the relevant developer. Third parties, on the other hand, are not bound by such restrictions or other provisions of the provider.
Copyright infringements by AI-generated code
Irrespective of the question of whether and to what extent rights to AI-generated code are established, it is particularly relevant for companies whether the use of AI tools infringes third-party copyrights.
AI tools generate program code based on data sets that are often protected by copyright. A good example of this problem is the current discussion about GitHub Copilot. To train the AI behind GitHub Copilot, the program code of GitHub users was used. After the launch of GitHub Copilot, it turned out that in some cases the suggestions for code completion largely corresponded to the program code of GitHub users who, according to their own statements, had not made their program code publicly available and had not made it available for training purposes.
Consequently, in practice the use of AI tools in the field of software development for companies always raises the question of whether the AI-generated code infringes the copyrights of the developer of the original code when the AI result is incorporated into the developer’s own code. An assessment of whether there is an interference with copyrights of third parties depends on whether the AI-generated code is a copyright-relevant adaptation or reproduction of the original code. The issue here is whether the third party’s original code is still “recognizable” in the AI-generated code. The cases in which this is to be affirmed in the case of AI-generated code have not yet been clarified. Companies are currently exposed to legal uncertainties. Therefore, an examination of the AI-generated code for congruence with third-party program code is highly recommended within the framework of the existing possibilities to minimize risk.
If AI-generated code affects copyrights on the original code, companies should not use the AI-generated code without the legal or contractual permission of the copyright holder, otherwise they may be subject to injunctive relief and claims for damages.
Unnoticed license violations through the use of open-source software
When using AI tools, companies should also pay particular attention to the unnoticed introduction of open-source software (“OSS”) into their own program code, because this may involve more extensive obligations.
OSS is software that anyone can use free of charge as long as the relevant license conditions are complied with. OSS licenses are sometimes very extensive and require the disclosure of the program code, the naming of the authors or the licensing of the program code under the OSS license of the OSS used.
AI-generated code, or even parts of it, may be “infected” by OSS, since a large number of AI tools have also been trained with OSS. This means that code snippets and even whole code parts from OSS licenses can be adopted in the output. In such cases, the AI output may itself be subject to an open-source license. If “infected” AI code is adopted into your own code, the user must comply with the license conditions of the OSS provider in order to be allowed to use the code. OSS licensing terms can basically be divided into three types:
- Strict copyleft licenses: The user is obliged to license the edited program code under the OSS license and to publish it. Depending on the license, the integration of the OSS may in individual cases also result in the proprietary program code or only parts of it having to be licensed under the OSS license and possibly disclosed. This may preclude the exploitation of the proprietary program code against payment.
- Limited copyleft licenses: Usually only adaptations of the OSS are subject to the OSS license (so-called “file-based copyleft”); they allow exceptions to the obligation to license the program code under the OSS license, so that this license type is less strict.
- Permissive licenses: Do not include an obligation to license added or changed code under the OSS license and are designed liberally.
Particular caution is required when AI-generated code is used that is subject to a strict copyleft license. The company may be obliged via the OSS license provisions to disclose its own program code or even to publish it under the OSS license. This can be particularly problematic if the company wants to use its proprietary program code commercially or protect it from copying by any competitors.
Some AI tool providers have recognized these risks and flag parts of the AI-generated code for the user during the development process if they “resemble” OSS training data and are thus potentially subject to an OSS license (e.g., Amazon CodeWhisperer). Developers can then decide for themselves whether or not they want to use the code and submit to the terms of the relevant OSS license.
How reliably the references to any OSS used and their licensing conditions are made by the AI tool in question cannot be assessed across the board. This depends on the relevant AI tool and should be critically reviewed by the company. Initial test reports suggest that these functions do not work reliably with some AI tools, so special attention should be paid to the OSS issue.
What should companies do? Recommendations for action
In the long term, developers and companies will probably not be competitive without the use of AI tools in software development. In order to minimize the aforementioned risks, companies should in particular
- check the terms of use of the AI tool to be used for problematic clauses (use of program code for training purposes; granting of rights to the program code to the provider; liability); keep an eye on updates;
- ask the provider about the type of training data of the AI tool; there are some AI tools that only use OSS with permissive OSS licenses for training, so the risks in relation to OSS are lower;
- select “data-saving” settings that restrict the use of the data entered, especially the proprietary program code for the AI tool provider’s own purposes (e.g., for training purposes);
- check whether the AI-generated code possibly infringes third-party copyrights; this can be done by a manual check, which is error-prone and time-consuming, or with the help of special services that perform a comparison with a certain database and indicate similarities (e.g., MOSS, Codequiry, HackerRank, JPlag, etc.); however, residual risks remain;
- check whether the AI-generated code contains parts of OSS, so that OSS license provisions must be complied with; in this case, the rights and obligations resulting from the OSS license must be checked; particular caution is required in the case of strict copyleft licenses;
- document the use of the AI tool and the AI-generated code in the software accordingly, so that the origin of the program code parts can be proven in the event of disputes;
- check the AI-generated code for malicious code before it is used productively, either manually or with the help of special services;
- check the data protection and security guidelines of the AI tool to be used or deployed with regard to the security measures taken; carry out independent research into known security incidents or warnings and conduct their own investigations;
- regularly back up the program code and other data to minimize potential damage from data loss;
- continuously explore the market for alternative providers;
- raise awareness among developers and other users of AI tools; and
- issue guidelines or manuals for the use of such AI tools and the handling of OSS.
Autor

LUTZ | ABEL Rechtsanwalts PartG mbB, Hamburg
Senior Associate, Attorney-at-Law
