A programmer is suing Microsoft, GitHub and OpenAI over artificial intelligence technology that generates its own computer code.
Cade Metz, based in San Francisco, writes about artificial intelligence and other emerging technologies.
In late June, Microsoft released a new kind of artificial intelligence technology that could generate its own computer code.
Called Copilot, the tool was designed to speed the work of professional programmers. As they typed away on their laptops, it would suggest ready-made blocks of computer code they could instantly add to their own.
The lawsuit has echoes in the last few decades of the technology industry. In the 1990s and into the 2000s, Microsoft fought the rise of open source software, seeing it as an existential threat to the future of the company’s business. As the importance of open source grew, Microsoft embraced it and even acquired GitHub, a home to open source programmers and a place where they built and stored their code.
Nearly every new generation of technology — even online search engines — has faced similar legal challenges. Often, “there is no statute or case law that covers it,” said Bradley J. Hulbert, an intellectual property lawyer who specializes in this increasingly important area of the law.
“The ambitions of Microsoft and OpenAI go way beyond GitHub and Copilot,” Mr. Butterick said in an interview. “They want to train on any data anywhere, for free, without consent, forever.”
By pinpointing patterns in all that text, this system learned to predict the next word in a sequence. When someone typed a few words into this “large language model,” it could complete the thought with entire paragraphs of text. In this way, the system could write its own Twitter posts, speeches, poems and news articles.
Much to the surprise of the researchers who built the system, it could even write computer programs, having apparently learned from an untold number of programs posted to the internet.
This new system became the underlying technology for Copilot, which Microsoft distributed to programmers through GitHub. After being tested with a relatively small number of programmers for about a year, Copilot rolled out to all coders on GitHub in July.
For now, the code that Copilot produces is simple and might be useful to a larger project but must be massaged, augmented and vetted, many programmers who have used the technology said. Some programmers find it useful only if they are learning to code or trying to master a new language.
Mr. Butterick identifies as an open source programmer, part of the community of programmers who openly share their code with the world. Over the past 30 years, open source software has helped drive the rise of most of the technologies that consumers use each day, including web browsers, smartphones and mobile apps.
Though open source software is designed to be shared freely among coders and companies, this sharing is governed by licenses designed to ensure that it is used in ways to benefit the wider community of programmers. Mr. Butterick believes that Copilot has violated these licenses and, as it continues to improve, will make open source coders obsolete.
After publicly complaining about the issue for several months, he filed his suit with a handful of other lawyers. The suit is still in the earliest stages and has not yet been granted class-action status by the court.
Mr. Butterick and another lawyer behind the suit, Joe Saveri, said the suit could eventually tackle the copyright issue.
Asked if the company could discuss the suit, a GitHub spokesman declined, before saying in an emailed statement that the company has been “committed to innovating responsibly with Copilot from the start, and will continue to evolve the product to best serve developers across the globe.” Microsoft and OpenAI declined to comment on the lawsuit.
Under existing laws, most experts believe, training an A.I. system on copyrighted material is not necessarily illegal. But doing so could be if the system ends up creating material that is substantially similar to the data it was trained on.
Pam Samuelson, a professor at the University of California, Berkeley, who specializes in intellectual property and its role in modern technology, said legal thinkers and regulators briefly explored these legal issues in the 1980s, before the technology existed. Now, she said, a legal assessment is needed.
“It is not a toy problem anymore,” Dr. Samuelson said.
Source: New York Times