The AI research organization LAION has sent an open letter to the European Parliament recommending that regulatory efforts in the AI sector should encourage the use of open source AI models. LAION’s datasets are widely used to train AI models.
Open-source AI models offer greater safety, accountability, reproducibility, and robustness than closed-source models because of their transparency, according to the letter, which is supported by the European Laboratory for Learning and Intelligent Systems and several other leading AI developers and researchers.
The letter suggests that the use of open-source models can foster innovation, particularly in small and medium-sized enterprises, minimize redundant training runs, be more environmentally friendly, and address global challenges in areas such as health and climate change.
The benefits of open-source AI models
The transparency of open-source AI also allows for extensive community review and helps identify and address security vulnerabilities earlier than is possible with closed systems, LAION says, citing Linux as a positive example.
Reproducible and transparent open-source models enable independent validation of AI results, ensuring trust and scientific integrity.
Focusing regulatory efforts on AI applications rather than enabling technologies could preserve the benefits of open-source models without stifling innovation, the letter argues.
This proposal is likely motivated by the fact that it can be difficult for smaller companies or open-source movements to comply with complex regulatory requirements if they want to develop new AI technologies without immediate commercial intent.
The authors also recommend that commercial companies should be incentivized to open source their basic models, while retaining ownership of fine-tuned versions for industry-specific applications. This strategy would allow broader access to the underlying models without compromising commercial competitiveness.
Meta currently appears to be following this strategy with its Llama models, but other AI technology companies such as Google, OpenAI, and Microsoft are strictly following the closed-source principle.
OpenLM aims to make AI training more efficient
LAION also introduces OpenLM, a new PyTorch codebase for efficient training of medium-sized language models. OpenLM is designed to fully utilize GPU resources, increase training speed, and is easily adaptable to new research and development projects, according to LAION.
For testing, LAION trained the OpenLM-1B and OpenLM-7B language models with a large set of text tokens (1.6 trillion and 1.25 trillion, respectively). A token is a single unit of data that, when assembled, forms a larger unit of data, similar to sentences composed of words, letters, and punctuation.
For tasks such as zero-shot text classification and multiple-choice, the OpenLM models are expected to outperform open-source alternatives such as OPT-1.3B and Pythia-1B, and achieve performance on par with LLAMA-7B and MPT-7B. Laion releases the language models and training datasets on Huggingface.
LAION’s petition for an international open-source AI supercomputer
In addition to the open letter, LAION launched a petition back in April to establish a publicly funded supercomputer to support international open-source AI research and development. The petition collected 3,627 signatures.
The proposal aims to ensure technological independence from large technology corporations. The proposed supercomputer would support the development of fundamental open-source AI models and provide the necessary infrastructure for a transparent future of AI.
Laion recently won “The Falling Walls Science Breakthrough of the Year Award” in the Science and Innovation Management category for “democratizing AI research by providing open access to advanced AI models, tools, and datasets, fostering public engagement and awareness, and promoting international collaboration to create a transparent and inclusive AI ecosystem that benefits everyone.”