Artificial Intelligence & Machine Learning / Open-source AI models

Weekly Artificial Intelligence & Machine Learning / Open-source AI models Insights

Stay ahead with our expertly curated weekly insights on the latest trends, developments, and news in Artificial Intelligence & Machine Learning - Open-source AI models.

Recent Articles

Sort Options:

Open-Sourced AI Models May Be More Costly in the Long Run, Study Finds

Open-Sourced AI Models May Be More Costly in the Long Run, Study Finds

Open-source AI models require significantly more computing power than their closed-source counterparts for equivalent tasks, highlighting a key difference in efficiency and resource utilization within the evolving landscape of artificial intelligence technology.


Why do open-source AI models require more computing power than closed-source models?
Open-source AI models often require more computing power because they may not be as optimized for efficiency as closed-source models. This leads to higher resource utilization for equivalent tasks, increasing energy consumption and operational costs over time.
What does 'compute' mean in the context of AI models, and why is it important?
'Compute' refers to the hardware resources such as CPUs, GPUs, or TPUs used to perform the numerical calculations necessary for training and running AI models. It is measured in FLOPS (floating-point operations per second). The amount of compute determines how quickly and effectively a model can learn from data and perform tasks, impacting both performance and energy consumption.
Sources: [1]

15 August, 2025
Gizmodo

OpenAI has new, smaller open models to take on DeepSeek - and they'll be available on AWS for the first time

OpenAI has new, smaller open models to take on DeepSeek - and they'll be available on AWS for the first time

OpenAI has launched two open-weight models, gpt-oss-120B and gpt-oss-20B, designed for edge use and available on AWS. These models aim to enhance AI accessibility while competing with existing large language models, despite lacking independent performance evaluations.


What are the gpt-oss-120B and gpt-oss-20B models, and how do they differ from previous OpenAI models?
The gpt-oss-120B and gpt-oss-20B are OpenAI's new open-weight language models released under the Apache 2.0 license. The 120B model has 117 billion parameters and activates 5.1 billion parameters per token using a mixture-of-experts architecture, while the 20B model has 21 billion parameters and activates 3.6 billion parameters per token. These models are designed for efficient deployment, with the 20B model able to run on consumer-grade hardware with just 16 GB of RAM, making it suitable for edge and on-device use. They demonstrate strong reasoning, tool use, and structured output capabilities, approaching or surpassing the performance of proprietary models like OpenAI's o4-mini and o3-mini on various benchmarks.
Sources: [1], [2], [3]
What does it mean that these models are 'open-weight' and available on AWS for the first time?
'Open-weight' means that the full model weights are publicly available under a permissive Apache 2.0 license, allowing developers to run, modify, and deploy the models without restrictions typical of proprietary models. This is significant because it enables local and edge deployment, reducing reliance on cloud infrastructure. The availability of these models on AWS for the first time means that users can access and run these open models directly on Amazon Web Services, facilitating scalable cloud-based use and integration into existing AWS workflows.
Sources: [1], [2]

10 August, 2025
TechRadar

OpenAI’s New Open Source Models Are A Very Big Deal: 3 Reasons Why

OpenAI’s New Open Source Models Are A Very Big Deal: 3 Reasons Why

OpenAI's new open-source model highlights the competitive landscape of AI, particularly between China and the U.S. The article explores the implications for companies navigating the evolving AI technology roadmap, emphasizing the importance of innovation and collaboration in this dynamic field.


What does it mean that OpenAI’s new models are 'open source' and why is this significant?
OpenAI’s new models, gpt-oss-120b and gpt-oss-20b, are released with open weights under the permissive Apache 2.0 license, meaning anyone can access, run, and modify the models locally without relying on OpenAI’s API. This is significant because previous models like GPT-3.5 and GPT-4 were closed-source and API-only, limiting control over latency, cost, and privacy. Open-source availability enables developers and companies to deploy powerful AI on consumer hardware, fostering innovation and reducing infrastructure costs.
Sources: [1], [2]
How do OpenAI’s new open-source models compare in performance and capabilities to previous models?
The gpt-oss-120b model matches the performance of OpenAI’s o4-mini model on reasoning benchmarks while requiring significantly less hardware (a single 80GB GPU). The smaller gpt-oss-20b model performs similarly to o3-mini and can run on consumer devices with 16GB GPU. These models excel in advanced reasoning, tool use (such as web search and code execution), and support chain-of-thought reasoning and structured outputs, outperforming other open-source models of similar size.
Sources: [1], [2]

07 August, 2025
Forbes - Innovation

How DeepSeek and Open-Source Models Are Shaking Up AI

How DeepSeek and Open-Source Models Are Shaking Up AI

The rise of generative artificial intelligence has intensified the ongoing debate among tech companies and academics regarding the risks and rewards of open-source software development, highlighting its growing importance in the tech landscape.


What is the Mixture-of-Experts (MoE) architecture used by DeepSeek AI, and how does it improve AI performance?
DeepSeek AI uses a Mixture-of-Experts (MoE) architecture, which consists of multiple specialized sub-models or 'experts' trained for specific tasks. When processing a query, only the most relevant experts are activated instead of the entire model. This selective activation reduces computational load, increases efficiency, and improves accuracy by leveraging specialized expertise for different types of data or tasks.
Sources: [1], [2]
Why is open-source software development significant in the context of AI models like DeepSeek?
Open-source software development allows AI models like DeepSeek to be more transparent, accessible, and collaboratively improved by the tech community. This approach accelerates innovation, enables diverse contributions, and helps balance the risks and rewards associated with AI development. The rise of generative AI has intensified debates about open-source's role, highlighting its growing importance in shaping the future of AI technology.

06 August, 2025
Bloomberg Technology

OpenAI now offers open AI models, but CIOs need to assess the risk

OpenAI now offers open AI models, but CIOs need to assess the risk

OpenAI introduces two open models, providing enterprise IT with the opportunity to create customized large language models (LLMs) trained on specific corporate content, enhancing tailored solutions for businesses. This innovation marks a significant advancement in AI technology for enterprises.


What are OpenAI's new open models and how do they differ from previous models?
OpenAI has released two new open-weight AI reasoning models called gpt-oss-20B and gpt-oss-120B. These models are open-source, allowing enterprises to customize them by training on specific corporate data, which enhances tailored AI solutions. Unlike OpenAI's recent proprietary models, these open models can be run locally on hardware ranging from consumer laptops to single Nvidia GPUs, and they support advanced reasoning and tool use. This marks OpenAI's first open model release since GPT-2, over five years ago.
Sources: [1], [2]
What risks should CIOs consider when adopting OpenAI's open models for enterprise use?
CIOs need to assess risks related to data security, privacy, and governance when deploying OpenAI's open models. Although these models enable customization on corporate data, enterprises must ensure proper usage governance, including logging, guardrails, and personally identifiable information (PII) detection, to prevent data leaks or misuse. Additionally, integrating open models with proprietary cloud AI services may introduce complexity and require careful risk management to balance innovation with security.
Sources: [1], [2]

06 August, 2025
ComputerWeekly.com

Why OpenAI’s Open Source Models Are A Big Deal via @sejournal, @martinibuster

Why OpenAI’s Open Source Models Are A Big Deal via @sejournal, @martinibuster

OpenAI's new open-weight models showcase impressive reasoning capabilities, though they come with a tradeoff in hallucinations. The publication highlights the significance of these advancements in the realm of artificial intelligence and their potential impact on future developments.


What are OpenAI's open-weight models and why are they significant?
OpenAI's open-weight models, such as gpt-oss-20B and gpt-oss-120B, are AI models released with open weights under a permissive Apache 2.0 license, allowing anyone to run and customize them on their own infrastructure. These models support advanced reasoning and tool use, enabling complex multi-step tasks and domain-specific AI applications. Their open availability marks a major step in democratizing AI access, allowing developers, enterprises, nonprofits, and governments worldwide to build and innovate without relying solely on proprietary models. This openness also lowers barriers for emerging markets and smaller organizations, fostering broader AI-driven economic growth and innovation globally[1][2][3][4].
Sources: [1], [2], [3], [4]
What tradeoffs come with using OpenAI's open-weight models?
While OpenAI's open-weight models demonstrate impressive reasoning capabilities and flexibility, they come with a tradeoff in terms of hallucinations—instances where the AI generates plausible but incorrect or fabricated information. Additionally, these models may require more customization and integration effort compared to closed-source models that offer more turnkey solutions. However, OpenAI mitigates some limitations by enabling these open models to connect with more capable closed models in the cloud for tasks they cannot perform alone, such as image processing. This hybrid approach balances openness with performance and reliability[2][3].
Sources: [1], [2]

06 August, 2025
Search Engine Journal

OpenAI returns to its open-source roots with new open-weight AI models, and it's a big deal

OpenAI returns to its open-source roots with new open-weight AI models, and it's a big deal

The article explains that models licensed under Apache 2.0 benefit from one of the most permissive open licenses, allowing for broad usage and modification. This fosters innovation and collaboration within the tech community, enhancing accessibility and development opportunities.


What does it mean that OpenAI's new AI models are licensed under the Apache 2.0 license?
The Apache 2.0 license is a permissive open-source license that allows users to freely use, modify, distribute, and sublicense the AI models, including for commercial purposes. Users must include the original copyright notice and license text, and disclose significant changes made to the code. This license encourages innovation and broad usage without API fees, enabling deployment on consumer hardware or cloud platforms.
Sources: [1], [2]
How does the Apache 2.0 license affect the ownership and use of data generated by OpenAI's models?
Under OpenAI's terms, users retain ownership of both their input data and the output generated by the models. The output can be licensed under permissive licenses like Apache 2.0 or MIT by the user, allowing them to share, modify, or use the data freely, including for training other models. Restrictions apply only to the original user, not to the data itself, promoting openness and reuse within the community.
Sources: [1]

06 August, 2025
ZDNet

OpenAI releases two open-weight AI models, including one that runs well on Apple Silicon Macs

OpenAI releases two open-weight AI models, including one that runs well on Apple Silicon Macs

OpenAI has unveiled two new open-weight AI models, gpt-oss-20b and gpt-oss-120b, fulfilling its earlier promise. These models are now available for free download, enhancing accessibility for developers and researchers in the AI community.


What does 'open-weight' mean in the context of OpenAI's new AI models?
'Open-weight' means that the full model weights (parameters) are publicly released and available for free download, allowing developers and researchers to run, fine-tune, and customize the models without restrictions. This contrasts with proprietary models where weights are kept private and only accessible via API. OpenAI’s gpt-oss-120b and gpt-oss-20b models are open-weight, licensed under Apache 2.0, enabling experimentation and commercial use without copyleft or patent risks.
Sources: [1], [2]
How can the gpt-oss-20b model run efficiently on Apple Silicon Macs?
The gpt-oss-20b model is smaller (21 billion parameters) and optimized to run with lower latency and within limited memory (16GB), making it suitable for local use on consumer hardware like Apple Silicon Macs. Its native MXFP4 quantization and efficient architecture allow it to operate without requiring large, specialized GPUs, unlike the larger gpt-oss-120b which needs more powerful hardware such as an 80GB GPU.
Sources: [1], [2]

05 August, 2025
9to5Mac

OpenAI Open Models

OpenAI Open Models

The article discusses the release of open-weight language models, gpt-oss-120b and gpt-oss-20b, highlighting their potential impact on AI development and accessibility. The authors emphasize the importance of these models in advancing natural language processing technology.


What are the main differences between the gpt-oss-120b and gpt-oss-20b models?
The gpt-oss-120b model has 117 billion total parameters and activates about 5.1 billion parameters per token, delivering stronger reasoning and better performance on complex tasks. It requires a high-end GPU like the NVIDIA H100 or a multi-GPU setup. The gpt-oss-20b model has 21 billion total parameters, activates about 3.6 billion parameters per token, and is optimized for speed and accessibility, fitting on a single 16GB GPU, making it suitable for on-device or low-cost server inference. Both models use Mixture-of-Experts (MoE) architecture with 4-bit quantization for efficient inference but differ in scale and hardware requirements.
Sources: [1], [2]
What does it mean that the gpt-oss models are 'open-weight' and why is this important?
'Open-weight' means that the full model weights of gpt-oss-120b and gpt-oss-20b are publicly released under a permissive Apache 2.0 license, allowing developers and researchers to freely access, run, modify, and fine-tune these models. This openness promotes transparency, wider accessibility, and innovation in AI development by enabling experimentation without the restrictions of proprietary models. It also allows deployment on consumer hardware and supports diverse applications such as tool use, chain-of-thought reasoning, and agentic tasks.
Sources: [1], [2]

05 August, 2025
Product Hunt

OpenAI Releases Open-Weight Models After DeepSeek’s Success

OpenAI Releases Open-Weight Models After DeepSeek’s Success

OpenAI is set to launch two open-access AI models designed to replicate human reasoning, following the global spotlight on China's DeepSeek and its innovative AI software. This move marks a significant advancement in the field of artificial intelligence.


What are open-weight AI models and why is OpenAI releasing them now?
Open-weight AI models are artificial intelligence models whose internal parameters (weights) are made publicly accessible, allowing researchers and developers to study, modify, and build upon them. OpenAI's release of two open-access models follows the success of China's DeepSeek, which gained global attention for its innovative AI software. This move by OpenAI represents a significant advancement in AI, promoting transparency and wider collaboration in replicating human reasoning capabilities.
Sources: [1], [2]
How do OpenAI's new open-weight models compare to DeepSeek's AI models?
OpenAI's new open-weight models are designed to replicate human reasoning and follow DeepSeek's success in this area. DeepSeek's R1 model slightly outperforms OpenAI's o1 model in some reasoning benchmarks, such as mathematical reasoning (DeepSeek-R1 scored 97.3% vs. OpenAI o1's 96.4%) and operates at a lower cost. However, OpenAI's models maintain strong coding capabilities and have recently released frontier models like o3 that surpass DeepSeek's R1 in overall performance. This competitive dynamic has driven OpenAI to release open-weight models to foster innovation and accessibility.
Sources: [1], [2], [3]

05 August, 2025
Bloomberg Technology

OpenAI's first new open-weight LLMs in six years are here

OpenAI's first new open-weight LLMs in six years are here

OpenAI has launched its first new open-weight large language models, gpt-oss-120b and gpt-oss-20b, since 2019. These models, available for download, offer flexibility for users while lacking multi-modal input capabilities, marking a significant step in democratizing AI access.


What does 'open-weight' mean in the context of OpenAI's new language models?
'Open-weight' means that the full model parameters (weights) are publicly available for download and use, allowing researchers and developers to run, modify, and build upon the models without restrictions. This contrasts with proprietary models where weights are kept private and only accessible via API.
Sources: [1], [2]
Why do OpenAI's new open-weight models lack multi-modal input capabilities?
The new open-weight models, gpt-oss-120b and gpt-oss-20b, focus on text-based reasoning and generation and do not support multi-modal inputs like images or audio. This limitation likely reflects a trade-off to prioritize accessibility and flexibility for users while maintaining manageable model complexity and hardware requirements.
Sources: [1], [2]

05 August, 2025
Engadget

OpenAI has finally released open-weight language models

OpenAI has finally released open-weight language models

OpenAI has launched its first open-weight large language models since 2019, available for free download and modification. This move aims to reestablish OpenAI's presence in the open model landscape amid rising competition from Chinese models and Meta's shift towards closed releases.


What does 'open-weight' mean in the context of OpenAI's language models?
'Open-weight' means that the full model parameters (weights) are publicly available for download, modification, and deployment without restrictive licensing. This allows developers and researchers to customize, fine-tune, and run the models on their own hardware, unlike closed models that only provide API access.
Sources: [1], [2]
Why is OpenAI releasing open-weight models now after years of focusing on closed models?
OpenAI is releasing open-weight models to reestablish its presence in the open model landscape amid rising competition from Chinese AI models and Meta's shift towards closed releases. This move also reflects a balance between openness and safety, as OpenAI has introduced new safety protocols to mitigate risks associated with open-source models.
Sources: [1], [2]

05 August, 2025
MIT Technology Review

Deep Cogito v2: Open-source AI that hones its reasoning skills

Deep Cogito v2: Open-source AI that hones its reasoning skills

Deep Cogito has launched Cogito v2, a groundbreaking open-source AI model family that enhances its reasoning abilities. Featuring models up to 671B parameters, it employs Iterated Distillation and Amplification for efficient learning, outperforming competitors while remaining cost-effective.


What is Iterated Distillation and Amplification (IDA) and how does it improve Deep Cogito v2's reasoning?
Iterated Distillation and Amplification (IDA) is a training technique where the AI model internalizes the reasoning process through iterative policy improvement rather than relying on longer search times during inference. This method enables Deep Cogito v2 models to learn more efficient and accurate reasoning skills, improving performance on complex tasks such as math and language benchmarks while remaining cost-effective.
Sources: [1]
What does it mean that Deep Cogito v2 models are 'hybrid reasoning models'?
Deep Cogito v2 models are called hybrid reasoning models because they can toggle between two modes: a fast, direct-response mode for simple queries and a slower, step-by-step reasoning mode for complex problems. This hybrid approach allows the models to efficiently handle a wide range of tasks by balancing speed and depth of reasoning, outperforming other open-source models of similar size.
Sources: [1], [2]

01 August, 2025
AI News

An unhandled error has occurred. Reload 🗙