AI Safety

Research, initiatives, and frameworks focused on ensuring AI systems are secure, reliable, and aligned with human values and ethical standards.

IFS Joins UK's AI Policy Advisory Board

IFS has been appointed as an Advisory Board Member of the UK's All-Party Parliamentary Group on AI, contributing to AI policy discussions alongside major industry players.

March 18, 2025

Anthropic's New Techniques to Detect Deceptive AI

Anthropic has developed methods to identify when AI systems conceal their true objectives, a significant step in AI safety research. The company trained its AI assistant, Claude, to hide its goals, then successfully detected these hidden agendas using various auditing techniques.

March 14, 2025

NewsGuard Launches FAILSafe to Protect AI from Foreign Disinformation

NewsGuard has introduced the FAILSafe service to shield AI models from foreign influence operations, particularly targeting Russian, Chinese, and Iranian disinformation.

March 11, 2025

Google Removes Diversity Mentions from AI Team Webpage

Google has updated its Responsible AI team webpage, removing references to 'diversity' and 'equity'. This change follows similar actions by other tech companies.

March 09, 2025

CompScience Partners with CMTA and Bender Insurance to Modernize Workers' Compensation

CompScience has teamed up with the California Manufacturers & Technology Association and Bender Insurance Solutions to launch an AI-driven program aimed at reducing workplace injuries and insurance costs for California manufacturers.

March 04, 2025

HiddenLayer Report Highlights Rising AI Breaches and Security Challenges

HiddenLayer's latest report reveals a significant increase in AI breaches, with 74% of organizations experiencing incidents in 2024. The report emphasizes the need for enhanced security measures as AI adoption grows.

March 04, 2025

ABM Unveils World's First Emotion Processing Unit Chip

Advanced Brain Methodologies Inc. (ABM) has announced the launch of the world's first Emotion Processing Unit (EPU) chip, a groundbreaking neuro-chip designed to revolutionize mental health and cognitive performance.

March 03, 2025

Safe Pro Appoints Young J. Bang to Lead AI Integration for U.S. Military

Safe Pro Group has appointed Young J. Bang, former Principal Deputy Assistant Secretary of the Army, to spearhead the integration of AI technology into U.S. military systems, announced in a press release.

March 03, 2025

Anthropic Introduces Hierarchical Summarization for AI Monitoring

Anthropic has unveiled a new approach called hierarchical summarization to enhance AI monitoring, particularly for its computer use capabilities.

February 28, 2025

Infosys Introduces Open-Source Responsible AI Toolkit

Infosys has launched an open-source Responsible AI Toolkit to enhance trust and transparency in AI, announced in a press release. The toolkit is part of the Infosys Topaz Responsible AI Suite.

February 26, 2025

Leidos and SeeTrue Partner to Enhance AI Threat Detection

Leidos and SeeTrue have announced a collaboration to improve AI-powered threat detection technology for airport security and customs screenings.

February 25, 2025

OpenAI Bans Accounts Misusing ChatGPT for Surveillance

OpenAI has banned accounts from China and North Korea for using ChatGPT in surveillance and influence operations, according to Reuters.

February 23, 2025

Exabits and Phala Network Enhance AI Security with TEE-Enabled Infrastructure

Exabits has partnered with Phala Network to offer TEE-enabled GPU clusters for secure AI data processing, announced in a press release.

February 22, 2025

DeepSeek to Open-Source AGI Research Amid Privacy Concerns

DeepSeek, a Chinese AI startup, plans to open-source five repositories next week to promote transparency and community-driven innovation, amid ongoing privacy concerns.

February 22, 2025

Securiti and Databricks Collaborate to Enhance Enterprise AI Systems

Securiti has partnered with Databricks to integrate Databricks Mosaic AI and Delta tables into its Gencore AI solution, enabling safer enterprise AI development, according to a press release.

February 19, 2025

Giskard Unveils Phare: A New Benchmark for Evaluating AI Models

Giskard has launched Phare, an open and independent benchmark to assess AI models on security dimensions like hallucination and bias, with Google DeepMind as a research partner.

February 19, 2025

Mira Murati Launches Thinking Machine Labs with AI Focus

Former OpenAI CTO Mira Murati has launched a new AI startup, Thinking Machine Labs, with a team of top researchers and engineers, including many from OpenAI.

February 19, 2025

Pangea Launches AI Security Guardrails and $10,000 Jailbreak Competition

Pangea has announced the availability of AI Guard and Prompt Guard to enhance AI security, alongside a $10,000 jailbreak competition to highlight AI vulnerabilities.

February 18, 2025

OpenAI Co-Founder Sutskever's Startup Valued Over $30 Billion

Ilya Sutskever, co-founder of OpenAI, is raising over $1 billion for his startup Safe Superintelligence, which is now valued at more than $30 billion.

February 18, 2025

Caseware AiDA Receives Positive Evaluation for AI Safety Protocols

Caseware's AI digital assistant, AiDA, has been positively evaluated for its safety protocols by the Holistic AI Governance Platform, ensuring data security and compliance for accounting professionals.

February 13, 2025

Subscribe to AI Policy Brief

Weekly report on AI regulations, safety standards, government policies, and compliance requirements worldwide.