New Microsoft tool lets devs spin up AI behavior tests using text descriptions

https://techcrunch.com/wp-content/uploads/2026/06/GettyImages-172665283.jpg?resize=1200,900

AI researchers and labs have advanced by leaps and bounds in evaluating AI models for everything from safety and compliance to sycophancy and alignment. But it appears companies and developers are faced with a new, specific need: making sure that their AI system behaves as intended for their specific product or service.

In a bid to make that testing process simpler, Microsoft on Tuesday took the wraps off ASSERT, short for Adaptive Spec-driven Scoring for Evaluation and Regression Testing.

The open-source framework, Microsoft says, makes evaluating application-specific AI behavior easy by using AI to turn high-level, natural-language descriptions of goals, policies, or intended behaviors into thorough, scored tests that can be investigated.

ASSERT takes plain-language descriptions of an AI model’s expected behavior and policies, turns them into a structured set of acceptable and unacceptable behaviors, generates problem scenarios and test cases, runs them against the target system, and...

Copyright of this story solely belongs to techcrunch.com. To see the full text click HERE

Read more

https://cdn.mos.cms.futurecdn.net/txXZ8gzz5JKNnRLQotqvz9-1920-80.jpg

I put the Mac mini-sized Kensington SD5010T5 EQ to the test and discovered a fully featured Thunderbolt 5 docking station that doesn’t take up much desk space

Its dual native HDMI ports alongside downstream Thunderbolt 5 architecture eliminate messy legacy adapter chains, making it an elite choice for multi-display setups. However, its immense bandwidth capacity remains heavily bottlenecked by a market still playing catch-up on compatible host devices. Pros * +Massive 140W host charging over a single cable