• Experts tried to get AI t

    From Mike Powell@1:2320/105 to All on Tue Nov 25 09:27:36 2025
    Experts tried to get AI to create malicious security threats - but what it
    did next was a surprise even to them

    Date:
    Mon, 24 Nov 2025 22:26:00 +0000

    Description:
    Experiments find LLMs can create harmful scripts, although real-world reliability failures prevent them from enabling fully autonomous cyberattacks today.

    FULL STORY

    Despite growing fear around weaponized LLMs , new experiments have revealed
    the potential for malicious output is far from dependable.

    Researchers from Netskope tested whether modern language models could support the next wave of autonomous cyberattacks, aiming to determine if these
    systems could generate working malicious code without relying on hardcoded logic.

    The experiment focused on core capabilities linked to evasion, exploitation, and operational reliability - and came up with some surprising results.

    Reliability problems in real environments

    The first stage involved convincing GPT-3.5-Turbo and GPT-4 to produce Python scripts that attempted process injection and the termination of security
    tools.

    GPT-3.5-Turbo immediately produced the requested output, while GPT-4 refused until a simple persona prompt lowered its guard.

    The test showed that bypassing safeguards remains possible, even as models
    add more restrictions.

    After confirming that code generation was technically possible, the team
    turned to operational testing - asking both models to build scripts designed
    to detect virtual machines and respond accordingly.

    These scripts were then tested on VMware Workstation, an AWS Workspace VDI,
    and a standard physical machine, but frequently crashed, misidentified environments, or failed to run consistently.

    In physical hosts, the logic performed well, but the same scripts collapsed inside cloud-based virtual spaces.

    These findings undercut the idea that AI tools can immediately support automated malware capable of adapting to diverse systems without human intervention.

    The limitations also reinforced the value of traditional defenses, such as a firewall or an antivirus , since unreliable code is less capable of bypassing them.

    On GPT-5, Netskope observed major improvements in code quality, especially in cloud environments where older models struggled.

    However, the improved guardrails created new difficulties for anyone
    attempting malicious use, as the model no longer refused requests, but it redirected outputs toward safer functions, which made the resulting code unusable for multi-step attacks.

    The team had to employ more complex prompts and still received outputs that contradicted the requested behavior.

    This shift suggests that higher reliability comes with stronger built-in controls, as the tests show large models can generate harmful logic in controlled settings, but the code remains inconsistent and often ineffective.

    Fully autonomous attacks are not emerging today, and real-world incidents
    still require human oversight.

    The possibility remains that future systems will close reliability gaps
    faster than guardrails can compensate, especially as malware developers experiment.

    ======================================================================
    Link to news story: https://www.techradar.com/pro/experts-tried-to-get-ai-to-create-malicious-secu rity-threats-but-what-it-did-next-was-a-surprise-even-to-them

    $$
    --- SBBSecho 3.28-Linux
    * Origin: capitolcityonline.net * Telnet/SSH:2022/HTTP (1:2320/105)