Purdue researchers develop ‘PickleBall’ to make AI model sharing safer — what it means for open-source AI

Researchers at Purdue University have developed a new security tool called PickleBall, designed to make it safer to load machine learning models shared through public repositories. The tool blocks attackers from hiding malicious code inside AI models — a growing threat as more developers rely on pre-trained models from platforms like Hugging Face.

In tests, PickleBall blocked 100% of malicious models in the researchers’ dataset while correctly loading nearly 80% of benign models, all with only about 1.75% performance overhead.

The work directly targets an emerging problem in the AI supply chain: when “loading a model” can silently turn into “running an attacker’s code” on your infrastructure.


Key Takeaways

  • Purdue researchers created PickleBall, a secure loader for pickle-based machine learning models.
  • In evaluations, PickleBall blocked all malicious models while successfully loading nearly 80% of benign models, outperforming other secure loaders.
  • The tool targets a critical AI supply-chain risk: attackers hiding malicious code in shared models hosted on public repositories.
  • Existing defences — model scanners, restrictive policies, “safer” formats — show gaps in coverage and practicality.
  • For open-source AI, tools like PickleBall could become standard for safely reusing community models in production.

Explore More

Want to go deeper into AI security and open-source model ecosystems? Explore these hubs on Arti-Trends:

  • AI Guides Hub — foundational explainers on AI infrastructure, model hosting and supply-chain security
  • AI Tools Hub — evaluations of AI devtools, security tooling and infrastructure platforms
  • AI News Hub — fast coverage of new AI security research and open-source threats
  • AI Investing Hub — analysis of AI security, infrastructure and tooling companies shaping this space

These hubs help you connect individual research breakthroughs like PickleBall to the broader trends in secure and trustworthy AI.


The problem: malicious code hidden in AI models

AI developers increasingly download pre-trained models from hubs like Hugging Face instead of training from scratch. That convenience comes with a hidden risk: a model file can behave like executable code. If an attacker slips malicious payloads into a model or its configuration, loading that model may quietly execute arbitrary code on the host system.

Recent studies have shown that:

  • model formats like Python’s pickle can embed code execution
  • automated security checks on public model hubs may miss malicious models
  • organisations often trust “popular” models or repos without deep inspection

This creates a classic software-supply-chain problem: AI teams may be importing malware when they think they’re just importing a model.


What Purdue’s PickleBall actually does

PickleBall focuses on one of the most widely used — and most problematic — formats in the AI ecosystem: pickle-based models. According to the researchers, nearly 45% of popular models on Hugging Face still rely on insecure pickle serialization, despite known risks.

The tool works in two main stages:

  1. Static policy generation
    • PickleBall analyses the source code of the machine learning library (e.g., PyTorch, scikit-learn) to understand what legitimate pickle operations look like.
    • It then builds a custom, library-specific policy defining what is allowed during model deserialization.
  2. Safe, policy-enforced loading
    • At load time, PickleBall acts as a drop-in replacement for Python’s pickle module.
    • It enforces the generated policy, allowing benign models while rejecting models that attempt dangerous operations (like arbitrary function invocation).

In evaluations, PickleBall:

  • correctly loaded 79.8% of benign models in their dataset
  • rejected 100% of malicious models
  • introduced only about 1.75% overhead, which is negligible for most workflows

By contrast, the team found that:

  • some model scanners failed to detect known malicious models, and
  • a leading secure loader blocked significantly more benign models than PickleBall, making it less usable in practice.

Why this matters for open-source AI

For open-source AI developers and startups, the implications are significant:

1. Model hubs are part of your attack surface

The research reinforces what security teams have been warning: model hubs are the new package registries. Just like npm or PyPI, they’re attractive for attackers trying to slip malicious content into widely used ecosystems.

2. “Safe because it’s popular” is a dangerous assumption

Popularity, downloads and star counts don’t guarantee safety. Some malicious models have already bypassed basic scanning in well-known AI repositories.

3. Secure loading should become a default practice

Purdue’s work suggests that secure loaders — not just static scanners — should be part of standard AI engineering practice, especially in:

  • MLOps pipelines
  • enterprise AI products
  • SaaS platforms that allow users to upload or plug in models

Practical implications for startups, teams and tool builders

For AI startups & SaaS platforms

  • Treat model loading as a security-critical operation, like executing untrusted code.
  • Integrate tools like PickleBall (or similar secure loaders) into:
    • internal inference services
    • user-provided model upload flows
    • CI/CD pipelines for AI services

For open-source AI developers

  • Review your own projects’ use of pickle and other risky formats.
  • Prefer safer serialization formats where possible — but don’t assume they’re magically secure.
  • Document model-loading expectations clearly to help downstream users manage risk.

For security & platform teams

  • Add model repositories to your supply-chain risk assessments.
  • Consider policies where all downloaded models must:
    • be scanned
    • be loaded in sandboxed environments
    • go through secure deserialization layers

What this means for the future of open-source & community AI

The Purdue team frames PickleBall as a lift for the entire model-sharing ecosystem, not a replacement for hubs or scanners.

If widely adopted, tools like this could:

  • allow open-source AI to remain vibrant and accessible
  • reduce the risk that a single malicious model compromises an entire organisation
  • shift cultural norms so that “secure loading” becomes as standard as “virtual environments” or “dependency pinning”

For the broader community, this is a sign that AI is leaving its “anything goes” phase and entering a maturity phase where security engineering and ML engineering must merge.


What happens next

PickleBall is currently a research prototype, but the authors position it as a drop-in, practical tool that AI teams can adopt without overhauling their entire stack.

The next steps likely include:

  • integration into major ML frameworks and MLOps platforms
  • collaboration with model hubs to improve default safety
  • follow-up research on protecting other formats and configuration-based attacks

As AI supply-chain incidents grow, similar tools may become part of the baseline expectations for any serious AI platform.


Sources

  • Purdue Engineering Purdue ECE researcher helps develop award-winning tool to make AI model sharing safer

Leave a Comment

Scroll to Top