LLMs in Physical Unclonable Function (PUF) Analysis

Suhas Bhairav
Aug 1, 2025
3 min read

Physical Unclonable Functions (PUFs) are hardware primitives that exploit manufacturing variations to generate unique, device-specific secrets. These functions are used in secure key generation, device authentication, and anti-counterfeiting in hardware security systems.

LLMs in Physical Unclonable Function (PUF) Analysis

PUFs are attractive because they're “unclonable” — even the manufacturer cannot reproduce the same physical response. However, as with any security mechanism, PUFs need rigorous testing, modeling, and analysis to ensure they resist attacks, particularly modeling attacks.

Enter Large Language Models (LLMs) and Generative AI, which are beginning to play a key role in understanding, modeling, and even attacking or validating the robustness of PUFs in novel ways.

🧠 What Is a PUF?

A PUF maps challenges (inputs) to responses (outputs) based on the physical characteristics of a circuit. Examples include:

Ring Oscillator PUFs
Arbiter PUFs
SRAM PUFs
Butterfly PUFs

Each PUF instance is unique due to tiny variations in semiconductor manufacturing. Ideally, the mapping from challenge to response is hard to predict without physical access.

⚠️ The Threat: Modeling Attacks

Machine learning models have been used to approximate PUF behavior, especially for Arbiter PUFs, by learning challenge-response pairs (CRPs) and building predictive models.

Traditional models include:

Logistic regression
Support vector machines (SVMs)
Neural networks

These raise the question: Can LLMs or transformer architectures also model or analyze PUFs — and do it better?

🤖 How LLMs Are Used in PUF Analysis

1. Modeling Attacks with Transformers

LLMs (or transformer-based models like BERT or GPT) can:

Take a sequence of challenges as input
Predict corresponding responses
Generalize to unseen challenges, especially in linear PUFs (e.g., Arbiter PUFs)

These attacks can simulate adversaries trying to clone or impersonate a PUF without access to the actual device.

2. Challenge Pattern Discovery

LLMs are exceptional at pattern extraction. They can analyze large CRP datasets and uncover:

Linearity
Redundancy in responses
Poor entropy or biasThis helps researchers identify weak PUF configurations.

3. PUF Behavior Explanation

Given raw CRP logs or error patterns:

“What might be causing these unstable responses in the PUF?”

LLMs can suggest:

Environmental drift (voltage/temp)
Aging effects
Setup errors
Structural PUF weaknesses

4. Synthetic CRP Generation

Generative models can create synthetic challenge-response datasets to:

Train hardware emulators
Perform stress testing
Benchmark modeling attack resilience

This is useful for researchers working in simulation rather than with real silicon.

🛠️ Sample Workflow

Collect CRP dataset from the target PUF
Format for LLM ingestion:
{"challenge": "0010110101", "response": "1"}
Feed to a transformer model to:
- Predict future CRPs
- Score entropy
- Identify instability or bias
Ask GPT-4 for insight:
“Analyze this sequence of 1000 CRPs and identify if there’s any linearity or predictability.”

🔐 Applications

Security Validation: Use LLMs to simulate attacks and improve PUF robustness before deployment.
Embedded Debugging: Explain unstable responses from a failing or aging chip.
Design Assistance: Suggest optimal configurations (e.g., number of stages, XORs) for strong PUF entropy.
Synthetic Data Creation: Train other models using GPT-generated CRP data for experimentation.

🧪 Research Potential

Recent papers have explored using transformer models as part of PUF modeling pipelines, achieving higher accuracy than classical ML in some cases.

LLMs can handle long-range dependencies, which helps in modeling composite or XOR PUFs.
With transfer learning, LLMs can adapt across different PUF architectures.
Hybrid models (LLMs + entropy metrics) can score robustness and uniqueness of any PUF.

⚠️ Challenges

Data Volume: Training LLMs requires large CRP datasets — may not be feasible for all PUF types.
Computation Cost: Transformer training and inference can be heavy for embedded environments.
False Confidence: LLMs can hallucinate patterns that don’t exist. Proper statistical validation is crucial.
Security Risks: LLM-based modeling attacks may help attackers clone weak PUFs — highlighting the need for strong entropy and non-linearity.

🔮 Future Directions

LLM agents that assist PUF designers by simulating attacks and validating strength.
Explainable PUF analyzers that describe how predictable or cloneable a PUF may be.
Self-healing firmware that uses LLMs to detect when PUFs are degrading or producing unstable outputs.
RAG-based PUF security advisors, integrating AI with academic and real-world PUF threat data.

✅ Conclusion

Generative AI and LLMs are powerful tools in the PUF ecosystem — not just for simulating attacks, but also for strengthening design, analyzing behavior, and ensuring long-term robustness. As PUFs become central to IoT, embedded security, and authentication at scale, AI-powered analysis will be key to maintaining trust in unclonable hardware identities.