How LLM models compare to other models

View profile for Mohamed Amine Ferrag, PhD

Associate Professor of AI & Cybersecurity I PhD and HDR degrees I IEEE IoTJ Editor

❓How do we evaluate the cybersecurity knowledge of LLMs? Introducing CyberMetric-80, CyberMetric-500, CyberMetric-2000, and CyberMetric-10000—comprehensive Q&A benchmark datasets designed for this purpose. 📚 Using LLM and RAG, we created these datasets from NIST standards, research papers, and more, validated by experts over 200+ hours. We tested 25 top LLM models and involved 30 human participants in CyberMetric-80. 🔍 Results: GPT-4o, GPT-4-turbo, Mixtral-8x7B-Instruct, Falcon-180B-Chat, and GEMINI-pro 1.0 excelled, outperforming human participants, though experienced experts still surpassed smaller models like Llama-3-8B. 📖 Read the full paper here: https://lnkd.in/eaJvbmSK #Cybersecurity #AI #LLM #CyberMetric #Innovation

  • diagram

The paper is very interesting thank you for your sharing. I would like to add that there are existing AI-powered security LLMs that help cybersecurity analysts detect threats earlier, respond faster, and stay ahead of attacks. One example is Purple AI offered by SentinelOne. What do you think the additional benefits would be if we train ChatGPT, considering there are already existing AI-powered security LLMs?

Like
Reply
Mohamed Rihan

Associate Professor of Wireless Communications | Editorial board member of EURASIP Journal on Wireless Communications and Networking | Former Marie Curie Postdoctoral Research Fellow | Senior Member, IEEE.

7mo

Impressive

Like
Reply
Ashfaaq Farzaan

AI/ML in Cyber Security | 🚀 Building LLM & GenAI Applications for Cybersecurity

7mo

Very informative 🙌🏼

See more comments

To view or add a comment, sign in

Explore topics