Page 1 of 1

When we are testing the A.I.

Posted: Tue Jun 03, 2025 5:35 pm
by vmk1oc
When the A.I. says... given that input then the change is 23,5% that there is an earth-like planet at that location of the universe.

How are we going to verify an A.I. on it's output? Talk with us about this subject. :arrow:

Re: When we are testing the A.I.

Posted: Mon Jul 14, 2025 4:33 pm
by vmk1oc
Subject: Great Resources That Explain How AI Is Tested

Looking for clear, reliable information on how Artificial Intelligence (AI) is tested? Here’s a curated list of top websites that explain AI testing methods in detail β€” from technical benchmarking to real-world evaluation and ethical testing.

πŸ§ͺ In-depth Technical Resources:

1. Papers with Code – Evaluation
Shows how AI models are evaluated using datasets, metrics (accuracy, F1, BLEU, etc.), and academic benchmarks. Great for understanding how AI testing is done in research.

2. DeepLearning.AI Blog & Courses
Covers error analysis, bias detection, overfitting, and data leakage testing. Ideal for learners and practitioners alike.

3. MosaicML Blog
Focuses on how to test large language models (LLMs) in practice. Includes examples of robustness, hallucination, and edge-case testing.

🧰 Practical Tools & Frameworks for Testing AI:

4. MLflow
An open-source platform for logging, testing, and tracking machine learning models. Useful when you're training your own models.

5. Great Expectations
Validates the consistency of input data pipelines β€” a key aspect of reliable AI systems.

6. Checklist by Microsoft
A framework for testing NLP models on robustness, consistency, fairness, etc. Includes both manual and automated test cases.

🧠 Ethics & Broader AI Testing Concerns:

7. Partnership on AI
Publishes papers and guides on how to evaluate AI ethically β€” including bias, transparency, and trustworthiness.

8. AI Testing Manifesto (Fraunhofer Institute)
Outlines practical quality criteria for safe and robust AI systems, such as those used in automotive or industrial settings.

βœ… Bonus – Dutch Context:

9. TNO – AI Testing & Trust (Netherlands)
Explains how to build and test AI systems with a focus on Dutch and EU regulations. Also discusses certification and compliance with the EU AI Act.