notebook: Document an agentic AI system #472

validbeck · 2026-01-28T22:43:24Z

Pull Request Description

What and why?

sc-12466

In accordance with our writing style guide, I renamed, edited, and restructured contents of the "AI Agent Validation with ValidMind - Banking Demo" notebook to: Document an agentic AI system
I also added the Agentic AI template as a notebook artifact as it isn't a default out-of-the-box template: agentic_ai_template.yaml

How to test

Pull down this PR: gh pr checkout 472
Make sure you have the new added scorer for StepEfficiency in your library environment by registering a new Python kernel
Run notebooks/code_samples/agents/document_agentic_ai.ipynb

What needs special review?

The notebook runs end-to-end without issues in my environment, but you should check that everything looks fine to them as well.

General topic and test wording

I understand how to use the library functions, how to run tests, etc. very well at this point but I don't necessarily know why we're choosing to run the tests that we do, so please make sure that my descriptions are accurate and relevant.

Assigning AI evaluation metric scores section

Important

Please check that the following are all correct and do what we want them to do for the reasons outlined in the notebook:

I added some information here on returning scorers related to the assignments we want to make.
I also added the following execution layer test as it was missing as per this conversation in Fix langgraph version in the agentic demo notebook #471 (comment): StepEfficiency.py

Dependencies, breaking changes, and deployment notes

Refer to the line about StepEfficiency.py above.

Release notes

Learn how to build and document an agentic AI system with the ValidMind Library with our new notebook. Construct a LangGraph-based banking agent that selects and invokes tools in response to user requests. You'll assign AI evaluation scores to your agent, run accuracy, RAGAS, and safety tests, and log the results of your tests to the ValidMind Platform.

Document an agentic AI system

Checklist

validmind/scorers/llm/deepeval/__init__.py

validmind/scorers/llm/deepeval/StepEfficiency.py

validmind/tests/__types__.py

notebooks/code_samples/agents/document_agentic_ai.ipynb

AnilSorathiya

minor comment:
We have remove the StepEfficiency scorer from the code due to bug. Main branch doesn't have it.

Otherwise it's looks good to me. Thanks 👍

validbeck · 2026-01-29T19:55:35Z

@AnilSorathiya It works and passes the code quality test in the current version:

github-actions · 2026-02-02T14:31:15Z

PR Summary

This pull request introduces a new YAML template (agentic_ai_template.yaml) that defines comprehensive guidelines for documenting agentic AI systems. The template is structured into multiple sections including conceptual soundness, data evaluation, model evaluation, and observability and monitoring. Each section comprises detailed guidelines (with examples and hierarchical parent section references) aimed at enabling users to document features such as autonomy, reasoning, memory, risk management, regulatory compliance, and more.

Additionally, a new Jupyter Notebook (document_agentic_ai.ipynb) has been added. This notebook provides step‐by‐step instructions to build and document an agentic AI system using the ValidMind Library. It includes detailed markdown explanations, code cells for installing dependencies, initializing the ValidMind environment, building and testing agent workflows, and running validation tests. The notebook guides users to verify LLM API access via environment variable configuration, integrate banking tools, bind the tools to the agent, and finally to capture test results including AI evaluation metrics.

Significantly, a legacy notebook (langgraph_agent_simple_banking_demo.ipynb) has been removed, likely because its functionality is being replaced or superseded by the new, more comprehensive documentation template and notebook. Overall, the PR refactors the documentation and testing approach for agentic AI systems by providing a structured template and modernizing the developer guides.

Test Suggestions

Run a YAML linter to ensure the syntax and formatting of agentic_ai_template.yaml are correct.
Execute the new document_agentic_ai.ipynb notebook cell-by-cell to verify that all code cells run without errors.
Perform integration tests to check that the ValidMind library correctly picks up the new template for model documentation.
Verify that the removal of the legacy banking demo notebook does not break any external references or dependencies.

validbeck added 30 commits January 26, 2026 11:27

Creating new version of agentic AI notebook

6024baf

Edit: Intro

570cebf

Save point

08ae74d

Save point

49ebc39

Save point

fdaec52

Building agent - Test banking tools

9cbe8d1

Save point

8e796cc

Save point

d4ee437

Building agent - Create agent

4a442f1

Save point

319df40

Save point

8fb6df0

Save point

ae9c3b9

Save point

569045e

Save point

18d702c

Save point

6ae8866

Clarifying OpenAI access

bc188bb

Save point

2f7bca0

Save point

054f0ac

Setup - Running tests

001d8b6

Save point

1114efd

Save point

a168ec4

Save point

2df63b7

Save point

f55f099

Applying some of Anil's changes to the edited cells

83dbc8a

Setup rest of headings

4b55be1

Save point

762c442

Save point

c8e5dbe

Save point

dc48a53

Save point

f50772e

Running evaluation tests — Custom response accuracy

c35683e

validbeck added 5 commits January 28, 2026 13:30

Running safety tests

49fe39c

CLeaning up intro

384e70e

Save point

e7e1b0e

Cleanup: Next steps

93ecd47

Removing old notebook & adding toc

15879de

validbeck self-assigned this Jan 28, 2026

validbeck added documentation Improvements or additions to documentation enhancement New feature or request labels Jan 28, 2026

validbeck added 3 commits January 28, 2026 14:52

Pulling in from main

435b843

Running make copyright

9053cd3

Removing whitespaces from StepEfficiency.py

8ea7f00

validbeck requested review from AnilSorathiya, cachafla, juanmleng and nrichers January 29, 2026 00:24