What does the evaluation actually cover?

The full AI system — not just the model. Data flows, preprocessing, business rules, user interface, human oversight, and real-world outcomes. Scope is aligned to the risk dimensions most relevant to your system.

How long does it take?

A typical evaluation takes anywhere from a few days to a couple of weeks. The exact timeline depends on scope and the depth of data access. That time is not a limitation — it is what makes the evaluation meaningful.

Is your evaluation automated or manual?

Both. We use proprietary methodology and tooling developed over more than a decade, which makes us significantly faster than a purely manual approach. But every evaluation is adapted to the specific system. You may also see this work referred to as an AI audit — the terms are often used interchangeably, but we prefer evaluation because it more accurately describes what we do.

What access do you need from us?

We work with whatever access is available and always aim to go as deep as possible. Before we begin, we agree the scope together — including what data and documentation you can share. The depth of the evaluation is directly proportional to the depth of access.

What is the difference between the evaluation and ongoing monitoring?

The evaluation is always the starting point: a structured assessment of your system as it currently performs in production. Monitoring is a continuation — we track the metrics we identified together and alert you when something changes materially. Monitoring is not available as a standalone service.

Is this compatible with our existing compliance or GRC processes?

Yes, and it goes further. GRC platforms tell you what processes you have in place. Independent evaluation tells you whether your AI system actually performs as those processes intend — in production, with real data. They are complementary, not competing. This includes compliance with the EU AI Act.

Your AI system, evaluated by

independent experts

We independently assess how AI systems behave in the real world, giving you credible assurance that your AI is safe and dependable in operation.

Who this is for

Developers

You build AI systems.

Your clients trust that they work responsibly.

We give you the evidence to back that up.

Deployers

You deploy AI built by someone else.

You are still accountable for what it does in your operations and how it impacts your users, regardless of who built it.

AI System Risk Management Process

Step 0: Forward Deployment

Get our experts to advise your team at earlier stages in the design or development process, so you can minimize risks from the outset

Step 1: First Evaluation

Get a clear picture of the risks your system is exposed to in a language board members or investors can understand, and your team can act on risk mitigation.

Step 2: Automated Monitoring

After the evaluation, we track agreed metrics at defined intervals, depending on the risk profile of your system. We alert you of drifts and interpret what they mean.

Step 3: New Evaluations

An evaluation - or audit - is only valid for a specific period, as AI systems by nature evolve with time. We regularly provide a new full evaluation guided by our experts.

Step 1: First Evaluation

Get a clear picture of the risks your system is exposed to in a language board members or investors can understand, and your team can act on risk mitigation.

Our experts listen to your concerns. Whether you are worried about bias or hallucination, ROI, liability, or curious about environmental impact, or just want insight to make sure your AI systems are performing as expected. Then, with our expertise on where the biggest risks lie for your particular system and your additional requirements, we build the risk matrix and a data schema.

With your input, we assign metrics, benchmarks, and AI judges to scan your production data and quantify risks, diagnose risk sources, and suggest mitigation strategies.

Most importantly, you get an AI leaflet with a standardized score card, which you can share with users, clients, investors, regulators, and partners in the form of an Eticas.ai Seal that you can share to show your commitment to AI accountability.

Dashboard including last read values and
evolution
Alerts when metrics drift beyond agreed
thresholds
Remediation guidance when issues surface:
not just a flag, but a path forward

At the end of the evaluation

Risk picture in board language: What the system is doing, where it could fail, and how serious the exposure is; translated out of technical language
Risk dynamics dashboard: Visualization of risk levels across the metrics that matter for your specific system
Impact insight: How the system is affecting your business, your staff and your learners. Not just whether it works, but who it works for
Mitigation pathways: Prioritized recommendations for each risk surfaced, with clear actions the team can implement
Monitoring-ready benchmarks: Metrics and thresholds agreed during evaluation, ready to carry forward into continuous monitoring
The Eticas.ai Leaflet: A standardized scorecard shareable with clients, investors, regulators, and partners as evidence of AI accountability

Step 2: Monitoring

After the evaluation, we track agreed metrics at defined intervals, depending on the risk profile of your system. We alert you of drifts and interpret what they mean.

Based on risk metrics and the data schema, our platform is configured to automatically flag performance drifts and changes in model or user behavior. The frequency depends on your risk profile: our platform can check every minute or every month. You and your team get access to:

Sandbox to test potential impact of system changes and request a new evaluation
Evaluation trail to present as proof of audit

Step 3: New Evaluation

An evaluation – or audit - is only valid for a specific period, as AI systems by nature evolve with time. We regularly provide a new full evaluation guided by our experts.

As your systems become more complex, we revisit risks, metrics and benchmarks with you, and update the data schema as appropriate.

In addition to everything that comes with any evaluation (as per step 1), you get:

Overall, you have an up-to-date picture of how your AI system is performing in your specific context, control over impacts on revenue, operators, users and compliance, as well as defensible assurance and safety controls.

Over time, you make better decisions, with the right AI, at the right price - and it shows.

Data flows

Model & AI components

Business rules & UX

Human oversight

Organisational context

Real-world outcomes

Independent

Why Eticas

With no interest in the outcome, we objectively quantify any risk.

Socio-technical

We evaluate the full system: data, model, business rules, software, people and process.

Powered by Tech

Proprietary methodology and tech built over more than a decade of auditing systems in production.

Guided by Experts

Evaluating AI since 2012. We don’t stop at metrics. We give you the evidence and guidance to act.

FAQs

The full AI system, not just the model.
That means data flows, preprocessing, business rules, user interface, human oversight, and real-world outcomes.
We scope every evaluation – also called an audit - around a structured set of risk dimensions, which we prioritize according to what matters most for your specific system.
A typical evaluation takes between a few days and a couple of weeks, depending on scope and data access. That time is a feature, not a limitation: a genuine end-to-end audit of a production AI system requires understanding the specific system, the organization around it, and the context in which it operates.
Our methodology and tooling make the process significantly faster than a purely manual approach, but full automation is incompatible with what an evaluation needs to do.
Both. We use a “Service as Software” model. We use structured methodology and software that we have been developing for over a decade, including open-source risk-assessment libraries, which makes us faster and more consistent than a purely manual approach. But every evaluation is adapted to the specific system, its data, and its organizational context. Technology is key for the execution of the audits, but the expertise of our human auditors is critical for guidance and certification.

While we always aim to go as deep as possible, we work with whatever access is available. Before we begin, we agree on the scope together: what data and documentation you can share, which system components we can test directly, and what we assess from outputs alone. The depth of the evaluation is proportional to the depth of access, and we are transparent about what each level of access allows us to certify.
The evaluation is always the starting point and it needs to be performed regularly as AI systems naturally evolve with time. Evaluations always involve working with your team and they may involve changes in the prioritized risk or dimensions or the data schema, for example.
Monitoring is the continuation of any evaluation: we track the metrics and benchmarks identified and alert you when something changes.
Monitoring is not available as a standalone service: it depends on the evaluation to establish what to measure and what the thresholds should be.
Absolutely! Yes, and it goes further. GRC platforms help you record what processes you have in place and how they match the processes required by existing standards (like ISO42001 or SOC1/2) and regulations (like the EU Act). Independent evaluation tells you whether your AI system actually performs as those processes intend in production, with real data. The two are complementary, not competing.

Real clients. All verticals. Real impact.

Your AI system, evaluated by

independent experts

Who this is for

Developers

Deployers

AI System Risk Management Process

Step 0: Forward Deployment

Step 1: First Evaluation

Step 2: Automated Monitoring

Step 3: New Evaluations

Step 1: First Evaluation

Get a clear picture of the risks your system is exposed to in a language board members or investors can understand, and your team can act on risk mitigation.

At the end of the evaluation

Step 2: Monitoring

After the evaluation, we track agreed metrics at defined intervals, depending on the risk profile of your system. We alert you of drifts and interpret what they mean.

Step 3: New Evaluation

An evaluation – or audit - is only valid for a specific period, as AI systems by nature evolve with time. We regularly provide a new full evaluation guided by our experts.

Independent

Why Eticas

Socio-technical

Powered by Tech

Guided by Experts

FAQs

What does the evaluation actually cover?

How long does it take?

Is your audit automated or manual?

What access do you need from us?

What is the difference between evaluation and ongoing monitoring?

Is this compatible with our existing compliance or GRC processes?