ES EN
Resources

Frequently asked questions

The questions we hear most when someone starts using ArtificialQA. If yours isn't here, contact us via artificialqa.com.

General

What exactly is ArtificialQA?

A platform to test, evaluate, and monitor the quality of AI agents. You generate cases, run them against your real AI agent, evaluate them with deterministic asserts and 17 LLM evaluators, and get auditable reports.

How does it differ from traditional testing?

Traditional testing assumes input → exact output. AI agents break that premise: the same question can have multiple valid answers. ArtificialQA is designed specifically to evaluate non-deterministic responses by combining hard rules with qualitative LLM-based evaluation.

Do I need to know how to code?

No, not to use the platform end-to-end. AI generation, browser connection, suite creation, and evaluation are all done through the UI. Coding helps if you want to use the REST API, regex or JSON Schema asserts, or integrate runs in your CI.

Account and plans

Do I need a credit card to start?

No. The Free plan doesn't require a credit card and has no time limit.

Can I cancel anytime?

Yes. Cancellation takes effect at the end of the current billing cycle. No minimum commitment.

How do I switch plans?

For now, plan changes are assisted: contact us via artificialqa.com or our contact email and we'll coordinate the upgrade/downgrade and payment with you. Online payment and plan management is on the roadmap, not enabled yet.

Do you have discounts for startups or academia?

We evaluate case by case. Reach out to us.

AI generation

How good are the questions the AI generates?

It depends a lot on the additional context you provide. The more specific the context (what your AI agent does, what tone it should have, what critical cases concern you), the better the cases. That's why they land in a review view where you can edit each one before sending it to your catalog or a specific suite.

Which industries does it support?

15 industries: general, customer support, healthcare, finance, ecommerce, travel, telecom, education, legal, hr, saas, insurance, real estate, food, safety. Each with a tuned prompt.

Can I generate in Spanish and English?

Yes, both languages supported.

AI agent connection

Does my AI agent have to be publicly accessible?

For Browser, the chat URL has to be reachable from our execution workers. For HTTP, the endpoint too. If your AI agent lives behind a VPN or IP allowlist, contact us to coordinate.

What if my AI agent has login?

In Browser, you define the Login Steps: the sequence of selectors and actions to authenticate before chatting. The platform runs them before each case.

Can I test multiple environments of the same AI agent?

Yes. Create multiple Agent Connections (one per environment: dev, staging, production) and build Test Plans against each.

Tokens and consumption

When are tokens consumed?

On AI generation, on LLM evaluation, on AI-powered enhanced reports (Enterprise), and when running Test Plans against AI agents with a Browser (Playwright) connection — because the platform uses an LLM (AI Locator) to dynamically detect the chat inside the page. When running against an HTTP/API connection, the call to your AI agent's own endpoint doesn't consume tokens.

What if I run out of tokens?

We notify you at 80% of your quota. When you hit 100%, token-consuming operations pause until the next cycle or an upgrade. Execution via HTTP/API connection remains available (no tokens consumed). If you need extra token packs on top of your current plan, contact us and we'll set it up.

Do unused Free tokens roll over?

No. Quotas refresh each month; they don't accumulate.

Evaluation

Do I always need to activate all 17 evaluators?

No. Activate only those that make sense for your use case. For general customer support 5–7 evaluators usually suffice; for critical domains (health, finance) it pays to activate more, especially data_accuracy, hallucination, security.

Can I trust an LLM evaluator's score?

Yes. The 17 evaluators come pre-calibrated by our team: we validate each one against reference datasets to make sure their scores are reliable before enabling them in production. More detail in Security & compliance.

Do evaluators always give the same score?

There's inherent variability when using LLMs. To reduce it we use stable prompts and low temperature, but the result is never 100% deterministic — that's by design when scoring language. Internal calibration ensures that variability stays within an acceptable margin.

Data and privacy

Where is data stored?

In cloud infrastructure with at-rest and in-transit encryption. If you have data residency requirements by region, we coordinate it in the Enterprise plan.

Do you use my data to train models?

No. Your organization's data is used exclusively to operate your instance of the platform. We don't feed AI models with customer data.

Can you delete my data if I stop using the platform?

Yes. Submit a deletion request and we'll proceed per the timelines in our policy. For Enterprise customers we formalize this in the DPA.

Integrations

Do you have Slack/Jira integration?

Not natively today. It's on the roadmap. In the meantime, you can build the bridge with the REST API.

Do you have an SDK?

Official SDKs not yet. The REST API is available (Enterprise plan) and is standard — works from any language that does HTTP.

If none of this answers your question

Get in touch via artificialqa.com. Real user questions are what end up shaping this section.