How long does AI development typically take?

Entry-level AI agents are near-production ready in 4 to 6 weeks. Custom AI development with its own data pipeline takes 8 to 12 weeks. Platform-level solutions with multiple agents and tenant separation run 12 to 20 weeks — depending on integration depth.

How is AI development made GDPR- and nDSG-compliant?

We host in the EU or Switzerland, use open-source LLMs in your own hands when data must not leave the building, and log every AI call in a revision-safe audit trail. Risk class is assessed per use case against the EU AI Act — full applicability from August 2026.

Do I need my own data for AI development?

For AI agents on top of existing LLMs, typically no. For custom ML models, yes — we check data sufficiency during the data audit. For RAG systems, your existing documents are enough; no training of a custom model needed.

What is the difference between a ChatGPT integration and custom AI development?

A ChatGPT integration drops a chat layer onto existing text. Custom AI development connects model, auth layer, audit trail, and your own data sources — RAG with source citation, tool calls into your existing systems, tenant separation. One is a wrapper; the other is an architecture.

AI Development & Machine Learning | AI Agents from €20,000

Why AI development in 2026 looks different

Two years ago, AI development meant training your own model or pasting a chatbot into a website. Neither holds up today. LLMs are commodity infrastructure now — the differentiation has moved from the model to the architecture around it: how AI accesses your data, where it touches your existing processes, how decisions stay auditable. AI development in 2026 is an architecture discipline — and that is where we work.

What we mean by AI development. We do not build ChatGPT wrappers. We build custom AI solutions that integrate with your data model, your auth layer, and your audit trail:

AI agents for defined workflows: multi-step procedures where the AI calls tools, checks intermediate results, and logs each step
RAG systems on your knowledge: answers come from your documents, databases, and policies — with source citation, not from a vendor's training corpus
Custom machine learning for your domain: classification, forecasting, anomaly detection — where a generic LLM is too vague
Aligned with GDPR and nDSG: EU and Swiss hosting, open-source LLMs on request, data minimisation at every layer
Measurable impact instead of AI theatre: we define before project start which metric must move — handling time, classification accuracy, share of tickets resolved
Integration with your stack: ERP, CRM, databases, and internal APIs are connected through an auth and audit layer — not via copy-paste

What that looks like in practice: a RAG system over 12,000 internal PDFs replaces Confluence search. A classification model triages incoming service tickets before a human touches them. An AI agent drafts quotes — your team still does the final review. AI development as a tool, not as an end in itself.

Where AI development actually delivers

Not every task needs AI. These five fields have emerged from our practice as the production-grade patterns — architectures we have built, run, and iterated on repeatedly.

AI agents for service and back-office workflows

What AI agents do in 2026 — and what they do not. AI agents handle multi-step procedures in narrowly defined domains: read the request, check it against the knowledge base, call a tool, log the result, escalate to a human when needed. They do not replace a profession — they take the recurring 70 per cent off the desk so your team can give the demanding 30 per cent proper attention.

Service agent for first responses: categorise requests, answer standard cases directly, hand complex cases — including a pre-check — to your team
Document agent: extract invoices, delivery notes, and contracts; push them into the ERP; flag deviations
Sync agent: reconcile records between CRM, ERP, and custom backend — with an audit trail kept revision-safe
Research agent: market data, competitor updates, lead enrichment — as scheduled runs, not as a live chatbot
Review agent: check incoming documents against your policies, document findings with source citation — the final approval step stays with a human

Document processing with RAG

In 2026, RAG is the most mature architectural pattern in AI development. Instead of letting the LLM guess, we index your documents in a vector database, retrieve the relevant passages before each answer, and ship source citations along. Hallucinations drop significantly because the model only has to phrase what your sources actually say.

Make internal knowledge searchable: employees ask the AI instead of digging through SharePoint, Confluence, and old email threads
Service answers from product docs: the AI grounds itself in your manuals and release notes, not in a vendor training corpus
Contract and policy research: clauses, rules, and internal guidelines made semantically searchable — with a direct pointer to the location in the source document
Sales enablement: product details, reference cases, and pricing structure from one source, not from ten parallel slide versions

What changes: response time on internal questions drops from minutes to seconds. Onboarding becomes noticeably easier — new colleagues can ask questions without interrupting anyone. Content stays current because the system reads your originals, not a stale export.

Custom machine learning for domain-specific tasks

When a generic LLM is too vague, custom ML development pays off: your own model, trained on your historical data, with measurable accuracy on your classes. Typically lighter, faster, and cheaper to run than an LLM call per request.

Forecasting: sales, utilisation, or demand predictions from your historical data
Classification: assign tickets, emails, and documents to the right category automatically
Anomaly detection: spot irregularities in transactions, sensor data, or logs before they escalate
Computer vision: image and video analysis for quality control, stock-taking, and visual inspection
Specialised NLP: entity extraction, intent detection, sentiment in domains where an LLM stays too generic

LLM integration into existing platforms

When AI should sit inside your application, not next to it. We connect LLMs to your existing auth layer, route every call through your audit trail, and make behaviour controllable via prompts and configuration — without forcing your frontend team to suddenly learn AI engineering.

Auth and role model respected: what a user is allowed to see, the LLM sees on their behalf — RAG hits from out-of-scope areas are filtered, not censored after the fact
Audit trail kept revision-safe: prompt, model version, sources, response, and timestamp end up in your logging — traceable on request
Feature flags for AI capabilities: turn individual features on per tenant, per user group, or per region — a controlled rollout, not a big bang

Semantic search and knowledge base

Full-text search often is not enough. Semantic search finds the right document even when the searcher uses different words than the author. We combine classical full-text search, vector embeddings, and — depending on data volume and accuracy needs — a re-ranking model.

Similarity search: „Have we seen a case like this before?“ — the platform surfaces comparable tickets, parts, or contracts
Cross-silo search: one search box, behind it SharePoint, ERP, wiki, and mail archive — results with source and permission filter
Structured extracts: search results as JSON or tables for downstream processes — one of the underrated strengths of current AI systems

How AI development with us works

An AI project rarely fails on the model — it fails on an unclear use case, thin data, or missing integration. Our approach is cut into five phases, each with a defined deliverable. You can step out after any phase.

1. Use-case discovery and data audit (1–2 weeks)

Workshop: which procedures consume the most time today, and where is an AI solution genuinely the right lever — rather than just a better script?
Data audit: is there enough material in sufficient quality, in which format does it sit, and what needs cleanup before any model discussion?
Impact hypothesis: which metric should move by how much, how do we measure it before project start, and when does the project count as failed?
Feasibility sketch: is the right answer RAG, a custom ML model, an AI agent — or a noticeably simpler piece of software without AI?
MVP definition: the smallest cut that delivers real impact, not a demo that only holds up during the pitch

2. Architecture and model selection (1–2 weeks)

Before the first line of code, we decide the tradeoffs together: open-source LLM in your own hands versus API vendor, EU or Swiss hosting, choice of vector database, auth and audit layer. These decisions carry the project for years — so no gut-feel call.

Model choice: open-source (Llama, Mistral) for data control and cost, commercial APIs (Anthropic, OpenAI) for top language quality — often hybrid, by use case
Hosting path: EU region (Frankfurt), Switzerland (Zurich), or on-prem — depending on industry, data classification, and nDSG requirements
Vector database: PostgreSQL pgvector for existing systems, Qdrant or Weaviate at higher volumes
Integration layer: how the AI accesses your data, where auth runs, where logging happens — before we write code

3. MVP development (typically 4–6 weeks)

Within four to six weeks, a near-production version handles real procedures — not a pitch-only demo. You test on your own data, we measure against the impact hypothesis from phase 1.

Agent or RAG pipeline: from the first call to the logged response
Indexing of your documents, embedding strategy, retrieval and re-ranking logic
For custom ML: data preparation, model selection, validation against a hold-out set
Connection to at least one target system (ERP, CRM, internal backend) including auth
Eval suite: defined test cases against which we measure accuracy, latency, and cost

4. Integration into your existing stack

This is where the MVP becomes part of your platform — and this is where most AI projects in the market fall over. We wire the solution into your auth, your audit trail, your secrets management, and your monitoring. No shadow IT.

Integration with your existing auth (SSO, OIDC, internal identity provider)
Audit trail kept revision-safe: every AI decision logged with prompt, sources, and model version
Secrets management: API keys and model access centrally managed, not in code repos
Monitoring: response times, error rates, cost per request in your existing dashboard (Grafana, Sentry, LangFuse)

5. Operations and continued development

Operating infrastructure: cloud (EU/Switzerland) or on-prem, depending on requirements — we run both
Drift monitoring: spotting shifts in input data or response behaviour — before users complain
Human-in-the-loop: critical decisions still pass through an approval step, documented and traceable
Iteration with user feedback: prompts, model versions, retrieval strategies evolve from real usage — not from a vacuum
Hand-off to your team: documentation, training, a clear line between „you maintain“ and „we operate“ — end-to-end responsibility on our side, if you want it that way

What does AI development cost?

An AI project is not a SaaS subscription — it is a clearly scoped build. Three orders of magnitude we see in practice — fixed-price frame or time and materials, depending on how sharply the use case is defined up front.

AI Agent — Entry

from 20,000 €

One clearly scoped workflow (e.g. service triage, document extraction)
LLM via API (Anthropic Claude, OpenAI, Google Gemini) or open source
Connection to 2–3 of your existing tools
Tool calls, logging, escalation path to a human
Monitoring on response time, cost per request, error rate
Documentation and hand-off to your team
Timeline: 4–6 weeks

Custom AI development

from 50,000 €

Your own ML model trained on your data, or a specialised RAG system
Data pipeline with cleanup, enrichment, and versioning
Eval suite and model assessment against defined accuracy thresholds
API in your own hands: production-grade, documented, versioned
RAG systems over your documents — with source citation in the output
AI agent with multiple tools and multi-step tasks
Timeline: 8–12 weeks

Enterprise AI platform

from 90,000 €

Multiple agents or models, orchestrated through your backend layer
Custom ML components plus RAG plus LLM integration from a single architecture
Connection to ERP, CRM, and internal databases — with auth and audit layer
Multi-stage approval workflows, tenant separation, role model
Drift monitoring, latency and cost tracking, alerting into your existing stack
Iteration from real usage: prompt versioning, model swaps, retrieval tuning
Timeline: 12–20 weeks

How we measure impact. Before project start we define a concrete metric — handling time per service ticket, share of tickets resolved without human input, classification accuracy against a hold-out set. For a 20,000 € AI agent project, we work the maths through together: how many hours per month are addressed, what hosting and model calls cost — and when the build pays for itself. If the maths does not work, we do not take the project.

Technology stack for modern AI development

We use mature, production-proven tools — not experiments nobody will maintain next quarter. Which concrete building blocks we pick depends on where your data is allowed to live, which accuracy requirements apply, and how much you want to operate yourself.

LLMs, RAG frameworks, and vector databases

Open-Source-LLMs und API-Anbieter	Open-source models (Llama, Mistral) sit in your own hands and are the first choice when data must not leave the building. Anthropic Claude, OpenAI ChatGPT, and Google Gemini we use deliberately where language quality or tool use are decisive. Hybrid setups — open source for sensitive procedures, APIs for complex language tasks — are often the pragmatic middle ground in 2026.
LangChain und LlamaIndex	Production-tested frameworks for RAG systems and AI agents — tool calls, memory, multi-step pipelines. We do not rebuild every Lego brick ourselves; we use what has hardened in the community — and complement with our own code where things become specific.
PostgreSQL pgvector, Qdrant und Weaviate	Vector databases hold embeddings of your documents so a RAG system finds the right sources in milliseconds. PostgreSQL pgvector is enough for many SMB setups and reuses your existing database skills. At higher volumes or with strict isolation we move to Qdrant or Weaviate — both EU-hostable.

Custom machine learning and operations

Python-ML-Stack	scikit-learn for classical ML, PyTorch for deep learning, Pandas and NumPy for data preparation. Proven, well documented, large community — and maintainable when the team rotates two years from now.
MLOps und Modell-Betrieb	MLflow for experiment versioning, Docker for reproducible deployment, FastAPI as the model API. LangFuse as an observability layer for LLM-based applications — prompt versions, trace inspection, cost per call. Sentry for classical errors, Grafana for dashboards.
Hosting in der EU und in der Schweiz	AWS Frankfurt, Azure Germany/Switzerland, Google Zurich, Hetzner, IONOS, Exoscale, and pure on-prem scenarios — all paths we have already walked. Which option fits is decided by data classification and nDSG requirement, not by gut feeling.

How we choose — and what we leave out

Open source versus API. For sensitive data, high volumes, or real data-sovereignty requirements, a self-hosted open-source LLM almost always wins. For rare, linguistically demanding requests, a commercial provider delivers more per euro. We recommend the path per use case, not per belief.

Frameworks versus your own code. LangChain and LlamaIndex save weeks — as long as your use case stays inside the intended patterns. The moment custom logic is needed (own escalation paths, unusual tool calls, multiple tenants), we lift parts out and write them ourselves. Mix, not religion.

Managed service versus your own infrastructure. Managed services (AWS Bedrock, Vertex AI) are fast to set up but cost data sovereignty and carry vendor risk. Running it yourself is more expensive at first and often cheaper and more independent in the long run. We set up what fits your risk and operations stance — and say so plainly when both options are valid.

Specialised stacks for particular needs

Our standard stack covers roughly 80 per cent of mid-market AI projects. For the remaining 20 per cent, we reach for the following options:

R und statistische Modellierung	If your team already works in R — in regulated areas or classical statistics — we build the model logic there and expose it through a production-grade interface.
On-Device- und Edge-KI	For mobile apps, IoT devices, or scenarios without stable cloud connectivity, we deploy compact models directly on the device — low latency, no recurring API cost, clear data sovereignty.
Fine-Tuning und Custom-Transformer	When a standard LLM stays too vague in your domain: targeted fine-tuning or custom transformer variants — appropriate when data and volume justify the effort.

AI development GDPR- and nDSG-compliant

An AI system changes the compliance picture in two ways: first, data flows through an additional model that interprets. Second, the <a href="/en/blog/ai-act-ki-verordnung-software-architektur"><strong>EU AI Act</strong></a> has been entering into force in stages since August 2024 — obligations for GPAI models from August 2025, full applicability from August 2026. We account for both layers from the start.

Data minimisation and protection of personal data

Only the data actually needed: AI agents and RAG systems access exclusively explicitly cleared sources — checked at the auth layer, not in the prompt
Pseudonymisation before the model: personal data is removed or pseudonymised before training or inference, where the use case allows
Retention rules: prompts, responses, and embeddings are deleted after defined periods — automated, documented, auditable
Encryption: data at rest (AES-256) and in transit (TLS 1.3) encrypted — standard, not up for debate

Hosting in the EU and Switzerland

EU region and Switzerland: AWS Frankfurt, Azure Germany and Switzerland, Google Zurich, alternatively Hetzner, IONOS, or Exoscale — data and models stay in your own hands
On-prem deployment: open-source LLMs and custom ML models run in your data centre when data class or industry obligations demand it
Open-source LLMs: Llama or Mistral — you fully control weights, updates, and inference path
Without hyperscaler dependency: on request, fully without US hyperscalers — often the clean choice with sensitive data

AI Act, audit trail, and explainability

Audit trail kept revision-safe: prompt, model version, source hits, response, and timestamp for every AI call — one of the most important architectural decisions in a production-grade AI system
Human oversight for high-risk applications: where the AI Act requires it, every automated decision passes through a documented approval step — no black-box automation
Explainability of the decision: for every answer, the sources and rules that led to the result are visible
GDPR Art. 22 — right to explanation: for automated decisions with legal effect, those affected receive a traceable explanation — the architecture provides for it, rather than reconstructing it after the fact

Measuring impact, controlling latency and cost

Three dimensions decide whether an AI system holds up in production: response time, accuracy, and cost per request. We treat all three explicitly — with numbers you can measure the project against three months in.

Response time

Quantised models: smaller bit widths (4-bit, 8-bit) reduce inference time markedly, with a moderate accuracy trade-off — we assess size and suitability per use case
Response caching: recurring questions with identical context come from the cache instead of bothering the model again
Streaming responses: tokens flow as soon as they are produced — the user sees that the system is working immediately
Edge deployment where it fits: compact models directly on site or on device — low latency, clear data sovereignty, no API cost per call

Cost per request

Routing per request type: simple requests to a smaller open-source model, demanding ones to Claude or Gemini — the mix is markedly cheaper than „everything via the most expensive API“
Prompt discipline: shorter, structured prompts without ballast lower token cost and improve accuracy at the same time
Self-hosted LLMs: from moderate volume onwards, running it yourself is usually cheaper and more independent than API calls
Batch processing: collect non-time-critical tasks and process them in batches instead of real time — lower peak load, lower cost

Scaling and drift

Horizontal scaling: multiple inference instances behind a load balancer — important when the AI system becomes part of a real-time product
Auto-scaling: instances spin up during peaks and shut down again in quieter periods
Queue-based architecture: requests decoupled through a queue (e.g. RabbitMQ, NATS) — protects against load spikes and simplifies retry logic
Drift monitoring: input data and response behaviour shift over time — we measure actively (with LangFuse, Sentry, for example) and react before users notice. More hosting paths in Cloud Services.

Integration into your existing IT landscape

An AI solution creates impact only when it lives inside the data flows people already work with. We connect AI to your existing systems — through auth, audit trail, and secrets management, not through copy-paste or a browser plugin.

ERP systems: SAP, Microsoft Dynamics, Odoo, proAlpha — read and write through official APIs, respecting the permission model
CRM integration: Salesforce, HubSpot, Pipedrive — enriching customer data, lead scoring, research agent
Communication tools: Slack, Microsoft Teams, email — AI capabilities where your team already works, not in yet another tool
Databases: PostgreSQL, MongoDB, MS SQL, Snowflake — direct read path for RAG, with row- and field-level permissions
Document stores: SharePoint, Google Drive, S3, Nextcloud — indexing with permission filter, so RAG only surfaces what the user is allowed to see
Your own APIs: internal services via REST or GraphQL — versioned, documented, with the auth chain running from the AI through to the target system

Maintenance and continued development

An AI system does not end with go-live. Models, prompts, and data age. Our maintenance model keeps the solution dependable over years — with clear responsibilities between you and us.

Drift and accuracy monitoring: we continuously measure whether input data and response behaviour are shifting — and alert before users complain
Model refresh: for custom ML, periodic re-training on updated data; for LLMs, a structured switch to new model versions with eval comparison
Prompt versioning: prompts are versioned like code, tested, and rolled back when needed — no silent changes in production
Cost and latency monitoring: calls per day, average response time, cost per request — visible in your dashboard, not in a spreadsheet
RLHF and iteration from real usage: thumbs-up/-down feedback and corrections flow back into the system in a structured way, improving answer quality over time

More on the operations and maintenance model: Maintenance & Support.

Why IntegrIT for mid-market AI development?

Senior engineering, no junior pool: you talk to engineers who have been shipping software for years — not to an account layer that just forwards tickets
AI solutions as an architecture discipline: we understand LLMs, vector databases, and agent pipelines — and equally the auth, audit, and data-model layer all of that has to embed into
First near-production version in 4–6 weeks: we build for real impact, not for the demo at the steering committee in two months
Aligned with GDPR, nDSG, and the EU AI Act: hosting in the EU or Switzerland, audit trail kept revision-safe, risk class assessed per use case
Impact before kickoff: we define the success metric with you up front — and say „no“ honestly when a use case does not pay off
Code, models, and data in your own hands: none of it sits with us or a sub-vendor — you can end the contract at any time without losing data
End-to-end responsibility across the platform: we also build your backend and your apps — AI as an integral part of your architecture, instead of three vendors arguing over interfaces

Next step: framing AI development together

Send us a short note describing what you have in mind — informally to development@integritsol.de, or via the Calendly link below. We reply within one working day and set up a first conversation. You get an honest assessment, not a sales tour.

First conversation about your AI development project

30 to 60 minutes, no obligation. We go through one or two use cases, look at data situation and architecture, and tell you directly whether an AI solution is the right path — or whether a simpler piece of software is enough.

Or call directly: +49 1522 3635395

AI development as an architecture discipline: AI agents, RAG systems, custom machine learning