Bringing artificial intelligence into business production is now an operational decision, no longer a technological gamble. The difference between a project that works and one that drags is method: a clear sequence of phases, each with specific questions to answer before moving to the next.
In this article we propose a framework in 6 phases, built on the experience of real projects. It's designed for Italian SMEs implementing AI for the first or second time — or who had a first failed attempt and want to restart well.
Phase 1: Use case scoping (1-2 weeks)
The goal is to exit this phase with a single use case well defined, written on an A4 page.
Questions to answer:
- What is the concrete problem we want to solve?
- How much does that problem cost today (man-hours, errors, missed opportunities)?
- What is the observable metric we want to improve? By how much?
- Is there enough historical data to train/calibrate a system?
- What happens when AI gets it wrong? What is the tolerance level?
- Who is the recipient of the AI output: an end customer, an employee, a system?
The output of this phase is a document that, if shown to an outside person, allows them to understand in 5 minutes what we are trying to do and why.
Typical mistake
Skipping this phase to "start developing immediately". The most common result is finding yourself 2 months later with a system that works but doesn't solve the right problem.
Phase 2: Privacy and compliance audit (1 week)
Before touching a single piece of data, do a preventive assessment. It's not bureaucracy: it's the phase that protects you from the most serious and least visible risks.
Questions:
- What categories of personal data are passed to the AI system?
- Is there a clear legal basis for their processing (consent, contract, legitimate interest)?
- Is the chosen AI provider in the EU or does it have standard contractual clauses?
- Do we need sensitive data (health, opinions, financial data)? If so, is a data protection impact assessment planned?
- How is the right to be forgotten implemented on the AI side? How does a user get "forgotten"?
- Is there an updated processing register including the AI system?
For projects involving personal data, a privacy expert consultation in this phase costs much less than a subsequent fine.
Typical mistake
"We'll decide later". Privacy issues almost always require architectural choices made at the start (where data lives, how it's encrypted, which provider). Rethinking them once the system is built is expensive.
Phase 3: Proof of Concept (2-4 weeks)
The goal is to exit with a working prototype demonstrating the use case is technically feasible and the chosen model achieves minimum quality to be useful.
What it includes:
- A set of representative test cases (50-200 real examples)
- A first version of the prompt (or prompt pipeline) producing the desired output
- A quantitative quality assessment: % of correctly handled cases, most frequent errors
- An estimate of cost per call and monthly projection on expected volumes
- A clear idea of the edge cases the system doesn't handle well
The PoC output is a report saying: "Yes, it works, and here are the numbers" or "No, it doesn't work, and here are the reasons". Both outcomes are valid.
Typical mistake
Considering the PoC as "the final product" and trying to ship it directly to production. A PoC is designed to be fast, not robust. Reusing a PoC as production is the shortcut you pay for immediately after.
Phase 4: Hardening and production (2-4 weeks)
The PoC is transformed into a system ready for the real world. It's the densest technical phase.
What needs adding compared to the PoC:
- Robust error handling: what does the system do when the AI model is down, slow, returns invalid output
- Human fallback: mechanisms to forward low-confidence cases to a person
- Caching and cost optimization: avoiding redundant calls, grouping requests
- Structured logging: tracing every input/output for subsequent audits (respecting privacy)
- Security: protection against prompt injection, hostile input handling, rate limiting
- Prompt versioning: every prompt change is a "release" with commit, test, possible rollback
- Automated regression tests: a suite verifying that changing prompt or model doesn't worsen output
Typical mistake
Neglecting prompt security. An AI app allowing users to send free text must be designed to resist "jailbreak" attempts on the prompt. It's not theory: it's a regular practice in production.
Phase 5: Launch and adoption (1-2 weeks)
The system enters production, but cautiously.
Recommended strategy:
- Progressive launch: first a subset of users (5%), then 25%, then 100%. Helps intercept problems before they impact everyone.
- Internal communication: employees who will use the system or be impacted must be informed, trained, listened to
- Feedback channel: a simple way to collect signals from the first users
- Rollback procedure: how to quickly return to the previous system if needed
- Role definition: who responds if the system causes harm? Who can stop it?
Typical mistake
Big bang launch on all users simultaneously. If something goes wrong, the impact is maximum and the damage is already done before being able to react.
Phase 6: Continuous monitoring and iteration (ongoing)
AI in production is never "done". Quality degrades over time for many reasons: input data changes, users change, providers change, prices change, new models are released.
What needs continuous monitoring:
| Area | What to measure | Frequency |
|---|---|---|
| Output quality | Error rate, user reports, accuracy on test set | Weekly |
| AI costs | Daily spend, cost per call, anomalies | Daily |
| Performance | Response times, errors, provider availability | Real-time + alerts |
| Privacy | Periodic log audits, accesses, retention | Quarterly |
| Security | Prompt injection attempts, abuses | Weekly |
| User satisfaction | Survey, NPS, qualitative feedback | Monthly |
At least once per quarter, a complete system review is worthwhile: is the use case still current? Is ROI where we expected? Are there choices to revisit?
Typical mistake
"We'll notice when someone complains". Without active monitoring, the system can degrade for months before someone officially notices. By then, trust loss has already happened.
Want to implement AI in your business following a method like this?
We support Italian companies and SMEs in all 6 phases: from scoping to monitoring. We work in short sprints, with clear metrics. Even a single free 20-minute AI audit can help understand where to start.
Book a free AI auditFramework summary
| Phase | Goal | Duration | Output |
|---|---|---|---|
| 1. Scoping | Define a clear use case | 1-2 weeks | 1-page document |
| 2. Privacy audit | Identify regulatory risks | 1 week | Assessment + architectural decisions |
| 3. PoC | Verify technical feasibility | 2-4 weeks | Prototype + quality report |
| 4. Hardening | Transform PoC to production | 2-4 weeks | Robust, secure, tested system |
| 5. Launch | Activate in controlled way | 1-2 weeks | Live system + initial feedback |
| 6. Monitoring | Maintain and improve | ongoing | Dashboard + periodic reviews |
Conclusion
AI projects that fail rarely do so for technical limits. They fail for method choices not made upfront: vague use case, ignored privacy, launch without measuring, abandonment after 30 days.
The framework in this guide is not "the right way" — it's a way that works, distilled from real projects that went through. Adapt it to your context, compress phases if needed, but don't skip them.
AI in business is an investment. Like all investments, it pays off when managed well.
Frequently asked questions
Related services
The services this article talks about
AI Completa end-to-end
Completiamo end-to-end il tuo progetto AI: refactor codice ChatGPT/Claude/Cursor, sicurezza, database, hosting, CI/CD, deploy e supporto continuativo.
Discover the service →AI Server Setup
Setup infrastruttura cloud per progetti AI: AWS, GCP, Azure, Vercel, container Docker/Kubernetes, CI/CD, database, monitoring, backup e disaster recovery.
Discover the service →