AI Procurement Best Practices: Sourcing and Contracting AI Solutions in KSA

When a Saudi government ministry or a private-sector enterprise decides to procure an AI system, it is not making a software purchase in the conventional sense. It is entering into a long-term relationship with a technology that will evolve, with a vendor whose interests will not always align with its own, and with a set of regulatory obligations that are themselves still taking shape. The contracting moment — the point at which a vendor is selected and terms are signed — is only a threshold. What happens before it, and what is built into the agreement to govern what comes after, determines whether the AI investment delivers its intended value or becomes a source of persistent organizational friction.

Saudi Arabia's AI ecosystem has developed with remarkable speed since the establishment of the Saudi Data and Artificial Intelligence Authority in 2019. The National AI Strategy, the SDAIA vendor landscape, and the growing body of regulatory guidance from NCA and SDAIA have created a more structured environment than existed even a few years ago. Yet the procurement practices of many KSA organizations have not kept pace with the sophistication of the systems they are now acquiring. Standard IT procurement processes — designed for software with predictable, static behavior — routinely fail to account for the properties that make AI distinctive: the continuous learning that changes model behavior over time, the dependency on data quality that cannot be fully assessed before deployment, the ethical implications of automated decision-making, and the difficulty of attributing responsibility when an AI-generated output causes harm.

This guide sets out the principles and practices that KSA organizations — government entities, regulated industries, and private-sector organizations pursuing Vision 2030's digital transformation agenda — should embed in their AI procurement processes.

Before the Market: Defining What You Actually Need

The most consequential decisions in AI procurement are made before any vendor enters the picture. Organizations that approach the market without a clear, honest account of what they need, what they have, and what they can realistically manage tend to procure systems that are either overspecified for their actual use case or underscoped for their actual ambitions. Both failures are expensive and time-consuming to correct.

Defining objectives with precision is harder than it sounds. "Improve customer service efficiency," "enhance fraud detection," and "optimize procurement spend" are all legitimate AI use-case categories, but they are not procurement specifications. A genuine specification identifies the specific decision or prediction the system needs to make, the data inputs available to make it, the operational context in which outputs will be used, and the consequences of errors in either direction — false positives and false negatives often carry very different costs, and a procurement process that does not surface this asymmetry will struggle to evaluate vendor proposals meaningfully.

Alignment with Vision 2030 objectives is a real consideration for both government entities and private organizations operating within the Kingdom's strategic priorities. For government procurement specifically, the degree to which an AI initiative supports national goals — whether in digital government services, healthcare transformation, economic diversification, or logistics modernization — can affect both procurement authority and evaluation criteria. Organizations should document this alignment explicitly, not as a formality but as a discipline that forces clarity about the system's intended role in a larger strategic picture.

Organizational readiness assessment is perhaps the most frequently skipped pre-procurement step, and its omission accounts for a significant share of AI implementation failures. The question is not simply whether the organization has the technical infrastructure to run an AI system — cloud capacity, network bandwidth, integration points — but whether it has the data, the people, and the change management capacity to operate one effectively. AI models require high-quality, relevant training data, and organizations that have not invested in data governance frequently discover, mid-implementation, that the data they believed they had is not in a condition to support the system they have purchased. Internal talent matters too: a sophisticated AI system that arrives without a credible plan for who will monitor its performance, interpret its outputs, investigate anomalies, and manage vendor relationships is a liability dressed as an investment.

Engaging the Market

Market engagement for AI procurement is different from issuing an RFP for standard software because the vendor landscape is more heterogeneous and the evaluation task is more complex. International vendors bring mature platforms and, in some cases, extensive deployment experience in comparable contexts. Local and regional vendors often bring capabilities that matter more than they initially appear: genuine familiarity with Arabic language processing, understanding of KSA regulatory requirements, and the proximity that enables responsive support and collaborative customization. A hybrid approach — international platform with local implementation partner, or a local vendor with deep sector expertise — is often more appropriate than either extreme.

The Personal Data Protection Law creates specific vendor selection constraints that many organizations have not fully internalized. PDPL's restrictions on cross-border data transfer mean that a vendor whose AI platform processes all data in a jurisdiction outside the Kingdom may be non-compliant with requirements that apply to the data the system will handle. SDAIA has provided guidance on data localization requirements that procurement teams must review before finalizing their vendor shortlists. NCA's cybersecurity requirements for cloud and AI systems add additional layers of vendor qualification criteria, particularly for organizations operating in regulated sectors or handling sensitive government data.

Sector-specific regulatory experience is a genuine differentiator in vendor selection, not merely a marketing claim. An AI vendor that has deployed similar systems within Saudi healthcare must understand SFDA requirements. One working in financial services must navigate SAMA's framework for algorithmic systems. A vendor operating in the customs and logistics space is working within ZATCA's integration requirements. The ability to demonstrate compliance in these sector-specific contexts is qualitatively different from general statements about regulatory awareness, and evaluation processes should probe the distinction.

A well-constructed Request for Proposals is a governance document as much as a procurement instrument. It should specify not just functional requirements and technical specifications, but the transparency and explainability standards the organization will require — documentation of model architecture, training data provenance, decision logic, and the conditions under which the model's outputs should and should not be trusted. Saudi regulators, following the principles articulated by SDAIA, increasingly expect organizations to be able to account for consequential automated decisions. An organization that has not required this documentation in its RFP will struggle to produce it when a regulator asks.

What the Evaluation Must Examine

Vendor evaluation for AI procurement requires capabilities that many procurement teams have not previously needed. Technical assessment of an AI model's performance — its accuracy across different segments of the input distribution, its behavior on edge cases and adversarial inputs, its sensitivity to changes in the underlying data — requires analytical skills that go beyond procurement expertise. Organizations should plan for this gap explicitly: either by building the internal capability, engaging external technical advisors, or running structured technical evaluations — proof-of-concept pilots using realistic sample data — that generate evidence rather than vendor self-assessment.

Pilots deserve particular emphasis because AI systems often behave differently in a vendor's demonstration environment than in the organization's actual operational context. The training data a vendor uses to build a demonstration may not reflect the idiosyncrasies of the customer's actual data. The integration challenges that will consume implementation effort often only become visible when the system encounters real data pipelines. A structured pilot, even a limited one, provides information that no amount of vendor documentation can substitute for.

Security and compliance verification should not rely solely on vendor self-certification. Third-party audits and certifications — ISO 27001 for information security management, compliance attestations from recognized certification bodies — provide independent assurance that vendor claims are grounded in actual practice. NCA's requirements for AI systems in critical and sensitive applications establish a baseline that vendor security documentation must address, and procurement evaluation should verify this against the actual NCA guidance rather than accepting vendor summaries of it.

Total cost of ownership analysis for AI systems must extend well beyond licensing fees. Implementation costs — data preparation, system integration, configuration and customization, staff training — often substantially exceed the cost of the platform itself. Ongoing costs include model retraining as the underlying data distribution shifts over time, performance monitoring infrastructure, vendor support, and the internal staff time required to manage the system effectively. Contracts that appear cost-effective based on headline licensing fees can prove expensive when these downstream costs are properly accounted for.

The Contract as a Governance Instrument

Standard IT contracts are inadequate for AI procurement, and the inadequacy is not merely technical. IT contracts typically define performance in terms of uptime and availability: the system either runs or it does not. AI system performance is more complex and more variable. A model can be technically running — returning outputs on schedule, with no system errors — while producing results that have quietly degraded in accuracy, developed systematic biases against particular input types, or begun reflecting drift in the underlying data that was not present when the system was originally trained. Contracts that do not define performance in terms of model accuracy and output quality, with specified thresholds and consequences for falling below them, create environments in which vendors have no contractual obligation to address model degradation.

Performance guarantees in AI contracts should specify minimum accuracy thresholds for the predictions or decisions the system is responsible for, the methodology by which performance will be measured, the frequency of performance review, and the remediation process when performance falls below specified levels — including provisions for model retraining, rollback to a previous version, or contract termination if acceptable performance cannot be restored. These provisions require negotiation because they impose genuine obligations on vendors, but they are the mechanism through which organizations maintain accountability for AI systems operating in their name.

Data ownership and usage rights require explicit treatment because the defaults in standard contracts are often unfavorable to customers. Organizations should specify who owns the training data used to build or fine-tune the model, who owns any custom model components developed for the engagement, whether and how the vendor may use the customer's data to improve their own models or services, and what happens to data when the contract ends. The last question is particularly important: AI systems can accumulate substantial knowledge about an organization's operations, customers, and internal processes, and contract exit provisions should address both data return and data deletion in a manner consistent with PDPL requirements.

Intellectual property provisions in AI contracts can be genuinely complex when the engagement involves co-development — situations in which the customer's domain expertise, proprietary data, or operational requirements have meaningfully shaped the model or the system. Joint ownership arrangements are possible but require careful drafting to avoid disputes over exploitation rights. Organizations that have contributed substantially to the development of a model should not assume that standard contract terms reflect this contribution; it must be negotiated explicitly.

Liability and indemnification provisions for AI systems present challenges that the law has not yet fully resolved, in Saudi Arabia or elsewhere. When an AI system produces an output that causes harm — a misclassified customs declaration that delays a critical shipment, a credit scoring error that incorrectly denies financing, a diagnostic recommendation that contributes to a clinical error — the question of who bears responsibility requires contractual treatment even in the absence of fully settled legal frameworks. Vendors who accept no responsibility for AI-driven decisions create moral hazard; those who accept unlimited liability for model outputs may be accepting obligations they cannot realistically fulfill. Negotiated positions that allocate liability based on the locus of failure — vendor responsibility for model defects, customer responsibility for deployment context and oversight — are more defensible than either extreme, and they create appropriate incentives on both sides.

Model versioning and update protocols deserve more attention than they typically receive. AI models are not static: vendors release updated versions, retrain models on expanded datasets, modify architectures to improve performance, and push changes that may affect the behavior of systems customers depend on. Contracts should establish requirements for advance notice of model changes, customer consent for changes that affect defined performance characteristics, rollback procedures if a model update degrades performance, and version control documentation that enables the customer to understand what changed and when.

Ethical Frameworks and Human Oversight

SDAIA's AI ethics principles — which encompass fairness, transparency, accountability, and human-centered design — are not aspirational statements. They represent the regulatory direction of travel in the Kingdom, and procurement contracts that do not address them are likely to require renegotiation as regulatory guidance matures.

Bias mitigation requirements should be written into contracts as operational obligations rather than aspirational commitments. This means requiring vendors to implement bias detection methodologies appropriate to the system's use case, to conduct regular bias audits against defined criteria, to disclose audit results, and to remediate identified biases within defined timeframes. For systems making decisions that affect individuals — employees, customers, applicants, patients — the fairness obligations are particularly acute, and organizations should be prepared to specify the protected characteristics they expect the system to treat equitably.

Human-in-the-loop protocols define the boundary between automated and human decision-making, and that boundary requires deliberate design. Not every AI output warrants human review — building in mandatory human oversight for every low-stakes prediction would negate the efficiency benefits of automation. But high-stakes decisions — those with significant consequences for individuals or for the organization, or those operating in domains where the cost of error is asymmetric — should have clearly defined human review requirements embedded in the system's operating procedures and, where appropriate, in the contract itself. Saudi regulatory frameworks for healthcare, financial services, and government decision-making each have their own specifications for what constitutes an automated decision requiring human accountability, and these should be reflected in procurement requirements.

Transparency and documentation obligations support both internal governance and regulatory audit readiness. Organizations should require vendors to provide documentation sufficient to explain model architecture, describe the training data used, characterize the system's known limitations and failure modes, and support the explanation of individual outputs when challenged. This documentation is not merely a procurement nicety — it is the foundation of the accountability that SDAIA's framework requires.

Building for the Long Term

AI procurement is not completed when contracts are signed. The governance structures, monitoring practices, and vendor management capabilities built during and after procurement determine whether the system continues to deliver value over its operational life.

Performance monitoring requires infrastructure and attention. Model accuracy in production is not self-evident; it must be measured against ground truth, tracked over time, and reviewed by people with both technical literacy and operational domain knowledge. Performance drift — the gradual degradation of model accuracy as the real-world patterns the model was trained on shift — is normal and expected. What is not acceptable is discovering it only after the drift has produced significant operational consequences. Monitoring systems should generate regular performance reports, flag anomalies against defined thresholds, and trigger defined response protocols when performance falls below acceptable levels.

Model retraining is an operational reality that procurement budgets frequently underestimate. A model trained on last year's data performs on this year's inputs, and the gap between the two grows over time. Retraining requires fresh labeled data — which must be collected, cleaned, and annotated — as well as computational resources and, often, significant vendor involvement. Procurement should account for these costs explicitly, including the question of who bears them under what circumstances.

Governance and oversight structures must exist before the system is deployed rather than being assembled in response to problems. An AI ethics committee or review board with cross-functional membership — legal and compliance, operations, IT, HR, and where relevant, external members with domain expertise — provides a forum for the ongoing questions that AI systems inevitably generate. Risk registers for AI systems, maintained and updated as systems evolve and as the organization's understanding of their behavior matures, provide structured situational awareness. Audit trails that log AI-driven decisions, model changes, and user interventions support both internal accountability and the external audit readiness that regulators increasingly expect.

For government entities, SDAIA's evolving vendor registry and certification frameworks are relevant procurement resources. These pre-screening mechanisms reduce but do not eliminate the due diligence burden — organizations still need to assess whether a pre-screened vendor is appropriate for their specific use case — but they provide a useful baseline from which to work. Shared services and joint procurement opportunities across government entities can also reduce per-agency costs and create more consistent governance standards across the public sector's AI deployments.

The investment KSA organizations make in developing rigorous AI procurement capabilities compounds over time. Organizations that build genuine expertise in evaluating AI vendors, negotiating AI-specific contracts, and managing AI systems through their operational lifecycle are not just better positioned for their current procurement. They develop institutional knowledge that improves every subsequent procurement, reduces the gap between what they purchase and what they actually need, and equips them to engage with an AI ecosystem that will continue to evolve in capability, complexity, and regulatory expectation.

Published by PeopleSafetyLab — AI safety and governance research for KSA organizations.

AI Procurement Best Practices: Sourcing and Contracting AI Solutions in KSA

AI Procurement Best Practices: Sourcing and Contracting AI Solutions in KSA

Before the Market: Defining What You Actually Need

Engaging the Market

What the Evaluation Must Examine

The Contract as a Governance Instrument

Ethical Frameworks and Human Oversight

Building for the Long Term

Nora Al-Rashidi