Data sovereignty for Australian businesses
Data sovereignty used to be a policy conversation. It is now an operational one. Buyers ask harder questions about where data goes, and engineering teams end up owning the practical answer.
This post walks through what sovereignty means in practice, which industries feel it most, and what infrastructure choices usually matter. Treat it as engineering guidance, not legal advice.
What data sovereignty actually means
Sovereignty means your data is shaped by the laws of wherever it lives, and by the laws that can reach the provider holding it. That is why teams end up discussing foreign-jurisdiction exposure, not just region selection. The CLOUD Act(opens in new tab) comes up often in procurement and risk reviews for exactly that reason.
For plenty of workloads, none of this matters. For healthcare, finance, legal, and government, it can matter a great deal.
Residency is the narrower question: where is the data physically stored? Australian residency means the bytes don't cross the border. You can have residency without sovereignty. Data sitting in a Sydney data centre that gets processed through systems subject to foreign law fails the sovereignty test even if it passes the residency one.
The regulatory context
A handful of frameworks usually drive the conversation.
The OAIC guidance on APP 8(opens in new tab) is the right starting point for cross-border disclosure of personal information. If you are sending personal information to an overseas recipient, that is not a box-ticking exercise. Your organisation still carries responsibility for how that data is handled.
For APRA-regulated entities, CPS 234(opens in new tab) and CPS 231 guidance(opens in new tab) shape how security capability, outsourcing, and third-party oversight get discussed. For healthcare workloads tied to My Health Record data, the My Health Records Act 2012(opens in new tab) is the relevant source text. If you are building for government or regulated sectors, pull in legal and compliance input early instead of trying to reverse-engineer it after an architecture choice has already been made.
Where this becomes a practical problem
For most teams, the real problem sits with data in motion, not data at rest.
A modern application leaks data to twenty or thirty third parties without anyone properly clocking it: analytics, CRMs, support tools, AI APIs, error trackers. Each of those is a potential residency violation if nobody has checked what it sends and where it sends it.
AI sharpens the edge. The use cases that justify investing in AI, like reading patient notes, reviewing contracts, or summarising customer support tickets, are the ones that usually trigger the hardest sovereignty questions. The engineering mistake is not using AI. It is sending sensitive data to a third party before anyone has checked whether that data class is allowed to leave your chosen boundary.
Building compliant infrastructure
The infrastructure side has three layers.
Data at rest. Use AWS's Australian regions: ap-southeast-2 in Sydney and ap-southeast-4 in Melbourne. Lock down S3, RDS, and DynamoDB so nothing replicates outside Australia. Encrypt at rest with KMS, and use customer-managed keys where the sensitivity warrants it.
Data in transit. TLS on everything. Use PrivateLink or VPC endpoints for AWS services so traffic doesn't traverse the public internet. Service-mesh mTLS for inter-service traffic if the data classification calls for it.
Data in processing. This is the awkward one. Any compute that touches sensitive data should run in the deployment boundary you have chosen and documented. For some teams that means Australian-hosted models. For others it may mean a documented exception with minimised data, contractual controls, and explicit approval.
# Verify your S3 bucket is in the correct region
aws s3api get-bucket-location \
--bucket patient-records-au
# Verify an RDS instance is in ap-southeast-2
aws rds describe-db-instances \
--db-instance-identifier production-primary \
--query 'DBInstances[0].AvailabilityZone' \
--region ap-southeast-2
What to document
Compliance isn't only about being right. It's about being able to show your work. If you're in a regulated industry, keep this set of documents current:
- A data flow diagram covering every system that receives personal or sensitive information
- A list of third-party services that receive data, with their data processing agreements attached
- Evidence of encryption config for storage and transit
- Access logs for sensitive systems (who, when, what)
- A breach response plan with notification timelines
That's the file you pull out for auditors, hand to procurement, or reach for at 2am when something has gone sideways.
AI services and data sovereignty
AI creates a specific risk surface. Teams ship AI features fast, and the sensitivity conversation often happens after the fact.
Our approach with clients is to set the policy before the architecture. Which data classifications can go offshore? Which cannot? Put technical controls in place to enforce those lines, not just a wiki page that people skim once.
For workloads that cannot leave Australia, we deploy private LLM infrastructure on Australian compute. For workloads that can use an external provider, we document the exception, check the commercial and compliance posture, and minimise the data sent to the task at hand.
Running private AI infrastructure is more accessible than many teams assume. The hard part is usually not the model server. It is the surrounding operating model: data flow review, access control, logging, and handover.
Getting started
If you haven't done a data flow audit recently, start there. Map where sensitive data actually goes, not where you think it goes. The gaps are almost always in integrations that got added quickly and never got a second look.
After that, it's prioritisation. Which flows carry the most risk, and what's the right control for each.
Our AI Services engagement covers the AI-specific side, including private LLM deployment for use cases that can't touch offshore services. Platform Engineering addresses the broader infrastructure and access-control work.