Unicode Conversion: Everything That Can Go Wrong
Unicode conversion is less common than it was a decade ago — most organisations that were going to convert have done so, and new implementations have been Unicode from the start. But there remain ECC systems in production that are still non-Unicode, and for those, a conversion is required before any migration to S/4HANA on HANA (which is Unicode-only). This guide covers what the conversion involves and where it fails.
What Unicode Conversion Is and Why It Is Still Relevant
Non-Unicode SAP systems store character data in a single-byte code page — typically a Latin, Cyrillic, or Asian code page depending on the primary language of the implementation. This means the system can only correctly represent characters from that code page. Displaying text from other languages either corrupts the characters or requires system-level code page switching, which is error-prone.
Unicode systems store all character data in UTF-16, which can represent any character from any language in a single consistent encoding. The database, the application server, and the frontend all agree on how characters are encoded.
The conversion from non-Unicode to Unicode is not simply a database character set change — it requires an export of the entire database, conversion of all character data, and reimport into a new Unicode system instance. The technical tool for this is R3load. The process has a downtime window measured in hours to days depending on data volume.
If you are planning an S/4HANA migration from a non-Unicode system, the Unicode conversion must happen first. It cannot be combined with the S/4HANA migration in a single step.
The UCCHECK Transaction
UCCHECK is the pre-conversion analysis transaction. It scans the ABAP repository for programs and function modules that contain code patterns that are problematic in a Unicode environment.
The primary categories of findings are: type C fields treated as byte strings (using ABAP operations like MOVE that treat character data as raw bytes — this breaks under Unicode because characters can now be multi-byte), non-standard character assignments, and string operations that make code-page-specific assumptions.
Run UCCHECK before estimating the conversion effort. The number of critical findings is the primary driver of pre-conversion development work. A system with 5,000 custom programs and significant legacy ABAP typically has hundreds to thousands of UCCHECK findings at the critical and warning level.
Each critical finding requires a developer to review the code, understand whether the byte-string treatment was intentional (rare) or accidental (common), and adapt the code to Unicode-safe operations.
Custom Code Scanning
UCCHECK covers syntactic issues, but some Unicode problems are semantic — they depend on runtime behaviour that static analysis cannot detect. Test-driven validation of custom programs against a Unicode test system is the only way to catch these.
The standard approach is to build a Unicode sandbox — a Unicode instance using a copy of the production data — and run custom programs against it with representative test data. Programs that produce different results in the Unicode sandbox versus the non-Unicode production system have Unicode conversion issues.
Prioritise the programs that are run by high-volume business processes, month-end close, and payroll. These are the ones where a Unicode conversion issue discovered post-go-live has the highest business impact.
The Conversion Process
The Unicode conversion uses SWPM (Software Provisioning Manager) to orchestrate the R3load export and import. The high-level steps:
Pre-conversion preparation: complete all UCCHECK remediations, run data archiving to reduce data volume, verify the target server meets Unicode and HANA sizing requirements, install and configure the new Unicode system skeleton (including kernel, database, and SAP instance installation without data).
Export phase: R3load exports the entire database from the source non-Unicode system into a set of export files, performing character encoding conversion on the way out. Export runs on the source system while it is live (read-only access is possible during export for some configurations) or during a downtime window.
Import phase: R3load imports the converted export files into the new Unicode system. This is the primary downtime phase — the source system should be locked during import to prevent users from making changes that will not be in the Unicode system.
Post-import steps: apply Support Packages to reach the target patch level (if the Unicode conversion is combined with a system update), run post-conversion programmes (UCCHECK_SPOOLS, post-conversion reports), validate data integrity.
The total downtime window depends on data volume and hardware. For a typical mid-size ECC system (2-5 TB), expect 8-24 hours of technical downtime. Very large systems can take longer. Run the conversion in a sandbox first to get a realistic timing estimate before committing to a production window.
Data Volume Impact
Unicode stores characters in UTF-16, which is two bytes per character for Basic Multilingual Plane characters (which covers virtually all characters used in business applications). A non-Unicode system using a single-byte code page stores those same characters in one byte. This means the Unicode conversion approximately doubles the storage size of pure text fields.
In practice, SAP tables contain a mix of character data and numeric/binary data. The overall database size increase from Unicode conversion is typically 20-40% rather than the theoretical maximum doubling. Run the SAP sizing tool or consult SAP's sizing guides for your specific system to get a realistic estimate.
This storage increase has implications for the target system sizing, backup window duration, and — on HANA — memory requirements. Do not assume the Unicode system will have the same storage footprint as the non-Unicode source.
Testing Requirements
Unicode conversion testing has three layers.
Technical validation: verify that the converted system starts, the post-conversion programmes complete without errors, and row counts in critical tables match between source and target (with appropriate adjustments for archiving).
Data integrity validation: spot-check character data in key tables. Documents, material descriptions, customer names, and address fields are the first places to look for encoding corruption. Pay particular attention to special characters in your primary language and any multilingual content.
Business process validation: run end-to-end business processes in the Unicode system before go-live. Payroll, FI posting, outbound IDoc processing, and any print-intensive processes are the highest priority. Character encoding problems in output forms are often discovered only when actual output is reviewed.
What Goes Wrong
The most common failure is insufficient pre-conversion UCCHECK remediation. Teams that begin the conversion before completing critical finding remediation discover runtime errors in custom programs after go-live. The fix after go-live is an emergency correction transport under pressure — avoidable with adequate preparation time.
The second failure is timing underestimation. Unicode conversion downtime windows almost always run longer than estimated in sandbox, because sandbox systems have smaller data volumes and faster hardware than production. Build a realistic conversion timeline from a production-representative dataset, not from sandbox tests on a subset.
The third failure is inadequate post-conversion validation. The system starts and basic transactions work, so the go-live proceeds. Two weeks later, month-end close discovers that a custom FI report is producing garbled output for special characters in document texts. Extended validation of data-intensive and output-intensive processes before go-live would have caught this.
> Editorial note: Unicode conversion tooling and procedures have evolved across SAP releases. The R3load-based approach described here applies to ABAP stack systems. Java stack (AS Java) has a different conversion path. Verify the current procedure against SAP Note 552464 and the SWPM documentation for your specific release.