In my experience creating 3D assets for augmented reality, the difference between a model that works and one that fails comes down to ruthless optimization for real-time performance. This guide is for 3D artists, developers, and designers who need to bridge the gap between high-fidelity creation and the constrained, dynamic environments of AR. I'll share the core technical requirements, my personal workflow for optimization, and how requirements shift across different AR applications, so you can build assets that are not just visually compelling but technically robust.
Key takeaways:
For AR, especially on mobile, polygon budget is your primary constraint. I typically aim for models under 50k triangles for complex objects, and often under 10k for simpler props or characters that need to be instantiated multiple times. The goal isn't just a low number, but efficient geometry. What I've found is that clean, quad-dominant topology with minimal n-gons and triangles is crucial. This ensures the model deforms correctly if animated and subdivides predictably if needed for a higher-fidelity LOD.
Poor topology leads to shading artifacts and inefficient rendering, which drains battery life and causes frame drops. My rule of thumb: every polygon must justify its existence. Use supporting edge loops only where deformation or sharp edges are required, and rely on normal maps to convey surface detail that geometry once did.
Textures are where you win back the visual fidelity sacrificed by low-poly geometry. I always bake high-poly detail—scratches, grooves, fabric weave—into normal, ambient occlusion, and roughness maps. Keep texture resolutions as low as possible while maintaining clarity on the target device screen; 1k or 2k maps are often sufficient for AR. Crucially, I pack metallic, roughness, and ambient occlusion into a single texture's RGB channels to minimize texture samples.
For materials, use a PBR (Physically Based Rendering) workflow. It's the standard for real-time engines like Unity and Unreal, which power most AR experiences. Avoid overly complex shader networks. In AR, a model might be viewed under any lighting condition, so materials must react plausibly to unpredictable environmental light.
Format choice dictates where and how your model can be used. For broadest compatibility in mobile AR development (ARKit, ARCore), glTF 2.0 (.glb) is my go-to. It's a modern, efficient format that bundles geometry, materials, textures, and even animations into a single file, and it's natively supported by the web via WebGL. USDZ is essential for Apple's ecosystem (iOS AR Quick Look); it supports more complex scene data and animations.
I always export from my main 3D package into these runtime formats as a final step. FBX is still useful as an interchange format during production, but for deployment, glTF or USDZ are what actually run in the AR session.
My workflow is a constant balance between creation and constraint. I start by blocking out the model with primary forms, strictly mindful of the final poly budget. Once the high-poly sculpt is done for detail, I create a low-poly version—this is the actual AR mesh. I then UV unwrap the low-poly model meticulously to maximize texel density and minimize seams.
The critical phase is baking: I transfer all the high-poly detail onto texture maps for the low-poly model. Finally, I author the final PBR textures (base color, normal, packed MRAO) at target resolutions. The last step is a clean export to glTF or USDZ, ensuring all paths are relative and materials are correctly assigned.
I've integrated AI generation into the early stages of this workflow to save days of work. For instance, I can use a text prompt or a concept sketch in a tool like Tripo AI to generate a base 3D mesh in seconds. This gives me a fantastic starting point for concept validation and rapid prototyping. The generated model often comes with a sensible initial topology, which I then take into my standard software for the essential optimization steps: retopologizing for cleaner edge flow, UV unwrapping, and texture baking.
This approach lets me bypass the most time-consuming parts of traditional modeling (blocking, sculpting basic forms) and jump straight to the technical refinement that makes or breaks an AR asset. It's particularly useful for generating variations of environmental assets or props where speed is key.
The tracking method dictates your model's first impression. For marker-based AR, the model appears anchored to a flat image. Here, I pay extra attention to the model's "bottom" or contact surface, ensuring it sits convincingly on the marker without floating. The initial "pop-in" animation should be smooth to mask tracking initialization.
For markerless/plane-detection AR (like placing furniture on a floor), the model must interact with environmental lighting and cast plausible shadows. I spend more time tweaking material roughness and metallic values so the object looks grounded. The model often needs multiple Levels of Detail (LODs) so it remains performant when viewed from afar.
These represent opposite ends of the AR spectrum. A social media filter (e.g., for Instagram or TikTok) has an extremely tight polygon and texture budget—often under 20k triangles and a single 1k texture atlas. The focus is on stylized, expressive performance and flawless real-time face tracking. Optimization is brutal.
For industrial visualization (e.g., viewing a machine part in a factory), visual accuracy is paramount. Poly counts can be higher (50k-100k), and textures are more detailed to show wear, labels, and material differences. However, the model must still run at 60 FPS on a tablet or AR headset, so efficient LOD systems and careful draw call batching are my focus here.
I build my AR assets to be modular and future-proof. This means:
moving at the speed of creativity, achieving the depths of imagination.