World Model For Autonomous Agents
In my practice, a "spatial person" is far more than a 3D avatar; it's a fully realized digital entity defined by its geometry, topology, and the spatial data that allows it to exist and interact convincingly within a 3D environment. This concept is fundamental because it shifts the focus from mere visual representation to functional, animatable assets ready for production. I'll walk through my complete workflow for creating them, from defining purpose to final export, and explain why a hybrid approach—leveraging AI for ideation and speed while applying traditional skills for precision—consistently yields the best results. This guide is for 3D artists, game developers, and XR creators who want to build characters that are not just seen, but that truly inhabit a space.
Key takeaways:
When I say "spatial person," I'm not just talking about a static model. I'm referring to a data structure designed for spatial reasoning. It's an asset that understands its place in a 3D coordinate system, with geometry that's built to move, deform, and interact. A simple sculpt is a statue; a spatial person is an actor. The distinction lies in intent and construction from the very first polygon.
The foundation has three pillars. Geometry is the visible shape. Topology—the flow and connection of polygons—is the unseen armature that dictates how the shape bends and moves. Spatial Data encompasses everything from the model's real-world scale and pivot point to UV coordinates and skeleton binding. Neglecting any one of these results in a model that looks good in a preview but fails in production.
Adopting this mindset changes everything. I don't start modeling a face; I start planning how the mouth will open and the cheeks will squash. I don't just sculpt a hand; I ensure the finger joints have the proper loop density for a clean fist. This forward-thinking approach saves me countless hours of rework later during rigging, animation, and engine integration.
I always begin with questions: Is this for a VR social app, a cinematic, or a mobile game? The answers dictate every technical decision. A VR avatar needs extreme deformation clarity for lip sync, while a background cinematic character might prioritize subdivision surface detail. I write down key specs: target polygon count, required bone count, and the primary animations (e.g., walking, gesturing).
My quick checklist:
This is where I establish the critical edge flow. I typically start with a primitive or a very basic humanoid base. My focus is entirely on topology: creating loops around the eyes, mouth, and joints that will support natural deformation. I keep the mesh low-poly and quad-dominant at this stage. For rapid prototyping, I often use Tripo AI to generate a base mesh from a text prompt or sketch, which gives me a great starting topology to refine, rather than starting from a cube.
With a clean base, I subdivide or use sculpting tools to add secondary and tertiary forms—muscle definition, wrinkles, facial features. Crucially, I constantly reference real-world proportions. I use a standard human scale (usually 1.8 meters) and check proportions (e.g., head-to-body ratio) against my concept. This ensures the character feels grounded in its space.
I create a skeleton (armature) that matches the topology's flow. Weight painting is where good topology pays off; clean loops result in predictable, smooth joint deformation. I always test a basic set of poses (idle, walk cycle, extreme expressions) to identify and fix any weighting issues before the model goes to an animator.
The rule is simple: edge loops must follow the contours of underlying anatomy. Loops should circle the eyes, run from the nose to the mouth, and follow the major muscle groups around the shoulders and limbs. I avoid triangles and n-gons in deformation areas at all costs, as they cause pinching and artifacts during animation.
Common pitfalls I avoid:
I model with the end in mind. For a game character, I maintain a low-poly cage and bake high-frequency details from a sculpt onto normal maps. For film, I might work directly with a dense subdivision surface mesh. The key is to place polygon density where it's needed for deformation or silhouette, and reduce it everywhere else.
A model built at the wrong scale is a nightmare to integrate. I always work in real-world units (meters). I keep a reference object—like a standard door or a chair—in my scene to visually gauge scale. Consistent proportions are what make a character feel believable, even if it's stylized.
For brainstorming and overcoming the blank canvas, AI is unparalleled. I can input a text description like "cyberpunk samurai with hydraulic arms" and get a viable 3D base mesh in seconds. This is incredible for rapid prototyping, generating asset variations, or finding a creative direction I wouldn't have initially considered. It compresses hours of blocking-in into a moment.
When I need specific, controlled results—correcting anatomy to match a concept sheet, designing complex hard-surface parts, or crafting a specific emotional expression—manual modeling and sculpting are irreplaceable. I have complete control over every vertex and edge loop, which is essential for final-quality assets and solving specific technical or artistic problems.
My preferred workflow leverages the best of both. I'll use Tripo AI to generate 2-3 base mesh options from a prompt. I'll pick the most promising one, not as a final asset, but as a topologically-sound starting block. Then, I import it into my main DCC tool (like Blender or Maya) for serious refinement: fixing proportions, optimizing topology for my specific rig, and adding precise detail through sculpting. This approach gives me a massive head start while retaining full artistic and technical control.
After UV unwrapping—a step I ensure is clean to avoid stretching—I texture based on the spatial purpose. For real-time use, I create efficient texture atlases. I use PBR (Physically-Based Rendering) workflows for consistency across engines. Material IDs are crucial; I separate materials for skin, eyes, teeth, and clothing to allow for flexible shader adjustments later.
Beyond basic rigging, I prepare the character for its role. This might mean adding blend shapes for facial animation, setting up corrective shapes for better knee and elbow bends, or defining attachment points for weapons/accessories. I create a simple animation test scene to verify everything works in context before handing it off.
This final step is critical. I meticulously check:
By treating the spatial person as an integrated system of data from day one, this final export is never a panic-filled fix-it session, but a smooth, predictable conclusion to the creation process.
moving at the speed of creativity, achieving the depths of imagination.
Text & Image to 3D models
Free Credits Monthly
High-Fidelity Detail Preservation