Monocular Depth Cues

Static monocular depth cues in flat images consist of items that allow us to perceive depth on the basis of “the position of objects on the retinal image, the size of their retinal image and the effects of lighting on the retinal image” (Yantis, 2014, p. 193). This is related to assumptions we make based on experiences we have had in the 3D world. For example, when we see a painting with items covering other items, we assume that the one being covered is a full item and not a piece of it cut out in the same exact shape as the part of the item covering it.

There are two primary cues related to the position of objects on the retinal: (1) partial occlusion (interposition) and (2) relative height. With partial occlusion one can determine depth when they see one item partially covering another. This tells our visual depth perception mechanism that the thing being partially covered is behind or further away. Relative height employs the concept of perspective where the further things are from you appear further up or down from eye level – the horizon. Things below you are below your eye level (or horizon), but as they get further away, they get closer to your eye level (horizon). This is similar to things above you – as they get further away, the come down closer to your eye level (horizon). Anyone who has ever taken an art class in high school was taught this concept when they learned 1-point and 2-point perspective.

Retinal image size is related to three concepts: (1) the size-distance relationship, (2) the visual angle and (3) the size perspective. There are 4 primary cues related to the size of an object’s retinal image: (1) Familiar size (we expect what an object would look like from certain distances from our eyes, (2) relative size (where we can assume items are the same size, and if one is half the size of the other, we can assume that one is twice as far away, (3) texture gradients (textures in objects – when we assume they are the relative same size – appear smaller the further away they are), and (4) linear perspective (parallel lines converge as they get further away).

There are 3 primary cues related to the effects of lighting on the retinal image. These are (1) atmospheric perspective (basically – the haziness of the atmosphere is greater the further away it is), (2) shading (curved surfaces don’t have a solid color – and shading indicates the object’s shape), and (3) cast shadows (where an object’s shadows fall help indicate distance from our eye).

Image 1. This is a motorcycle that I drew. It uses a variety of the monocular cues for depicting depth. First we see the shading in all objects indicating that they are not flat and have curved shapes indicating their depth. We also see relative size in action – both wheels are the same size, but the back one is slightly smaller indicating it is further away than the front wheel. We also see the effects of shadows on the exhaust, below the fairing and gas tank and around the wheels and rims.

Image 2 is a fabulous still life that incorporates many of the depth cues. We see shadows depicting position; we see shading showing the curvature of each item, we see partial occlusion – pear covering the pitcher, grapes covering the pitcher, pitcher covering the wall.

Image 3’s primary cue is atmospheric perspective – images further away have a distinct haziness and we’ve come to expect that more hazy = further away.


1. Yantis, S. (2014). Sensation and perception. New York, NY: Worth Publishers.  Retrieved from