zotero-db/storage/JQ2XX7UR/.zotero-ft-cache

384 lines
47 KiB
Plaintext

Journal of Mathematical Psychology MP1151
journal of mathematical psychology 41, 89 98 (1997) article no. MP971151
Hyperbolic Representation of Global Structure of Visual Space
Tarow Indow
University of California, Irvine
Most studies on visual perception assume a limited region in visual space to be Euclidean. In a series of alley experiments, in which extensive configurations of stimulus points in a frameless space were dealt with, it was found that a horizontal or slanted plane extending from the subject is best described by hyperbolic geometry, whereas a frontoparallel plane in front of the subject is best described by Euclidean geometry. Theoretical problems around these findings and two properties of visual space (VS) were discussed: (1) VS is closed in the sense that no percepts can appear at an infinite distance. (2) VS is dynamic in the sense that its global structure critically depends upon the configuration of objects in the physical space. Two questions were also discussed: (1) How far is VS extended beyond the farthest percept under various conditions? (2) How does the sky, as the boundary of VS, in daytime as well as at night, change its shape in accordance with what we see in VS? ] 1997 Academic Press
RIEMANNIAN REPRESENTATION OF VISUAL SPACE
Let us denote by VS the space we are perceiving in front of ourselves and by X the physical space from which light stimuli come. VS is a coherent complex that is segmented into figures, background, and the self. The self is a percept and it is to be distinguished from the body that is a physical object in X (Kohler, 1929). Both VS and the self are final products of the long series of processes. The series underlying VS consists of the physical process in X from physical objects to the retina and the physiological process from the retina to the brain. The process underlying the self stems from proprioceptive stimulation. Some parts of the self, such as hands, may be visible, but the main function of the self in VS is to determine fundamental directions of VS, above and below as well as to the right and left, and to function as the origin for distances, how far away figures are. We regard VS to be three-dimensional. It is through VS that we guide our bodies in X so as to reach, manipulate, or avoid objects therein. The global structure of VS was described before in (Indow, 1991, 1995). The following five features will be relevant to the discussion in this article.
VS1. VS is closed. At the end of our line of regard, no matter where it is directed, there is always a percept appearing
Correspondence and reprint requests should be addressed to Tarow Indow, Department of Cognitive Sciences, School of Social Sciences, University of California, Irvine, CA 92697. E-mail: tindowÄuci.edu.
at a finite distance. Indoors, it may be a table or the wall, and outdoors, trees, terrain, or the sky. Neither infinity nor nothing can be percepts. In a segmented VS, however, there exits a terra incognita between the self and percepts.
VS2. We perceive a number of geometrical patterns in VS; straight lines and their lengths, a flat plane, angles, betweenness, parallelness, etc.
VS3. VS is dynamic. The size and structure of VS change according to the physical condition in X, especially to the farthest physical objects. In other words, VS is not a solid container into which various percepts are placed. Rather it is like a balloon.
VS4. Under ordinary conditions, the sky appears as a vault, and when visible, the horizon is always at the ``eyelevel'' no matter whether the eyes are directed upward or downward (Sedgwick, 1980, 1986). Hence, according to the height of eye and the direction of regard, the areas occupied by sky and ground or ocean change in VS.
VS5. VS is stable. Unless we fixate something, VS is a product based on multiple glances. Nevertheless, we usually see a coherent VS. When the direction of regard is changed or the body moves, we feel the change of the direction or the position of self in a stable VS.
Most people may think that none of the mathematically well-established geometries is suitable for describing the global structure of VS. However, to account for results of the so-called alley experiment, Luneburg (1947, 1950), a geometer, postulated that VS is a Riemannian space (R) of constant curvature (K) and conjectured that K<0. The idea was reiterated by Blank (1958, 1959), a theoretical physicist. Following the style of Busemann (1942, 1955), the basic motives for this postulation can be briefly summarized as follows:
RS1. If VS is finitely compact and convex and if the distance $ we see between any two percepts in VS satisfies Frechet's conditions ($ik t$ki o 0, $ii t0, $ij Ä $jk o or t$ik where ``t, o, and Ä'' respectively denote ``to appear equal to, larger than, and to be concatenated''), then VS is regarded as a metric space. These prerequisites are not contradictory to our perceptual experience.
89
0022-2496Â97 25.00
Copyright 1997 by Academic Press All rights of reproduction in any form reserved.
File: 480J 115101 . By:DS . Date:23:04:97 . Time:10:28 LOP8M. V8.0. Page 01:01 Codes: 6909 Signs: 5049 . Length: 60 pic 11 pts, 257 mm
90
TAROW INDOW
RS2. If VS is locally Euclidean, as tacitly assumed in most studies of visual perception in a limited location in VS, then VS is a Riemannian space R. The geometrical property of a local region in R is characterized by the Gaussian total curvature K therein.
RS3. If a space R allows free mobility, K should not change from region to region. For R of constant curvature, three geometries are possible; elliptic (K>0), Euclidean (K=0), and hyperbolic (K<0). It is called the Helmholtz Lie problem to discuss the conditions of physical space under which a physical object can move from one position to another without changing its shape and size. The conclusion is that the physical space must be structured according to either one of these geometries (Busemann, 1955; Freudenthal, 1965; Suppes et al., 1989). As to VS, we have to deal with invariance of a perceived figure, not that of a solid object as its physical counterpart. With the same logic, we can argue that, if we can see two congruent figures at any two locations in a subspace of VS by appropriately adjusting the size and shape of respective physical objects, then the subspace must be a R of constant K (Indow, 1991). Furthermore, if a similarity transformation is possible in a subspace of VS to change the size of a figure keeping its shape strictly invariant, then the subspace must be Euclidean. Mathematically speaking, unless K=0, a proportional change in lengths introduces some distortion in angles.
Wang (1951, 1952) discussed free-mobility of small line segments, instead of figures, and showed that the possible geometries of the so-called G-spaces of Busemann (more general than R) are limited to the above-mentioned three if the dimensionality of the G-space is even or three. We are concerned only with 3D VS as R3 or subplanes in VS as R2. This condition may seem easier to test experimentally. However, if we design an experiment to verify the prerequisite of VS being R of constant curvature, it seems to me more important to rely upon a more overall cognitive impression such as the perceptual congruence or similarity rather than to carry out piecemeal tests of mobility of line segments. I do not know any psychophysical experiment in which the possibility of continuous maintenance of perceptual congruence or similarity of a figure over a wide range of VS has been carefully tested. The possibility of perceptual similarity transformation will be a necessary condition to have a realistic picture (tempo l'oeil) of a scene. This problem will not be discussed in this article. The possibility of perceptual congruence is presupposed in the procedure to match two figures. This procedure is widely used in many experiments. It is not clear to what extent this overall impression guarantees that the subject sees strict identities in lengths and angles. However, whenever the matching is possible in a subspace of VS, we can say that the subspace can be regarded as an R of constant curvature within that
degree of uncertainty. The physical space X is a 3D Euclidean space E 3.
We can fit theoretical equations to the data of so-called alley experiments, and all the results were shown in Indow (1991) and Indow 6 Watanabe (1984a). In this experiment, pairs of stimulus points [QLi , QRi] are presented, one on the left and the other on the right of the median line and i=1 to n from the farthest pair to the nearest pair to the subject. All are on the horizontal plane passing though the eyes, and the farthest pair [QL1 , QR1] is fixed. The subject is asked to adjust the positions of the remaining Q's in two ways. The head is fixed, but the subject is allowed to move the eyes during the adjustments. The resulting configuration [QLi , QRi] is called a parallel alley (P) when adjusted so that each series, QLi and QRi appears straight and the two are parallel, and a distance alley (D) when adjusted so that all pairs, QLi and QRi appear to have the same lateral distances. In experiments in which no framework, such as a wall or the edges of table, is visible, the two alleys do not coincide and the D-alley, [QLi , QRi]D , lies outside of the P-alley, [QLi , QRi]P . This was true not only for [QLi , QRi] on the horizontal plane but also for [QLi , QRi] on a slanted plane passing through the eyes. This fact implies that each of these planes in VS with various upward directions of regard is R2 of K<0. We had satisfactory fits of theoretical equations to [QLi , QRi]P and [QLi , QRi]D by adjusting two free parameters, K and _. The latter parameter _ will be explained in the next section. Always K turned out to be negative, which implies that VS under this condition is structured as a hyperbolic space.
EUCLIDEAN MAP OF VS
There are various ways to depict Rm in a Euclidean space EM, where m and M denote respective dimensions. Herein, Poincare's model for R2 or R3 will be used to define the theoretical alley curves. In contrast to other models in which a larger dimension M is necessary to represent Rm, no extra dimension is needed in this model (M=m), and let us call it the Euclidean map (EM) of R. Although this can be used as a model for either case K>0 or K<0 (Indow, 1979), only the hyperbolic case will be explained herein. Fig. 1a shows EM2 for VS2 slanted with regard to the horizontal plane, where (!, ', `) are the Cartesian coordinate axes with the origin O corresponding to the self. A point P will be defined by polar coordinates \0 , ., and , the meaning of which will be clear from the figure. Fig. 1c is the physical space X3 where the two eyes are denoted by R, L and a stimulus point is denoted by Q. What are meant by #, e0 , ,, and % will also be clear. The following characteristics of EM are important for the subsequent discussion.
EM1. If a slanted plane VS2 with an elevation angle is regarded as R2 of constant K, it is represented within the
File: 480J 115102 . By:DS . Date:23:04:97 . Time:10:28 LOP8M. V8.0. Page 01:01 Codes: 6628 Signs: 5895 . Length: 56 pic 0 pts, 236 mm
THE GLOBAL STRUCTURE OF VISUAL SPACE
91
at both ends (circles A and B passing through P in Fig. 1a). As shown by B, one side can be in the region where ! is negative, the region having no perceptual counterpart in VS. Any straight line starting from O in EM, such as the axis !( ) or ', is orthogonal to the BC( ), and hence represents a perceptual straight line in VS extending from the self in that direction. The length of this perceptual radial line (depth distance) will be denoted by $0.
EM3. VS and EM are not isometric but conformal. Denote by \jk the length of the above mentioned arc between Pj and Pk , and by $jk the perceptual distance between corresponding points in VS. Then, $jk {\jk , but
q=- &K , K<0,
(1)
2
$jk Bq&1 sinh &1[q\jk(1+(q\0j)2)&1Â2 (1+(q\0k)2)&1Â2], (2)
where B denotes ``being proportional.'' The same size of \jk in EM represents shorter visual distance $jk in VS when points are far from O and hence the radial distances from O
to Pj and Pk , \0j and \0k , are large. When it is not necessary to specify a point, radial distances in VS and in EM will be
represented by $0 and \0 . Then, two are related as follows:
$0 Bq&1 tanh &1[q\0].
(3)
FIG. 1. Euclidean map EM of visual space VS as R of K<0 (a,b) and physical space X (c).
basic circle, BC( ). BC( ) in the region !>0 corresponds to infinite perceptual radial distance ($0= ). As mentioned in VS1, all Q's beyond a certain distance, max e0 in Fig. 1c, appear in VS at a finite distance, max $0 . This visual radial distance is represented by max \0 in Figs. 1a and 1b. BC( ) being a circle is due to the condition that K is constant within the slanted plane. This condition does not necessarily imply that either max \0 or max e0 must be of constant length irrespective of angle , or .. If the entire VS3 is regarded as a R of constant K, it is represented within the semisphere obtained by rotating BC( ), &90%< <90%.
EM2. Within a BC( ) and in the range of in which K remains, constant, a geodesic between any two points in R2( ), which represents a straight line in VS, is given by the circle that passes these points and is orthogonal to BC( )
Because $0 to BC( ) is infinity, the radius of BC( ) is \0=q&1. Derivations of these equations are explained in Indow (1991). In contrast to distance, any angle in EM is the undistorted representation of the corresponding perceived angle in VS. In other words, EM is conformal with VS. Hence, a right angle in EM implies that the corresponding perceptual angle in VS is also a right angle.
EM4. If max \0 remains constant independent of the directional angle . in VS2, the relationship between the dotted circle and BC( ) depends only upon K. In this case, it is convenient to define max\0 to be 2 as the unit to measure \jk and \0. In terms of this unit for EM, the range of K is so constrained that &1<K<0.
THEORETICAL EQUATIONS AND MAPPING FUNCTIONS
Theoretical geodesics representing P- and D-alleys on the horizontal plane VS2 are readily defined in EM from the properties stated above (Fig. 1b where =0). There are an infinite number of geodesic circles passing through PL1 and PR1 which do not intersect within the semicircle representing max \0 . Luneburg defined as the representation of P-alley such a set of non-intersecting geodesics that are
File: 480J 115103 . By:XX . Date:21:04:97 . Time:12:55 LOP8M. V8.0. Page 01:01 Codes: 4427 Signs: 3269 . Length: 56 pic 0 pts, 236 mm
92
TAROW INDOW
orthogonal to the '-axis, because P-alley extends in the direction of the axis !( ) and the two axes, !( ) and ', are orthogonal. Let us denote by H a plane or a line in VS that appears frontoparallel to the self. The representation of an H-line in the horizontal plane of VS is given by a geodesic circle that is orthogonal to the !( )-axis (Fig. 1b). The D-circles passing through PL1 and PR1 are the loci of geodesic on H that represents the same constant lateral length as $LR . Notice that the curves connecting these loci are circles in EM but not geodesics because they are not orthogonal to BC( ). As shown by heavy curves in Fig. 1b, when K<0, the D-circles are outside of the P-circles for any pairs [PL , PR] closer than [PL1 , PR1].
In order to have theoretical curves for stimulus point Q, point P( \0 , ., ) on these theoretical curves in EM must be mapped into the physical space X. Let us represent the position of Q in terms of bipolar coordinates, # (or e0), ,, and % (Fig. 1c). In the region where the convergence angle # is effective, we can use # instead of e0 . In order to account for the experimental result of the D-alley being outside of the P-alley in X, it is sufficient to assume that the mapping is ``monotone.'' In order to have theoretical curves in X, however, it is necessary to quantify relationships between ( \0 , ., ) and (# (or e0), ,, %). Luneburg defined the following simple mapping functions in the a priori manner:
\0= g(# ; _)=2e&_#, .=,, =%.
(4)
Let us call (4) Luneburg's mapping functions. In so far as P- and D-alleys on a slanted VS in laboratory experiments are concerned, the theoretical curves projected from EM into X through the Luneburg's mapping functions describe the data well. The equations have two parameters, K in the equations in EM and _ in the mapping functions to X. Always the most appropriate values of K and _ were estimated for configuration [Qj] constructed by each subject (e.g., Indow, 1991; Indow 6 Watanabe, 1984a). In some alley experiments, Q 's were also adjusted to form frontoparallel lines. With a fixed point Q0i on the x-axis at a distance e0i , [QLi , QRi]P and [QLi , QRi]D were adjusted to satisfy the condition that the five Q 's appear fronto-parallel (an H-curve). The distance e0i was appropriately defined each for [QLi , QRi]P and [QLi , QRi]D , i>1. All the results were fitted by the respective theoretical equations with the same values of K and _. For instance, K=&0.38 and _=19.8 (K with the unit defined in EM4 and # in radians) gave such curves in X that passed through the total configuration of stimulus points [Qi] for P- and D-alleys as well as for H-curves passing though Q0i.
MP1. The mapping functions (4) are completely egocentric. This is the reason why all the experiments were performed only in frameless VS: small light points in the
dark or small objects on a table at eyelevel (%=0) where the edges of table and the wall of the room were invisible.
MP2. The last two equations in (4) imply that directions from the self in VS are preserved in EM (EM3). The first equation g(# ; _) implies that \0 and hence $0 depend upon # only and the equation remains invariant for all directions from the self. If this is the case, VS is a R satisfying isotropy and max\0 forms a circle in Fig. 1.
MP3. The first equation is meaningful only in the region where # changes significantly with e0 . In most of the laboratory experiments we performed, the distance e01 to Q1 was less than 5 m. In an experiment carried out in a gymnasium, e01 to Q1 was about 16.10 m. Including this experiment, individual values of K and _ obtained in experiments before 1979 were listed in Table 1 of Indow and Watanabe (1984a).
FINDINGS FROM LABORATORY EXPERIMENTS
The following findings are relevant to the subsequent discussions. For other findings in my laboratory and in experiments conducted by many other investigators, consult Indow (1991).
EF1. Values of K( <0) remain almost invariant in each subject for the following change of context: e01=150t417 m, the interval between QL1 and QR1=12t76 cm, %=0t90%, and Q= a small light point in the dark and a small black object in an illuminated frameless space. Individual differences of K are relatively small, in the range from &0.3 to &0.4. What changes with the context more systematically is the value of _. It tends to be larger in an illuminated space than in a dark space.
EF2. The alley experiment in the a dark gymnasium gave clearly different values of K and _ : |K| was about 2 times and _ was 2.1 to 2.5 times larger than those in the laboratory room (Fig. 5 in Indow 1991).
EF3. Presenting the total configuration of stimulus points [Qj] for P- and D-alleys and H-lines constructed by a subject, we can ask the subject to make paired comparison judgments on perceptual distances $jk between various pairs of points. Always, ratios between two perceptual distances, from a common point Qi to Qj and Qk , $ij Â$ik , were assessed, and from these raw data we can obtain the matrix of scaled distances D=(djk), j, k=1 to n, including the self as a point. Then, through a MDS program in which K is involved as a free parameter, we can construct such a configuration [Pj] in EM that satisfies two criteria: (A) the degree of coincidence between the pattern of [Pj] and the theoretical equations for alleys and H-lines in EM, and (B) the degree of coincidence between data djk and the theoretical distances $jk that are obtained through (2) from interpoint
File: 480J 115104 . By:DS . Date:23:04:97 . Time:10:28 LOP8M. V8.0. Page 01:01 Codes: 6580 Signs: 5471 . Length: 56 pic 0 pts, 236 mm
THE GLOBAL STRUCTURE OF VISUAL SPACE
93
distances \jk of [Pj]. The judgments on ratios between $'s were shown to be very consistent and the optimized K turned out to be negative in the laboratory experiments (Indow, 1982, 1991). It was not necessary to assume mapping functions in advance. The Luneburg mapping functions (4) were not well supported, especially the exponential form between \0 and #, by the a posteriori mapping relationship that we obtained by comparing [Pj] constructed in EM with [Qj] in X.
EF4. The parallel lines we see in our daily life are not extended in the form of a P-alley. In most cases, they are running horizontally or vertically on a frontoparallel Hplane. Experiments to construct horizontal P- and D-alleys on an H-plane and also to ask judgments on ratios between $'s have been performed. The results in these two approaches (one using and the other not using the Luneburg's mapping functions) were unequivocal. The geometry of perceptually frontoparallel plane is Euclidean, K=0 (Indow, 1979, 1982, 1988; Indow 6 Watanabe, 1984b, 1988). This result is in agreement with the observation in daily life that, on a frontoparallel plane H and between two H`s, we can have similarity transformation to keep the shape invariant while the size is altered (RS3 in the first section). On a slanted plane where K<0 (including %=0),
we can imagine continuous translocation of a figure keeping its perceptual shape and size invariant, provided that the size and shape of the object in X are appropriately adjusted. However, such a translocation is not imaginable from a frontoparallel plane to a slanted plane. In other words, there is no logical ground to postulate that these two subspaces VS2 must have the same value of K.
BOUNDARY OF VS AND THE ROLE OF K
We see a number of curved surfaces in VS and also in a picture. A vase or a torso appears to have different curvature from point to point and the same is true in its photo (e.g., Koenderink et al., 1992). On the other hand, we cannot see the curvature of the VS itself. When we say that a horizontal or slanted plane in VS (Fig. 1b, =0 or >0) has K<0, it does not mean that we cannot see straight lines in this plane. It only means that the behavior of perceptual straight lines and angles therein obey hyperbolic geometry; for instance, the frontoparallel geodesics representing a constant perceptual length (thick arc) reduces its size in EM when it moves toward BC( ). When K becomes closer to 0, BC( ) goes to infinity and the total VS is represented in a small area around O in EM. Then it will be intuitively clear
FIG. 2. Relationships between three radial distances, e0 and binocular convergence # in physical space X, \0 in Euclidean map EM, and $0 in visual space VS: (a) e0 Ä \0 , (b) \0 Ä $0 , (c) e0 and # Ä $0 , (d) e0 Ä #.
File: 480J 115105 . By:XX . Date:21:04:97 . Time:12:55 LOP8M. V8.0. Page 01:01 Codes: 3564 Signs: 2822 . Length: 56 pic 0 pts, 236 mm
94
TAROW INDOW
that both D-curves and P-geodesics become the same
parallel straight lines.
In this section, the discussion will be limited to perceptual
radial distance $0 in a given direction . on a slanted plane VS2( ). Assume that $0 , changing from 0 to max $0 in the given direction, behaves as a geodisic of R1 with constant K.
Max $0 is the distance to the boundary of the VS in this direction. Define max \0 representing max $0 , to be 2, then BC( )=2Â- &K=q&1, as shown with regard to the direc-
tion of ' in Fig. 1b. The perceptual counterpart of the
boundary, max $0 , may not be a visible entity by itself. Sometimes it is more convenient to use q' instead of ' (the
bottom of Fig. 1b), then BC( )=1 and max $0 is represented
by q max $0=- &K.
Fig. 2a shows three curves, A, B, and C, to relate the
radial distance \0 in EM to its representation e0 in X in a given direction (%, ,). This functional form g can be different
according to the direction g and the context. Fig. 2b shows
the relationship between perceptual distance $0 and \0 . This curve is fixed for a given value of K and a fixed unit for $0 . How $0 changes according to e0 or the binocular convergence # is given in Fig. 2c, which depends upon the func-
tion g in Fig. 2a. Fig. 2d gives # as a function of e0 , which depends upon the interpupil distance only. The first equa-
tion of Luneburg's mapping functions (4) gives \0 as a function of #. It will be clear from Fig. 2d that to use # becomes
meaningless for e0 larger than a certain value. According to the Luneburg equation, max \0=2 is defined by the
FIG. 3. R=max $0 Â$01 under various conditions, in which $01 is based upon Luneburg's mapping function and max $0 corresponds to max \01=2.
File: 480J 115106 . By:XX . Date:21:04:97 . Time:12:55 LOP8M. V8.0. Page 01:01 Codes: 2420 Signs: 1699 . Length: 56 pic 0 pts, 236 mm
THE GLOBAL STRUCTURE OF VISUAL SPACE
95
distance at which #=0. However, defining max \0 and hence max $0 in this way is not practical. The real boundary of VS may correspond to max e0 , in which #>0, if calculated. In the next paragraph, a different definition of
max $0 is discussed. Outdoors, the perceptual distance to the sky ( >0) and
the horizon ( =0), when these are visible, may correspond
to max $0 , but indoors it is not clear what delimits VS. Let us consider the situation in which there is a stimulus Q in X
that appears in VS as the farthest percept in that direction.
As in Fig. 1b denote by Q1 this stimulus and by P1 its representation in EM. If we know the values of K and \01 , in so far as max \0 under the given condition is defined to be 2, the ratio
\ + R=max $0 Â$01=tanh &1(- &K)Âtanh &1
-
&K 2
\01
(5)
can be determined and we know how far the boundary of VS is beyond the farthest percept. The boundary may not be perceptible. Fig. 3 shows the ratio R under various conditions of the alley experiment, in which theoretical curves using Luneburg's mapping function \0= g(# ; _) described the data well and \01 was given through #. Geo M is the geometrical mean of individual values of R. Conditions 1 to 3 were described before. The background (dimly illuminated checkerboard) was presented behind the fixed point pair [QL1 , QR1] at three different distances in 4 and 6. In Condition 4, positions of [QLi , QRi] were adjusted as described in the first section. In 6, as in Indow 6 Watanabe (1984a), only one pair [QL1 , QR1] was presented that appeared to move toward the subject (apparent movement). The subject adjusted their trajectories so that the movement was straight and parallel (P-alley) or with the constant width (D-alley). In Condition 5, three moving patterns were projected on the screen behind [QLi, QRi]. Though the configuration of points does not necessarily appear as being embedded in the scene, the background pattern by itself gave the impression of a dynamic perspective.
Compared with other conditions, R is smaller in Conditions 1 and 6. In Condition 1, nothing was visible beyond the fixed light points [QL1 , QR1]. In Condition 6, there were no Q1's remaining at e01. In the illuminated space Condition 2, where the subject saw the configuration of objects in front of large white curtain, R is large. Though the subject did not have a clear impression about the distance between the farthest pair of percepts and the curtain, the boundary of VS must be at or beyond the perceived background. When stationary light points were presented in front of a background, 4 and 5, the condition of the background had only a minor, if any, effect on R. However, R's were larger in these conditions than in the cases without
background, 1 and 3. In the large light point configurations without background, R of Condition 3 was in between R of Condition 1 and the R's of Conditions 4 and 5.
Throughout these results, we can say that the size of VS, max $0 , changes in accordance with the perceptual distance $01 to the farthest object Q1 . The distance $01 is determined by the position of Q1 in X and also by other conditions in X. Outdoors, $01 is much larger than $01 in Condition 3. However, it is still true that there is max $0 of a finite length beyond $01 (VS1 in the first section), and the two are related as discussed above, provided that K remains constant in the given direction. As pointed out in VS2, VS is dynamic. When the retina is stimulated by dim light reflected from a large homogeneous wall completely covering the visual field, the VS is not structured and the subject sees the fog of light extending from right in front of self (Metzger, 1930). In order for the VS to be segmented into the wall and self with a terra incognita in between, some heterogeneity in the stimulation is necessary. Furthermore, how far the wall appears from the self depends upon this heterogeneity. Under ordinary conditions, the retinal stimulation consists of images of objects in X at various distances from the body, then we see a stable 3D pattern of percepts in front of self and VS organizes itself in accordance with this pattern.
OUTDOOR VS
The sky, in daytime as well as at night, appears as a vault. This fact, however, does not tell anything about the curvature of VS as R3. A curved surface having positive or negative curvature can exist in a 3D Euclidean space, E 3. An experiment was carried out at a beach in which a configuration of 11 stars clearly distinguishable from others was used as [Qj]. As described in EF3, through ratio judgments $ij Â$ik where i, j, k refer to stars or the self, the matrix D=(djk), 12_12, was obtained with two subjects. The subject was allowed to move the head. Otherwise. it was impossible to see the entire configuration of stars. Scaled results were highly consistent, e.g., djk rdkj . At first, the configuration [Pj] was constructed in E 3 by a metric MDS program (Indow, 1968). The interstar distance, $jk , was defined by the length of the chord, not that of the arc along the perceived vault. This is the condition to meet the logic of MDS and also it was easier for the subjects to see the distance in this way. Later, the data were processed by a more flexible MDS program that allows to construct [Pj] in EM3 by optimizing the value of K, either positive or negative, and allows more flexibility to the relationship between data djk and interpoint distance djk in [Pj]. As shown in Fig. 10 in Indow (1991), the representations according to three geometries, Euclidean, elliptic, and hyperbolic, gave [Pj]'s in EM that were indistinguishable to each other and also satisfied the criterion B in EF3 equally well (Fig. 9 in that article). Namely, djk is in each [Pj], with K=0, K<0, or
File: 480J 115107 . By:DS . Date:23:04:97 . Time:10:28 LOP8M. V8.0. Page 01:01 Codes: 6706 Signs: 5782 . Length: 56 pic 0 pts, 236 mm
96
TAROW INDOW
K>0, turned out to be proportional to data djk with the same level of very small scatters. The other criterion A was not applicable in this case. The perceived sky can be represented be a curved surface in a R3 of constant K. In order to differentiate geometries, however, we need such a [Qi] that has stimulus points in between stars and self to form a 3D configuration in a stronger sense. The indistinguishability in EM does not imply that each [Pj] represents the same configuration of stars embedded in the same form of vault. Only [Pj] in E3 (EM of K=0) directly tells us the shape of sky the subject is perceiving. In the other two cases, [Pj]'s in EM give more or less distorted pictures of the perceptual sky (RS3). Because the sky is an extended form of H-plane, in which K was shown to be 0 (EF4), it is not implausible that [Pj] in E3 truly represents the night sky in which the stars were embedded under this observation condition. Putting aside the problem of which geometry is most appropriate to describing this VS, it was common in all three representations that the shape of [Pj] in EM clearly deviates from a semicircle. Namely, max \0 changes its length according to direction and the isotropy condition in MP2 does not hold herein. The perceived sky extended most in the direction of ', next in the direction of `, and least in the direction of !, in Fig. 1, which may be ascribed to the following circumstances. The subject faced toward the dark ocean and nothing was visible in the direction of !. The silhouette of terrain and lights of town were visible in the right and left peripheries ('). In the zenith direction (`) there were stars. Again, the result shows that how VS extends critically depends upon what we see in respective directions (VS3).
The sky in daytime is generally regarded as a bowl flattened in the zenith direction. In this case, no object is visible in this direction and we perceive objects as being at various distances on the ground. The attempt by meteorologists and other scientists to determine the shape of the sky has a long history in Europe. The study is based on the following assumption. The perceived sky in the direction of !, in Fig. 1a is represented by an arc of a circle having the radius r that is vertically shifted by the amount 2 (Fig. 4a), where Z and H respectively correspond to the zenith and horizon. The ground line intersects the radius from the center of the circle to Z at the point O that corresponds to the self. Namely, OZ and OH represent the largest perceptual distances in the respective directions, max $0(Z) and max $0(H). Let us call this form of sky a shifted circle. The subject is asked to bisect the arc between Z and H. Under this hypothesis, and the ratio R(HZ)=max $0(H)Âmax $0(Z)=cot |, this angle | is related to the bisecting angle : in the following way:
was used by Smith in 1728 (:=23%) and followed by Reimann in 1890 (:=22.33%), which implies that R(HZ)r3.5. This ratio should depend upon condition of the sky. According to Table 1 in Neuberger (1951) and Table 1 in Miller 6 Neuberger (1945), for the daytime sky R(HZ)r2.3 when cloudy and R(HZ)r2.1 when clear, and for a clear moonlit night sky R(HZ)r2.2. It was not stated what was visible in the direction of H. Eq. (6) is based on Euclidean geometry. No one paid attention to the angle ; in Fig. 4a under this assumption. For :=22%(R(HZ)=3.5) and 31%(R(HZ)=2.3), ;=32% and 47%. According to my observation, these are too acute for the angle by which the sky appears to meet the ground or ocean at the horizon.
Fig. 4b depicts a shifted circle in EM with q=- &KÂ2 for the sky in VS, where max q\0 changes its size as a function of as if the center is shifted on the `-axis in Fig. 1. Now it
tan :=(cos |&cos 2|)Âsin |.
(6)
If there is no shift and the sky is a semicircle, R(HZ)=1 and FIG. 4. Bisection of perceived sky: Shifted circle in VS as (a) E3, and :=45%, of course. According to Filehne (1912), this method shifted circle in EM for VS3 as R3 of K<0 (b and c).
File: 480J 115108 . By:XX . Date:21:04:97 . Time:12:56 LOP8M. V8.0. Page 01:01 Codes: 4755 Signs: 4043 . Length: 56 pic 0 pts, 236 mm
THE GLOBAL STRUCTURE OF VISUAL SPACE
97
is convenient to define the unit in EM so that BC=1 (Fig. 1b). In these coordinates with q, q\0(H)=- &K<1.0. It was assumed that the bisecting point M was determined in terms of a chord (Fig. 4c). In the experiment with stars stated before, the subject experienced difficulty to feel distances between stars as an arc along the vault. Hence, it is more likely that the subject bisects the sky so that $(ZM)t$(MH). Then, \(ZM){\(MH) in II. If K, R(HZ) with a value of q\0(Z), and the angle ; are given, under the condition that $(ZM)=$(MH),
\ + \(MH) = \(ZM)
1 & ( q\ 0( H ) ) 2 1 & ( q\ 0( Z ) ) 2
(7)
from (2) and : is determined. This value of : has to reproduce the given R(HZ) through
R(
HZ
)
=
max max
$0(H) $0(Z)
=
tanh &1 tanh &1
q\0(H) q\0(Z)
(8)
Then, when K=&0.88, q max \0(Z)=0.6 and ;=65%, then the following conditions in EM satisfy (7) and (8); :=21%, R(HZ)=2.5. Because VS and EM are conformal (EM4), we can say that the perceived sky having R(HZ)=2.5 meets the horizon with the angle of 65%, a more reasonable prediction than that from (6). The perceived sky is not shifted in this representation. It is the locus of max $0 that varies according to (3) with max q\0 as a function of in EM, in which the ratio between max $0 at =?Â2 and max $0 at =0 is 1 : 2.5. Because K{0, it is not possible to draw this sky on the sheet of paper without distorting either length or angle (RS3 in the first section). Fig. 4 gives a schematic illustration of the sky under discussion. It would not be fruitful to pursue this argument any further because there is no a priori guarantee for either hypothesis to hold: the sky as a shifted circle in Euclidean VS (Fig. 4a) or the representation by a shifted circle in EM for the sky in VS with K<0 (Fig. 4b). The sky bisecting experiments were referred to here only with the purpose of showing that the assumption of VS being Euclidean has been taken for granted, explicitly and implicitly, in various contexts and that calculation based on a different geometry leads to different predictions.
For outdoor VS, the information on max $0 and its relation to the physical condition in X is important, because all perceptual phenomena take place within this boundary. As shown in Fig. 2d, to take binocular convergence # as the determining variable is meaningless when [Qi] extends over a large area. It was unfortunate that Battro et al. (1976) used Luneburg's mapping function (4) in their P- and D-alley experiments performed in a large gardens and in a polo field in daytime. Of the largest [QLi , QRi], QL1 and QR1 were at x=240 and y=\48 m. Using \0= g(# ; _), they reported that K 0 in 38 cases and K<0 in 52 cases.
The use of # for these stimulus configurations made it dif-
ficult to evaluate the fits of theoretical curves. However, it is
clear that there were subjects giving the P-alley inside of the
D-alley as well as subjects giving the reversed relationship.
This is not surprising. Under this condition, trees and
terrain, etc., must have been visible beyond the field and
max e0 is much larger than e01=245 m. Then, [QLi , QRi] is represented in a small portion of the dotted circle (max \0) in Fig. 1b, and hence the discrepancy between P- and
D-alleys cannot be large.
A number of outdoor experiments on $0=G(e0 ; g) in a direction g (Fig. 2c) have been made, in which equisection,
fractionation, magnitude estimation, etc., were used to
obtain a scaled value d0 of $0 . Often a power function
d0
BG(
e
0)
=
e
; 0
was
fitted
in
the
range
0
<
e
0
<
e 01
and
;
was
slightly smaller than 1.0 on average for the direction
g : (%=,=0): 0.95 in Cook (1978), 0.86 in Da Silva 6 Da
Silva (1983), 0.85t0.99 according to e01 in Teghtsoonian 6 Teghtsoonian (1970), etc. In some indoor settings, ; was
slightly larger than 1.0 (Kunapas, 1960; Teghtsoonian 6
Teghtsoonian, 1969; etc.). In all these studies, large
individual differences were found in ;. In other words, there
must have been two groups of subjects having different
forms of G(e0), one convex and the other concave upward as B and C curves in Fig. 2c. Galanter 6 Galanter (1973)
used as Q an aircraft with the sky as the background or a
small boat with water as the background. Always, Q passed
perpendicular to the line of regard at various distances e0 . The distances e0 were determined with a radar and covered a wide range from a few hundred yards to more than 5 miles.
The exponent ; systematically changed according to % in
Fig. 1b (,=0); 1.25t1.27 when % is close to 0% (H-direction), 1.0 for 12%, and 0.80 for 90% (Z-direction). Baird 6
Wagner (1982) performed an experiment at night in the
Dartmouth College campus in which Q is a building at a
distance e0 . Then, the scaled perceptual distance d0 was a power function of e0 with the exponent ;=1.17, and d0 to the sky right above the building was a power function of d0 to that building with ;=0.46.
These studies are not related to the geometry of the VS
and the dependency of form of G(e0) on individual, direction, and context may be ascribed to differences in mapping
function andÂor the process to make judgment on $0 . If we scale perceptual distances $0 to have data d0 , it is crucial to test consistency in d0 , as discussed before in the experiment with stars. Provided d0 B$0 , the exponent ; in G(e0) being larger than 1.0 means that \0= g(e0 , g) in Fig. 2a is not so convex upward as A and B in the range 0<e0<e01 . The boundary of VS in the direction g may be behind the
farthest percept (max e0>e01) or at e01(max e0=e01). Outdoors, often we see the sky filling the gap between some
farthest percepts, e.g., between buildings. The buildings and
the sky appear at the same radial distance. In this case, the
sky may or may not correspond to the boundary VS. When
File: 480J 115109 . By:DS . Date:23:04:97 . Time:10:28 LOP8M. V8.0. Page 01:01 Codes: 6694 Signs: 5730 . Length: 56 pic 0 pts, 236 mm
98
TAROW INDOW
we do not see any percept appearing with the same distance as the sky, the distance to sky is max $0 . Then, it is an interesting question to ask in what part in the physical space X does correspond to max e0 . This problem will be discussed elsewhere in connection with the appearance of horizon.
We have to take into account the geometrical structure of VS when the appearance of 3D figures at various e0 in X becomes the matter of concern. To the best of my knowledge, linear perspective, the set of rules artists use to accurately create 2D projections of the outline forms of 3D patterns in X, is completely based on Euclidean geometry (e.g., Kemp, 1990; Sedgwick, 1986). Yokochi (1995) analyzed many 18th century paintings in Japan (Kokan, Hiroshige, Okyo, etc.) and China (Ching Dynasty) and showed that the linear perspective was only partially followed. It is possible to think of linear perspectives based on other geometries (Finch, 1977). Realistic painting is essentially a similarity transformation on the H-plane of the original scene in VS3. Hence, it would be an interesting project to compare various linear perspectives based on different geometries as to which gives the most natural impression of a large scale scene, especially from the viewpoint of the geometry of the VS.
ACKNOWLEDGMENTS
In experiments performed when the author was a professor of Keio University in Tokyo, a number of former students participated as subjects. Collaborators of the unpublished experiments 5 and 6 in the fifth section, which were carried out here at UCI, are Kevin Wright and Toshio Watanabe.
REFERENCES
Baird, J. C., 6 Wagner, M. (1982). The moon illusion. I. How high is the sky? Journal of Experimental Psychology, General, 111, 296 303.
Battro, A. M., Pierro Metto, S., 6 Rozestraten, R. J. A. (1976). Riemannian geometry's of variable curvature in visual space: Visual alleys, horopter, and triangles in big open fields. Perception, 5, 9 23.
Blank, A. A. (1958). Axiomatics of binocular vision. The foundation of metric geometry in relation to space perception. Journal of Optical Society of America, 48, 328 334.
Blank, A. A. (1959). The Luneburg theory of binocular space perception. In S. Koch (Ed.), Psychology: A study of a science (pp. 395 426). New York: McGraw Hill.
Busemann, H. (1942). Metric methods in Finsler spaces and in the foundation of geometry. Princeton, NJ: Princeton University Press.
Busemann, H. (1955). The geometry of geodesics. New York: Academic Press.
Cook, M. (1978). The judgment of distance on a plane surface. Perception H Psychophysics, 23, 85 90.
Da Silva, J. A., 6 Da Silva, C. B. (1983). Scaling apparent distance in a large open field: Some new data. Perception H Motor Skill, 56, 135 138.
Filehne, W. (1912). Die mathematische Ableitung der Form des schenbaren Himmelsgewobes. Archiv fur Physiologie, 5, 1 32.
Finch, D. (1977). Hyperbolic geometry as an alternative to perspective for constructing drawings of visual space. Perception, 6, 221 225.
Freudenthal, H. (1965). Lie groups in the foundations of geometry. Advances in Mathematics, 1, 145 190.
Indow, T. (1968). Multidimensional mapping of visual space with real and simulated stars. Perception H Psychophysics, 3, 45 64.
Indow, T. (1979). Alleys in visual space. Journal of Mathematical Psychology, 19, 221 258.
Indow, T. (1982). An approach to geometry of visual space with no a priori mapping functions: Multidimensional mapping according to Riemannian metrics. Journal of Mathematical Psychology, 26, 204 236.
Indow, T. (1988). Alleys on apparent frontoparallel plane. Journal of Mathematical Psychology, 32, 259 284.
Indow, T. (1991). A critical review of Luneburg's model with regard to global structure of visual space. Psychological Review, 98, 430 453.
Indow, T. (1995). Psychophysical scaling: scientific and practical application. In R. D. Luce, M. D'Zmura, D. Hoffman, G. J. Iverson, 6 A. K. Romney (Eds.), Geometric representations of perceptual phenomena: Papers in honor of Tarow Indow on his 70th birthday (pp. 1 28). Mahwah, NJ: Lawrence Erlbaum.
Indow, T., 6 Watanabe, T. (1984a). Parallel- and distance-alleys with moving points in the horizontal plane. Perception H Psychophysics, 35, 144 154.
Indow, T., 6 Watanabe, T. (1984b). Parallel- and distance-alleys on horopter plane in the dark. Perception, 13, 165 182.
Indow, T., 6 Watanabe, T. (1988). Alleys on an extensive apparent frontoparallel plane: A second experiment. Perception, 17, 647 666.
Kemp, M. (1990). The science of art. New Haven: Yale University Press. Koenderink, J. J., van Doorn, A. J., 6 Kappers, A. M. L. (1992). Surface
perception in pictures. Perception H Psychophysics, 52, 487 496. Kohler, W. (1929). Ein altes Scheinproblem. Naturwissenshaften, 17,
395 401. Kunnapus, T. M. (1960). Scales for subjective distance. Scandinavian Journal
of Psychology, 1, 187 192. Luneburg, R. K. (1947). Mathematical analysis of binocular vision.
Princeton, NJ: Princeton University Press. Luneburg, R. K. (1950). The metric of binocular visual space. Journal of
Optical Society of America, 50, 637 642. Metzger, W. (1930). Optische Untersuchungen am Ganzfeld. II. Zur
phanomenologie des homogenen Ganzfeld. Psychologische Forschung, 1, 36 29. Miller, A., 6 Neuberger, H. (1945). Investigations into the apparent shape of the sky. Bulletin American Meteorological Society, 26, 212 216. Neuberger, H. (1951). General meteorological optics. In T. H. Malone (Ed.), Compendium of Meteorology (pp. 61 78). Boston: American Meteorological Society. Sedgwick, H. A. (1980). The geometry of spatial layout in pictorial representation. In M. A. Hagen (Ed.), The perception of pictures (Vol. 1), Alberti's window (pp. 33 90). New York: Academic Press. Sedgwick, H. A. (1886). Space perception. In K. R. Boff, L. Kaufman, 6 J. P. Thomas (Eds.), Handbook of perception and human performance (Vol. 1), Sensory processes and perception (Chap. 21). New York: Wiley. Suppes, P., Krantz, D. M., Luce, R. D., and, Tversky, A. (1989). Foundations of measurement (Vol. 2), Geometrical, threshold and probabilistic representations. New York: Academic Press. Teghtsoonian, M., 6 Teghtsoonian, R. L. (1969). Scaling apparent distance in natural indoor settings. Psychonomic Science, 16, 281 283. Teghtsoonian, R. L., 6 Teghtsoonian, M. (1970). Scaling apparent distance in a natural outdoor setting. Psychonomic Science, 21, 215 216. Wang, H. C. (1951). Two theorems on metric spaces. Pacific Journal of Mathematics, 1, 473 480. Wang, H. C. (1951). Two-point homogeneous spaces. Annals of Mathematics, 55, 177 191. Yokochi, K. (1995). Perspective in ukiyoe. Tokyo: Sanseido [in Japanese].
Received: November 6, 1996
File: 480J 115110 . By:DS . Date:23:04:97 . Time:10:28 LOP8M. V8.0. Page 01:01 Codes: 14244 Signs: 6679 . Length: 56 pic 0 pts, 236 mm