Relation of tracks played and recordings at the environments.
Audio representation is critical for immersive virtual environments. This article presents a quasi-experiment based on architecture students evaluating the immersive impact of 3D audio in the representation of urban environments. In the framework of acoustic urban heritage preservation, a set of city squares with varying acoustic features were used as case studies in a two-step process: an objective analysis of the acoustic properties of these spaces; and the users’ subjective perceptions of the virtual environment of the squares. The study shows that we can gain a better understanding of the objective parameters through the subjective views of users. Acoustic heritage can be assessed subjectively using an immersive system such as virtual reality, in which audio representation is a key factor.
- virtual reality
- urban planner
- sound design
The current chapter is based on the answers to two basic questions arising in evaluations of digital cultural resources, and more specifically, sonic cultural heritage: “what” and “how.” Firstly, what are we evaluating when we refer to digital cultural resources? Is it a tangible issue or an intangible perception? Does it consist of a series of personal impressions or can we establish an objective parameter? The United Nations’ Millennium Ecosystem Assessment was carried out from 2003 to 2005, and the subcategory of cultural heritage in Cultural Ecosystem Services (CES) was introduced over a decade later. Yet there is a lack of consensus about what cultural heritage refers to within the Ecosystem Services (ES) context . Secondly, how should cultural heritage be evaluated: through abstract concepts or immersive experience? It seems practically impossible to imagine an evaluation of cultural resources that is only based on abstract concepts. However, the evaluation also requires a common basis to enable comparisons of the results. This second answer supports the immersive experience as a powerful method for cultural resources evaluation.
An explanation is required to understand the methodology used in this paper. In general, people know very little about decibels of sound, and much less about sound roughness, musical clarity or speech intelligibility. Only a small group of scientists understand the operation of acoustic science. Therefore, when urban acoustic heritage is evaluated, why are ordinary people forced to refer to numbers and graphs? Surely the evaluation would be more reasonable if it were made using real sound samples? In this context, Virtual Reality (VR) provides an easy, interactive framework for ordinary people to evaluate urban acoustic heritage.
The interest in conservation of tangible and intangible cultural heritage has been rising notably in recent years. Apart from its own value, cultural heritage fosters economic and social growth. The Heritage Research National Plan, drawn up by the Spanish Cultural Heritage Institute, highlights the importance of cultural heritage as a local development engine and a stimulus for tourism, and its relevance as a generator of culture and knowledge. However, the Plan also stresses the complexity of research in this field, due to a range of characteristics and problems, and because of the high number of factors involved that make it necessary to apply human and experimental sciences in interdisciplinary teams.
1.1. Evaluation of acoustic quality in outdoor spaces
The evaluation of the acoustic quality of a space is fundamental to determine possible interventions in it and the suitability of its future uses. Several studies have established optimal indexes and ranges for the various measurable parameters [2, 3]. Nevertheless, as in current regulations, the focus has been concert halls, which have different features and requirements from outdoor spaces.
Very few objective and subjective tests have been undertaken in these kinds of spaces, due to the difficulty in installing measuring instruments and the variable conditions of the environments. In this study, four outdoor spaces were tested and a great effort was made to find the best environmental conditions. Preliminary work was done in the studied environments .
The application of new technologies in cultural heritage is a practice that has become increasingly widespread. The construction of virtual models allows us to reproduce environments for their study, avoiding direct intervention in these spaces and encouraging their conservation. After some data collection in the actual place, a model is designed and calibrated in which the environment can be recreated as many times as desired, without the need to travel there. This methodology could overcome the major difficulty that an in-place test might present.
Some authors have attempted to investigate urban sound propagation. They have centered on the complexity of the medium: irregular faces, interconnection with adjacent canyons, and a large variety of materials and boundary conditions. Moreover, a predominant characteristic of the urban environment is that it is open to the sky, and induces large radiative losses [5, 6, 7]. Much of the literature is focused on propagation in a single urban canyon [8, 9, 10, 11, 12, 13]. A few authors attempted to model wave propagation in parallel or intersecting streets, [14, 15, 16, 17] or in larger urban areas , but often limited to 2D geometries. Others have used a coupled modal-finite elements method to address the problem, while others have introduced the frontier finite elements method.
1.2. Spatial audio in architectural representation
Spatial audio in virtual reality has received increasing attention in recent years, due to its impact on the immersive experience. Spatial audio is the representation of audio features of reality that intentionally exploit sound localization. It has many possible uses in the gaming industry, entertainment or military applications. Most of these uses rely on both acoustic and spatial information about the sound. However, although spatial information is addressed, architectural design representation does not currently pay much attention to spatial audio as a factor in spatial representation.
Many other factors that have been considered in architectural design representation are linked to visual features [19, 20]. Natural light modeling and rendering [21, 22], artificial light control [23, 24], texture cognition and representation [25, 26], color discernment [27, 28] or material visualisation [29, 30, 31, 32, 33, 34] are some of the countless details that an architect must manage when they represent a building. However, although the effect of sound on spatial cognition is recognizable , it has received little attention in architectural representation.
In 2003, Kang et al. highlighted the introduction of new EU noise policies  and noted that noise-mapping software/techniques are being widely used in European cities . Nevertheless, they noted that these techniques can provide an overall picture for macro-scale urban areas, but the study of the micro-scale, for example an urban street or a square, could be more appropriate with the use of detailed acoustic simulation techniques. In addition, applications that predict and measure micro-scale environments  are still not sufficiently user-friendly, and the computation time is rather long. Kang et al. presented two computer models based on the radiosity and image source methods in an attempt to present to urban designers an interface that could be useful in the design stage, using simple formulae that can estimate sound propagation in micro-scale urban areas.
This paper presents a set of criteria for implementing 3D audio in virtual urban environments. The study is based on the definition of a new virtual audio format, generated from the combination of objects and ambisonic formats. This new audio format was explained in 2017 . Using these criteria, we then describe the preparation of a set of experiments with architecture students. The results of the experiments confirm that the implementation of 3D audio enhances the immersive experience in the environments.
2. The case study environments
Four main performance environments located within the heart of the Ciutat Vella of Barcelona, the area surrounded by the former Roman Walls, were studied: Plaça Sant Felip Neri, the corner between Carrer del Bisbe and Carrer Santa Llúcia, Plaça Sant Iu, and Plaça del Rei.
2.1. Plaça Sant Felip Neri
This quiet and secluded public square, located at the end of Montjuïc del Bisbe street, is one of a set of closed squares in the Ciutat Vella of Barcelona. Its floor plan shows an irregular pentagon boundary figure with a central fountain. The 505 sqm plan presents a uniform stone floor material and is completed by five façades made of stone material as well. Of the five façades, one is Sant Felip Neri church, while the others house a school, a hotel, some dwellings and the parish stances. Three big, old trees with an asymmetric distribution in plan cover the square with their foliage. Their trunks serve as irregular columns that support the green ceiling, enclosing the square and preventing people from seeing the open sky. No sound of traffic is heard, because the Plaça is far from main roads. However, the noise of shouting children fills the square every morning, when a group play during breaktime in their beautiful schoolyard: Plaça Sant Felip Neri. During the rest of the day, a few groups of tourists arrive and look to the pockmarked stones on the church façade; marks that remind us of the Spanish Civil War. At any time of day, a street musician may use the square to play the guitar or violin in the most distant corner or near the central fountain, accompanying with music couples who are out walking, in a romantic scene.
2.2. Carrer del Bisbe-Carrer de Santa Llúcia
This little crossroads near the cathedral square seems an ordinary place. Nevertheless, a closer examination reveals that some factors come together in this single crossing. Geometrically, the floorplan forms a T pattern in which the crossing point coincides with the bishop’s palace door. This door, when opened, reveals an interior courtyard that enlarges Carrer de Santa Llúcia, leading into this peaceful enclosure. The façade of Santa Llúcia chapel in the same corner gives a monumental and ceremonial character to the place. On the opposite side, the entrance to the Casa de La Ardiaca museum is prolonged by a ramp. During the day, some street vendors invade the corner and try to sell their products in front of Santa Llúcia chapel or at the beginning of the ramp. However, the Bishop’s palace door is always fully clear, because of the presence of a guard when the door is open, or even on account of the large number of people who circulate through Carrer del Bisbe. Only at night, and particularly on Saturday nights, people tend to fill the area in front of the closed door of the bishop’s palace standing up and looking in the opposite direction. There, an old man sings opera arias and recitatives over an amplified orchestral base. His voice invades the corner and goes beyond those limits, turning the old streets into an urban opera theatre.
2.3. Plaça de Sant Iu
Like a widening of the street, Plaça de Sant Iu is located in front of the eastern door of the cathedral. A gigantic gothic door crowned by an octagonal bell tower constitutes the west façade of this little square. On the opposite façade, a gallery formed by five stone arches closes the square. The north façade is formed by the classic-style entrance of Frederic Marès museum, whilst the south façade presents a flat wall without any door that serves as a perfect backstage for street musicians. This is the preferred point for buskers in Plaça de Sant Iu, not only because of the presence of the big wall behind them, but also because of a long stone bench on the east façade that allows listeners to sit down. In this privileged environment, groups of one, two or three musicians sing or play their instruments. A unique stone material covers the four façades of the square and the floor and gives it a uniform appearance.
2.4. Plaça del Rei
Some meters behind Plaça de Sant Iu is Plaça del Rei, a totally different environment both in size and proportions. There are no trees in this 745 sqm area of stone pavement, whilst its four façades enclose the square up to a height of 20 m. In the north corner of the square, a monumental staircase rises from the floor to the Museu de la Ciutat door. Usually, street musicians enliven the atmosphere with their instruments every day, and crowds of tourists occupy the entire square looking at the real shields on the walls, the pointed arches of the windows, or the tower of Santa Àgata chapel. The everyday life of the square is always very busy, and total silence only occurs when the square transforms into a concert hall for choir, orchestra or band performances. At these times, the players are usually situated on the corner stairs and the public occupy the rest of the square. When this happens, the sound of the musicians can be heard bouncing on the hard stone of the rear walls creating a sense of spatiality that envelopes the audience.
3. The methodology used for the in-site measurements
The four environments have some features of open-air places. However, bearing in mind studies on the evaluation of outdoor space acoustics , we analysed them using a closed concert hall acoustic method. This decision was taken after considering three factors. The first concerns the openness of the places: the four environments can be seen as boxes in which the floor and walls are made of stone, and the ceiling of the most absorbent material that ever existed, because no sound will bounce in the open air. The second consideration for the decision concerns size: the smallest environment, the Carrer de Santa Llúcia, holds an air volume of 1800 m3, which makes it similar to a typical hall for speeches; the largest environment, the Plaça del Rei with a volume of 12,000 m3, does not exceed the volume of a big concert hall such as the Berlin Philharmonic Concert Hall. Finally, the third consideration explains that in an open-air environment, the sound sources change their position every moment. This situation could be definitive if we were studying the soundscape of an everyday configuration, with running children, singers, street vendors or even police sirens. However, we are recording the place in a street concert configuration, and this means that there is one player at a fixed point and the listeners stand up in the quietest mode.
The measurement methodology was previous controlled reproduction. This method consists of the previous recording of an acoustic signal in an anechoic chamber and the following recording of the same signal in the environment. Subsequently, the two signals are compared. The first of the recordings in the anechoic chamber were made with a calibrated reproduction system and a calibrated recording system. The reproduction system consisted of a directional speaker LD 90 W connected to a 230 V power supply. It was positioned in one of the corners of the anechoic chamber. The recording system consisted of RODE NT-55 pair-matched microphones connected to a ZOOM H6 handy recorder on a stand. This recording system was positioned in the middle of the chamber, which was 2.5 m from the speaker. Additionally, the anechoic measurements were recorded with a HATS system connected to a laptop (Figures 1 and 2).
The in-site measurements were performed with the same equipment as the anechoic chamber measurements, except the HATS system. The distance of the measurements varied in each environment, as shown in Figure 3:
To ensure uniform measurements, some were repeated in the four environments, while others were recorded in specific places, considering the normal musical use of the environments. Tracks 1–13 were repeated in the four environments. However, tracks 17 (the final part of Stravinsky’s Firebird Suite) and 18 (a solo harp piece by Lucien) were reproduced only in Plaça de Sant Felip. Track 16 (the initial measures of Puccini’s Nessum Dorma) was reproduced in Carrer de Santa Llúcia due to the usual type of music played in this environment. Tracks 14 (initial measures of Victoria’s choral work O sacrum convivium) and 15 (a flute-guitar piece by Dvorak) were played in Plaça de Sant Iu. Finally, track 19 (initial measures of Orff’s Carmina Burana) was recorded in Plaça del Rei. The dates of the recording are as follows (Table 1):
|Track||Plaça de Sant Felip Neri||Carrer de Santa Llúcia||Plaça de Sant Iu||Plaça del Rei|
|1. La oboe||30/3/2017||24/04/2017||07/04/2017||06/04/2017|
|5. 63 Hz||30/3/2017||24/04/2017||07/04/2017||06/04/2017|
|6. 160 Hz||30/3/2017||24/04/2017||07/04/2017||06/04/2017|
|7. 400 Hz||30/3/2017||24/04/2017||07/04/2017||06/04/2017|
|8. 1000 Hz||30/3/2017||24/04/2017||07/04/2017||06/04/2017|
|9. 2000 Hz||30/3/2017||24/04/2017||07/04/2017||06/04/2017|
|10. 4000 Hz||30/3/2017||24/04/2017||07/04/2017||06/04/2017|
|11. White noise||30/3/2017||24/04/2017||07/04/2017||06/04/2017|
|12. Pink noise||30/3/2017||24/04/2017||07/04/2017||06/04/2017|
4. What is evaluated?
The four environments were studied using a reproduction-recording system. In this system, an impulse signal previously calibrated in the Anechoic Chamber was emitted in the environment. This signal was captured in different positions in each environment, as already mentioned. Figure 3 shows the measured positions. Each of the recording points was subdivided into two channels corresponding to left (L) and right (R). Therefore, the naming of each recording consists of the number of the recording point, followed by an underscore and capital letter L or R: 1_L, 1_R, 2_L, 2_R, etc.
4.1. Acoustic framework
Once the recordings had been made, different parameters of acoustic quality were obtained by signal processing. The following parameters were studied:
Reverberation time (T60): when a sonorous source that is continually radiating suddenly stops in a determined enclosure, a listener in the hall will continue to hear the sound for a period of time in which its energy is being absorbed by the surfaces of the enclosure’s limits . The T60 value corresponds to the falling time of the sound associated with the angle for the first 60 dB decrease. The T60 for an empty hall varies with the frequency. Generally, for music halls, the Ts is higher for low frequencies and decreases when the frequency increases. This typical spectrum of reverberation is known as the tonal curve.
Early decay time (EDT): this considers the reverberation time for the first 10 dB of decrease. EDT is more closely related to the subjective impression of the reverberation in an enclosure than Ts . To ensure good diffusion of sound in a hall, it is imperative that EDT corresponding to 500 Hz and 1 kHz is in the same order as Ts .
Speech clarity (C50): registered C50 values vary with the listening point. According to Carrión Isbert , the recommended value of C50 associated with each point in an occupied hall must fulfill C50 > 2 dB. The higher the value, the greater is the speech intelligibility and sonority in the considered point.
Definition (D50): if the definition increases, the hall is better prepared for speech, as may be the case in theatres or conference halls. Thus, a D50 value that is over 65% is an appropriate value for this kind of hall. A concert hall with good acoustics has a definition index lower than 50% in central frequencies of 500 and 1000 Hz. In concert halls, the higher the definition index is, the worse quality is the acoustics .
Musical clarity (C80): registered C80 values vary with the listening point. Beranek  recommends an average of −4 ≤ C80 ≤ 0 dB for C80 in the 500 Hz, 1 kHz and 2 kHz frequencies for an empty hall. Values over +1 dB should be avoided.
Strength (G): G values remain similar at each of the measurement points. They approximately correspond to a decreasing line from low frequencies (G = 30) to high frequencies (G = 10). UNE-EN ISO 3382  recommends G values between 4 and 5.5.
4.2. Acoustics quality parameters in the environments
4.2.1. Reverberation time (T60)
In Sant Felip Neri square (top left from Figure 4) we can observe an increase in reverberation time when we step away from the source. The Ts values vary from a conference hall (T60 = 0.7–1.9: point 1), an opera theatre (T60 = 1.2–1.5: point 2), a chamber music concert hall (T60 = 1.3–1.7: points 3 and 4) and a symphonic concert hall (T60 = 1.8–2.0), according to the recommended values of Carrión . Plaça del Rei (top right) presents typical values of a symphonic hall when we step away from the source. Meanwhile, Plaça Sant Iu (bottom left) shows a typical curve of a speech hall in all the interior points of the square, except for the point in the alley, which presents Ts values like a chamber music hall. Carrer de Santa Llúcia (bottom right) presents some features of an opera theatre for points 1, 2 and 3, that is, in the frontal points to the source, whilst in the rear part to the source, some symphonic hall features are presented.
4.2.2. Speech clarity (C50)
In Plaça de Sant Felip Neri (top left from Figure 5) only the recordings at points 1, 2 and 3 exceed 2 dB of C50 for high frequencies. Note that Points 1, 2 and 3 are the nearest points to the source and it is natural that clarity is better near the speaker. Thus, we can deduce that this is not a square with clear acoustics for speech in most of the recording locations and frequencies. In Plaça del Rei (top right) we find a similar situation at first glance. However, clarity is very appropriate at point 1 for mid-high frequencies. Moreover, as we step away from the source, that is, at points 1, 2 and 3, clarity is restricted only to high frequencies, whilst in lateral points, the clarity is below accepted levels. Plaça de Sant Iu (bottom left) presents a similar scheme to those seen above: a lack of clarity for low-mid frequencies and better clarity for high frequencies. Note that point 4 is the only one that does not follow the typical curve of the other points. This is due to its position in the access alley to the square rather than inside the enclosure. Thus, its behavior is different from the others. In Carrer de Santa Llúcia (bottom right) we can observe a progressive decrease in speech clarity from point 1 to point 5 for mid-high frequencies. This knowledge indicates that speech clarity at points 1 and 2 is only acceptable for mid-high frequencies. Considering that this environment is generally used by opera singers, and that most of the audience occupies the zone in points 1 and 2, we can say that the acoustics of this space are extremely favorable to its use.
4.2.3. Musical clarity (C80)
In Sant Felip Neri Square (top left from Figure 6) we can see that all the C80 values remain above −4 dB, but they are higher than +1 dB from 2 kHz at points 1, 2 and 4. If we compare C50 and C80 values, we can deduce that this square is clearer for music than for speech. In Plaça del Rei (top right), C80 values at points 2, 4 and 5 are within the desired limits. However, at points 1 and 2, C80 values exceed +1 dB, thus those points are not optimum for musical clarity. This square holds good musical clarity at the points furthest from the source, that is to say, points that belong to the reverberant field and not to the direct field. In contrast, Plaça de Sant Iu (bottom left) only shows C80 values within the desired limit when we consider low and mid frequencies. These recommended frequencies hold C80 values exceeding the +1 dB criteria. These data, compared with C50, make us think that the square is more appropriate for speech than for music. It is a square in which the spoken word is correctly understood, although music is not underprivileged. Finally, Carrer de Santa Llúcia (bottom right) has better musical clarity at points 3, 4 and 5. Curiously enough, these points are the same that held the aforementioned bad speech clarity (C50). This fact suggests that the points with better qualities for speech are not the optimum ones for music, and vice-versa.
4.2.4. Definition (D50)
In Plaça Sant Felip Neri (top left from Figure 7), we can see that D50 values were below 50% for low frequencies, but above 50% for high frequencies (from 500 Hz). This was true particularly at points 1, 2 and 4. Thus, this square is very appropriate for music at points 3 and 5 (lateral and distant from the source) and better for speech at points 1, 2 and 4 (points near to or centered with the source). Similarly, the Plaça del Rei (top right) had D50 parameters that exceeded 50% for high frequencies at points 1, 2 and 3 (these points were aligned frontally with the source). The lateral points maintained D50 values under 50%. Thus, these lateral points are appropriate for music. These data, together with those revised about C50 and C80, again indicate that this square has good acoustics for music and worse acoustics for speech. Conversely, Plaça Sant Iu (bottom left) had a similar tendency for almost all the measured points, except for the point measured in the alley. In particular, D50 was above 65% for high frequencies and under 50% for mid and low frequencies. Therefore, we restate that this square works better for speech and has too much definition for music. Perhaps for this reason, and because of its size, the square is ideal for solo singers or those accompanied with chamber instruments. Finally, Carrer de Santa Llúcia (bottom right) has an inherent tendency to lower definition when we move away from the source or we are behind it. Particularly, points 1 and 2 are more suitable for speech or opera, whilst points 3, 4 and 5 have better features for music. Again, we can note that the audience zone belongs to points 1 and 2.
4.2.5. Early decay time (EDT)
In Plaça de Sant Felip Neri (top left from Figure 8) we can see that EDT values for 500 Hz and 1 kHz are similar to Ts values, except for the Ts peak at point 5, which is the result of a measurement error due to the high amount of background noise at that time. Similarly, in Plaza del Rei (top right), Sant Felip Neri (bottom left) and Carrer Santa Llúcia (bottom right), the EDT levels are very similar to the Ts levels, which indicates that there is a good sound diffusion in these environments.
5. How is it evaluated?
Quantitative approaches are the main methods of scientific research. They focus on analysing the degree of association between quantified variables, promulgated by logical positivism. Therefore, possible answers need to be constrained in order to evaluate results objectively . Some evaluation investigations have already been done with architecture students [44, 45].
Qualitative research is less common in education areas, because it focuses on detecting and processing intentions. Unlike quantitative methods, qualitative approaches require deduction to interpret the results. The qualitative approach is subjective, because it is assumed that reality is multifaceted and cannot be reduced to a universal parameter. Interviewers are passive observers, they take notes and classify them . These methods have traditionally been related with social sciences, due to their association with human factors and the user’s experience. In fact, User Experience, UX, is a discipline focused on the study of behavioral patterns in working environments. Our case study is framed in teaching process usability . Thus, a brief discussion of what usability means is mandatory.
5.1. Usability evaluation
We could define usability as a general quality that indicates the suitability for a specific purpose of a particular artefact (appropriateness for a purpose) .
This term is linked with the development of products (which could be systems, technologies, tools, applications or devices) that can be easy to learn, effective and enjoyable in the user’s experience. Nevertheless, usability can be considered another factor in a wider process called the acceptability of a system. Thus, acceptability defines whether a system is good enough to meet all a user’s needs .
In ISO/IEC 9126, usability is defined as “software product capability to be understood, learned, used and attractive for the user, when it is used under specific conditions.” However, usability is not limited to computer systems. It is a concept that can be applied to any element in which an interaction between a human and an artefact occurs.
In addition, in ISO/IEC 9241-11, the guidelines for the usability of a particular product are described. Here, usability is defined as “the level in which a product can be used by particular users in order to reach specified goals with effectivity, efficiency and satisfaction in a particular context of use.” In our research, the effectivity of a system is related with its goals, efficiency is related with the performance of the used resources to reach the goals, and satisfaction is related with its acceptability and commodity . This definition is based on the concept of quality in use, and describes how the user does particular tasks in particular environments in an effective way . For Bevan, the quality of use, measured in terms of efficiency, efficacy and satisfaction, is not only determined by the product, but also by the context (kind of users, tasks of the users and physical environment). Therefore, the usability, understood as the quality in use of a product is the interaction between a user and a product while a task is being accomplished in a technical, physical, social and organisational environment.
In our study, usability defines the general quality, indicating the suitability for educational purposes of an immersive scenario. In a similar line as , the goal is to evaluate the student motivation before and after the use of such technologies. Users are asked to evaluate the quality of the soundscape representation in this scenario. Both visual and acoustic data have a direct impact on the perception of the space and the realism of this representation is the focus of the evaluation.
5.2. Experiments with architecture students
Two kinds of experiments were carried out with architecture students. The first group of experiments considered a quantitative approach to the evaluation, whereas the second group was performed according to a qualitative approach.
5.2.1. The quantitative test
A set of multiple choice questions about several audio sequences were administered to a group of people (17 in total). In this test, some urban acoustic features were analysed. This first test, as shown in Table 2, was given once the recordings and acoustic analysis had been carried out. The test had two objectives: to characterise people’s perception and knowledge of acoustic and sonic features of the outdoor space, and to obtain feedback on the most relevant aspects of street music.
|E. Code||Description||Option A||Mention index MI (%) for option A||Option B||Mention index, MI, (%) for option B|
|1. I||Speech clarity (C50)||C50 (500 Hz) = 2 dB (from Figure 5: Carrer Santa Llúcia, recording point 1, Puccini track)||100||C50 (500 Hz) = −7.5 dB (from Figure 5: Carrer Santa Llúcia, recording point 3: Puccini track)||0|
|2. S||Sense of space||(Plaça de Sant Iu, recording point 3, Tchaicovsky track)||82.4||(Plaça del Rei, recording point 1, Tchaicovsky track)||17.6|
|3. EDT||Early decay time||EDT (500 Hz) = 1.3 s (from Figure 8: Plaça Sant Felip Neri, recording point 1, Mendelssohn track)||29.4||EDT (500 Hz) = 2.3 s (from Figure 8: Plaça del Rei, recording point 4, Mendelssohn track)||70.6|
|4. Br||Brightness||(Plaça Sant Iu, recording point 1, Dvorak track)||58.8||(Plaça Sant Iu, recording point 4, Dvorak track)||41.2|
|5. T60||Reverberation time||T60 (500 Hz) = 0.65 s (from Figure 4, Plaça del Rei, recording point 1, Orff track)||88.2||T60 (500 Hz) = 1.75 s (from Figure 4, Plaça del Rei, recording point 4, Orff track)||11.8|
|6. BR||Bass ratio||(Plaça Sant Felip Neri, recording point 1, Lucien track)||82.4||(Plaça Sant Felip Neri, recording point 4, Lucien track)||17.6|
The results should allow for an initial approximation of whether people are aware of the nuances and differences between a street music recording and a concert hall music recording. Above all, it should be possible to test people on the big differences between the acoustics of the different public spaces. Therefore, different recordings of the same music but from different spatial points were compared. A total of six questions, each with a new melody, covered the following topics: speech intelligibility, sense of space, reverberation time, timbre modification, EDT and bass amplification.
|USERS||U 1||U 2 (M)||U 3 (M)||U 4||U 5||U 6||U 7||U 8||U 9||U 10 (M)||U 11||U 12||U 13 (M)||U 14||U 15 (M)||U 16||U 17|
After an analysis of the survey results, we can highlight some important findings. First, all the questions in Table 2 are balanced to A or B over 70%, except the fourth question, which is more ambiguous. This shows the high consensus about the acoustic features that were being evaluated. Second, we highlight the presence of users who had professional or higher music qualifications in Table 3. These users agreed unanimously, or almost unanimously (except one) in their decisions. The only question on which they disagreed with each other concerns the sense of space.
5.2.2. The qualitative test
In this context, the analysis of some typical street music locations in Ciutat Vella of Barcelona was included in this soundscape evaluation. One of the environments studied here was the Plaça Sant Felip Neri. Various recordings were made in the environment according to different positions of the listeners. For the current study, one recording was selected: The Fountain by Marcel Lucien was reproduced in Plaça de Sant Felip Neri in five positions. Figure 1 shows the emitting point with the enumeration of the different recording points in each square.
To create the test conditions, Plaça Sant Felip Neri needed to be reproduced as faithfully as possible, in terms of visual and auditory aspects.
First, a 3D model was created using photogrammetric processing of digital images to generate 3D spatial data. This method is relatively fast to implement, does not require specialised hardware like laser scanning and, if performed correctly, produces high quality results that are not as precise as other techniques, but more than enough to transport the user to a faithful 3D recreation of the square.
The next goal was to recreate the soundscape. Two options were developed and presented to the test subjects. The first option was to use the original concert hall recording and present it to the users as it is, without any distortion, reverberation or additional ambient sounds from the square.
The second option was more difficult to create. As stated, five recordings of the song were made from different locations in the square. These recordings captured the subtleties a user would experience listening in the real square. The challenge was to allow the test subjects to move freely around the square and still be able to listen to the song in conditions as close as possible to the real conditions at any point in the square, not only the five recording points.
To extend the experience to any given point, a mixing algorithm was used to perform a logarithmic interpolation between the nearest recordings, to provide an experience that was identical to the original when the user was exactly at the recording point, and faded seamlessly as they moved closer to the next one.
Finally, a head mounted display needed to be used to show the virtual reality to the test subjects. The Oculus Rift was selected for this task due to its compact size, high quality display and integrated headphones. This provided a fully immersive environment with a great sense of presence for the users (Figure 9).
The Oculus Rift allows for room scale tracking. This means that the user can move around the real space, and that movement translates to the virtual world. This greatly amplifies the sense of presence and makes the experience a lot more realistic and comfortable. However, it has limitations: the length of the cable and the resolution of the tracking cameras only allow the user to move around a space 3 m long and 2 m wide, approximately.
To improve this aspect, a teleport system was created. Using the Oculus Rift touch controllers, the user could point to any location in the square, and instantly teleport to that location. This provided the necessary freedom to move around the square, while maintaining all the benefits of room tracking.
The result of this process was an experience that allowed free movement around a realistic 3D recreation of the original square. Users perceived two distinctly different audio environments: one that recreated concert hall conditions, and a second one to experience as closely as possible the square’s sound conditions (Figure 10).
For the qualitative study, a sample of 18 students (11 men and 8 women) who agreed to participate was selected.
The BLA method works on positive and negative poles to define the strengths and weaknesses of the product. Once the element is obtained, the laddering technique can be applied to define the relevant details of the product. The object of a laddering interview was to uncover how product attributes, usage consequences, and personal values are linked in a person’s mind. The characteristics obtained through the laddering application will define which specific factor contributes to the consideration of an element as either a strength or a weakness. The BLA process consisted of three steps, following a similar method to Fonseca, Redondo and Villagrasa :
Elicitation of elements. The implementation of the test started with a blank template for the positive (most favorable) and negative (least favorable) elements. The interviewer (in this case the professor) asked the users (the student) to mention a positive and a negative aspect of the two types of music that could be heard (Option A and Option B). Thus, we obtained two positive aspects and two negative aspects.
Marking of elements. Once the list of positive and negative elements has been completed, the interviewer asked the user to mark each one from 0 (lowest possible level of satisfaction) to 10 (maximum level of satisfaction).
Element definition. Once the elements had been assessed, the qualitative phase started. The interviewer asked for justification of each one of the elements by performing the laddering technique. Questions were asked such as “Why is it a positive element?” “Why did you give it this mark?” The answers had to be specific explanations of the exact characteristics that made the mentioned element a strength or weakness of the product.
From the results obtained, the next step was to polarize the elements based on two criteria:
Positive (Px)/Negative (Nx): the student had to differentiate between elements perceived as strong points of the experience that helped them to consider the music as satisfactory, compared to negative aspects that were not satisfactory or simply needed to be modified to be satisfactory.
Common Elements (xC)/Particular (xP): finally, the positive and negative elements that were repeated in the students’ answers (common points) and the responses that were only given by one of the students (particular points) were separated according to the coding scheme shown in Tables 1–3.
The common elements that were mentioned at a higher rate were the most important aspects to use, improve or modify (according to their positive or negative sign). Particular elements, which were mentioned by only one user, could be ruled out or treated in later stages for development (Table 4).
|E. code||Description||Av. score (Av)||Mention index (MI) (%)|
|1PC (A)||Clarity of music||7.7||44.4|
|2PC (A)||Guiding thread for music||8.3||16.6|
|3PC (A)||Quality of sound||8.3||16.6|
|4PC (A)||Focused on the music||9||11.1|
|1PP (A)||Peaceful music||9||5.6|
|1NC (A)||Not realistic||3||33.3|
|2NC (A)||No sense of space||4||22.2|
|3NC (A)||No background||4.5||11.1|
|4NC (A)||Movement too fast||4||11.1|
|1NP (A)||No variance of echo||4||5.6|
|2NP (A)||Like a television||4||5.6|
|3NP (A)||Too loud||4||5.6|
|2PC (B)||Sense of the place||8.7||33.3|
|2PP (B)||Softer and modulated||7||5.6|
|3PP (B)||More natural||7||5.6|
|1NC (B)||No clarity of music||3.8||22.2|
|2NC (B)||Relation between background and vision||4.7||16.7|
|3NC (B)||Disturbing background||3.7||16.7|
|4NC (B)||Problems with volume||3.7||16.7|
|5NC (B)||It is not real enough||3.5||11.1|
|1NP (B)||Quality of hardware||5||5.56|
|2NP (B)||Sudden changes in sound||3||5.56|
The individual values obtained for positive and negative indicators are shown in Table 5. Once the features mentioned by the students were identified and given values, the third step defined by the BLA initiated the qualitative stage in which the students described and provided solutions or improvements for each of their contributions in the format of an open interview.
Table 6 shows the main improvements or changes that the students proposed for both positive and negative elements.
|USERS||U 1||U 2||U 3||U 4||U 5||U 6||U 7||U 8||U 9||U 10||U 11||U 12||U 13||U 14||U 15||U 16||U 17||U 18|
|E. Code||Description||Mention index (MI) (%)|
|1CI (A)||Improve the relation with the environment||66.7|
|2CI (A)||Improve the background sound||22.2|
|3CI (A)||Change the position of the sounds||11.1|
|2PI (A)||Decrease the volume of the sound||5.7|
|3PI (A)||Improve the relation with the musician||5.7|
|4PI (A)||Improve the quality of the sound||5.7|
|1CI (B)||Improve sound quality||27.8|
|2CI (B)||The changes between position could be softer||22.2|
|3CI (B)||Balance the volume levels between different points||11.1|
|4CI (B)||Improve the relation between vision and sound||11.1|
|5CI (B)||Decrease the background noise||11.1|
|1PI (B)||Improve the clarity of sound||5.7|
|1PI (B)||Improve the relation with the place||5.7|
At this point, we can identify the most relevant items obtained from the BLA, which had high rates of citation, high scores or a combination of both. It is important to separate the types of results obtained. The first group belongs to option A (concert hall recording), and the second group to option B (public square recording). After the elicitation of the most relevant features of each of them, we are going to end by comparing them.
Option A (concert hall recording). We can highlight that this kind of recording has good clarity of music (MI: 44.4%, Av: 7.8), it favors the guidance of the thread for music (MI: 16.6%, Av: 8.3), and the quality of the sound is valued (MI: 16.6%, Av: 8.3). In terms of the main negative comments, students clearly identified a lack of realism in this kind of experience (MI: 33.3%, Av: 3), that was related to the lack of sense of space (MI: 22.2%, Av: 4) and they missed the background noise (MI: 11.1, Av: 4.5), aspects that were directly related to the design of the application.
Option B (public square recording). Two main positive aspects were highlighted by students: the high degree of realism of the application both in visual and acoustic terms (MI: 50%, Av: 8.4), and the good relation between sound and place (MI: 33.3%, Av: 8.7). Conversely, some negative comments were pointed out: a lack of clarity in the music (MI: 22.2%, Av: 3.8), a bad relation between background and vision (MI: 16.7%, Av: 4.7), which could be solved with the position of different visual avatars, and the presence of some disturbing background (MI: 16.7%, Av: 3.7), due to the different times of the original recordings. Technically, these would be the main aspects to modify in future iterations of the proposed method.
In summary, two clear opinions about the experiment were shown, which confirm the first question of the survey: Which recording do you prefer, A or B? Most people (61.1%) agreed that option B was better than option A (38.9%). The reasons for this answer were clearly explained in the rest of the survey. Although there was a high valuation of the realism of the application both in visual and acoustic terms in option B (MI: 50%, Av: 8.4), it was also certain that clarity of music in option B was not as good as in option A, as we can see if we compare 1PC (A) with 1NC (B). This confirms that the street music recording implies a decrease of quality in the music played. This loss could be a drawback for musicians who want to perform in the middle of the city. However, the survey reveals another feature that must be taken into account: a third of the students (MI: 33.3%) evaluated option B with an almost excellent score (Av: 8.7) for the sense of space quality (2PC [B]). This shows the hidden potential of spatial sound, that is, sound spatialization. Several attempts can be found in the history of music in which composers wrote their music bearing in mind the spatial features of the places in which it was going to be played. However, all these compositions tend to be limited to closed spaces, and the spatial possibilities are limited to the specific space. A wide range of possibilities arise when a closed concert hall is replaced by the openness of squares and public spaces. Coupled volumes, streets, galleries, balconies or even stairs now belong to this new stage for music that can be explored in infinite ways.
The study aimed to highlight the questions of What is going to be evaluated and How is it going to be evaluated within the context of cultural heritage evaluation. In this case study of a higher education evaluation, we have explained that both questions can be answered in three words: objectivity through subjectivity. In fact, what was evaluated with the quantitative and qualitative tests was not far from what was studied with the acoustic parameters. Furthermore, the subjective opinions were based on the objective parameters. This built a bridge over the big gap between these two poles, and helped us to understand that no objective parameters can be evaluated without subjective insight. For the cultural evaluation, a scientific basis must be established to achieve reliable results. Nevertheless, a unilateral evaluation that only relies on these scientific data would overlook the valuable opinions of users. What is more, without the user’s insight, the analysis would neglect the term “cultural,” because no culture is possible without the action of humans, that is, the users. Here, cultural is defined as the opposite of natural, as a synonym of artificial, as something that is evaluated by a human.
However, some limitations of the study need to be addressed in further research. The number of participants in the surveys should be increased, and a pre- and post-test evaluation of satisfaction with the process introduced. Similarly, the immersive experiment should be extended to other outdoor environments (Carrer Santa Llúcia, Plaça de Sant Iu and Plaça del Rei) so that the objective parameters can be compared with the subjective users’ opinions.
Further research must also be carried out on the implementation of these representation techniques in the higher education system, especially in Architectural Degree courses, in which spatial understanding is crucial. In this context, it is clear that architecture students should be able to deal with spatial representations that not only cover visual features, but also sonic or even thermic components of architecture. Today’s technology has reached such a high level of representation capabilities, that a vague idea of what an environment looks like is no longer acceptable. An architect should manage these tools when they present a new building, and protect existing constructions that are regarded as cultural heritage.
This research was supported by the National Programme of Research, Development and Innovation aimed to the Society Challenges BIA2016-77464-C2-1-R & BIA2016-77464-C2-2-R of the National Plan for Scientific Research, Development and Technological Innovation 2013-2016, Government of Spain, titled “Gamificación para la enseñanza del diseño urbano y la integración en ella de la participación ciudadana (EduGAME4CITY),” and “Diseño Gamificado de visualización 3D con sistemas de realidad virtual para el studio de la mejora de competencias motivacionales, sociales y espaciales del usuario (EduGAME4CITY).”