Academia.eduAcademia.edu
Recreating the Past Duncan Brown Kate Devlin Alan Chalmers Philippe Martinez Paul Debevec Greg Ward Course #27, SIGGRAPH 2002 San Antonio, Texas, USA. 21 – 26 July 2002 Abstract Recent developments in computer graphics and interactive techniques are providing powerful tools for modelling multi-dimensional aspects of data gathered by archaeologists. This course addresses the problems associated with reconstructing archaeological and heritage sites on computer and evaluating the realism of the resultant models. The crucial question considered is: are the results misleading and thus are we in fact misinterpreting the past? We will never know precisely what was in the mind of our ancestors as they painted rock shelters in France 25 thousand years ago, or raised the pyramids in Egypt, or even purchased a particular brightly coloured pot during the Middle Ages. Recently archaeologists have been increasingly turning to computer graphics and interactive techniques to help interpret material preserved from ancient cultures. This course describes currently used state-of-the-art techniques for reconstructing archaeological sites and addresses the issues that still need to be resolved so that these techniques can indeed play a significant role in helping us understand the past. The attendees should have an interest in “understanding the past” and a basic knowledge of computer graphics. No prior knowledge of laserscanning, lighting simulation or visual perception evaluation is assumed, although any knowledge will be an advantage. The course covers the following topics: creating models of the sites, including laserscanning; very realistic lighting siumlation; quantifying the realism of the results using human visual perception and psychophysical methods; valid interpretation of the results by the archaeologists and general public. All topics are illustrated by case studies. i ii Course Schedule Module 1 - Creating the Past 1.30 Introduction to Recreating the Past - Chalmers - some intuitive examples of applications - role of realism - our focus: understanding the past using computer graphics methods 1.50 Creating the Models - Debevec & Martinez - using all the evidence - laser scanning - recreating color and textures 2.40 Very Realistic Lighting Simulation - Ward & Brown - experimental archaeology to recreate the ancient fuel types - accurate simulation of light propagation - tone mapping operators to achieve meaningful results 3.15 Break Module 2 - Interpreting the Past 3.30 Quantifying Realism - Chalmers - psychophysics: procedures for comparing real and synthetic images - fidelity assessment - case studies 4.00 Interpretation of the Models - Brown & Devlin - displaying the information - setting standards - interpreting the results - avoiding misinterpretation - developing new hypotheses 4.40 Conclusion & Summary - Chalmers & Martinez 5.00 Discussion and questions - All iii iv About the authors Alan Chalmers is a Reader in the Department of Computer Science at the University of Bristol, UK. He has published over 80 papers in journals and international conferences on very realistic graphics. He is currently Vice President of ACM SIGGRAPH. His research is investigating the use of very realistic graphics in the accurate visualisation of archaeological site reconstructions and techniques which may be used to reduced computation times without affecting the perceptual quality of the images. Dept. of Computer Science, University of Bristol, Woodland Road, Bristol BS8 1UB, United Kingdom. Email:[email protected] URL: http://www.cs.bris.ac.uk/ alan/ Kate Devlin is a research member of the Graphics Group in the Department of Computer Science at the University of Bristol, UK. Her undergraduate degree was in archaeology and she worked as a field archaeologist and site draughtsman for two years before studying for an MSc in Computer Science. Her research interests are realistic archaeological reconstructions of flame-lit environments and accurate display of high dynamic range scenes. Her personal interests lie in representation and interpretation of archaeological records. Dept. of Computer Science, University of Bristol, Woodland Road, Bristol BS8 1UB, United Kingdom Email:[email protected] URL: http://www.cs.bris.ac.uk/ devlin/ v vi Duncan Brown is the Curator of Archaeological Collections at Southampton City Heritage, UK; a visiting Research Fellow at the University of Southampton, UK; and a freelance medieval pottery specialist who has spent several years digging in Bucks, Northants, Wales, Sussex, Cheshire. He has worked on assemblages from Fyfield Down, Winchester, the Tower of London, Reading, the Mary Rose, Guernsey, Isle of Wight and elsewhere. Duncan specialises in pottery imported from mainland Europe and has taught ceramic analysis at Bournemouth University and archaeological theory at King Alfred’s College. Duncan has been co-editor of the newsletter of the Society for Medieval Archaeology, Secretary of the Medieval Pottery Research Group, founder committee member and subsequently chair of the IFA Finds Group. City Heritage Services, Southampton City Council, Southampton, United Kingdom. Email: [email protected] Paul Debevec received his Ph.D. from UC Berkeley in 1996 and leads a computer graphics research group at the University of Southern California’s Institute for Creative Technologies. His computer graphics research has concerned reconstructing real world environments from photographs, augmenting such models with computer-generated objects and humans, and creating animations using image-based rendering and global illumination. His films and art installations have featured photo-real reconstructions of the Rouen Cathedral, the UC Berkeley Campus, and St. Peter’s Basilica and have contributed to the visual effects techniques seen in “The Matrix” and “X-Men”. Debevec is a member of ACM SIGGRAPH, the Visual Effects Society, and the recipient of SIGGRAPH’s 2001 Significant New Researcher Award. Executive Producer, Graphics Research, USC Institute for Creative Technologies, 13274 Fiji Way, 5th Floor, Marina del Rey, CA 90292, USA. Email: [email protected] URL: http://www.debevec.org/ vii Philippe Martinez has a Ph.D. in Egyptology and a research degree from Ecoledu Louvre. After completing his studies he joined the staff of the Franco-Egyptian Center in Karnak and worked there as research assistant until 1992. During this period, he was in charge of on site documentation, specializing in the epigraphical survey and study of the dismantled limestone monuments of the Middle Kingdom and beginning of the 18th dynasty. In 1989, Philippe Martinez joined a team of computer engineers at France’s Electricity R D department, working on a 3D reconstruction of the ancient Egyptian temples of Karnak and Luxor. Philippe Martinez’ major contribution was in the use of 3D modelling in archaeological hypothesis testing. Since 1999, Philippe Martinez has been a member of the ECHO Project (Egyptian Cultural Heritage Operation) as lead archaeologist. Using 3D reconstruction in the field of monumental archaeology, he is trying to work on the very idea of digital epigraphy in the hope of applying fast, reliable, robust and cost effective documentation techniques to the thousands of monuments actually disappearing in Egypt as well as in the rest of the world. His current mission takes him around the Mediterranean, in Italy, Tunisia and Egypt. École Normale Supérieure, Paris, France. Email: [email protected] Greg Ward (a.k.a. Greg Ward Larson) graduated in Physics from UC Berkeley in 1983 and earned a Masters in Computer Science from SF State University in 1985. Since 1985, he has worked in the field of light measurement, simulation, and rendering variously at the Berkeley National Lab, EPFL Switzerland, Silicon Graphics Inc., Shutterfly, and Exponent. He is author of the widely used Radiance package for lighting simulation and rendering. 1200 Dartmouth St., #C, Albany, CA 94706, USA. Email: [email protected] viii Contents 1 2 3 Introduction 3 1.1 4 Creating the models 9 2.1 3D scanning in archaeological perspective . . . . . . . . . . . . . 9 2.2 Affordable 3D scanning for archaeologists: the possibilities of structured light . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.3 Slides . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.4 A photometric approach to digitizing cultural artifacts: slides . . . 34 Very Realistic Lighting Simulation 3.1 4 Cap Blanc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Slides . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 Quantifying Realism 4.1 4.2 51 69 Luminaires . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 4.1.1 Creating a Realistic Flame . . . . . . . . . . . . . . . . . 70 4.1.2 Converting Pixel Information to Radiance Files . . . . . . 75 4.1.3 Creation of a Radiance Scene . . . . . . . . . . . . . . . 76 4.1.4 Creation of Final Images . . . . . . . . . . . . . . . . . . 77 4.1.5 Automation . . . . . . . . . . . . . . . . . . . . . . . . . 78 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 ix CONTENTS x 4.2.1 Flame to Film to Frames . . . . . . . . . . . . . . . . . . 78 4.2.2 Creating Sphere Data . . . . . . . . . . . . . . . . . . . . 80 4.2.3 Superimposing the Flame . . . . . . . . . . . . . . . . . 87 4.3 Converting Luminaire Data . . . . . . . . . . . . . . . . . . . . . 89 4.4 Validating Realism . . . . . . . . . . . . . . . . . . . . . . . . . 90 5 Representation and Interpretation 95 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 5.2 A brief history of archaeological illustration . . . . . . . . . . . . 96 5.3 5.4 5.5 5.6 5.2.1 Archaeological illustration: an overview . . . . . . . . . . 96 5.2.2 Case study: seeing Stonehenge . . . . . . . . . . . . . . . 99 The idea of realism . . . . . . . . . . . . . . . . . . . . . . . . . 101 5.3.1 Terms and concepts . . . . . . . . . . . . . . . . . . . . . 103 5.3.2 The nature of archaeological data . . . . . . . . . . . . . 106 5.3.3 Context . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 5.3.4 An established reality . . . . . . . . . . . . . . . . . . . . 109 Representing for a purpose . . . . . . . . . . . . . . . . . . . . . 110 5.4.1 Representations for the archaeologist . . . . . . . . . . . 110 5.4.2 Representations for the computer scientist . . . . . . . . . 111 5.4.3 Representation as advertising . . . . . . . . . . . . . . . 111 5.4.4 Representations for the public . . . . . . . . . . . . . . . 112 5.4.5 Fit for purpose . . . . . . . . . . . . . . . . . . . . . . . 113 Misinterpretation . . . . . . . . . . . . . . . . . . . . . . . . . . 113 5.5.1 Different outcomes from the same evidence . . . . . . . . 113 5.5.2 Seeing what we want to see . . . . . . . . . . . . . . . . 114 5.5.3 Reducing misinterpretation . . . . . . . . . . . . . . . . . 115 Setting standards . . . . . . . . . . . . . . . . . . . . . . . . . . 115 CONTENTS 5.7 1 5.6.1 Metadata . . . . . . . . . . . . . . . . . . . . . . . . . . 115 5.6.2 Alternative representations . . . . . . . . . . . . . . . . . 117 5.6.3 Preserving information . . . . . . . . . . . . . . . . . . . 120 5.6.4 Standardisation . . . . . . . . . . . . . . . . . . . . . . . 121 Developing new hypotheses . . . . . . . . . . . . . . . . . . . . 121 5.7.1 New ideas from light and colour perception . . . . . . . . 122 5.7.2 Case study: Medieval pottery . . . . . . . . . . . . . . . 123 5.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 5.9 Slides . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 Bibliography 144 A Included papers 151 2 CONTENTS Chapter 1 Introduction by Alan Chalmers, University of Bristol, UK. Recent developments in computer graphics and interactive techniques are providing powerful tools for modelling multi-dimensional aspects of data gathered by archaeologists. This course address the problems associated with reconstructing archaeological and heritage sites on computer and evaluating the realism of the resultant models. The crucial question considered is: are the results misleading and thus are we in fact misinterpreting the past. Archaeology provides an excellent opportunity for computer graphics to explore scientific methods and problems at a human scale and to introduce a cultural dimension which opens up avenues for new and alternative interpretations. For example, pottery is a very common find at the excavation of medieval sites. Archaeologists have been studying this material for many years and its means of manufacture, distribution and use are very well understood. With this sound basis provided by years of data acquisition, we can now begin to investigate less easily comprehended aspects of pottery. Medieval pottery was often very colourful, wonderfully texture and vibrantly decorated. Was this necessary because of the lighting conditions which prevailed in medieval society, or were the pots perfectly visible and the people simply wanted some colour in their otherwise dull lives? Computer graphics offers the possibility of exploring these questions. In order to minimise misinterpretation of the archaeological evidence, the question is whether computer visualisation of archaeological sites should be very realistic including the accurate modelling of the prevalent illumination, the complex 3D 3 CHAPTER 1. INTRODUCTION 4 Figure 1.1: Part of the frieze from Cap Blanc environment, and atmospheric factors such as smoke and dust. We will never know exactly why medieval pottery was so brightly coloured, but perhaps computer graphics and interactive techniques can establish a framework of possibilities within which archaeologists can hypothesise as to probable solutions. 1.1 Cap Blanc As a way of illustrating the potential computer graphics has to offer archaeology we consider the prehistoric site of Cap Blanc. The rock shelter site of Cap Blanc, overlooking the Beaune valley in the Dordogne, contains perhaps the most dramatic and impressive example of Upper Palaeolithic haut relief carving. A frieze of horses, bison and deer, some overlain on other images, was carved some 25,000 years ago into the limestone as deeply as 45cms, covers 13m of the wall of the shelter. Since its discovery in 1909 by Raymond Peyrille several descriptions, sketches, and surveys of the frieze have been published, but they appear to be variable in their detail and accuracy . In 1999, a laser scan of was taken of part of the frieze (Figure 1.1) at 20mm precision [9]. It was obviously of utmost importance that an eye safe laser was used to ensure there was no damage at all to the site. Figure 1.1 shows part of the frieze from Cap Blanc Some 55,000 points were obtained in two scans of the upper and lower part of the selected area, Figure 1.1 (the MDL laser scanner used did not have sufficient memory to store all the points in a single scan). These points were stitched together and converted into a triangular 1.1. CAP BLANC 5 mesh. Detailed photographs of the frieze were taken, each one of which included a standard rock art colour chart, just visible in Figure 1.1. As the exact spectral data for the colour chart is known, this enabled us to compensate for the lighting in the photograph and thus obtain approximate illumination-free textures to include with the wire-frame model. Images were rendered using Radiance. Figure 1.1(a) shows the horse illuminated by a simulated 55W incandescent bulb (as in a lowpower floodlight), which is how visitors view the actual site today. In Figure 1.1(b) the horse is now illuminated by an animal fat tallow candle as it may have been viewed 25,000 years ago. As can be seen the difference between the two images is significant with the candle illumination giving a ‘warmer glow’ to the scene as well as more shadows. This shows that it is important for archaeologists to view such art work under (simulated) original conditions rather than under modern lighting. (It is of course impossible to investigate these sensitive sites with real flame sources). For this site, we wanted to investigate whether the dynamic nature of flame, coupled with the careful use of three-dimensional structure, may have been used by our prehistoric ancestors to create animations in the cave art sites of France, 25,000 years ago. The shadows created by the moving flame do indeed appear to give the horse motion. We will of course never know for certain whether the artists of the Upper Palaeolithic were in fact creating animations 25,000 years ago, however the reconstructions do show that it certainly is possible. There is other intriguing evidence to support this hypothesis. As can be seen in the figures, the legs of the horse are not present in any detail. This has long been believed by archaeologists to be due to erosion, but if this is the case, why is the rest of the horse not equally eroded? Could it be that the legs were deliberately not carved in any detail to accentuate any motion, that is to create some form of motion blur? Furthermore traces of red ochre have been found on the carvings. It is interesting to speculate again whether the application of red ochre at key points on the horse’s anatomy may also have been used to enhance any motion effects. 6 CHAPTER 1. INTRODUCTION (a) Clouds of points from the scan (b) The reconstructed horse Figure 1.2: Cap Blanc horse: from point cloud to reconstruction 1.1. CAP BLANC 7 (a) 55W incandescent bulb (b) animal fat candle Figure 1.3: Lighting the reconstructed frieze 8 CHAPTER 1. INTRODUCTION Chapter 2 Creating the models by Paul Debevec, USC Institute for Creative Technologies, and Philippe Martinez, École Normale Supérieure. 2.1 3D scanning in archaeological perspective For most people, archaeology means digging up things and putting them on a shelf in some kind of museum, putting back together the so-called material culture of a civilisation. But for the professional archaeologist or art historian, the main part of this process is spent on what we shall call bluntly ‘documentation’. Very generally and simply put, it means the faithful and complete portraiture through any available mean and medium of the discovered artefacts in connection with their context of discovery. In most of the cases, this has been done most classically through verbal descriptions and drawings. The end of the 19th century brought a more consistently faithful tool with the advent of photography. However, the time as well as the equipment needed to pursue the goal of complete and accurate documentation of the ancient remains have contributed to raising its cost and lowering its effectiveness. Where actual publications, which are most of the time the only form of documentation available, would need to present the whole global and different aspects of the discoveries, they still tend to render a very verbal, and sometimes verbose if not partial, account of the archaeological context or monuments they concern. 9 10 CHAPTER 2. CREATING THE MODELS Digital tools appeared during the last two decades tend slowly to make those difficulties disappear. Thus the total stations used for topographical surveys now enable us to get a fair idea of the setting of huge settlements. Databases tend to make the difficult art of dealing with thousands of very varied types of data at least possible, while CAD tools enable archaeologists who are not professional draughtsmen to create drawings that are sometimes a lot better than the simple sketches we were unfortunately tolerating with the slow disappearance of dedicated skills and staff. Very simple and effective technologies like QTVRs enable us also to get very realistic and somewhat complete visual descriptions of objects or settlements. However all these possibilities may turn also into the source of new problems when the same staff member tries to cover all the different needs of documentation and publication, while never taking — or simply never having — the time to learn the different skills that lie behind the interface of the software they use. However, the appearance during the last decade of new 3D tools now enables us to at least try to get towards something like the global documentation archaeologists have been dreaming of over the last two centuries, regretting the incomplete documents left by their predecessors. Apart from the 3D modelling packages that enable an artistic or technical representation of the object or architecture or terrain studied, the recent availability of 3D scanning techniques tends to make it possible to try to obtain a very accurate and faithful representation of the archaeological remains, with the possibility to work on a model that is as close as possible to reality. These techniques enable us to take virtual casts of the objects without having to physically touch or risk harming them, while also avoiding a human intellectual reinterpretation interfering with the final portraiture of reality (even if this human interference cannot seem to be ever totally avoided during the complex processing of the original very accurate set of physically captured data). The available techniques tend to diversify greatly over the years. However, the cost, fragility and difficulty of use of most of these technologies have also made them almost impossible to use directly on site by ‘impoverished’ archaeological research institutions and professionals. When they should be used on a daily basis to salvage information that is destroyed everyday by trowels and shovels, they are unfortunately reserved for luckier projects that try to lure sponsors or media representatives by electing a very well known and not really threatened masterpiece as their centre of interest. However, these technologies being clearly in the process of development, these facts might as well look like a small price to pay for their availability, growing robustness and cost effectiveness. 2.1. 3D SCANNING IN ARCHAEOLOGICAL PERSPECTIVE 11 Different aspects of the studied subjects are to be taken into account: size, weight, material, colour, settings, artistic quality, fragility, reflective or non-reflective qualities of the material etc.. . . These intrinsic qualities have to be listed to be able to chose the technology to best fit the aims and needs. If we examine the different technologies, it is clear that none of them can be seen as universal and that anyone envisaging the use of this approach has to deal with a pretty complex toolbox. The same thing can be said of the methodology to be employed, once one or more technologies have been singled out. There is no way to define a methodology that could be applied blindly and thoroughly. Each project will in fact bring the definition of a specific strategy according to the target and to the type of final representation that would seem to be the desired aim of the project. To go one step further, the very idea of trying to use 3D scanning tools on cultural heritage and archaeological subjects has to be defined according to goals which are clearly complementary but can finally end up being almost contradictory during the actual performance of the project. A very big issue is the quality of the measurements gathered, and thus from them the quality of the final model. The archaeologist or art historian will naturally tend to ask for a very precise virtual cast with as high a resolution as possible. This stage has to be considered as the core of documentation and the file thus gathered could be seen as a digital archive that could enable its curators to replace the object if this one would end up being damaged or destroyed. But a high resolution scan turns out to be a painful thing to use once you try to turn into any kind of modelling and rendering. Thus the graphic specialists who take part in such projects would rather gather less-rich 3D scans that they could use simply as templates for a precise but somewhat reinterpreted representation. However it should be stressed from the outset that it seems always more important to try to get the best virtual cast as possible as an archive, to save as much of the available information as possible and to chose to resample the data to be able to use it for modelling. Apart from this idea of cultural heritage neutral archiving, one has also to think of the perception we have of these objects to understand and take into account the need of precise 3D capture, modelling and rendering of cultural objects. Most of those are linked to some kind of artistic and aesthetic aspect that make it unique and result in a complex mixture of feeling in its perception by the human mind. To go through a very straightforward comparison, human body and faces have been the nightmare of computer graphics for years now, the problem remaining almost unsolved today. The reason for this does not lie in technical inabilities but maybe in our perception that is centered around what a human being’s essence is — our CHAPTER 2. CREATING THE MODELS 12 brain reacting very suspiciously to any ‘unnatural’ or not realistically true representation of a human being. It is striking to note that, on the contrary, phoney or caricatured, humanised animals are easily accepted by our mind. We can possibly draw the same line between any industrial object that does not have any specifics that make it unique, and a cultural artefact that is most of the time the result of a very specific craftsmanship (if not artistry) that comes to us modelled by human use and the passing of time. Capturing the soul and essence of these objects to be able to transmit it to the largest possible audience, away from the object itself, is the challenge facing anyone willing to use digital tools to document, understand and explain ancient cultural artefacts and monuments. The problematic subject of realistic representation is discussed in more detail in section 5.3. 2.2 Affordable 3D scanning for archaeologists: the possibilities of structured light Among the existing technologies, the use of structured light might very well be the most promising, as it seems to be one that takes into accounts most of the real needs of the archaeologist. This technology is based mainly on a optical approach that can be simplified as the projection on an object of one or more geometrical patterns or grids whose topology is very specifically known. The capture consist of digital images showing either the grid as projected on the object or/and an image of the object itself taken from the very same angle that shall be used as a texture. The grid projected on the object is of course going to be deformed by the shape of the object. It is this deformation that is in fact complexly processed by a software that extracts , , , measurements for points of the projected grid or grids.   Different softwares are already available on the market today but research is still taking form. At present, the limitations of this technology reside mainly in the need for powerful projection devices that would enable the digital camera to capture a fine but dense grid on any kind of material and in the difficult, sometimes very bright, lighting conditions that would be encountered on site. All this tends to make structured light a technology best suited for small or medium size capture. However, it enables the processing of dense and precise measurements. The quality of the optics and the resolution of the camera are of course of the greatest importance, but the progress made in these aspects during the last few years make 2.2. AFFORDABLE 3D SCANNING FOR ARCHAEOLOGISTS: THE POSSIBILITIES OF STRUCTURED it possible to think that the necessary parameters will be available in affordable quality cameras in the coming years. But aside from that, the required apparatus can be put together mainly from off-the-shelf tools. To be able to deal with a portable, flexible device, it has to be constructed from different parts that need to be put back together for every use, and whose precise relationship is a condition to the quality of the capture. The capture itself can be done with either a still digital camera or a high resolution video camera. The choice is open but it also depends from the idea of using one or many different grids. Those can be projected either through the use of glass slides or through computer generated and displayed graphic files. In this case, it is interesting to use a digital projector that is directly linked to a laptop computer equipped with a powerful graphics card that will simultaneoulsy control the display and of the capture. These two devices, projector and camera, have to be mounted together on a stand (which could be any professional, stable, photographic tripod). However, the need to take the whole device onto the site makes it very difficult to ensure that the spatial relationship of projector and camera will always be the same, even approximately — something that is necessary to the actual processing of the data and a condition to the quality of the results. From this fact comes the need to predefine the neutral result of this spatial relationship through the use of calibration files that capture the image of the grid or grids on a known object (which can be either a plan or a box which is itself ”configured” through a regular and known pattern applied to it). It has to be stressed that the greatest care has to be taken with this reference object, its deformation inevitably leading to flaws in the quality and homogeneity of the measurements captured. A calibration file has thus to be processed every time the spatial relationship between camera and projector is altered. However, if the setting is good enough to accommodate a whole set of objects or different parts of the same big object, then hundreds of captures can be made with the same fixed setting, with the possibility of having the whole rig on wheels to move along or around the studied objects. In this case, there is a need to keep the eye of the camera at the same distance and angle of the object to keep a reasonable homogeneity and quality to the captured data. Very simple devices can be used to this effect, like markings on the ground showing the different predefined positions of the camera, and a track enabling the operating staff to check the relative spatial position of the camera-projector rig compared to the surface of the object, using, for example, laser pointers to keep the alignment to the predefined track. CHAPTER 2. CREATING THE MODELS 14 These precautions being taken, the capture device can be used to capture the object from different, complementary angles. The technology being optically based, it has to be kept in mind that we are in fact taking 3D photographs of complex objects. It is thus necessary to take multiple views of the same object to capture the complexity of its topology: either going around it, (having it turning in front of the camera if we are dealing with small portable artefacts), or panning in front of it from different directions and angles to cover as much as possible of its geometric specifics. This then leads to the difficult challenge to put these different views of the object together, once the original measured data is processed. This registration stage can be solved either by the use of specific targets present in the scene or on the object. However, most of the existing softwares enable us also to use a simple, robust and automatic fitting of the entities through common surfaces that are defined topologically or through the precise ”mapping” of the pixels of the captured image. All this make it very important to have as much common ground between the captured image as possible to ensure a steady workflow and precise results during the post-processing of the data. In comparison to the data gathered though metrically based 3D measurement and scanning that generate only , , , text files, archiving is an issue to be taken seriously into account as the original raw data consists of a large number of high resolution image data, but the actual progress in the domain of storage makes it possible to deal with, with a certain confidence. It is also to be kept in mind that the images thus gathered are precious ‘portraits’ of the object which, used as texture maps, result in the elaboration of a very rich and natural rendering of complex artefacts.   We can thus describe this technique as very straightforward, cost and time effective once specific methodological precautions are applied during the original capturing stage of the work. Its portability, flexibility and robustness makes it a precious tool for those who are dealing with fragile objects or who want to capture as much of the information as possible that gets destroyed daily during excavations. 2.3. SLIDES 15 2.3 Slides Creating the Models Paul Debevec Philippe Martinez USC Institute for Creative Technologies Ecole Normale Superieure 3D Scanning Overview •P Q,3 !" $#%R&"'(S)+)T *+, •-/.012+3R&4,56 798+:U<; =?>A@$?B,?C =AEDAF; G> 798+; H+IVGKJMLNC ; @$OAFD CHAPTER 2. CREATING THE MODELS 16 Triangulation            O P  &     F!#"O    $$Q%! >&'R(  S )T'*D ,+R> F! •-./102L 43 5B 76 89L 46 5N ):<; 5B 76 •@A57B0C,D.:<57BCE3=&F:<5B;8>=%?G2 •HI;;RKJ . CP=%?G24 L6:<?GMJ 5) N/A 1C 8>=%?G 2 Time of Flight U VB XWYZ\[ X]_^E a` Ob_cnedPY`gfP h]f b_ iW U jkW`s b_] lWa^cenmL 4W ao[X llWDofenWDp_^ Dcen[X Yq^[rlWD^]l sY U jk tW` sb_]lWDbS )^cenm4W U uv[k[ p_ ipw fe]_b_cne[Y 2.3. SLIDES 17 Recent 3D Scanning Projects (() ) **!!++ *. *,  ' ' --  **           !              "$ "$# #%' %'&&    Low-cost, high speed, sculpture scanner – structured light approach A= A=BD BDCE CEC5 C5F!FHG+G15 15;9 ;9I I 6 6 G3 G3F F JJK5 K5L3 L3M3 M3N3 N5OD O$P3 P5Q5 Q3R3 R3S S AY A=? ? T3 T5? ? 6679 79:3 :3U' U'? ? 05 05;9 ;9G G /' /'0+015 1325 254 46679 7985 85: : ; ; <= <=79 7>4 4;9 ;9: : ? ? @5 @5; ; V V 79 79WX WX;9 ;>FHF!7 7 JJK3 K3L3 L3M3 M3N3 N3O$ O$K3 K3L3 L3M3 M3N3 N3S S 18 CHAPTER 2. CREATING THE MODELS •Projector sends out a unique signal pattern in each direction •Camera records signals returning from each direction, and analyzes the pattern 2.3. SLIDES Advantages/Disadvantag es •Advantages 7(66E6 *! ""  &#$&%' ()*,+-%.0///.0///(1 5*6! 322K4)*5)*66 7986:;<6E=:>?( #61 ?( •Disadvantages @@A()*9  8 BC DE)L6=?( #69F=G! HHI1 9C1 8*5# 1 8* 7 1 C61 (1 =#& J Pixel Accuracy 19 CHAPTER 2. CREATING THE MODELS 20 Pixel Close-up Projector Sub-pixel Accuracy Camera 2.3. SLIDES Sub-pixel Curve Modeling Chris Tchou Master’s Thesis 2002 3D Scanning Parthenon Sculptures Basel Basel Skulpturhalle, Skulpturhalle, October October 2001 2001 21 22 Musee Musee du du Louvre, Louvre, October October 2001 2001 CHAPTER 2. CREATING THE MODELS 2.3. SLIDES 23 Additional Evidence: Drawings can provide additional lost information. How can this be incorporated? 3D 3D Scan, Scan, 2001 2001 Carrey Carrey Drawing, Drawing, 1674 1674 Scanning Casts: Sometimes in better condition than originals original scan of cast 24 CHAPTER 2. CREATING THE MODELS Scan with and without texture Scanning Environments 2.3. SLIDES 25     !                " #$ "%  &('# CHAPTER 2. CREATING THE MODELS 26 Rendering Archaeological Models with Global Illumination and Image -Based Image-Based Lighting Acquiring Real-World Illumination 2.3. SLIDES Outdoor Light Probes Outdoor Light Probes 27 CHAPTER 2. CREATING THE MODELS 28 Outdoor Light Probes Outdoor Light Probes 2.3. SLIDES Untextured Model rendered with real-world illumination Lighting Concept Drawings by Mark Brownlow 29 30 CHAPTER 2. CREATING THE MODELS Computer model of Parthenon, c. 1830, illuminated with image-based lighting, Arnold global illumination, depth of field Model of contemporary Parthenon, illuminated by evening light of Marina del Rey, CA 2.3. SLIDES Model of Christian Parthenon, c. 1000AD, showing Apse addition. Computer model of the Duveen Gallery in the British Museum, site of many of the Parthenon sculptures. 31 32 CHAPTER 2. CREATING THE MODELS Rendering of a computer scan of a cast of West Panel II of the Parthenon frieze in the Basel Skulpturhalle. 2.3. SLIDES 33 Rendering of a computer scan of the head of a Caryatid cast scanned in the Basel Skulpturhalle. Modeling and Animation Brian Emerson Craig “X-Ray” Halperin Mark Brownlow Yikuong Chen Diane Suzuki Hiroyuki Matsuguma Jamie Waese Rippling Tsou Shivani Khanna Patrick Lee Arnold Rendering Software Marcos Fajardo HDR Image Processing Chris Tchou Archaeological Consultant Philippe Martinez Sculpture Scanning Chris Tchou Tim Hawkins Paul Debevec Philippe Martinez Scanning Hardware Tim Hawkins Chris Tchou Paul Debevec Scanning Software Chris Tchou Jonathan Cohen Fred Pighin Video Editing Paul Asplund 3D Scanning made possible by Tomas Lochman of the Basel Skulpturhalle, Jean-Luc Martinez of the Musee du Louvre, and with the support of TOPPAN Printing Co., Ltd. CHAPTER 2. CREATING THE MODELS 34 2.4 A photometric approach to digitizing cultural artifacts: slides This section corresponds to the “A Photometric approach to digitizing cultural artifacts” paper [23]included in the Appendix. A Photometric Approach to Digitizing Cultural Artifacts      (   "    $ )  !  " $# % &    '    CD E$ F!GHI D JBKBLM)NBJBGO PBG!J3QZRNPBSBJ"L!G!JBTVU GW"X7ZY[S\ SB])SB^_!`7ZaZbdc aZe f7g aZhei j7aZh"kml!nop7j7q7i rZo jBads%i g e rZh"kmtRf7h"k i e n"u!vg w!xZhf7jk j7yn"uhaZbzRrk e rZg h"k m{"fZ7g7i e hyf}| sRvl!~`Z€Z€m‚!ƒ!„3k n hbZh"u „"g f7Zf7w!f u: † "j7B‡!f7oˆZ7f7Zg `Z7€Z€  *+*A +*-, .%/ %0%/%1/ 23, 4 5B76"8!9:/ ;7/%<572"=%8!>? "@ Current 3D Scanning Process § Acquire Laser or Structured Light Scans from Different Angles § Merge Scans into complete model § Register photographs to determine object appearance or reflectance § Project and merge photographs onto geometry 2.4. A PHOTOMETRIC APPROACH TO DIGITIZING CULTURAL ARTIFACTS: SLIDES35 Deriving Object Appearance § Approach 1: Project photos onto object as texture maps § Façade 96, Pulli 97, Miller98, Wood2000 § Lighting is fixed § Approach 2: Perform reflectometry to estimate diffuse and specular components for each surface point § Sato 97, Rushmeier 98, Marschner 99, Yu 99, Levoy 2000 § Still not general Deriving Textures § Approach 2: Derive lighting-independent lighting-independent lighting textures § Photograph surfaces under known illumination conditions § Estimate diffuse albedo and specular components for each surface § Disadvantages § difficult to control illumination § complex reflectance models require many observations § Models do not extend to translucency, subsurface effects CHAPTER 2. CREATING THE MODELS 36 Strengths of current 3D Scanning • Works well for plaster and marble sculptures • Well-defined geometric surfaces • Mostly diffuse reflectance • Limited mutual illumination • Limited self-occlusion                    Challenging cases for current 3D scanning • Complex geometry • Fur, plants • NonNon-diffuse Non-diffuse reflectance • Shininess (esp. spatially varying) • Anisotropy • Translucency • Jade, gems • Interreflection • Anything non-convex and light-colored 2.4. A PHOTOMETRIC APPROACH TO DIGITIZING CULTURAL ARTIFACTS: SLIDES37 The Reflectance Field (Debevec et al. 00) How an object transforms incident illumination into radiant illumination Capturing Illumination CHAPTER 2. CREATING THE MODELS 38 Captured Lighting Environments Light Stage 1.0        !  " #$%&('*),+.-/103240657 849:<; :=> : + 84? 5+ :<@ 0 :> A B =C?ED /F ? 5 @? + : ' G HIHKJ L MONPPPQ' 2.4. A PHOTOMETRIC APPROACH TO DIGITIZING CULTURAL ARTIFACTS: SLIDES39 CHAPTER 2. CREATING THE MODELS 40 Sample Images 4D Reflectance Field Data 2.4. A PHOTOMETRIC APPROACH TO DIGITIZING CULTURAL ARTIFACTS: SLIDES41 An Illuminated Reflectance Field Re-Illuminated Data 42 CHAPTER 2. CREATING THE MODELS Capturing Translucency 2.4. A PHOTOMETRIC APPROACH TO DIGITIZING CULTURAL ARTIFACTS: SLIDES43 Capturing Translucency The Need for High Dynamic Range CHAPTER 2. CREATING THE MODELS 44 "  !    #            $ %  $ &  '( 2.4. A PHOTOMETRIC APPROACH TO DIGITIZING CULTURAL ARTIFACTS: SLIDES45 46 CHAPTER 2. CREATING THE MODELS 2.4. A PHOTOMETRIC APPROACH TO DIGITIZING CULTURAL ARTIFACTS: SLIDES47 CHAPTER 2. CREATING THE MODELS 48 Future work: Capturing from all viewpoints View Interpolation 2.4. A PHOTOMETRIC APPROACH TO DIGITIZING CULTURAL ARTIFACTS: SLIDES49 Conclusion: What do you do with the virtual object? • • • • • • • • View it from different angles Illuminate it differently Place it in virtual surroundings Compute its volume Manipulate it Feel its shape and materials Sense its smells Analyze its composition Thanks! • • • • • • • • George Randall – collection owner Maya Martinez – production coordinator Chris Tchou – rendering programs Dan Maas and Chris Tchou – real-time demo real real-time Brian Emerson – 3D modeling Andrew Gardner - video editing Bobbie Halliday and Jamie Waese – production USC ICT, TOPPAN Printing, Inc. - support 50 CHAPTER 2. CREATING THE MODELS Chapter 3 Very Realistic Lighting Simulation by Greg Ward, Exponent. 51 CHAPTER 3. VERY REALISTIC LIGHTING SIMULATION 52 3.1 Slides ! " ! !  # $%! &' () *+,$+-,  # +.  /1  02 $+(- ca35_ b4 6 p:e7 r:f9 s=g; t?h< u@i> s=g; vBjA sVg=; wEkC x^l?D y%mHF zPnG pIe7 vjA ` oqd8 }UJ{ ~4 O L ’:‚N ”ƒP< •^„?D –%H€FHPG —I†7 ˜š‡R A ™TˆQ ›U‰S œ=ŠV; ˜B‡?A :‹9 œ=ŠV; žŒC [€MG ˜B‡A ŸW | ŽK [€MG ‘“ ¢UX¡£4 ®š¤ Y ¯¥PG °=¦Z ±B§N ²=¨D ³[©< ¯P¥G ´µª]\ ²^¨D ¶H«%F ¯¥G ·I¬7 ¸­A È5¹ ɺšÊ »½Ë¼Ì¾ Í¿ ÎÐÀ5Ï Á Í¿ ÑÃÍ¿ ÒÄÓÅ ÔÆÕÇ˼Ñà øUÖ ù^× úÙØ ù=× ûÚ  Û ý[Ü þBÝ  ü“ × ú%Ø ÿBß  ùÞ ù×Þàá^!á=":â ûÚE#ã?$á^äæ%&UùåU×^'çI!":â (ý[Ü )ûEÚ ù×^*è >"?ã=@êBA×æBCëqA×^?ã?Dì=E"â:FÚEA×ÞGíIF6ÚHî??ã?Dì^A×=F6ÚIKïÐJá^?ãVL"Ý:LÝ?A×VF=Ú Mð ¥ é+",-. /0., 12"3.4365 , 1.36798,;:": .;3=< `Cëqaã?bá=cñBd×^e\ØHf ßOghòfßBaã=iÚEe"ØÐjíIi6Úk6ïlêBd×^bá^eØ%i)ÚEjmíIl"ê:fßBj"í:enØójíIo^ Ûôd×?enØód×=i=Ú pð ¥ N0O PQRSTVUTOWS9XW6Y[ZRPS\WXZ"TX"SX]^R SRW=_ €à;×=‚=Ú ƒÜ „;ã=†õ?‡6öˆßB‰;í:ŠnØó‰V íO‰míI‚6Ú‹6ïƒÜ ŠØف×æ‹6;â:‚)ÚE÷^×^Žä=ð ¥ qrs=t uv w[xy"zyVys6{[t znr {[|"s}r~ The accuracy you require for your simulation depends on the application. It may not be as great as you think. If you only want to see what it would have looked like, you can often get by with relatively simple and inexpensive measurement techniques. 1 3.1. SLIDES 53 Tape Measure or Depth Scanner? Tape Measure Requirements and Accuracy ¥ Digital camera and an assistant (both optional) ¥ Centimeter accuracy Depth Scanner Requirements and Accuracy ¥ Scanner + computer + set-up + data reduction ¥ Millimeter accuracy The modeling step following each measurement may require more or less time depending on the tools available and geometric complexity. Tape Measure Example ¥ Accident Reconstruction (recent archaeology) ¥ Needed to know driverÕ driverÕs driverÕs eye height ¥ Photo with tape measure followed by computer modeling ¥ Centimeter accuracy Use tape measure that is visible in photographs and/or record measurements on audio or in notebook. Photogrammetry may also be used in scenes containing straight lines. 54 CHAPTER 3. VERY REALISTIC LIGHTING SIMULATION Depth Scanner Example ¥ Pietˆ Project www.research.ibm.com/pieta ¥ Multi-baseline stereo camera with 5 lights ¥ Captured geometry and reflectance ¥ Sub-millimeter accuracy Technically involved procedure of measuring artifacts or sites with laser scanners requires painstaking set-up and tedious data reduction. The results can be quite impressive. Macbeth Chart or Spectrophotometer? Macbeth Chart Requirements and Accuracy ¥ ColorCheckerª chart and a digital camera ¥ Accurate to about 8 !E (1994 CIE Lab) Spectrophotometer Requirements and Accuracy ¥ Hand-held spectrophotometer ¥ Accurate to about 1 !E 3 3.1. SLIDES 55 Macbeth Chart Example ¥ Digital photo with ColorCheckerª ColorCheckerª under uniform illumination ¥ Compare points on image and interpolate ¥ Best to work with HDR image ¥ Accurate to ~ 8 !!E E Need to make sure you have diffuse illumination at a uniform angle and no highlights.Average over areas to avoid problems with noise in image and texture on object. Values are difficult to interpolate from standard image. Spectrophotometer Example ¥ Commercial spectrophotometers run about $5K ¥ Measure reflectance spectrum for simulation under any light source ¥ Accurate to ~ 1 !!E E 70 70 60 60 50 50 40 40 30 30 20 20 720 720 710 710 700 700 690 690 680 680 670 670 660 660 650 650 640 640 630 630 620 620 610 610 600 600 590 590 580 580 570 570 560 560 550 550 540 540 530 530 520 520 510 510 500 500 490 490 480 480 470 470 460 460 450 450 440 440 380 380 400 400 390 390 430 430 420 420 00 410 410 10 10 Wavelength Wavelength (nm) (nm) Some spectrophotometers can separate diffuse and specular components. Most devices record reflectance at 10 nm increments over 400-700 nm range. Measuring textured materials is a problem. 4 56 CHAPTER 3. VERY REALISTIC LIGHTING SIMULATION Aerial Photo or Site Survey? Aerial Requirements and Accuracy ¥ Satellite photos or (better) fly-over ¥ 1-10 meter accuracy, usually without elevation Site Survey Requirements and Accuracy ¥ GPS or traditional surveying equipment ¥ 1-10 centimeter accuracy, with elevation Aerial Photo Example ¥ Giza Pyramids ¥ Fly-over aerial photo shows positions of pyramids and tombs ¥ Requires perspective correction ¥ Accuracy is ~ 5 m Taken from the website http://sphinxtemple.virtualave.net 5 3.1. SLIDES 57 Site Surveying ¥ Traditional instruments measure point-to-point ¥ GPS equipment measures absolute position ¥ Accuracy 1-10 cm Taken from the website http://www.johann-sandra.com/surveying1.htm 2. Simulation / Rendering ¥ Radiance Input Requirements ¥ Rendering Time and Accuracy ¥ Output Options Radiance website is http://radsite.lbl.gov/radiance Siggraph ‘94 paper is in the Appendix. 58 CHAPTER 3. VERY REALISTIC LIGHTING SIMULATION Radiance Input Requirements ¥ Radiance is a physically-based lighting simulation and rendering tool from LBNL ¥ Takes geometry and RGB reflectances (also BRDFs, BRDFs, procedural textures, etc.) ¥ Prefiltering with source illuminant yields accurate colors (~ 2 ! E) !E) ¥ Give attention to source photometry! See “Picture Perfect RGB Rendering Using Spectral Prefiltering and Sharp Color Primaries” on course notes CD-ROM. Example Radiance Image Simulation of San Francisco air traffic control tower created to examine different shading devices and monitor equipment. The model was created from many measurements, including spectrophotometry and captured textures and patterns. 7 3.1. SLIDES 59 Rendering Time and Accuracy ¥ Diffuse interreflection and output resolution are the main parameters ¥ Many other parameters controlling time and accuracy of direct, specular, specular, etc. ¥ User-friendly front-end program called ÒÒradÓ radÓ radÓ is handy to control rendering Computers are so much faster than when Radiance was written; it can now handle huge models with difficult lighting quite easily. Other methods such as Monte Carlo Path Tracing start to be competitive. Rad Parameter Settings 8 60 CHAPTER 3. VERY REALISTIC LIGHTING SIMULATION Rendering Quality Comparison 7 seconds 1.5 minutes 2 hours Low, Medium, and High quality rendering parameters as set by “rad.” Output Options ¥ Radiance picture format is gaining popularity in HDR imaging ¥ 4-byte RGBE uses common exponent per pixel ¥ XYZE format for photometric images ¥ Converters to and from other formats, including LogLuv TIFF ¥ Interactive rendering for previewing ¥ Direct numerical output is also supported Usual output from Radiance is a picture, but numerical output and direct ray-tracing control is possible with the system using the rtrace program. Interactive rendering is also supported with rview and rholo. 9 3.1. SLIDES 61 Interactive False Color Panoramic Interactive rendering allow us to preview our model and find good views. False color images allow us to analyze light levels. Panoramic images permit QTVR viewing of scene. 3. Visualization ¥ Numerical Visualization ¥ False color images and plots ¥ Visibility analysis - how well could people see? ¥ Tone-mapping ¥ Visibility-matching tone operator ¥ High Dynamic Range Display Visibility-matching tone operators include Larson et al ‘97 (in Appendix) and Pattanaik et al (Siggraph ‘98). 10 62 CHAPTER 3. VERY REALISTIC LIGHTING SIMULATION Tone-mapping Goal: Colorimetric Cannot represent entire range on display, so we end up clamping. Tone-mapping Goal: Optimize Contrast Produces nice-looking result, but what does it mean? 11 3.1. SLIDES 63 Tone-mapping Goal: Match Visibility Goal: If we can see it in the real world, we can see it on display and vice versa. Operator Comparison Tone Operators 0 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 -0.2 -0.4 Log10 Display Y -0.6 -0.8 vis-match contrast -1 clamped -1.2 -1.4 -1.6 -1.8 -2 Log10 World Luminance Clamping is obvious. Differences in other operators more difficult to understand. All examples here are global operators -- spatially varying operators are more interesting and potentially more powerful. 12 CHAPTER 3. VERY REALISTIC LIGHTING SIMULATION 64 & /  ' (  ' )' * '+ ,  -  .   +  0  1  ' 2 3  4 / 5 2 6  -  .   7 !7 ! ' 2 39 8# " 7 !7 !:; $%5 /  0< 1  ¥ ¥ ¥ y?zA{BzA|}G~zIK€ G~‚KƒP|E}Z„R‚KP† ‚Kƒ‡†WˆK„X€W{B}A{B‰Gz i? rK vWT wK = jA > kB @ jA > lE C mG D nF jI > oK H pMJ qG L nF rK N sP O lE C mA D tR Q N uP S vT rK N sU O V tX Q pWJ kY @ mZ D kB @ xG [ j> œ<Yž Ÿ ¡G¢£ Ÿe¤K¥B¢ §¦W¥K¨XŸW©BªA©B«G`¢¬®­^Ÿ ¢K¬P¥B¢^¯K¥B¢P£EªA©B°K Š< \ ‹G > ŒP ] J ŽG L ^ F C WJ ‘K _ ’B V  F ‹I > “WT ’K V ”X Q WJ •Y @ –A D •B @ —G [ ‹` > F ˜X N ™ S MJ a F ˜P N ’Y V  F šK b ’B V P F E C –A D •B @ ›K O ‹> ÁdÂBÃ^Ä ÅPÆBÇAÈWÈMÉ`ÊAÄWËXÌKÈ ÇGÃÍIÎKÌKËgÇAÆÐÏAÄ ÊAÄWÑZÄWÈWÄ ÃÉ ±d c ²Y b ³ F ´J µP N ¶Y @ ·Z D ¸eT ¸T ¹` f ºA L ´WJ »X Q ¼K V ¸MT ·Y D ³ F ½I > ¾K H ¼K V »g Q ·A D ¶g @ ¿Z h ´MJ ºA L ´WJ ÀK _ ´WJ ¸eT ´J ³F ¹f ¥ ¥ ¥ ¥ match contrast sensitivity scotopic and mesopic color sensitivity disability (veiling) glare loss of visual acuity in dim environments Contrast and color adjustments are global over image; glare and acuity simulation are local. á# ÒâÓ ã Ô ä Õ å Öæ ×ç; Ø%è Ù éÚ ê î Û#ë< Ü ì Ý^í Þã Ô ä Õ é Ú ß ï à äÕ ðòñôó õ ö ÷ Dark and light colorimetric exposures vs. histogram adjustment.  3.1. SLIDES 65 Contrast & Color Sensitivity          ! "# $%'&)(# *+$,.-/(0 1 243 57698 :<; Matching contrast and color sensitivity yields more representative visualization. Veiling Glare Simulation + = Glare is caused by scattering in the lens, iris, and aqueous humor, and results in veil cast on nearby photoreceptors. 14 CHAPTER 3. VERY REALISTIC LIGHTING SIMULATION 66 !# "$  % &' ()  * "+, )  ( $  . /  02 1 34 .  +  5 "1  ( ¥ k8lImn=o?pAq8rDsEnLpApAq8ptvu?w?n=o?u _8 6 `: 7 a 9 b=; c? < dA > e8 @ fDB gE C b=; dA > dF > e8 @ d > hI G i? H jK J bL; cK < iH ¥ Good dynamic range, tunable gamut ¥ Widely used for still projection systems ¥ Already in trials for digital cinema ¥ ŠŒ‹L=‹=Ž?8\‘2‹=’8“8”V•–?ŽK“8‹L˜—?™ šœ›F–K”D‹=8’\‘2‹L’8“8”^Œ–?=ž?— xN M ;y=Oz=;yL<{K@|8P}#Q~R;yLS8T€8UV9‚WƒK<{?T€8;y=P}8H„? X†YZ>‡AWƒ?U[;yLP}8S\Q~2;y=S8T€8U^]ˆNWƒKOzLJ‰KH„ ¥ ¥ ¥ Amazing dynamic range, widest gamut Still in development Promising for digital cinema These projection systems are under development, and may provide wider gamuts and greater dynamic range than current LCD-based systems. ®# Ÿ¯¡°± ¢ ² £ ³ ¤ ´ ¥ µ ¦ ¶§ ¯¨·¸ ©µ ¦ ´ ¥° ¡ ¹ª ²# £¯«º» ¬ ¼­µ ¦ ³¤ üÛýþÁÿ  Þÿ   V  þ  ¥ 0ç êà ê ½ è ¿ À ë4  ìÆ Ä í^ Å îI Ç ë  íV Å ìÉ Ä ïv È ðË Ê À íV Å ìÉ Ä ñÎ Ì òI Í îI Ç ïÐ È óœ Ï ô^ Ñ õv Ò öÕ Ó ÷Ô îI Ç öÐ Ó ø^ Ö òÕ Í ùØ × øÙ Ö éÛ ¿ ú Ú ûÜ 1! " ¾ 24#éÁ 3$ 5'  6  7 + 8 * 7 : 9 < 2 ; : = 4 1 > % & (*) (, #- . "/ ×^ÍIÓIÈIÊÞÝÚÛßÁÚÔ ÓÕàVá ÍIÒ ¥ 1024x768 resolution ¥ 60,000:1 dynamic range ¥ 2,000 cd/m2 maximum luminance OAP4QR S PTPU VRW:XKTY Z VT TP[ ¥ A â?@ ÈIB ãICÂVDäV@ ÈÕE å^@< ÈÙFÄÉG ÍICI ÂHÔ JK æZE åØL< àÙMá G ÍÕEåVE å^@ ÈINÊ ¥ 2048x1536 resolution ¥ 70,000:1 DR and 30,000 max. luminance Working with Canadian research group to develop systems and applications for high dynamic range imaging.  3.1. SLIDES 67 High Dynamic Range Viewer First prototype developed at LBNL 10 years ago. Paper from PICS conference included on course CD-ROM. Further Reference ¥ viz.cs.berkeley.edu/ gwlarson viz.cs.berkeley.edu/gwlarson viz.cs.berkeley.edu/gwlarson ¥ publication list with online links ¥ LogLuv TIFF pages and images ¥ www.debevec .org www.debevec.org www.debevec.org ¥ publication list with online links ¥ Radiance RGBE images and light probes ¥ radsite. .lbl.gov/radiance radsite lbl.gov/radiance radsite.lbl.gov/radiance ¥ Radiance rendering software and links  68 CHAPTER 3. VERY REALISTIC LIGHTING SIMULATION Chapter 4 Quantifying Realism by Alan Chalmers, University of Bristol, UK. The appearance of computer reconstructions of archaeological sites can often owe more of their form to the imagination of the artist than to the physics of the model. The rendered images may look ‘real’, but how can their validity be guaranteed? One approach is render images which are based on as much real-world data as is obtainable, rather than a preconceived aesthetic. Much research has been undertaken into accurately modelling archaeological sites and reconstructing incomplete structures. Unfortunately, the luminaires used by standard modelling packages to render these scenes tend to be based on parameters for daylight, filament bulbs or cold fluorescent tubes, rather than lamp or candle light. Before the advent of modern lighting, illumination within ancient environments was dependent on daylight and flame. The lack of windows in ancient environments (certainly the case in caves) and later, glass if available was expensive, windows were inappropriate for defense etc., meant that even during daylight hours, some form of firelight was necessary for interior illumination. The fuel used for the fire directly affects the visual appearance of the scene. Furthermore, flames are not static and their flicker may create patterns and moving shadows that further affect how objects lit by these flames might look. Any realistic reconstruction of an archaeological site must take into account that these would have been flame-lit environments and thus the reconstructions should not only incorporate the accurate spectral profile of the fuel being burnt, but also the manner in which the flame may move over time. 69 CHAPTER 4. QUANTIFYING REALISM 70 4.1 Luminaires “Command the Israelites to bring you clear oil of pressed olives for the light so that the lamps may be kept burning.” Exodus 27:20 This sections describes techniques which combine experimental archaeology and psychophysics, with computer graphics and vision in order to simulate flame lit environments to new levels of visual appearance and accuracy. 4.1.1 Creating a Realistic Flame Modelling the shape of a flame mathematically is complex. An alternative approach is to incorporate video footage of a real flame into the virtual environment. Correctly included in the virtual environment the video provides a realistically shaped flame, while the illumination of the flame within the virtual environment can be computed from the size and position of this real flame. To capture the correct shape of a flame, and following this, the movement, the first step is to film a real flame. This is best done using a high quality digital video camera. This film can then be transferred onto a computer, and by using various techniques as described below, the flame from each frame can then be incoorporated into the virtual environment. ‘Green Screen’ Technique The simple technique of blue-screening, widely employed in the film industry, can be used to cut out an object from its background surroundings. Filming the flame against an evenly coloured, matt background enables thresholding of each frame to take place, which is used to identify and dismiss a background colour, effectively separating the flame from any unwanted parts of the scene. The background colour should be chosen so that it does not occur within the foreground. ‘Blue Screen’ uses thresholding techniques to achieve this. Thresholding is the process of identifying a range of colour, and changing all areas within an image within this colour range to another 4.1. LUMINAIRES specified colour. Simple, solid objects can easily be separated from a background, but a flame produces some of its own difficulties: It is useful to have static, even lighting on the background to simplify the thresholding process. Filming a flame clearly creates a problem in that it is itself a light source, and it is moving, disrupting otherwise static lighting, and producing lighting effects on the background. Parts of a flame may be translucent or partly transparent, so on a film the background colour may seep through into what we would identify as the flame. This is hard to avoid, but can be compensated for later, by deliberately seeping the background colour of the modelled scene into the flame. The efficiently burning part of the flame, around its base, is generally blue in colour, so rather than using a blue screen, another colour, such as green should be employed. Once an object has been separated from its background, it can then be placed in front of other backgrounds. If used sensibly, the object can be blended into the new background to give the appearance that this is an unaltered, original scene. Capturing the Flame As discussed above, care has to be taken when filming in front of the green screen, to simplify the separation of the flame from its background. The green screen should be as evenly coloured as possible, and with as little shine as can be achieved. Simple things such as moving the flame as far from the screen as is possible help to achieve this. Figure 4.1 shows the required set up. It is important to capture as much of the actual flame as possible, and as the melting of candle wax at the base of the flame creates a substantial depression, care has to be taken to keep the whole flame visible. The video is streamed onto a computer, and thresholding is used to separate the background. The digital video has to be broken down into individual frames before this can take place, and a suitable file 71 72 CHAPTER 4. QUANTIFYING REALISM Figure 4.1: Green Screen filming format used that will allow manipulation of the picture files. Common picture formats have different amounts of encoding to reduce their size, so they need to be decoded before any changes can take place. It is also important at this stage to attempt to keep as much accuracy as possible, so lossy formats such as JPEG should be avoided. In terms of frames per second, 25 fps will be used which is sufficient to provide smooth, flowing results. Correct thresholding should immediately identify the area in which the flame exists. One way by which the flame could be represented would be by using an enclosed edge map to give the outer line for the flame, in a picture file, or by a series of connected coordinates. To ensure high fidelity reconstructions, rendering is carried out using Radiance [28]. Light sources in Radiance are best constructed from various predefined geometric shapes, such as cones, cylinders, and spheres. The simplest way of representing the shape of the flame is by using spheres. This involves finding the top and the base of the flame and dividing the space in between into a number of segments, determined by how 4.1. LUMINAIRES 73 Figure 4.2: Basic flame many spheres will be used to model the flame. On each of the dividing lines, a sphere is created, the radius and centre point of which can be found by looking for the boundaries of the flame along the dividing line. This is achieved using the thresholded image. Each pixel on a line is tested, starting from the sides of the image and working in towards the flame. When a non-background colour is found, then this pixel represents the edge of the flame on one side. The same is done from the other side, until two co-ordinate positions are found, representing the flame boundaries on the particular line. Figure 4.3 shows the flame represented by just 3 spheres: it is clearly not modelling the shape of the flame accurately, shown by the large amounts of yellow ‘flame’ that are not covered by the green spheres. Increasing the number of spheres should increase the accuracy of the representation. However, using more spheres leads to greater complexity in the rendering process, as each of the spheres will be an individual light source. Note: Care must be taken if the spheres overlap each other as the manner in which Radiance implements the illum material can mean this leads to problems of self-shadowing amongst the spheres. This will need to be accounted for during the rendering process. As Figures 4.3– 4.5 show, the theoretical benefit of adding more spheres to the representation slows quite quickly, the jump from 3 to 7 spheres being more useful in terms of percentage of the flame covered, than 74 CHAPTER 4. QUANTIFYING REALISM Figure 4.3: 3 sphere representation raising from 7 to 15 spheres. Also, there is a large amount of duplication as more spheres are used, leading to an inefficient representation, which will result in a much longer rendering time. The above simple method for attaining sphere data is flawed when the flame is not vertical, overemphasising the size of the flame as shown Figure 4.6. To deal with this, the centreline of the flame should be found, running from wick to tip. By using the normal to this line, a more accurate description can be produced. Figure 4.7 shows how with a rough red central line found for the flame, normals to this line can be used to divide the flame up in the same method as described above. This time, by working outwards from where the centre line intersects each normal, the boundaries for the flame can be found, and a sphere created. To define a sphere in 3D space we need a centre point ( , , ) and a radius. A flame is created using many of these sets of data, one for each sphere. This information then needs to be output to a text file. The information is currently in a state relative to pixels, and not to any real physical size. This means that using different resolution images of the same flame at a particular point in time, will give different data for the output spheres. This does not matter though, as the dimensions of each sphere, and the positions of all the spheres are   4.1. LUMINAIRES 75 Figure 4.4: 7 sphere representation relative. The spheres will need to be scaled suitably to fit in with the modelled environment. There are some problems with representing the flame in this manner. Firstly, it is quite likely that some of the spheres will be larger than the part of the flame they are attempting to model. This is particularly the case near the base and tip of the flame, or any part where the gradient of the edge is changing quickly. These bigger spheres mean slightly more light will be produced than is wanted. Another problem is the computational expense. Figure 4.7 demonstrates the difference between the 3, 7 and 15 sphere representations. With 15 spheres there is massive overlapping, and only very small parts of the original 3 and 7 spheres can be seen. It would be far better to find a way of converting the numerous spheres into 1 object, cutting out the duplication. 4.1.2 Converting Pixel Information to Radiance Files The above method creates data on a pixel level, but to be used in Radiance, this data needs to be scaled and converted from raw data. Scaling can only be achieved by judgement and trial and error within the Radiance scene. A Radiance definition for a sphere is as follows: CHAPTER 4. QUANTIFYING REALISM 76 Figure 4.5: 15 sphere representation               , where candlelight represents a previous declaration of the material type to be used, sphere describes the object to be created, and sphere1 is the name of the new object. The next two lines show that there are no modifiers. ! , " , # and $ represent the ( , , ) coordinates and the radius. The candlelight declaration will be used to describe the spectral properties of the light emitted from this sphere, in terms of a red, green and a blue component. A file containing the above description (with ! , " , # and $ replaced with suitable numbers) can be used in Radiance to create a single sphere. Many of the above descriptions will be used to create the flame.   4.1.3 Creation of a Radiance Scene With a Radiance file created for each frame of flame, this now needs to be incorporated into a Radiance scene. The only specification for this part is for the scene to be interesting enough, and to be able to show the movement of the flame, and the light emitted from it. In 4.1. LUMINAIRES 77 Figure 4.6: Oversized model Radiance, there are several different materials that can be used as light sources. Most commonly used is light, but in this case light would be inappropriate, as when viewing the light source itself, it is visible. What is needed is an invisible light source, which is provided by the material type illum. When viewed directly the object made from illum is invisible, but light is still emitted from it. Accordingly, the illum material type will be needed to create the flames. 4.1.4 Creation of Final Images With pictures rendered, it is now necessary for the original picture of the flame to be pasted in the right place on the scene. Some care needs to be taken here to accurately position the flame, and also to blend it into the scene, rather than simply sticking it on. As discussed earlier, parts of the flame may contain some of the old background colour within it, so this needs to be compensated for with the new background colour. With the flame replaced in all of the rendered frames, these can be played in sequence, at 25 fps, to create the finished animation. Following the above process should provide a path from the original film of a flame to another animation, with the same CHAPTER 4. QUANTIFYING REALISM 78 Figure 4.7: 3, 7 and 15 sphere representations flame, but in a completely different, realistic environment. 4.1.5 Automation It is very important that the whole process should be automated. When creating film at a rate of 25 fps, to get any reasonable amount of time out, many, many frames need to be produced. The programs for thresholding, creating flame data and replacing the flame, must work for all the frames at once, without needing to change any settings. If it works for the first frame, it must work for all the frames. For the rendering itself, this is not so important, as currently a complex scene may take many hours to render, so due to hardware constraints, this may have to be processed individually anyway. 4.2 Implementation 4.2.1 Flame to Film to Frames After filming the candle flame in front of a green screen, a file format was chosen for which to convert the digital video. Binary .ppm (Poskanzer Portable Pixmap) files were chosen for several reasons. Firstly they are not lossy, and although in Windows environments ppm’s are not well used, they are a standard on Unix, and can easily 4.2. IMPLEMENTATION be converted into Radiance pic files using ra ppm, a built-in Radiance program. For instance,                  would convert the Radiance image (pic file), into a .ppm image. Another reason is that .ppms can be manipulated using standard Image Processing techniques, such as provided by the Image Processing Library (IPLIB). This provides C functions for decoding ppm’s enabling them to be altered. The function readppm reads in a .ppm file and converts it into a pixmap, a 3D array. The first two indices represent pixel height and width. The third dimension/index of the array represents the red, green and blue parts of the image, stored with values of 0, 1, and 2 respectively. For instance, the statement           would assign the green value of the pixel at (15,34) to 240. The values that can be contained in the array are unsigned chars, i.e. any value from 0 to 255. The video was streamed onto a computer and split into .bmp frames using Adobe Premiere, and then converted into .ppms using Paint Shop Pro. The frames must be ordered with a 3-digit sequence number, prior to the file extension. For instance the first file might be called flame000.ppm, so the second must be flame001.ppm, etc. This is a quick yet necessary process to turn the film into suitable media. Figure 4.8 shows one frame from the original candle movie, shot against a green screen. The original film was recorded at a rate of 25 frames per second. This leads to a great deal of individual frames. Up to this starting point, other programs have been employed to turn the original footage into a series of images. It makes sense to use these packages to get to this early state, but from this point onwards it new programs need to be written. Programs (written in C) are needed that can take a large series of individual images, and quickly create data representing each of the flames within the image. Part of this also involves separating the flame from the green screen background. This 79 80 CHAPTER 4. QUANTIFYING REALISM Figure 4.8: The Original Candle Flame Figure 4.9: Get Flame Program Sequence leaves a series of flames that can be used in a second program: code that blends in the individual flames onto the corresponding rendered scenes to create the final realistic images and animations. 4.2.2 Creating Sphere Data At each stage in Figure 4.9, functions need to be written to carry out the specified task. The process starts with the original series of images. They are then reduced in size through cropping to remove unwanted background where the flame does not appear. The next stage is to threshold the image to separate the background from the flame itself. The crux of the process, creating the sphere data, is then undertaken. Finally, this is converted into Radiance files that can be used in the rendering process. 4.2. IMPLEMENTATION Figure 4.10: Original Flame Image to Cropped Flame Cropping The larger the picture files, the more work will need to be done, so it is important to get rid of unnecessary data at an early stage. It is also important to remove any candle or base that lies beneath the flame (Figure 4.10). Accordingly, two sets of coordinates are required, one specifying the top left corner of the cropped picture, and the other specifying the bottom right. These coordinates are entered into the definitions section at the beginning of the program. As they are only entered once for the whole film, it is important to allow enough room so that the moving flame will always remain within these specified boundaries. Thresholding Thresholding replaces a specified range of colour values with a single colour. As shown in Figure 4.11, this has been used to separate the flame from its background, which has been turned to a single colour — in this case black. Some of the heat haze around the central flame has also been captured to add to the realism of the picture. Changing the threshold settings can alter the amount of haze captured. For instance, if just the pure flame is required with no haze, by altering the inputs slightly, this can be achieved. These parameters are located 81 CHAPTER 4. QUANTIFYING REALISM 82 Figure 4.11: Cropped Flame to Thresholded Flame at the definitions section at the start of the getFlame program file. Maximum and minimum values can be specified in each of the red, green and blue parts to separate those parts of the image that are to be removed, and those which are to be kept.     ' ' ' ,             !  (     !  (   "!  (   %    '     '  " ,                 !      $#&% *) +)  )          "!      )      )       )  ' The code fragment above shows some thresholding. The pre-threshold image is stored in the pixmap - , and the new, thresholded image is stored in . . The code shows how one pixel of the original image is considered. If its red, green and blue values are within the designated colour values, specified by RLOW, RHIGH, GLOW, GHIGH, BLOW, and BHIGH, then it is considered an unwanted background area, and so replaced in the new image by another colour, specified by COLR, COLG, and COLB. If its colour is not within the boundaries, then the values of the original image are retained for the new image. To remove the green background used, suitable threshold values were: /   0   4.2. IMPLEMENTATION /     /   /   /   /   83                                   However, with different colour backgrounds, this can easily be altered. Finding Spheres The number of spheres used to model each flame can be specified at the start of the program using the SPHERES definition. This parameter can be used to change the balance between accuracy of the flame, and the speed of the rendering. A higher number of spheres leads to a better representation, but one that will take longer to render. The spheres are found using the method previously described. Some key code fragments demonstrating important parts of this process is as follows:    ,     )     ) # %     )    )  # %       (      (    # % )   )      )     ) , ,         (    The above code shows a method to find the top of the flame, needed to work out a central line of the flame. It cycles through the thresholded image and finds the uppermost pixel that is not the same colour as the background colour (i.e. not COLR, COLG, COLB), and will therefore be the highpoint of the flame. CHAPTER 4. QUANTIFYING REALISM 84   # )     #       $#               # )     # )  Once the centre line gradient has been found, the gradient of the normal, , can also be calculated. At this point the centre line can be divided up into the necessary number of chunks that is required. This is specified through the SPHERES definition. divH and divW represent the height and width jumps from one sphere to the next, , , , , representing the top and base coordinates of the flame. It is now possible using the equation of the centre line, starting from the tip of the flame, to work down it, using divH and divW as the decrement values. Each time the decrement occurs a new sphere can be created.             )     )   )  # %    '               # )        (      (    "  (  #% )         )       ) , ,   On each decrement, we start with a centre point of the sphere on the centre line. It is now necessary to find the edges of the flame on the normal to the centre line, and then calculate a more accurate centre point and radius for the sphere. The above code shows how the equation of the normal is used to find the pixel coordinates on which it lies. These values have to be rounded to keep them as pixel values. The code fragment shows how the program works outwards from the centre point to find the right hand side boundary of the flame. This point is stored in the rightH and rightW variables. The same is then done to find the left hand side boundary. 4.2. IMPLEMENTATION                                              +             #                 # ) # 85  )  )  # )  '                 #  ###     )  #  # The real centre point can then be determined, as well as the radius of the flame at that point, as the previous piece of code shows. So for each flame, a number of spheres have been defined as a triple; an ( , ) co-ordinate position centre point, and a radius. All the sphere data, for all the files is output into a single text file called “flameinfo.txt”. The number of files that are processed is specified by the FILES definition. The first line of this file contains information on the number of spheres used in each file and the number of files.  Figure 4.12 shows the ’flameinfo’ output if SPHERES was set to 5 and FILES set to 4 (i.e. one sphere representation was created comprising of 5 spheres). The rest of the text shows the ! , " and # co-ordinates of each sphere. The code also allows for extra images to be produced at this stage showing the centre points for each sphere, and the sphere edge boundaries, superimposed on the flame image. This is used as a tool to check the algorithm is working correctly. Figure 4.13shows 2 such images. The green dots show the boundary of the flames. The red dots represent the predicted centre line, calculated by forming the line from tip to base. The blue dots show the actual centres at that particular point in the flame. By taking the blue dots as centre points and using the distance from this to the closest green point on the normal line as the radius, each of the spheres are formed. Producing Radiance Scene Files The next part of the program reads in the file ’flameinfo.txt’, and creates the output, as Radiance scene descriptions. These are ordered in the same manner as the input files, going from f000.rad up to fnnn.rad. CHAPTER 4. QUANTIFYING REALISM 86 Figure 4.12: File information  ,     )    0   0   0                                       (     )     #&% ) #    # )                             (      # )           # )             (  # )    ## )  The data has been scaled so that it creates flames of the right size within Radiance, through the SCALEX, SCALEZ and SCALER defini- 4.2. IMPLEMENTATION Figure 4.13: Edge and Centre Markers for 7 and 20 Spheres tions. This may need to be adapted after rendering a scene, to check that the flame is the correct size in the picture. It is hard to predict the relative size beforehand, so actually rendering the scene with the spheres as solid, non-light emitting objects, is perhaps the best way of scaling properly. It was found that at this pixel level, dividing the original data by a factor of 2000 led to flames being created at the right scale within Radiance. Other Matters Another value that must be specified at the start of the program is that of the centre point of the candle (i.e. the width value at the base of the wick). This should not change throughout, so only needs to be specified once. This makes it simpler to find the centre lines of the flames. The program should be written so that it may easily be adapted to suit others needs, when using other film with different characteristics. 4.2.3 Superimposing the Flame With the rendering completed, a large sequence of images will have been produced. It now remains for the real flame corresponding to each image to be reinserted to create the final scenes. Another program needs to be written to handle this. As the images have retained the same sequential numbering system throughout, it is a simple task to match rendered images with the correct cut out flame. However 87 CHAPTER 4. QUANTIFYING REALISM 88 some groundwork must be done to ensure the flame is positioned correctly within the scene. A co-ordinate position must be specified, so that the flame is correctly placed at the point of the candle top. It is obvious when this is wrong, as it looks strange on the output, so trial and error is again the best way of achieving this. The actual flame picture may also need to be scaled at this point if the candle in the scene is further away or closer to the viewpoint, than the original flame was to the camera.     ,    '   '    '        ' ' ' ,                                             # %                                      )               # %             )        )                        )   )                    )            #                       #                                #                "     #    % '    '   '   ,     )  ) ) The central parts of the flame are placed straight on. These are the areas where none of the background colour will come through, such as the brightest part of the flame and the darkest parts such as the wick. This is shown on the if and if else parts of the above code. The surrounding areas, like the heat haze produced by the flame and the blue at the base around the wick, are blended in with the background colour to produce a more realistic effect. This is demonstrated in the Figure 4.14. The left image shows the flame placed straight on, with no blending techniques applied. This   4.3. CONVERTING LUMINAIRE DATA Figure 4.14: Different Ways of Blending the Flame gives a harsher, unrealistic edge to the flame, with a solid haze. The right image shows the haze somewhat wispier, as would be expected. Also on the left image, the blue base of the flame does not depict the transparency that a real flame gives. The right image shows a truer image, with some of the background showing through the transparent blue base. The program allows different areas of the flame to be identified through a colour range, and each one can then be treated differently. They can be placed straight on, blended in to varying degrees, or not put on at all. Examples of the results can be seen in Figures 4.15– 4.16. 4.3 Converting Luminaire Data The final problem to overcome is the conversion of luminaire data into RGB values. Because we wish to represent the results on computer displays we need to break the data down into spectral contributions for the three primary phosphor colours. When this process is performed, the detailed spectrum data from the spectroradiometer is merged into values representing the red, green and blue portions of the spectrum. It is essential that this conversion is calculated in a perceptually valid way, as defined by the CIE (Commission International de l’Eclairage) 89 CHAPTER 4. QUANTIFYING REALISM 90 1931 1-degree standard observer. This system specifies a perceived colour as a tristimulus value (i.e. three coordinates) indicating the luminance and chromaticity of a stimulus as it is perceived in a 1degree field around the foveal centre. Figure 4.17 shows the functions for the ! , " and # channels. The " channel measures the luminance of a source, and the ! and # channels measure the chromaticity. This information is more useful when broken down as follows: If we let = ! /(! + " + # ) and = " /(! + " + # ) Then we can calculate the exact colour values for the red, green and blue sections of the spectrum, disregarding luminance.        For a canonical set of VDU phosphors; RED           GREEN   BLUE   . Many luminance/chrominance meters will record the coordinates of the CIE standard observer. Fortunately, Radiance comes with a program rcalc to convert the " coordinates to RGB values, using xyz rgb.cal. It is worth re-iterating a couple of points. Firstly, the accurate modelling of luminaire colour values and temperature is usually unnecessary unless high measurement accuracy is required, because the image will appear unrealistically tinted. The perceived colour will always be shifted towards the white. Secondly, a reliance on extreme accuracy in these images will probably be unwise, because an RGB calculation is by definition an approximation of the colours present. If very accurate colour readings are required, we should calculate the convolution of the emission spectrum of the light source with the reflectance curve of the material under examination.       4.4 Validating Realism The aim of realistic image synthesis is the creation of accurate, high quality imagery which faithfully represents a physical environment, the ultimate goal being to create images which are perceptually indistinguishable from an actual scene. Advances in image synthesis techniques allow us to simulate the distribution of light energy in a scene with great precision. Unfortunately, this does not ensure that the displayed image will have a high fidelity visual appearance. Reasons for this include the limited dynamic range of displays, any resid- 4.4. VALIDATING REALISM ual shortcomings of the rendering process, and the extent to which human vision encodes such departures from perfect physical realism. Conversely, along many parameters, the visual system has strong limitations, and ignoring these leads to an over specification of accuracy beyond what can be seen on a given display system. This gives rise to unnecessary computational expenses. It is increasingly important to provide quantitative data on the fidelity of rendered images. This can be done either by developing computational metrics which aim to predict the degree of fidelity, or to carry out psychophysical investigations into the degree of similarity between the original and rendered images. Techniques to compare real and synthetic images, identify important visual system characteristics and thus produce benefits to the graphics community such as being able to reduce rendering times significantly, have been the subject of two previous courses at SIGGRAPH: “Seeing is Believing: Reality Perception in Modeling, Rendering and Animation” [12] and Image Quality Metrics [11]. McNamara et al.’s paper, “High Fidelity Image Synthesis” [30] which discusses many of the issues is included in Appendix A, as is Devlin and Chalmers’ application of the above methods, “Realistic Visualisation of the Pompeii Frescoes” [16]. section*Acknowledgements We would like to thank Jean Archambeau, the owner of Cap Blanc, for his permission to work at the site and his interest and Francesco d’ Errico, Kate Robson Brown, Ian Roberts, Chris Green, Natasha Chick and Michael Hall for their input to this work. Many thanks also to the Bristol/Bordeaux Twinning Association and the ALLIANCE / British Council programme (Action integre franco-britanique) for their financial support. Much of this work first appeared in: I. Roberts, Realistic modelling of flame, A. Chalmers (advisor), BSc Hons Thesis, University of Bristol, May 2001. C. Green The visualisation of ancient lighting conditions, A. Chalmers (advisor), BSc Hons Thesis, University of Bristol, May 1999. 91 92 CHAPTER 4. QUANTIFYING REALISM Figure 4.15: Array of gently Flickering Flame 4.4. VALIDATING REALISM Figure 4.16: Other candlelit envonments 93 94 CHAPTER 4. QUANTIFYING REALISM Figure 4.17: The CIE tristimulus curves Chapter 5 Representation and Interpretation by Kate Devlin, University of Bristol, UK. 5.1 Introduction The idea of representing a past environment in the form of an interpretive illustration is by no means a new one. While reconstruction drawings date back to the beginnings of the discipline itself, computer graphics has enabled new dimensions — quite literally — to be added to visual representations of archaeological sites. The archaeological community has embraced this technology, finding in it new methods of presenting the past to a public whose demands grow increasingly sophisticated. With the advent of ‘media archaeology’ in the form of television documentaries and World Wide Web presentations, and the increasing use of audio-visual displays in museums and interpretive centres, three-dimensional computer graphics provide an aesthetic and convenient means of enhancing an archaeological experience, allowing us a glimpse of the past that might otherwise be difficult to appreciate. To date, however, the emphasis has been on using three-dimensional computer graphics for display purposes, with interpretation and re95 96 CHAPTER 5. REPRESENTATION AND INTERPRETATION search taking second place to the need for media representations. The current trend for artistic conception and photo-realism in reconstructions is not enough to benefit the archaeological community, and for the archaeologist to use the computer-generated environments as a research tool, stricter controls are necessary. It is only when we can explain and quantify the accuracy of the generated image that it can be used for an interpretative purpose. 5.2 A brief history of archaeological illustration “Without drawing or designing the Study of Antiquities or any other Science is lame and imperfect.” William Stukeley, 1717. This section provides a brief overview of the development of archaeological illustration, from depictions of monuments in historical manuscripts through to the immersive Virtual Reality worlds created today, outlining the main developments in society and technology that permitted advances to be made. This provides a context for the aspects of representation that will be discussed later on in this chapter. 5.2.1 Archaeological illustration: an overview Illustrations depicting archaeological sites date back to the medieval period, as far back as the recorded interest in antiquities itself. The Renaissance heralded a rediscovery of classical Antiquity and marked a new approach to knowledge that in the UK, for example, was manifested in the foundation of the national academy of science — the Royal Society — in 1660 [34]. This led in turn to the regular meetings of the Society of Antiquaries of London from 1717 onwards, with the aim of “the encouragement, advancement and furtherance of the study and knowledge of the antiquities and history of this and other countries”. Systematic illustration of excavated artefacts took 5.2. A BRIEF HISTORY OF ARCHAEOLOGICAL ILLUSTRATION 97 hold, and by the mid-nineteenth century archaeology (rather than antiquarianism) became established as a discipline. Advances in metalplate engraving for the printing of illustrations and development in wood engraving and lithography permitted the creation of detailed and intricate work. When the pioneering archaeologist Pitt-Rivers published his excavation reports in the late nineteenth century his illustrations came closer to the standard of work required today than to that of his contemporaries. The advent of photography brought a greater choice of mediums in the twentieth century, and following the First World War the potential of aerial photography was realised, as was the importance of distribution maps for analytical purposes [1]. The possibility of printing good quality photographic reproductions meant that the ‘realistic’ drawing of the site could be discarded, yet the archaeologist today works with stylised two-dimensional site plans. Stripped of its subjectivity, the drawing in the form of a plan is still a means of explicitly conveying information. Computer Added Design (CAD) was first developed towards the end of the 1970s, initially two-dimensional but overtaken by three-dimensional packages in the second half of the 1980s. CAD packages have found a market in the archaeological community where they provide a robust, user-friendly means of linking site plans and section drawings to create a complete three-dimensional record of an excavation [15]. Visualisation projects in the UK originated in the late 1980s with Woodwark and Bowyer’s reconstruction of the Temple Precinct from Roman Bath, a project that inspired a succession of similar applications elsewhere [32]. Other forms of computer applications were subsequently explored. The use of GIS (Geographics Information System) was first seen in archaeology in 1986 in the US. GIS provides a method of combining spatial data and textual information for the purpose of landscape visualisation and analysis, and is in widespread use today. In terms of three-dimensional representations, VRML (Virtual Reality Modelling Language) was recognised as an international standard in 1997, providing a scene description language that permits the viewing and manipulating of a 3D ‘world’. The low cost, ease of use and portability of VRML led to its adoption for interactive explorations of archaeo- 98 CHAPTER 5. REPRESENTATION AND INTERPRETATION Figure 5.1: INSITE project reconstruction logical representations, although interactivity is at the cost of realism. Conversely, the move towards photo-realism in computer graphics inspired ‘accurate reconstructions’ of sites based on excavation reports or standing remains. However, the idea of portraying a subjective interpretation as ‘real’ is no less fanciful than the antiquarians’ paintings of the past. Approaches have been taken to quantify the realism of the images offered, such as the INSITE project at the University of Bristol which sought to accurately simulate the light distribution in a scene using the spectral value of the original light source (Figure 5.1) [10]. Having moved into the twenty-first century, multi-sensory and mixed reality applications now provide further ways to present heritage information. Shaderlamps — a method of graphically animating physical real world models with projectors (Figure 5.2.1) has brought three-dimensional computer graphics and animation outside of the computer monitor [35]. From headmounted displays to total immersion in graphics CAVE Automatic Virtual Environments (CAVE), Virtual Reality has been embraced as a new way of imparting archaeological representations [3]. Nowadays, museums and heritage centre come replete with audiovisual displays, yielding to the demands of an increasingly sophisticated public. The all-pervading mediums of television and Internet have brought archaeology into our homes. The study of the past has become high-tech and sexy — slick, speedy and visually stunning presentation is expected. The future of archaeological representation is limited only by tech- 5.2. A BRIEF HISTORY OF ARCHAEOLOGICAL ILLUSTRATION (a) Shaderlamps set-up 99 (b) Resulting model Figure 5.2: Shaderlamps: illuminating with projectors. By permission of R. Raskar. nology. As computer power continues to grow and hardware and software fall in price, new applications can and will be found. We are moving into the realm of multi-sensory VR experiences — acoustic rendering is a growing area of interest, and touch and smell have been integrated into museum displays. The promising area of augmented reality can integrate archaeology into our everyday experiences. So far, the demand is there, and the thrill of the subject is as relevant today as it was to the first antiquarians of the Renaissance. With the current media fascination for all things archaeological perhaps we should ask ourselves how far we can go with excavation plans and records before it all becomes infra dig? 5.2.2 Case study: seeing Stonehenge A good example of the changing nature of the representation of archaeological sites is that of Stonehenge in Wiltshire, England. It is depicted in a fourteenth century manuscript (albeit as an illustration of a legend first appearing around 1136 of the magician Merlin building the monument, Figure 5.3), but the essential form of the henge is apparent in this and also in a contemporary manuscript where the circle of stones is shown as a square due to the layout of the page. 100 CHAPTER 5. REPRESENTATION AND INTERPRETATION Figure 5.3: Fourteenth century manuscript depicting Merlin erecting Stonehenge (MS Egerton 3208 f.30r). By permission of the British Library. Paintings in the sixteenth and seventeenth centuries, presumably based on verbal accounts due to some inaccuracies, show an increasing awareness in antiquarian studies. A ‘reconstruction’ of the monument as a purported Roman Temple, Stonehenge Restored, by Inigo Jones in 1655 portrayed a completed and orderly drawing in full architectural splendour - the way he thought it should have been. John Aubrey made the first plan of the site in 1666. In William Stukeley’s topographical recording of Stonehenge, published in 1740 at a time when the Society of Antiquaries was wellestablished, scientific visualisations of a sort (for his imagination ran unfettered in some of his representations) were truly underway, such were his intentions to carry out the work for a distinctly archaeological purpose. Stonehenge remained in the eye of the artist, however, and Turner’s watercolour of 1829 is a no-holds-barred, intimidating portrayal of storm-lashed stones, a howling dog and an unfortunate shepherd struck dead by lightning. Constable also painted the monument in a similarly apocalyptic manner in 1836, continuing the tradition of romance and drama that so inspired earlier depictions. These nineteenth century paintings may have called artistic licence into play, but they are nonetheless imbued with the approach seen several hundred years before, and if we are to refer to the latter as a form of 5.3. THE IDEA OF REALISM archaeological representation, then on the same grounds we should not exclude the former. The twentieth century produced standardised archaeological excavation plans of Stonehenge, symbols and conventions demarcating the relevant features. A proliferation of drawings, paintings, photographs and postcards added to the purely archaeological record. The site never once wavered in popularity, with a three-dimensional replica of the monument even appearing, tragically not-to-scale, in 1984 rock ‘mockumentary’ This Is Spinal Tap. As VRML grew in popularity, so did a rash of virtual Stonehenge models created by enthusiasts the world over, its potential and popularity easy to comprehend. It is fair to say, though, that none came close in sheer scale and complexity than that developed by Virtual Presence Ltd. co-sponsored by Intel and English Heritage. Using source data such as photogrammetry, GIS data, site plans, excavation reports and astronomical maps, a detailed interactive model was created. (Figure 5.2.2 shows screenshots from this project.) Stonehenge will undoubtedly continue to prove inspirational, and we should look forward to future interpretations using the visualisation techniques that are developing today. From the earliest recorded drawings to an interactive replica, each interpretation stimulates interest in the monument, and that — above all — is surely what we should be trying to achieve. 5.3 The idea of realism “Even where both the light rays and the momentary external conditions are the same, the preceding train of visual experience, together with the information gathered from all sources, can make a vast difference in what is seen. If not even the former conditions are the same, duplication of light rays is no more likely to result in identical perception than is duplication of the conditions if the light rays differ...the behaviour of light sanctions neither our usual nor any other way of rendering space; and perspective 101 102 CHAPTER 5. REPRESENTATION AND INTERPRETATION (a) Close-up of trilithon (b) Aerial view (c) Sunrise (d) Sunrise (detail) Figure 5.4: VR Stonehenge screenshots. By permission of by Virtual Presence Ltd. 5.3. THE IDEA OF REALISM provides no absolute or independent standard of fidelity.” Nelson Goodman, 1976. This section discusses issues with interpreting images that are intended to be realistic, and defines the key terms and concepts in this area. It briefly outlines the problems of defining what we mean when we say an image is real. It queries the amount of authenticity we can expect from a virtual representation of an archaeological site given limitations in both image generation and archaeological data, and should illustrate the need for placing representations into a context in order for them to be useful. Because this topic covers some theoretical approaches to archaeological representation, an example is helpful. Suppose we have a house that was inhabited during the epidemic of Bubonic Plague that swept through Europe in the Fourteenth century. Archaeologists excavate the house and it is decided to use it as the basis for a museum on life in that time. We will use this example to illustrate issues of particular relevance to representation. 5.3.1 Terms and concepts Reconstruction vs. representation To begin, we must find a term to define the end-product of our work. The idea of classifying a generated image as a reconstruction — as such computer-generated past environments are often described — is slightly misleading. Although the geometry of the synthesised scene can be based on site plans and other captured data, any gaps in the information cannot be portrayed with justification. Where information is lacking, conjecture takes over. Where a construction is the original real-world scene, a reconstruction implies an objective rebuilding of this based on the material remains. By contrast, a representation more accurately describes one particular interpretation, to which there may be many and varied alternatives. If we consider our example of the Plague house, we can rebuild it objectively, either physically, on paper, or on computer, using the same dimensions and materials. This is a reconstruction. When we recreate 103 104 CHAPTER 5. REPRESENTATION AND INTERPRETATION the house and portray it as we think it was in that time, where gaps in our knowledge are filled in with educated guesses, it becomes a representation. In this chapter we will therefore use the term ‘representation’ when referring to our computer-generated archaeological scenes. Defining realism The human visual system is far too complex and unwieldy to currently know enough about its processes. It operates as a whole; we cannot yet replicate that on a computer. If we present humans with the same information the visual system can process it in the same manner, but on a higher (personal) level there is subjectivity. Before we produce an image and claim it is ‘realistic’, we must first define what we mean by ‘realism’. To say an image is physically real - i.e. the scene geometry is identical to the scene in the real world and the light distribution is accurately simulated - is concrete and understood, but to say it is perceptually real is to use a term that needs to be quantified. There exists a wide range of philosophical literature and discussion on the nature of realism, and each tends to state that our perception of a scene is subject to some factor: association, connotation, contextual/comparative views, semantic meaning, Platonic idealism, expectation and experience, discrimination vs. association, recognition over time/exposure, sensory appeal, familiarity and recognition, and so on, and so forth. . . With so many apparent influences it becomes difficult to know how to approach a definition of realism. If the viewer brings their own experiences and subjectivity to an interpretation then each viewer must have their own version of reality. By trying create reality by faithfully copying something we are immediately limited by the inability to specify what it is that we are copying. How do we decide if an image is real unless we compare it with something that we have defined as being real? What, therefore, constitutes realism of representation? Goodman suggests that it is “the probability of confusing the representation with the represented. . . how far the picture and object, under conditions of observation appropriate to each, give rise to the same responses and expectations” [19]. Certainly, in perceptually realistic graphics, the idea is to evoke the same 5.3. THE IDEA OF REALISM response from a generated scene as we might have to a real-world scene. Admittedly, it is an explanation that can be questioned given the subjective nature of personal response, but it does suffice to avoid becoming entrenched in tangled rhetoric. What is in a name? Virtual Reality vs. hyperreality Much of the above can be (and indeed has been) applied to the area of Virtual Reality (VR). Virtual Reality (a phrase which, in itself, seems contradictory) describes a environment generated by computer software. The use of VR as a term to describe the creation of realistic past environments comes under scrutiny and faces suggestions that the term ‘hyperreality’ is perhaps a better explanation for the representation of past environments [17, 3]. The term hyperreality, which is still evolving in definition, was brought into use by Baudrillard’s discussion of simulation: “it is the generation by models of a real without origin or reality: a hyperreal” [5]. The term describes a functioning copy of something that never existed in the first place. Spinal Tap — the mock rock group mentioned in Section 5.2 is an example of this. They were a fictional band, played by actors, featuring in a spoof documentary-style fictional film, yet the success of the film led to them producing records and touring despite the fact that they weren’t a real band and never actually existed. Now apply this term to computer-generated representations of archaeological sites: we can interpret the evidence and create a site that never actually existed, but which has become real to us because we have generated it. The idea of more real than real can be seen in the concept of living museums such as folk parks, where actors dressed in period costume portray the day-to-day life of that time. They represent a reality, but it is a reality taken out of context, where certain parts of the experience are emphasised and heightened, with the customer experiencing a conjectural world as a real one. Applying the example of the Plague house, if there were actors portraying the daily life of someone during the time the house was in use then the situation arises where the real (the remains of the house) and the fake (the actors, reproduction furniture and artefacts) co-exist. Historical information has been displayed alongside duplications and fakes in order to make people view 105 106 CHAPTER 5. REPRESENTATION AND INTERPRETATION it as being more authentic. This hyperreality is no more unusual than escapism in a novel or film — we seek out something that is more exciting, more dramatic and more memorable than mundane life. For a museum visitor, this may not necessarily be such a bad thing. If we are stimulated, does it matter if it is simulated? This is where the purpose of the representation comes into play — each need has a niche. To encourage people to think how life might have been in the past, it may be enough to offer them a lesser degree of reality. For the archaeologist to test and explore new hypotheses, more accuracy may be required. Virtual Reality, virtuality, hyperreality — it is the underlying concept rather than the name that matters. Reality can be viewed as a singularity — no matter how much we try to emulate it we will always fall on the side of either hypo- or hyperreality. In spite of this, a lack of authenticity does not mean representations are useless. We must be aware of what our motivation and purpose is (see Section 5.4). Also, providing we determine what we are trying to achieve and place it in a broader context, we do not need to spend too much time worrying over nomenclature, but such a discussion helps to highlight the issues in defining realism. 5.3.2 The nature of archaeological data One of the first discussions of the idea of Virtual Archaeology categorised the VR model as a replacement — a duplicate — for an original, with the fidelity of the model dictated by the dataset from which it was created [36]. However, the above section indicates that this does not necessarily guarantee realism. Also, given the wide-ranging aspects of archaeological evidence, it is improbable that we can display it all at once in any meaningful way. Gillings drew attention to the problematic nature of a tangible referent, the idea of an existing reality to which we can compare our model [17]. There are several issues that stem from the fact that we wish to portray this tangible referent. The first problem is that the site may no longer exist in the form in which we wish to represent it: walls tumble down, buildings crumble; paintings fade. The second problem is 5.3. THE IDEA OF REALISM the diachronous nature of archaeological sites. In general, an archaeological site consists of material remains that have accumulated over a period of time; it may be very short or may span over hundreds over years. Either way, it exists over a period of time; it cannot be isolated to one single instance. How, therefore, do we display this? A threedimensional spatial representation depicts a single moment in time. To show the site evolving in a temporal manner we must add a fourth dimension — a timeline. By way of example, suppose that the museum at the Plague house depicts life during an ordinary day in the fourteenth century. Presumably the people will have been living there for some time — the house has been built, it is full of furniture and useful items, and the people are carrying out their usual daily tasks. However, an archaeological record of that site does not focus on one particular day in the fourteenth century. The archaeologists may have sifted through twentyfirst century topsoil, through twentieth century discarded trash, backwards through the waste of the preceding centuries until they reach the remains of the Plague house. These are carefully recorded, but the excavation does not necessarily stop there. They can go further back in time, and uncover older archaeology lying beneath the house. They may keep excavating until they reach the subsoil where no more archaeology is to be found. From all of this comes an incredibly detailed excavation record spanning centuries, and yet only one typical fourteenth century day is shown to the public. It is tempting to say that a tangible original never actually existed, that there is no single, all-encompassing form that can be represented. Aside for the purely temporal, a site has many different aspects and can be viewed in many different ways. Having said that, the process of archaeology itself is materially and culturally selective — human influence exerts itself at every level. Sites are excavated in a manner suited to each excavation director, and may vary from one trench to the next. The archaeological record depends on how the site has been dug. The interpretation of this record depends on the interpreter. The reality “exists in the interface between the site and the excavator or analysis. It is the information itself” [33]. 107 108 CHAPTER 5. REPRESENTATION AND INTERPRETATION 5.3.3 Context The context of time has already been mentioned, but this is merely one aspect that needs to be considered. Social context is important, and is something that is lacking in many computer representations due to the absence of human figures in scenes. The addition of a human presence in a representation conflicts with achieving physical reality as it requires the addition of extraneous, conjectural material. Human figures feature in many drawn archaeological illustrations, providing not only a social element, but a convenient measure of scale. However, in synthesised scenes, if we try to make our representations as hands-off as possible in the hope that this will preserve a greater degree of authenticity, adding evidence of human habitation that cannot be inferred from the archaeological record results in artistic interpretation. This is an area where a balance needs to be found. Social context is at the expense of physical accuracy. It is a dilemma that again calls purpose and motivation into question. If we want to safeguard a degree of realism then our virtual worlds are barren and unpopulated. Conversely, how can we identify with such a sterile scene? How can we view an empty room as being real when it was probably furnished and populated in the past? This problem is not unique to computer representations, and the idea of living museums are, once more, testament to that. Archaeology is the study of material remains, but it was people who made and used these. If we neglect them, we exclude a facet of reality. If we include them, we stand accused of conjecture. Another interesting factor is the perspective from which we view the represented scene. Often, it is the form of a fly-by where the user zooms over and around the virtual site for a panoramic view. This is not a real view if we consider that the inhabitants of the fourteenth century did not have helicopters, and so would never have seen their home from this viewpoint. Nonetheless, it is a useful form of display, not only for the public to get an overall sight, but also for the archaeologist to establish spatial relationships. Conversely, when working from an eye-level view, it is important to remember that our tall and healthy bodies of today may not bear great resemblance to a medieval peasant, so a decision has to be made as to whether we are at our 5.3. THE IDEA OF REALISM present eye-level or the eye-level of the original inhabitant. Emotionally, it is difficult to convey realism — our response to a representation of a house during the time of Plague is unlikely to be the same as our response to the original environment. There is no danger, no fear, no worry or hope or loss in a safe representation. We do not have the threat of death and disease hanging over us when we gaze at an image or walk through a museum conveying such times. We remain distant and detached because we are not experiencing this original reality. The long list of factors that may shape our personal interpretations comes into play when we consider that we are subconsciously influenced by a wide range of experiences, unique to each individual. In a way, we look at things with our own archaeology of past events shaping each interpretation. However, although the image may not display context, there is a way of providing it. Computer-generated scenes offer means of putting objects and places into context in a way that museums with their shelves of artefacts cannot. By providing a method of determining contextual information, representations are afforded a way of explaining the processes that led to their creation. Section 5.6 will discuss this in more detail. 5.3.4 An established reality If we are to use computer graphics for predictive purposes - for sites that no longer exist (e.g. archaeology) or those that have not yet been built (e.g. architectural simulations) there is no real-world scene with which to compare our representation. Since we do not have a tangible original, we must try to establish that we have made the scene as close a representation as possible by using some form of quantification. If we are worried about artistic interpretation and conjecture then a method of providing information about the virtual image needs to be employed. The degree of realism we require depends on purpose — on the questions that we ask — rather than being a blanket-term for a impossible standard to which we try to adhere. 109 110 CHAPTER 5. REPRESENTATION AND INTERPRETATION 5.4 Representing for a purpose Representations of archaeological sites can be created for a number of different reasons, and this section aims to identify some of the key reasons and show how the intended purpose of an image can influence the ways in which it is represented, and thus interpreted. It is also important to realise that even if an image is not useful in an archaeological respect it does not make it useless altogether, as some of the cases below will illustrate. 5.4.1 Representations for the archaeologist It is likely that the primary purpose for representations created for use by an archaeologist will be for research. While their own personal interest might warrant the creation of images for aesthetic reasons, computer-generated representations provide the archaeologist with a means for exploring the past and testing new hypotheses in a safe and controlled manner (see Section 5.7). Establishing spatial relationships Three-dimensional computer graphics provide a convenient means for the archaeologist to gain spatial awareness of a two-dimensional site record and to establish spatial relationships within that environment. For the visualisation of the site layout and the distribution of artefacts a CAD wireframe model or a GIS model may well be sufficient. Realism is not a priority, but the ability to quickly and easily navigate the virtual world is essential. This is the most objective form of representation as only the details from the excavation record are likely to be displayed. Investigating new hypotheses A computer-generated representation of the archaeological evidence provides the archaeologist with the chance to manipulate variables in 5.4. REPRESENTING FOR A PURPOSE the virtual environment in a way that cannot be done in the real environment. This is explained in greater detail in Section 5.7. The main thrust here is that the archaeologist is chosing to emphasise their own ideas, so subjectivity is a factor. The end product is intended to either prove or disprove a particular hypotheses — the archaeologist may not necessarily be interested in exploring other avenues of thought. The representation may therefore have a tendancy to be biased towards supporting the archaeologist’s views. This is not a bad thing; indeed, it is the very nature of archaeology to provide new theories, and this gives the archaeologist a chance to visualise their ideas where beforehand they may have only written about them. 5.4.2 Representations for the computer scientist There is the possibility that the archaeological representation has been created for the purpose of providing an application for the use of new computer graphics techniques. In this instance there is the danger that the archaeological aspects of the representation may be neglected, with the emphasis placed on demonstrating advances in computer graphics research [40]. The image may well be meaningful, but the motivation behind it might limit its usefulness for the purposes of archaeological research. However, it could prove highly desirable as a demonstration of cutting-edge graphics. 5.4.3 Representation as advertising Cynical as it might seem, there is always the chance that visualisation might have been carried out for the primary reason of advertising. Such projects in the past have been the results of sponsorship by large commercial organisations, concentrating on archaeology as an area rich in media attention and public interest and likely to provide public relations points [32]. In these cases, research was not the main aim of the project, but it certainly did not harm the archaeologists who benefited from the money and resources. However, with an emphasis on advertising and less direct control from the archaeological side of things, and with well-known archaeological examples (rather than 111 CHAPTER 5. REPRESENTATION AND INTERPRETATION 112 unexplored datasets) being chosen as subjects for the work, there was little new insight to be gained. 5.4.4 Representations for the public Archaeological representations aimed at the public fall into two broad areas: education and entertainment. Increasingly, the boundary between the two is blurring as the idea of making education enjoyable — or making entertainment educational — takes hold. Education The move towards increased use of audio-visual displays for heritage purposes has led to a demand for high-quality, comprehensive representations that inspire as well as instruct. Computer-generated representations designed for this purpose run the risk of providing a model that is visually appealing at the expense of being informative, but in a museum context the primary information (in the form of the museum displays) reinforce the didactic objectives of the virtual environment. This is an area where representations can vary greatly depending on the demographics of the intended audience. Nonetheless, the main objective is to provide information to the non-specialist. Entertainment The growth of media archaeology — television and the Internet being the main examples — has led to representations based on the fact that they look stunning and slick and attract the attention of the viewer, thus pushing up the ratings or the hit-counter. Indeed, it is becoming rare nowadays to watch an archeological documentary on television that does not contain some sort of virtual representation (generally touted in the television listings as “state-of-the-art computer graphics”). These programmes may run the gamut from the educational documentary to the sensationalist suppositions, but they work from the same motivation — attract as many viewers as possible. Like the case for advertising mentioned above, this is not always particularly 5.5. MISINTERPRETATION useful, but it does provoke interest in the subject and can therefore be considered as indirectly educational. 5.4.5 Fit for purpose The above cases highlight the need for clearly establishing the motivation behind creating a representation. As we expect, representations must be tailored towards their intended user. This also gives us greater insight into the path that led to the specific interpretation protrayed. 5.5 Misinterpretation “In presenting a very visual and solid model of the past there is a danger that techniques of visualization will be used to present a single politically correct view of the past, and will deny the public the right to think for themselves.” Miller and Richards, 1994. Given the problems in archaeological representation with defining realism, the motivation of the work and the inclusion of inference, the possibility of misinterpretation is high. As Miller and Richards remarked in 1994, “there is little, if any quality control for computer graphics and they are not subject to the same intense peer review as scientific papers” [32]. 5.5.1 Different outcomes from the same evidence One way of demonstrating the potential of myriad interpretations derived from a single dataset is to ask a group of people to independently represent the same scene. A project of this nature was undertaken by Hodgson in 2001. He advertised on the web site of the Association of Archaeological Illustrators and Surveyors in the UK 1 appealing to people interested in creating a representation of a site. Each participant was given the same design brief containing information about the 1        113 114 CHAPTER 5. REPRESENTATION AND INTERPRETATION site (Dewlish Roman villa) , the purpose of the end product (a popular publication about the site for the “interested layman”), photographs, plans and sketches from the excavation report, and a choice of any medium. Additionally, particpants were informed that ”the illustrator is not bound to use only that material which is contained in the brief. Use of reference material for costume, furnishings, implements etc. is perfectly valid; illustrators are only asked to keep a record of any additional sources used for the ‘debriefing’ questionnaire” [25]. At the time of print, Hodgson was compiling and analysing the completed representations. The intended outcome is an overall picture of “how the reconstruction process functions” [25], taking into consideration the content of the completed images, the illustrator’s responses to the questionnaire, and the audience’s reaction. We await the results with interest. Projects of this type are an excellent way of highlighting how many interpretations can be derived from the one source, and are useful in emphasising to the public that a single representation cannot be taken for granted. 5.5.2 Seeing what we want to see Conversely to the above, given the previous discussion in Section 5.3 where the viewer brings their own experiences to their viewing of an image, there is the chance that the synthesised scene becomes a type of Rorschach test where the viewer projects their thoughts onto the image and sees it as it was never actually intended to look. It is not just the public who are at risk of this. The archaeologist may focus on an insignificant detail and magnify its importance, losing the overall impact of the image (and the idea of alternatives) by concentrating all their attention on a singular aspect. Likewise, a computer scientist may marvel over a great graphics effect, which to them is of the utmost importance to the scene (a scenario not unlike graphics researchers watching animated feature films and focusing all their attention on how amazing the rendered hair looks, rather than on the storyline). 5.6. SETTING STANDARDS 5.5.3 Reducing misinterpretation Since the chances of misinterpretation are significant, a method of reducing it must be implemented. Information about the dataset and the interpretative process is key, and informing the viewer of the methods and decisions taken to produce a synthesised scene allows them to place the image in context. Section 5.6 discusses how this can be achieved. 5.6 Setting standards The previous sections have demonstrated how misinterpretation of an image might arise. By incorporating information pertaining to a VR world or computer-generated image, each synthesised scene can be analysed and compared, allowing the viewer to determine information regarding decisions over representations of incomplete geometry and inclusion of artefacts, and also information image attributes such as rendering quality or resolution. If a level of standardisation can be reached, representations of past environments can be of more use to both the archaeologists and the public. If alternatives are provided, or if the image is treated with the same scrutiny as a documented source, we can limit the danger of awarding graphics more influence than they should actually hold. This section discusses the application of metadata, tagging and standardisation to computer graphics as a means of increasing contextual information and reducing misinterpretation. 5.6.1 Metadata To avoid misinterpretation due to a lack of information and to place an archaeological representation in its appropriate context, some form of description of the dataset and the decisions involved in creating the representation is required. Metadata — data about data, or information about information — is one method of providing this. The idea of providing metadata has flourished with the advent of the Internet, although the phrase has been around since the 1960s. Metadata exists in familiar forms, such as a bibliography that tells us when a book 115 CHAPTER 5. REPRESENTATION AND INTERPRETATION 116 was written and by whom it was published, or on a map where we can check the scale and the date of survey. The idea of metadata is not dissimilar to treating the image in the same manner as a documentary source. Any source used for historical/archaeological evidence is subject to scrutiny, and the nature of the document is always questioned — who wrote it, when, where, why and in what context. Questioning a computer-generated image the same way makes good sense, and metadata provides the answers. There are two issues about metadata that need to be standardised: the format of the metadata (the syntax, file format, etc.) and the actual metadata required, which is subject-area specific. A number of standards for metadata exist, and it has proved especially useful in archaeology when documenting and cataloguing material for digital archives. Applications and standards for metadata for digital archives are well-documented (metadata about metadata!) and published guides (such as “Creating Digital Resources for the Visual Arts: Standards and Good Practice” [21], published by the UK’s Arts and Humanities Data Service) seek to make digital archives more accessible. Syntax Standards for metadata may be highly specific, such as MARC (MAchine Readable Catalogue) used for a traditional library or the FGDC (Federal Geographic Data Committee), or may be simple and in widespread use, such as the Dublin Core standard [26], an international and interdisciplinary initiative, which is commonly used to describe online resources on the Internet. This is an attempt to standardise the information located in the META tags within the HEAD tags of an HTML (HyperText Markup Language) file, thus hopefully returning more comprehensive and less random information from an Internet search. There are fifteen core elements - structured categories such as title, author, date, source, etc. 5.6. SETTING STANDARDS Metadata for archaeological purposes Metadata can be represented in several different syntaxes, and Dublin Core is applicable to virtually any file format. As mentioned above, HTML is one commonly known form, but XML (eXtensible Markup Language) is perhaps a better-suited format for our purposes, allowing the creator to define the actual semantics [13]. Like HTML, XML uses tags and attributes to delimit data, but unlike HTML it is not a fixed format, so elements can be customised to suit the user’s purpose. A customised markup application can therefore be created for exchanging information in a particular subject area, and can the same information can be read across different operating systems. Work on online image retrieval has focused attention on the need for metadata for images, but it is the setting of information standards for VR that is perhaps more useful. The AHDS has determined the Core metadata required for VR (Table 5.1) and has also compiled a checklist of information that should be included when documenting virtual reality models (Table 5.2) [42]. 5.6.2 Alternative representations In addition to the provision of metadata, providing multiple interpretations of archaeological data is another means of drawing attention to the subjective aspects of representation. In order to provide a method of portraying the tentative nature of interpretations, Roberts and Ryan identified four modes of operation that allow the generation of different views of a site representation [40]. Each of these four modes allow alternative interpretations of the same data to be represented in a VRML world: Require New hinges on the archaeologist providing multiple insyances of the world, including the description of possibilities and arrangements that could have existed. The client then downloads each model they require. Switch Change has multiple interpretations in one document, and the client (who requests this single document) can switch between various configurations. 117 118 I NFORMATION CHAPTER 5. REPRESENTATION AND INTERPRETATION S COPE NOTE TYPE Title Survey index Description Language Type Format Subject Temporal Coverage Spatial Coverage Administrative area Country Date Creator Publisher Depositor Related archives Copyright The name of the bubbleworld, panorama or virtual reality model. The identification number/code used internally for the project. A brief summary of the main aims and objectives of the project for which the model was developed and a summary description of the model itself. An indication of the program language(s) in which interactions take place in the virtual reality model, e.g. Javascript. The type of resource, e.g. three-dimensional model, interactive resource, collaborative virtual environment. The data format of the resource, e.g. VRML 97. Keywords indexing the subject content of the model. If possible these can be drawn from existing documentation standards, e.g. for archaeology the English Heritage Thesaurus of Monument Types, the mda Archaeological Objects Thesaurus. If a local documentation standard is used a copy should be included with the data set. The time period covered by the virtual reality model. Where the model relates to a real world location, give the current and contemporary name(s) of the country, region, county, town or village covered by the model and map co-ordinates (e.g. in the UK national grid). Where appropriate give the District/County/Unitary Authority in which the model lies. Where appropriate, give the country to which the model relates. The dates of the first and last day of the virtual reality modelling project. The name(s), address(es) and roles of the creator(s), compiler(s), funding agencies, or other bodies or people intellectually responsible for the model. List details of any organisation that has published the model including the URL of on-line resources. The name, address and role of the organisation or individual(s) who deposited the data related to the virtual reality model. References to the original material from which the model was derived in whole or in part from published or unpublished sources, whether printed or digital. Give details of where the sources are located and how they are identified there (e.g. by file name or accession number). A description of any known copyrights held on the source material. Table 5.1: AHDS Core metadata for virtual reality models, from      "#! $ %  &$'!#()*$ +-, 5.6. SETTING STANDARDS I NFORMATION TYPE Project documentation Project name Survey index Description Bibliographic references Subject keywords Spatial coverage Administrative area Country Date Subject: discipline Subject: type Subject: period Creator Client Funding body Depositor Primary archives Related archives Copyright Target Audience Audience Mediator Standard Interactivity type Interactivity level Typical Time 119 I NFORMATION TYPE Application development Model type Application platform Hardware platform Authoring software 3D drawing tools 3D scanners Animation scripts Sound clips Image format Delivery platform Operating system Browser Plug-in / viewer Scripting language Hardware platform Network connection Target frame rate Description of archive List of all file names Explanation of codes used in file names Description of file formats List of codes used in files Date of last modification Table 5.2: A check-list for documenting virtual reality models    $#$ #   ! #%$   &$-!#($)*  +-,* 120 CHAPTER 5. REPRESENTATION AND INTERPRETATION Functional Change — a world is generated with objects and characteristics that can be easily altered (for example, building height) by means of a control panel. Program Run works on the basis of providing a version of the data and program to the client who can run it on their own machine and generate different views and models. This idea of providing alternatives is nothing new — archaeological illustrators have often raised similar issues about drawings, with pleas for “more partial and avowedly tentative reconstructions, with more alternatives offered when, as so often, the evidence is indecisive” [43]. The ability to identify which parts of a scene are conjectural and which come from the excavation record would undoubtedly be useful. A tool for switching on and off attributes of a scene, or for rearranging furniture, or altering lighting allows greater flexibility for the archaeologist’s investigations, and a chance for the public to see how the decision process works regarding interpretation. 5.6.3 Preserving information Another important aspect is to store the both the synthesised scenes and the associated information in a relevant technical form. A digital resource is of no use if it cannot be read, and changes in technology should be considered. An example of digital mismanagement is the BBC’s (British Broadcasting Corporation) Domesday Project of the 1980s. The original Domesday Book was a survey comissioned by William the Conqueror in 1085AD. Upon completion in August 1086 it contained records for 13418 settlements in England, providing a detailed record of life in Britain in the eleventh century. The BBC launched a project involving schoolchildren all over the UK, asking them to record their life and lands 900 years after the original Domesday Book. The information gathered by the schoolchildren was stored on 12 inch video disks which today can no longer be read as the technology is obselete. By contrast, the original eleventh century Domesday book is still legible (albeit in handwritten Latin). This serves as a warning that advances in technology should be considered and storage of information should be carefully planned and monitored. 5.7. DEVELOPING NEW HYPOTHESES 121 5.6.4 Standardisation In a subject such as archaeology where alternative explanations may be equally plausible, the option to move between a number of different interpretations is undoubtedly important. A form of standardisation is most desirable, but rather impractical given the diverse scope of the subject. In the same way that the archaeological evidence on an excavation needs to be recorded as thoroughly as possible, so too does the process used to create the computer-generated representations so that all the factors might be displayed, allowing the user to make up their own mind based on the supporting material. If we strive to provide information about the underlying decisions taken in the creation of our work then our virtual worlds have the potential of being meaningful, useful pieces of information. 5.7 Developing new hypotheses 2 As well as providing a good means of teaching the public about past environments, computer representations of archaeological sites can also be of use to the archaeologist. The immediate advantage is the ability to provide a spatial framework. For an archaeologist used to working with two-dimensional site plans which must be correlated with section drawings and level readings, a three-dimensional view of their site provides a framework for determining spatial relationships. This might well prove useful for establishing hypotheses about function of certain areas, or artefacts found therein. However, computer graphics also offers the chance to explore another aspect of past environments that does not appear in the archaeological record: light. 2 This section is by Duncan Brown, Southampton City Heritage, UK. The papers that form the basis of this discussion are included in the Appendix. 122 CHAPTER 5. REPRESENTATION AND INTERPRETATION 5.7.1 New ideas from light and colour perception Light is something we take for granted, that we create through the movement of a switch. Light is not something archaeologists can recover or record and consequently its importance is rarely considered in interpretations of past ways of living. Light is fundamentally important, yet at present it is rarely considered as a medium for comprehending how people behaved or how they perceived their environment. The hard archaeological evidence is unrevealing. Windows remain as evidence for the way of introducing light; lamps, lanterns and candlesticks for the means of its creation. Yet there is a more subtle way of approaching this problem and that is to look at the objects we find, the clues to the ways in which buildings were decorated, in an effort to understand what sort of environment past peoples created for themselves. Their perceptions can be revealed by the appearance of the objects they used. In searching for insights into the ways past people perceived their environments and the objects they interacted with it is important to consider how they illuminated their lives. Even more crucial is the relationship between light and colour. We are not used to looking at objects in conditions where light sources are dim, generally of a red cast and also moving. Nor indeed, once they have been removed from the ground, are we used to viewing artefacts only in daylight. In our age, the variations in the provision of light are beyond our immediate experience and understanding. The ways in which we view, perceive and understand objects is governed by current lighting methods (electric light and large windows) but in order to understand how objects were viewed and understood in the past we must consider how they were illuminated. It is essential that any system for reconstructing and visualising ancient environments must be as accurate as possible and flexible, allowing archaeologists to alter the scene parameters in order to investigate different hypotheses concerning the structure and contents of a site [10]. 5.7. DEVELOPING NEW HYPOTHESES Figure 5.5: Examples of medieval pottery 5.7.2 Case study: Medieval pottery The basic premise is that the colours of medieval pots are related to the lighting conditions that medieval people were accustomed to. Some pots are brightly coloured and highly decorated, others are dull (Figure 5.5). This is related in part to vessel function but must also reflect the intended place of use and thus variations in lighting conditions. Furthermore, pottery colours may actually reflect a typical absence, rather than presence of light, or at least lighting at much lower levels than we are used to. This case study therefore considers the ways in which medieval interiors were illuminated and how lighting conditions might affect the ways in which objects were perceived and designed. Provision of light As has already been inferred, archaeological evidence for the creation of light is relatively rare. The ceramic evidence is perhaps most commonplace on excavations and a variety of lamps and candleholders are known throughout the medieval period. These are generally portable types and are consequently small in size. It is difficult 123 124 CHAPTER 5. REPRESENTATION AND INTERPRETATION to envisage such objects being used to illuminate whole rooms. It is known, of course, that torches were extensively used and wallbrackets for these survive. Another source of light, and possibly a very important one, must have been the fires and braziers that were used to heat rooms. On this evidence it seems that the medieval interior must have been a flickering, smoke-beset world and perhaps this explains the bright colours used to decorate medieval objects and indeed rooms. The evidence for those is drawn primarily from manuscript illuminations and paintings, where furniture is often shown brightly painted, and walls display rich hangings. It may, however, be the case that most of these things were not meant to be seen at their best in artificial light. The hours of daylight regulated pre-industrial life and provided the fundamental means of illumination. In the present day, sunlight is almost irrelevant to the conducting of our lives; houses, offices, shops, factories are almost all permanently lit by artificial means. In the medieval period the sun was probably viewed as the only constant source of light and it is in the architecture that the best evidence for lighting can be found. Big windows provided lots of light but given the limited availability of window-glass they also brought draughts. Windows also created a security risk, as is shown most obviously in castles. Indeed the largest windows may be found in ecclesiastical buildings, those which one might presume to have been least threatened. This provokes the thought that the provision of light on through such grand openings was as much a signal of devotion to God as the building of the entire edifice. Light and holiness seem almost to be related. The way light was used within a church or cathedral may also, perhaps, reflect the controlling aspects of ecclesiastical architecture. Medieval pottery, however, was used more frequently in a domestic environment and the windows in houses are therefore more pertinent to this discussion. Here, window size might be related to status. The windows in surviving English peasant houses are generally small, leading to the conclusion that warmth was more important than light, perhaps because the rural lifestyle was in any case governed by the rising and the setting of the sun. Rural manor houses, and the homes of late medieval yeomen farmers exhibited larger windows, as did medieval town houses. In all instances the window provided light for 5.7. DEVELOPING NEW HYPOTHESES the carrying out of daily activities such as weaving and sewing. In towns, where jettying of upper stories often brought houses within a few feet of each other, windows had to be larger but they also allowed those sitting at them to communicate with their neighbours, as well as people in the street. The window therefore played an important social role as well as a domestic one. It is clear that different dwellings gave different lighting conditions. One might therefore expect the most brightly coloured objects to bee associated with the best lit settings and this is, to some extent true. The most highly decorated types of pottery, for instance, are not found at sites of the lowest status. However, that does not tell us how such objects were perceived by those who used them but simply how much light might have been available to see them by. It is an understanding of medieval perception that is being sought here and the next section considers ways of looking for that, if not necessarily finding it. Colour in the medieval period The pottery used in England throughout the medieval period was mostly earthenware and may be divided into white-firing and redfiring types. Pots made from white-firing clays appear white to buff in colour but were usually covered, completely or partially in a lead glaze. Clear lead glazes give a yellow to amber appearance to white pottery but the addition of copper creates green, which can vary from dark through olive to bright, apple hues. In general, the most brightly coloured medieval pottery was made from white-firing clays. Vessels made from iron-rich, red-firing clays range in colour from brick red to dull brown, when fired in oxidising conditions. Clear lead glazes can give a red-orange appearance to red wares while lead with copper glazes produce a variety of greens generally less vibrant or consistent in colour than those seen on white earthenware. Red clays fired in an oxygen-free atmosphere range in colour from grey through to black and in such conditions an ordinary lead glaze will turn slightly green or greenish-clear. In short, medieval pots were given a variety of colours, including white, red, brown, yellow, orange, green, grey and black. The appearance of vessels was also enhanced, and the colour effect slightly altered, by decoration such as painted lines, coloured 125 126 CHAPTER 5. REPRESENTATION AND INTERPRETATION slips and applied clays. So what? may be the first question that springs to mind after reading that. What, if anything, does this information tell us about the medieval period? It is possible to quantify medieval pottery assemblages by colour, but is that a useful thing to do? Did the colours of medieval pots have any meaning? The range of colours produced in pottery at least tells us something about ceramic technology. Aspects of kiln construction and management, clay exploitation and mineral use are all illuminated through an understanding of how pots were made in the colours that they were. Also revealed is the importance of colour in medieval society. The most colourful and highly decorated vessels were jugs, normally identified as table ware, used in the serving of liquids, especially wine. It is possible that these were used at high profile dinners. The dullest vessels, unglazed and undecorated, were those used in low-visibility activities such as cooking. Colour was therefore an important element in the culture of display that prevailed in medieval society, at least in its upper echelons. In these terms it is useful, therefore, to consider quantifying pottery assemblages by colour, as the quantities of highly decorated and plain pots might be related to status: high numbers of gaudy vessels might indicate a household where presentation was important. It may be possible to put meanings to colours but one might be wary of doing so in the case of pottery. After all, the range of colours is partly determined by the available technology, in terms of glaze and mineral use and kiln structure. Such a view, of course, excludes the element of human decision-making and if we accept that medieval potters gave pots the colours their customers wanted then we must ask why it was that those colours appealed. An analysis of colour symbolism may not appear very helpful, however. The most common pottery was coloured red-brown. In medieval art, brown was associated with mourning and red with God, while green, the most common colour for glazed pottery, signified the holy spirit and therefore also the bishops. Yellow was regarded as a substitute for gold, which denoted heaven [6]. The next step, of course, is to ask whether medieval potters were fully 5.7. DEVELOPING NEW HYPOTHESES aware of the symbolism in art. The same colours have meanings in folk-lore that are not entirely consistent with those of medieval Christianity. Brown is of course the colour of the earth and its association with mourning is perhaps related to the notion that humans are earthly beings whom, in death, revert to clay. Red is the colour of blood and, perhaps more pertinently with regard to pottery, of fire, and in its positive manifestation signifies life, love, warmth and passion. Its negative aspect is that of destruction and war. Green was also related to life, through its association with plants and the shoots of springtime as well as water. It thus symbolised hope and longevity [6]. There is more to go on here, and it is easy to create associations between pots of certain colours and the offering, presumably through their contents, of valued qualities such as life, warmth and a long life. Yellow is more of a problem, however, as in the middle ages it was the colour of envy. It may be safer, therefore, to look upon it as a substitute for gold, and here is the crux of the matter — it is we who are now doing the looking, thus creating our own understandings of the colours we see. Medieval pottery Issues of colour and perception in pottery have already been raised and it is clear that we must understand the relationships between pottery and those who used it. In the first place, it is not always clear whom those users were, or at least it is clear that in many medieval households several different users were involved. In the townhouse of a wealthy burgess, such as that which provides the setting for the case study set out below, there was a hierarchy of individuals and their roles. It seems reasonable therefore, that different types of vessel were also fitted into that scheme. At the lowest level, ceramically, we may place vessels used in food preparation. Medieval cooking pots were simple forms, cheaply made and cheap to buy. Their use would have been confined to the kitchen and scullery areas and cooking pots would probably not have appeared at high table or been considered for display. These vessels were of unglazed earthenware, usually buff, red-brown or grey in colour. Unglazed, or partially glazed, jugs were also produced and these may have been used in the same areas of the 127 128 CHAPTER 5. REPRESENTATION AND INTERPRETATION house. It is most likely that domestic servants used pottery bought for storage and cooking and, as far as the householder was concerned, the appearance of that pottery may have been largely irrelevant. There seems, therefore, to be little perceptual influence in the acquisition and use of kitchenwares and this is borne out by the fact that in most excavated medieval assemblages cooking pots and jars are invariably of local origin [7] and have a uniform appearance. Jugs, however, the medieval vessel type we commonly characterise as tableware, were derived from a wider variety of sources, even at humble farmsteads and issues of taste and display must be considered. Rich glazes and elaborate decoration suggest use in public ways, certainly outside the confines of the kitchen area. A penchant for display is known to have permeated most levels of medieval society and ceramics would have been a cheap way of assuaging such a requirement. It is important to recognise that pottery could never compete with glass or metal as a medium for showing off but it is easy to identify highly decorated vessels as intended for use at table. Here, aspects of colour become more important perhaps. The use of glaze allows a greater degree of consistency in the colouring of pottery and it must be assumed that this was done deliberately to appeal to the tastes of prospective customers. If, therefore, a domestic assemblage yields tableware of a certain range of hues, then it seems reasonable to suggest that those were the colours that the users preferred. The conclusion, therefore, is that the jug is a suitable focus for research into colour and perception in medieval pottery, at least at this initial stage. Jugs present the widest range of shapes, decorative designs and colours of any ceramic form in medieval assemblages; were probably used in a greater variety of domestic situations and might to a higher degree reflect the tastes of consumers. Putting things in context The first question that needs to be answered in this consideration of colour and perception must be this; what did medieval pots look like in their original setting? That question has been addressed by research conducted at the Department of Computer Science at Bristol University, where a computer model of the hall of a medieval town house has 5.7. DEVELOPING NEW HYPOTHESES 129 Figure 5.6: Photograph of the Medieval Merchant’s House, Southampton, UK. been constructed. The model is based on the Medieval Merchant’s House museum in Southampton, a half-timbered structure renovated by English Heritage as accurately as possible to represent a 13th century dwelling of some economic status (Figure 5.6). Computer modelling offers a flexible approach to investigating hypotheses regarding colour and lighting. It is not possible to occupy the actual building, light a fire, position candles and torches and record the results and if it were it is still not feasible to remove the existing fireplace and chimney-breast. In a computer-generated environment we are able to remove extraneous structures, change the size and colour of wall-hangings, even change the colours of the walls. We can increase or diminish the size of the hearth and alter its position and we can move other illuminants around as we wish, placing them on furniture or anywhere on the walls. It may also be possible to recreate vision defects, such as shortsightedness, that were left uncorrected in the 13th century. Figure 5.7 shows a view of the hall with the jugs placed on the main table. This scene is an accurate depiction of the actual building, the wall hanging and the furniture are pieces currently on display. The three jugs are a local redware baluster, red-brown in colour with a partial greenish-clear glaze, a Dorset white ware jug with an overall yellow glaze, decorated with dark brown applied vertical stripes and lines of pellets and a Saintonge white ware vessel with an overall bright green glaze. They are contemporary, dateable to around 12701300 and it is possible that similar vessels could have been in use in a 130 CHAPTER 5. REPRESENTATION AND INTERPRETATION Figure 5.7: Computer-generated model of the hall of the Medieval Merchant’s House, French Street, Southampton, lit by generic approximation of daylight (created by Ann McNamara, now of Trinity College, Dublin, Ireland). Figure 5.8: Computer-generated model of the hall of the Medieval Merchant’s House, French Street, Southampton, lit by candlelight (created by Patrick Ledda, University of Bristol, UK). Southampton household at the same time. The rendering in Figure5.7 is lit with generic lighting, comparable with daylight, rather than a specifically modelled light source. Figure 5.8 shows the same scene from a viewpoint on the gallery above the hall, lit with candles. These computer-generated images (Figures 5.7 and 5.8) serve to illustrate the value of this approach. This work has also confirmed the necessity of understanding how objects looked in their original contexts if we seek any insight into past perceptions. It is already apparent that many pots would have looked brightest when lit from above. Only the top half of some jugs are glazed and decorated, and this is perhaps indicative of how they were illuminated in use, perhaps by daylight through windows or perhaps from torches hung on walls. The purpose of this research is the revelation of detail such as this and there is great potential to go deeper into medieval ways of living. In searching for insights into the ways past people perceived their environments and the objects they interacted with it is important to consider how they illuminated their lives. Even more crucial is the relationship between light and colour. Colours will change in appearance according to the types of light source present; yellow, for instance is especially affected by the RGB factor of certain illuminants. The recreation of medieval lighting conditions is therefore seen as a vi- 5.8. SUMMARY tal step in comprehending attitudes to colour, and eventually perhaps, shape and decoration. If there is any symbolic meaning in the use of colour on pottery then this might be revealed through an exploration of medieval perception, through the recreation of a medieval environment. The modelling of a realistic environment through the application to computer graphics of computer science and psychophysics, is perceived to be the most far-reaching and flexible way of exploring human perceptions in the past. This research into colour and light has shown how easy it is for our own preconceptions to intrude into the ways we view archaeological objects or sites. Our research is intended to reveal more about medieval perceptions by investigating the context of colour. This has led to a re-evaluation of how we ourselves look at pottery, the chosen medium for our research. One result of our inquiries may be to suggest new, or additional ways of recording ceramic assemblages. The aim is to find methods of analysis that could take us beyond those typical questions of chronology and provenance. If the aim of archaeology is to provide insights into the lives of past individuals, communities and cultures then we need to show a greater respect for the things they have left behind and attempt more refined ways of understanding them. This project is an attractive mix of archaeological intuition and philosophy together with hard science. The proposition is even more exciting when viewed as a voyage of discovery which is bound to open up new lines of enquiry and thought. 5.8 Summary This chapter has discussed the problems of representation and interpretation of archaeological datasets. Visualising a past environment is fraught with difficulties from the outset. The archaeological record itself can be subjective, the level of realism must be defined, motivation and audience should be established, the potential for misinterpretation needs to be minimised, and standards have to be set. If the archaeological information is placed in context and the decisionmaking process is outlined then the synthesised scenes can become 131 132 CHAPTER 5. REPRESENTATION AND INTERPRETATION meaningful and useful. It is only then that we can confidently begin to explore and investigate these virtual environments as a means of understanding the past. 5.9. SLIDES 5.9 Slides Representation and Interpretation • Archaeological illustration • The idea of realism • Representing for a purpose • Misinterpretation • Setting standards • Developing new hypotheses 133 134 CHAPTER 5. REPRESENTATION AND INTERPRETATION Recording sites Visualising the data      à       à      5.9. SLIDES 135 Case Study: Depicting Stonehenge  !       "  # $   Terms and Concepts •% •% •6 •?  #  &'!(() *3,+-  &'/.50 *50&/'1A2&'  #   &'/(() "#*,3+-4  &/'.05"& &'   &/'  $ 7B $8C !  *05:9;  1A2&/'#8=<;> &' 1 %  1A2&' .500@    12A&' CHAPTER 5. REPRESENTATION AND INTERPRETATION 136 Defining realism • .!  <67 .37  >=!"#$%%&&' ?(  ).*+,,A - 6.0 @/ • 1B ',6*+&' 3254.67 %8 3,  .99B'32C :3*+36 %3, ; The tangible referent DE\ GFMIH JKFIML0E3NPOQSR3FIEGRGT`UFIMR3VXW6Y[Z QSEG \HIME3] EHIE3V FMI^`_ ab€dcegfhg'i‚ #jXky EG \lL 0V LmFn Ro.E\ GRG \HIMER3QSYpF=J0 ky YpFMIn ky n Ypq3nFIML0 q3LmNPrR3HIMEG \LmOHstqE3V E3st_ uƒvLwkxlL yky EG \q3n LmLmzstE„ {ky n Ypq3nR3strEq3FIs|Lm]+gR NPO.QSFMIYp}p] R3qE3FIEl~stYpFIEGFIL0HMIE3rHIE3stEV.FIM 5.9. SLIDES 137 Context *+,+),  *"-+. +),+!"/## •$ •& •(   %) )' ), Representing for a purpose • 0" )1o32p 547698;:o327<=6=:38>1o3?13@7BAC 4DFE*GIHWKJkL)"MONPGQ)NPRTS GIU'J HKNPJ*M)VWKE*MOJ*HKWNPXaYR)GIQ)NPUGIZ)NPRT[")E*GIHWKN\S+J*HKWNPRTSw ]R)Ek^ QT_`UXaHWKQ)E*GIE*G • 0)132p 54B7698{ b<=13cedf478>2gC+<=A8>h47BAC S+VWKJ*UQ)N-Pl*GFHWKE*l Q)R)NPmanE*G • 0)13 o25 p:3q`r38>2`47AsCAh=@gDtd+2guXaYV3ol • 0)132p 5476j98{ bdf=x>?A<bDFE*ynl*J E*R)HWKE VKHKWJ NPR)vwE*R)H 4iDjR)Ek^ XaYvwUJ R)N|PE G HKNPXaR)J MOzZ CHAPTER 5. REPRESENTATION AND INTERPRETATION 138 Misinterpretation #$#$%Q'&2()"E*H,+R-./Q.01)&%2%34I15J6O!!78 " 5:9$;<. • = ?& SD"E,H41I56J3%>&233*O'3 &(2?&F2(156&2(@N@56%5PAF!G.B%B?356CQ !?&>&(2T)" • K13L56 M! 56%%C !C @N%.GF%%U!%5J6% 5J6,H%H,35(@NO5P • Setting standards VWQ#$56KE%@0!EX'K%XE5J6B*.B>Zb[#$56O!%56J%56JC Y)T 56C  % #$56%!N@OOCQ%. • \ 4I &(2!1)56"*E]'_H^@%`<5AP0I13`<Z[b ac@deK1@QW'4I1%Z %'K%1`R%%^N@Lg7W'%fG%B?5656#$#$.%Zb[  %56. • \ 4I %1W'3f *W'5J64EX5l^@`<"EZb[m gah ..`<Z[bik']@)&24I1V,'K1%3`<]@NZjeB?7F *H, 56J# • 5.9. SLIDES 139 Developing new hypotheses 9     "!" 3#0   %       ' ! "  3         ! ( %      ! $ &  #   )*)*%%(!! + , " !-):*!./0)*%%(; 21#3 )*#3!. %"!'44; 2< -%!!#3!3#+ , "!' " #30!.15!!! !% !./"=1 5     >   7 * )  3 + 61 8 # Making it meaningful , "   3#0 $!#3%??  A 3#! G442=42!0!HB42 ;%% •@ 9!I>C • 5 3 • D  J %K!I  0 ! • @   !.!  •E  ! '%% $ 0$ 0/$ 08>  2 ;7" #3 )*:+ F L!M 140 CHAPTER 5. REPRESENTATION AND INTERPRETATION Case study: Medieval pottery , -.  #  /0!" / % (  #  $1 & '&2 0'/' ()  +* Case study: Medieval pottery 35476Q98W:47;<7=.>?47@FA7B.<7C.DFEHG B.IKJLGM47N.OLIXKJLDB.<7>?P9QDFIKC.NSRUTV 5.9. SLIDES 141 Case study: Medieval pottery      Summary V! '&(")"C '*)*+"C ',-)-/.%*+01$"2435$58$"16501$# K)""1#%$"C "1$.%"1$7$80165019:)87;01E< =<B$>*+7$&(?*+@ A 6#%#%9:019:)86@89:8&(?)""1-/6019:)K !6<B)**+0' C01D8$8$5$#%7;01=< E<B$>.%"1F)FB9:#%$#G V0' KJ'D8$K !*)*+"C '7LHM01D8$79:7$5#P N7,$8$7 C9:7;)8@IHS D865FB$><B$$! K.%T@RO6,5$5#N P9:! K,)801$QB0C '68#N P65"1$ &?(*+@R@IHK S#%T),*+-/$801$#N01D860'01D8$UHK S<B$,)-W /$ -/$689:83&?(*+@ G 142 CHAPTER 5. REPRESENTATION AND INTERPRETATION Acknowledgements ,%I  ,D/JD/C KC!!,"#%$L&,'G ("$ )$L&E"*+, -=.*//*/)0*/1,, 23B!451,26."7E723/18 9M;:.=<?>?@ *N+  45*N+ 23BA @ 7D245AC7D+ A*/'(45""4E45) /J*/A*/71OF45'( G!,,HPH Further information QRUSRSUTVnXWYW Z?ZZ\[^]E"_/[;`%aUSbc_/[^Xd"E]E[XePfPWYhgmjiE"_/kiE"d"EaSU]"EQWYlmaSUd"ETQbc] _kW Acknowledgements Many thanks indeed to Bob Stone, John Hodgson and Ramesh Raskar for providing some of the images and information referred to in this 5.9. SLIDES chapter, and for doing so in such willing and helpful ways. 143 144 CHAPTER 5. REPRESENTATION AND INTERPRETATION Bibliography [1] L. Adkins and R. A. Adkins. Archaeological Illustration. Cambridge University Press, Cambridge, UK, 1989. [2] R. Arnheim. Art and Visual Perception. University of California Press, Berkeley, CA, second edition, 1974. [3] J. A. Barcelo, M. Forte, and D. H. Sanders. Virtual Reality in Archaeology. Archeopress, Oxford, UK, 2000. [4] J. Bateman. Immediate realities: an anthropology of computer visualisation in archaeology. Internet Archaeology, 8, 2000. [5] J. Baudrillard. Simulacra and Simulations. University of Michigan Press, 1994. [6] U. Becker. The Continuum Encyclopedia of Symbols. Continuum, New York, NY, 2000. [7] D. H. Brown. Pots from houses. Medieval Ceramics, 1997. [8] D. H. Brown and A. Chalmers. Light, perception and medieval pottery. Unpublished. [9] K. A. Robson Brown, A. G. Chalmers, T. Saigol, C. Green, and F. d’Errico. An automated laser scan survey of the upper palaeolithic rock shelter of cap blanc. Journal of Archaeological Science, 28:283–289, 2001. [10] A. Chalmers, S. Stoddart, J. Tidmus, and R. Miles. Insite: an interactive visualisation system for archaeological sites. In J. Huggett and N. Ryan, editors, Computer Applications and Quantitative Methods in Archaeology 1994, pages 225–228. BAR International Series 600, Archeopress, 1995. 145 146 BIBLIOGRAPHY [11] A. G. Chalmers, A. McNamara, S. Daly, K. Myszkowski, and T. Troscianko. Image quality metrics. In SIGGRAPH 2000 Course #44. ACM SIGGRAPH, July 2000. [12] A. G. Chalmers, A. McNamara, S. Daly, K. Myszkowski, and T. Troscianko. Seeing is believing: Reality perception in modeling, rendering and animation. In SIGGRAPH 2001 Course #21. ACM SIGGRAPH, August 2001. [13] World Wide Web Consortium. Extensible markup language (xml). URL http://w3.org/XML/. [14] G. Currie. Image and Mind. Cambridge University Press, Cambridge, UK, 1995. [15] R. Daniels. The need for the solid modelling of structure in the archaeology of buildings. Internet Archaeology, 2, 1997. [16] K. Devlin and A. Chalmers. Realistic visualisation of the pompeii frescoes. In Alan Chalmers and Vali Lalioti, editors, AFRIGRAPH 2001, pages 43–47. ACM SIGGRAPH, November 2001. [17] M. Gillings. Engaging place: a framework for the integration and realisation of virtual-reality approaches in archaeology. In L. Dingwall, S. Exon, V. Gaffney, S. Laflin, and M. van Leusen, editors, Computer Applications and Quantitative Methods in Archaeology 1997, pages 187–200. BAR International Series 750, Archeopress, 1999. [18] E. H. Gombrich. The Image and the Eye. Phaidon Press Limited, London, UK, 1999. [19] N. Goodman. Languages of Art. Hackett Publishing Company, Indianapolis, Indiana, second edition, 1976. [20] R. L. Gregory. Eye and Brain. Oxford University Press, Oxford, UK, fifth edition, 1998. [21] C. Grout, P. Purdy, J. Rymer, K. Youngs, J. Williams, A. Lock, and D. Brickley. Creating Digital Resources for the Visual Arts: Standards and Good Practice. Oxbow Books, Oxford, UK, 2000. [22] II H. Eiteljorg. The compelling computer image - a doubleedged sword. Internet Archaeology, 8, 2000. BIBLIOGRAPHY [23] T. Hawkins, J. Cohen, and P. Debevec. Photometric approach to digitizing cultural artifacts. In 2nd International Symposium on Virtual Reality, Archaeology, and Cultural Heritage (VAST 2001), 2001. [24] J. Hodgson. Style and content: The effects of style on archaeological reconstruction. Graphic Archaeology, 1997. [25] J. Hodgson. Dewlish roman villa: Design brief. Design brief for reconstruction project, 2001. [26] Dublin Core Metadata Initiative. The dublin core metadata initiative. URL http://dublincore.org. [27] J. Kantner. Realism vs. reality: Creating virtual reconstructions of prehistoric architecture. In Virtual Reality in Archaeology. Archeopress, Oxford, UK, 2000. [28] G. Ward Larson and R. Shakespeare. Rendering with Radiance: The Art and Science of Lighting Visualization. Morgan Kaufmann, San Francisco, CA, 1998. [29] A. McNamara, A. Chalmers, and D. Brown. Light and the culture of medieval pottery. In Proceedings of theInternational Conference on Medival Archaeology, pages 207–219, October 1997. [30] A. McNamara, A. Chalmers, T. Troscianko, and I. Gilchrist. Comparing real and synthetic scenes using human judgements of lightness. In Proceedings of the 11th Eurographics Rendering Workshop, pages 207–219. Springer Verlag, June 2000. [31] A. McNamara, A. Chalmers, T. Troscianko, and E. Reinhard. Fidelity of graphics reconstructions: A psychophysical investigation. In Proceedings of the 9th Eurographics Rendering Workshop, pages 237–246. Springer Verlag, June 1998. [32] P. Miller and J. Richards. The good, the bad, and the downright misleading: archaeological adoption of computer visualisation. In J. Huggett and N. Ryan, editors, Computer Applications and Quantitative Methods in Archaeology 1994, pages 19–22. BAR International Series 600, Archeopress, 1995. 147 148 BIBLIOGRAPHY [33] B. Molyneaux. From virtuality to actuality: the archaeological site simulation environment. In Archaeology and the Information Age. Routledge, London, UK, 1992. [34] S. Piggott. Antiquity Depicted. Thames and Hudson, London, UK, 1978. [35] R. Raskar, G. Welch, K. Low, and D. Bandyopadhyay. Shader lamps: Animating real objects with image-based illumination. In S. J. Gortler and K. Myszkowski, editors, Rendering Techniques 2001, pages 89–102. Springer-Verlag, 2001. [36] P. Reilly. Towards a virtual archaeology. In K. Lockyear and S. Rahtz, editors, Computer Applications and Quantitative Methods in Archaeology 1990, pages 133–140. BAR International Series 565, Archeopress, 1991. [37] P. Reilly. Three-dimensional modelling and primary archaeological data. In Archaeology and the Information Age. Routledge, London, UK, 1992. [38] P. Reilly and S. Rahtz. Archaeology and the Information Age. Routledge, London, UK, 1992. [39] C. S. Rhyne. Computer images for research, teaching, and publication in art history and related disciplines. An International Journal of Documentation, XIL:19–51, 1995. [40] J. C. Roberts and N. Ryan. Alternative archaeological representations within virtual worlds. In Richard Bowden, editor, Proceedings of the 4th UK Virtual Reality Specialist Interest Group Conference, pages 179–188, Uxbridge, Middlesex, November 1997. [41] N. Ryan. Computer based visualisation of the past: technical ‘realism’ and historical credibility. In P. Main T. Higgins and J. Lang, editors, Imaging the past: electronic imaging and computer graphics in museums and archaeology, number 114 in Occasional Papers, pages 95–108. The British Museum, London, November 1996. [42] Archaeology Data Service. Creating and using virtual reality: a guide for the arts and humanities. URL http://ads.ahds.ac.uk/project/goodguides/vr/appendix2.html. BIBLIOGRAPHY [43] J. T. Smith. The validity of inference from archaeological evidence. In P. J. Drury, editor, Structural Reconstruction, pages 7– 19, Oxford, UK, 1982. BAR International Series 110, Archaeopress. [44] D. Spicer. Computer graphics and the perception of archaeological information: Lies, damned statistics and...graphics! In C. L. N. Ruggles and S. P. Q. Rahtz, editors, Computer Applications and Quantitative Methods in Archaeology 1987, pages 187–200. BAR International Series 393, Archeopress, 1988. [45] T. Troscianko, A. McNamara, and A. Chalmers. Measures of lightness constancy as an index of the perceptual fidelity of computer graphics. In European Conference on Visual Perception 1998, Perception Vol 27 Supplement, pages 25–25. Pion Ltd, August 1998. [46] G. Ward and E. Eydelberg-Vileshin. Picture perfect rgb rendering using spectral prefiltering and sharp color primaries. In (to appear) Eurographics 2002, September 2002. 149 150 BIBLIOGRAPHY Appendix A Included papers 151 LIGHT AND THE CULTURE OF COLOUR IN MEDIEVAL POTTERY by Duncan H. Brown, Alan Chalmers and Ann MacNamara Light is something we take for granted, that we create through the movement of a switch. Light is not something archaeologists can recover or record and consequently its importance is rarely considered in interpretations of past ways of living. Light is something that we, as students of the past, need to understand, for the introduction or the creation of light, and its use, facilitated the activities, rituals and lives of our ancestor societies. Light is fundamentally important, yet at present it is rarely considered as a medium for comprehending how people behaved or how they perceived their environment. The hard archaeological evidence is unrevealing. Windows remain as evidence for the way of introducing light; lamps, lanterns and candlesticks for the means of its creation. Yet there is a more subtle way of approaching this problem and that is to look at the objects we find, the clues to the ways in which buildings were decorated, in an effort to understand what sort of environment past peoples created for themselves. Their perceptions can be revealed by the appearance of the objects they used. This paper is specifically concerned with medieval pottery because that is my particular specialism but the philosophy behind this discussion should lend itself to the study of any type of object of any date. The basic premise is that the colours of medieval pots are related to the lighting conditions that medieval people were accustomed to. Some pots are brightly coloured and highly decorated, others are dull. This is related in part to vessel function but must also reflect the intended place of use and thus variations in lighting conditions. Furthermore, pottery colours may actually reflect a typical absence, rather than presence of light, or at least lighting at much lower levels than we are used to. This short paper therefore considers the ways in which medieval interiors were illuminated and how lighting conditions might affect the ways in which objects were perceived and designed. The basis of this discussion is a computer science research project which places medieval pots into simulated environments and introduces different types of illumination. This project is in its early stages but this is seen as an opportunity to show how even these preliminary developments can lead to new lines of enquiry of medieval ceramics. This paper therefore presents a philosophical discussion rather than hard evidence but in doing so hopefully suggests new lines of enquiry and thought. MEDIEVAL LIGHTING As has already been inferred, archaeological evidence for the creation of light is relatively rare. The ceramic evidence is perhaps most commonplace on excavations and a variety of lamps and candle-holders are known throughout the medieval period. These are generally portable types and are consequently small in size. It is difficult to envisage such objects being used to illuminate whole rooms. It is known, of course, that torches were extensively used and wall-brackets for these survive. Another source of light, and possibly a very important one, must have been the fires and braziers that were used to heat rooms. On this evidence it seems that the medieval interior must have been a flickering, smoke-beset world and perhaps this explains the bright colours used to decorate medieval objects and indeed rooms. The evidence for those is drawn primarily from manuscript illuminations and paintings, where furniture is often shown brightly painted, and walls display rich hangings. It may, however, be the case that most of these things were not meant to be seen at their best in artificial light. The hours of daylight regulated pre-industrial life and provided the fundamental means of illumination. In the present day, sunlight is almost irrelevant to the conducting of our lives; houses, offices, shops, factories are almost all permanently lit by artificial means. In the medieval period the sun was probably viewed as the only constant source of light and it is in the architecture that the best evidence for lighting can be found. Big windows provided lots of light but given the limited availability of window-glass they also brought draughts. Windows also created a security risk, as is shown most obviously in castles. Indeed the largest windows may be found in ecclesiastical buildings, those which one might presume to have been least threatened. This provokes the thought that the provision of light on through such grand openings was as much a signal of devotion to God as the building of the entire edifice. Light and holiness seem almost to be related. The way light was used within a church or cathedral may also, perhaps, reflect the controlling aspects of ecclesiastical architecture. Medieval pottery, however, was used more frequently in a domestic environment and the windows in houses are therefore more pertinent to this discussion. Here, window size might be related to status. The windows in surviving English peasant houses are generally small, leading to the conclusion that warmth was more important than light, perhaps because the rural lifestyle was in any case governed by the rising and the setting of the sun. Rural manor houses, and the homes of late medieval yeomen farmers exhibited larger windows, as did medieval town houses. In all instances the window provided light for the carrying out of daily activities such as weaving and sewing. In towns, where jettying of upper stories often brought houses within a few feet of each other, windows had to be larger but they also allowed those sitting at them to communicate with their neighbours, as well as people in the street. The window therefore played an important social role as well as a domestic one. It is clear that different dwellings gave different lighting conditions. One might therefore expect the most brightly coloured objects to bee associated with the best lit settings and this is, to some extent true. The most highly decorated types of pottery, for instance, are not found at sites of the lowest status. However, that does not tell us how such objects were perceived by those who used them but simply how much light might have been available to see them by. It is an understanding of medieval perception that is being sought here and the next section considers ways of looking for that, if not necessarily finding it. LIGHT AND COLOUR It was in the thirteenth and fourteenth centuries, the high medieval period, that English pottery was most elaborately decorated and given the brightest colours. There were dark grey pots, brown, red-brown, brick red, orange, pink and white ones, and those were not glazed. Medieval lead glazes tend either to draw their own colour from that of the clay beneath, thus a clear glaze on a white body appears bright yellow; or are themselves coloured by means of additives, so a bright green is created with the addition of copper. External glazing was almost certainly a form of decoration and it is hard to find any symbolic distinctions between colours in a medium which is inherently prosaic. At the same time, pottery played an increasingly important role in domestic life from the 12th century onwards and it seems to have been consumed at a high rate. The exuberant ceramic forms of the high medieval period coincided with many other cultural developments and at present it is the intention to concentrate on those types. The purpose is to examine the relationship between lighting conditions and the consumption of pots of a particular appearance by means of trying to find out how those pots might have looked in different settings. PHOTO-REALISTIC VISUALISATION Those settings are being created on a computer at the University of Bristol. A selection of pots of different shapes and colours are being computer-modelled and different environments and lighting conditions are being computer-simulated. The value of using a computer is the speed at which one can alter environments. Initially, it is necessary to simulate and measure the optical environment prevailing at the time. This will be achieved by using a controlled test environment, and simulating different conditions by filling it with smoke and dust. Optical measurements of the scene will be made, including replicas of medieval pots, under these murky conditions, using a hyper-spectral camera system developed at Bristol University for the purpose of scene analysis. Similar measurements will also be made of a restored medieval house in Southampton. The next goal is to be able to represent the scene on a high-quality stereoscopic computer display device. This will allow convenient inspection of the likely appearance of artefacts under conditions prevailing at their time of use, and it will be possible to examine the scene from different viewpoints. The fidelity of the reproduction process will be assessed by comparing human visual performance both in the original room and on the computer screen; if the optic arrays are similar in both cases then visual perception and discrimination would follow identical functions. Thus, it will be possible to use an operational measure of human vision to ensure the computer is representing the original scene with accuracy. Finally, the computerbased virtual environment will be used to assess the effects of smoke, dust and other particles on the appearance of a pot. It is essential that any system for reconstructing and visualising ancient environments must be as accurate as possible and flexible, allowing archaeologists to alter the scene parameters in order to investigate different hypotheses concerning the structure and contents of a site (Chalmers and Stoddart, 1994). Over the last decade, computer graphics techniques have shown an astonishing increase in functionality and performance. Real-time generation of images has been made possible by implementing projective display algorithms in specialised hardware. With the latest graphical hardware systems it is now possible to walk through virtual environments and scenes and perform tasks within these environments. Although the image quality of projective methods is good enough for spatial impressions and for interaction in a virtual environment, it is not sophisticated enough for realistic lighting simulation. Only by simulating the physical propagation of light in the environment can something approaching photo-realism be achieved. In computer graphics, the illumination at any point in a scene can be determined by solution of the rendering equation (Kayija, 1986). Unfortunately, the general form of this equation involves a complex integral over the entire environment and, as such, photo-realistic computer graphics techniques are only able to approximate the solution. Of those currently in use, the particle tracing method is able to approximate most closely all the lighting effects in a closed environment (Pattanaik, 1993). The particle tracing model follows the path of photons as they are emitted from the surface of the light sources and uses the reflected particle flux given by a large number of these particles pre unit time as a measure of the illumination of points in the environment. In this way, the particle tracing method is able to simulate direct as well as indirect reflection, diffuse and specular reflection, and the effects of participating media such as flame, smoke, dust and fog (Lafortune and Williams, 1996). All these effects are essential for a physically accurate lighting simulation. CONCLUSION It may seem that computer-simulation will never approach the authenticity of finding a medieval house, filling it with pottery and lighting fires, torches and lamps. However, the time and resources needed for doing that are very restrictive. On a computer it is possible to change the shape of a room, the colours of the walls, the colours of the pots, and the quantity of light and smoke, with relative ease. That facility will open up new lines of enquiry. The project is in its early stages but it is revealing that so far observation shows that the appearance of the objects in a room changes with the colours of the walls and the angle of the light source. The constancy of the light source will also have an effect, hence the development of a flickering light and the introduction of smoke and dust particles. This project is an attractive mix of archaeological intuition and philosophy together with hard science. The proposition is even more exciting when viewed as a voyage of discovery which is bound to open up new lines of enquiry and thought. BIBLIOGRAPHY Chalmers, AG and Stoddart, SKF, 1994, Photo-realistic graphics for visualising archaeological adoption of computer visualisation. In Hugget J and Ryan N, 'Computer applications and quantitative methods in archaeology' pp19-22, Tempus Reparatum, Glasgow. Kajiya, JT, 1986, The Rendering Equation, in ACM Computer Graphics, 20 (4), pp143150. Lafortune, E and Williams, Y, 1996, Rendering participating media with bidirectional path tracing, in Pueyo X and Schroeder, P, 'Seventh Eurographics Workshop on Rendering', pp 92-101, Oporto. LIGHT, COLOUR, PERCEPTION AND MEDIEVAL POTTERY Duncan H. Brown and Alan Chalmers This paper begins with a discussion of the some of the issues that need to be acknowledged by archaeologists studying the use and perception of colour in past societies. This is followed by a case study that, it is hoped, further illuminates the subject, but is something of an interim statement on a continuing research project, where computer graphics are used to re-create medieval lighting effects. The purpose is to reach some understanding of how objects, specifically pottery vessels, might have appeared in a medieval environment. The context given to the whole of this paper is therefore that of medieval England, although the general principles should be applicable throughout archaeology. LIGHT It is difficult to appreciate colour without understanding light, for without light there is no colour. In attempting to understand any significance particular colours might have had in the past, it is therefore necessary to overcome some of our own preconceptions. We, after all, are used to illuminating our environment simply by flicking a switch, but the types of light cast by electric sources are very different to those experienced by our medieval predecessors. Each different light source has its own spectral profile, some lights are more red, for instance, and others more blue. This is visible to us as ‘warm’ or ‘cold’ light. Undergraduate research by Natasha Chick, of the Department of Computer Science at Bristol University, compared the spectral properties of various light sources by converting spectroradiometer data to RGB format and taking readings from a MacBeth colour chart. At the yellow square of the chart an animal fat candle gives RGB readings of 0.759% red, 0.24% green and 0.001% blue. A 55-watt bulb, by comparison, shows as 0.524% red, 0.345% green and 0.131% blue (Chick, 2000). This demonstrates that an electric bulb gives a much greater blue cast than a tallow flame, which is more red. The results are shown visually in figures 1 and 2. Light sources also differ widely in the amount of light produced, so that there are bright or dim lights. Flames were the usual means of introducing light in the past, and these have the extra dimension of flicker, thus creating patterns and moving shadows that further affect how objects might have looked. We are not used to looking at objects in conditions where light sources are dim, generally of a red cast and also moving. Nor indeed, once they have been removed from the ground, are we used to viewing artefacts only in daylight. The role of sunlight would seem less important when studying objects that were mainly, as far as we understand, used indoors, but in this we might be mistaken. Medieval daily life was governed by the rising and setting of the sun, as documents such as books of hours demonstrate. The creation of light was a costly business, especially through the use of candles, which were beyond the means of many, and most people would not have stayed up much beyond nightfall. Some medieval buildings, such as churches and shops, were constructed with large windows that were designed to make the most of sunlight, allowing visitors a good view of what was being offered for there is no doubt that most of the public rituals or transactions enacted at these places took place during daylight hours. Others, such as castles and rural dwellings, contained small windows that served to provide illumination without compromising the requirements of protection, either from people or the elements. The provision of daylight was equally crucial in these places, however, and in the rural context at least, once the sun had gone activities and movement were restricted. In any event, the quality of daylight available must have affected the appearance of the objects used indoors and again, in our well-glaziered age, the variations in the provision of light are beyond our immediate experience and understanding. The ways in which we view, perceive and understand objects is governed by current lighting methods (electric light and large windows) but in order to understand how objects were viewed and understood in the past we must consider how they were illuminated. COLOUR The pottery used in England throughout the medieval period was mostly earthenware and may be divided into white-firing and red-firing types. Pots made from white-firing clays appear white to buff in colour but were usually covered, completely or partially in a lead glaze. Clear lead glazes give a yellow to amber appearance to white pottery but the addition of copper creates green, which can vary from dark through olive to bright, apple hues. In general, the most brightly coloured medieval pottery was made from white-firing clays. Vessels made from iron-rich, red-firing clays range in colour from brick red to dull brown, when fired in oxidising conditions. Clear lead glazes can give a red-orange appearance to red wares while lead with copper glazes produce a variety of greens generally less vibrant or consistent in colour than those seen on white earthenware. Red clays fired in an oxygen-free atmosphere range in colour from grey through to black and in such conditions an ordinary lead glaze will turn slightly green or greenish-clear. In short, medieval pots were given a variety of colours, including white, red, brown, yellow, orange, green, grey and black. The appearance of vessels was also enhanced, and the colour effect slightly altered, by decoration such as painted lines, coloured slips and applied clays. So what? may be the first question that springs to mind after reading that. What, if anything, does this information tell us about the medieval period? It is possible to quantify medieval pottery assemblages by colour, but is that a useful thing to do? Did the colours of medieval pots have any meaning? The range of colours produced in pottery at least tells us something about ceramic technology. Aspects of kiln construction and management, clay exploitation and mineral use are all illuminated through an understanding of how pots were made in the colours that they were. Also revealed is the importance of colour in medieval society. The most colourful and highly decorated vessels were jugs, normally identified as table ware, used in the serving of liquids, especially wine. It is possible that these were used at high profile dinners. The dullest vessels, unglazed and undecorated, were those used in low-visibility activities such as cooking. Colour was therefore an important element in the culture of display that prevailed in medieval society, at least in its upper echelons. In these terms it is useful, therefore, to consider quantifying pottery assemblages by colour, as the quantities of highly decorated and plain pots might be related to status: high numbers of gaudy vessels might indicate a household where presentation was important. It may be possible to put meanings to colours but one might be wary of doing so in the case of pottery. After all, the range of colours is partly determined by the available technology, in terms of glaze and mineral use and kiln structure. Such a view, of course, excludes the element of human decision-making and if we accept that medieval potters gave pots the colours their customers wanted then we must ask why it was that those colours appealed. An analysis of colour symbolism may not appear very helpful, however. The most common pottery was coloured red-brown. In medieval art, brown was associated with mourning and red with God, while green, the most common colour for glazed pottery, signified the holy spirit and therefore also the bishops (Becker, 2000). Yellow was regarded as a substitute for gold, which denoted heaven (ibid). The next step, of course, is to ask whether medieval potters were fully aware of the symbolism in art. The same colours have meanings in folk-lore that are not entirely consistent with those of medieval Christianity. Brown is of course the colour of the earth (ibid) and its association with mourning is perhaps related to the notion that humans are earthly beings whom, in death, revert to clay. Red is the colour of blood and, perhaps more pertinently with regard to pottery, of fire, and in its positive manifestation signifies life, love, warmth and passion. Its negative aspect is that of destruction and war (ibid). Green was also related to life, through its association with plants and the shoots of springtime as well as water. It thus symbolised hope and longevity (ibid). There is more to go on here, and it is easy to create associations between pots of certain colours and the offering, presumably through their contents, of valued qualities such as life, warmth and a long life. Yellow is more of a problem, however, as in the middle ages it was the colour of envy. It may be safer, therefore, to look upon it as a substitute for gold, and here is the crux of the matter. For it is we who are now doing the looking, thus creating our own understandings of the colours we see. PERCEPTION There is no guarantee that the green pot we now look upon was also understood to be green by medieval people. Or to put it another way, although they may have known it to be green, they might have appreciated it as something else. There is no doubt that colour can transmit mood and meaning, and medieval folk may have read their own particular meanings into pots of certain colours. The use of green, for instance, might have signified its quality and its suitability for certain functions, furthermore, the application of slipped decoration might even have indicated those uses to which a pot should not be put. We come, here, to a further issue. If there is a gap between that which we perceive and what medieval people understood, there was also a gap between what the creator of a pot understood and the perceptions of the consumer. This may be true, and perhaps more readily recognised, in terms of function as well as appearance. A pot that the potter considered to be an excellent wine jug may have been bought for use as a piss-pot. A simple example of differences in perception may be that of a vessel made in Muslim Spain and imported into Christian England. It is likely that any significance or message in the decoration became irrelevant in the transfer from one culture to another. The same may therefore be true of colour. This discussion, however, is focused on the perceptions of the users, rather than the makers, of medieval pottery and here again we must recognise how discrepancies might creep in. It is unlikely, at certain levels of medieval society, that domestic goods were acquired directly by those householders who kept servants. The pots we recover from their dwellings may not, therefore, accurately reflect the tastes of the principal occupants. It is, however, reasonable to assume that householders would ensure that they used objects they liked the look of. Once we acknowledge taste as a factor in deciding which objects to use then we raise the question of perception for no colour, shape or decorative design is actually worse or better, it is only perceived to be so. Cultural, social, political and economic attitudes will be brought to bear in determining what is acceptable, and may be expressed in the symbolism of artefact appearance and use and domestic ritual. Here again, our own preconceptions will interfere. The framework for our own perceptions is quite different from that for medieval society, which will remain, at best, difficult to comprehend. Certain clues survive, and the significance of colour in medieval culture is one of them. There is no physical difference, between humans in the present and the medieval past, in the actual mechanisms for perceiving, the eye and brain, so changes in our perceptions must be explained culturally and emotionally rather than clinically. POTTERY Issues of colour and perception in pottery have already been raised and it is clear that we must understand the relationships between pottery and those who used it. In the first place, it is not always clear whom those users were, or at least it is clear that in many medieval households several different users were involved. In the townhouse of a wealthy burgess, such as that which provides the setting for the case study set out below, there was a hierarchy of individuals and their roles. It seems reasonable therefore, that different types of vessel were also fitted into that scheme. At the lowest level, ceramically, we may place vessels used in food preparation. Medieval cooking pots were simple forms, cheaply made and cheap to buy. Their use would have been confined to the kitchen and scullery areas and cooking pots would probably not have appeared at high table or been considered for display. These vessels were of unglazed earthenware, usually buff, red-brown or grey in colour. Unglazed, or partially glazed, jugs were also produced and these may have been used in the same areas of the house. It is most likely that domestic servants used pottery bought for storage and cooking and, as far as the householder was concerned, the appearance of that pottery may have been largely irrelevant. There seems, therefore, to be little perceptual influence in the acquisition and use of kitchenwares and this is borne out by the fact that in most excavated medieval assemblages cooking pots and jars are invariably of local origin (Brown 1997) and have a uniform appearance. Jugs, however, the medieval vessel type we commonly characterise as tableware, were derived from a wider variety of sources, even at humble farmsteads (ibid) and issues of taste and display must be considered. Rich glazes and elaborate decoration suggest use in public ways, certainly outside the confines of the kitchen area. A penchant for display is known to have permeated most levels of medieval society and ceramics would have been a cheap way of assuaging such a requirement. It is important to recognise that pottery could never compete with glass or metal as a medium for showing off but it is easy to identify highly decorated vessels as intended for use at table. Here, aspects of colour become more important perhaps. The use of glaze allows a greater degree of consistency in the colouring of pottery and it must be assumed that this was done deliberately to appeal to the tastes of prospective customers. If, therefore, a domestic assemblage yields tableware of a certain range of hues, then it seems reasonable to suggest that those were the colours that the users preferred. The conclusion, therefore, is that the jug is a suitable focus for research into colour and perception in medieval pottery, at least at this initial stage. Jugs present the widest range of shapes, decorative designs and colours of any ceramic form in medieval assemblages; were probably used in a greater variety of domestic situations and might to a higher degree reflect the tastes of consumers. CONTEXT The first question that needs to be answered in this consideration of colour and perception must be this; what did medieval pots look like in their original setting? That question has been addressed by research conducted at the Department of Computer Science at Bristol University, where a computer model of the hall of a medieval town house has been constructed. The model is based on the Medieval Merchant’s House museum in Southampton, a half-timbered structure renovated by English Heritage as accurately as possible to represent a 13th century dwelling of some economic status. The modelling was carried out by Ann McNamara as part of her post-graduate research into visual perception and computer graphics. Three 13th century jugs were chosen from the archaeology collections of Southampton City Council and these have been modelled, not to a very high degree of accuracy as yet, and inserted into the environment. Realistic renderings of medieval-type light sources have not yet been introduced, although Natasha Chick’s research now makes that possible. Figure 3 shows a view of the hall with the jugs placed on the main table. This scene is an accurate depiction of the actual building, the wall hanging and the furniture are pieces currently on display. The three jugs are a local redware baluster, red-brown in colour with a partial greenish-clear glaze, a Dorset white ware jug with an overall yellow glaze, decorated with dark brown applied vertical stripes and lines of pellets and a Saintonge white ware vessel with an overall bright green glaze. They are contemporary, dateable to around 1270-1300 and it is possible that similar vessels could have been in use in a Southampton household at the same time. The rendering in figure 3 is lit with generic lighting, comparable with daylight, rather than a specifically modelled light source. Figure 4 shows the same scene lit from a central hearth with two candles on the table. The later fireplace and chimney shown in figure 3, which is still in place in the actual building, has been removed from the rendering in figure 4, thus giving a more accurate model of the 13th century hall. The next steps will be to illuminate the scene with accurately rendered light sources, such as tallow flames, then to introduce atmospheric pollutants, such as smoke and dust, that will effect the behaviour of the light. More accurate models of the pots will probably be best achieved through the use of a high-resolution laser scanner. Once a computer model has been generated, however, it is necessary to ascertain that the viewer’s perception of what is visible on a monitor equates with how an actual scene would be perceived. In other words, how ‘real’ is a computer-generated environment? The aim of realistic image synthesis is the creation of accurate, high quality imagery that faithfully represents a physical environment, the ultimate goal being to create images that are perceptually indistinguishable from an actual scene. Advances in image synthesis techniques allow us to simulate the distribution of light energy in a scene with great precision. This does not, unfortunately, ensure that the displayed image will have a high fidelity visual appearance. Reasons for this include the limited dynamic range of displays, any residual shortcomings of the rendering process, and the extent to which human vision encodes such departures from perfect physical realism. Computer graphics techniques are increasingly being used to reconstruct and visualise features of cultural heritage sites that may otherwise be difficult to appreciate. While this new perspective may enhance our understanding of the environments in which our ancestors lived, if we are to avoid misleading impressions of a site, then the computer generated images should not only look "real", but there should be a quantifiable metric of image fidelity by which this "realism" can be measured. Computational image quality metrics may be used to provide quantitative data on the fidelity of rendered images; however, if such images are to be used to investigate visual perception of environments by humans then they must include an understanding of the features of the Human Visual System (HVS). The HVS comprises many complex mechanisms that work in conjunction with each other, making it necessary to consider the HVS as a whole rather than study each function independently. Psychophysical experiments can be used to investigate the human perception of scenes. Subjects are presented with a (random) selection of images produced on a viewing device and the same real world scenario. Interactive responses and detailed questionnaires are used to evaluate the perceptual quality of the rendered images. Outcomes from these psychophysical experiments then feed back to further improve the rendering process. Preliminary results, based on work Ann McNamara carried out at Bristol, show that there is a close correlation between perceptions of actual and computer-generated scenes (McNamara et al, 2000). We are a long way from a satisfactory model but this exercise has already thrown up a number of issues. Computer modelling offers a far more flexible approach. It is not possible to occupy the actual building, light a fire, position candles and torches and record the results and if it were it is still not feasible to remove the existing fireplace and chimney-breast. In a computer-generated environment we are able to remove extraneous structures, change the size and colour of wall-hangings, even change the colours of the walls. We can increase or diminish the size of the hearth and alter its position and we can move other illuminants around as we wish, placing them on furniture or anywhere on the walls. It may also be possible to recreate vision defects, such as shortsightedness, that were left uncorrected in the 13th century. These computergenerated images are far from ideal but figures 3 and 4 serve to illustrate the value of this approach. This work has also confirmed the necessity of understanding how objects looked in their original contexts if we seek any insight into past perceptions. It is already apparent that many pots would have looked brightest when lit from above. Only the top half of some jugs are glazed and decorated, and this is perhaps indicative of how they were illuminated in use, perhaps by daylight through windows or perhaps from torches hung on walls. The purpose of this research is the revelation of detail such as this and there is great potential to go deeper into medieval ways of living. CONCLUSIONS The research described above has a long way to go yet, but certain points have already been raised and are perhaps worth summarising here. In searching for insights into the ways past people perceived their environments and the objects they interacted with it is important to consider how they illuminated their lives. Even more crucial is the relationship between light and colour. Colours will change in appearance according to the types of light source present; yellow, for instance is especially affected by the RGB factor of certain illuminants. The recreation of medieval lighting conditions is therefore seen as a vital step in comprehending attitudes to colour, and eventually perhaps, shape and decoration. If there is any symbolic meaning in the use of colour on pottery then this might be revealed through an exploration of medieval perception, through the recreation of a medieval environment. The modelling of a ‘realistic’ environment through the application to computer graphics of computer science and psychophysics, is perceived to be the most far-reaching and flexible way of exploring human perceptions in the past. Here, however, the ways in which a computergenerated image might be perceived also need to be understood. A broader issue, therefore, is how all of us read our experiences. This research into colour and light has shown how easy it is for our own preconceptions to intrude into the ways we view archaeological objects or sites. It is rare for archaeologists to attempt to see things differently (ask the protesters at Sea Henge) and the ways in which we recover details of the past, through the recording of demonstrably secure data such as composition or dimensions, is designed to facilitate research into human activity rather than human thought. We all take too much for granted, certainly in the ways we consume but also in the ways we interpret. Our research is intended to reveal more about medieval perceptions by investigating the context of colour. This has led to a re-evaluation of how we ourselves look at pottery, the chosen medium for our research. One result of our inquiries may be to suggest new, or additional ways of recording ceramic assemblages. The aim is to find methods of analysis that could take us beyond those typical questions of chronology and provenance. If the aim of archaeology is to provide insights into the lives of past individuals, communities and cultures then we need to show a greater respect for the things they have left behind and attempt more refined ways of understanding them. How do archaeologists perceive the past and objects from the past, as archaeologists or as human beings; and why might there be a difference between the two? If nothing else, our research has raised such questions, and may help to answer them. BIBLIOGRAPHY Becker, U, 2000, ‘The Continuum Encyclopedia of Symbols’ Continuum Brown, DH, 1997, 'Pots from Houses', Medieval Ceramics 21, 83-94 Chick, NJ, 2000, ‘Realistic visualisation of medieval sites’, Final year project dissertation, University of Bristol. McNamara A., Chalmers A.G., Troscianko T., Gilchrist I. ‘High Fidelity Image Synthesis’, 11th Eurographics Workshop on Rendering, Brno, June 2000. CAPTIONS Figure 1: Computer-generated environment with MacBeth chart, lit by tallow flame (created by Natasha Chick). Figure 2: Computer-generated environment with MacBeth chart, lit by 55-watt electric light (created by Natasha Chick). Figure 3: Computer-generated model of the hall of the Medieval Merchant’s House, French Street, Southampton, lit by generic approximation of daylight (created by Ann McNamara, now of Trinity College, Dublin). Figure 4: Computer-generated model of the hall of the Medieval Merchant’s House, French Street, Southampton, lit by a generic approximation of firelight (created by Ann McNamara, now of Trinity College, Dublin). The following paper is a reprint of: A Photometric Approach to Digitizing Cultural Artifacts Tim Hawkins Jonathan Cohen Paul Debevec University of Southern California Institute for Creative Technologies Published at: 2nd International Symposium on Virtual Reality, Archaeology, and Cultural Heritage David Arnold, Alan Chalmers, and Dieter Fellner, co-chairs Glyfada, Greece, November 2001. http://www.eg.org/events/VAST2001/ A Photometric Approach to Digitizing Cultural Artifacts Tim Hawkins Jonathan Cohen Paul Debevec University of Southern California Institute for Creative Technologies 1 ABSTRACT In this paper we present a photometry-based approach to the digital documentation of cultural artifacts. Rather than representing an artifact as a geometric model with spatially varying reflectance properties, we instead propose directly representing the artifact in terms of its reflectance field – the manner in which it transforms light into images. The principal device employed in our technique is a computer-controlled lighting apparatus which quickly illuminates an artifact from an exhaustive set of incident illumination directions and a set of digital video cameras which record the artifact’s appearance under these forms of illumination. From this database of recorded images, we compute linear combinations of the captured images to synthetically illuminate the object under arbitrary forms of complex incident illumination, correctly capturing the effects of specular reflection, subsurface scattering, self-shadowing, mutual illumination, and complex BRDF’s often present in cultural artifacts. We also describe a computer application that allows users to realistically and interactively relight digitized artifacts. Categories and subject descriptors: I.2.10 [Artificial Intelligence]: Vision and Scene Understanding - intensity, color, photometry and thresholding; I.3.7 [Computer Graphics]: ThreeDimensional Graphics and Realism - color, shading, shadowing, and texture; I.3.7 [Computer Graphics]: Three-Dimensional Graphics and Realism - radiosity; I.4.1 [Image Processing and Computer Vision]: Digitization and Image Capture - radiometry, reflectance, scanning; I.4.8 [Image Processing]: Scene Analysis - photometry, range data, sensor fusion. Additional Key Words and Phrases: image-based modeling, rendering, and lighting. 1 ings of cultural artifacts such as statues, vases, and interior environments. Unfortunately, this traditional approach is difficult to apply to a large class of cultural artifacts – ones that exhibit complex reflectance properties such as anisotropy or iridescence, ones that exhibit significant self-shadowing or mutual illumination, ones that exhibit significant subsurface reflection, objects that are highly specular or translucent, and objects with intricate surface geometry. We can illustrate these difficulties with several examples:    Introduction Creating realistic computer models of cultural artifacts can aid their study by remote parties as well as serve as an improved record of the artifact for archival purposes. While a photograph faithfully records an artifact’s appearance from a single point of view in a particular lighting environment, it can be far more informative to be able to see the artifact from any angle and in any form of lighting - allowing a scholar to give it greater scrutiny as well as the general observer to see the artifact in its natural environment. The standard computer model of a cultural artifact consists of a geometric surface model covered by a texture map which represents the artifact’s spatially varying diffuse reflectance properties. Sometimes, a specular component is added to render the shininess shininess, often set to a single representative value for the object. An artifact’s geometry is most commonly acquired using active sensing such as laser scanning or structured light, and the texture maps are usually constructed using a simplified form of reflectometry: lighting the object from a known direction with a calibrated light source, and then, using an estimate the object’s surface orientation, determining the spatially varying albedo of the object. This basic approach has produced numerous high-quality records and render1 USC Institute for Creative Technologies, 13274 Fiji Way 5th floor, Marina del Rey, CA, 90292. Email: ftimh,[email protected].  A fur headband would be difficult to digitize since it does not have a well-defined surface – the stripe of a laser scanner or video projector would scatter through the fur rather than drape over it, and the surface reconstruction algorithm would have difficulty reconstructing a surface from the data. Even if the fur headband’s geometry could be captured, it could be impractical to represent the geometry of tens of thousands of individual hair segments, and to subsequently realistically render the effects of this fur being illuminated. Similar challenges exist for digitizing many types of clothing as well as for human hair. A small jade sculpture would exhibit significant subsurface scattering - light hitting it from behind would cause it light up from the front, and light striking the front would penetrate the surface a considerable distance. This could complicate the range scanning process, but would pose an even more significant problem for reflectometry: current techniques neither estimate nor represent subsurface scattering properties of an object. A computer rendering of the jade sculpture might more closely resemble painted green rock than luminous jade. An intricately carved ivory sculpture with complex internal geometry would be challenging to digitize due to both significant self-shadowing and mutual illumination. Scanning its geometry would be complicated by the self-shadowing; narrow crevices are difficult to record with triangulation-based scanning methods in which each surface point must be visible to both the source of light and the image sensor. Furthermore, the complexity of the geometry could complicate the range map merging process, which works best when there are relatively coherent surface sections. Mutual illumination – the fact that light will bounce between surfaces inside the object’s reflective concavities – would complicate the reflectometry stage since it becomes very difficult to reliably control the illumination incident on any particular surface. The significant mutual illumination will also complicate rendering, requiring expensive global illumination algorithms to produce images that correctly replicate the appearance of the original object. A polished silver necklace, encrusted with rubies, diamonds, and abalone, would also be difficult to digitize. The low diffuse reflectance of the silver and the internal reflections and refractions in the rubies and diamonds would make geometry capture with structured light or laser scanning impractical. It would be hard to measure the reflectance of the reflective surfaces since most reflectometry techniques are better suited to materials with a significant diffuse reflection component. The translucence of the rubies and diamonds would make modeling their reflection characteristics difficult, and the reflectance of the irridescent abalone would be too complex to represent with most currently available reflectance models. In summary, a large class of cultural artifacts exhibiting complex geometric and reflectance properties cannot be effectively digitized using currently available techniques. This poses a significant problem for the application of computer graphics to cultural heritage, as many of the materials and designs used by craftspeople and artisans are specifically chosen to have complex geometry or to reflect light in interesting ways. In this paper, we show that an alternative approach based on capturing reflectance fields [4] of cultural artifacts can acquire, represent, and render any of the above artifacts just as easily as it could a clay jug or granite statue, with simple acquisition and rendering processes, and can produce photorealistic results. The technique is data intensive, requiring thousands of photographs of the artifact, and as currently applied allows only for relatively low-resolution renderings. As such we discuss its advantages and disadvantages over current techniques as well as potential hybrid methods. In our proposed technique the artifact is photographed under a dense sampling of incident illumination directions from a dense array of camera viewpoints. The device we use to acquire this dataset is a light stage consisting of a semicircular rotating arm with an array of strobe lights capable of illuminating an object placed at its center from up to a thousand different directions covering the entire sphere of incident illumination. Images taken with digital cameras are compressed together into a single multi-dimensional dataset comprising the object’s reflectance field [4], which characterizes how the object transforms incident illumination into radiant imagery. Renderings of the object can then be created under any form of illumination - such as the light in a forest, a cathedral, a museum, or in a candlelit hut, by taking linear combinations of the images in the reflectance field dataset. No geometric model of the object is required, and the resulting renderings capture the full complexity of the object’s interaction with light, including selfshadowing, mutual illumination, subsurface scattering, and translucency, as well as non-lambertian diffuse and anisotropic specular reflection. In this paper we apply this process to a number of Native American cultural artifacts including an otter fur headband, a feathered headdress, an animal-skin drum, and several pieces of neckwear and clothing. We show these artifacts illuminated by several realworld natural lighting environments, and describe a software program for interactively re-illuminating artifacts in real time. We also describe the reflectance field capture process and the equipment involved. We also propose using view interpolation to extrapolate a discrete set of original viewpoints to of an artifact to arbitrary novel viewpoints in conjunction with the reflectance field capture technique. The central contribution of this paper is to demonstrate that the reflectance field capture technique, introduced in the context of rendering human faces in [4], provides particular advantages over current digitization techniques when applied to cultural artifacts. Since artifacts can have stronger specular components than human skin, we also show that capturing shiny artifacts requires the acquisition of high dynamic range [5] reflectance field image data. We present an interactive program for visualizing re-illuminated reflectance fields, and suggest how view interpolation could be used to continuously vary the viewpoint of an artifact from a discretely sampled set of viewpoints. Finally, we suggest improvements to the light stage apparatus specifically for acquiring complete viewindependent models of cultural artifacts. 2 Background and Related Work Current leading techniques for digitizing three-dimensional cultural artifacts involve acquiring multiple range scans of the artifact, as- sembling them into a complete geometric surface model, and then using a form of reflectometry to derive lighting-independent texture maps for the artifact’s surfaces. Range scans are acquired most commonly through laser-stripe scanning, by projecting patterns of light from a video projector, or through illumination-assisted stereo correspondence. Individual range scans can be aligned to each other using Iterated Closest Points (ICP) algorithm [23] and merged using either polygon zippering [19] or volumetric range merging [2]. A sampling of recent projects which have used these techniques to derive geometric models of cultural artifacts are IBM Watson’s Florentine Pieta Project [17], Stanford’s Digital Michelangelo Project [11], Electricit de France’s Cosquer cave (1994) and Colossus of Ptolemy (1997) projects, the Canadian National Research Council’s museum artifact scanning work [1], and work to scan vases [16] at the Istituto di Elaborazione dell Informazione in Pisa. Reconstructing an artifact’s appearance - not just its geometry - is the second stage of the digitizing process. Some techniques (e.g. [6], [22]) directly project photographs of the object onto the artifact’s geometry to form diffuse texture maps; this technique has the advantage that the renderings will exhibit realistic, pre-rendered shading effects including self-shadowing and mutual illumination, but have the disadvantage that the lighting is fixed according to the conditions in the original imagery. For environments it is sometimes acceptable to have static lighting; for artifacts static lighting is generally less acceptable since the directions of incident illumination on an artifact change as the artifact is rotated and there is often a desire to visualize an artifact as it would be seen in different illumination environments. Deriving lighting-independent texture maps for artifacts has been done in several projects. One set of techniques lights an object from one or more directions and uses the geometric model to estimate the surface’s diffuse albedo for all points on the surface. [13] solved for the spatially varying diffuse reflectance properties across a diffuse object using different lighting and observation directions. Related techniques are used in [17] and [11] to derive illumination-independent texture maps for marble statues. [1] uses the intensity return of its collimated tri-colored laser scanner to derive color lighting-independent texture maps. Recovering specular object properties has been investigated in [18]. These current techniques have produced excellent 3D models of artifacts with well-defined surface geometry and generally diffuse reflectance characteristics. However, these current techniques for model acquisition are difficult or impossible to apply for a large class of artifacts exhibiting complex surface microstructure, spatially varying specular reflection, complex BRDFs, translucency, and subsurface scattering. As a result, artifacts featuring silver, gold, glass, fur, cloth, jewels, jade, leaves, or feathers are very challenging to accurately digitize and to convincingly render. Recent work [4] building upon related image-based rendering techniques [9, 15, 21] presented a technique for creating relightable computer models of human faces without explicitly modeling their geometric or reflectance properties. Instead, the face was illuminated from a dense array of incident illumination and a set of digital images were captured to represent the face’s reflectance field. The images from the face’s reflectance field were then combined together in order to produce images of the face under any form of illumination, including lighting environments captured from the real world as in [3]. [12] applied this technique in rendering cultural artifacts exhibiting diffuse reflectance properties. In this paper, we show that this technique can be applied to the digitization of cultural artifacts exhibiting any geometric or reflectance properties including those that are traditionally difficult to model and render. We discuss issues which arise in applying these techniques to capturing cultural artifacts, including the need to acquire image data sets using high dynamic range photography [5] to properly capture and render specular highlights. We furthermore describe an interactive lighting tool that allows artifacts to be re-illuminated by a user in real time, and propose image-based rendering techniques that will allow an artifact to be manipulated in 3D as well as being arbitrarily illuminated. In this work we use a collection of Native American clothing and jewelry to demonstrate the possibilities of the technique. 3 Dataset Acquistion Figure 1: The Light Stage illuminates an artifact (center) from a dense array of incident illumination directions as its appearance is recorded by high-speed digital video cameras (one can be seen in the upper left). This quarter-second photographic exposure shows several of the lights on at once although only one strobe light flashes at any given time. The data acquisition apparatus used in this work consists of a semicircular arm three meters in diameter that rotates about a vertical axis through its endpoints. Attached to the arm are twenty-seven evenly spaced xenon strobe lights, which fire sequentially at up to 200 Hz as the arm rotates around the subject. The arm position and strobe lights are computer-controlled allowing the strobes to synchronize with high-speed video cameras. We have used two models of high-speed video cameras in this work. The first is the Uniq Vision UC-610; it has a single image sensor of 660 by 494 pixels and can run asynchronously at up to 110 frames per second, producing digital output to the computer. The second camera is a Sony DXC-9000, which has separate 640 480 image sensors for red, green, and blue channels and which runs progressively at 60 frames per second. The Sony camera produces sharper images and more vibrant colors, although it has analog video output which must be digitized by the computer. Figure 1 shows the light stage capturing the reflectance field of a Lakota Native American headdress; the capture process takes approximately fifteen seconds to acquire 1,728 images of the artifact. The 1,728 images are arranged as an array of 64 directions of longitude corresponding to the rotation of the arm and 27 directions of latitude corresponding to the individual strobe lights. An alternative low-cost light stage apparatus presented in [4] consists of a single traditional light source on a two-axis rotation mechanism that spirals around the artifact, beginning at the north pole and spiraling down along the surface of a sphere to the south. This alternative device is less expensive to construct, but the device we present in this current paper allows the lights to be positioned with greater precision and repeatability and the datasets to be acquired significantly more rapidly, and with much less work. Figure 2 shows a captured reflectance field dataset of the headdress. Images near the top of the figure are illuminated from above; images in the center of the figure are illuminated from straight forward, and to the right of the center are illuminated from the right. Images at the far left and right of the figure are actually illuminated from various directions behind the artifact; these images are important to capture since they show light grazes along the sides of the artifact and shines through the artifact’s translucent areas. The actual dataset taken by the apparatus is considerably higher resolution; the figure shows just every fourth image in both the horizontal and vertical dimensions. For artifacts exhibiting relatively shiny reflectance, it becomes necessary to record the reflectance field dataset in high dynamic range – with imagery that can capture a greater ratio between light and dark areas than single exposures from the video cameras. For this we can employ a high dynamic range image acquisition method as in [5] to combine both under- and over-exposed images taken from the same viewpoint in the same lighting into images that capture the full dynamic range of the artifact under that lighting. To take these images, we capture the reflectance field dataset more than once; we first record it normally and then next time place neutral density filters over the video camera lenses in order to reduce the exposure of the imagery. In these subsequent passes, the images are darkened to the point that the specular highlights will be properly imaged without saturating the image sensor. In this work, we use 3-stop neutral density filters which reduce the exposure by a factor of eight. If the specular highlights still saturate the image, the exposure can be further reduced by adding additional neutral density filters or setting the cameras to a smaller aperture. In Section 5 we show an example of using this procedure to faithfully reproduce specular reflections in a synthetically illuminated artifact. 4 Illuminating Reflectance Fields of Artifacts (a) (b)  (c) (d) Figure 3: Real-World Lighting Environments (a) A light probe [3] image records the illumination in San Francisco’s Grace Cathedral (b) The probe image is resampled into a latitude-longitude format having the same coordinate system and resolution as the captured reflectance field in Figure 2. (c) A light probe image recording the incident illumination in a Eucalyptus grove. (d) The resampled version of (c). The incident illumination datasets were recorded by taking omnidirectional high dynamic range images of real lighting environments. The motivation for our approach is to be able to realistically show digitized artifacts illuminated by any desired form of illumination. This ability can allow scholars to study how an artifact responds to light and how it may have appeared to people of the corre- Figure 2: A Reflectance Field Dataset This mosaic of images of the headdress shows a sampling of the 1,728 images acquired in a light stage capture session (the original 64 27 dataset is shown as a 16 8 dataset.) The dataset shows the headdress illuminated from all possible directions of incident illumination.  sponding culture in the artifact’s natural environments. It can also allow the artifact to be realistically integrated into virtual cultural recreations or virtual museums, illuminating it with the specific illumination present in any given virtual environment. To re-illuminate the artifacts, we employ the reflectance field illumination process originally applied for relighting human faces in [4]. First, an image of an incident illumination environment is captured or rendered; for this, the light probe technique presented in [3] can be employed. In this technique, a series of differently exposed images of a mirrored ball are combined to produce a high dynamic range omnidirectional image that measures the color and intensity of the illumination arriving from every direction in the environment. The two light probe images used in this paper are shown in Figure 3. To create a rendering of an artifact as it would appear in such a sampled lighting environment, the light probe image is resampled into the same coordinate space and resolution as the artifact’s reflectance field; in our work this is a 64 27 image in a latitude longitude coordinate system. The two images on the right of Figure 3 show the Grace Cathedral and Eucalyptus Grove lighting environments resampled into this coordinate system and resolution. The next step is to multiply the red, green, and blue color channels of each of the reflectance field dataset images by the red, green, and blue colors of the corresponding pixel of the resampled lighting environment. For example, suppose that the pixel in the lighting environment corresponding to light coming directly from the right of the artifact is bright yellow. Then the reflectance field image illuminated from this direction in the light stage will be scaled so that it too is correspondingly bright and yellow. Thus, each image in the dataset becomes an accurate rendering of how the artifact would appear if illuminated by just its corresponding direction of light in the environment. The illuminated reflectance field dataset for the headdress is shown in Fig. 4. The final step is to sum all of the images in the illuminated reflectance field dataset, producing a final rendered image showing the artifact as it would appear as illuminated by the entire sampled lighting environment at once. This procedure works because of the additive nature of light [9]: if an artifact is illuminated by   two sources of light and photographed separately as illuminated by each, then the sum of these two images will show what the artifact will look like as if illuminated by both sources at once. This assumes that the cameras taking these images are radiometrically calibrated; i.e. that the pixel values in each image are proportional to the amount of light received by the image sensor; we use the method of [5] in order to perform this calibration. Figure 6 shows the result of illuminating the reflectance field dataset of the headdress by the two lighting environments in Figure 3 as well as a user-constructed lighting environment made with the interactive relighting tool described in Figure 5. We note that this illumination technique yields results quite close to the physically correct answer of how the artifact would appear in the given light. Since the rendered image is a linear combination of real images of the artifact, the rendering will exhibit correct real-world illumination effects including those of anisotropic specularity, iridescence, self-shadowing, mutual illumination, subsurface scattering, translucency, and complex surface microstructure; and thus is a faithful rendering of what the artifact would look like in the specified environments. This is notable given that there is no modeling of the artifact’s surface geometry nor any steps to derive reflectance data for the artifact. It should be mentioned that the technique will not yield perfect results in all cases. Scenes with very concentrated light sources should produce very sharp shadows on the artifact; such shadows will become slightly blurred using this technique since the reflectance fields are acquired at a finite resolution. The technique may also produce incorrect results if the spectrum of either the illumination or of the artifact’s reflectance has significant spectral peaks or valleys; carrying out the calculation solely on trichromatic RGB pixel values may fail to yield precise color balance in the renderings in such cases. Finally, this technique assumes that the artifact is illuminated by an even field of illumination; additional data acquisition and rendering would be required to show an artifact as it would be illuminated by dappled light or in partial shadow, or in close proximity to other artifacts or light sources1 . 1 This additional data could be recorded by using pixel-addressable video Figure 4: Lighting a Reflectance Field Dataset A reflectance field dataset is illuminated by coloring each image in the dataset according to the color and intensity of the illumination coming from the corresponding direction in a sampled lighting environment. The images in this figure are colored according to the illumination captured in Grace Cathedral (see Figure 3 using the light probe technique in [3]. A final image of the artifact as illuminated by the environment is obtained by adding together all of the transformed images in the dataset; such an image can be seen in Figure 6. 4.1 Interactive Reflectance Field Illumination We have written an interactive computer program that implements our relighting technique in real time on a standard PC equipped with an OpenGL graphics card. Screen snapshots from this program are shown in Figure 5. The program operates in two modes; in the first, the user can choose from a variety of captured lighting environments with which to illuminate the artifact, and can rotate the environment about the y-axis to see the light from the environment reflect differently off of the artifact. In the second mode, the user can construct the lighting environment by hand using a number of light sources. For each source, the user selects its intensity, color, and direction, as well as whether the light is a hard source such as a point light or a soft source such as an area light. As the user moves the lights, the appearance of the artifacts updates interactively at over twenty frames per second on contemporary mid-range PCs. The demo uses compressed versions of the reflectance field datasets in order to compute the renderings; this degrades the quality of the renderings slightly but makes it possible to process the quantity of data necessary at interactive rates. We have found that being able to interactively control realistic illumination significantly helps a user sense both the geometric and material properties of an artifact. In the work so far we only render the artifact from a static viewpoint. In the next section we describe how the artifact can be rendered from novel viewpoints in addition to being rendered under arbitrary illumination. 5 Results We have applied the photometric digitization technique to a variety of cultural artifacts chosen to exhibit complex geometry and reflectance properties that would be difficult to capture using traditional digitization techniques. Some descriptive information on the artifacts themselves is presented in Section 8. projectors, rather than uniform light sources, to illuminate the artifact. Figure 6 shows a Lakota headdress synthetically illuminated by light captured from the Grace cathedral and Eucalyptus grove environments in Figure 3, as well as a user-specified lighting environment. Despite the headdress’ complex and in places fuzzy geometry and its generally complex reflectance properties, the headdress appears realistically illuminated by each environment. A rendering of the headdress from a novel viewing angle is shown in Figure 10. Figure 8 shows an otter skin cap and an animal bone choker necklace being captured and then illuminated by the Grace cathedral environment. The figure illustrates the need to capture the reflectance fields of shiny artifacts in high dynamic range - using multiple varying exposures to properly record both albedo reflection and specular highlights. Image (c) was illuminated using only standard single-exposure imagery in which highlight values were clipped, and yields a diffuse appearance to the reflective abalone necklace decoration. Image (d) used high dynamic range images to illuminate the artifact and thus properly replicates the shininess of the abalone. Figure 7 shows two original light stage images and two synthetically illuminated renderings of a flat drum. Since the skin of the drum is thin and translucent, it becomes significantly illuminated even when lit from behind. Since this effect is captured in the reflectance field, the drum head will properly light up when illuminated from behind by a sampled illumination environment such as by the bright yellowish altar in the Grace cathedral lighting environment. Figure 9 shows an approach to photometrically recording jewelry and clothing by having them be worn by an actual person. The light stage is outfitted with a chair and the reflectance field of the subject wearing the clothing is captured. Capturing clothing and jewelry in this manner can help us to better understand its physical and artistic design, and greater context is provided when the clothing is worn by a member of the culture from which the artifacts originate. (a) (b) Figure 5: An Interactive Illumination Program The program seen above allows reflectance fields of cultural artifacts to be interactively illuminated with (a) different sampled illumination environments or (b) arbitrary user-specified illumination. (a) (b) (c) (d) Figure 6: Synthetically Illuminating an Artifact (a) One of 1,728 original images in the reflectance field dataset of a headdress. (b) The headdress synthetically illuminated by the environmental lighting captured in Grace cathedral in Figure 3. (c) The headdress synthetically illuminated by the environmental lighting captured in a eucalyptus grove in Figure 3. (d) The headdress synthetically illuminated by a user-constructed lighting environment using the interactive relighting program seen in Figure 5. The renderings exhibit all of the artifact’s complex properties of specularity, anisotropic reflection, translucency, and mutual illumination; such effects are usually challenging to model, represent, and render using currently available techniques for artifact digitization. (a) (b) (c) (d) Figure 7: Capturing Translucency (a) The reflectance field of a Flat Drum is captured in the light stage. (b) When the drum is lit from behind, the front of the drum lights up due to the translucency of the drum head. The shadow of the strings that tighten the drum is also visible. (c) This image of the reflectance field of the drum illuminated by the Grace cathedral lighting environment in Figure 3 does not reveal the translucency since the environment is situated so that more light strikes the front of the drum than the back. (d) Rotating the lighting environment so that more of the light comes from behind the drum reveals the drum’s translucency. A traditionally scanned 3D model of the drum with a diffuse texture map would not be able to reproduce this effect. (a) (b) (c) (d) Figure 8: The Need for High Dynamic Range Datasets (a) a dataset image of an otter headband and an abalone shell necklace. The bright reflection in the necklace is too bright to be captured accurately by the video camera; the pixel values in the highlight have been clipped to the maximum pixel value. (b) A dataset image taken on a second pass of the light stage with a 3-stop neutral density filter placed in front of the camera lens. While most of the image is too dark to be useful, the specular highlight in the shell is properly imaged within the range of sensitivity of the camera. The two images can be combined into a single high dynamic range image as in [5]. Relighting the dataset with the method described in Section 4 will produce correct results only if the high dynamic range dataset is used. (c) A rendering of the artifacts using only the low dynamic range imagery captured from a single pass of the light stage as in (a); the abalone necklace piece appears to made of a diffuse plaster-like material. (d) A rendering of the artifacts using high dynamic range imagery captured from multiple passes of the light stage faithfully produces the bright iridescent specular reflection in the necklace. (a) (b) (c) (d) (e) Figure 9: Capturing Jewelry and Clothing The light stage can have a chair mounted within it that allows the appearance of traditional clothing and jewelry to be captured as worn by a person, in this case be a member of the particular culture of the artifacts’ origin. (a) and (b) show tribal elder George Randall being photometrically recorded in the light stage wearing the otter fur hat, a deerskin Ghostdance shirt, a hair pipe necklace, and holding a prayer staff (Section 8 provides some more detailed information about the artifacts.) Both images used an extended shutter speed of ten seconds to capture an image of all of the lighting directions being illuminated at once. (c) and (d) show close-ups of Randall in the Grace Cathedral and a manually specified lighting environment. (e) shows a wider view of Randall with a prayer staff synthetically re-illuminated in the Grace Cathedral lighting environment. 6 Future Work A benefit of the technique presented in this paper is that no geometric model of the artifact need be supplied or acquired. However, the technique as so far presented does not allow the viewpoint of the object to be changed, which is the principal way to give a sense of the geometry and reflectance of an object. One technique to render a reflectance field of an artifact from novel viewpoints would be to acquire a geometric model of the artifact, texture map the model with illuminated renderings of the artifact, and then render the texture-mapped model from different points of view. In this paper, we wish to model artifacts whose geometry can not easily be scanned with currently available 3D scanning techniques. As a result, we propose an image-based approach to rendering artifacts from novel viewpoints. The Light Field [10] and Lumigraph [8] works suggest acquiring a two-dimensional dataset of images from different viewpoints situated on a viewing surface around an object and then composing new viewpoints by sampling similar pixel rays found in existing images. Since this is done entirely with images, no three-dimensional model of the object is required. However, to produce sharp renderings from novel viewpoints the the spacing of the image viewpoints must be very fine, which for our purposes would multiply the already great quantity of data required to represent an object’s reflectance field from just one viewpoint. In order to reduce the amount of data required, a practical suggestion would be to acquire more widely spaced viewpoints and geometrically interpolate between them using view interpolation [20]. To show the potential merit of such an approach, we produced two rendered views (Figures 10(a) and 10(c)) of the headdress made from reflectance fields acquired by two simultaneously running digital video cameras. From these two static viewpoints, we used manually provided image correspondences to produce an intermediate viewpoint as seen in Figure 10(b). Ideally, one would like to find such correspondences manually using an optic flow technique. While finding such correspondences can be challenging [14], we note that the problem may be easier in our case since we have many pictures under different lighting conditions of the object seen from the two viewpoints. Thus, matching could be performed on a much higher-dimensional space of image pixels (under all the different lighting conditionals) to help disambiguate correspondences. Figure 11 shows a possible augmentation of the light stage apparatus that would capture such datasets. Instead of just a few cameras, a linear array of sixteen cameras aimed at the subject is distributed along a second semicircular arm. Each time the arm (a) (b) (c) Figure 10: Rendering the Artifact from Arbitrary Viewpoints (a) and (c) Renderings of the artifact illuminated by the Grace Cathedral environment from different viewing directions. (b) An illuminated rendering of the artifact from an intermediate novel viewpoint made using the image-based rendering technique of view interpolation [20]. Using this technique, the artifact can be rendered from any viewpoint if reflectance field data is captured from a sufficiently dense sampling of discrete viewing directions. An augmented light stage that would automate such a capture process is shown in 11. of lights goes around, a reflectance field of the artifact is captured from a variety of latitudes but from just one longitudinal direction. To capture views spaced out in longitude as well, there are two possibilities. The first is that the array of cameras would rotate in fixed angular increments after each rotation of the lights; the other is that the artifact itself would be rotated by a motion-controlled platform after each lighting pass. technique for this would be to use a monochrome camera equipped with a multispectral filter wheel to capture an object’s reflectance field at a variety of spectral bands. Another technique would be to place a Liquid Crystal Tunable Filter (LCTF) [7] in front of the camera to choose the reflectance bands. Combined with multispectral measurements of incident illumination (taken, for example, by placing an LCTF in front of the camera imaging a mirrored ball), such datasets could yield significantly more accurate renditions of artifacts under novel illumination conditions (such as under firelight from a torch or fluorescent light in a museum), even if the target rendering space remains just (R,G,B). 7 Figure 11: An Evolved Light Stage includes a stationary array of cameras, seen at right, that record the artifact from a onedimensional array of directions, and a motorized platform that can rotate the artifact a set number of degrees for every rotation of the light stage arm. The device would capture a complete light field of the artifact for every direction of incident illumination, allowing artifacts to be rendered from any viewpoint as well as illuminated from any direction. In this manner, we could effectively capture an entire light field [10, 8] of the artifact as illuminated from every possible incident illumination direction, and from such a dataset could render the artifact from arbitrary angles and in arbitrary illumination, all without having geometric information for the artifact. For the moment, we leave this acquisition process for future work. Another important avenue for future work will be to capture higher spectral resolution of the light reflected from the object as illuminated by the light stage. Treating reflectance and incident illumination with just three spectral bands (R, G, and B) is an approximation that can create color rendition problems when either the object’s reflectance or the illuminant has a complex spectrum. One Conclusion In this paper we have presented an alternative technique for photometrically acquiring computer graphics models of real-world cultural artifacts. In this work we acquire reflectance fields of the artifacts – image datasets that directly measure how an artifact transforms incident illumination into radiant imagery – rather than surface geometry and texture maps. The method allows the artifacts to be rendered under arbitrary illumination conditions, including image-based illumination sampled from the real world. In this work we have focused on rendering artifacts from the same viewpoints from which their imagery is captured, but have shown that imagebased rendering techniques can allow rendering from novel viewpoints to be performed as well. We demonstrated realistic illuminated renderings of a variety of cultural artifacts which would be challenging to model, represent, and render using current digitization techniques. It is our hope that this line of research will eventually help yield practical methods for digitizing photometrically accurate models of cultural artifacts, as well as provide insight into improving current techniques. 8 About the Artifacts The artifacts featured in this paper are from the private collection of George Randall, a tribal elder from the White Earth Chippewa Reservation in Minnesota. The headdress in Figure 6 is a classic war bonnet from the Lakota tribe made around 1920 at the Rosemont Reservation in South Dakota near Little Big Horn. The main head band is made from a Union Army blanket, and the headdress features bead work, metal bells, white and black ermine fur, colored ribbons, and turkey feathers augmented with tufts of rabbit fur at the tips. All the materials are organic except for the bells and mirrors, which were typical trade goods. The drum in Figures 7 and 5 is a ceremonial flat drum made of stretched elk stomach held together by leather lacing. The painted design shows two eagles, or hanbli, a prominent icon in the tribe’s narrative tradition. The otter skin hat in Figure 8 was made in California from a freshwater otter skin from Canada. The hat is approximately twenty years old and is typical of a style of headdress in use for approximately two hundred years. The choker necklace in Figure 8 is also approximately twenty years old and is made from “hair pipe” (hollowed out bone from small animals), trade beads, and an abalone shell at the front. The Ghostdance shirt in Figure 9 is sewn from baby deer skin. The front features six strands of horse hair wrapped with red thread, as well as three horse shoe prints indicating that the wearer owns three horses. The back of the shirt features three ermine tails attached to abalone shell circles. These styles of clothes and jewelry are typical of Native American tribes spanning from the Northeast to Wisconsin such as the Seneca, Mohawk and Wampanoag. Acknowledgements We would like to express our gratitude to George Randall for making himself and his collection of artifacts available for this project and our great thanks to Maya Martinez-Smith for organizing his visit and making this collaboration possible. We thank Chris Tchou for his extensive contributions to the software employed in creating the renderings and both Chris Tchou and Dan Maas for writing the interactive reflectance field visualization program. We also wish to thank Brian Emerson for his modeling for figure 11, Andy Gardner for helping with the video editing, and Bobbie Halliday and Jamie Waese for their invaluable production assistance. This work was supported by the University of Southern California, the United States Army, and TOPPAN Printing Co., Inc. and does not necessarily reflect any corresponding positions or policies and no official endorsement should be inferred. References [1] B ARIBEAU , R., C OURNOYER , L., G ODIN , G., AND R IOUX , M. Colour three-dimensional modelling of museum objects. Imaging the Past, Electronic Imaging and Computer Graphics in Museum and Archaeology (1996), 199–209. [2] C URLESS , B., AND L EVOY, M. A volumetric method for building complex models from range images. In SIGGRAPH 96 (1996), pp. 303–312. [3] D EBEVEC , P. Rendering synthetic objects into real scenes: Bridging traditional and image-based graphics with global illumination and high dynamic range photography. In SIGGRAPH 98 (July 1998). [4] D EBEVEC , P., H AWKINS , T., T CHOU , C., D UIKER , H.-P., S AROKIN , W., AND S AGAR , M. Acquiring the reflectance field of a human face. Proceedings of SIGGRAPH 2000 (July 2000), 145–156. ISBN 1-58113-208-5. [5] D EBEVEC , P. E., AND M ALIK , J. Recovering high dynamic range radiance maps from photographs. In SIGGRAPH 97 (August 1997), pp. 369–378. [6] D EBEVEC , P. E., TAYLOR , C. J., AND M ALIK , J. Modeling and rendering architecture from photographs: A hybrid geometry- and imagebased approach. In SIGGRAPH 96 (August 1996), pp. 11–20. [7] G AT, N. Real-time multi- and hyper-spectral imaging for remote sensing and machine vision: an overview. In Proc. 1998 ASAE Annual International Mtg. (Orlando, Florida, July 1998). [8] G ORTLER , S. J., G RZESZCZUK , R., S ZELISKI , R., AND C OHEN , M. F. The Lumigraph. In SIGGRAPH 96 (1996), pp. 43–54. [9] H AEBERLI , P. Synthetic lighting for photography. Available at http://www.sgi.com/grafica/synth/index.html, January 1992. [10] L EVOY, M., AND H ANRAHAN , P. Light field rendering. In SIGGRAPH 96 (1996), pp. 31–42. [11] L EVOY, M., P ULLI , K., C URLESS , B., R USINKIEWICZ , S., K OLLER , D., P EREIRA , L., G INZTON , M., A NDERSON , S., D AVIS , J., G INSBERG , J., S HADE , J., AND F ULK , D. The digital michelangelo project: 3d scanning of large statues. Proceedings of SIGGRAPH 2000 (July 2000), 131–144. ISBN 1-58113-208-5. [12] M ALZBENDER , T., G ELB , D., AND W OLTERS , H. Polynomial texture maps. Proceedings of SIGGRAPH 2001 (August 2001), 519–528. ISBN 1-58113-292-1. [13] M ARSCHNER , S. Inverse Rendering for Computer Graphics. PhD thesis, Cornell University, August 1998. [14] M C M ILLAN , L., AND B ISHOP, G. Plenoptic Modeling: An imagebased rendering system. In SIGGRAPH 95 (1995). [15] N IMEROFF , J. S., S IMONCELLI , E., AND D ORSEY, J. Efficient rerendering of naturally illuminated environments. Fifth Eurographics Workshop on Rendering (June 1994), 359–373. [16] R OCCHINI , C., C IGNONI , P., AND M ONTANI , C. Multiple textures stitching and blending on 3d objects. Eurographics Rendering Workshop 1999 (June 1999). Held in Granada, Spain. [17] R USHMEIER , H., B ERNARDINI , F., M ITTLEMAN , J., AND TAUBIN , G. Acquiring input for rendering at appropriate levels of detail: Digitizing a pietà. Eurographics Rendering Workshop 1998 (June 1998), 81–92. ISBN 3-211-83213-0. Held in Vienna, Austria. [18] S ATO , Y., W HEELER , M. D., AND I KEUCHI , K. Object shape and reflectance modeling from observation. In SIGGRAPH 97 (1997), pp. 379–387. [19] T URK , G., AND L EVOY, M. Zippered polygon meshes from range images. In SIGGRAPH 94 (1994), pp. 311–318. [20] W ILLIAMS , L., AND C HEN , E. View interpolation for image synthesis. In SIGGRAPH 93 (1993). [21] W ONG , T.-T., H ENG , P.-A., O R , S.-H., AND N G , W.-Y. Imagebased rendering with controllable illumination. Eurographics Rendering Workshop 1997 (June 1997), 13–22. [22] W OOD , D. N., A ZUMA , D. I., A LDINGER , K., C URLESS , B., D UCHAMP, T., S ALESIN , D. H., AND S TUETZLE , W. Surface light fields for 3d photography. Proceedings of SIGGRAPH 2000 (July 2000), 287–296. ISBN 1-58113-208-5. [23] Y.C HEN , AND M EDIONI , G. Object modeling from multiple range images. Image and Vision Computing 10, 3 (April 1992), 145–155. Realistic Visualisation of the Pompeii Frescoes Kate Devlin and Alan Chalmers University of Bristol Merchant Venturers Building Woodland Road BRISTOL BS8 1UB [email protected] [email protected] ABSTRACT Three dimensional computer reconstruction provides us with a means of visualising past environments, allowing us a glimpse of the past that might otherwise be difficult to appreciate. Many of the images generated for this purpose are photorealistic, but no attempt has been made to ensure they are physically and perceptually valid. We are attempting to rectify these inadequacies through the use of accurate lighting simulation. By determining the appropriate spectral data of the original light sources and using them to illuminate a scene, the viewer can perceive a site and its artefacts in close approximation to the original environment. The richly decorated and well-preserved frescoes of the House of the Vettii in Pompeii have been chosen as a subject for the implementation of this study. This paper describes how, by using photographic records, modelling packages and luminaire values from a spectroradiometer, a three dimensional model can be created and then rendered in a lighting visualisation system to provide us with images that go beyond photorealistic, accurately simulating light behaviour and allowing us a physically and perceptually valid view of the reconstructed site. A method for capturing real flame and incorporating it in a virtual scene is also discussed, with the intention of recreating the movement of a flame in an animated scene. KEYWORDS Computer graphics, reconstructions, archaeology, visualization, visual perception. 1. INTRODUCTION The application of computer graphics to the field of archaeology is becoming more commonplace. From providing the archaeologist with an aid to interpretation to giving the public an animated glimpse of the past, the use of realistic graphics provides a powerful tool for modelling multi-dimensional aspects of archaeological data. Sites and artefacts can be reconstructed and visualised in 3D space, providing a safe and controlled method of studying past environments. This new perspective may enhance our understanding of the conditions in which our ancestors lived and worked. To date, however, limitations have been imposed with regard to the validity of these reconstructions [1]. The concept of realistic image synthesis centres on generating scenes with an authentic visual appearance. The modelled scene should not only be physically correct but also perceptually equivalent to the real scene it portrays [10] and this research seeks to address the problems encountered in the realistic simulation of archaeological sites. Today our world is lit by bright and steady light, but past societies relied on daylight and flame for illumination. There is well-documented archaeological evidence for the use of flame; the presence of hearths, the remains of lamps and historical documentation where it exists all provide a source of information regarding the use of artificial light. If we consider our perception of the world we inhabit today and compare the modern lighting with that of the past, it is evident that there are significant differences in how it appears [2]. It would seem, therefore, that the photo-realistic site reconstructions often produced are flawed in regard to lighting conditions. Although they may look "real" their validity cannot be guaranteed as no attempt has been made to use physically accurate values for ancient light sources and surface reflectance. They owe more to an artist's imagination than to an interpretation based on numerical simulation. The commonly used software packages base the lighting conditions on daylight, fluorescent light or filament bulbs and not the lamp and candlelight that would have been used in the past. In some cases the reconstructions are lit with physically impossible lighting values. Our perception of past environments should consider the lighting conditions of that time - the use of natural daylight and the use of flame in a variety of forms. The different fuel types of each of these sources will affect our perception of a scene, and this needs to be taken into account [6]. Our perception of colour is affected by the amount and nature of light reaching the eye, so by simulating the behaviour of the appropriate type of light in an environment it should be possible to demonstrate how it may have looked in the past. The goal is to produce images that recreate accurately the visual appearance of an environment illuminated by flame. 2. LUMINAIRES The luminosity of flame is due to glowing particles of solids in laminar flux, the colour of which is primarily related to the emission from incandescent carbon particles. A typical fuel/air wick flame consists of three distinct zones: the inner core, the blue intermediate zone and the outer cone [5]. The different zones of the flame produce different emissions depending on the fuel type and environment conditions. Previous work on modelling flame has focussed on large-scale flames such as fires [4], fireballs and explosions [15] [13] or more generic flames [17][18][19]. Inakage introduced a simplified candle flame model [7], which Raczkowski extended to incorporate the dynamic nature of the flame [14]. In this study we are interested in the perception of objects illuminated by different fuel types. 2.1 Building the luminaires The acquisition of valid experimental data is of vital importance as the material used may have had a significant influence on the perception of the ancient environment. In consultation with the Department of Archaeology at the University of Bristol, various light sources were recreated. These included processed beeswax candles, tallow candles (of vegetable origin), unrefined beeswax candles, a selection of reeds coated in vegetable tallow, a rendered animal fat lamp, and an olive oil lamp. pure Eastman Kodak Standard white powder, which diffusely reflects the aggregate incident light. Ten readings were taken for each lamp type and an average was calculated. This data can be used to create an RGB colour model for use in rendering the scene. 3. THE POMPEII FRESCOES The House of the Vettii in Pompeii is one of the best-preserved and decorated buildings in this World Heritage site, and is the most frequently visited building in Pompeii [12]. The rich colours and extensive use of artistic techniques such as trompe l'oeil, along with its magnificent state of preservation draws millions of visitors through its rooms each year. However, the impact of time and tourism on such a site has led to serious deterioration. Computer reconstruction of the House of the Vettii allows us to view it as it might have been when it was in use before the eruption of Vesuvius in AD 79. The room chosen for the study was an oecus, or reception room, which opens onto a colonnaded garden. The high quantity of red and yellow pigments in this room was of specific interest to our study. Its three walls are richly decorated by intricate frescoes in the IV Style, also termed the “illusionist style” (Figure 2). Descriptions of how frescoes are created appear in Classical literature. It involves the application of colour pigments to wet plaster so that the plaster and the paint are merged and dry together, creating a permanent and vivid display. The fact that the House of the Vettii was immediately and sympathetically restored around the frescoes has meant that the paint colours have been well preserved. Figure 1. Experimental archaeology: reconstructing ancient light sources The appropriate sources for this project were judged to be olive oil lamps, the most readily available fuel type for that area. Water was added to some of the lamps to keep them cool whilst being carried and to stop the oil from sticking. Salt was added to others to make the oil burn for longer. Detailed spectral data was gathered using a spectroradiometer, allowing us to measure the absolute value of the spectral characteristics without making physical contact with the flame. This device can measure the emission spectrum of a light source from 380nm to 760nm, in 5nm wavelength increments, giving an accurate breakdown of the emission lines generated by the combustion of a particular fuel. The measurements were performed in a completely dark room and the device was aimed at a board coated with a 99% optically Figure 2. The room as it appears today The frescoes in the room were recorded photographically. A colour chart was included at either side of each photo to permit calibration, identify illumination levels and allow any gradient in the light to be calculated. Using a scale plan of the room a 3D model was generated. 3.2 Modelling the flame 3.1 Converting the luminaire data The resulting luminaire data obtained from the spectroradiometer was then converted to RGB values to enable display on a computer monitor. It is essential, when converting the detailed spectrum data from the spectroradiometer into values representing the red, green and blue portions of the spectrum, that this conversion is calculated in a perceptually valid way, as defined by the CIE (Commission International de l’Eclairage) 1931 1-degree standard observer1. The type of flames that we are interested in are categorised as diffusion wick flames, in which heat transfer from the flame cause a steady production of flammable vapour. Observations from the reconstructed light sources showed that all of the flames were small and (ignoring the effect of air turbulence for now) fairly steady. Rather than attempting to model the shape of the flame accurately for this initial stage of the project, video footage of the real flames from the reconstructed light sources was processed using computer vision techniques, the shape of the flame extracted and the real flame incorporated in the virtual scene. To capture the flame a ‘blue screen’ technique was used. This simple technique is widely employed in the film industry, and is used to cut an object from its background surroundings. Filming the flame against an evenly coloured, matt background enables thresholding of each frame, which is used to identify and dismiss a background colour, effectively separating the flame from any unwanted parts of the scene. The background colour should be chosen so that it does not occur within the foreground. As the intermediate zone of the flame is generally blue in colour, a green screen rather than a blue screen should be used (Figure 4). Figure 3. CIE Tristimulus functions Figure 3 shows the functions for the X, Y and Z channels. The Y channel measures the luminance of a source, and the X and Z channels measure the chromaticity. This information is more useful when broken down as follows: If we let x= X ( X + Y + Z) and y= Y ( X + Y + Z) then we can calculate the exact colour values for the red, green and blue sections of the spectrum, disregarding luminance. For a canonical set of VDU phosphors: RED(x,y) = (0.64, 0.33); GREEN(x,y) = (0.29, 0.60); and BLUE(x,y) = (0.15, 0.06) [22]. Radiance [22], a lighting visualization system, was used to render the images. It contains a function (rcalc) to convert the xyY coordinates to RGB values. Radiance then takes these RGB values and accurately simulates the associated light source behaviour in a modelled scene. 1 This system specifies a perceived colour as a tristimulus value indicating the luminance and chromaticity of a stimulus as it is perceived in a 1-degree field around the foveal centre. Figure 4: Video still of the flame against a green screen Thresholding is the process of identifying a range of colour, and changing all areas within an image that fall into this range to another specified colour. Simple, solid objects can easily be separated from a background, but a flame produces some of its own difficulties. It is useful to have static and even lighting on the background to simplify the thresholding process. Filming a flame creates a problem in that it is itself a light source and may therefore disrupt otherwise static lighting, producing unwanted background lighting effects. Parts of a flame may be translucent or partly transparent, so seepage of the background colour into what we identify as the flame may occur. This is difficult to avoid, but we can compensate for this at a later stage by deliberately seeping the background colour of the modelled scene into the flame. By splitting the video stream into separate frames an animated sequence can be achieved. Once an object has been separated from its original background, it can then be placed in a new scene. If used sensibly, the object can be blended into the new background to give the appearance that this is an unaltered, original scene. The illumination of the flame in the environment was achieved by approximating the shape of the flame by a series of illum spheres [22], as shown in figure 5(a) and included in a virtual scene 5(b). The material type illum is an invisible light source. When viewed directly, the object made from illum is invisible but it still emits light. The number and size of the spheres can of course be varied to achieve a better “fit” to the shape of the flame for each frame of the video sequence. Once the images have been rendered the original picture of the flame is pasted into the scene. Some care needs to be taken here to accurately position the original picture of the flame and blend it into the scene. artwork under (simulated) original conditions rather than under modern lighting. It is of course impossible to investigate these sensitive sites with real flame sources. Figure 6: The effect of different fuel types (a) modern (b) olive oil The initial results of this on the Pompeii frescoes can be seen below. It is noticeable that the lamp-lit scenes (Figures 7b – 7d) can be perceived as warmer in appearance when compared to the modern light (Fig. 7a), with the yellow and red pigments particularly well emphasised. The appearance of the threedimensional trompe l’oeil art is also influenced. Figure 5: (a) Simple luminaire model (b) real flame in virtual environment A set of programs was written to take a sequence of flame pictures as input and to output data representing the flame, to any desired level of accuracy [16]. This provides an efficient method of incorporating the real flame in a synthetic image. 3.3 Changes in perception in flame-lit conditions Conversion of the spectral data to RGB does of course lead to an approximation of the colours present. In future, for more accuracy we will need to consider calculating the convolution of the emission spectrum of the light source with the reflectance curve of the material under examination. However, even with the approximation, significant perceptual differences related to fuel type are achieved. These simulations can be validated with real scenes [11]. Figure 6 shows a test scene including a MacBeth colour chart illuminated with (a) a 55W electric bulb, (b) an olive oil fuel. The difference in fuel type has a discernible effect on the appearance of the MacBeth chart. The apparent differences indicate that it is important for archaeologists to view such Figure 7. Clockwise from top left: (a) modern lighting (b) olive oil lamp (c) olive oil lamp with salt (d) olive oil lamp with water These are preliminary images only, and current work involves the addition of models of appropriate Roman furniture and artefacts to the scene. This will not only create a more realistic scene, but will allow archaeologists to investigate the appearance of objects under their original lighting conditions. It is important to remember that a reconstruction is only one glimpse of many valid interpretations. Various configurations of lamps are also being modelled to provide a number of possible scenarios. 4. FUTURE WORK In flame-lit environments the range of light can vary greatly over short distances. Human vision ranges over nine orders of magnitude whereas the dynamic range of most display devices covers only two, thus some form of compensation is required to map the light-dependent way we will view a scene [20][3][21]. The ultimate aim of realistic graphics is the creation of images that provoke the same response and sensation as a viewer would have to a real scene, i.e. the images are physically and perceptually accurate when compared to reality. Given that the aim is to create an environment that can be perceived as real, future work involving tone mapping will allow us to gain more perceptual accuracy of the scene. Furthermore, the use of an eyetracking device to measure involuntary eye movement will enable us to define which areas of the room are emphasised under different lighting conditions, and will give us an insight into the effectiveness of the three-dimensional paint techniques employed in the frescoes. Above all this, the need to establish a metric for realism through comparison with reality is essential if the images are to be of full use [11], and work in the future will attempt to quantify how "real" our reconstructions actually are. 5. CONCLUSIONS To date, the work has provided us with a means of viewing a reconstructed site under its original lighting, and allows us to place a real flame in a virtual environment. The method of incorporating a real flame in a rendered image also provides for movement by means of a sequence of frames, so that an animation of the scene can have a dynamic flame inserted in it. This method of visualization of past environments provides a safe and controlled manner in which the archaeologist can test hypotheses regarding perception and purpose of colour in decoration and artefacts. Computer generated imagery indistinguishable from the real physical environment will be of substantial benefit to the archaeological community, and this research is one method of moving beyond the current trend of photo-realistic graphics into physically and perceptually realistic scenes which are ultimately of greater use to those investigating our past. 6. ACKNOWLEDGMENTS Many thanks to Ian Roberts for his work regarding flame modelling, and Helen Legge for her input in Italy. We would also like to thank the Defence Science and Technology Laboratory (formerly DERA), and in particular Marilyn Gilmore, for their help and financial assistance. 7. REFERENCES [1] Barcelo, J.A., Forte, M. and Sanders, D.H., eds. Virtual Reality in Archaeology. (2000) ArchaeoPress. [2] Chalmers A., Green C. and Hall M. “Firelight: Graphics and Archaeology”, Electronic Theatre, SIGGRAPH 00, New Orleans, July 2000. [3] Ferwerda, J. A., Pattanaik, S., Shirley, P. and Greenberg, D. P. "A Model of Visual Adaptation for Realistic Image Synthesis". Proceedings of ACM SIGGRAPH 96 (August 1996) Addison Wesley, pp. 249-258. [4] Gardner G.Y., "Modeling Amorphous Natural Features", in SIGGRAPH 94 Course Notes 22 (1994). [5] Gaydon, A.G. and Wolfard, H.G. Flames: Their Structure, Radiation and Temperature. (1979) Chapman and Hall. [6] Green, C. The Visualisation of Ancient Lighting Conditions. Project Thesis submitted in support of the degree of Bachelor of Science in Computer Science, University of Bristol (1999). [7] Inakage M., “A Simple Model of Flames”, Computer Graphics Around the World, ed. T.S.Chua, T.L.Kunii, Proceedings of Computer Graphics International, (1990) Springer-Verlag, pp. 71-81. [8] Mavrodineanu R. et al., “Analytical Flame Spectroscopy, Selected Topics”. (1970) MacMillan. [9] McNamara A., Chalmers A., Troscianko T. and Gilchrist I. “Comparing real & synthetic scenes using human judgements of lightness”. In B. Peroche and H. Rushmeier, editors, Rendering Techniques 2000, Springer Wien. [10] McNamara, A., Chalmers, A., Troscianko, T. and Reinhard, E., “Fidelity of Graphics Reconstructions: A Psychophysical Investigation”. Proceedings of the 9th Eurographics Workshop on Rendering (June 1998) Springer Verlag, pp. 237 - 246. [11] McNamara, A. and Chalmers, A., Image Quality Metrics, Image Quality Metrics Course Notes, SIGGRAPH 00, (July 2000). [12] Nappo, S., Pompeii: Guide to the Lost City. (1998) Weidenfeld and Nicolson. [13] Perlin K., Hoffert E.M., “Hypertexture”, Proceedings of ACM SIGGRAPH 89 (1989) pp. 253-262, 1989. [14] Raczkowski J. “Visual Simulation and Animation of a laminar Candle Flame”. International Conference on Image Processing and Computer Graphics, (1996) Poland. [15] Reeves W.T., “Particle Systems - A Technique for Modeling a Class of Fuzzy Objects”. Proceedings of ACM SIGGRAPH 83 (1983) pp. 359-376. [16] Roberts, I. Modelling Realistic Flame. Project Thesis submitted in support of the degree of Bachelor of Science in Computer Science, University of Bristol (2001). [17] Rushmeier, Holly E. “Rendering Participating Media: Problems and Solutions from Application Areas”. Proceedings of the Fifth Eurographics Workshopon Rendering (June 1995) Springer-Verlag. [18] Sakas G., “Cloud modeling for visual simulators”, in G. von Bally and H.I. Bjelkhagen, editors, Optics for protection of man and environment against natural and technological disasters (1993) Elsevier Science Publishers B.V., pp.323-333. [19] Stam J., Fiume E., "Turbulent Wind Fields for Gaseous Phenomena". Proceedings of SIGGRAPH 93 (1993) pp.369-376. [20] Tumblin, J. and Rushmeier, H., “Tone Reproduction for Realistic Images”, IEEE Computer Graphics and Applications (November 1993) 13(6), pp. 42 – 48. [21] Ward Larson, G., Rushmeier, H. and Piatko, C. “A Visibility Matching Tone Operator for High Dynamic Range Scenes”, IEEE Transactions on Visualization and Computer Graphics 3 (1997) no. 4, pp. 291 – 306. [22] Ward Larson, G. and Shakespeare, R., Rendering with RADIANCE: The art and science of lighting simulation. (1998) Morgan Kauffman. Computer Representation of the House of the Vettii, Pompeii Under modern (55w) light Under olive oil lamp Under olive oil lamp, with furniture to show shadow effects Kate Devlin and Alan Chalmers Department of Computer Science University of Bristol Comparing Real & Synthetic Scenes using Human Judgements of Lightness Ann McNamara Alan Chalmers Department of Computer Science Tom Troscianko Iain Gilchrist Department of Experimental Psychology University of Bristol Bristol [email protected] Abstract. Increased application of computer graphics in areas which demand high levels of realism has made it necessary to examine the manner in which images are evaluated and validated. In this paper, we explore the need for including the human observer in any process which attempts to quantify the level of realism achieved by the rendering process, from measurement to display. We introduce a framework for measuring the perceptual equivalence (from a lightness perception point of view) between a real scene and a computer simulation of the same scene. Because this framework is based on psychophysical experiments, results are produced through study of vision from a human rather than a machine vision point of view. This framework can then be used to evaluate, validate and compare rendering techniques. 1 Introduction The aim of realistic image synthesis is the creation of accurate, high quality imagery which faithfully represents a physical environment, the ultimate goal being to create images which are perceptually indistinguishable from an actual scene. Rendering systems are now capable of accurately simulating the distribution of light in an environment. However, physical accuracy does not ensure that the displayed images will have authentic visual appearance. Reliable image quality assessments are necessary for the evaluation of realistic images synthesis algorithms. Typically the quality of an image synthesis method is evaluated using numerical techniques which attempt to quantify fidelity using image to image comparisons (often comparisons are made with a photograph of the scene that the image is intended to depict). Several image quality metrics have been developed whose goals are to predict the visible differences between a pair of images. It is well established that simple approaches, such as mean squared error (MSE), do not provide meaningful measures of image fidelity, more sophisticated techniques are necessary. As image quality assessments should correspond to assessments made by humans, a better understanding of features of the Human Visual System (HVS) should lead to more effective comparisons, which in turn will steer image synthesis algorithms to produce more realistic, reliable images. Any feature of an image not visible to a human is not worth computing. Results from psychophysical experiments can reveal limitations of the HVS. However, problems arise when trying to incorporate such results into computer graphics algorithms. This is due to the fact that, often, experiments are designed to explore a single dimension of the HVS at a time under laboratory conditions. The HVS comprises 1 many complex mechanisms, which rather than function independently, often work in conjunction with each other, making it more sensible to examine the HVS as a whole. Rather than attempting to reuse results from previous psychophysical experiments, new experiments are needed which examine the complex response HVS as a whole instead of than trying to isolate features for individual investigations. In this work we study the ability of the HVS to perceive albedo and the impact of rendering quality on this task. Rather than deal with atomic aspects of perception, this study examines a complete task in a more realistic setting. Human judgements of lightness are compared in real scenes, and synthetic images. Correspondence between these judgements is then used as an indication of the fidelity of the synthetic image. 1.1 Lightness Perception Fig. 1. Importance of depth perception for lightness constancy Lightness is apparent reflectance, brightness is apparent intensity of the illuminant. Reflectance is the proportion of light falling on an object that is reflected to the eye of the observer. Reflectance (albedo) is constant, the perception of lightness depends of reflectance [1]. Gilchrist [8] showed that the perception of the degree of “lightness” of a surface patch (i.e. whether it is white, gray or black) is greatly affected by the perceived distance and orientation of the surface in question, as well as the perceived illumination falling on the surface - where the latter were experimentally manipulated through a variety of cues such as occlusion, or perspective. Perception of the lightness of patches varying in reflectance may thus be a suitable candidate for the choice of visual task. It is simple to perform, and it is known that lightness constancy depends on the successful perception of lighting and the 3D structure of a scene, for example figure 1. When viewed in isolation the patches on the top left hand corner appear to be of different luminance. However, when examined in the context of the entire scene, it can be seen that the patches have been cut from the edge of the stairwell, and is perceived as an edge where the entire stairwell has the same luminance. Eliminating the depth cues means the patches are perceived as different, demonstrating 2 the dependency of lightness perception on the correct perception of three dimensional structure, [10]. As the key features of any scene are illumination, geometry and depth, the task of lightness matching encapsulates all three key characteristics into one task. This task is particularly suited to this experimental framework, apart from being simple to perform it also allows excellent control over experimental stimuli. Subsequent sections describe an experimental framework, with such a lightness matching task at the core, to allow human observers to compare real and synthetic scenes. The remainder of this paper is divided into the following sections. In Section 2, we describe previous research. In Section 3, we describe the steps taken to build the experiment in order to facilitate easy human comparison between real and synthetic scene, we also discuss the actual organisation of participants in terms of scheduling. Section 4 describes the experiment, the results are presented in section 5 and finally, conclusions are drawn in section 6. 2 Previous Work Models of visual processing enable the development of perceptually based error metrics for rendering algorithms that will reduce the computational demands of rendering while preserving the visual fidelity of the rendered images. Much research investigating this issue is under way. Using a simple five sided cube as their test environment Meyer et al [13] presented an approach to image synthesis comprising separate physical and perceptual modules. They chose diffusely reflecting materials to built a physical test environment. Each module is verified using experimental techniques. The test environment was placed in a small dark room. Radiometric values predicted using a radiosity lighting simulation of a basic environment are compared to physical measurements of radiant flux densities in the real environment. Then the results of the radiosity calculations are transformed to the RGB values for display, following the principles of colour science. Measurements of irradiation were made at 25 locations in the plane of the open face for comparison with the simulations. Results show that irradiation is greatest near the centre of the open side of the cube. This area provides the best view of the light source and other walls. The calculated values are much higher than the measurements. In summary, there is good agreement between the radiometric measurements and the predictions of the lighting model. Meyer et al. then proceeded by transforming the validated simulated value to values displayable on a television monitor. A group of twenty experimental participants were asked to differentiate between real environment and the displayed image, both of which were viewed through the back of a view camera. They were asked which of the images was the real scene. Nine out of the twenty participants (45%) indicated that the simulated image was actually the real scene, i.e. selected the wrong answer, revealing that observers were simply guessing. Although participants considered the overall match and colour match to be good, some weaknesses were cited in the sharpness of the shadows (a consequence of the discretisation in the simulation) and in the brightness of the ceiling panel (a consequence of the directional characteristics of the light source). The overall agreement lends strong support to the perceptual validity of the simulation and display process. Rushmeier et al. [15] used perceptually based metrics to compare image quality to a captured image of the scene being represented. The image comparison metrics were derived from [4],[6], [11]. Each is based on ideas taken from image compression techniques. The goal of this work was to obtain results from comparing two images using these models that were large if large differences between the images exist, and small 3 when they are almost the same. These suggested metrics include some basic characteristics of human vision described in image compression literature. First, within a broad band of luminance, the visual system senses relative rather than absolute luminances. For this reason a metric should account for luminance variations, not absolute values. Second, the response of the visual system is non-linear. The perceived “brightness” or “lightness” is a non-linear function of luminance. The particular non-linear relationship is not well established and is likely to depend on complex issues such as perceived lighting and 3-D geometry. Third, the sensitivity of the eye depends on the spatial frequency of luminance variations. The perceptual metrics derived were used to compare images in a manner that roughly corresponds to subjective human vision, in particular the Daly model performed very well. The Visible Difference Predictor (VDP) is a perceptually based image quality metric proposed by Daly [4]. Myskowski [14] realised the VDP had many potential applications in realistic image synthesis. He completed a comprehensive validation and calibration of VDP response via human psychophysical experiments. Then, he used the VDP local error metric to steer decision making in adaptive mesh subdivision, and isolated regions of interest for more intensive global illumination computations. The VDP was tested to determine how close VDP predictions come to subjective reports of visible differences between images by designing two human psychophysical experiments. Results from these experiments showed a good correspondence with VDP results for shadow and lighting pattern masking and in comparison of the perceived quality of images generated as subsequent stages of indirect lighting solutions. McNamara et al [12] built an experimental framework to facilitate human comparison between real and synthetic scene. They ran a series of psychophysical experiments in which human observers were asked to compare regions of a real physical scene with regions of the computer generated representation of that scene. The comparison involved lightness judgements in both the generated image and the real scene. Results from these experiments showed that the visual response to the real scene and a high fidelity rendered image was similar. The work presented in this paper extends this work to investigate comparisons using three dimensional objects as targets, rather than simple regions. This allows us to examine scene characteristics such as shadow, object occlusion and depth perception. 3 Experimental Design This section outlines the steps involved in building a well articulated scene containing three dimensional objects placed within a custom built environment to evoke certain perceptual cues such as lightness constancy, depth perception and the perception of shadows. Measurements of virtual environments are often inaccurate. For some applications1 such estimation of input may be appropriate. However, for these purposes an accurate description of the environment is essential to avoid introducing errors at such an early stage. Also, once the global illumination calculations have been computed, it is important to display the resulting image in the correct manner while taking into account the limitations of the display device. As we are interested in comparing different rendering engines, it is vital that we minimise errors in the model and display stages, this means then that any errors arising can be attributed to the rendering technique employed to calculate the image. This study required an experimental set-up comprised of a real 1 The level of realism required is generally application dependent. In some situations a high level of realism is not required, for example games, educational techniques and graphics for web design. 4 Fig. 2. The test environment showing real environment and computer image. environment and a computer representation of that three dimensional environment. The measurements required for this study, the equipment used to record them are described herein, along with the rendering process employed to generate the physical stimuli. 3.1 The Real Scene The test environment was a five sided box shown in figure 2. Several objects that were placed within the box for examination. All interior surfaces of the box were painted with white matt house paint. To accommodate the three dimensional objects, custom paints were mixed, using precise ratios to serve as the basis for materials in the scene. To ensure correct, accurate ratios were achieved, 30ml syringes were used to mix paint in parts as shown in Table 1. The spectral reflectance of the paints were measured using a TOPCON-100 spectroradiometer, these values were transformed to RGB tristimulus values following [16]. Appearance Black Dark Gray Dark Gray Dark Gray Dark Gray Dark Gray Gray Gray Light Gray Light Gray Light Gray Almost White Almost White White % White Reflectance Patch# Patch Reflectance 0 10 20 30 40 50 60 70 80 90 95 97.5 98.25 100 0.0471 0.0483 0.0635 0.0779 0.0962 0.1133 0.1383 0.1611 0.2002 0.3286 0.4202 0.5292 0.5312 0.8795 0 0 2 4 6 7 9 14 15 19 23 26 26 29 .0494 .0494 .0668 .0832 .1012 .1120 .1224 .1680 .2259 .3392 .4349 .5512 .5512 .8795 Table 1. Paint Reflectance along with Reflectance of Corresponding Patch 5 Paint/P atch C orrespondance 1 P atch R eflectance P aint R eflectance 0.9 0.8 Reflectance 0.7 0.6 0.5 0.4 0.3 0.2 0.1 29 27 25 23 21 19 17 15 13 9 11 7 5 3 1 0 Paint/Patch N um ber Fig. 3. Correspondence of Patches to Paints As in [12] a small, front-silvered, high quality mirror was incorporated into the set up to allow the viewing conditions to facilitate alternation between the two settings, viewing of the original scene or viewing of the modelled scene on the computer monitor. When the optical mirror was in position, subjects viewed the original scene. In the absence of the optical mirror the computer representation of the original scene was viewed. The angular sub-tenses of the two displays were equalised, and the fact that the display monitor had to be closer to the subject for this to occur, was allowed for by the inclusion of a +2 diopter lens in its optical path; the lens equated the optical distances of the two displays. 3.2 Illumination The light source consisted of a 24 volt quartz halogen bulb mounted on optical bench fittings at the top of the test environment. This was supplied by a stabilised 10 amp DC power supply, stable to 30 parts per million in current. The light shone through a 70 mm by 115 mm opening at the top of the enclosure. Black masks, constructed of matt cardboard sheets, were placed framing the screen and the open wall of the enclosure, a separate black cardboard sheet was used to define the eye position. An aperture in this mask was used to enforce monocular vision, since the VDU display did not permit stereoscopic viewing. 3.3 The Graphical Representations Ten images were considered for comparison to the real scene, they are listed here along with the aims that we hoped to achieve from the comparison. 1. Photograph: Comparison to a photograph is needed to enable us to evaluate our method to more traditional image comparison metrics. The reasoning behind this is that most current techniques compare to “reality” by comparing to a captured 6 2. 3. 4. 5. 6. 7. 8. 9. 10. image. We wanted to see if this is equivalent to comparing to a real physical environment and so included a photograph, taken with a digital camera, as one of our test images. Radiance: 2 Ambient Bounces: A Radiance [17] image generated using 2 ambient bounces is generally considered to be a high quality image. Here we wanted to determine if 2 ambient bounces gives a similar perceptual impression to an 8 ambient bounce image which is more compute intensive. Radiance: 8 Ambient Bounces: We wanted to investigate if there was a marked difference using a Radiance image generated using 8 ambient bounces, as this involves considerably more compute time, and might not be necessary i.e. may not provide any more perceptual information than an image rendered using 2 ambient bounces. Radiance: 8 Ambient Bounces BRIGHT: This image had its brightness increased manually to see if this affected perception. The brightness was doubled (i.e. the intensity of each pixel was multiplied by 2) to see what, if any effect this had on the perception of the image. Radiance: Default: Image generated with the default Radiance parameters. This would determine whether extra compute time makes a significant difference. The default image renders in a very short time, however ambient bounces of light are absent, we wanted to compare this to imagery where interreflections were catered for. Radiance: Controlled Errors in Estimate Reflectance Values: The RGB values for the materials were set to equal values to see what difference, if any, this made compared to using measured values. A poor perceptual response to this image would confirm our suspicion that material properties must be carefully quantified if an accurate result is required. This comparison, and the next, was to demonstrate the importance of using exact measurements rather than estimations for material values. Radiance: Controlled Errors in Estimate of Light Source: The RGB values for the light source were set to equal values to see what difference this made compared to using measured values. This experiment will show the necessity of measuring emission properties of sources in an environment if an accuracy is the aim. Radiance: Tone Mapped: We wanted to investigate the difference tone mapping would make to our test image. Tone mapping transforms the radiance values computed by the rendering engine to values displayable on a display device in a manner that preserves the subjective impression of the scene. The Tone Mapping Operator (TMO) used here was introduced by Ferwerda et al. [5]. Although the image examined does not have a very high dynamic range, we were interested to see the effects tone mapping would have on image perception. Renderpark: Raytraced: This was a very noisy image generated using stochastic raytracing. This experiment was designed to see how under-sampling would affect perception. Here the effect of under-sampling is exaggerated but might give insights in to how much undersampling a rendering engine can ”get away with” without affecting perceptual performance. Renderpark: Radiosity: Finally,to investigate the effects of meshing in a radiosity solution, a poorly meshed radiosity image was used. We wanted to demonstrate the importance of using an accurage meshing strategy when employing radiosity techniques. These images are shown in the accompanying colour plate. 7 The media used for stimulus presentation was a gamma corrected 20-inch monitor with the following phosphor chromaticity coordinates: xr yr = 0:6044 = 0:3434 xg yg = 0:2808 = 0:6016 xb yb = :1520 = :0660 xw yw = 0:2786 = 0:3020 4 Experiment Eighteen observers participated in the experiment, and were naive of the purpose of the experiment. All had normal or corrected-to-normal vision. Both condition order and trial order were fully randomised across subjects and conditions. Participants were given clear instructions. 4.1 Training on Munsell Chips Fig. 4. Patch arrangement used to train participants with Reference Chart) In [12], the task involved matching regions to a control chart which meant observers had to look away from the scene under examination to choose a match. Moving between scene and chart may affect adaptation to the scene in question, also the view point is not fixed, for this reason we decided to train participants on the control patches first. Once trained on the patches participants could then recall the match from memory. Training was conducted as follows. Observers were asked to select, from a numbered grid of 30 achromatic Munsell chips presented on a white background, a sample to match a second unnumbered grid (figure 4) simultaneously displayed on the same background, under constant illumination. The unnumbered grid comprised 60 chips. At the start of each experiment participants were presented with two grids, one an ordered numbered regular grid the other an unordered unnumbered irregular grid comprising one or more of the chips from the numbered grid. Both charts were hung on the wall approximately one meter from the participant. Each participant was asked to match the chips on the unnumbered grid to one of the chips on the numbered grid on the left. In other words they were to pick a numbered square on the left and place it right next to the grid on the right which in the grid would match it exactly. This is done in a random manner, a laser pointer 2 was used to point to the unnumbered chip under examination. Then the numbered chart was removed, and the unnumbered chart replaced by a similar chart but one where the chips had a different order. Participants repeated the task, this time working from memory to recall the number each chip would match to. The results of this training exercise are graphed in figure 5. The graph on the left shows the average 2 non-invasive medium 8 Training on Patches Training on Patches 1.2 35 30 1 0.8 Correlation Average Match 25 20 15 0.6 0.4 10 0.2 5 With Reference Chart With Reference Chart Without Reference Chart 0 0 5 10 15 20 25 30 Without Reference Chart 0 35 0 Patch Number 5 10 15 20 Participant Fig. 5. Average of Matching to Training Patches with and without the reference chart shown on the right along with the Average Correlation for both cases on the left match across 18 subjects, both with the reference chart and without the reference chart. The graph on the right shows the average correlation. This correlation gives an indication of the extent to which two sets of data are linearly related. A values close to 1 indicates a strong relationship, while a value of 0 signifies there is no linear relationship. A correlation of 1 would result if the participant matched each unnumbered patch to its corresponding numbered patch, in reality this is not the case and some small errors are made, what we need to determine is if the errors made when matching from memory i.e. without the chart are about the same size as the errors made with the reference chart in place. The correlation value when matching the patches with the chart in place is 0.96, and when matching from memory the result is 0.92, indicating a very small difference of 0.04 between the two conditions. From this small difference we can conclude that participants are just as good at matching the patches without the reference chart in place. Thus, this training paradigm proved to be reliable and stable. This has the dual benefit of speeding up the time taken per condition, as well as ensuring participants do not need to move their gaze from image to chart, thus eliminating any influence due to adaptation. 4.2 Matching to Images Each participant was presented with a series of images, in a random order, one of which was the real environment. Participants were not explicitly informed which image was the physical environment. The images presented were the real scene, the photograph and the 9 rendered images. There were 17 different objects in the test environment, subjects were also asked to match the 5 sides of the environment (floor, ceiling, left 9 wall, back wall and right wall) giving a total of 21 matches. The paints used on the objects match to the training patches as shown in graph 3, and detailed in table 3.1. Participants were asked to judge the lightness of target objects in a random manner. We chose this particular task - that of matching materials in the scene against a display of originals - because the task has a number of attractive features. First, Gilchrist [9, 7] has shown that the perception of lightness (the perceptual correlate of reflectance) is strongly dependent on the human visual system’s rendition of both illumination and 3-D geometry. These are key features of perception of any scene and are in themselves complex attributes. However, the simple matching procedure used here depends critically on the correct representation of the above parameters. Therefore, the task should be sensitive to any mismatch between the original and the rendered scene. Secondly, the matching procedure is a standard psychophysical task and allows excellent control over the stimulus and the subject’s response. The task chosen here corresponds closely to the methodology of Gilchrist [2, 9, 7] which permits simple measures (of lightness) to be made at locations in complex scenes. Ultimately, the task was chosen to be simple while also being sensitive to perceptual distortions in the scene. 5 Results Results for each participant were recorded and analysed independently. The value (or gray level) chosen by each participant in the real scene was compared with the values chosen in the rendered image. For a rendered image to be a faithful reproduction, the values in both cases should be closely related. To examine this relationship we carried out a linear correlation for each subject. This correlation gives an indication of the extent to which two sets of data are linearly related. A values close to 1 indicates a strong relationship, whilst a value of 0 signifies there is no linear relationship. A correlation of 1 would result if the participant chose exactly the same gray level for each object in the real scene and rendered image. Correlation values are shown in table 2, and graphed as shown in the colour plate, the graph on the right shows these values averaged. To examine the pattern of these correlations across participants we carried out ANalysis Of VAriance (ANOVA). ANOVA is a powerful set of procedures used for testing significance where two or more conditions are used, here 10 conditions were examined [3]. A repeated measures within subjects ANOVA was used. There was a significant effect of condition: F (9; 153) = 80:3; p < :001 This equation can be read as follows, the F statistic equals 80.3, with 9 degrees of fredom (10 images), 153 degrees of freedom for the error term (calculated as a function of image combinations). The P value indicates the probability that these differences occur by chance. This is a repeated measures within subjects analysis of variance as each subject performed each condition. This means there are statistically reliable differences between the conditions. This is to be expected as some images were deliberately selected for variation in quality. The ANOVA showed there are significant differences in perception across images. Further analyses were carried out to investigate where these differences occur. These analyses took the form of a paired comparison t-test. Here we took the correlation between the real scene and the photograph, and compared it to the correlation of the real scene to the other images. Results from the correlations are shown in the following table. 10 Image Photograph * 2 Ambient Bounces 8 Ambient Bounces Brightened 8 Ambient Bounces * Default * Controlled Error Materials Tone Mapped Controlled Error Illumination * Raytraced * Radiosity Mean Correlation with REAL .8918 .843 .884 .865 .337 .692 .879 .862 .505 .830 Table 2. Comparison of Rendered Images to Real Environment A star in the table indicates a statistically significant difference, reflecting a reliable decrement in quality when compared to the photograph. The significant t values were as follows: Two Ambient Bounces: (t(17) = 3:11; p < :01) Default Image: (t(17) = 12:4; p < :001) Guessed Materials Image: (t(17) = 10:7; p < :001) Raytraced Image: (t(17) = 9:36; p < :001) Radiosity Image: (t(17) = 3:00; p < :01) The t statistic equals (take Two Ambient Bounces as an example) 3.11, with 17 degrees of freedom (18 participants). The probability, p of this distribution happening by chance is less than 0.01. This means that while there are some small differences between the results of matching to the photograph and matching to other images, these differences are not significant. In summary, our results show that there is evidence that the 2 Ambient Bounces image, the Default image, the Controlled Error Materials image, the Raytraced image and the Radiosity image are perceptually degraded compared to the photograph. However, there is no evidence that the others images in this study are perceptually inferior to the photograph. From this we can conclude that the 8 Ambient Bounces image, the Brightened 8 Ambient Bounces image, the Tone Mapped image and the Controlled Error Illumination image are of the same perceptual quality as a photograph of the real scene. 6 Conclusions We have introduced a method for measuring the perceptual equivalence between a real scene and a computer simulation of the same scene, from a lightness matching point of view. Because this model is based on psychophysical experiments, results are produced through study of vision from a human rather than a machine vision point of view. By conducting a series of experiments, based on the psychophysics of lightness perception, we can estimate how much alike a rendered image is to the original scene. Results show that given a real scene and a faithful representation of that scene, the visual response function in both cases is similar. 11 Because the complexity of human perception and the computational expensive rendering algorithms that exist today, future work should focus on developing efficient methods from which resultant graphical representations of scenes yield the same perceptual effects as the original scene. To achieve this the full gamut of colour perception, as opposed to simply lightness, must be considered by introducing scenes of increasing complexity. References 1. E. H. Adelson, Lightness Perception and Lightness Illusions, 339–351, MIT Press, 1999, pp. 339–351. 2. J. Cataliotti and A. Gilchrist, Local and global processes in lightness perception, Perception and Psychophysics, vol. 57(2), 1995, pp. 125–135. 3. H. Coolican, Research methods and statistics in psychology, Hodder and Stoughton, Oxford, 1999. 4. S. Daly, The visible difference predictor: an algorithm for the assessment of ima ge fidelity, In A. B. Watson Editor, Digital Images and Human Vision, MIT Press, 1993, pp. 179–206. 5. J. A. Ferwerda, S.N. Pattanaik, P. Shirley, and D. P. Greenberg, A model of visual adaptation for realistic image synthesis, Computer Graphics 30 (1996), no. Annual Conference Series, 249–258. 6. J. Gervais, Jr. L.O. Harvey, and J.O. Roberts, Identification confusions among letters of the alphabet, Journal of Experimental Psychology: Human Perception and Perfor mance, vol. 10(5), 1984, pp. 655–666. 7. A. Gilchrist, Lightness contrast and filures of lightness constancy: a common explanation, Perception and Psychophysics, vol. 43(5), 1988, pp. 125–135. 8. A. Gilchrist, S. Delman, and A. Jacobsen, The classification and integration of edges as critical to the perception of reflectance and illumination, Perception and Psychophysics 33 (1983), no. 5, 425–436. 9. A. Gilchrist and A. Jacobsen, Perception of lightness and illumination in a world of one reflectance, Perception 13 (1984), 5–19. 10. A. L. Gilchrist, The perception of surface blacks and whites, Scientific American 240 (1979), no. 3, 88–97. 11. J. L. Mannos and D. J. Sakrison, The effects of a visual criterion on the encoding of images, IEEE Transactions on Information Theory IT-20 (1974), no. 4, 525–536. 12. A. McNamara, A. Chalmers, T. Troscianko, and E. Reinhard, Fidelity of graphics reconstructions: A psychophysical investigation, Proceedings of the 9th Eurographics Rendering Workshop, Springer Verlag, June 1998, pp. 237–246. 13. G. W. Meyer, H. E. Rushmeier, M. F. Cohen, D. P. Greenberg, and K. E. Torrance, An Experimental Evaluation of Computer Graphics Imagery, ACM Transactions on Graphics 5 (1986), no. 1, 30–50. 14. K. Myszkowski, The visible differences predictor: Applications to global illumination problems, Rendering Techniques ’98 (Proceedings of Eurographics Rendering Workshop ’98) (New York, NY) (G. Drettakis and N. Max, eds.), Springer Wien, 1998, pp. 233–236. 15. H. Rushmeier, G. Ward, C. Piatko, P. Sanders, and B. Rust, Comparing real and synthetic images: Some ideas about metrics, Eurographics Rendering Workshop 1995, Eurographics, June 1995. 16. D. Travis, Effective color displays, Academic Press, 1991. 17. G. J. Ward, The RADIANCE lighting simulation and rendering system, Proceedings of SIGGRAPH ’94 (Orlando, Florida, July 24–29, 1994) (Andrew Glassner, ed.), Computer Graphics Proceedings, Annual Conference Series, ACM SIGGRAPH, ACM Press, July 1994, ISBN 0-89791-667-0, pp. 459–472. 12 -- -- The RADIANCE Lighting Simulation and Rendering System Gregory J. Ward Lighting Group Building Technologies Program Lawrence Berkeley Laboratory (e-mail: [email protected]) ABSTRACT This paper describes a physically-based rendering system tailored to the demands of lighting design and architecture. The simulation uses a light-backwards ray-tracing method with extensions to efficiently solve the rendering equation under most conditions. This includes specular, diffuse and directionaldiffuse reflection and transmission in any combination to any level in any environment, including complicated, curved geometries. The simulation blends deterministic and stochastic ray-tracing techniques to achieve the best balance between speed and accuracy in its local and global illumination methods. Some of the more interesting techniques are outlined, with references to more detailed descriptions elsewhere. Finally, examples are given of successful applications of this free software by others. CR Categories: I.3.3 [Computer Graphics]: Picture/image generation - Display algorithms ; I.3.7 [Computer Graphics]: Three-Dimensional Graphics and Realism - Shading. Additional Keywords and Phrases: lighting simulation, Monte Carlo, physically-based rendering, radiosity, ray-tracing. 1. Introduction Despite voluminous research in global illumination and radiosity over the past decade, few practical applications have surfaced in the fields that stand the most to benefit: architecture and lighting design. Most designers who use rendering software employ it in a purely illustrative fashion to show geometry and style, not to predict lighting or true appearance. The designers cannot be blamed for this; rendering systems that promote flash over content have been the mainstay of the graphics industry for years, and the shortcuts employed are well-understood by the software community and well-supported by the hardware manufacturers. Why has radiosity not yet taken off in the rendering market? Perhaps not enough time has passed since its introduction to the graphics community a decade ago [8]. After all, it took ray-tracing nearly that long to become a mainstream, commercial rendering technique. Another possibility is that the method is too compute-intensive for most applications, or that it simply does not fulfill enough people’s needs. For example, most radiosity systems are not well automated, and do not permit general reflectance models or curved surfaces. If we are unable to garner support even from the principal beneficiaries, designers, what does that say of our chances with the rest of the user community? Acceptance of physically-based rendering is bound to improve†, but researchers must first demonstrate the real-life applicability of their techniques. There have been few notable successes in applying radiosity to the needs of practicing designers [6]. While much research has focused on improving efficiency of the basic radiosity method, problems associated with more realistic, complicated geometries have only recently gotten the attention they deserve [2,19,22]. For whatever reason, it appears that radiosity has yet to fulfill its promise, and it is time to reexamine this technique in light of real-world applications and other alternatives for solving the rendering equation [10]. There are three closely related challenges to physicallybased rendering for architecture and lighting design: accuracy, generality and practicality. The first challenge is that the calculation must be accurate; it must compute absolute values in physical units with reasonable certainty. Although recent research in global illumination has studied sources of calculation error [1,20], few researchers bother to compute in physical lighting units, and even fewer have compared their results to physical experiments [15]. No matter how good the theory is, accuracy claims for simulation must ultimately be backed up with comparisons to what is being simulated. The second challenge is that a rendering program must be general. It is not necessary to simulate every physical lighting phenomenon, but it is important to do enough that the unsolvable rendering problems are either unimportant or truly exceptional. The third challenge for any rendering system is that it be practical. This includes a broad spectrum of requirements, from being reliable (i.e. debugged and tested) to being application-friendly, to producing good results in a reasonable time. All three of the above challenges must be met if a physically-based rendering package is to succeed, and all three must be treated with equal importance. Radiance is the name of a rendering system developed by the author over the past nine years at the Lawrence Berkeley Laboratory (LBL) in California and the Ecole Polytechnique Federale de Lausanne (EPFL) in Switzerland. It began as a study in ray-tracing algorithms, and after demonstrating its potential for saving energy through better lighting design, acquired funding from the U.S. Department of Energy and later from the Swiss government. The first free software release was in 1989, and since then it has found an increasing number of users in the research and design community. Although it has never been a commercial product, Radiance has benefited enorhhhhhhhhhhhhhhhhhhhhh †The term "physically-based rendering" is used throughout the paper to refer to rendering techniques based on physical principles of light behavior for local and global illumination. The term "simulation" is more general, referring to any algorithm that mimics a physical process. -- -- mously from the existence of an enthusiastic, active and growing user base, which has provided invaluable debugging help and stress-testing of the software. In fact, most of the enhancements made to the system were the outcome of real or perceived user requirements. This is in contrast to the much of the research community, which tends to respond to intriguing problems before it responds to critical ones. Nine years of user-stimulated software evolution gives us the confidence to claim we have a rendering system that goes a long way towards satisfying the needs of the design community. Further evidence has been provided by the two or three design companies who have abandoned their own in-house software (some of which cost over a million dollars to develop) in favor of Radiance. In this paper, we describe the Radiance system design goals, followed with the principal techniques used to meet these goals. We follow with examples of how users have applied Radiance to specific problems, followed by conclusions and ideas for future directions. 2. System Design Goals The original goals for the Radiance system were modest, or so we thought. The idea was to produce an accurate tool for lighting simulation and visualization based on ray-tracing. Although the initial results were promising, we soon learned that there was much more to getting the simulation right than plugging proper values and units into a standard ray-tracing algorithm. We needed to overcome some basic shortcomings. The main shortcoming of conventional ray-tracing is that diffuse interreflection between surfaces is approximated by a uniform "ambient" term. For many scenes, this is a poor approximation, even if the ambient term is assigned correctly. Other difficulties arise in treating light distribution from large sources such as windows, skylights, and large fixtures. Finally, reflections of lights from mirrors and other secondary sources are problematic. These problems, which we will cover in some detail later, arose from the consideration of our system design goals, given below. The principal design goals of Radiance were to: 1. Ensure accurate calculation of luminance 2. Model both electric light and daylight 3. Support a variety of reflectance models 4. Support complicated geometry 5. Take unmodified input from CAD systems These goals reflect many years of experience in architectural lighting simulation; some of them are physically-motivated, others are user-motivated. All of them must be met before a lighting simulation tool can be of significant value to a designer. 2.1. Ensure Accurate Calculation of Luminance Accuracy is one of the key challenges in physically-based rendering, and luminance (or the more general "spectral radiance") is probably the most versatile unit in lighting. Photometric units such as luminance are measured in terms of visible radiation, and radiometric units such as radiance are measured in terms of power (energy/time). Luminance represents the quantity of visible radiation passing through a point in a given direction, measured in lumens/steradian/meter2 in SI units. Radiance is the radiometric equivalent of luminance, measured in watts/steradian/meter2. Spectral radiance simply adds a dependence on wavelength to this. Luminance and spectral radiance are most closely related to a pixel, which is what the eye actually "sees." From this single unit, all other lighting metrics can be derived. Illuminance, for example, is the integral of luminance over a projected hemisphere (lumens/meter2 or "lux" in SI units). Luminous intensity and luminous flux follow similar derivations. By computing the most basic lighting unit, our simulation will adapt more readily to new applications. To assure that a simulation delivers on its promise, it is essential that the program undergo periodic validation. In our case, this means comparing luminance values predicted by Radiance to measurements of physical models. An initial validation was completed in 1989 by Grynberg [9], and subsequent validations by ourselves and others confirm that the values are getting better and not worse [14]. 2.2. Model Both Electric Light and Daylight In order to be general, a lighting calculation must include all significant sources of illumination. Daylight simulation is of particular interest to architects, since the design of the building facade and to a lesser degree the interior depends on daylight considerations. Initially, Radiance was designed to model electric light in interior spaces. With the addition of algorithms for modeling diffuse interreflection [25], the software became more accurate and capable of simulating daylight (both sun and sky contributions) for building interiors and exteriors. The role of daylight simulation in Radiance was given new importance when the software was chosen by the International Energy Agency (IEA) for its daylight modeling task* [4]. 2.3. Support a Variety of Reflectance Models Luminance is a directional quantity, and its value is strongly determined by a material’s reflectance/transmittance distribution function. If luminance is calculated using a Lambertian (i.e. diffuse) assumption, specular highlights and reflections are ignored and the result can easily be wrong by a hundred times or more. We cannot afford to lose directional information if we hope to use our simulation to evaluate visual performance, visual comfort and aesthetics. A global illumination program is only as general as its local illumination model. The standard model of ambient plus diffuse plus Phong specular is not good enough for realistic image synthesis. Radiance includes the ability to model arbitrary reflectance and transmittance functions, and we have also taken empirical measurements of materials and modeled them successfully in our system [29]. 2.4. Support Complicated Geometry A lighting simulation of an empty room is not very interesting, nor is it very informative. The contents of a room must be included if light transfer is to be calculated correctly. Also, it is difficult for humans to evaluate aesthetics based on visualizations of empty spaces. Furniture, shadows and other details provide the visual cues a person needs to understand the lighting of a space. Modeling exteriors is even more challenging, often requiring hundreds of thousands of surfaces. Although we leave the definition of "complicated geometry" somewhat loose, including it as a goal means that we shall not limit the geometric modeling capability of our simulation in any fundamental way. To be practical, data structure size should grow linearly (at worst) with geometric complexity, and there should be no built-in limit as to the number of surfaces. To be accurate, we shall support a variety of surface primitives, also ensuring our models are as memory-efficient as possible. To be general, we shall provide N-sided polygons and a mechanism for interpolating surface normals, so any reasonable shape may be represented. Finally, computation time should have a sublinear relationship to the number of surfaces so that the user does not pay an unreasonable price for accurate modeling. hhhhhhhhhhhhhhhhhhhhh *The IEA is a consortium of researchers from developed nations cooperatively seeking alternative energy sources and ways of improving energy efficiency in their countries. -- -- 2.5. Take Unmodified Input from CAD Systems If we are to model complicated geometry, we must have a practical means to enter these models into our simulation. The creation of a complicated geometric model is probably the most difficult task facing the user. It is imperative that the user be allowed every means to simplify this task, including advanced CAD systems and input devices. If our simulation limits this process in any way, its value is diminished. Therefore, to the greatest degree possible, we must accept input geometry from any CAD environment. This is perhaps the most difficult of the goals we have outlined, as the detail and quality of CAD models varies widely. Many CAD systems and users produce only 2D or wireframe models, which are next to useless for simulation. Other CAD systems, capable of producing true 3D geometric models, cannot label the component surfaces and associate the material information necessary for an accurate lighting simulation. These systems require a certain degree of user intervention and post-processing to complete the model. Even the most advanced CAD systems, which produce accurate 3D models with associated surface data, do not break surfaces into meshes suitable for a radiosity calculation. The missing information must either be added by the user, inferred from the model, or the need for it must be eliminated. In our case, we eliminate this need by using something other than a radiosity (i.e. finite element) algorithm. CAD translators have been written for AutoCAD, GDS, ArchiCAD, DesignWorkshop, StrataStudio, Wavefront, and Architrion, among others. None of these translators requires special intervention by the user to reorient surface normals, eliminate T-vertices, or mesh surfaces. The only requirement is that surfaces must somehow be associated with a layer or identifier that indicates their material type. 3. Approach We have outlined the goals for our rendering system and linked them back to the three key challenges of accuracy, generality and practicality. Let us now explore some of the techniques we have found helpful in meeting these goals and challenges. We start with a basic description of the problem we are solving and how we go about solving it in section 3.1, followed by specific solution techniques in sections 3.2 to 3.5. Sections 3.6 to 3.9 present some important optimizations, and section 3.10 describes the overall implementation and use of the system. 3.1. Hybrid Deterministic/Stochastic Ray Tracing Essentially, Radiance uses ray-tracing in a recursive evaluation of the following integral equation at each surface point: Lr (θr ,φr ) = Le (θr ,φr ) + 2π π ∫0 ∫0 Li (θi ,φi ) ρbd (θi ,φi ;θr ,φr ) | cosθi | (1) sinθi d θi d φi where: θ is the polar angle measured from the surface normal φ is the azimuthal angle measured about the surface normal Le (θr ,φr ) is the emitted radiance (watts/steradian/meter2 in SI units) Lr (θr ,φr ) is the reflected radiance Li (θi ,φi ) is the incident radiance ρbd (θi ,φi ;θr ,φr ) is the bidirectional reflectance-transmittance distribution function (steradian-1) This equation is essentially Kajiya’s rendering equation [10] with the notion of energy transfer between two points replaced by energy passing through a point in a specific direction (i.e. the definition of radiance). This formula has been documented many times, going back before the standard definition of ρbd [16]. Its generality and simplicity provide the best foundation for building a lighting simulation. This formulation of the rendering problem is a natural for ray tracing because it gives outgoing radiance in terms of incoming radiance over the projected sphere, without any explicit mention of the model geometry. The only thing to consider at any one time is the light interaction with a specific surface point, and how best to compute this integral from spawned ray values. Thus, no restrictions are placed on the number or shape of surfaces or surface elements, and discretization (meshing) of the scene is unnecessary and even irrelevant. Although it is possible to approximate a solution to this equation using uniform stochastic sampling (i.e. Monte Carlo), the convergence under most conditions is so slow that such a solution is impractical. For example, a simple outdoor scene with a ground plane, a brick and the sun would take days to compute using naive Monte Carlo simply because the sun is so small (0.5° of arc) in comparison to the rest of the sky. It would take many thousands of samples per pixel to properly integrate light coming from such a concentrated source. The key to fast convergence is in deciding what to sample by removing those parts of the integral we can compute deterministically and gauging the importance of the rest so as to maximize the payback from our ray calculations. In the case of the outdoor scene just described, we would want to consider the sun as an important contribution to be sampled separately, thus removing the biggest source of variance from our integral. Instead of relying on random samples over the hemisphere, we send a single sample ray towards the sun, and if it arrives unobstructed, we use a deterministic calculation of the total solar contribution based on the known size and luminosity of the sun as a whole. We are making the assumption that the sun is not partially occluded, but such an assumption would only be in error within the penumbra of a solar shadow region, and we know these regions to represent a very small portion of our scene. Light sources cause peaks in the incident radiance distribution, Li (θi ,φi ). Directional reflection and transmission cause peaks in the scattering function, ρbd . This will occur for reflective materials near the mirror angle, and in the refracted direction of dielectric surfaces (e.g. glass). Removing such peak reflection and transmission angles by sending separate samples reduces the variance of our integral at a comparatively modest cost. This approach was introduced at the same time as raytracing by Whitted [31]. Further improvements were made by adding stochastic sampling to the deterministic source and specular calculations by Cook in the first real linking of stochastic and deterministic techniques [5]. Radiance employs a tightly coupled source and specular calculation, described in [29]. 3.2. Cached Indirect Irradiances for Diffuse Interreflection No matter how successful we are at removing the specular reflections and direct illumination from the integral (1), the cost of determining the remaining diffuse indirect contributions is too great to recalculate at every pixel because this requires tracing hundreds of rays to reduce the variance to tolerable levels. Therefore, most ray-tracing calculations ignore diffuse interreflection between surfaces, using a constant "ambient" term to replace the missing energy. Part of the reason a constant ambient value has been accepted for so long (other than the cost of replacing it) is that diffuse interreflection changes only gradually over surfaces. Thus, the contrast-sensitive eye usually does not object to the loss of subtle shading that accompanies an ambient approximation. However, the inaccuracies that result are a problem if one wants to know light levels or see the effects of daylight or -- -- indirect lighting systems. Since indirect lighting changes gradually over surfaces, it should be possible to spread out this influence over many pixels to obtain a result that is smooth and accurate at a modest sampling cost. This is exactly what we have done in Radiance. The original method for computing and using cached irradiance values [25] has been enhanced using gradient information [28]. The basic idea is to perform a full evaluation of Equation (1) for indirect diffuse contributions only as needed, caching and interpolating these values over each surface. Direct and specular components are still computed on a per-pixel basis, but hemispherical sampling occurs less frequently. This gives us a good estimate of the indirect diffuse contribution when we need it by sending more samples than we would be able to afford for a pixel-independent calculation. The approach is effectively similar to finite element methods that subdivide surfaces into patches, calculate accurate illumination at one point on each patch and interpolate the results. However, an explicit mesh is not used in our method, and we are free to adjust the density of our calculation points in response to the illumination environment. Furthermore, since we compute these view-independent values only as needed, separate form factor and solution stages do not have to complete over the entire scene prior to rendering. This can amount to tremendous savings in large architectural models where only a portion is viewed at any one time. Figure 1 looks down on a diffuse sphere in a room with indirect lighting only. A blue dot has been placed at the position of each indirect irradiance calculation. Notice that the values are irregularly spaced and denser underneath the sphere, on the sphere and near the walls at the edges of the image. Thus, the spacing of points adapts to changing illumination to maintain constant accuracy with the fewest samples. To compute the indirect irradiance at a point in our scene, we send a few hundred rays that are uniformly distributed over the projected hemisphere. If any of our rays hits a light source, we disregard it since the direct contribution is computed separately. This sampling process is applied recursively for multiple reflections, and it does not grow exponentially because each level has its own cache of indirect values. rotational gradient is positive in this direction), and our hemisphere samples contain this information as well. Formalizing these observations, we have developed a numerical approximation to the irradiance gradient based on hemisphere samples. Unfortunately, its derivation does not fit easily into a general paper, so we refer the reader to the original research [28]. Figure 3 Figure 2 Our hemisphere samples not only tell us the total indirect illumination, they also give us more detailed information about the locations and brightnesses of surfaces visible from the evaluation point. This information may be used to predict how irradiance will change as a function of point location and surface orientation, effectively telling us the first derivative (gradient) of the irradiance function. For example, we may have a bright reflecting surface behind and to the right of a darker surface as shown in Figure 2. Moving our evaluation point to the right would yield an increase in the computed irradiance (i.e. the translational gradient is positive in this direction), and our samples can tell us this. A clockwise rotation of the surface element would also cause an increase in the irradiance value (i.e. the Figure 3a,b. Plots showing the superiority of gradient interpolation for indirect irradiance values. The reference curve is an exact calculation of the irradiance along the red line in Figure 1. The linear interpolation is equivalent to Gouraud shading between evenly spaced points, as in radiosity rendering. The Hermite cubic interpolation uses the gradient values computed by Radiance, and is not only smoother but demonstrably more accurate than a linear interpolation. Knowing the gradient in addition to the value of a function, we can use a higher order interpolation method to get a better irradiance estimate between the calculated points. In effect, we will obtain a smoother and more accurate result without having to do any additional sampling, and with very little overhead. (Evaluating the gradient formulas costs almost nothing compared to computing the hemisphere samples.) Figure 3a shows the irradiance function across the floor of Figure 1, along the red line. The exact curve is shown overlaid with a linearly interpolated value between regularly spaced calculation points, and a Hermite cubic interpolation using com- -- -- puted gradients. The cubic interpolation is difficult to separate from the exact curve. Figure 3b shows the relative error for these two interpolation methods, clearly demonstrating the advantage of using gradient information. Caching indirect irradiances has four important advantages over radiosity methods. First, no meshing is required, since a separate octree data structure is used to hold the calculated values. This lifts restrictions on geometric shapes and complexity, and greatly simplifies user input and scene analysis. Second, we only have to compute those irradiances affecting the portion of the scene being viewed. This speeds rendering time under any circumstance, since our view-independent values may be reused in subsequent images (unlike values computed with importance-driven radiosity [20]). Third, the density of irradiance calculations is reduced at each level of interreflection, maintaining constant accuracy while reducing the time required to compute additional bounces. Fourth, the technique adapts to illumination by spacing values more closely in regions where there may be large gradients, without actually using the gradient as a criterion. This eliminates errors that result from using initial samples to decide sampling density [12], and improves accuracy overall. The gradient is used to improve value interpolation, yielding a smoother and more accurate result without the Machbands that can degrade conventional radiosity images. 3.3. Adaptive Sampling of Light Sources Although sending one sample ray to each light source is quite reasonable for outdoor scenes, such an approach is impractical for indoor scenes that may have over a hundred light sources. Most rays in a typical calculation are in fact shadow rays. It is therefore worth our while to rule out light sources that are unimportant and avoid testing them for visibility. The method we use in Radiance for reducing the number of shadow rays is described in [26]. A prioritized list of potential source contributions is created at each evaluation of Equation (1). The largest potential contributors (contribution being a function of source output, proximity and ρbd ) are tested for shadows first, and we stop testing when the remainder of the source list is below some fraction of the unoccluded contributions. The remaining source contributions are then added based on statistical estimates of how likely each of them is to be visible. Figure 4 Figure 4 shows a simple example of how this works. The left column represents our sorted list of potential light source contributions for a specific sample point. We proceed down our list, checking the visibility of each source by tracing shadow rays, and summing together the unobstructed contributions. After each test, we check to see if the remainder of our potential contributions has fallen below some specified fraction of our accumulated total. If we set our accuracy goal to 10%, we can stop testing after four light sources because the remainder of the list is less than 10% of our known direct value. We could either add all of the remainder in or throw it away and our value would still be within 10% of the correct answer. But we can do better than that; we can make an educated guess at the visibility of the remaining sources using statistics. Taking the history of obstructed versus unobstructed shadow rays from previous tests of each light source, we multiply this probability of hitting an untested source by the ratio of successful shadow tests at this point over all successful shadow tests (2/(.9+.55+.65+.95) == 0.65 in this example), and arrive at a reasonable estimate of the remainder. (If any computed multiplier is greater than 1, 1 is used instead.) Our total estimate of the direct contribution at this point is then the sum of the tested light sources and our statistical estimate of the remainder, or 1616 in this example. We have found this method to be very successful in reducing the number of shadow test rays required, and it is possible to place absolute bounds on the error of the approximation. Most importantly, this type of adaptive shadow testing emphasizes contrast as the primary decision criterion. Contrast is defined as the difference between the luminance at a point and the background luminance divided by the background luminance. If a shadow boundary is below the visible contrast threshold, then an error in its calculation is undetectable by the viewer. Thus, this method produces no visible artifacts in its tradeoff of speed for accuracy. Accuracy is still lost in a controlled way, but the resulting image is subjectively flawless, due to the eye’s relative insensitivity to absolute light levels. Figure 5 shows a theater lighting simulation generated by Radiance in 1989. This image contains slightly over a hundred light sources, and originally took about 4 days to render on a Sun-4/260. (The equivalent of about 5 Vax-11/780’s.) Using our adaptive shadow testing algorithm reduced the rendering time to 2 days for the same image†. The time savings for scenes with more light sources can be better than 70%, especially if the light sources have narrow output distributions, such as the spotlights popular in overlighted retail applications. A different problem associated with ray-per-source shadow testing is inadequate sampling of large or nearby sources, which threatens simulation accuracy. For example, a single ray cannot adequately sample a fluorescent desk lamp for a point directly beneath it. The simplest approach for sources that are large relative to their distance is to send multiple sample rays. Unfortunately, breaking a source into pieces and sending many rays to it is inefficient for distant points in the room. Again, an adaptive sampling technique is the most practical solution. In our adaptive technique, we send multiple rays to a light source if its extent is large relative to its distance. We recursively divide such sources into smaller pieces until each piece satisfies some size/distance criterion. Figure 6a shows a long, skinny light source that has been broken into halves repeatedly until each source is small enough to keep penumbra and solid angle errors in check. Figure 6b shows a similar subdivision of a hhhhhhhhhhhhhhhhhhhhh †The theater model was later rendered in [2] using automatic meshing and progressive radiosity. Meshing the scene caused it to take up about 100 Mbytes of memory, and rendering took over 18 hours on an SGI R3000 workstation for the direct component alone, compared to 5 hours in 11 Mbytes using Radiance on the same computer. -- -- Figure 6 Figure 7 large rectangular source. A point far away from either source will not result in subdivision, sending only a single ray to some (randomly) chosen location on the source to determine visibility. 3.4. Automatic Preprocessing of "Virtual" Light Sources Thus far we have accounted for direct contributions from known light sources, specular reflections and transmission, and diffuse interreflections. However, there are still transfers from specular surfaces that will not be handled efficiently by our calculation. A mirror surface may reflect sunlight onto a diffuse or semispecular surface, for example. Although the diffuse interreflection calculation could in principle include such an effect, we are returning to the original problem of insufficient sampling of an intense light source. A small source reflected specularly is still too small to find in a practical number of naive Monte Carlo samples. We have to know where to look. We therefore introduce "virtual" light sources that do not exist in reality, but are used during the calculation to direct shadow rays in the appropriate directions to find reflected or otherwise transferred light sources. This works for any planar surface, and has been implemented for mirrors as well as prismatic glazings (used in daylighting systems [4]). For example, a planar mirror might result in a virtual sun in the mirror direction from the real sun. When a shadow ray is sent towards the virtual sun, it will be reflected off the mirror to intersect the real sun. An example is shown in Figure 7a. This approach is essentially the same as the "virtual worlds" idea put forth by Rushmeier [18] and exploited by Wallace [24], but it is only carried out for light sources and not for all contributing surfaces. Thus, multiple transfers between specular surfaces can be made practical with this method using intelligent optimization techniques. The first optimization we apply is to limit the scope of a virtual light source to its affected volume. Given a specific source and a specific specular surface, the influence is usually limited to a certain projected volume. Points that fall outside this volume are not affected and thus it is not necessary to consider the source everywhere. Furthermore, multiple reflections of the source are possible only within this volume. We can thus avoid creating virtual-virtual sources in cases where the volume of one virtual source fails to intersect the second reflecting surface, as shown in Figure 7b. The same holds for thrice redirected sources and so on, and the likelihood that virtual source volumes intersect becomes less likely each time, provided that the reflecting surfaces do not occupy a majority of the space. To minimize the creation of useless virtual light sources, we check very carefully to confirm that the light in fact has some free path between the source and the reflecting surface before creating the virtual source. For example, we might have an intervening surface that prevents all rays from reaching a reflecting surface from a specific light source, such as the situation shown in Figure 7c. We can test for this condition by sending a number of presampling rays between the light source and the reflecting surface, assuming if none of the rays arrives that the reflecting path must be completely obstructed. Conversely, if none of the rays is obstructed, we can save time during shadow testing later by assuming that any ray arriving at the reflecting surface in fact has a free path to the source, and further ray intersection tests are unnecessary. We have found presampling to be very effective in avoiding wasteful testing of completely blocked or unhindered virtual light source paths. Figure 8 shows a cross-section of an office space with a light shelf having a mirrored top surface. Exterior to this office is a building with a mirrored glass facade. Figure 9a shows the interior of the office with sunlight reflected by the shelf onto the ceiling. Light has also been reflected by the exterior, glazed building. Light shelf systems utilize daylight very effectively and are finding increasing popularity among designers. To make our calculation more efficient overall, we have made additional use of "secondary" light sources, described in the next section. -- -- Figure 8 3.5. User-directed Preprocessing of "Secondary" Sources What happens when daylight enters a space through a skylight or window? If we do not treat such "secondary" emitters specially in our calculation, we will have to rely on the ability of the naive Monte Carlo sampling to find and properly integrate these contributions, which is slow. Especially when a window or skylight is partially obscured by venetian blinds or has a geometrically complex configuration, computing its contribution requires significant effort. Since we know a priori that such openings have an important influence on indoor illumination, we can greatly improve the efficiency of our simulation by removing them from the indirect calculation and treating them instead as part of the direct (i.e. source) component. Radiance provides a practical means for the user to move such secondary sources into the direct calculation. For example, the user may specify that a certain window is to be treated as a light source, and a separate calculation will collect samples of the transmitted radiation over all points on the window over all directions, a 4-dimensional function. This distribution is then automatically applied to the window, which is treated as a secondary light source in the final calculation. This method was used in Figure 9a not only for the windows, but also for light reflected by the ceiling. Bright solar patches on interior surfaces can make important contributions to interior illumination. Since this was the desired result of our mirrored light shelf design, we knew in advance that treating the ceiling as a secondary light source might improve the efficiency of our calculation. Using secondary light sources in this scene reduced simulation time to approximately one fifth of what it would have been to reach the same accuracy using the default sampling techniques. Figure 9b shows a Monte Carlo path tracing calculation of the same scene as 9a, and took roughly the same amount of time to compute. The usual optimizations of sending rays to light sources (the sun in this case) and in specular directions were used. Nevertheless, the image is very noisy due to the difficulty of computing interreflection independently at each pixel. Also, locating the sun reflected in the mirrored light shelf is hopeless with naive sampling; thus the ceiling is extremely noisy and the room is not as well lit as it should be. An important aspect of secondary light sources in Radiance is that they have a dual nature. When treated in the direct component calculation, they are merely surfaces with precalculated output distributions. Thus, they can be treated efficiently as light sources and the actual variation that may take place over their extent (e.g. the bright and dark slats of venetian blinds) will not translate into excessive variance in the calculated illumination. However, when viewed directly, they revert to their original form, showing all the appropriate detail. In our office scene example, we can still see through the window despite its treatment as a secondary light source. This is because we treat a ray coming from the eye differently, allowing it to interact with the actual window rather than seeing only a surface with a smoothed output distribution. In fact, only shadow rays see the simplified representation. Specular rays and other sampling will be carried out as if the window was not a light source at all. As is true with the computation of indirect irradiance described in section 3.2, extreme care must be exercised to avoid double-counting of light sources and other inconsistencies in the calculation. 3.6. Hierarchical Octrees for Spatial Subdivision One of the goals of our simulation is to model very complicated geometries. Ray-tracing is well-suited to calculations in complicated environments, since spatial subdivision structures reduce the number of ray-surface intersection tests to a tiny fraction of the entire scene. In Radiance, we use an octree spatial subdivision scheme similar to that proposed by Glassner [7]. Our octree starts with a cube encompassing the entire scene, and recursively subdivides the cube into eight equal subcubes until each voxel (leaf node) intersects or contains less than a certain number of surfaces, or is smaller than a certain size. Figure 10. Plot showing sublinear relationship of intersection time to number of surfaces in a scene. The best fit for γ in this test was 0.245, meaning the ray intersection time grew more slowly than the fourth root of N . The spheres were kept small enough so that a random ray sent from the field’s interior had about a 50% chance of hitting something. (I.e. the sphere radii were proportional to N 1/3.) This guarantees that we are really seeing the cost of complicated geometry, since each ray goes by many surfaces. Although it is difficult to prove in general, our empirical tests show that the average cost of ray intersection using this technique grows as a fractional power of the total number of surfaces, i.e. O (N γ) where γ < 1⁄2. The time to create the octree grows linearly with the number of surfaces, but it is usually only a tiny fraction of the time spent rendering. Figure 10 shows the relationship between ray intersection time and number of surfaces for a uniformly distributed random field of spheres. The basic surface primitives supported in Radiance are polygons, spheres and cones. Generator programs provide conversion from arbitrary shape definitions (e.g. surfaces of -- -- revolution, prisms, height fields, parametric patches) to these basic types. Additional scene complexity is modeled using hierarchical instancing, similar to the method proposed by Snyder [21]. In our application of instancing, objects to be instanced are stored in a separate octree, then this octree is instanced along with other surfaces to create a second, enclosing octree. This process is repeated as many times and in as many layers as desired to produce the combined scene. It is possible to model scenes with a virtually unlimited number of surfaces using this method. Figure 11 shows a cabin in a forest. We began with a simple spray of 150 needles, which were put into an octree and instanced many times and combined with twigs to form a branch, which was in turn instanced and combined with larger branches and a trunk to form a pine tree. This pine tree was then put in another octree and instanced in different sizes and orientations to make a small stand of trees, which was combined with a terrain and cabin model to make this scene. Thus, four hierarchical octrees were used together to create this scene, which contains over a million surfaces in all. Despite its complexity, the scene still renders in a couple of hours, and the total data structure takes less than 10 Mbytes of RAM. 3.7. Patterns and Textures Another practical way to add detail to a scene is through the appropriate infusion of surface detail. In Radiance, we call a variation in surface color and/or brightness a pattern, and a perturbation of the surface normal a texture. This is more in keeping with the English definitions of these words, but sometimes at odds with the computer graphics community, which seems to prefer the term "texture" for a color variation and "bump-map" for a perturbation of the surface normal. In any case, we have extended the notion somewhat by allowing patterns and textures to be functions not only of surface position but also of surface normal and ray direction so that a pattern, for example, may also be used to represent a light source output distribution. Our treatment of patterns and textures was inspired by Perlin’s flexible shading language [17], to which we have added the mapping of lookup functions for multi-dimensional data. Using this technique, it is possible to interpret tabulated or image data in any manner desired through the same functional language used for procedural patterns and textures. Figure 12 shows a scene with many patterns and textures. The textures on the vases and oranges and lemons are procedural, as is the pattern on the bowl. The pattern on the table is scanned, and the picture on the wall is obviously an earlier rendering. Other patterns which are less obvious in this scene are the ones applied to the three light sources, which define their output distributions. The geometry was created with the generator programs included with Radiance, which take functional specifications in the same language as the procedural patterns and textures. The star patterns are generated using a Radiance filter option that uses the pixel magnitude in deciding how much to spread the image, showing one advantage of using a floatingpoint picture format [27]. (The main advantage of this format is the ability to adjust exposure after rendering, taking full advantage of tone mapping operators and display calibration [23,30].) 3.8. Parallel Processing One of the most practical ways to reduce calculation time is with parallel processing. Ray-tracing is a natural for parallel processing, since the calculation of each pixel is relatively independent. However, the caching of indirect irradiance values in Radiance means that we benefit from sharing information between pixels that may or may not be neighbors in one or more images. Sharing this information is critical to the efficiency of a parallel computation, and we want to do this in a system-independent way. We have implemented a coarse-grained, multiple instruction, shared data (MISD) algorithm for Radiance rendering†. This technique may be applied to a single image, where multiple processes on one or more machines work on small sections of the image simultaneously, or to a sequence of images, where each process works on a single frame in a long animation. In the latter case, we need only worry about the sharing of indirect irradiance values on multiple active invocations, since dividing the image is not an issue. The method we use is described below. Indirect irradiance values are written to a shared file whose contents are checked by each process prior to update. If the file has grown, the new values (produced by other processes) are read in before the irradiances computed by this process are written out. File consistency is maintained through the NFS lock manager, thus values may be shared transparently across the network. Irradiance values are written out in blocks big enough to avoid contention problems, but not so big that there is a lot of unnecessary computation caused by insufficient value sharing. We found this method to be much simpler, and about as efficient, as a remote procedure call (RPC) approach. Since much of the scene information is static throughout the rendering process, it is wasteful to have multiple copies on a multi-processing platform that is capable of sharing memory. As with value sharing, we wished to implement memory sharing in a system-independent fashion. We decided to use the memory sharing properties of the UNIX fork(2) call. All systems capable of sharing memory do so during fork on a copy-on-write basis. Thus, a child process need not be concerned that it is sharing its parent’s memory, since it will automatically get its own memory the moment it stores something. We can use this feature to our advantage by reading in our entire scene and initializing all the associated data structures before forking a process to run in parallel. So long as we do not alter any of this information during rendering, we will share the associated memory. Duplicate memory may still be required for data that is generated during rendering, but in most cases this represents a minor fraction of our memory requirements. 3.9. Animation Radiance is often used to create walk-through animations of static environments. Though this is not typically the domain of ray-tracing renderers, we employ some techniques to make the process more efficient. The most important technique is the use of recorded depth information at each pixel to interpolate fully ray-traced frames with a z-buffer algorithm. Our method is similar to the one explained by Chen et al [3], where pixel depths are used to recover an approximate 3-dimensional model of the visible portions of the scene, and a z-buffer is used to make visibility decisions for each intermediate view. This makes it possible to generate 30 very good-looking frames for each second of animation while only having to render about 5 of them. Another technique we use is unique to Radiance, which is the sharing of indirect irradiance values. Since these values are view-independent, there is no sense in recomputing them each time, and sharing them during the animation process distributes the cost over so many frames that the incremental cost of simulating diffuse interreflection is negligible. Finally, it is possible to get interactive frame rates from advanced rendering hardware using illumination maps instead of ray-tracing the frames directly. (An illumination map is a 2dimensional array of color values that defines the surface shading.) Such maps may be kept separate from the surfaces’ own patterns and textures, then combined during rendering. Specular hhhhhhhhhhhhhhhhhhhhh †Data sharing is of course limited in the case of distributed processors, where each node must have its own local copy of scene data structures. -- -- iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii c c RADIANCE File Types iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii c Data Type c c Format c Created by c Used for iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii c c c c c ASCII text text editor, CAD translator geometry, materials, patterns, textures c iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii c Scene Description c c c c Function File c c ASCII text c text editor c surface tessellation, patterns, textures, c c c c c scattering functions, coordinate c c c c c mappings, data manipulation iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii c Data File c c ASCII integers and c luminaire data translator, text editor c N-dimensional patterns, textures, c c c c c floats scattering functions iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii c c c ASCII integers c Hershey set, font design system, font c text patterns, label generator Polygonal Font c c c c c iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii c c c c translator, text editor c c Octree c Binary c scene compiler (oconv) c fast ray intersection, incremental scene c c c c c c compilation, object instancing iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii c Picture c run-length encoded c renderer, filter, image translator c interactive display, hard copy, lighting c c c c 4-byte/pixel c c analysis, material pattern, rendering c c c c c floating-point recovery ciiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii c c c c c Ambient File c c Binary c renderer, point value program c sharing view-independent indirect c iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii c c c irradiance values c Table 1. All binary types in Radiance are portable between systems, and have a standard information header specifying the format and the originating command(s). surfaces will not appear correct since they depend on the viewer’s perspective, but this may be a necessary sacrifice when user control of the walk-through is desired. Interactive rendering has long been touted as a principal advantage of radiosity, when in fact complete view-independence is primarily a side-effect of assuming diffuse reflection. Radiance calculates the same values using a ray-tracing technique, and storage and rendering may even be more efficient since large polygons need not be subdivided into hundreds of little ones -- an illumination map works just as well or better. 3.10. Implementation Issues Radiance is a collection of C programs designed to work in concert, communicating via the standard data types listed in Table 1. The system may be compiled directly on most UNIX platforms, including SGI, Sun, HP, DEC, Apple (A/UX), and IBM (RS/6000). Portability is maintained over 60,000+ lines of code using the Kernighan and Ritchie standard [11] and conservative programming practices that do not rely on system-specific behaviors or libraries. (In addition to UNIX support, there is a fairly complete Amiga port by Per Bojsen, and a limited MSDOS port by Karl Grau.) A typical rendering session might begin with the user creating or modifying a geometric model of the space using a CAD program. (The user spends about 90% of the time on geometric modeling.) The CAD model is then translated into a Radiance scene description file, using either a stand-alone program or a function within the CAD system itself. The user might then create or modify the materials, patterns and textures associated with this model, and add some objects from a library of predefined light sources and furnishings. The completed model would then be compiled by oconv into an octree file, which would be passed to the interactive renderer, rview, to verify the desired view and calculation parameters. Finally, a batch rendering would be started with rpict, and after a few minutes or a few hours, the raw picture would be filtered (i.e. anti-aliased via image reduction) by pfilt using a suitable exposure level and target resolution. This finished picture may be displayed with ximage, translated to another format, printed, or further analyzed using one of the many Radiance image utilities. This illustrates the basic sequence of: model → convert → render → filter → display all of which may be put in a single pipelined command if desired. As Radiance has evolved over the years, it has become increasingly sophisticated, with nearly 100 programs that do everything from CAD translation to surface tessellation to lighting calculations and rendering to image filtering, composition and conversion. With this sophistication comes great versatility, but learning the ins and outs of the programs, even the few needed for simple rendering, is impractical for most designers. To overcome system complexity and improve the reliability of rendering results, we have written an executive control program, called rad. This program takes as its input a single file that identifies the material and scene description files needed as well as qualitative settings related to this environment and the simulation desired. The control program then calls the other programs with the proper parameters in the proper sequence. The intricacies of the Radiance rendering pipeline are thus replaced by a few intuitive variable settings. For example, there is a variable called "DETAIL", which might be set to "low" for an empty room, "medium" for a room with a few pieces of furniture and "high" for a complicated room with many furnishings and textures. This variable will be used with a few others like it to determine how many rays to send out in the Monte Carlo sampling of indirect lighting, how closely to space these values, how densely to sample the image plane, and so on. One very important variable that affects nearly all rendering parameters is called "QUALITY". Low quality renderings come out quickly but may not look as good as medium quality renderings, and high quality renderings take a while but when they finish, the images can go straight into a magazine article. This notion of replacing many algorithm-linked rendering parameters with a few qualitative variables has greatly improved the usability of Radiance and the reliability of its output. The control program also keeps track of octree creation, secondary source generation, aborted job recovery, image filtering and anti-aliasing, and running the interactive renderer. The encoding of expertise in this program has been so successful, in fact, that we rely on it ourselves almost 100% for setting parameters and controlling the rendering process. Although the addition of a control program is a big improvement, there are still many aspects of Radiance that are not easily accessible to the average user. We have therefore added a number of utility scripts for performing specific tasks from the more general functions that constitute the system. One example of this is the falsecolor program, which calls other image filter programs and utilities to generate an image showing luminance contours or other data associated with a scene or -- -- rendering. Figure 9c shows our previous rendering (Figure 9a) superimposed with illuminance contours. These contours tell the lighting designer if there is enough light in the right places or too much light in the wrong places -- information that is difficult to determine from a normal image†. Even with a competent rendering control program and utility scripts for accomplishing specific tasks, there are still many designers who would not want to touch this system with an extended keyboard. Modern computer users expect a list of pull-down menus with point-and-click options that reduce the problem to a reasonably small and understandable set of alternatives. We are currently working on a graphical user interface (GUI) to the rad control program, which would at least put a friendlier face on the standard system. A more effective longterm solution is to customize the rendering interface for each problem domain, e.g. interior lighting design, daylighting, art, etc. Due to our limited resources and expertise, we have left this customization task to third parties who know more about specific applications, and who stand to benefit from putting their GUI on our simulation engine. So far, there are a half dozen or so developers working on interfaces to Radiance. 4. Applications and Results The real proof of a physically-based rendering system is the problems it solves. Here we see how well we have met the challenges and goals we set out. Radiance has been used by hundreds of people to solve thousands of problems over the years. In the color pages we have included some of the more recent work of some of the more skilled users. The results have been grouped into two application areas, electric lighting problems and daylighting problems. 4.1. Electric Lighting Electric lighting was the first domain of Radiance, and it continues to be a major strength. A model may contain any number of light sources of all shapes and sizes, and the output distributions may be entered as either near-field or far-field data. The dual nature of light sources (mentioned in section 3.5) also permits detailed modeling of fixture geometry, which is often important in making aesthetic decisions. There are several application areas where electric lighting is emphasized. The most obvious application is lighting design. Figure 13 shows a comparative study between three possible lighting alternatives in a hotel lobby space. Several other designs were examined in this exploration of design visualization. With such a presentation, the final decision could be safely left to the client. One design application that requires very careful analysis is indirect lighting. Figure 14 shows a simulation of a new control center for the London Underground. The unusual arrangement of upwardly directed linear fluorescents was designed to provide general lighting without affecting the visibility of the central display panel (image left). Stage lighting is another good application of physicallybased rendering. The designs tend to be complex and changing, and the results must be evaluated aesthetically (i.e. visually). Figure 15 shows a simulation of a scene from the play Julius Caesar . Note the complex shadows cast by the many struts in the stage set. Computing these shadows with a radiosity algorithm would be extremely difficult. hhhhhhhhhhhhhhhhhhhhh †Actually, Radiance pictures do contain physical values through a combination of the 4-byte floating-point pixel representation and careful tracking of exposure changes [27], but the fidelity of any physical image presentation is limited by display technology and viewing conditions. We therefore provide the convenience of extracting numerical values with our interactive display program. 4.2. Daylighting Daylight poses a serious challenge to physically-based rendering. It is brilliant, ever-changing and ever-present. At first, the daylight simulation capabilities in Radiance were modest, limited mostly to exteriors and interiors with clear windows or openings. Designers, especially architects, wanted more. They wanted to be able to simulate light through venetian blinds, intricate building facades and skylights. In 1991, the author was hired on sabbatical by EPFL to improve the daylight simulation capabilities of Radiance, and developed some of the techniques described earlier in this paper. In particular, the large source adaptive subdivision, virtual source and secondary source calculations proved very important for daylighting problems. The simplest application of daylight is exterior modeling. Many CAD systems have built-in renderers that will compute the solar position from time of day, year, and location, and generate accurate shadows. In addition to this functionality, we wanted Radiance to show the contributions of diffuse skylight and interreflection. Figure 16 shows the exterior of the Mellencamp Pavillion, an Indiana University project that recently acquired funding (along with its name). A more difficult daylighting problem is atrium design*. Designing an atrium requires thorough understanding of the daylight availability in a particular region to succeed. Figure 17 shows an atrium space modeled entirely within Radiance, without the aid of a CAD program [13]. The hierarchical construction of Radiance scene files and the many programmable object generators makes text-editor modeling possible, but most users prefer a "mousier" approach. Daylighted interiors pose one of the nastiest challenges in rendering. Because sunlight is so intense, it is usually diffused or baffled by louvers or other redirecting systems. Some of these systems can be quite elaborate, emphasizing the need for simulation in their design. Figure 18 shows the interior of the pavillion from Figure 16. Figure 19 shows a library room illuminated by a central skylight. Figure 20a shows a simulation of a daylighted museum interior. Daylight is often preferred in museums as it provides the most natural color balance for viewing paintings, but control is also very important. Figure 20b shows a false color image of the illuminance values on room surfaces; it is critical to keep these values below a certain threshold to minimize damage to the artwork. 5. Conclusion We have presented a physically-based rendering system that is accurate enough, general enough, and practical enough for the vast majority of lighting design and architectural applications. The simulation uses a light-backwards ray-tracing method with extensions to handle specular, diffuse and directional-diffuse reflection and transmission in any combination to any level in any environment. Not currently included in the calculation are participating media, diffraction and interference, phosphorescence, and polarization effects. There is nothing fundamental preventing us from modeling these processes, but so far there has been little demand for them from our users. The principle users of Radiance are researchers and educators in public and private institutions, and lighting specialists at large architectural, engineering and manufacturing firms. There are between 100 and 200 active users in the U.S. and Canada, and about half as many overseas. This community is continually growing, and as the Radiance interface and documentation improves, the growth rate is likely to increase. hhhhhhhhhhhhhhhhhhhhh *An atrium is an enclosed courtyard with a glazed roof structure for maximizing daylight while controlling the indoor climate. -- -- For the graphics research community, we hope that Radiance will provide a basis for evaluating new physically-based rendering techniques. To this end, we provide both the software source code and a set of precomputed test cases on our ftp server. The test suite includes diffuse and specular surfaces configured in a simple rectangular space with and without obstructions. More complicated models are also provided in object libraries and complete scene descriptions. 6. Acknowledgements Most of the color figures in this paper represent the independent work of Radiance users, and are reprinted with permission. Figure 5 was created by Charles Ehrlich from a design by Mark Mack Architects of San Francisco. Figure 11 was created by Cindy Larson. Figures 12 and 13 were created by Martin Moeck of Siemens Lighting, Germany. Figure 14 was created by Steve Walker of Ove Arup and Partners, London. Figure 15 was created by Robert Shakespeare of the Theatre Computer Visualization Center at Indiana University. Figures 16, 18 and 19 were created by Scott Routen and Reuben McFarland of the University Architect’s Office at Indiana University. Figures 17 and 20 were created by John Mardaljevic of the ECADAP Group at De Montfort University, Leicester. The individuals who have contributed to Radiance through their support, suggestions, testing and enhancements are too numerous to mention. Nevertheless, I must offer my special thanks to: Peter ApianBennewitz, Paul Bourke, Raphael Compagnon, Simon Crone, Charles Ehrlich, Jon Hand, Paul Heckbert, Cindy Larson, Daniel Lucias, John Mardaljevic, Kevin Matthews, Don McLean, Georg Mischler, Holly Rushmeier, Jean-Louis Scartezzini, Jennifer Schuman, Veronika Summeraur, Philip Thompson, Ken Turkowski, and Florian Wenz. Work on Radiance was sponsored by the Assistant Secretary for Conservation and Renewable Energy, Office of Building Technologies, Buildings Equipment Division of the U.S. Department of Energy under Contract No. DE-AC03-76SF00098. Additional funding was provided by the Swiss federal government as part of the LUMEN Project. 7. Software Availability Radiance is available by anonymous ftp from two official sites: hobbes.lbl.gov nestor.epfl.ch 128.3.12.38 128.178.139.3 Berkeley, California Lausanne, Switzerland [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] For convenience, Radiance 2.4 has been included on the CD-ROM version of these proceedings. [22] From Mosaic, try the following URL: file://hobbes.lbl.gov/www/radiance/radiance.html [23] 8. References [1] [2] [3] [4] [5] [6] [7] Baum, Daniel, Holly Rushmeier, James Winget, ‘‘Improving Radiosity Solutions Through the Use of Analytically Determined Form-Factors,’’ Computer Graphics , Vol. 23, No. 3, July 1989, pp. 325-334. Baum, Daniel, Stephen Mann, Kevin Smith, James Winget, ‘‘Making Radiosity Usable: Automatic Preprocessing and Meshing Techniques for the Generation of Accurate Radiosity Solutions,’’ Computer Graphics , Vol. 25, No. 4, July 1991. Chen, Shenchang Eric, Lance Williams, ‘‘View Interpolation for Image Synthesis,’’ Computer Graphics , August 1993, pp. 279288. Compagnon, Raphael, B. Paule, J.-L. Scartezzini, ‘‘Design of New Daylighting Systems Using ADELINE Software,’’ Solar Energy in Architecture and Urban Planning , proceedings of the 3rd European Conference on Architecture, Florence, Italy, May 1993. Cook, Robert, Thomas Porter, Loren Carpenter, ‘‘Distributed Ray Tracing,’’ Computer Graphics , Vol. 18, No. 3, July 1984, pp. 137-147. Dorsey, Julie O’B., Francois Sillion, Donald Greenberg, ‘‘Design and Simulation of Opera Lighting and Projection Effects,’’ Computer Graphics , Vol. 25, No. 4, July 1991, pp. 41-50. Glassner, Andrew S., ‘‘Space subdivision for fast ray tracing’’ IEEE Computer Graphics and Applications Vol. 4, No. 10, October 1984, pp. 15-22. [24] [25] [26] [27] [28] [29] [30] [31] Goral, Cindy, Kenneth Torrance, Donald Greenberg, Bennet Battaile, ‘‘Modeling the Interaction of Light Between Diffuse Surfaces,’’ Computer Graphics , Vol. 18, No. 3, July 1984, pp. 213222. Grynberg, Anat, Validation of Radiance , LBID 1575, LBL Technical Information Department, Lawrence Berkeley Laboratory, Berkeley, California, July 1989. Kajiya, James T., ‘‘The Rendering Equation,’’ Computer Graphics , Vol. 20, No. 4, August 1986. Kernighan, Brian, Dennis Ritchie, The C Programming Language , Prentice-Hall, 1978. Kirk, David, James Arvo, ‘‘Unbiased Sampling Techniques for Image Synthesis,’’ Computer Graphics , Vol 25, No. 4, July 1991, pp. 153-156. Mardaljevic, John and Kevin Lomas, ‘‘Creating the Right Image,’’ Building Services / The CIBSE Journal , Vol 15, No. 7, July 1993, pp. 28-30. Mardaljevic, John, K.J. Lomas, D.G. Henderson, ‘‘Advanced Daylighting Design for Complex Spaces’’ Proceedings of CLIMA 2000 , 1-3 November 1993, London UK. Meyer, Gary, Holly Rushmeier, Michael Cohen, Donald Greenberg, Kenneth Torrance, ‘‘An Experimental Evaluation of Computer Graphics Imagery,’’ ACM Transactions on Graphics , Vol. 5, No. 1, pp. 30-50. Nicodemus, F.E., J.C. Richmond, J.J. Hsia, Geometrical Considerations and Nomenclature for Reflectance , U.S. Department of Commerce, National Bureau of Standards, October 1977. Perlin, Ken, ‘‘An Image Synthesizer’’, Computer Graphics , Vol. 19, No. 3, July 1985, pp. 287-296. Rushmeier, Holly, Extending the Radiosity Method to Transmitting and Specularly Reflecting Surfaces , Master’s Thesis, Cornell Univ., Ithaca, NY, 1986. Rushmeier, Holly, Charles Patterson, Aravindan Veerasamy, ‘‘Geometric Simplification for Indirect Illumination Calculations,’’ Proceedings of Graphics Interface ’93 , May 1993, pp. 227-236. Smits, Brian, James Arvo, David Salesin, ‘‘An ImportanceDriven Radiosity Algorithm,’’ Computer Graphics , Vol 26, No. 2, July 1992, pp. 273-282. Snyder, John M., Alan H. Barr, ‘‘Ray Tracing Complex Models Containing Surface Tessellations,’’ Computer Graphics Vol. 21, No. 4, pp. 119-128, July 1987. Teller, Seth and Pat Hanrahan, ‘‘Global Visibility Algorithms for Illumination Computations,’’ Computer Graphics , pp. 239-246, August 1993. Tumblin, Jack, Holly Rushmeier, ‘‘Tone Reproduction for Realistic Images,’’ IEEE Computer Graphics and Applications , Vol. 13, No. 6, November 1993, pp. 42-48. Wallace, John, Michael Cohen, Donald Greenberg, ‘‘A Two-Pass Solution to the Rendering Equation: A Synthesis of Ray Tracing and Radiosity Methods,’’ Computer Graphics , Vol. 21, No. 4, July 1987. Ward, Gregory, Francis Rubinstein, Robert Clear, ‘‘A Ray Tracing Solution for Diffuse Interreflection,’’ Computer Graphics , Vol. 22, No. 4, August 1988. Ward, Gregory, ‘‘Adaptive Shadow Testing for Ray Tracing,’’ Second EUROGRAPHICS Workshop on Rendering , Barcelona, Spain, April 1991. Ward, Gregory, ‘‘Real Pixels,’’ Graphics Gems II , Edited by James Arvo, Academic Press 1991, pp. 80-83. Ward, Gregory, Paul Heckbert, ‘‘Irradiance Gradients,’’ Third EUROGRAPHICS Workshop on Rendering , Bristol, United Kingdom, May 1992. Ward, Gregory, ‘‘Measuring and Modeling Anisotropic Reflection,’’ Computer Graphics , Vol. 26, No. 2, July 1992, pp. 265-272. Ward, Gregory, ‘‘A Contrast-Based Scalefactor for Luminance Display,’’ Graphics Gems IV , Edited by Paul Heckbert, Academic Press, 1994. Whitted, Turner, ‘‘An Improved Illumination Model for Shaded Display,’’ Communications of the ACM , Vol. 23, No. 6, June 1980, pp. 343-349. lighter background object darker foreground object Figure 2: Irradiance gradients due to bright and dark objects in the environment. Potential History Visibile? 1053 90% Yes 750 55% No 600 65% No 520 95% Yes Sum of Tested 1573 Maximum Remainder 149 Remainder Estimate 100 30% 30 100% 11 20% 6 60% 2 15% 0 90% 0 75% × 0.65 43 Figure 4: Adaptive shadow testing algorithm, explained in Section 3.4. Figure 6a: Linear light source is adaptively split to minimize falloff and visibility errors. Figure 6b: Area light source is subdivided in two dimensions rather than one. Mirror Figure 7a: Virtual source caused by mirror reflection. B A Figure 7b: Source reflection in mirror A cannot intersect mirror B, so no virtual-virtual source is created. Figure 7c: Source rays cannot reach mirror surface, so no virtual source is created. Light distribution on ceiling Mirrored upper surface Light distribution on window Figure 8: Crossection of office space with mirrored light shelf. Irradiance Interpolation x=6.875 12 10 Linear Cubic Irradiance 8 Actual 6 4 2 0 0 2 4 Y Position 6 8 10 Irradiance Interpolation Error x=6.875 10 Linear 5 Relative Error (%) Cubic 0 -5 -10 0 2 4 Y Position 6 8 10 Ray Intersection Time vs. Number of Surfaces 1.0 Relative Execution Time 0.8 Data x^.245 0.6 0.4 0.2 0.0 0 5000 10000 15000 Number of Surfaces 20000 -- -- 9. Appendix Table 2 shows some of the more important variables used by the rad program, and the effect they have on the rendering process. iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii ciiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii c Rad Variable Settings ciiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii c c c c Variable Name Interpretation Legal Values Affects iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii c c c c c DETAIL geometric detail High, Med, Low image sampling, irradiance c c c c c value density iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii c c c c c c c EXPOSURE picture exposure positive real final picture brightness, c c c ciiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii c ambient approximation c c c c c c c c INDIRECT importance of indirect 0, 1, 2,... number of diffuse c c c c c diffuse contribution interreflections iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii c c c c c PENUMBRAS importance of soft shadows True, False source subdivision, source c c c c c sampling, image plane c c c c c sampling ciiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii c c c c ciiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii c QUALITY rendering quality/accuracy High, Med, Low nearly everything c c c c c c c c VARIABILITY light distribution in the High, Med, Low indirect irradiance c c c c c space interpolation, hemisphere c c c c c sampling iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii c c c c c ZONE region of interest Interior/Exterior keyword irradiance value density, c c c c c plus bounding box standard viewpoints c iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii c c c c Table 2. Rad rendering variables, their meanings, values and effects. 9.1. Comparison to Other Rendering Environments Although a comprehensive comparison between rendering systems is beyond the scope of this paper, we can make a few simple observations. Keeping in mind the three challenges of accuracy, generality and practicality, we may judge how well each rendering system fares in light of the goals listed in section 2. Note that these goals are specific to predictive rendering for architectural and lighting design -- a system may fail one or more of these requirements and still be quite useful for other applications. The most heavily used commercial rendering environments are graphics libraries. These libraries are often developed by computer manufacturers specifically for their graphics hardware. They tend to be very fast and eminently practical, but are not physically accurate or sufficiently general in terms of reflectance, transmittance and geometric modeling to be useful for lighting design. Accuracy and generality have been sacrificed for speed. So-called "photo-realistic" rendering systems may be general and practical, but they are not accurate. One of the best examples of photo-realistic rendering software is RenderMan [Upstill89], which is based on the earlier REYES system [Cook87]. Although Renderman can generate some beautiful pictures, global illumination is not incorporated, and it is difficult to create accurate light sources. Shadows are usually cast using a z-buffer algorithm that cannot generate penumbra accurately [Reeves87]. However, the system does provide a flexible shading language that can be programmed to simulate some global illumination effects [Cook84b][Hanrahan90]. In recent years, many free rendering packages have been distributed over a network. RayShade, one of the best free raytracers, does not simulate diffuse interreflection, and uses a nonphysical reflection model. As with most photo-realistic rendering software, accuracy is the key missing ingredient. Filling in this gap, some free radiosity programs are starting to show up on the network. Though the author has not had the opportunity to learn about all of them, most appear to use traditional approaches that are limited to diffuse surfaces and simple environments, and therefore are not general or practical enough for lighting design. Systems for research in rendering and global illumination algorithms exist at hundreds of universities around the world. Few of these systems ever make it out of the laboratory, so it is particularly difficult to judge them in terms of practicality. However, from what research has been published, it appears that most of these systems are based on classic or progressive radiosity techniques. As we have noted, radiosity relies on diffuse surfaces and relatively simple geometries, so its generality is limited. Extensions to non-diffuse environments tend to be very expensive in time and memory, since directionality greatly complicates the governing equations of a view-independent solution [Sillion91]. Recent work on extending an adjoint system of equations for a view-dependent solution [Smits92] to nondiffuse environments appears promising, but the techniques are still limited to simple geometries [Christensen93][Aupperle93]. The basic problem with the radiosity method is that it ties illumination information to surfaces, and this approach runs into trouble when millions of surfaces are needed to represent a scene. Rushmeier et al addressed this problem with their "clumping" approach, which partially divorces illumination from geometry [Rushmeier93]. Baum et al [Baum91] developed techniques for automatically meshing large models, which saves on manual labor but does not reduce time and space requirements. The theater model shown in Figure 5 was rendered in [Baum91] using automatic meshing and progressive radiosity. Meshing the scene caused it to take up about 100 Mbytes of memory, and rendering took over 18 hours on an SGI R3000 workstation for the direct component alone, compared to 5 hours in 11 Mbytes using Radiance on the same computer. Some of the larger architecture and engineering firms have the resources to create their own in-house lighting simulation and rendering software. Although it is difficult to speculate as to the nature and capabilities of these systems since they are hidden from public view, the author is aware of at least a halfdozen well-funded projects aimed at putting the state of the art in global illumination into practice. Most of these projects are based on progressive radiosity or something closely related. In at least two cases, Abacus Simulations in Scotland and Siemens Lighting in Germany, in-house software projects have been abandoned after considerable expense in favor of using Radiance. At least two other firms, Ove Arup in England and Phillips Lighting in the Netherlands, use Radiance side by side with inhouse software. Of course, we cannot conclude from this that Radiance is the best, but the trend is encouraging. By far the most relevant class of software to compare is commercial lighting simulation and rendering programs. Most of these systems are practical, or people would not buy them. Most are also accurate, or they would not qualify as lighting -- -- simulations. The problem is lack of generality. LumenMicro (Lighting Technologies, Boulder, Colorado) is the biggestselling program among lighting designers, yet it is limited to modeling environments built from grey, diffuse, axis-aligned rectangles. A more promising product is called LightScape (LightScape Graphics Software, Toronto, Canada). This software uses progressive radiosity and graphics rendering hardware to provide real-time update capabilities. LightScape’s ray tracing module may be used to improve shadow resolution and add specular effects, but this solution is expensive and incomplete. Also, memory costs associated with complicated geometries limit the practicality of this system. To be fair, LightScape is in its initial release, and has some very accomplished researchers working on it. One program that shows great potential has recently been released to the U.S. market, Arris Integra (Sigma Design in Baltimore). This program uses a bidirectional ray tracing technique developed by Fujimoto [Fujimoto92], and its capabilities have been demonstrated in [Scully93]. The chief drawback of this system seems to be that it is somewhat difficult and expensive to use, costing roughly 15,000 dollars for the basic software and taking many long hours to perform its calculations. 9.2. Prospects for the Future of Rendering Today’s researchers in global illumination have the opportunity to decide the future direction of rendering for decades to come. Most commercial rendering systems currently pay little attention to the physical behavior of light, providing shortcuts such as Phong shading and lights with linear fall-off that undermine realism and make the results useless for lighting design and other predictive applications. We believe that the golden road to realistic rendering is physical simulation, but it is necessary to decide which phenomena shall be included and which may safely be left out. If we choose a scope that is too broad, it will incur large computational expenses with little payoff for users. If our scope is too narrow, we will limit the application areas and realism and therefore limit our audience. Global illumination researchers must join together to set standards for physicallybased rendering; standards that will provide a basis for comparison between techniques, and the stability needed for practical progress. As part of this larger standardization effort, we would like to see a common scene description format adopted by the rendering community. There are many candidates at this point, but none of them contain physically valid light source and material descriptions. We would welcome the use of the Radiance format, but extending a conventional scene description language might work better. We suggest the formation of a small committee of global and local illumination researchers to decide what should be included in such a format. We further suggest that one or two graphics hardware or software vendors could cover expenses for this task. In return, the vendors would get a new, physically valid foundation for building the next generation of rendering solutions. The future of physically-based rendering depends on cooperation and agreement. We must agree on a starting point and work together towards a goal to bring science to this art. 9.3. Appendix References [Aupperle93] Aupperle, Larry, Pat Hanrahan, ‘‘Importance and Discrete Three Point Transport,’’ Proceedings of the Fourth EUROGRAPHICS Workshop on Rendering , Paris, France, June 1993, pp. 85-94. [Baum91] Baum, Daniel, Stephen Mann, Kevin Smith, James Winget, ‘‘Making Radiosity Usable: Automatic Prepro- cessing and Meshing Techniques for the Generation of Accurate Radiosity Solutions,’’ Computer Graphics , Vol. 25, No. 4, July 1991. [Cook84b] Cook, Robert, ‘‘Shade Trees,’’ Computer Graphics , Vol. 18, No. 3, July 1984, pp. 223-232. [Cook87] Cook, Robert, Loren Carpenter, Edwin Catmull, ‘‘The Reyes Image Rendering Architecture,’’ Computer Graphics , Vol. 21, No. 4, July 1987, pp. 95-102. [Christensen93] Christensen, Per, David Salesin, Tony DeRose, ‘‘A Continuous Adjoint Formulation for Radiance Transport,’’ Proceedings of the Fourth EUROGRAPHICS Workshop on Rendering , Paris, France, June 1993, pp. 95-104. [Fujimoto92] Fujimoto, Akira, Nancy Hays, ‘‘Mission Impossible: High Tech Made in Poland,’’ Computer Graphics and Applications , Vol. 12, No. 2, March 1992, pp. 8-13. [Hanrahan90] Hanrahan, Pat and Jim Lawson, ‘‘A Language for Shading and Lighting Calculations,’’ Computer Graphics , Vol. 24, No. 4, August 1990, pp. 289-298. [Reeves87] Reeves, William, David Salesin, Robert Cook, ‘‘Rendering Antialiased Shadows with Depth Maps,’’ Computer Graphics , Vol. 21, No. 4, July 1987, pp. 283-291. [Rushmeier93] Rushmeier, Holly, Charles Patterson, Aravindan Veerasamy, ‘‘Geometric Simplification for Indirect Illumination Calculations,’’ Proceedings of Graphics Interface ’93 , May 1993, pp. 227-236. [Scully93] Scully, Vincent, ‘‘A Virtual Landmark,’’ Progressive Architecture , September 1993, pp. 80-87. [Sillion91] Sillion, Francois, James Arvo, Stephen Westin, Donald Greenberg, ‘‘A Global Illumination Solution for General Reflectance Distributions,’’ Computer Graphics , Vol 25, No. 4, July 1991, pp. 187-196. [Upstill89] Upstill, Steve, The RenderMan Companion , AddisonWesley, 1989. This paper appeared in IEEE Transactions on Visualization and Computer Graphics, Vol. 3, No. 4, December 1997. A Visibility Matching Tone Reproduction Operator for High Dynamic Range Scenes Gregory Ward Larson† Building Technologies Program Environmental Energy Technologies Division Ernest Orlando Lawrence Berkeley National Laboratory University of California 1 Cyclotron Road Berkeley, California 94720 Holly Rushmeier IBM T.J. Watson Research Center Christine Piatko†† National Institute for Standards and Technology January 15, 1997 This paper is available electronically at: http://radsite.lbl.gov/radiance/papers Copyright 1997 Regents of the University of California subject to the approval of the Department of Energy † Author's current address: Silicon Graphics, Inc., Mountain View, CA. †† Author's current address: JHU/APL, Laurel, MD. LBNL 39882 UC 400 A Visibility Matching Tone Reproduction Operator for High Dynamic Range Scenes Gregory Ward Larson Lawrence Berkeley National Laboratory Holly Rushmeier IBM T.J. Watson Research Center Christine Piatko National Institute for Standards and Technology ABSTRACT We present a tone reproduction operator that preserves visibility in high dynamic range scenes. Our method introduces a new histogram adjustment technique, based on the population of local adaptation luminances in a scene. To match subjective viewing experience, the method incorporates models for human contrast sensitivity, glare, spatial acuity and color sensitivity. We compare our results to previous work and present examples of our techniques applied to lighting simulation and electronic photography. Keywords: Shading, Image Manipulation. 1 Introduction The real world exhibits a wide range of luminance values. The human visual system is capable of perceiving scenes spanning 5 orders of magnitude, and adapting more gradually to over 9 orders of magnitude. Advanced techniques for producing synthetic images, such as radiosity and Monte Carlo ray tracing, compute the map of luminances that would reach an observer of a real scene. The media used to display these results -either a video display or a print on paper -- cannot reproduce the computed luminances, or span more than a few orders of magnitude. However, the success of realistic image synthesis has shown that it is possible to produce images that convey the appearance of the simulated scene by mapping to a set of luminances that can be produced by the display medium. This is fundamentally possible because the human eye is sensitive to relative rather than absolute luminance values. However, a robust algorithm for converting real world luminances to display luminances has yet to be developed. The conversion from real world to display luminances is known as tone mapping. Tone mapping ideas were originally developed for photography. In photography or video, chemistry or electronics, together with a human actively controlling the scene lighting and the camera, are used to map real world luminances into an acceptable image on a January 15, 1997 page 1 display medium. In synthetic image generation, our goal is to avoid active control of lighting and camera settings. Furthermore, we hope to improve tone mapping techniques by having direct numerical control over display values, rather than depending on the physical limitations of chemistry or electronics. Consider a typical scene that poses a problem for tone reproduction in both photography and computer graphics image synthesis systems. The scene is a room illuminated by a window that looks out on a sunlit landscape. A human observer inside the room can easily see individual objects in the room, as well as features in the outdoor landscape. This is because the eye adapts locally as we scan the different regions of the scene. If we attempt to photograph our view, the result is disappointing. Either the window is overexposed and we can't see outside, or the interior of the room is under-exposed and looks black. Current computer graphics tone operators either produce the same disappointing result, or introduce artifacts that do not match our perception of the actual scene. In this paper, we present a new tone reproduction operator that reliably maps real world luminances to display luminances, even in the problematic case just described. We consider the following two criteria most important for reliable tone mapping: 1. Visibility is reproduced. You can see an object in the real scene if and only if you can see it in the display. Objects are not obscured in under- or over-exposed regions, and features are not lost in the middle. 2. Viewing the image produces a subjective experience that corresponds with viewing the real scene. That is, the display should correlate well with memory of the actual scene. The overall impression of brightness, contrast, and color should be reproduced. Previous tone mapping operators have generally met one of these criteria at the expense of the other. For example, some preserve the visibility of objects while changing the impression of contrast, while others preserve the overall impression of brightness at the expense of visibility. Figure 1. A false color image showing the world luminance values for a window office in candelas per meter squared (cd/m2 or Nits). January 15, 1997 page 2 The new tone mapping operator we present addresses our two criteria. We develop a method of modifying a luminance histogram, discovering clusters of adaptation levels and efficiently mapping them to display values to preserve local contrast visibility. We then use models for glare, color sensitivity and visual acuity to reproduce imperfections in human vision that further affect visibility and appearance. Figure 2. A linear mapping of the luminances in Figure 1 that overexposes the view through the window. Figure 3. A linear mapping of the luminances in Figure 1 that underexposes the view of the interior. January 15, 1997 page 3 Figure 4. The luminances in Figure 1 mapped to preserve the visibility of both indoor and outdoor features using the new tone mapping techniques described in this paper. 2 Previous Work The high dynamic range problem was first encountered in computer graphics when physically accurate illumination methods were developed for image synthesis in the 1980's. (See Glassner [Glassner95] for a comprehensive review.) Previous methods for generating images were designed to automatically produce dimensionless values more or less evenly distributed in the range 0 to 1 or 0 to 255, which could be readily mapped to a display device. With the advent of radiosity and Monte Carlo path tracing techniques, we began to compute images in terms of real units with the real dynamic range of physical illumination. Figure 1 is a false color image showing the magnitude and distribution of luminance values in a typical indoor scene containing a window to a sunlit exterior. The goal of image synthesis is to produce results such as Figure 4, which match our impression of what such a scene looks like. Initially though, researchers found that a wide range of displayable images could be obtained from the same input luminances -such as the unsatisfactory over- and under-exposed linear reproductions of the image in Figures 2 and 3. Initial attempts to find a consistent mapping from computed to displayable luminances were ad hoc and developed for computational convenience. One approach is to use a function that collapses the high dynamic range of luminance into a small numerical range. By taking the cube root of luminance, for example, the range of values is reduced to something that is easily mapped to the display range. This approach generally preserves visibility of objects, our first criterion for a tone mapping operator. However, condensing the range of values in this way reduces fine detail visibility, and distorts impressions of brightness and contrast, so it does not fully match visibility or reproduce the subjective appearance required by our second criterion. January 15, 1997 page 4 A more popular approach is to use an arbitrary linear scaling, either mapping the average of luminance in the real world to the average of the display, or the maximum non-light source luminance to the display maximum. For scenes with a dynamic range similar to the display device, this is successful. However, linear scaling methods do not maintain visibility in scenes with high dynamic range, since very bright and very dim values are clipped to fall within the display's limited dynamic range. Furthermore, scenes are mapped the same way regardless of the absolute values of luminance. A scene illuminated by a search light could be mapped to the same image as a scene illuminated by a flashlight, losing the overall impression of brightness and so losing the subjective correspondence between viewing the real and display-mapped scenes. A tone mapping operator proposed by Tumblin and Rushmeier [Tumblin93] concentrated on the problem of preserving the viewer's overall impression of brightness. As the light level that the eye adapts to in a scene changes, the relationship between brightness (the subjective impression of the viewer) and luminance (the quantity of light in the visible range) also changes. Using a brightness function proposed by Stevens and Stevens [Stevens60], they developed an operator that would preserve the overall impression of brightness in the image, using one adaptation value for real scene, and another adaptation value for the displayed image. Because a single adaptation level is used for the scene, though, preservation of brightness in this case is at the expense of visibility. Areas that are very bright or dim are clipped, and objects in these areas are obscured. Ward [Ward91] developed a simpler tone mapping method, designed to preserve feature visibility. In this method, a non-arbitrary linear scaling factor is found that preserves the impression of contrast (i.e., the visible changes in luminance) between the real and displayed image at a particular fixation point. While visibility is maintained at this adaptation point, the linear scaling factor still results in the clipping of very high and very low values, and correct visibility is not maintained throughout the image. Chiu et al. [Chiu93] addressed this problem of global visibility loss by scaling luminance values based on a spatial average of luminances in pixel neighborhoods. Values in bright or dark areas would not be clipped, but scaled according to different values based on their spatial location. Since the human eye is less sensitive to variations at low spatial frequencies than high ones, a variable scaling that changes slowly relative to image features is not immediately visible. However, in a room with a bright source and dark corners, the method inevitably produces display luminance gradients that are the opposite of real world gradients. To make a dark region around a bright source, the transition from a dark area in the room to a bright area shows a decrease in brightness rather than an increase. This is illustrated in Figure 5 which shows a bright source with a dark halo around it. The dark halo that facilitates rendering the visibility of the bulb disrupts what should be a symmetric pattern of light cast by the bulb on the wall behind it. The reverse gradient fails to preserve the subjective correspondence between the real room and the displayed image. Inspired by the work of Chiu et al., Schlick [Schlick95] developed an alternative method that could compute a spatially varying tone mapping. Schlick's work concentrated on improving computational efficiency and simplifying parameters, rather than improving the subjective correspondence of previous methods. January 15, 1997 page 5 Figure 5. Dynamic range compression based on a spatially varying scale factor (from [Chiu93]). Contrast, brightness and visibility are not the only perceptions that should be maintained by a tone mapping operator. Nakamae et al. [Nakamae90] and Spencer et al. [Spencer95] have proposed methods to simulate the effects of glare. These methods simulate the scattering in the eye by spreading the effects of a bright source in an image. Ferwerda et al. [Ferwerda96] proposed a method that accounts for changes in spatial acuity and color sensitivity as a function of light level. Our work is largely inspired by these papers, and we borrow heavily from Ferwerda et al. in particular. Besides maintaining visibility and the overall impression of brightness, the effects of glare, spatial acuity and color sensitivity must be included to fully meet our second criterion for producing a subjective correspondence between the viewer in the real scene and the viewer of the synthetic image. A related set of methods for adjusting image contrast and visibility have been developed in the field of image processing for image enhancement (e.g., see Chapter 3 in [Green83]). Perhaps the best known image enhancement technique is histogram equalization. In histogram equalization, the grey levels in an image are redistributed more evenly to make better use of the range of the display device. Numerous improvements have been made to simple equalization by incorporating models of perception. Frei [Frei77] introduced histogram hyperbolization that attempts to redistribute perceived brightness, rather than screen grey levels. Frei approximated brightness using the logarithm of luminance. Subsequent researchers such as Mokrane [Mokrane92] have introduced methods that use more sophisticated models of perceived brightness and contrast. The general idea of altering histogram distributions and using perceptual models to guide these alterations can be applied to tone mapping. However, there are two important January 15, 1997 page 6 differences between techniques used in image enhancement and techniques for image synthesis and real-world tone mapping: 1. In image enhancement, the problem is to correct an image that has already been distorted by photography or video recording and collapsed into a limited dynamic range. In our problem, we begin with an undistorted array of real world luminances with a potentially high dynamic range. 2. In image enhancement, the goal is to take an imperfect image and maximize visibility or contrast. Maintaining subjective correspondence with the original view of the scene is irrelevant. In our problem, we want to maintain subjective correspondence. We want to simulate visibility and contrast, not maximize it. We want to produce visually accurate, not enhanced, images. 3 Overview of the New Method In constructing a new method for tone mapping, we wish to keep the elements of previous methods that have been successful, and overcome the problems. Consider again the room with a window looking out on a sunlit landscape. Like any high dynamic range scene, luminance levels occur in clusters, as shown in the histogram in Figure 6, rather than being uniformly distributed throughout the dynamic range. The failure of any method that uses a single adaptation level is that it maps a large range of sparsely populated real world luminance levels to a large range of display values. If the eye were sensitive to absolute values of luminance difference, this would be necessary. However, the eye is only sensitive to the fact that there are bright areas and dim areas. As long as the bright areas are displayed by higher luminances than the dim areas in the final image, the absolute value of the difference in luminance is not important. Exploiting this aspect of vision, we can close the gap between the display values for high and low luminance regions, and we have more display luminances to work with to render feature visibility. Another failure of using a uniform adaptation level is that the eye rapidly adapts to the level of a relatively small angle in the visual field (i.e., about 1°) around the current fixation point [Moon&Spencer45]. When we look out the window, the eye adapts to the high exterior level, and when we look inside, it adapts to the low interior level. Chiu et al. [Chiu93] attempted to account for this using spatially varying scaling factors, but this method produces noticeable gradient reversals, as shown in Figure 5. Rather than adjusting the adaptation level based on spatial location in the image, we will base our mapping on the population of the luminance adaptation levels in the image. To identify clusters of luminance levels and initially map them to display values, we will use the cumulative distribution of the luminance histogram. More specifically, we will start with a cumulative distribution based on a logarithmic approximation of brightness from luminance values. January 15, 1997 page 7 Figure 6. A histogram of adaptation values from Figure 1 (1° spot luminance averages). First, we calculate the population of levels from a luminance image of the scene in which each pixel represents 1° in the visual field. We make a crude approximation of the brightness values (i.e., the subjective response) associated with these luminances by taking the logarithm of luminance. (Note that we will not display logarithmic values, we will merely use them to obtain a distribution.) We then build a histogram and cumulative distribution function from these values. Since the brightness values are integrated over a small solid angle, they are in some sense based on a spatial average, and the resulting mapping will be local to a particular adaptation level. Unlike Chiu's method however, the mapping for a particular luminance level will be consistent throughout the image, and will be order preserving. Specifically, an increase in real scene luminance level will always be represented by an increase in display luminance. The histogram and cumulative distribution function will allow us to close the gaps of sparsely populated luminance values and avoid the clipping problems of single adaptation level methods. By deriving a single, global tone mapping operator from locally averaged adaptation levels, we avoid the reverse gradient artifacts associated with a spatially varying multiplier. January 15, 1997 page 8 We will use this histogram only as a starting point, and impose restrictions to preserve (rather than maximize) contrast based on models of human perception using our knowledge of the true luminance values in the scene. Simulations of glare and variations in spatial acuity and color sensitivity will be added into the model to maintain subjective correspondence and visibility. In the end, we obtain a mapping of real world to display luminance similar to the one shown in Figure 7. For our target display, all mapped brightness values below 1 cd/m2 (0 on the vertical axis) or above 100 (2 on the vertical axis) are lost because they are outside the displayable range. Here we see that the dynamic range between 1.75 and 2.5 has been compressed, yet we don't notice it in the displayed result (Figure 4). Compared to the two linear operators, our new tone mapping is the only one that can represent the entire scene without losing object or detail visibility. Figure 7. A plot comparing the global brightness mapping functions for Figures 1, 2, and 3, respectively. In the following section, we illustrate this technique for histogram adjustment based on contrast sensitivity. After this, we describe models of glare, color sensitivity and visual January 15, 1997 page 9 acuity that complete our simulation of the measurable and subjective responses of human vision. Finally, we complete the methods presentation with a summary describing how all the pieces fit together. 4 Histogram Adjustment In this section, we present a detailed description of our basic tone mapping operator. We begin with the introduction of symbols and definitions, and a description of the histogram calculation. We then describe a naive equalization step that partially accomplishes our goals, but results in undesirable artifacts. This method is then refined with a linear contrast ceiling, which is further refined using human contrast sensitivity data. 4.1 Symbols and Definitions Lw Bw Lwmin Lwmax Ld Ldmin Ldmax Bde N T ƒ(bi ) ∆b P(b) log(x) log10(x) = world luminance (in candelas/meter2 ) = world brightness, log(Lw) = minimum world luminance for scene = maximum world luminance for scene = display luminance (in candelas/meter2 ) = minimum display luminance (black level) = maximum display luminance (white level) = computed display brightness, log(Ld ) [Equation (4)] = the number of histogram bins = the total number of adaptation samples = frequency count for the histogram bin at bi = the bin step size in log(cd/m2 ) = the cumulative distribution function [Equation (2)] = natural logarithm of x = decimal logarithm of x 4.2 Histogram Calculation Since we are interested in optimizing the mapping between world adaptation and display adaptation, we start with a histogram of world adaptation luminances. The eye adapts for the best view in the fovea, so we compute each luminance over a 1° diameter solid angle corresponding to a potential foveal fixation point in the scene. We use a logarithmic scale for the histogram to best capture luminance population and subjective response over a wide dynamic range. This requires setting a minimum value as well as a maximum, since the logarithm of zero is -∞. For the minimum value, we use either the minimum 1° spot average, or 10-4 cd/m2 (the lower threshold of human vision), whichever is larger. The maximum value is just the maximum spot average. We start by filtering our original floating-point image down to a resolution that roughly corresponds to 1° square pixels. If we are using a linear perspective projection, the pixels on the perimeter will have slightly smaller diameter than the center pixels, but they will still be within the correct range. The following formula yields the correct resolution for January 15, 1997 page 10 1° diameter pixels near the center of a linear perspective image: S = 2 tan(θ/2) / 0.01745 (1) where: S = width or height in pixels θ = horizontal or vertical full view angle 0.01745 = number of radians in 1° For example, the view width and height for Figure 4 are 63° and 45° respectively, which yield a sample image resolution of 70 by 47 pixels. Near the center, the pixels will be 1° square exactly, but near the corners, they will be closer to 0.85° for this wide-angle view. The filter kernel used for averaging will have little influence on our result, so long as every pixel in the original image is weighted similarly. We employ a simple box filter. From our reduced image, we compute the logarithms of the floating-point luminance values. Here, we assume there is some method for obtaining the absolute luminances at each spot sample. If the image is uncalibrated, then the corrections for human vision will not work, although the method may still be used to optimize the visible dynamic range. (We will return to this in the summary.) The histogram is taken between the minimum and maximum values mentioned earlier in equal-sized bins on a log(luminance) scale. The algorithm is not sensitive to the number of bins, so long as there are enough to obtain adequate resolution. We use 100 bins in all of our examples. The resulting histogram for Figure 1 is shown in Figure 6. 4.2.1 Cumulative Distribution The cumulative frequency distribution is defined as: P(b) = ∑ f (b ) i bi < b T (2) where: T = ∑ f (bi ) (i.e., the total number of samples) bi Later on, we will also need the derivative of this function. Since the cumulative distribution is a numerical integration of the histogram, the derivative is simply the histogram with an appropriate normalization factor. In our method, we approximate a continuous distribution and derivative by interpolating adjacent values linearly. The derivative of our function is: dP(b) f (b) = db T ∆b where: ∆b = January 15, 1997 (3) [log(Lwmax ) − log( Lwmin )] (i.e., the size of each bin) N page 11 Figure 8. Rendering of a bathroom model mapped with a linear operator. 4.3 Naive Histogram Equalization If we wanted all the brightness values to have equal probability in our final displayed image, we could now perform a straightforward histogram equalization. Although this is not our goal, it is a good starting point for us. Based on the cumulative frequency distribution just described, the equalization formula can be stated in terms of brightness as follows: Bde = log(Ldmin ) + [log(Ldmax ) − log(Ldmin )]⋅ P(Bw ) (4) The problem with naive histogram equalization is that it not only compresses dynamic range (contrast) in regions where there are few samples, it also expands contrast in highly populated regions of the histogram. The net effect is to exaggerate contrast in large areas of the displayed image. Take as an example the scene shown in Figure 8. Although we cannot see the region surrounding the lamps due to the clamped linear tone mapping operator, the image appears to us as more or less normal. Applying the naive histogram equalization, Figure 9 is produced. The tiles in the shower now have a mottled appearance. Because this region of world luminance values is so well represented, naive January 15, 1997 page 12 histogram equalization spreads it out over a relatively larger portion of the display's dynamic range, generating superlinear contrast in this region. Figure 9. Naive histogram equalization allows us to see the area around the light sources but contrast is exaggerated in other areas such as the shower tiles. 4.4 Histogram Adjustment with a Linear Ceiling If the contrast being produced is too high, then what is an appropriate contrast for representing image features? The crude answer is that the contrast in any given region should not exceed that produced by a linear tone mapping operator, since linear operators produce satisfactory results for scenes with limited dynamic range. We will take this simple approach first, and later refine our answer based on human contrast sensitivity. A linear ceiling on the contrast produced by our tone mapping operator can be written thus: dL d Ld ≤ dLw Lw January 15, 1997 (5a) page 13 That is, the derivative of the display luminance with respect to the world luminance must not exceed the display luminance divided by the world luminance. Since we have an expression for the display luminance as a function of world luminance for our naive histogram equalization, we can differentiate the exponentiation of Equation (4) using the chain rule and the derivative from Equation (3) to get the following inequality: exp( Bde ) ⋅ f (Bw ) log(Ldmax ) − log(Ldmin ) Ld ⋅ ≤ T ∆b Lw Lw (5b) Since Ld is equal to exp( Bde ) , this reduces to a constant ceiling on ƒ(b): f (b) ≤ T ∆b log( Ldmax ) − log( Ldmin ) (5c) In other words, so long as we make sure no frequency count exceeds this ceiling, our resulting histogram will not exaggerate contrast. How can we create this modified histogram? We considered both truncating larger counts to this ceiling and redistributing counts that exceeded the ceiling to other histogram bins. After trying both methods, we found truncation to be the simplest and most reliable approach. The only complication introduced by this technique is that once frequency counts are truncated, T changes, which changes the ceiling. We therefore apply iteration until a tolerance criterion is met, which says that fewer than 2.5% of the original samples exceed the ceiling.1 Our pseudocode for histogram_ceiling is given below: boolean function histogram_ceiling() tolerance := 2.5% of histogram total repeat { trimmings := 0 compute the new histogram total T if T < tolerance then return FALSE foreach histogram bin i do compute the ceiling if ƒ(bi) > ceiling then { trimmings += ƒ(bi) - ceiling ƒ(bi) := ceiling } } until trimmings <= tolerance return TRUE This iteration will fail to converge (and the function will return FALSE) if and only if the dynamic range of the output device is already ample for representing the sample luminances in the original histogram. This is evident from Equation (5c), since ∆b is the world brightness range over the number of bins: f (bi ) ≤ T [log( Lwmax ) − log(Lwmin )] ⋅ N [ log(Ldmax ) − log(Ldmin )] (5d) 1 The tolerance of 2.5% was chosen as an arbitrary small value, and it seems to make little difference either to the convergence time or the results. January 15, 1997 page 14 If the ratio of the world brightness range over the display brightness range is less than one (i.e., our world range fits in our display range), then our frequency ceiling is less than the total count over the number of bins. Such a condition will never be met, since a uniform distribution of samples would still be over the ceiling in every bin. It is easiest to detect this case at the outset by checking the respective brightness ranges, and applying a simple linear operator if compression is unnecessary. We call this method histogram adjustment rather than histogram equalization because the final brightness distribution is not equalized. The net result is a mapping of the scene's high dynamic range to the display's smaller dynamic range that minimizes visible contrast distortions, by compressing under-represented regions without expanding overrepresented ones. Figure 10 shows the results of our histogram adjustment algorithm with a linear ceiling. The problems of exaggerated contrast are resolved, and we can still see the full range of brightness. A comparison of these tone mapping operators is shown in Figure 11. The naive operator is superlinear over a large range, seen as a very steep slope near world luminances around 100.8. Figure 10. Histogram adjustment with a linear ceiling on contrast preserves both lamp visibility and tile appearance. January 15, 1997 page 15 Figure 11. A comparison of naive histogram equalization with histogram adjustment using a linear contrast ceiling. The method we have just presented is itself quite useful. We have managed to overcome limitations in the dynamic range of typical displays without introducing objectionable contrast compression artifacts in our image. In situations where we want to get a good, natural-looking image without regard to how well a human observer would be able to see in a real environment, this may be an optimal solution. However, if we are concerned with reproducing both visibility and subjective experience in our displayed image, then we must take it a step further and consider the limitations of human vision. 4.5 Histogram Adjustment Based on Human Contrast Sensitivity Although the human eye is capable of adapting over a very wide dynamic range (on the order of 109 ), we do not see equally well at all light levels. As the light grows dim, we have more and more trouble detecting contrast. The relationship between adaptation luminance and the minimum detectable luminance change is well studied [CIE81]. For consistency with earlier work, we use the same detection threshold function used by Ferwerda et al. [Ferwerda96]. This function covers sensitivity from the lower limit of January 15, 1997 page 16 human vision to daylight levels, and accounts for both rod and cone response functions. The piecewise fit is reprinted in Table 1. log10 of just noticeable difference -2.86 (0.405 log10(La) + 1.6)2.18 - 2.86 log10(La) - 0.395 (0.249 log(La) + 0.65)2.7 - 0.72 applicable luminance range log10(La) < -3.94 -3.94 ≤ log10(La) < -1.44 -1.44 ≤ log10(La) < -0.0184 -0.0184 ≤ log10(La) < 1.9 log10(La) - 1.255 log10(La) ≥ 1.9 Table 1. Piecewise approximation for ∆Lt (La). We name this combined sensitivity function: ∆Lt (La) = "just noticeable difference" for adaptation level La (6) Ferwerda et al. did not combine the rod and cone sensitivity functions in this manner, since they used the two ranges for different tone mapping operators. Since we are using this function to control the maximum reproduced contrast, we combine them at their crossover point of 10-0.0184 cd/m2 . To guarantee that our display representation does not exhibit contrast that is more noticeable than it would be in the actual scene, we constrain the slope of our operator to the ratio of the two adaptation thresholds for the display and world, respectively. This is the same technique introduced by Ward [Ward91] and used by Ferwerda et al. [Ferwerda96] to derive a global scale factor. In our case, however, the overall tone mapping operator will not be linear, since the constraint will be met at all potential adaptation levels, not just a single selected one. The new ceiling can be written as: dL d ∆Lt (Ld ) ≤ dLw ∆Lt (Lw ) (7a) As before, we compute the derivative of the histogram equalization function (Equation (4)) to get: exp( Bde ) ⋅ f (Bw ) log(Ldmax ) − log(Ldmin ) ∆Lt (Ld ) ⋅ ≤ T ∆b Lw ∆Lt (Lw ) (7b) However, this time the constraint does not reduce to a constant ceiling for ƒ(b). We notice that since Ld equals exp(Bde) and Bde is a function of Lw from Equation (4), our January 15, 1997 page 17 ceiling is completely defined for a given P(b) and world luminance, Lw: f (Bw ) ≤ ∆Lt (Ld ) T ∆bLw ⋅ ∆Lt (Lw ) [log(Ldmax ) − log(Ldmin )]Ld where: Ld (7c) = exp(Bde), Bde given in Equation (4) Once again, we must iterate to a solution, since truncating bin counts will affect T and P(b). We reuse the histogram_ceiling procedure given earlier, replacing the linear contrast ceiling computation with the above formula. Figure 12. Our tone mapping operator based on human contrast sensitivity compared to the histogram adjustment with linear ceiling used in Figure 10. Human contrast sensitivity makes little difference at these light levels. Figure 12 shows the same curves for the linear tone mapping and histogram adjustment with linear clamping shown before in Figure 11, but with the curve for naive histogram January 15, 1997 page 18 equalization replaced by our human visibility matching algorithm. We see the two histogram adjustment curves are very close. In fact, we would have some difficulty differentiating images mapped with our latest method and histogram adjustment with a linear ceiling. This is because the scene we have chosen has most of its luminance levels in the same range as our display luminances. Therefore, the ratio between display and world luminance detection thresholds is close to the ratio of the display and world adaptation luminances. This is known as Weber's law [Riggs71], and it holds true over a wide range of luminances where the eye sees equally well This correspondence makes the right-hand side of Equations (5b) and (7b) equivalent, and so we should expect the same result as a linear ceiling. Figure 13. The brightness map for the bathroom scene with lights dimmed to 1/100th of their original intensity, where human contrast sensitivity makes a difference. To see a contrast sensitivity effect, our world adaptation would have to be very different from our display adaptation. If we reduce the light level in the bathroom by a factor of 100, our ability to detect contrast is diminished. This shows up in a relatively larger detection threshold in the denominator of Equation (7c), which reduces the ceiling for the January 15, 1997 page 19 frequency counts. The change in the tone mapping operator is plotted in Figure 13 and the resulting image is shown in Figure 14. Figure 13 shows that the linear mapping is unaffected, since we just raise the scale factor to achieve an average exposure. Likewise, the histogram adjustment with a linear ceiling maps the image to the same display range, since its goal is to reproduce linear contrast. However, the ceiling based on human threshold visibility limits contrast over much of the scene, and the resulting image is darker and less visible everywhere except the top of the range, which is actually shown with higher contrast since we now have display range to spare. Figure 14 is darker and the display contrast is reduced compared to Figure 10. Because the tone mapping is based on local adaptation rather than a single global or spot average, threshold visibility is reproduced everywhere in the image, not just around a certain set of values. This criterion is met within the limitations of the display's dynamic range. Figure 14. The dimmed bathroom scene mapped with the function shown in Figure 13. January 15, 1997 page 20 5 Human Visual Limitations We have seen how histogram adjustment matches display contrast visibility to world visibility, but we have ignored three important limitations in human vision: glare, color sensitivity and visual acuity. Glare is caused by bright sources in the visual periphery, which scatter light in the lens of the eye, obscuring foveal vision. Color sensitivity is lost in dark environments, as the light-sensitive rods take over for the color-sensitive cone system. Visual acuity is also impaired in dark environments, due to the complete loss of cone response and the quantum nature of light sensation. In our treatment, we will rely heavily on previous work performed by Moon and Spencer [Moon&Spencer45] and Ferwerda et al. [Ferwerda96], applying it in the context of a locally adapted visibility-matching model. 5.1 Veiling Luminance Bright glare sources in the periphery reduce contrast visibility because light scattered in the lens and aqueous humor obscures the fovea; this effect is less noticeable when looking directly at a source, since the eye adapts to the high light level. The influence of glare sources on contrast sensitivity is well studied and documented. We apply the work of Holladay [Holladay26] and Moon and Spencer [Moon&Spencer45], which relates the effective adaptation luminance to the foveal average and glare source position and illuminance. In our presentation, we will first compute a low resolution veil image from our foveal sample values. We will then interpolate this veil image to add glare effects to the original rendering. Finally, we will apply this veil as a correction to the adaptation luminances used for our contrast, color sensitivity and acuity models. Moon and Spencer base their formula for adaptation luminance on the effect of individual glare sources measured by Holladay, which they converted to an integral over the entire visual periphery. The resulting glare formula gives the effective adaptation luminance at a particular fixation for an arbitrary visual field: La = 0.913Lf + K > where: La Lf L(θ,φ) θf K L( , ) ∫∫ 2 cos( )sin( )d d (8) f = corrected adaptation luminance (in cd/m2 ) = the average foveal luminance (in cd/m2 ) = the luminance in the direction (θ,φ) = foveal half angle, approx. 0.00873 radians (0.5°) = constant measured by Holladay, 0.0096 The constant 0.913 in this formula is the remainder from integrating the second part assuming one luminance everywhere. In other words, the periphery contributes less than 9% to the average adaptation luminance, due to the small value Holladay determined for K. If there are no bright sources, this influence can be safely neglected. However, bright sources will significantly affect the adaptation luminance, and should be considered in our model of contrast sensitivity. January 15, 1997 page 21 To compute the veiling luminance corresponding to a given foveal sample (i.e., fixation point), we can convert the integral in Equation (8) to an average over peripheral sample values: Lvi = 0.087 ⋅ ∑ Lj cos( j≠i ∑ j ≠i where: Lvi Lj θi,j i, j ) 2 i, j cos( i, j ) (9) 2 i, j = veiling luminance for fixation point i = foveal luminance for fixation point j = angle between sample i and j (in radians) Since we must compute this sum over all foveal samples j for each fixation point i, the calculation can be very time consuming. We therefore reduce our costs by approximating the weight expression as: cos 2 ≈ cos 2 − 2cos (10) Since the angles between our samples are most conveniently available as vector dot products, which is the cosine, the above weight computation is quite fast. However, for large images (in terms of angular size), the L vi calculation is still the most computationally expensive step in our method due to the double iteration over i and j. To simulate the effect of glare on visibility, we simply add the computed veil map to our original image. Just as it occurs in the eye, the veiling luminance will obscure the visible contrast on the display by adding to both the background and the foreground luminance.2 This was the original suggestion made by Holladay, who noted that the effect glare has on luminance threshold visibility is equivalent to what one would get by adding the veiling luminance function to the original image [Holladay26]. This is quite straightforward once we have computed our foveal-sampled veiling image given in Equation (9). At each image pixel, we perform the following calculation: Lpvk = 0.913Lpk + Lv (k) where: Lpvk Lpk Lv (k) (11) = veiled pixel at image position k = original pixel at image position k = interpolated veiling luminance at k The Lv (k) function is a simple bilinear interpolation on the four closest samples in our veil image computed in Equation (9). The final image will be lighter around glare sources, and just slightly darker on glare sources (since the veil is effectively being spread away from bright points). Although we have shown this as a luminance calculation, we retain color information so that our veil has the same color cast as the responsible glare source(s). 2 The contrast is defined as the ratio of the foreground minus the background over the background, so adding luminance to both foreground and background reduces contrast. January 15, 1997 page 22 Figure 15 shows our original, fully lit bathroom scene again, this time adding in the computed veiling luminance. Contrast visibility is reduced around the lamps, but the veil falls off rapidly (as 1/θ2 ) over other parts of the image. If we were to measure the luminance detection threshold at any given image point, the result should correspond closely to the threshold we would measure at that point in the actual scene. Figure 15. Our tone reproduction operator for the original bathroom scene with veiling luminance added. Since glare sources scatter light onto the fovea, they also affect the local adaptation level, and we should consider this in the other parts of our calculation. We therefore apply the computed veiling luminances to our foveal samples as a correction before the histogram generation and adjustment described in Section 4. We deferred the introduction of this correction factor to simplify our presentation, since in most cases it only weakly affects the brightness mapping function. January 15, 1997 page 23 The correction to local adaptation is the same as Equation (11), but without interpolation, since our veil samples correspond one-to-one: Lai = 0.913Li + Lvi where: Lai Li (12) = adjusted adaptation luminance at fixation point i = foveal luminance for fixation point i We will also employ these Lai adaptation samples for the models of color sensitivity and visual acuity that follow. 5.2 Color Sensitivity To simulate the loss of color vision in dark environments, we use the technique presented by Ferwerda et al. [Ferwerda96] and ramp between a scotopic (grey) response function and a photopic (color) response function as we move through the mesopic range. The lower limit of the mesopic range, where cones are just starting to get enough light, is approximately 0.0056 cd/m2 . Below this value, we use the straight scotopic luminance. The upper limit of the mesopic range, where rods are no longer contributing significantly to vision, is approximately 5.6 cd/m2 . Above this value, we use the straight photopic luminance plus color. In between these two world luminances (i.e., within the mesopic range), our adjusted pixel is a simple interpolation of the two computed output colors, using a linear ramp based on luminance. Since we do not have a value available for the scotopic luminance at each pixel, we use the following approximation based on a least squares fit to the colors on the Macbeth ColorChecker Chart™:   Y + Z  Yscot = Y ⋅1.33⋅  1+ − 1.68   X  where: Yscot X,Y,Z (13) = scotopic luminance = photopic color, CIE 2° observer (Y is luminance) This is a very good approximation to scotopic luminance for most natural colors, and it avoids the need to render another channel. We also have an approximation based on RGB values, but since there is no accepted standard for RGB primaries in computer graphics, this is much less reliable. Figure 16 shows our dimmed bathroom scene with the human color sensitivity function in place. Notice there is still some veiling, even with the lights reduced to 1/100th their normal level. This is because the relative luminances are still the same, and they scatter in the eye as before. The only difference here is that the eye cannot adapt as well when there is so little light, so everything appears dimmer, including the lamps. The colors are clearly visible near the light sources, but gradually less visible in the darker regions. January 15, 1997 page 24 Figure 16. Our dimmed bathroom scene with tone mapping using human contrast sensitivity, veiling luminance and mesopic color response. 5.3 Visual Acuity Besides losing the ability to see contrast and color, the human eye loses its ability to resolve fine detail in dark environments. The relationship between adaptation level and foveal acuity has been measured in subject studies reported by Shaler [Shaler37]. At daylight levels, human visual acuity is very high, about 50 cycles/degree. In the mesopic range, acuity falls off rapidly from 42 cycles/degree at the top down to 4 cycles/degree near the bottom. Near the limits of vision, the visual acuity is only about 2 cycles/degree. Shaler's original data is shown in Figure 17 along with the following functional fit: R(La ) ≈ 17.25arctan(1.4log10(La ) + 0.35) + 25.72 where: R(La) La January 15, 1997 (15) = visual acuity in cycles/degree = local adaptation luminance (in cd/m2 ) page 25 Figure 17. Shaler's visual acuity data and our functional fit to it. In their tone mapping paper, Ferwerda et al. applied a global blurring function based on a single adaptation level [Ferwerda96]. Since we wish to adjust for acuity changes over a wide dynamic range, we must apply our blurring function locally according to the foveal adaptation computed in Equation (12). To do this, we implement a variable-resolution filter using an image pyramid and interpolation, which is the mip map introduced by Williams [Williams83] for texture mapping. The only difference here is that we are working with real values rather than integers pixels. At each point in the image, we interpolate the local acuity based on the four closest (veiled) foveal samples and Shaler's data. It is very important to use the foveal data (Lai) and not the original pixel value, since it is the fovea's adaptation that determines acuity. The resulting image will show higher resolution in brighter areas, and lower resolution in darker areas. Figure 18 shows our dim bathroom scene again, this time applying the variable acuity operator applied together with all the rest. Since the resolution of the printed image is low, we enlarged two areas for a closer look. The bright area has an average level around January 15, 1997 page 26 25 cd/m2 , corresponding to a visual acuity of about 45 cycles/degree. The dark area has an average level of around 0.05 cd/m2 , corresponding to a visual acuity of about 9 cycles/degree. Figure 18. The dim bathroom scene with variable acuity adjustment. The insets show two areas, one light and one dark, and the relative blurring of the two. 6 Method Summary We have presented a method for matching the visibility of high dynamic range scenes on conventional displays, accounting for human contrast sensitivity, veiling luminance, color sensitivity and visual acuity, all in the context of a local adaptation model. However, in presenting this method in parts, we have not given a clear idea of how the parts are integrated together into a working program. January 15, 1997 page 27 The order in which the different processes are executed to produce the final image is of critical importance. These are the steps in the order they must be performed: procedure match_visibility() compute 1° foveal sample image compute veil image add veil to foveal adaptation image add veil to image blur image locally based on visual acuity function apply color sensitivity function to image generate histogram of effective adaptation image adjust histogram to contrast sensitivity function apply histogram adjustment to image translate CIE results to display RGB values end We have not discussed the final step, mapping the computed display luminances and chrominances to appropriate values for the display device (e.g., monitor RGB settings). This is a well studied problem, and we refer the reader to the literature (e.g., [Hall89]) for details. Bear in mind that the mapped image accounts for the black level of the display, which must be subtracted out before applying the appropriate gamma and color corrections. Although we state that the above steps must be carried out in this order, a few of the steps may be moved around, or removed entirely for a different effect. Specifically, it makes little difference whether the luminance veil is added before or after the blurring function, since the veil varies slowly over the image. Also, the color sensitivity function may be applied anywhere after the veil is added so long as it is before histogram adjustment. If the goal is to optimize visibility and appearance without regard to the limitations of human vision, then all the steps between computing the foveal average and generating the histogram may be skipped, and a linear ceiling may be applied during histogram adjustment instead of the human contrast sensitivity function. The result will be an image with all parts visible on the display, regardless of the world luminance level or the presence of glare sources. This may be preferable when the only goal is to produce a nice-looking image, or when the absolute luminance levels in the original scene are unknown. 7 Results In our dynamic range compression algorithm, we have exploited the fact that humans are insensitive to relative and absolute differences in luminance. For example, we can see that it is brighter outside than inside on a sunny day, but we cannot tell how much brighter (3 times or 100) or what the actual luminances are (10 cd/m2 or 1000). With the additional display range made available by adjusting the histogram to close the gaps between luminance levels, visibility (i.e., contrast) within each level can be properly preserved. Furthermore, this is done in a way that is compatible with subjective aspects of vision. In the development sections, two synthetic scenes have served as examples. In this section, we show results from two different application areas -- lighting simulation and electronic photography. January 15, 1997 page 28 Figure 19. A simulation of a shipboard control panel under emergency lighting. Figure 20. A simulation of an air traffic control console. January 15, 1997 page 29 Figure 21. A Christmas tree with very small light sources. 7.1 Lighting Simulation In lighting design, it is important to simulate what it is like to be in an environment, not what a photograph of the environment looks like. Figures 19 and 20 show examples of real lighting design applications. In Figure 19, the emergency lighting of a control panel is shown. It is critical that the lighting provide adequate visibility of signage and levers. An image synthesis method that cannot predict human visibility is useless for making lighting or system design judgments. Figure 20 shows a flight controller's console. Being able to switch back and forth between the console and the outdoor view is an essential part of the controller's job. Again, judgments on the design of the console cannot be made on the basis of ill-exposed or arbitrarily mapped images. Figure 21 is not a real lighting application, but represents another type of interesting lighting. In this case, the high dynamic range is not represented by large areas of either high or low luminance. Very high, almost point, luminances are scattered in the scene. The new tone mapping works equally well on this type of lighting, preserving visibility January 15, 1997 page 30 while keeping the impression of the brightness of the point sources. The color sensitivity and variable acuity mapping also correctly represent the sharp color view of areas surrounding the lights, and the greyed blurring of more dimly lit areas. Figure 22. A scanned photograph of Memorial Church. 7.2 Electronic Photography Finally, we present an example from electronic photography. In traditional photography, it is impossible to set the exposure so all areas of a scene are visible as they would be to a human observer. New techniques of digital compositing are now capable of creating images with much higher dynamic ranges. Our tone reproduction operator can be applied to appropriately map these images into the range of a display device. Figure 22 shows the interior of a church, taken on print film by a 35mm SLR camera with a 15mm fisheye lens. The stained glass windows are not completely visible because the recording film has been saturated, even though the rafters on the right are too dark to see. Figure 23 shows our tone reproduction operator applied to a high dynamic range version of this image, called a radiance map. The radiance map was generated from 16 separate exposures, each separated by one stop. These images were scanned, registered, and the full dynamic range was recovered using an algorithm developed by Debevec and Malik January 15, 1997 page 31 [Debevec97]. Our tone mapping operator makes it possible to retain the image features shown in Figure 23, whose world luminances span over 6 orders of magnitude. The field of electronic photography is still in its infancy. Manufacturers are rapidly improving the dynamic range of sensors and other electronics that are available at a reasonable cost. Visibility preserving tone reproduction operators will be essential in accurately displaying the output of such sensors in print and on common video devices. Figure 23. Histogram adjusted radiance map of Memorial Church. 8 Conclusions and Future Work There are still several degrees of freedom possible in this tone mapping operator. For example, the method of computing the foveal samples corresponding to viewer fixation points could be altered. This would depend on factors such as whether an interactive system or a preplanned animation is being designed. Even in a still image, a theory of probable gaze could be applied to improve the initial adaptation histogram. Additional modifications could easily be made to the threshold sensitivity, veil and acuity models to simulate the effects of aging and certain types of visual impairment. This method could also be extended to other application areas. The tone mapping could be incorporated into global illumination calculations to make them more efficient by January 15, 1997 page 32 relating error to visibility. The mapping could also become part of a metric to compare images and validate simulations, since the results correspond roughly to human perception [Rushmeier95]. Some of the approximations in our operator merit further study, such as color sensitivity changes in the mesopic range. A simple choice was made to interpolate linearly between scotopic and photopic response functions, which follows Ferwerda et al. [Ferwerda96] but should be examined more closely. The effect of the luminous surround on adaptation should also be considered, especially for projection systems in darkened rooms. Finally, the current method pays little attention to absolute color perception, which is strongly affected by global adaptation and source color (i.e., white balance). The examples and results we have shown match well with the subjective impression of viewing the actual environments being simulated or recorded. On this informal level, our tone mapping operator has been validated experimentally. To improve upon this, more rigorous validations are needed. While validations of image synthesis techniques have been performed before (e.g., Meyer et al. [Meyer86]), they have not dealt with the level of detail required for validating an accurate tone operator. Validation experiments will require building a stable, non-trivial, high dynamic range environment and introducing observers to the environment in a controlled way. Reliable, calibrated methods are needed to capture the actual radiances in the scene and reproduce them on a display following the tone mapping process. Finally, a series of unbiased questions must be formulated to evaluate the subjective correspondence between observation of the physical scene and observation of images of the scene in various media. While such experiments will be a significant undertaking, the level of sophistication in image synthesis and electronic photography requires such detailed validation work. The dynamic range of an interactive display system is limited by the technology required to control continual, intense, focused energy over millisecond time frames, and by the uncontrollable elements in the ambient viewing environment. The technological, economic and practical barriers to display improvement are formidable. Meanwhile, luminance simulation and acquisition systems continue to improve, providing images with higher dynamic range and greater content, and we need to communicate this content on conventional displays and hard copy. This is what tone mapping is all about. 9 Acknowldgments The authors wish to thank Robert Clear and Samuel Berman for their helpful discussions and comments. This work was supported by the Laboratory Directed Research and Development Funds of Lawrence Berkeley National Laboratory under the U.S. Department of Energy under Contract No. DE-AC03-76SF00098. 10 References [Chiu93] K. Chiu, M. Herf, P. Shirley, S. Swamy, C. Wang and K. Zimmerman "Spatially nonuniform scaling functions for high contrast images," Proceedings of Graphics Interface '93, Toronto, Canada, May 1993, pp. 245253. January 15, 1997 page 33 [CIE81] CIE (1981) An analytical model for describing the influence of lighting parameters upon visual performance, vol 1. Technical foundations. CIE 19/2.1, Techical committee 3.1 [Debevec97] Debevec, Paul and Jitendra Malik, "Recovering High Dynamic Range Radiance Maps from Photographs," Proceedings of ACM SIGGRAPH '97. [Ferwerda96] J. Ferwerda, S. Pattanaik, P. Shirley and D.P. Greenberg. "A Model of Visual Adaptation for Realistic Image Synthesis," Proceedings of ACM SIGGRAPH '96, p. 249-258. [Frei77] W. Frei, "Image Enhancement by Histogram Hyperbolization," Computer Graphics and Image Processing, Vol 6, 1977 286-294. [Glassner95] A. Glassner, Principles of Digital Image Synthesis, Morgan Kaufman, San Francisco, 1995. [Green83] W. Green Digital Image Processing: A Systems Approach, Van Nostrand Reinhold Company, NY, 1983. [Hall89] R. Hall, Illumination and Color in Computer Generated Imagery, SpringerVerlag, New York, 1989. [Holladay26] Holladay, L.L., Journal of the Optical Society of America, 12, 271 (1926). [Meyer86] G. Meyer, H. Rushmeier, M. Cohen, D. Greenberg and K. Torrance. "An Experimental Evaluation of Computer Graphics Imagery," ACM Transactions on Graphics, January 1986, Vol. 5, No. 1, pp. 30-50. [Mokrane92] A. Mokrane, "A New Image Contrast Enhancement Technique Based on a Contrast Discrimination Model," CVGIP: Graphical Models and Image Processing, 54(2) March 1992, pp. 171-180. [Moon&Spencer45] P. Moon and D. Spencer, "The Visual Effect of Non-Uniform Surrounds", Journal of the Optical Society of America, vol. 35, No. 3, pp. 233-248 (1945) [Nakamae90] E. Nakamae, K. Kaneda, T. Okamoto, and T. Nishita. "A lighting model aiming at drive simulators," Proceedings of ACM SIGGRAPH 90, 24(3):395404, June, 1990. [Rushmeier95] H. Rushmeier, G. Ward, C. Piatko, P. Sanders, B. Rust, "Comparing Real and Synthetic Images: Some Ideas about Metrics,'' Sixth Eurographics Workshop on Rendering, proceedings published by Springer-Verlag. Dublin, Ireland, June 1995. January 15, 1997 page 34 [Schlick95] C. Schlick, "Quantization Techniques for Visualization of High Dynamic Range Pictures," Photorealistic Rendering Techniques (G. Sakas, P. Shirley and S. Mueller, Eds.), Springer, Berlin, 1995, pp.7-20. [Spencer95] G. Spencer, P. Shirley, K. Zimmerman, and D. Greenberg, "Physically-based glare effects for computer generated images," Proceedings ACM SIGGRAPH '95, pp. 325-334. [Stevens60] S. S. Stevens and J.C. Stevens, "Brightness Function: Parametric Effects of adaptation and contrast," Journal of the Optical Society of America, 53, 1139. 1960. [Tumbline93] J. Tumblin and H. Rushmeier. "Tone Reproduction for Realistic Images," IEEE Computer Graphics and Applications, November 1993, 13(6), 42-48. [Ward91] G. Ward, "A contrast-based scalefactor for luminance display," In P.S. Heckbert (Ed.) Graphics Gems IV, Boston, Academic Press Professional. [Williams83] L. Williams, "Pyramidal Parametrics," Computer Graphics, v.17,n.3, July 1983. January 15, 1997 page 35 High Dynamic Range Imaging Greg Ward Exponent – Failure Analysis Assoc. Menlo Park, California Abstract The ultimate in color reproduction is a display that can produce arbitrary spectral content over a 300-800 nm range with 1 arc-minute resolution in a full spherical hologram. Although such displays will not be available until next year, we already have the means to calculate this information using physically-based rendering. We would therefore like to know: how may we represent the results of our calculation in a device-independent way, and how do we map this information onto the displays we currently own? In this paper, we give an example of how to calculate full spectral radiance at a point and convert it to a reasonably correct display color. We contrast this with the way computer graphics is usually done, and show where reproduction errors creep in. We then go on to explain reasonable short-cuts that save time and storage space without sacrificing accuracy, such as illuminant discounting and human gamut color encodings. Finally, we demonstrate a simple and efficient tone-mapping technique that matches display visibility to the original scene. Introduction Most computer graphics software works in a 24-bit RGB space, with 8-bits allotted to each of the three primaries in a power-law encoding. The advantage of this representation is that no tone-mapping is required to obtain a reasonable reproduction on most commercial CRT display monitors, especially if both the monitor and the software adhere to the sRGB standard, i.e., CCIR-709 primaries and a 2.2 gamma [1]. The disadvantage of this practice is that colors outside the sRGB gamut cannot be represented, particularly values that are either too dark or too bright, since the useful dynamic range is only about 90:1, less than 2 orders of magnitude. By contrast, human observers can readily perceive detail in scenes that span 4-5 orders of magnitude in luminance through local adaptation, and can adapt in minutes to over 9 orders of magnitude. Furthermore, the sRGB gamut only covers about half the perceivable colors, missing large regions of blue-greens and violets, among others. Therefore, although 24-bit RGB does a reasonable job of representing what a CRT monitor can display, it does a poor job representing what a human observer can see. Display technology is evolving rapidly. Flat-screen LCD displays are starting to replace CRT monitors in many offices, and LED displays are just a few years off. Micromirror projection systems with their superior dynamic range and color gamut are already widespread, and laser raster projectors are on the horizon. It is an important question whether we will be able to take full advantage and adapt our color models to these new devices, or will we be limited as we are now to remapping sRGB to the new gamuts we have available -- or worse, getting the colors wrong? Unless we introduce new color models to our image sources and do it soon, we will never get out of the CRT color cube. The simplest solution to the gamut problem is to adhere to a floating-point color space. As long as we permit values greater than one and less than zero, any set of color primaries may be linearly transformed into any other set of color primaries without loss. The principal disadvantage of most floating-point representations is that they take up too much space (96-bits/pixel as opposed to 24). Although this may be the best representation for color computations, storing this information to disk or transferring it over the internet is a problem. Fortunately, there are representations based on human perception that are compact and sufficiently accurate to reproduce any visible color in 32-bits/pixel or less, and we will discuss some of these in this paper. There are two principal methods for generating high dynamic-range source imagery: physically-based rendering (e.g., [2]), and multiple-exposure image capture (e.g., [3]). In this paper, we will focus on the first method, since it is most familiar to the author. It is our hope that in the future, camera manufacturers will build HDR imaging principles and techniques into their cameras, but for now, the easiest path to full gamut imagery seems to be computer graphics rendering. Computer graphics lifts the usual constraints associated with physical measurements, making floating-point color the most natural medium in which to work. If a renderer is physically-based, it will compute color values that correspond to spectral radiance at each point in the rendered image. These values may later be converted to displayable colors, and the how and wherefore of this tone-mapping operation is the main topic of this paper. Before we get to tone-mapping, however, we must go over some of the details of physically-based rendering, and what qualifies a renderer in this category. Specifically, we will detail the basic lighting calculation, and compare this to common practice in computer graphics rendering. We highlight some common assumptions and approximations, and describe alternatives when these assumptions fail. Finally, we demonstrate color and tone mapping methods for converting the computed spectral radiance value to a displayable color at each pixel. The Spectral Rendering Equation Ro ( o , )= ∫∫ fr ( o ; i, ) Ri ( i, )cos i d i (1) The spectral rendering Eq. (1) expresses outgoing spectral radiance R o at a point on a surface in the direction o ( o, o) as a convolution of the bidirectional reflectance distribution function (BRDF) with the incoming spectral radiance over the projected hemisphere. This equation is the basis of many physically-based rendering programs, and it already contains a number of assumptions: 1. Light is reflected at the same wavelength at which it is received; i.e., the surface is not fluorescent. 2. Light is reflected at the same position at which it is received; i.e., there is no subsurface scattering. 3. Surface transmission is zero. 4. There are no polarization effects. 5. There is no diffraction. 6. The surface does not spontaneously emit light. In general, these assumptions are often wrong. Starting with the first assumption, many modern materials such as fabrics, paints, and even detergents, contain “whitening agents” which are essentially phosphors added to absorb ultraviolet rays and re-emit them at visible wavelengths. The second assumption is violated by many natural and man-made surfaces, such as marble, skin, and vinyl. The third assumption works for opaque surfaces, but fails for transparent and thin, translucent objects. The fourth assumption fails for any surface with a specular (shiny) component, and becomes particularly troublesome when skylight (which is strongly polarized) or multiple reflections are involved. The fifth assumption fails when surface features are on the order of the wavelength of visible light, and the sixth assumption is violated for light sources. Each of these assumptions may be addressed and remedied as necessary. Since a more general rendering equation would require a long and tedious explanation, we merely describe what to add to account for the effects listed. To handle fluorescence, the outgoing radiance at wavelength λ o may be computed from an integral of incoming radiance over all wavelengths λ I, which may be discretized in a matrix form [4]. To handle subsurface scattering, we can integrate over the surface as well as incoming directions, or use an approximation [5]. To handle transmission, we simply integrate over the sphere instead of the hemisphere, and take the absolute value of the cosine for the projected area [2]. To account for polarization, we add two terms for the transverse and parallel polarizations in each specular direction [4] [6]. To handle diffraction, we fold interactions between wavelength, polarization, amplitude and direction into the BRDF and the aforementioned extensions [7]. Light sources are the simplest exception to handle – we simply add in the appropriate amount of spontaneous radiance output as a function of direction and wavelength. Participating Media Implicitly missing from Eq. (1) is the interaction of light with the atmosphere, or participating media. If the space between surfaces contains significant amounts of dust, smoke, or condensation, a photon leaving one surface may be scattered or absorbed along the way. An additional equation is therefore needed to describe this volumetric effect, since the rendering equation only addresses interactions at surfaces. dR(s) =− ds a s 4 R(s) − ∫ R( i s i R(s) + )P( i ) d (2) Eq. (2) gives the differential change in radiance as a function of distance along a path. The coefficients σa and σs give the absorption and scattering densities respectively at position s, which correspond to the probabilities that light will be absorbed or scattered per unit of distance traveled. The scattering phase function, P( i ), gives the relative probability that a ray will be scattered in from direction i at this position. All of these functions and coefficients are also a function of wavelength. The above differential-integral equation is usually solved numerically by stepping through each position along the path, starting with the radiance leaving a surface given by Eq. (1). Recursive iteration from a sphere of scattered directions can quickly overwhelm such a calculation, especially if it is extended to multiple scattering events. Without going into details, Rushmeier et al. approached the problem of globally participating media using a zonal approach akin to radiosity that divides the scene into a finite set of voxels whose interactions are characterized in a formfactor matrix [8]. More recently, a modified ray-tracing method called the photon map has been applied successfully to this problem by Wann Jensen et al. [9]. In this method, photons are tracked as they scatter and are stored in the environment for later resampling during rendering . Solving the Rendering Equation Eq. (1) is a Fredholm integral equation of the second kind, which comes close to the appropriate level of intimidation but fails to explain why it is so difficult to solve in general [10]. Essentially, the equation defines outgoing radiance as an integral of incoming radiance at a surface point, and that incoming radiance is in turn defined by the same integral with different parameters evaluated at another surface point. Thus, the surface geometry and material functions comprise the boundary conditions of an infinitely recursive system of integral equations. In some sense, it is remarkable that researchers have made any progress in this area at all, but in fact, there are many people in computer graphics who believe that rendering is a solved problem. For over fifteen years, three approaches have dominated research and practice in rendering. The first approach is usually referred to as the local illumination approximation, and is the basis for most graphics rendering hardware, and much of what you see in movies and games. In this approximation, the integral equation is converted into a simple sum over light sources (i.e., concentrated emitters) and a general ambient term. The second approach is called ray tracing, and as its name implies, this method traces additional rays to determine specular reflection and transmission, and may be used to account for more general interreflections as well [11] [12]. The third approach is called radiosity after the identical method used in radiative transfer, where reflectances are approximated as Lambertian and the surfaces are divided into patches to convert the integral equation into a large linear system that may be solved iteratively [13]. Comparing these three approaches, local illumination is the cheapest and least accurate. Ray tracing has the advantage of coping well with complex geometry and materials, and radiosity does the best job of computing global interactions in simpler, diffuse environments. In truth, none of the methods currently in use provides a complete and accurate solution to the rendering equation for general environments, though some come closer than others. The first thing to recognize in computer graphics, and computer simulation in general, is that the key to getting a reasonable answer is finding the right approximation. The reason that local illumination is so widely employed when there are better techniques available is not simply that it’s cheaper; it provides a reasonable approximation to much of what we see. With a few added tricks, such as shadow maps, reflection maps and ambient lights, local illumination in the hands of an expert does a very credible job. However, this is not to say that the results are correct or accurate. Even in perceptual terms, the colors produced at each pixel are usually quite different from those one would observe in a real environment. In the entertainment industry, this may not be a concern, but if the application is prediction or virtual reenactment, better accuracy is necessary. For the remainder of this paper, we assume that accuracy is an important goal, particularly color accuracy. We therefore restrict our discussion of rendering and display to physically-based global illumination methods, such as ray-tracing and radiosity. Tone Mapping By computing an approximate solution to Eq. (1) for a given planar projection, we obtain a spectral rendering that represents each image point in physical units of radiance per wavelength (e.g., SI units of watts/steradian/meter2/nm). Whether we arrive at this result by ray-tracing, radiosity, or some combination, the next important task is to convert the spectral radiances to pixel color values for display. If we fail to take this step seriously, it almost doesn’t matter how much effort we put into the rendering calculation – the displayed image will look wrong. Converting a spectral image to a display image is usually accomplished in two stages. The first stage is to convert the spectral radiances to a tristimulus space, such as CIE XYZ. This is done by convolving each radiance spectrum with the three standard CIE observer functions. The second stage is to map each tristimulus value into our target display’s color space. This process is called tonemapping, and depending on our goals and requirements, we may take different approaches to arrive at different results. Here are a few possible rendering intents: 1. Colorimetric intent: Attempt to reproduce the exact color on the display, ignoring viewer adaptation.1 2. Saturation intent: Maintain color saturation as far as possible, allowing hue to drift. 3. Perceptual intent: Attempt to match perception of color by remapping to display gamut and viewer adaptation. The rendering intents listed above have been put forth by the ICC profile committee, and their exact meaning is somewhat open to interpretation, especially for out-of-gamut colors. Even for in-gamut colors, the perceptual intent, which interests us most, may be approached in several different ways. Here are a few possible techniques: A. Shrink the source (visible) gamut to fit within the display gamut, scaling uniformly about the neutral line. B. Same as A, except apply relative scaling so less saturated colors are affected less than more saturated ones. The extreme form of this is gamut-clipping. C. Scale colors on a curve determined by image content, as in a global histogram adjustment. D. Scale colors locally based on image spatial content, as in Land’s retinex theory. To any of the above, we may also add a white point transformation and/or contrast adjustment to compensate for a darker or brighter surround. In general, it is impossible to reproduce exactly the desired observer stimulus unless the source image contains no bright or saturated colors or the display has an unusually wide gamut and dynamic range.2 Before we can explore any gamut-mapping techniques, we need to know how to get from a spectral radiance value to a tristimulus color such as XYZ or RGB. The calculation is actually straightforward, but the literature on this topic is vast and confusing, so we give an explicit example to make sure we get it right. Correct Color Rendering Looking at the simplest case, spectral reflection of a small light source from a diffuse surface in Eq. (1) reduces to the following formula for outgoing radiance: Ro ( ) = d ( ) Ei ( ) (3) 1 The ICC Colorimetric intent is actually divided into relative and absolute intents, but this distinction is irrelevant to our discussion. 2 See www.hitl.washington.edu/research/vrd/ for information on Virtual Retinal Display technology. where d( ) is the diffuse reflectance as a function of wavelength, and Ei ( ) is the spectral irradiance computed by integrating radiance over the projected source. To convert this to an absolute XYZ color, we apply the standard CIE conversion, given below for SI units [16]: X = 683 ∫ x ( )R( )d Y = 683 ∫ y ( )R( )d Reflected Spectra Z = 683 ∫ z ( ) R( ) d (4) BlueFlower Example To compute the absolute CIE color for a surface point, we need to know the spectra of the source and the material. Fig. 1 shows the source spectra for standard illuminant A (2856K tungsten), illuminant B (simulated sunlight), and illuminant D65 (6500K daylight). Fig. 2 shows the reflected spectral radiance of the BlueFlower patch from the MacBeth chart under each of these illuminants. To these curves, we apply the CIE standard observer functions using Eq. (4). Source Spectra Illum A 720 710 700 690 680 670 660 650 640 630 620 610 600 590 580 570 560 550 540 530 520 510 500 490 480 470 460 450 440 430 420 410 Illum D65 400 Under B 720 710 700 690 680 670 660 650 640 630 620 610 600 590 580 570 560 550 540 530 520 510 500 490 480 470 460 450 440 430 420 410 400 390 Under D65 Wavelength (nm) Figure 2. Spectral radiance of MacBeth BlueFlower patch under three standard illuminants. Source CIE (x,y) BlueFlower CIE XYZ Illum D65 (.3127,.3290) 0.274 0.248 0.456 709 RGB 0.279 (absolute) 0.219 0.447 709 RGB 0.279 (adjusted) 0.219 0.447 Illum B (.3484,.3516) 0.280 0.248 0.356 0.349 0.209 0.341 0.285 0.218 0.444 Illum A (.4475,.4075) 0.302 0.248 0.145 0.525 0.179 0.119 0.306 0.215 0.426 Table 1. Computed color values for BlueFlower under three standard illuminants. Illum B 390 Under A 380 At this point, we may wish to convert to an opponent color space for gamut-mapping, or we may wait until we are in the device color space. If our tone-mapping is a simple scale factor as described in technique A above, we may apply it in any linear color space and the results will be the same. If we convert first to a nonlinear device color space, we need to be aware of the meaning of out-of-gamut colors in that space before we map them back into the legal range of display values. We demonstrate a consistent and reasonable method, then compare to what is usually done in computer graphics. 380 illuminant A and B conditions on the screen, they would likely appear incorrect because the viewer would be adapted to the white point of the monitor rather than the white point of the original scenes being rendered. If we assume the scene white point is the same color as the illuminant and the display white point is D65, then a white point adjustment is necessary for the other illuminants (A and B), as given in the third row of Table 1. Wavelength (nm) Figure 1. Spectral power of three standard illuminants. The resulting XYZ values for the three source conditions is given in the first row of Table 1. Not surprisingly, there is a large deviation in color under different illuminants, especially tungsten. We can convert these colors to their RGB equivalents using Eq. (5), as given in the second row of Table 1. If we were to directly display the colors from the R  X  G = C  Y   B   Z   3.2410 −1.5374 −0.4986 C709 = −0.9692 1.8760 0.0416   0.0556 −0.2040 1.0570  (5) We use a linear transform to adjust the white point from that of the illuminant to that of the display, which we assume to be D65 in this example. Eq. (5) gives the absolute transformation from XYZ to CCIR-709 linear RGB, and this is all we need for the D65 illuminant condition. For the others, we apply the transformation shown in Eq. (6). Eq. (6) is the linear von Kries adaptation model with the CMCCAT2000 primary matrix [14], which does a reasonable job of accounting for chromatic adaptation when shifting from one dominant illuminant to another [15]. The original white point primaries (R w,Gw,Bw) are computed from the illuminant XYZ using the M CMCCAT matrix, and the destination primaries (R w’,Gw’,Bw’) for D65 are computed using the same transform to be (0.9478,1.0334,1.0850).  X ′  Rw′ Rw −1 Y ′  = M  0  Z ′  0 0 Gw′ Gw 0  X M Y  B′w Bw   Z  0 0  Rw  X w  Gw  = M  Yw   Bw   Zw  M CMCCAT  0.7982 0.3389 −0.1371 = −0.5918 1.5512 0.0406   0.0008 0.0239 0.9753  (6) The combined matrices for a white shift from standard illuminants B and A to D65 (whose chromaticities are given at the top of Table 1) and subsequent conversion from CIE XYZ to CCIR-709 RGB color space, are given in Eq.(7) as C B and C A. Matrix C 709 from Eq. (5) was concatenated with the matrix terms in Eq. (6) to arrive at these results, which may be substituted for C 709 in Eq. (5) to get the adjusted RGB colors in the third row of Table 1 from the absolute XYZ values in the first row.  3.1273 −1.6836 −0.4867  CB = −0.9806 1.9476 0.0282   0.0605 −0.2036 1.3404   2.9355 −2.0416 −0.5116  CA = −1.0247 2.1431 −.0500   0.0732 −0.1798 3.0895  (7) Conventional CG Calculation The standard approach in computer graphics color calculations is to assume all light sources are perfectly white and perform calculations in RGB color space. To display the results, a linear scale factor may be applied to bring the results into some reasonable range, and any values outside the sRGB gamut will be clamped. We obtain an RGB value for the BlueFlower material from its published (x,y) chromaticity of (0.265,0.240) and reflectance of 24.3%. These published values correspond to viewing under standard illuminant C (simulated overcast), which is slightly bluer than D65. The linear RGB color for the flower material using the matrix C 709 from Eq. (5) is (0.246,0.217,0.495), which differs from the D65 results in Table 1 by 10 ∆E* units using the CIE L*uv perceptual metric [16]. Most of this difference is due to the incorrect scene illuminant assumption, since the ∆E* between illuminant C and D65 is also around 10. This demonstrates the inherent sensitivity of color calculations to source color. Using the color corresponding to the correct illuminant is therefore very important. The reason CG lighters usually treat sources as white is to avoid the whole white balancing issue. As evident from the third row in Table 1, careful accounting of the light source and chromatic adaptation is almost a no-op in the end. For white points close to the viewing condition of D65, the difference is small: a difference of just 1 ∆E* for illuminant B. However, tungsten is very far from daylight, and the ∆E* for illuminant A is more than 5, which is definitely visible. Clearly, if we include the source spectrum, we need to include chromatic adaptation in our tone-mapping. Otherwise, the differences will be very visible indeed -- a ∆E* of 22 for illuminant B and nearly 80 for illuminant A! What if we include the source color, but use an RGB approximation instead of the full spectral rendering? Errors will creep in from the reduced spectral resolution, and their significance will depend on the source and reflectance spectra. Computing everything in CCIR-709 RGB for our BlueFlower example, the ∆E* from the correct result is 1 for illuminant B and nearly 8 for illuminant A. These errors are at least as large as ignoring the source color entirely, so there seems to be little benefit in this approach. Relative Color Approximation An improved method that works well for scenes with a single dominant illuminant is to compute the absolute RGB color of each material under the illuminant using a spectral precalculation from Eqs. (3) and (4). The source itself is modeled as pure white (Y,Y,Y) in the scene, and sources with a different color are modeled relative to this illuminant as (Rs/Rw,Gs/Gw,Bs/Bw), where (Rw,Gw,Bw) is the RGB value of the dominant illuminant and (Rs,Gs,Bs) is the color of the other source. In our example, the RGB color of the BlueFlower material under the three standard illuminants are those given in the second row of Table 1. Prior to display, the von Kries chromatic adaptation in Eq. (5) is applied to the image pixels using the dominant source and display illuminants. The incremental cost of our approximation is therefore a single transform on top of the conventional CG rendering, and the error is zero by construction for direct reflection from a single source type. There may be errors associated with sources having different colors and multiple reflections, but these will be negligible in most scenes. Best of all, no software change is required – we need only precalculate the correct RGB values for our sources and surfaces, and the rest comes for free. It is even possible to save the cost of the final von Kries transform by incorporating it into the precalculation, computing adjusted rather than absolute RGB values for the materials, as in Eq. (7). We would prefer to keep this transform separate to preserve the colorimetric nature of the rendered image, but as a practical matter, it is often necessary to record a white-balanced image, anyway. As long as we record the scene white point in an image format that preserves the full gamut and dynamic range of our tristimulus pixels, we insure our ability to correctly display the rendering in any device’s color space, now and in the future. High Dynamic Range Images Real scenes and physically-based renderings of real scenes do not generally fit within a conventional display’s gamut using any reasonable exposure value (i.e., scale factor). If we compress or remap the colors to fit an sRGB or similar gamut, we lose the ability to later adjust the tone-scale or show off the image on a device with a larger gamut or wider dynamic range. What we need is a truly device-independent image representation, which doesn’t take up too much space, and delivers superior image quality whatever the destination. Fortunately, such formats exist. Since its inception in 1985, the Radiance physicallybased renderer has employed a 32-bit/pixel RGBE (RedGreen-Blue-Exponent) format to store its high dynamic range output [17]. Predating Radiance, Bill Reeves of Pixar created a 33-bit log RGB format for the REYES rendering system, and this format has a public version contributed by Dan McCoy in 1996 to Sam Leffler’s free TIFF library ( www.libtiff.org). While working at SGI, the author added to the same TIFF library a LogLuv format that captures 5 orders of magnitude and the full visible gamut in 24 bits using a perceptual color encoding [18]. The 32-bit version of this format holds up to 38 orders of magnitude, and often results in smaller files due to run-length encoding [19]. Both LogLuv formats combine a logarithmic encoding of luminance with a linear encoding of CIE (u’,v’) chromaticity to cover the full visible gamut as opposed to the gamut of a specific device or medium. Of the formats mentioned, only SGI’s LogLuv TIFF encoding covers the full gamut and dynamic range of perceivable colors. The Radiance RGBE format spans a large dynamic range but is restricted to positive RGB values, so there are visible chromaticities it cannot represent. There is an XYZE version of the same format, but the associated quantization errors make it a poor choice. The Pixar 33-bit log format also has a restricted RGB gamut and only covers 3.8 orders of magnitude, which is marginal for human perception. Since the TIFF library is well tested and free, there is really no reason not to use LogLuv, and many rendering packages now output in this format. Even shareware browsers such as ACDSee are able to read and display LogLuv TIFF’s. Gamut Mapping In order to fit a high dynamic range image into the limited color space of a conventional display, we need to apply one of the gamut compression techniques mentioned at the beginning of this section. Figure 3. Radiance rendering of control tower clamped to limited display gamut and dynamic range. Figure 4. The same rendering displayed using a visibilitypreserving tone operator including glare effects. Figure 5. A tone operator designed to optimize print contrast. the tone-mapping and lower contrast. In the low end, we see that this operator tends to provide more contrast to compensate for veiling reflection typical of glossy prints. 1.20 1.00 Display Y 0.80 0.60 0.40 vis-match contrast clamped 0.20 0.00 0 1000 2000 3000 4000 5000 6000 7000 8000 World Luminance Figure 6. Comparison between three tone-mapping operators. 0.25 0.20 0.15 Display Y Specifically, we show how one might apply the third approach to display an image: C. Scale colors on a curve determined by image content, as in a global histogram adjustment. We assume that the rendering system has calculated the correct color at each pixel and stored the result in a high dynamic-range image format. Our task is then to examine this image and choose an appropriate mapping to our display. This is a difficult process to automate, and there is no guarantee we will achieve a satisfactory result in all cases. The best we can do is codify a specific set of goals and requirements and optimize our tone-mapping accordingly. One possible goal of physically-based rendering is to assess visibility in some hypothetical environment or situation, or to recreate a situation that is no longer readily available (e.g., a plane crash). In such cases, we want to say that anything visible to an observer in the actual scene will be visible on the tone-mapped display. Conversely, if something is not visible on the display, we want to say that it would not be visible to an observer in the actual scene. This kind of visibility-matching operator was described in [20], and we show the result in Fig. 4. Fig. 3 shows the image mapped to an sRGB gamut using technique B to desaturate out-of-gamut colors. As we can see, some of the detail in the planes outside the window was lost to clamping, where it is preserved in the visibility-matching histogram-adjustment procedure in Fig. 4. An optional feature of our tone operator is the ability to simulate disability glare, which reduces visible contrast due to the harsh backlighting in the tower environment. This is visible as a slight haze in front of the monitors in Fig. 4. Fig. 5 demonstrates another type of tone operator. This is also a histogram adjustment method, but instead of attempting to reproduce visibility, this operator seeks to optimize contrast over the entire image while keeping colors within the printable gamut. Especially in digital photo printers, saturated colors may be difficult to reproduce, so it may be desirable to darken an image to avoid desaturating some regions. We see that this method produces good contrast over most of the image. Fig. 6 shows the global mapping of these three operators from world (rendered) luminance to display value (fraction of maximum). Where the naive linear operator clamps a lot of information off the top end, the two histogram adjustment operators present this information at a reduced contrast. This compression is necessary in order to bring out detail in the darker regions. We can see that the slopes match the linear operator near black in Fig. 7, deviating from the linear clamping operator above a certain level, where compression begins. Fig. 8 plots the contrast optimizing tone operator against the world luminance distribution. Peaks in the luminance histogram correspond to increases in contrast, visible in the tone-mapping as a slight increase in slope. Since this is a log-log luminance plot, a small change in slope corresponds to a large change in contrast. The dip between 1.5 and 2.0 corresponds to a more gradual slope in 0.10 vis-match contrast clamped 0.05 0.00 0 50 100 150 200 250 300 350 400 450 World Luminance Figure 7. Close-up on darker region of tone-mappings. Log DispY Freq 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 Log10 Word Luminance Figure 8. Good global tone operators produce greater contrast at peaks in the input histogram. Conclusion 6. The recommendations we make in this paper for accurate color rendering may be summarized as follows: 1. Use a global illumination method with appropriate solutions for all of the phenomena being simulated. 2. Follow accurate spectral calculations with a good chromatic adaptation model to avoid color casts in the displayed image. 3. Substitute full spectral rendering with a relative color approximation for scenes with a single dominant illuminant. 4. Record images in a high dynamic range format to preserve display options (i.e., SGI LogLuv TIFF). 5. Base tone-mapping and gamut-mapping operators on specific goals, such as matching visibility or optimizing color or contrast. Floating-point spectral calculations and high dynamic-range image manipulation are critical to accurate color rendering. The original approach of rendering directly in 24-bit RGB was recognized as hopeless and abandoned decades ago, but much of the mentality behind it remains with us today. The methods outlined in this paper are not particularly expensive, neither in terms of implementation effort nor rendering cost. It’s simply a matter of applying the right approximation. The author is not aware of any commercial software package that follows more than one or two of these principles, and it seems like a question of priorities. Most of the money in rendering is spent by the entertainment industry, either in movies or in games. Little emphasis has been placed on accurate color rendering, but with the recent increase in mixed-reality rendering, this is beginning to change. Mixed-reality special effects and games require rendered imagery to blend seamlessly with film or live footage. Since reality follows physics and color science, rendering software will have to do likewise. Those of us whose livelihood depends on predictive rendering and accurate color stand to benefit from this shift. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. References 1. 2. 3. 4. 5. Michael Stokes, Matthew Anderson, Srinivasan Chandrasekar, Ricardo Motta, A Standard Default Color Space for the Internet, www.w3.org/Graphics/Color/sRGB Greg Ward, The RADIANCE Lighting Simulation and Rendering System, Computer Graphics (Proceedings of SIGGRAPH 94), ACM, 1994. Paul Debevec, Jitendra Malik, Recovering High Dynamic Range Radiance Maps from Photographs, Computer Graphics (Proceedings of SIGGRAPH 97), ACM, 1997. Alexander Wilkie, Robert Tobler, Werner Purgathofer, Combined Rendering of Polarization and Fluorescence Effects, Proceedings of 12 th Eurographics Workshop on Rendering, June 2001. Henrik Wann Jensen, Stephen Marschner, Marc Levoy, Pat Hanrahan, A Practical Model for Subsurface Light 20. Transport, Computer Graphics (Proceedings of SIGGRAPH 01), ACM, 2001. Xiaodong He, Ken Torrance, François Sillion, Don Greenberg, A Comprehensive Physical Model for Light Reflection, Computer Graphics (Proceedings of SIGGRAPH 91), ACM, 1991. Jay Gondek, Gary Meyer, Jon Newman, Wavelength Dependent Reflectance Functions, Computer Graphics (Proceedings of SIGGRAPH 94), ACM, 1994. Holly Rushmeier, Ken Torrance, The Zonal Method for Calculating Light Intensities in the Presence of a Participating Medium, Computer Graphics (Proceedings of SIGGRAPH 87), ACM, 1987. Henrik Wann Jensen, Efficient Simulation of Light Transport in Scenes with Participating Media using Photon Maps, Computer Graphics (Proceedings of SIGGRAPH 98), ACM, 1998. Jim Kajiya, The Rendering Equation, Computer Graphics (Proceedings of SIGGRAPH 86), ACM, 1986. Greg Ward Larson, Rob Shakespeare, Rendering with Radiance, Morgan Kaufmann Publishers, 1997. Henrik Wann Jensen, Realistic Image Synthesis Using Photon Mapping, A.K. Peters Ltd., 2001. Francois Sillion, Claude Puech, Radiosity and Global Illumination, Morgan Kaufmann Publishers, 1994. C. Li, M.R. Luo, B. Rigg, Simplification of the CMCCAT97, Proc. IS&T/SID 8 th Color Imaging Conference, November 2000. Sabine Süsstrunk, Jack Holm, Graham Finlayson, Chromatic Adaptation Performance of Different RGB Sensors, IS&T/SPIE Electronic Imaging, SPIE Vol. 4300, January 2001. Günter Wyszecki, W.S. Stiles, Color Science, J. Wiley, 1982. Greg Ward, Real Pixels, Graphics Gems II, edited by James Arvo, Academic Press, 1992. Greg Ward Larson, Overcoming Gamut and Dynamic Range Limitations in Digital Images, IS&T/SID 6th Color Imaging Conference, November 1998. Greg Ward Larson, The LogLuv Encoding for Full Gamut, High Dynamic Range Images, Journal of Graphics Tools, 3(1):15-31 1998. Greg Ward Larson, Holly Rushmeier, Christine Piatko, A Visibility Matching Tone Reproduction Operator for High Dynamic Range Scenes, IEEE Transactions on Visualization and Computer Graphics, Vol. 3, No. 4, December 1997. Biography Greg Ward (a.k.a. Greg Ward Larson) graduated in Physics from UC Berkeley in 1983 and earned a Master’s in Computer Science from SF State University in 1985. Since 1985, he has worked in the field of light measurement, simulation, and rendering variously at the Berkeley National Lab, EPFL Switzerland, Silicon Graphics Inc., Shutterfly, and Exponent. He is author of the widely used Radiance package for lighting simulation and rendering. Eurographics Workshop on Rendering (2002), pp. 1–7 Paul Debevec and Simon Gibson (Editors) Picture Perfect RGB Rendering Using Spectral Prefiltering and Sharp Color Primaries Greg Ward† Elena Eydelberg-Vileshin‡ Abstract Accurate color rendering requires the consideration of many samples over the visible spectrum, and advanced rendering tools developed by the research community offer multispectral sampling towards this goal. However, for practical reasons including efficiency, white balance, and data demands, most commercial rendering packages still employ a naive RGB model in their lighting calculations. This often results in colors that are qualitatively different from the correct ones. In this paper, we demonstrate two independent and complementary techniques for improving RGB rendering accuracy without impacting calculation time: spectral prefiltering and color space selection. Spectral prefiltering is an obvious but overlooked method of preparing input colors for a conventional RGB rendering calculation, which achieves exact results for the direct component, and very accurate results for the interreflected component when compared with full-spectral rendering. In an empirical error analysis of our method, we show how the choice of rendering color space affects final image accuracy, independent of prefiltering. Specifically, we demonstrate the merits of a particular transform that has emerged from the color research community as the best performer in computing white point adaptation under changing illuminants: the Sharp RGB space. Categories and Subject Descriptors (according to ACM CCS): I.3.7 [Computer Graphics]: Color, shading, shadowing, and texture 1. Introduction It is well-known that the human eye perceives color in a three-dimensional space, owing to the presence of three types of color receptors. Early psychophysical research demonstrated conclusively that three component values are sufficient to represent any perceived color, and these values may be quantified using the CIE XYZ tristimulus space19 . However, because the spectrum of light is continuous, the interaction between illumination and materials cannot be accurately simulated with only three samples. In fact, no finite number of fixed spectral samples is guaranteed to be sufficient — one can easily find pathological cases, for example, a pure spectral source mixed with a narrow band absorber, that require either component analysis or a ludicrous number of fixed samples to resolve. If the rendered spectrum is † Exponent – Failure Analysis Associates, Menlo Park, California ‡ Department of Computer Science, Stanford University, Palo Alto, California c The Eurographics Association 2002.  inaccurate, reducing it to a tristimulus value will usually not hide the problem. Besides the open question of how many spectral samples to use, there are other practical barriers to applying full spectral rendering in commercial software. First, there is the general dearth of spectral reflectance data on which to base a spectral simulation. This is consistent with the lack of any kind of reflectance data for rendering. We are grateful to the researchers who are hard at work making spectral data available3, 18 , but the ultimate solution is to put the necessary measurement tools in the hands of people who care about accurate color rendering. Hand-held spectrophotometers exist and may be purchased for the cost of a good laser printer, but few people apply them in a rendering context, and to our knowledge, no rendering package takes spectrophotometer data as direct input. The second practical barrier to spectral rendering is white balance. This is actually a minor issue once you know how to address it, but the first time you render with the correct Ward and Eydelberg / Picture Perfect RGB Rendering source and reflectance spectra, you are likely to be disappointed by the strong color cast in your output. This is due to the change in illuminant from the simulated scene to the viewing condition, and there is a well-known method to correct for this, which we will cover in Section 2. The third practical barrier to the widespread acceptance of spectral rendering is what we call the “data mixing problem.” What if the user goes to the trouble of acquiring spectral reflectances for a set of surfaces, but they also want to include materials that are characterized in terms of RGB color, or light sources that are specified to a different spectral resolution? One may interpolate and extrapolate to some extent, but in the end, it may be necessary to either synthesize a spectrum from RGB triples a la Smits’ method13 , or reduce all the spectral data to RGB values and fall back on three component rendering again. The fourth practical barrier to full spectral rendering is cost. In many renderings, shading calculations dominate the computation, even in RGB. If all of these calculations must be carried out at the maximum spectral resolution of the input, the added cost may not be worth the added benefit. Many researchers in computer graphics and color science have addressed the problem of efficient spectral sampling8, 7 . Meyer suggested a point-sampling method based on Gaussian quadrature and a preferred color space, which requires only 4 spectral samples and is thus very efficient10 . Like other point sampling techniques, however, Meyer’s method is prone to problems when the source spectrum has significant spikes in it, as in the case of common fluorescent lighting. A more sophisticated approach employing orthonormal basis functions was presented by Peercy, who uses characteristic vector analysis on combinations of light source and reflectance spectra to find an optimal, orthonormal basis set12 . Peercy’s method has the advantage of handling spiked and smooth spectra with equal efficiency, and he demonstrated accurate results with as few as three orthonormal bases. The additional cost is comparable to spectral sampling, replacing N multiplies in an N-sample spectral model with M × M multiplies in an M-basis vector model. Examples in his paper showed the method significantly out-performing uniform spectral sampling for the same number of operations. The cost for a 3-basis simulation, the minimum for acceptable accuracy in Peercy’s technique, is roughly three times that of a standard RGB shading calculation. In this paper, we present a method that has the same overall accuracy as Peercy’s technique, but without the computational overhead. In fact, no modification at all is required to a conventional RGB rendering engine, which multiplies and sums its three color components separately throughout the calculation. Our method is not subject to point sampling problems in spiked source or absorption spectra, and the use of an RGB rendering space all but eliminates the data mixing problem mentioned earlier. White adaptation is also accounted for by our technique, since we ask the user to iden- tify a dominant source spectrum for their scene. This avoids the dreaded color cast in the final image. We start with a few simple observations: 1. The direct lighting component is the first order in any rendering calculation, and its accuracy determines the accuracy of what follows. 2. Most scenes contain a single dominant illuminant; there may be many light sources, but they tend to all have the same spectral power distribution, and spectrally differentiated sources make a negligible contribution to illumination. 3. Exceptional scenes, where spectrally distinct sources make roughly equal contributions, cannot be “white balanced,” and will look wrong no matter how accurately the colors are simulated. We can be satisfied if our color accuracy is no worse on average than standard methods in the mixed illuminant case. The spectral prefiltering method we propose is quite simple. We apply a standard CIE formula to compute the reflected XYZ color of each surface under the dominant illuminant, then transform this to a white-balanced RGB color space for rendering and display. The dominant sources are then replaced by white sources of equal intensity, and other source colors are modified to account for this adaptation. By construction, the renderer gets the exact answer for the dominant direct component, and a reasonably close approximation for other sources and higher order components. The accuracy of indirect contributions and spectrally distinct illumination will depend on the sources, materials, and geometry in the scene, as well as the color space chosen for rendering. We show by empirical example how a sharpened RGB color space seems to perform particularly well in simulation, and offer some speculation as to why this might be the case. Section 2 details the equations and steps needed for spectral filtering and white point adjustment. Section 3 shows an example scene with three combinations of two spectrally distinct light sources, and we compare the color accuracy of naive RGB rendering to our prefiltering approach, each measured against a full spectral reference solution. We also look at three different color spaces for rendering: CIE XYZ, linear sRGB, and the Sharp RGB space. Finally, we conclude with a summary discussion and suggestions for future work. 2. Method Spectral prefiltering is just a straightforward transformation from measured source and reflectance spectra to three separate color channels for rendering. These input colors are then used in a conventional rendering process, followed by a final transformation into the display RGB space. Chromatic adaptation (i.e., white balancing) may take place either before or after rendering, as a matter of convenience and efficiency. c The Eurographics Association 2002.  Ward and Eydelberg / Picture Perfect RGB Rendering Given a source I(λ) and a material ρm (λ) with arbitrary spectral distributions, the CIE describes a standard method for deriving a tristimulus color value that quantifies what the average human observer sees. The XYZ tristimulus color space is computed from the CIE “standard observer” response functions, x̄, ȳ, and z̄, which are integrated with an arbitrary source illuminant spectrum and surface reflectance spectrum as shown in Eq. (1), below: Z Xm = I(λ) ρm (λ) x̄(λ) dλ Z (1) Ym = I(λ) ρm (λ) ȳ(λ) dλ Z Zm = I(λ) ρm (λ) z̄(λ) dλ For most applications, the 1971 2◦ standard observer curves are used, and these may be found in Wyszecki and Stiles19 . Eq. (1) claims to quantify the exact color an observer would experience if she were to look at a diffuse color patch with the given reflectance spectrum under the given illuminant, but does it really work? In reality, there is a strong tendency for viewers to discount the illuminant in their observations, and the color one sees depends strongly on the ambient lighting in the environment. For example, Eq. (1) might predict a yellow-orange color for a white patch under a tungsten illuminant, while a human observer would still call it “white” if they were in a room lit by the same tungsten source. In fact, a standard photograph of the patch would show its true yellow-orange color, and most novice photographers have the experience of being startled when the colors they get back from their indoor snapshots are not as they remembered them. To provide for the viewer’s chromatic adaptation and thus avoid a color cast in our image after all our hard work, we apply a von Kries style linear transform to our values prior to display16 . This transform takes an XYZ material color computed under our scene illuminant, and shifts it to the equivalent, apparent color XYZ under a different illuminant that corresponds to our display viewing condition. All we need are the XYZ colors for white under the two illuminants as computed by Eq. (1) with ρm (λ) = 1, and a 3 × 3 transformation matrix, MC , that takes us from XYZ to an appropriate color space for chromatic adaptation. (We will discuss the choice of MC shortly.) The combined adaptation and display transform is given in Eq. (2), below:    Rw      0 0 Rm Xm R  w G  −1    w Gm  = MD M Ym  , M (2) 0 0 C C   Gw Bm Bw Zm 0 0 Bw where for the scene illuminant, and similarly for the display white point, (Xw ,Yw , Zw ). The display matrix, MD , that we added to the standard von Kries transform, takes us from CIE XYZ coordinates to our display color space. For an sRGB image or monitor with D65 white point14 , one would use the following matrix, followed by a gamma correction of 1/2.2:   3.2410 −1.5374 −0.4986 MsRGB =  −0.9692 1.8760 0.0416  0.0556 −0.2040 1.0570 If we are rendering a high dynamic-range scene, we may need to apply a tone-mapping operator such as Larson et al6 to compress our values into a displayable range. The tone operator of Pattanaik et al even incorporates a partial chromatic adaptation model11 . The choice of which matrix to use for chromatic adaptation, MC , is an interesting one. Much debate has gone on in the color science community over the past few years as to which space is most appropriate, and several contenders seem to perform equally well in side-by-side experiments2 . However, it seems clear that RGB primary sets that are “sharper” (more saturated) tend to be more plausible than primaries that are inward of the spectral locus4 . In this paper, we have selected the Sharp adaptation matrix for MC , which was proposed based on spectral sharpening of colormatching data16 :   1.2694 −0.0988 −0.1706 1.8006 0.0357  MSharp =  −0.8364 0.0297 −0.0315 1.0018 sRGB vs. Sharp Color Space CIE (u’,v’) coordinates 0.6 0.5 0.4 sRGB Sharp 0.3 v axis 2.1. Color Transformation 0.2 0.1 0.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 u axis     Rw Xw Gw  = MC Yw  Bw Zw c The Eurographics Association 2002.  Figure 1: A plot showing the relative gamuts of the sRGB and Sharp color spaces. Ward and Eydelberg / Picture Perfect RGB Rendering Figure 1 shows a CIE (u , v ) plot with the locations of the sRGB and Sharp color primaries relative to the visible gamut. Clearly, one could not manufacture a color monitor with Sharp primaries, as they lie just outside the spectral locus. However, this poses no problem for a color transform or a rendering calculation, since we can always transform back to a displayable color space. In fact, the Sharp primaries may be preferred for rendering and RGB image representation simply because they include a larger gamut than the standard sRGB primaries. This is not an issue if one can represent color values less than zero and greater than one, but most image formats and some rendering frameworks do not permit this. As we will see in Section 3, the choice of color space plays a significant role in the final image accuracy, even when gamut is not an issue. 2.2. Application to Rendering We begin with the assumption that the direct-diffuse component is most important to color and overall rendering accuracy. Inside the shader of a conventional RGB rendering system, the direct-diffuse component is computed by multiplying the light source color by the diffuse material color, where color multiplication happens separately for each of the three RGB values. If this calculation is accurate, it must give the same result one would get using Eq. (1) followed by conversion to the rendering color space. In general, this will not be the case, because the diffuse RGB for the surface will be based on some other illuminant whose spectrum does not match the one in the model. For example, the CIE (x, y) chromaticities and Y reflectances published on the back of the Macbeth ColorChecker chart9 are measured under standard illuminant C, which is a simulated overcast sky. If a user wants to use the color Purple in his RGB rendering of an interior space with an incandescent (tungsten) light source, he might convert the published (Y, x, y) reflectances directly to RGB values using the inverse of MsRGB given earlier. Unfortunately, he makes at least three errors in doing so. First, he is forgetting to perform a white point transform, so there is a slight red shift as he converts from (Y, x, y) under the bluish illuminant C to the more neutral D65 white point of sRGB. Second, the tungsten source in his model has a slight orange hue he forgets to account for, and there should be a general darkening of the surface under this illuminant, which he fails to simulate. Finally, the weak output at the blue end of a tungsten spectrum makes purple very difficult to distinguish from blue, and he has failed to simulate this metameric effect in his rendering. In the end, the rendering shows something more like violet than the dark blue one would actually witness for this color in such a scene. If the spectra of all the light sources are equivalent, we can precompute the correct result for the direct-diffuse component and replace the light sources with neutral (white) emitters, inserting our spectrally prefiltered RGB values as the diffuse reflectances in each material. We need not worry about how many spectral samples we can afford, since we only have to perform the calculation once for each material in a preprocess. If we intend to render in our display color space, we may even perform the white balance transform ahead of time, saving ourselves the final 3 × 3 matrix transform at each pixel. In Section 3, we analyze the error associated with three different color spaces using our spectral prefiltering method, and compare it statistically to the error from naive rendering. The first color space we apply is CIE XYZ space, as recommended by Borges1 . The second color space we use is linear sRGB, which has the CCIR-709 RGB color primaries that correspond to nominal CRT display phosphors14 . The third color space is the same one we apply in our white point transformation, the Sharp RGB space. We look at cases of direct lighting under a single illuminant, where we expect our technique to perform well, and mixed illuminants with indirect diffuse and specular reflections, where we expect prefiltering to work less effectively. When we render in CIE XYZ space, it makes the most sense to go directly from the prefiltered result of Eq. (1) to XYZ colors divided by white under the same illuminant: Xm = Xm Xw Ym = Ym Yw  Zm = Zm Zw We may then render with light sources using their absolute XYZ emissions, and the resulting XYZ direct diffuse component will be correct in absolute terms, since they will be remultiplied by the source colors. The final white point adjustment may then be combined with the display color transform exactly as shown in Eq. (2). When we render in sRGB space, it is more convenient to perform white balancing ahead of time, applying both Eq. (1) and Eq. (2) prior to rendering. All light sources that match the spectrum of the dominant illuminant will be modeled as neutral, and spectrally distinct light sources will be modeled as having their sRGB color divided by that of the dominant illuminant. When we render in the Sharp RGB space, we can eliminate the transformation into another color space by applying just the right half of Eq. (2) to the surface colors calculated by Eq. (1):     1   0 0 Rm Xm Rw   1 Gm  =  0 0  MSharp Ym  , Gw 1 Zm Bm 0 0 Bw Dominant illuminants will again be modeled as neutral, and spectrally distinct illuminants will use: Rs = Rs Rw Gs = Gs Gw Bs = Bs Bw The final transformation to the display space will apply the c The Eurographics Association 2002.  Ward and Eydelberg / Picture Perfect RGB Rendering remaining part of Eq. (2):     Rw Rd −1 0 Gd  = MD M Sharp Bd 0 0 Gw 0   0 Rm  0  Gm  . Bm Bw 3. Results Our test scene was constructed using published spectral data and simple geometry. It consists of a square room with two light sources and two spheres. One sphere is made of a smooth plastic with a 5% specular component, and the other sphere is made of pure, polished gold (24 carat). The diffuse color of the plastic ball is Macbeth Green9 . The color of elemental gold is computed from its complex index of refraction as a function of wavelength. The ceiling, floor, and far wall are made of the Macbeth Neutral.8 material. The left wall is Macbeth Red, and the right wall is Macbeth Blue. The near wall, seen in the reflection of the spheres, is the Macbeth BlueFlower color. The left light source is a 2856◦ K tungsten source (i.e., Standard Illuminant A). The right light source is a cool white fluorescent. All spectral data for our scene were taken from the material tables in Appendix G of Glassner’s Principles of Digital Image Synthesis5 , and these are also available in the Materials and Geometry Format (MGF)17 . For convenience, the model used in this paper have been prepared as MGF files and included with our image comparisons in the supplemental materials. Figure 2 shows a Monte Carlo path tracing of this environment with fluorescent lighting using 69 evenly spaced spectral samples from 380 to 720 nm, which is the resolution of our input data. Using our spectral prefiltering method with the cool white illuminant, we recomputed the image using only three sRGB components, taking care to retrace exactly the same ray paths. The result shown in Figure 3 is nearly indistinguishable from the original, with the possible exception of the reflection of the blue wall in the gold sphere. This can be seen graphically in Figure 5, which plots the CIE 1994 Lab ∆E∗ color difference in false color. A ∆E∗ value of one is just noticeable if the colors are adjacent, and we have found values above five or so to be visible in sideby-side image comparisons. Using a naive assumption of an equal-energy illuminant, we recomputed the sRGB material colors from their reflectance spectra and rendered the scene again, arriving at Figure 4. The rendering took the same time to finish, about a third as long as the full-spectral rendering, and the results are quite different. Both the red wall and the green sphere have changed lightness and saturation from the reference image, the blue wall is reflected as purple in the gold sphere, and the ∆E∗ errors shown in Figure 6 are over 20 in large regions. Clearly, this level of accuracy is unacceptable for critical color evaluations, such as selecting a color to repaint the living room. c The Eurographics Association 2002.  XYZ sRGB Sharp 50% 98% 50% 98% 50% 98% naive 8.4 39.2 3.9 16.3 0.8 4.4 prefilt 1.3 6.6 0.3 2.4 0.2 0.9 naive 5.3 25.5 5.8 29.3 0.9 4.6 prefilt 0.8 6.3 0.2 1.3 0.1 0.9 naive 5.0 27.1 4.2 14.0 0.6 2.5 prefilt tung 3.5 14.9 0.6 2.3 0.7 2.2 prefilt fluor 4.6 46.1 0.6 6.8 0.7 8.1 Average 4.1 23.7 2.2 10.4 0.6 3.4 Illum Method tung fluor both Table 1: ∆E∗ percentiles for our example scene. We repeated the same comparisons in CIE XYZ and Sharp RGB color spaces, then changed the lighting configuration and ran them again. Besides the fluorescent-only lighting condition, we looked at tungsten-only and both sources together. Since the lumen output of the two sources is equal, it was not clear which one to choose as the dominant illuminant, so we applied our prefiltering technique first to one source then to the other. Altogether, we compared 21 combinations of light sources, color spaces, and rendering methods to our multispectral reference solution. The false color images showing the ∆E∗ for each comparison are included in the supplemental materials, and we summarize the results statistically in Table 1 and Figure 7. Table 1 gives the 50th percentile (median) and 98th percentile ∆E∗ statistics for each combination of method, lighting, and color space. These columns are averaged to show the relative performance of the three rendering color spaces at the bottom. Figure 7 plots the errors in Table 1 as a bar chart. The 50th percentile errors are coupled with the 98th percentile errors in each bar. In every simulation but one, the Sharp RGB rendering space keeps 98% of the pixels below a ∆E∗ of five relative to the reference solution, a level at which it is difficult to tell the images apart in side-by-side comparisons. The smallest errors are associated with the Sharp color space and spectral prefiltering with a single illuminant, where 98% of the pixel differences are below the detectable threshold. In the mixed illuminant condition, spectral prefiltering using tungsten as the dominant illuminant performs slightly better than a naive assumption, and prefiltering using cool white as the dominant illuminant performs slightly worse. The worst performance by far is seen when we use CIE XYZ as the rendering space, which produces noticeable differences above five for over 2% of the pixels in every simulation, and a median ∆E∗ over five for all the naive simulations. 4. Conclusions In our experiments, we found spectral prefiltering to minimize color errors in scenes with a single dominant illuminant spectrum, regardless of the rendering color space. The Ward and Eydelberg / Picture Perfect RGB Rendering Figure 2: Our reference multi-spectral solution for the fluorescent-only scene. Figure 3: Our prefiltered sRGB solution for the fluorescentonly scene. Figure 4: Our naive sRGB solution for the fluorescent-only scene. Figure 5: The ∆E∗ error for the prefiltered sRGB solution. Figure 6: The ∆E∗ error for the naive sRGB solution. Figure 7: Error statistics for all solutions and color spaces. median CIE 1994 Lab ∆E∗ values were reduced by a factor of eight, to levels that were below the detectable threshold when using the sRGB and Sharp color spaces. Of the three color spaces we used for rendering, the CIE XYZ performed the worst, generating median errors that were just above the detectable threshold even with prefiltering, and five times the threshold without prefiltering, meaning the difference was clearly visible over most of the image in sideby-side comparisons to the reference solution. In contrast, the Sharp RGB color space, favored by the color science community for chromatic adaptation transforms, performed exceptionally well in a rendering context, producing median error levels that were below the detectable threshold both with and without prefiltering. We believe the Sharp RGB space works especially well for rendering because it minimizes the representation error for tristimulus values by aligning its axes along the densest regions of XYZ space. This property is held in common with the AC1C2 color space recommended by Meyer for rendering for this very reason10 . In fact, the AC1C2 space has also been favored for chromatic adaptation, indicating the strong connection between rendering calculations and von Kries style transforms. This is apparent when we notice how the white points in the diagonal matrix of Eq. (2), are multiplied in separate channels, analogous to the color calculations inside a three-component shader. The combination of spectral prefiltering and the Sharp RGB space is particularly effective. With prefiltering under c The Eurographics Association 2002.  Ward and Eydelberg / Picture Perfect RGB Rendering a single illuminant, 98% of the pixels were below the detectable error threshold using the Sharp RGB space, and only certain reflections in the gold sphere were visibly different in a side-by-side comparison. We included a polished gold sphere because we knew its strong spectral selectivity and specularity violated one of our key assumptions, which is that the direct-diffuse component dominates the rendering. We saw in our results that the errors using prefiltering for the gold sphere are no worse than without, and it probably does not matter whether we apply our prefiltering method to specular colors or not. However, rendering in a sharpened RGB space always seemed to help. We also tested the performance of prefiltering when we violated our second assumption of a single, dominant illuminant spectrum. When both sources were present and equally bright, the median error was still below the visible threshold using prefiltering in either the sRGB or Sharp color space. Without prefiltering, the median jumped significantly for the sRGB space, but was still below threshold for Sharp RGB rendering. Thus, prefiltering performed no worse on average than the naive approach for mixed illuminants, which was our goal as stated in the introduction. In conclusion, we have presented an approach to RGB rendering that works within any standard framework, adding virtually nothing to the computation time while reducing color difference errors to below the detectable threshold in typical environments. The spectral prefiltering technique accommodates sharp peaks and valleys in the source and reflectance spectra, and user-selection of a dominant illuminant avoids most white balance problems in the output. Rendering in a sharpened RGB space also greatly improves color accuracy independent of prefiltering. Work still needs to be done in the areas of mixed illuminants and colored specular reflections, and we would like to test our method to a greater variety of example scenes. Acknowledgments The authors would like to thank Maryann Simmons for providing timely reviews of the paper in progress, and Albert Meltzer for critical editing and LaTeX formatting assistance. 4. G. D. Finlayson and P. Morovic. Is the Sharp Adaptation Transform more plausible than CMCCAT2000? Proc. 9th Color Imaging Conf., pp. 310–315, 2001. 5. Andrew S. Glassner. Principles of Digital Image Synthesis. Morgan Kaufmann, 1995. 6. G. W. Larson, H. Rushmeier and C. Piatko. A Visibility Matching Tone Reproduction Operator for High Dynamic Range Scenes. IEEE Transactions on Visualization and Computer Graphics, 3(4) (December 1997). 7. Laurence T. Maloney. Evaluation of Linear Models of Surface Spectral Reflectance with Small Numbers of Parameters. J. Optical Society of America A, 3(10):1673–1683 (October 1986). 8. David Marimont and Brian Wandell. Linear Models of Surface and Illuminant Spectra. J. Optical Society of America A, 9(11):1905–1913 (November 1992). 9. C. S. McCamy, H. Marcus and J. G. Davidson. A colorrendition chart. J. Applied Photographic Engineering, 2(3):95–99 (summer 1976). 10. Gary Meyer. Wavelength Selection for Synthetic Image Generation. Computer Vision, Graphics and Image Processing, 41:57–79, 1988. 11. Sumanta N. Pattanaik, James A. Ferwerda, Mark D. Fairchild and Donald P. Greenberg. A multiscale model of adaptation and spatial vision for realistic image display. Proc. Siggraph ’98. 12. Mark S. Peercy. Linear color representations for full speed spectral rendering. Proc. Siggraph ’93. 13. Brian Smits. An RGB to Spectrum Conversion for Reflectances. J. Graphics Tools, 4(4):11–22, 1999. 14. Michael Stokes et al. A Standard Default Color Space for the Internet – sRGB. Ver. 1.10, November 1996. http://www.w3.org/Graphics/Color/sRGB. 15. S. Sueeprasan and R. Luo. Incomplete Chromatic Adaptation under Mixed Illuminations. Proc. 9th Color Imaging Conf., pp. 316–320, 2001. References 16. S. Süsstrunk, J. Holm and G. D. Finlayson. Chromatic Adaptation Performance of Different RGB Sensors. IS&T/SPIE Electronic Imaging, SPIE 4300, Jan. 2001. 1. C. Borges. Trichromatic Approximation for Computer Graphics Illumination Models. Proc. Siggraph ’91. 17. Greg Ward et al. 2. Anthony J. Calabria and Mark D. Fairchild. Herding CATs: A Comparison of Linear Chromatic-Adaptation Transforms for CIECAM97s. Proc. 9th Color Imaging Conf., pp. 174–178, 2001. 3. Kristin J. Dana, Bram van Ginneken, Shree K. Nayar and Jan J. Koenderink. Reflectance and Texture of Real World Surfaces. ACM TOG, 15(1):1–34, 1999. c The Eurographics Association 2002.  Materials and Geometry Format. http://radsite.lbl.gov/mgf. 18. Harold B. Westlund and Gary W. Meyer. A BRDF Database Employing the Beard-Maxwell Reflection Model. Graphics Interface 2002. 19. Günter Wyszecki and W. S. Stiles. Color Science: Concepts and Methods, Quantitative Data and Formulae. John Wiley & Sons, New York, 2nd ed., 1982.