Introduction

Various notes related to developing these units.

Table of contents:

  1. Author
  2. About this documentation
  3. Compiling
    1. Compiling these units with FPC
    2. Compiling these units with Delphi
  4. Requirements
    1. Libraries
    2. OS / architectures compatibility
  5. Units map
  6. Implementation notes (i.e. how does it work ?)
    1. OpenGL rendering optimization notes
      1. First, some terminology
      2. What is currently implemented ?
      3. What are the alternatives ?

Author

This is documentation for Kambi VRML game engine, see [http://vrmlengine.sourceforge.net/]. Developed by Michalis Kamburelis.

All comments, bug reports, questions, patches and everything are welcome.

About this documentation

What you are reading right now is a documentation for my units generated by PasDoc (see [http://pasdoc.sf.net/] and [http://vrmlengine.sourceforge.net/reference.php]). Every unit should start with a comment documenting it's purpose, every function (or functions' group) declared in unit's interface should start with a comment documenting exactly what it does. I believe everything is really good documented. The only problem for English speakers is...

...Polish language: various parts of my Pascal sources may be still documented in Polish. I am constantly translating those things to English, but there's a lot of Polish text so full translation to English isn't going to happen immediately. One thing you can do about that is to tell me "I am interested in using/understanding your sources and I really hate when I see a function/unit that looks like what I want but is documented in Polish". Such "positive criticism" should make me hasten my efforts to produce good English documentation, so don't hesitate to email me with such requests.

Compiling

Compiling these units with FPC

Latest stable version of FPC is always advised. See [http://vrmlengine.sourceforge.net/sources.php#section_fpc_ver] for more comments about FPC versions allowed.

Compile my programs with compile.sh script prepared in directory of each program. In summary, this script just calls fpc with proper command-line options.

If you want to use my units in your own programs then you would probably like some way to compile only my units. Here it is:

  1. If you use Lazarus, the most comfortable way to use my units in your programs is probably by using Lazarus kambi_units package, see kambi_vrml_game_engine/packages/README in sources.

  2. If you don't use Lazarus, you should compile my units using kambi_vrml_game_engine/Makefile file. GNU make is required (under Linux this is the default `make', under Windows you have this already bundled with FPC (you can also get this from Cygwin or MinGW), under FreeBSD this is called `gmake').

  3. If you need more control over compilation process then you can also directly use kambi.cfg file. This is my configuration file for FPC. It's used by kambi_vrml_game_engine/Makefile and by all compile.sh scripts of all programs. All my programs and units must be compiled with FPC using exactly this configuration.

Note for x86_64: for FPC <= 2.2.0, GLExt unit provided with FPC is buggy. Workaround is to add directory kambi_vrml_game_engine/opengl/x86_64/ to your FPC config file, so that fixed GLExt implementation from there will be used. I submitted patch to FPC to fix this, so in FPC > 2.2.0 this is fixed.

Compiling these units with Delphi

Some part of these units can be compiled with Delphi 7 for Windows (all versions, even Delphi 7 Personal). This concerns most of the units in base/ subdirectory, all units in images/ and fonts/. Maybe some other units could be compiled with Delphi, but this would be only an accident.

Some parts possibly could be compiled with Kylix (Delphi for Linux), but this would also be only "by accident" — I'm not planning to support Kylix. Let's be serious and use FPC/Lazarus.

Required Delphi compiler configuration:

This is rather standard Delphi configuration, so it should not be any problem. Other configuration settings do not matter, i.e. my code should compile and work properly with any values of other configuration settings.

As usual you can turn on Range checking, Overflow checking and Assertions for debugging purposes.

Requirements

Libraries

Images unit requires libpng (libpng.so under UNIXes, libpng12.dll under Windows) installed if you want to load/save PNG images. Otherwise you will get ELibPngNotAvailable exception when trying to load/save files in PNG format. Note that libpng requires also zlib installed.

KambiZlib unit (required by KambiZStream unit and used by VRMLNodes) requires zlib installed.

Some additional libraries are required for sound playing: OggVorbis stuff requires vorbisfile library, OpenAL stuff naturally requires OpenAL library. But all sound units are implemented in such way that even if the user doesn't have OpenAL/vorbisfile installed, programs using these units can still run fine (e.g. units will not raise any exception at unit initialization just because some library cannot be loaded; instead, appropriate boolean variable will be set to False, and higher-level code can always check this).

For Windows, you can get some of the required DLLs easily from my site, [http://vrmlengine.sourceforge.net/miscella/win32_dlls.zip]. Inside this archive there are also pointers where the "upstream" versions of these DLLs can be found on WWW.

On Mac OS X there are some additional requirements, since Mac OS X port is done much like any other Unix right now. So some units require some typical Unix libraries, like X11, GTK, GTKGlExt etc. See [http://vrmlengine.sourceforge.net/macosx_requirements.php] for details.

OS / architectures compatibility

Tested and actually used on Linux, FreeBSD, Darwin/Mac OS X and Windows, on i386 processor (32 bit). Also, Linux on x86-64 (64 bit) is working flawlessly.

The engine is very portable, and minimal amount of work / testing should get it to compile under any other modern platform supported by FPC. All Unix flavors may work out of the box. Windows on x86-64 is probably OK too, but needs to be tested. On big-endian processors (most not-x86 processors), some image loading code probably needs to be adjusted. Porters / testers are most welcome.

Units map

Here's a short map of important units documented here. Units are divided into a couple of groups. Group's name (like "base") is also the name of subdirectory where unit's source code is placed. The "Uses:" notes below say which units are allowed to be used by units in specific group.

base

These units provide some basic functions and classes. Some units hide OS-specific issues. Most important are KambiUtils, KambiClassUtils, KambiFilesUtils and other Kambi*Utils units which contain many useful routines.

There are also some templates (see dynarray.inc, dim2array.inc, dim3array.inc, objectslist.inc, and their automatically duplicated versions in templates/ subdirectory).

There are also units for parsing and evaluating mathematical expressions (MathExprLexer, MathExprParser, MathExpr), unit ParseParametersUnit for command-line parsing, VectorMath unit with a lot of primitive geometry types and functions. And some more.

Uses: Nothing. These units do not depend on units from other groups.

audio

These are my audio units. OpenAL header translation (KambiOpenAL), OpenAL helper routines (ALUtils), sound file formats (currently, WAV and OggVorbis) reader (SoundFile), smart sound sources allocator (ALSourceAllocator) and finally some high-level class to comfortably use this whole thing (GameSoundEngine).

Uses: base.

images

This is my Images unit to load images to 2d pixel arrays. Also an image cache (ImagesCache) unit, to help avoiding the same images more than once (which is important e.g. when loading game textures). Also some things to handle PasJPEG. Also my binding to libpng library (KambiPng) and some helpers (KambiPngUtils).

Uses: base.

fonts

Units TTFontsTypes and BmpFontsTypes and many units named TTF_* and BFNT_* that depend on TTFontsTypes or BmpFontsTypes. Units TTF_* and BFNT_* are automatically generated using font2pascal program. Also some helpers for Windows fonts and for translating them to Pascal sources (like those TTF_* and BFNT_* units).

Uses: base.

3dgraph

Units dealing with basic 3d graphics. Not dependent on OpenGL and not doing anything closely related to "processing 3d models". These units make some base for units in opengl and 3dmodels groups. Operating on axis-aligned bounding boxes (Boxes3d), loading skyboxes (BackgroundBase), navigation in 3D world (MatrixNavigation), ray-tracer helpers (SpaceFillingCurves and SphereSampling) etc.

Note: VectorMath unit is considered by me as something more general and useful — so it's in base group, not here.

Uses: base, images.

opengl

Basic units that use OpenGL. GLWindow unit (something like "my glut (with many many enhancements)"), OpenGL fonts — bitmap (OpenGLBmpFonts) and outline (OpenGLTTFonts), various OpenGL helpers (KambiGLUtils), handling OpenGL images and textures (GLImages), progress bar display in OpenGL window (ProgressGL), messages (GLWinMessages, TimeMessages, GLWinInputs), helpers for shadows (ShadowVolumesHelper) and anti-aliasing (GLAntiAliasing).

Uses: base, images, 3dgraph, fonts.

3dmodels

Units to handle and process 3D models. VRML file reading, writing and processing (VRMLLexer, VRMLFields, VRMLNodes), building and using octree based on VRML model (VRMLOctree), ray-tracer based on VRML model (VRMLRayTracer), loading animations (VRMLAnimation), other 3d model formats reading (Object3dGEO, Object3ds, Object3dOBJ, ColladaToVRML), and converting to VRML (Object3dAsVRML).

These units do not depend on OpenGL.

Uses: base, images, 3dgraph.

3dmodels.gl

Units handling 3D models and do something OpenGL-specific. Basically this means that they use units in both opengl and 3dmodels groups.

First of all, that's VRMLFlatSceneGL unit that provides the most final and complete class for dealing with static VRML models and displaying them in OpenGL. And there's VRMLGLAnimation, that provides the same thing, but for animations (actually using VRMLFlatSceneGL underneath).

Uses: base, images, 3dgraph, fonts, opengl, 3dmodels.

Implementation notes (i.e. how does it work ?)

OpenGL rendering optimization notes

Notes about optimizing OpenGL rendering for the (most usual) case when you know that user will usually not see the whole scene.

First, some terminology

What is an "active part" of VRML model ?

This is simply the whole VRML model without parts excluded because they are inactive children of Switch or LOD nodes. E.g. consider this VRML file:

  #VRML V1.0 ascii
  Switch { Sphere { } }
  Cone { }

Active part of it consists of only one cone, since sphere is not visible.

What is ShapeState ?

So-called ShapeState is a VRML node that represents some visible shape (for VRML 1.0 this means one of the nodes AsciiText, Cone, Cube, Cylinder, IndexedFaceSet, IndexedLineSet, PointSet, Sphere) accompanied with the "state" of various things when this node is used in VRML file. E.g. this VRML file

  #VRML V1.0 ascii
  Texture { filename "aaa.png" }
  DEF MySphere Sphere { }
  DEF MyCone Cone { }
  Texture { filename "bbb.png" }
  USE MySphere

consists of two shape nodes (named MySphere and MyCone) and has three ShapeStates:

  1. MySphere textured by aaa.png

  2. MyCone textured by aaa.png

  3. MySphere textured by bbb.png

Many (although not all) VRML nodes are either shape nodes (that represent something visible) or nodes that affect subsequent state.

For more details, see source code:

Understanding the term "ShapeState" is important, since this is the basis of optimizations outlined below. After loading the VRML model, class TVRMLFlatScene creates a flat view of this model as a list of all ShapeStates that are in the active part of the model. I call this a "flat view" of the model because original vrml model is an acyclic directed graph (no, it's not a tree since you have the "USE" keyword).

What is currently implemented ?

  1. You can use TVRMLFlatSceneGL with Optimization = roSeparateShapeStates and use it's Render* methods that take into account your visibility frustum (e.g. RenderFrustum) or visibility sphere or something like that (generally speaking: methods that generate some function TestShapeStateVisibility that can quickly decide that some ShapeStates are not visible by the user).

    Simple approach to this, i.e. testing each ShapeState for visibility separately, works great when you have relatively few ShapeStates in your scene (let's say <= 100, but this estimation may be too big or too small in some practical cases). And such ShapeStates have small BoundingBoxes. Then this scheme will be often able to quickly eliminate whole ShapeStates from rendering pipeline, thus decreasing amount of triangles that we pass to OpenGL, thus speeding up the rendering.

    Drawbacks: this means that "internal" design of your model (how it's divided into ShapeStates) matters a lot. E.g. don't define your entire model as one IndexedFaceSet node. Don't define your model as many many many IndexedFaceSet nodes that have very few (let's say <= 10) triangles. Don't create IndexedFaceSet nodes with triangles that are scattered all around the whole scene (i.e., nodes that have BoundingBox that is very large and is visible from almost every camera position in the scene).

  2. This approach is the best thing implemented for now:

    It is actually approach (1) but done in a little more sophisticated way: I'm traversing down octree (entering only into nodes that collide with my viewing frustum), and this way I mark ShapeStates that are possibly visible.

    To use this: just build your octree with TVRMLFlatScene.CreateShapeStateOctree and then render using TVRMLFlatSceneGL.RenderFrustumOctree. TVRMLFlatSceneGL still (like in approach (1)) should be constructed with Optimization = roSeparateShapeStates.

    This way I keep using roSeparateShapeStates, so most of drawbacks of (1) remain, but one drawback is removed: the case when you have many many ShapeStates in your scene is more tolerable now. You can even get better performance by dividing some of your ShapeStates to more ShapeStates. Of course, this will work only to a certain point, i.e. at some point making more ShapeStates will again degrade performance. But now you can squeeze more ShapeStates than with approach (1).

  3. There's also a modified version of approach (2). With Optimization = roSeparateShapeStates, I create a separate display list for each ShapeState, that includes ShapeState's tranformation and lighting settings.

    However, this prevents display lists sharing between various ShapeStates: If the scene uses the same ShapeState many times, but transformed differently, I would like to create only one display list for such ShapeState. Moreover, when using TVRMLGLAnimation, if the same ShapeState is used within each animation frame (and it's only transformed differently), I would also like to use one display list. This is achieved by Optimization = roSeparateShapeStatesNoTransform.

    This is important that we save display list usage, as display lists can eat a huge amount of memory (as they have to store many things, like all vertex coordinates).

What are the alternatives ?

  1. Then comes the obvious idea to don't use scene division given by ShapeStates, but instead use triangles. This means that you will not any longer depend on how model designer divided your scene into ShapeStates. All the drawbacks of (1) disappear : you can have large ShapeStates with large BoundingBoxes, you can have many many ShapeStates, etc.

    To make this work you have to traverse octree to decide which triangles are in your visibility frustum/sphere/etc. (Doing this without octree, i.e. testing each triangle against your visibility frustum/sphere/etc. would be pointless, since this is what OpenGL already does itself.)

    Such traversing of the octree should be the first pass, when you're somehow marking visible triangles. Then, in the second pass, you should render your triangles ShapeState-by-ShapeState. Otherwise (if you would try to render triangles immediately when traversing your octree) you could produce too much overhead to OpenGL by too often changing your material/texture/etc. properties, since you will probably find triangles from various nodes (with various material/texture/etc. properties) very close in some octree nodes/leafs. Well, OK, this seems negotiable: maybe in some cases it would be sensible and more efficient to render triangles immediately when traversing your octree (let's assign number (5) to this approach).

    Drawbacks: well, first of all your octree must be really good. Just like for ray-tracer. Too shallow octree (like the one that may be sufficient for collision detection) is not good. But this is not really a problem, since my octree can be constructed with quite aggressive requirements. Constructing such octree may take a while of time, but that's another story.

    The real problem is that you will be unable to put large parts of rendering pipeline into OpenGL display lists, since you're deciding what to render on the "triangle-level granularity", so to speak. Of course, you could construct separate display list for each triangle, but this will (probably) not give you much speed.

Final notes: don't blame me if you found an invalid statement in this document. This is only a quick draft of some of my thoughts and ideas. But I'm no guru on this matter, and I'm learning new things every day. You think you have a better approach to some of the issues here (even if it works only in certain cases) ? I want to know about it. So email me.


Generated by PasDoc 0.10.0 on 2008-02-25 00:00:48