Outerra forum
Outerra Engine => Technology => Topic started by: hope on June 21, 2015, 09:42:10 am

First of all, I want to say great job with Outerra! It's one of the best, if not the best 'planet engine' out there.
It's been a dream of mine to make a procedural planet for the web. At least something that allows me to build a simple game. I successfully created a very simple procedural planet in ActionScript 3.0 by using a lot of AGAL shader magic. Writing shaders in a low level language was painful, so I moved to WebGL. Thankfully WebGL supports GLSL, which is like a dream compared to AGAL.
Anyway, to the point! I've seen a few threads in this forum where people discuss technical issues with their planet engines, and you've given some really good answers, cameni. I'm amazed that you even have time to answer these questions! :) There are few things that caught my attention that I would like to ask you about.
First question
In OT the individual cube face coordinates u,v (in range 1..1) are mapped to the sphere by normalization of vectors made from <u, v, ±k> set (order depends on the face), where k is:
k = 1 + 0.2071067812 (1  u²)(1  v²)
With 0.2071067812 ~= 0.5/(sqrt(2)  1)  1, which is creating most regularly sized tiles.
Most people are using an algorithm from Philip Nowell to map a cube to sphere  http://mathproofs.blogspot.com/2005/07/mappingcubetosphere.html
As you described, your method gives more regular result. After briefly looking at both of the algorithms, it seems like your algorithm could also perform better. Especially if you can precalculate the sqrt.
I'm also curious if you've implemented a sphere to cube mapping version of your algorithm, and are you willing to share the solution if you did? :) I think most people would move to your solution if one created a good example on how to use it. You'll be the next John Carmack! ;)
Second question
Procedural noise. I've been thinking about this topic way too much. I'm never satisfied with the performance. I've tried both Perlin and Simplex noise, all of them take some time. Currently I'm using Simplex noise in the GPU to generate heightmap. Eventually this data needs to be moved to the CPU for calculating the collision and other things. JavaScript/WebGL is not quite there yet, but sooner or later it'll support threads and other cool things such as WebAssembly.
Again, to the point, I'm pretty sure I could improve the performance many times by rethinking how the heightmap is generated and reusing resources.
The main issue that I'm running into is that I'm using about 1014 octaves to have the terrain more interesting when the camera is close to the surface of the planet. For each quadtree node/patch I'm generating a 1024x1024 heightmap. When that node is split, each children will generate a 1024x1024 heightmap, and so on. I'm not reusing the heightmap from the parent, and I'm pretty sure that could be a game changer in terms of performance.
Now, my question. Could I reduce the number of octaves if I somehow reused the noise or heightmap from the parent? If so, how exactly would I do that? I can't think of any 'seamless' method.
I've also been wondering if it's possible to use a fractal algorithm like mandelbrot. It repeats interesting patterns as you zoom in, and is super fast, but has repeating patterns that could be hidden with combination of other algorithms. Do you know if there has been any attempt to use fractal algorithms to generate a terrain?
Yet another thought, using 2D noise instead of 3D noise. I assume this would probably improve the performance slightly (I haven't measured the speed difference yet).
Absolute value of face coordinates summed (give the same number on the edges), and inherited height from parent (it's the same on the face edges as well) to break the patterns.
As I understand it I could use some tricks to make the noise seamless between cube faces, though there would always be some seam that would be visible, or at least a break in the pattern. I would like to simply ask, would you say it's worth changing to 2D noise for the added performance?
And the last... question. I promise! (for now).
Faces are indexed using signed 32b integers from 2^30 to +2^30, leaving space for overflows.
I think integers can be used in case of Perlin as well, by computing auxiliary coordinates first that are kind of floating with level. In each level you would shift the integer coordinates to the right by 328level bits and mask out the bits above eight lowest bits. Then you compute floatingpoint coordinates by multiplying with 1.0/256 and pass them further on. Could that work with that algorithm?
With all kinds of coherent noise one is basically computing an interpolated value between some discrete points in space when computing the value for given level, and the coordinates just have to change twice as fast with each level.
Also, we are not using a simple GetNoiseForPosition(int x, int y, int z) function as it has many disadvantages, slowness being just one of them. We are accumulating the noise and caching it so that we can also modify the values or to use height maps for certain levels etc.
Basically it is indeed more or less like that, and we are also using a couple of techniques to speed it up  like caching the generated values at certain levels to be used as the basis for finer details, so that the more detailed levels do not have to compute all octaves of noise. The system is designed so that it can handle external data or generated noise almost transparently.
Here's an old screenshot from one of predecessors of the algorithm (from 2003), the colored profile curves seen on the right defined how various properties of noise generator changed with the level (Xaxis corresponds to the level but the markings show the level size in meters instead). That version didn't handle existing terrain data, it was a pure fractal generator.
Can you share some more information on that fractal generator, was it not related to perlin noise at all? How did you solve the seams between cube faces, etc? :)
I know these are many questions, and some of them are very general. It would be highly appreciated if you could answer some of them!
Thank you! P.s. I promise to include some pictures and links to my WebGL planet generator.

I'm also curious if you've implemented a sphere to cube mapping version of your algorithm, and are you willing to share the solution if you did? :) I think most people would move to your solution if one created a good example on how to use it.
I'm planning to write a blog post about the mapping we are using for some time already, but alas the time ...
Here's the code for forward transformation:
///Convert cubeface coordinates to 3D normalized coordinates
//@param f face id
//@param faceh horizontal face coordinate in range <1 .. 1>
//@param facev vertical face coordinate in range <1 .. 1>
//@param xyz [out] 3D normalized coordinates
//@return xyz
//@note face coordinates (u,v) are converted to 3D by normalization of
// (u, v, 1 + M * (1  u^2) * (1  v^2))
template<class FLOAT>
inline FLOAT* cubeface_to_xyz( int f, FLOAT faceh, FLOAT facev, FLOAT xyz[3] )
{
FLOAT q0 = faceh*faceh;
FLOAT q1 = facev*facev;
static const FLOAT M = FLOAT(0.5/(M_SQRT2  1)  1);
FLOAT w = 1 + M * (1  q0) * (1  q1);
FLOAT d = 1 / sqrt( q0 + q1 + w*w );
FLOAT fv = f&1 ? facev : facev;
const FLOAT xyzxy[5] = { faceh, fv, f&1 ? w : w, faceh, fv };
const FLOAT* pxyz = xyzxy + 2  (f>>1);
xyz[0] = pxyz[0] * d;
xyz[1] = pxyz[1] * d;
xyz[2] = pxyz[2] * d;
return xyz;
}
For the reverse, analytical solutions so far ended up worse than running a few steps of Newton solver, and that's what we are using. Another point is that we only rarely need to use this function in any performance critical point.
Code for the reverse (from 3D to cube face):
///Convert xyz 3D coordinates to cubeface coordinates of given face
//@param xyz 3D coordinates (don't have to be normalized)
//@param force_face face to use
//@param face [out] 2D cube face coordinates in range <1 .. 1>
//@return face id
template<class FLOAT>
inline int xyz_to_cubeface_face( const FLOAT xyz[3], int force_face, FLOAT face[2] )
{
FLOAT yzxy[4] = { xyz[1], xyz[2], xyz[0], xyz[1] };
int f = force_face >> 1;
int plus = force_face & 1;
//change sign
yzxy[f+1] = plus ? yzxy[f+1] : yzxy[f+1];
static const FLOAT M = FLOAT(0.5/(M_SQRT2  1)  1);
//should be scaled so that
// K*z = 1 + M * (1  (K*x)^2) * (1  (K*y)^2)
//
// K*z = 1 + M * (1  (K*x)^2  (K*y)^2 + K^4*x^2*y^2)
// K^4*a + K^2*b + K*c + d = 0
//
// using Newton solver:
// f'(K) = 4*a*K^3 + 2*b*K + c
// K(n+1) = K(n)  f(K)/f'(K)
FLOAT af = fabs(xyz[f]);
FLOAT a = M * yzxy[f] * yzxy[f] * yzxy[f+1] * yzxy[f+1];
FLOAT b = M * (yzxy[f] * yzxy[f] + yzxy[f+1] * yzxy[f+1]);
FLOAT c = af;
FLOAT d = 1 + M;
FLOAT K = 1/af, fK, dK;
do {
fK = K*(K*(b + a*K*K) + c) + d;
dK = K*(4*a*K*K + 2*b) + c;
K = K  fK/dK;
} while(fabs(fK) > 1.e8);
face[0] = yzxy[f] * K;
face[1] = yzxy[f+1] * K;
return (f<<1) + plus;
}

Could I reduce the number of octaves if I somehow reused the noise or heightmap from the parent? If so, how exactly would I do that? I can't think of any 'seamless' method.
Well that's the main optimization you can do. The algorithm is simple, though it can have many implementation variants: take the parent data layer, rescale it and add a new layer of noncoherent noise. The noise doesn't have to be independent but it can also adapt based on the existing parent terrain parameters, for example adding more noise if there was a steeper slope in parent, or making its amplitude smaller where you want flat lands. The noise is actually computed as a pseudorandom perturbation where the seed is something that's always the same on given coordinate  i.e. some global coordinates or even the value of parent layer (though that may cause patterns) etc. It will also ensure that tile edges always use the same noise values.
And that's basically what the "fractal generator" is ... minus lots of devilish details =D

Wow, thank you for the quick answer!
I tested your cube to sphere mapping and it works beautifully! Do you mind if I write a blog on where I show with WebGL the difference between your cube to sphere algorithm vs the one from Philip Nowell? If I have time I'll also add performance comparison. The performance difference will be interesting since your algorithm uses 1 sqrt instead of 3.
I'm really glad that you've decided to share your solution, It'll definitely help a lot of people out there who are interested in making their own planet engine :)
Could I reduce the number of octaves if I somehow reused the noise or heightmap from the parent? If so, how exactly would I do that? I can't think of any 'seamless' method.
Well that's the main optimization you can do. The algorithm is simple, though it can have many implementation variants: take the parent data layer, rescale it and add a new layer of noncoherent noise. The noise doesn't have to be independent but it can also adapt based on the existing parent terrain parameters, for example adding more noise if there was a steeper slope in parent, or making its amplitude smaller where you want flat lands. The noise is actually computed as a pseudorandom perturbation where the seed is something that's always the same on given coordinate  i.e. some global coordinates or even the value of parent layer (though that may cause patterns) etc. It will also ensure that tile edges always use the same noise values.
And that's basically what the "fractal generator" is ... minus lots of devilish details =D
Interesting! This subject has always been a bit foggy to me. I should really study the subject more before asking more questions, but I have some questions that keep popping up in my head that I should probably just give a try. But hey, it can't hurt to ask! :)
Currently I've been using 6 octaves of simplex noise for all nodes, no matter the depth, which gets heavy when close to the surface of the planet. So based on your reply, plus other posts I've found on Google, I've come up with something that I'd like to try.
The first idea I have is to use 3 to 5 octaves of noise for the first QuadTree node on each cube face, that forms the planet. For the children I'd use part of the heightmap from the parent node, scale it 2x and then use that data. So basically a node will inherit the octaves from the parent node, and append one octave to it. This method should give me increased detail (more octaves) as I get closer to the ground, and should be cheaper! Is my premise correct, or are there any issues you see with this method?
And the last idea, which is more on noncoherent noise.
I could have simplex noise for the first 34 levels and then use noncoherent noise to add more detail based on the simplex noise in the parent node.
Could you give some example on a simple noncoherent noise function? Is it basically a simple function that takes in 2D coordinates and a height value, which then outputs height value? It would be really, really awesome to get some more insight into this subject!
Thank you again. I really appreciate all the wisdom you're sharing with the rest of the world!

Do you mind if I write a blog on where I show with WebGL the difference between your cube to sphere algorithm vs the one from Philip Nowell? If I have time I'll also add performance comparison. The performance difference will be interesting since your algorithm uses 1 sqrt instead of 3.
I'm really glad that you've decided to share your solution, It'll definitely help a lot of people out there who are interested in making their own planet engine :)
Sure go ahead :)
The first idea I have is to use 3 to 5 octaves of noise for the first QuadTree node on each cube face, that forms the planet. For the children I'd use part of the heightmap from the parent node, scale it 2x and then use that data. So basically a node will inherit the octaves from the parent node, and append one octave to it. This method should give me increased detail (more octaves) as I get closer to the ground, and should be cheaper! Is my premise correct, or are there any issues you see with this method?
It will work, but you may need to use a higher level filtering for the scaling to get rid of some artifacts. Alternatively, you could always use 23 octaves of noise, but superimpose it over the parent node that's also 23 levels higher. The advantage would be that there will be practically no artifacts from linear resampling.
I could have simplex noise for the first 34 levels and then use noncoherent noise to add more detail based on the simplex noise in the parent node.
Could you give some example on a simple noncoherent noise function? Is it basically a simple function that takes in 2D coordinates and a height value, which then outputs height value? It would be really, really awesome to get some more insight into this subject!
Yes, noncoherent noise is just a normal noise. In this case you want a pseudorandom permutation of some input seed values. If the seed is already randomized enough, you can use a very simple permutation, like
y = 3141592653UL * x + 1;
and then use low 16 bits and convert them to 1 .. 1 range or whatever you need.
For seed values that are not randomized, like some grid coordinates, you have to use something better. For example:
float random_2d_perm( int2 coord )
{
const int D = 1;
int xp = coord.x;
int yp = coord.y;
const int K = 2047483673;
int p = (K*xp + D)*xp;
p = (p&0xffff) + yp;
p = (K*p + D)*p;
p = (p&0xffff)  0x8000;
return float(p) * float(1.0/0x8000);
}