ShapeWorld
See the video: https://youtu.be/_tYYDTV2f8I (4.51 min)
Some sample frames from the video (reduced in size from the original 1280 x 720 resolution, although the video is in 1920 x 1080 - 1080p HD):
The purpose of the ShapeWorld module is to allow an artificial mind to express itself. In this case the new construct represents an updated version of the OtoomCM program and its derivatives (see under AI Programs above). At the time of writing (March 2020) the main body of the program has been completed, what's left to do is attaching the visuals (ie, ShapeWorld) and the accompanying audio section. As the updated version makes use of parallel processes (via CUDA), instead of having something like 300 nodes requiring up to three hours for one cycle to be completed in a linear fashion, now over 60,000 nodes with cycle times of approximately 25 seconds are possible. The exact numbers will depend on the detailed deployment of the AI engine. (However, see Note) below)
What 'language' should an artificial mind have? One could opt for the movie version where androids talk like humans (not that there could be a time in the future when androids do indeed speak the way we do). However, let's focus on a truly artificial mind first. A mind that possesses cognitive dynamics yet exists in its own right.
There are obvious differences between organisms on this planet. On the one hand, they all are based on the common laws of physics, chemistry, and the rules underpinning non-linear systems in general. On the other, their particular physical configuration together with their brains' overall capacity make for a distinctive language, a form of expression that during the evolutionary timeline has settled into a symbiosis between their cognitive capacity and the body's framework. Hence insects can be differentiated by their intrinsic sounds, the same applies to dogs and cats, and even in humans there are, broadly speaking, differences between males and females, young and old, big and small.
How organisms, including humans, express themselves is a function of the above symbiosis; what they express is the result of their cognitive processes.
Given the complexity relationships between the two it would be useless trying to impose a human-type language upon a system that does not have the appropriate complexity to process information and expect the result to manifest itself as a linguistic framework requiring a degree of complexity the system simply does not have. Therefore the current attempts at human-like sentences coming from an AI-based algorithm are based on code sequences which have been designed by programmers; they are a top-down implementation (which is not to diminish the programmers' skill). A truly artificial mind however creates output that comes from within, a bottom-up approach. Just as any organism has its own language because that's what has emerged over the millennia. And so to ShapeWorld.
Since our AI program features cognitive dynamics yet answers to formal algorithms in the form of computer code, it makes sense (at least in my opinion) to use a language that is similarly constructed. There is the formality of the code, but there also needs to be a flexibility which is responsive to the virtually infinite variety the AI engine is capable of at its own level of complexity.
ShapeWorld makes use of the eight basic shapes found in nature, indeed in all our architecture: plane, block, pyramid, cone, sphere, cylinder, spiral, torus. Any shape, however complicated in the end, can be deconstructed into those basics. (Strictly speaking this is not entirely true: a torus can be deconstructed into a curved cylinder for example, but in general they all have their specific algorithms, and to jump from one to the other one algorithm has to be exited and another entered)
ShapeWorld uses one single algorithm that includes sections for every one of the shape types, and which shape, or which combination of shapes, appears depends on 27 parameters making up the sections. Therefore any shape, however distinctly expressed, is a function of the combined influence of those 27 parameters. In the ShapeWorld video the module is run on its own, which means the parameters are continually defined by random number generators. As one value has been selected, there is an incremental movement from its previous value to the current one, at which point another value is randomly selected, and so on. Once coupled to the AI program the random number generators are replaced by output from the AI engine.
The shapes can be compared to an organism's body language; they are the visuals. Then there is the sound. How does an artificial mind sound like? Or, in this case, how does a sphere, a cylinder or cone for that matter, sound like? Since the entire system is a construct to begin with, derived from the considerations mentioned above, the sound should follow similar lines.
Each shape has its own particular sound, its own wave spectrum. The algorithm is loosely based on the specific form (ie, round, elongated, angular) but apart from that uses equations capable of producing sound coming through the speakers. In other words, there is no recording that has been made from some sound-producing 'thing' and subsequently been modified. Certain expressions within the equations contain parameters which are defined by the values defining the shapes on an on-going basis, so a change in a shape is reflected in its sound and a combination of shapes is reflected in a combination of sounds (a mix). Further considerations: Of frogs and things.
The video shows a small selection of various parameter combinations as they change the appearance of the shapes, having applied colour and/or textures. Mostly they are rendered as full forms, a few as adjacent vertices and/or lines. The sound is influenced by those parameter combinations and changes.
Producing all the shape types through a single algorithm required a certain re-think on just about all geometric aspects. Colouring and texturing made use of shaders applying various methods so that features such as shadows and reflections could be applied in one render pass and still maintain a reasonable frame rate.
Workflow:
- run the ShapeWorld module and/or the effects module for the Otoom fractal, texts and intro, capturing the frames as 1280x720p bmp files and the audio tracks
using XAudio2;
- convert the bmp files to png images;
- using Blender to turn the images of each scene into a 1920x1080p avi clip;
- using VideoPad to align the clips and audio tracks and export them to
the final avi video.
Issues:
Unfortunately there is a certain amount of banding in the video, especially when the colour gradient extends over a larger area. The program itself (where the
shapes are rendered in the first place) does not have this problem. See for example frame 011049 in
sequence 29, or frame 011905 in sequence 33, all in their original size.
It seems the banding issue invites a considerable range of suggestions as well as the debates around them. In this particular case some settings and codec choices had no effect while others made matters worse in another area; rather annoying after having taken so much care with shaders and textures to begin with. Perhaps a higher-end video editing software on a 64-bit system would solve the problem (by the way, staying with the 720p format does not help, nor does using bmp files in Blender rather than the png version). In any case, that problem has nothing to do with the initial module nor its inclusion in the AI program.
Note (September 2021):
The program has been completed. See OCTAM for screenshots, the manual, and comments on its functional aspects. It turns out that
using CUDA and maintaining the depth-first traversal of the AI Engine matrix would result in a speed loss overall. Changing to the breadth-first alternative results in no
speed gain while unseating the functional integrity of the AI Engine. Therefore, no CUDA.
© Martin Wurzinger - see Terms of Use