Traditional 3D computer-aided design software, such as Maya and 3ds Max, provide all the tools required to build virtual worlds. The toolbox lets the user model, place, paint and animate objects with great expressivity.
Advanced operations, such as texturing, rely on well-defined but complex geometric algorithms. As a consequence, imagination is not the sole limit of these systems, as should be expected in a layman world builder application.

We implemented a scheme to allow fluid design of virtual worlds. Our idea is to combine a flavor of the Teddy 3D freeform modeling system described in Teddy with standard computer vision and machine learning algorithms.
This allows to strike the balance between expressiveness and ease-of-use that is required for non-expert design tasks and rapid prototyping. As we aim for intuitive editing, we provide hardware tools that match our virtual controllers, in the form of a ball and a pen.

Our ideas include:


We use a ball and an optical emitter/sensor and combine them into a new interaction metaphor. For a right-handed user, the ball is held by the left hand and the glove is worn on the right hand. The system can be used with a traditional mouse, which can facilitate its adoption and its integration in industrial pipelines. This comes at the price of less intuitive and natural movements.

We arranged the list of possible actions to reach balance between the ball and the glove. The main possible actions are modeling, select/move/rotate/scale objects, workspace exploration, coloring/texturing.

A stroke drawn with the finger can be seen as the object silhouette, or a cut path. User strokes are automatically processed by shape builders. Each builder applies a specific algorithm to the stroke in order to create a 3D model from it.
The generated shape is then placed in the workspace to directly match the silhouette location and orientation. The video shows in the Teddy shape builder, the extruder shape builder, as well as the cutting tool we implemented.

The Teddy shape builder tends to produce cushion-like models with no sharp edges. As a consequence, it is difficult to draw shapes with triangular or square bases without using the cut tool. On top of this, our interaction method is very sensitive to motion noise, which sometimes remains untouched by our resampling algorithms and can be seen in the output stroke. Thus, it is very difficult to draw perfect primitives such as spheres or squares.

To balance for this, we use a machine learning algorithm. Offline, we train an artificial neural network to recognize and classify primitives from input strokes. The system makes suggestions in real-time, depending on the input of the user. The suggestions include primitives as well as the output of the shape builders.

We provide the user with a hemispherical menu whose surface changes depending on the task being performed or the available choices.
The menu is hemispherical and matches the ball motion to keep our interaction metaphore consistent. It offers intuitive and rapide exploration of the different options available while keeping the left hand on the ball, and without moving the glove.

Shape recognition module

This project includes a machine learning module which makes shape suggestions in real-time depending on the input stroke of the user.
It compensates the fact that it is hard with our freeform input system to draw regular primitive shapes.
The machine learning algorithm relies on an artificial neural network built from the Fast Artificial Neural Network Library (FANN).

We developed a simple offline tool in C# / WPF to train our neural network.

The first interface of this tool is the Training mode

The user specifies all the different shapes the neural network will be able to identify (3 in this screenshot: square, triangle and circle) then inputs many different samples for each one of them. The more diverse the samples, the better the recognition. Several different people should be part of the training session to account for different ways to hold our device, the person's handwriting, right-handed or left-handed, and so on.

The system pre-process each input stroke to put them in a similar comparison space (resampling, scaling...) and automatically generates additional input strokes by offsetting the starting point and switching the stroke orientation (clockwise / counterclockwise).

Then comes the Evaluation mode to assess the effectiveness of the training.

The user simply inputs a test stroke and the system graphically displays the probability for this stroke to be each possible output. In the screenshot on the right we can see that the probability for the input stroke to be a triangle (in red) is much higher than any others (in green).

Those similarity factors are then used in real-time in our system to make relevant suggestions to the users.