This project was developed by the team consisting of Constant Roux and Larissa Xavier Vitor, both in their final year at UPSSITECH engineering school, specializing in robotics. The project is an integral part of the Human-Computer Interaction (HCI) course instructed by Philippe Truillet.
The project explores the use of a multimodal fusion engine in the field of Human-Computer Interaction. To learn more about the course and access the source code, please visit the GitHub repository associated with the project: Multimodal Human-Machine Interaction.
The system operates based on the following general principles:
-
Data Transmission from Interfaces to FIFO:
- Position data from mouse clicks.
- Text data generated through voice input using a microphone and a language model.
- Shape data from the camera (gesture detection).
-
Process Thread Operation:
- The Process Thread reads data from the FIFO and stores it in the Slot Array, a table containing the latest received data (this table is periodically reset).
- It attempts to generate a possible command and verifies its coherence by communicating with the AppThread, which manages the application and its interface.
-
AppThread Execution:
- The AppThread examines the commands to be sent and executes them.
- It has the capability to manage all shapes present on the screen.
For more detailed information, please refer to the project report provided in the project source code.
The project encompasses the following key functionalities:
-
Create a Shape:
- Generate a shape from the selection of circle, rectangle, or triangle.
- Assign a specific color to the shape, choosing from options like red, green, or blue.
- Place the newly created shape at a designated location.
-
Delete an Existing Shape:
- Remove a previously created shape from the interface.
-
Move an Existing Shape:
- Adjust the position of an existing shape to a new location on the screen.
-
Quit the Application:
- Exit the application, terminating its execution.
Make sure you have the following dependencies installed before running the project:
- Windows OS: This project is designed to run on the Windows operating system.
- Processing: The OneDollarIvy component of the project requires the Processing software. You can download Processing from processing.org.
- Python 3.x: The project requires Python 3.0 or a later version.
- NumPy: NumPy is a fundamental package for scientific computing with Python. To install it, use the following command:
pip install numpy
- Ivy-Python: Ivy is a custom software bus used in this project. To install it, use the following command:
pip install ivy-python
To get started with this projet, simply clone this project by using the following command:
git clone https://github.com/ConstantRoux/multimodal-fusion.git
Follow these steps to use this project:
-
Launch Ivy Visionneur: Execute the
visionneur.bat
file located inagent/libraries/visionneur
:cd agent/libraries/visionneur visionneur.bat
-
Launch Sra5: Execute the
sra5.exe
file located inagent/libraries/sra5
:cd agent/libraries/sra5 sra5.exe
-
Run OneDollarIvy from Processing: Open the Processing software and load the
OneDollarIvy
file located inagent/libraries/OneDollarIvy
. -
Run the Main Project: Execute the following command to run the main project:
python main.py