Multimodal Fusion

Context

This project was developed by the team consisting of Constant Roux and Larissa Xavier Vitor, both in their final year at UPSSITECH engineering school, specializing in robotics. The project is an integral part of the Human-Computer Interaction (HCI) course instructed by Philippe Truillet.

The project explores the use of a multimodal fusion engine in the field of Human-Computer Interaction. To learn more about the course and access the source code, please visit the GitHub repository associated with the project: Multimodal Human-Machine Interaction.

Presentation

→ General principle

The system operates based on the following general principles:

Data Transmission from Interfaces to FIFO:
1. Position data from mouse clicks.
2. Text data generated through voice input using a microphone and a language model.
3. Shape data from the camera (gesture detection).
Process Thread Operation:
1. The Process Thread reads data from the FIFO and stores it in the Slot Array, a table containing the latest received data (this table is periodically reset).
2. It attempts to generate a possible command and verifies its coherence by communicating with the AppThread, which manages the application and its interface.
AppThread Execution:
1. The AppThread examines the commands to be sent and executes them.
2. It has the capability to manage all shapes present on the screen.

For more detailed information, please refer to the project report provided in the project source code.

→ Functionalities

The project encompasses the following key functionalities:

Create a Shape:
1. Generate a shape from the selection of circle, rectangle, or triangle.
2. Assign a specific color to the shape, choosing from options like red, green, or blue.
3. Place the newly created shape at a designated location.
Delete an Existing Shape:
1. Remove a previously created shape from the interface.
Move an Existing Shape:
1. Adjust the position of an existing shape to a new location on the screen.
Quit the Application:
1. Exit the application, terminating its execution.

Getting started

→ Dependencies

Make sure you have the following dependencies installed before running the project:

Windows OS: This project is designed to run on the Windows operating system.
Processing: The OneDollarIvy component of the project requires the Processing software. You can download Processing from processing.org.
Python 3.x: The project requires Python 3.0 or a later version.
NumPy: NumPy is a fundamental package for scientific computing with Python. To install it, use the following command:
```
pip install numpy
```
Ivy-Python: Ivy is a custom software bus used in this project. To install it, use the following command:
```
pip install ivy-python
```

→ Installation

To get started with this projet, simply clone this project by using the following command:

git clone https://github.com/ConstantRoux/multimodal-fusion.git

→ Usage

Follow these steps to use this project:

Launch Ivy Visionneur: Execute the visionneur.bat file located in agent/libraries/visionneur:
```
cd agent/libraries/visionneur
visionneur.bat
```
Launch Sra5: Execute the sra5.exe file located in agent/libraries/sra5:
```
cd agent/libraries/sra5
sra5.exe
```
Run OneDollarIvy from Processing: Open the Processing software and load the OneDollarIvy file located in agent/libraries/OneDollarIvy.
Run the Main Project: Execute the following command to run the main project:
```
python main.py
```

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
.idea		.idea
agent		agent
assets		assets
interface		interface
process		process
slot		slot
.gitignore		.gitignore
README.md		README.md
__init__.py		__init__.py
main.py		main.py
report.pdf		report.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Multimodal Fusion

Context

Presentation

→ General principle

→ Functionalities

Getting started

→ Dependencies

→ Installation

→ Usage

Demonstration

About

Uh oh!

Releases

Packages

Languages

ConstantRoux/multimodal-fusion

Folders and files

Latest commit

History

Repository files navigation

Multimodal Fusion

Context

Presentation

→ General principle

→ Functionalities

Getting started

→ Dependencies

→ Installation

→ Usage

Demonstration

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages