multi-modal

Star

Here are 273 public repositories matching this topic...

kyegomez / qformer

Sponsor

Star

Implementation of Qformer from BLIP2 in Zeta Lego blocks.

machine-learning ai machine artificial-intelligence multi-modal attention-mechanism multi-modality blip2

Updated May 17, 2024
Python

kyegomez / MegaVIT

Sponsor

Star

The open source implementation of the model from "Scaling Vision Transformers to 22 Billion Parameters"

computer-vision artificial-intelligence multi-modal vision-and-language multi-modal-learning vision-transformer gpt4 multi-modal-fusion

Updated May 17, 2024
Python

TuAnh23 / MultiModalST

Star

Limit the use of end-to-end data for Speech Translation (by leveraging Automatic Speech Recognition and Machine Translation data instead) using zero-shot multilingual text translation techniques.

multi-modal zero-shot few-shot speech-translation

Updated May 16, 2022
Python

Jakob-L-M / multi-modal-document-search

Star

This repository provides a streamlit application that enables a user to upload a screenshot which will than be queried against a database of PDF documents. Both the image structure as well as the (possibly) included text are used to find matching documents for a self defined set.

multi-modal ocr-recognition embedding-vectors streamlit vector-database

Updated Dec 28, 2023
Python

lyyf2002 / ASGEA

Star

Code for ASGEA: Exploiting Logic Rules from Align-Subgraphs for Entity Alignment

knowledge-graph alignment multi-modal entity-alignment asgea

Updated Feb 28, 2024
Python

Pruthvi-Sanghavi / air_water_land_surveillance_bot

Star

Repository for air water and land surveillance robot developed as a part of DRDO Robotics and Unmanned Systems Exposition.

quadcopter surveillance arduino-uno multi-modal differential-drive-robot

Updated Feb 10, 2021

kyegomez / VisionLLaMA

Sponsor

Star

Implementation of VisionLLaMA from the paper: "VisionLLaMA: A Unified LLaMA Interface for Vision Tasks" in PyTorch and Zeta

ai deep-learning vit multi-modal vision-models vision-transformers

Updated May 18, 2024
Python

kyegomez / CELESTIAL-1

Sponsor

Star

Omni-Modality Processing, Understanding, and Generation

openai attention multi-modal multimodality attention-is-all-you-need attention-mechanisms multimodal multimodal-deep-learning gpt-4 gpt4 omnimodal

Updated May 3, 2024
Python

JanTeichertKluge / DMLSim

Star

This library provides packages on DoubleML / Causal Machine Learning and Neural Networks in Python for Simulation and Case Studies.

machine-learning deep-learning neural-network simulation transformers transformer multi-modal causal-inference case-study bert causal multimodal multimodal-deep-learning dgp causal-machine-learning beit double-machine-learning doubleml

Updated Jun 20, 2023
Python

dermatologist / kedro-tf-utils

Star

Kedro pipelines for multimodal ML in TensorFlow.

tensorflow healthcare multi-modal hacktoberfest kedro

Updated Mar 14, 2023
Python

ccwutw / knowledge-networks

Star

A Knowledge Network implementation from Knowledge Graphs

knowledge-graph multi-modal neo4j-graph

Updated Oct 19, 2022
Python

kyegomez / M2PT

Sponsor

Star

Implementation of M2PT in PyTorch from the paper: "Multimodal Pathway: Improve Transformers with Irrelevant Data from Other Modalities"

ai models ml attention llama multi-modal attention-is-all-you-need gpt4 gpt5 mulit-modality

Updated Mar 11, 2024
Python

mahalrs / newsgen

Star

Multi-Modal Image Generation for News Stories

transformers multi-modal clip text-image vqgan vqgan-clip dalle-mini

Updated May 6, 2023
Python

jiaowoguanren0615 / CoCa-Pytorch

Star

This is a warehouse for CoCa-pytorch-model, can be used to train your dataset

deep-learning pytorch multi-modal coca contrastive-learning

Updated Nov 17, 2023
Python

namburusiddhartha / 24787-Project_Multi-Modal-Approach-for-Sentiment-Analysis

Star

This repo builds on MELD: A Multimodal Multi-Party Dataset for Emotion Recognition in Conversations paper, comparing the accuracies of shallow machine learning models with deep LSTM models using bi-modal approach (text + audio).

machine-learning deep-learning multi-modal