This repo contains Big Data Project, its about "Real Time Twitter Sentiment Analysis via Kafka, Spark Streaming, MongoDB and Django Dashboard".
-
Updated
May 20, 2024 - Jupyter Notebook
This repo contains Big Data Project, its about "Real Time Twitter Sentiment Analysis via Kafka, Spark Streaming, MongoDB and Django Dashboard".
Intel(R) Extension for Scikit-learn is a seamless way to speed up your Scikit-learn application
Extensible data integration Java framework for building XML and non-XML fragment-based applications
curated list of awesome tools and libraries for specific domains
Arkime is an open source, large scale, full packet capturing, indexing, and database system.
Hazelcast is a unified real-time data platform combining stream processing with a fast data store, allowing customers to act instantly on data-in-motion for real-time insights.
Apache DataFusion SQL Query Engine
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
Postgres for Search and Analytics
AI + Data, online. https://vespa.ai
YTsaurus is a scalable and fault-tolerant open-source big data platform.
ClickHouse® is a free analytics DBMS for big data
Simple and Distributed Machine Learning
Stroom is a highly scalable data storage, processing and analysis platform.
Data-Centric Pipelines and Data Versioning
CDP Public Cloud is an integrated analytics and data management platform deployed on cloud services. It offers broad data analytics and artificial intelligence functionality along with secure user access and data governance features.
An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
Add a description, image, and links to the big-data topic page so that developers can more easily learn about it.
To associate your repository with the big-data topic, visit your repo's landing page and select "manage topics."