1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69# Tools, resources and services related to data management, processing, storage, transfer etc
#### Table of contents <a name="toc"></a>
* [Tools catalog / search / discovery / comparison service](#tool-search)
* [Database](#database)
- [Embedded database](#embedded-db)
* [Database tools](#db-tools)
* [Data platform](#data-platform)
* [Data processing](#data-processing)
- [JavaScript](#data-processing-js)
- [Python](#data-processing-python)
### Tools catalog / search / discovery / comparison service <a name="tool-search"></a> [↑ ↑ ↑](#toc)
* [Database of Databases](https://dbdb.io/) - discover and learn about database management systems.
* [DB-Engines](https://db-engines.com/) - an initiative to collect and present information on database management systems (DBMS). Contains [DB-Engines Ranking](https://db-engines.com/en/ranking) - a list of DBMS ranked by their current popularity.
* [Embedded Database Systems](http://embedded-database.com/) - curated news and information about commercial and open source embedded database systems.
* [NoSQL Database](https://hostingdata.co.uk/nosql-database/) - list of nosql database management systems.
### Database <a name="database"></a> [↑ ↑ ↑](#toc)
* [ArangoDB](https://www.arangodb.com/) - a native multi-model database with flexible data models for documents, graphs, and key-values. Build high performance applications using a convenient SQL-like query language or JavaScript extensions.
* [BangDB](https://bangdb.com/) - a multiflavored, multimodel, embedded, distributed, high performance, analytical, timeseries NoSql database written in C/C++ and design from scratch for solving contemporary and future problems in simple and easy manner which otherwise requires huge amount of time and resources.
* [DuckDB](https://duckdb.org/) - a high-performance analytical database system. It is designed to be fast, reliable, portable, and easy to use. DuckDB provides a rich SQL dialect.
* [Firebird](https://firebirdsql.org/) - a relational database offering many ANSI SQL standard features that runs on Linux, Windows, and a variety of Unix platforms. Firebird offers excellent concurrency, high performance, and powerful language support for stored procedures and triggers.
* [ForerunnerDB](http://www.forerunnerdb.com/) - a NoSQL JavaScript JSON database with a query language based on MongoDB (with some differences) and runs on browsers and Node.js.
* [H2](http://www.h2database.com/) - Java SQL database. Embedded and server modes; in-memory database.
* [OrientDB Community](https://orientdb.org/) - open source NoSQL DBMS that brings together the power of graphs and the flexibility of documents into one scalable high-performance operational database.
* [PostgreSQL](https://www.postgresql.org/) - a powerful, open source object-relational database system that uses and extends the SQL language combined with many features that safely store and scale the most complicated data workloads.
* [RxDB](https://rxdb.info/) - a realtime database for JavaScript applications. RxDB (short for Reactive Database) is a NoSQL-database for JavaScript applications like websites, hybrid apps, Electron-apps, Progressive Web Apps and NodeJs. Reactive means that you can not only query the current state, but subscribe to all state changes like the result of a query or even a single field of a document.
* [SQLite](https://www.sqlite.org/) - a C-language library that implements a small, fast, self-contained, high-reliability, full-featured, SQL database engine.
* [YDB](https://ydb.tech/) - an AI-powered Distributed SQL DBMS that unifies transactional, analytical, federated, and streaming workloads, delivers strict consistency and high availability, and brings AI capabilities directly to developers.
* #### Embedded database <a name="embedded-db"></a> [↑ ↑ ↑](#toc)
- [Berkeley DB](https://www.oracle.com/database/technologies/related/berkeleydb.html) - a family of embedded key-value database libraries providing scalable high-performance data management services to applications. The Berkeley DB products use simple function-call APIs for data access and management. Berkeley DB provides a collection of well-proven building-block technologies that can be configured to address any application need from the hand-held device to the data center, from a local storage solution to a world-wide distributed one, from kilobytes to petabytes.
- [EJDB2](https://ejdb.org/) - embeddable JSON database engine C library. Simple XPath like query language (JQL). Websockets / Android / iOS / React Native / Flutter / Java / Dart / Node.js bindings.
- [LiteDB](http://www.litedb.org/) - embedded NoSQL database for .NET. An open source MongoDB-like database with zero configuration, mobile ready.
- [ObjectBox](https://objectbox.io/) - an embedded, object-oriented database for Mobile Apps and IoT.
- [Perst](https://www.mcobject.com/perst/) - open source, dual license, object-oriented embedded database system. It is available in one edition developed as an all-Java embedded database, and another implemented in C# (for Microsoft .NET Framework applications).
- [Realm Mobile Database](https://www.mongodb.com/realm/mobile/database) - an open-source and object-oriented DBMS designed for the mobile devices.
### Database tools <a name="db-tools"></a> [↑ ↑ ↑](#toc)
* [qStudio](https://www.timestored.com/qstudio/) - a free SQL client and notebook that lets you browse tables, run SQL scripts, and chart and export the results.
* [SQLiteStudio](https://sqlitestudio.pl/) ([Github repo](https://github.com/pawelsalawa/sqlitestudio)) - a free, open source, multi-platform SQLite database manager.
### Data platform <a name="data-platform"></a> [↑ ↑ ↑](#toc)
* [YTsaurus](https://ytsaurus.tech/) - a distributed storage and processing platform for large amounts of data. It includes MapReduce computation model, a distributed file system and a NoSQL key-value storage.
### Data processing <a name="data-processing"></a> [↑ ↑ ↑](#toc)
* [Apache Spark](https://spark.apache.org/) - a multi-language engine for executing data engineering, data science, and machine learning on single-node machines or clusters.
* [Apache Zeppelin](https://zeppelin.apache.org/) - web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala, Python, R and more.
* [CUE](https://cuelang.org/) - an open source language, with a rich set of APIs and tooling, for defining, generating, and validating all kinds of data: configuration, APIs, database schemas, code, ...
* [Dasel](https://github.com/TomWright/dasel) (short for data-selector) - allows you to query and modify data structures using selector strings.
* [Huey](https://github.com/rpbouman/huey) - a browser-based application that lets you explore and analyze data. Huey supports reading from multiple file formats, like .csv, .parquet, .json data files as well as .duckdb database files.
* [Malloy](https://www.malloydata.dev/) - a modern open source language for analyzing, transforming, and modeling data.
* [Metabase](https://www.metabase.com/) - the easy-to-use open source Business Intelligence and Embedded Analytics tool that lets everyone work with data.
* [Miller](https://github.com/johnkerl/miller) - a command-line tool for querying, shaping, and reformatting data files in various formats including CSV, TSV, JSON, and JSON Lines.
* [Polars](https://pola.rs/) - an open-source library and tools for data manipulation, known for being one of the fastest data processing solutions on a single machine.
- [polars-cli](https://github.com/pola-rs/polars-cli) - CLI interface for running SQL queries with Polars as backend.
* [RBQL - Rainbow Query Language](https://rbql.org/) - a technology that provides SQL-like language for data-transformation and data-analysis queries for structured data (e.g. CSV files, log files, Python lists, JS arrays).
* [Spyder](https://www.spyder-ide.org) - a powerful scientific environment written in Python, for Python, and designed by and for scientists, engineers and data analysts.
* #### JavaScript <a name="data-processing-js"></a> [↑ ↑ ↑](#toc)
- [Arquero](https://idl.uw.edu/arquero/) - a JavaScript library for query processing and transformation of array-backed data tables.
- [Danfo.js](https://danfo.jsdata.org/) - an open-source, JavaScript library providing high-performance, intuitive, and easy-to-use data structures for manipulating and processing structured data.
- [Simple data analysis (SDA)](https://github.com/nshiab/simple-data-analysis) - an easy-to-use and high-performance JavaScript library for data analysis. Works with tabular and geospatial data.
- [SQLRooms](https://sqlrooms.org/) - a comprehensive framework for building powerful data analytics applications that run entirely in the browser. It combines DuckDB's SQL capabilities with React to create interactive, client-side analytics tools without requiring a backend.
* #### Python <a name="data-processing-python"></a> [↑ ↑ ↑](#toc)
- [Amphi](https://amphi.ai/) - visual data transformation powered by Python. Designed for data preparation, reporting, and ETL.
- [Ibis](https://ibis-project.org/) - the portable Python dataframe library.
- [pandas](https://pandas.pydata.org/) - a fast, powerful, flexible and easy to use open source data analysis and manipulation tool, built on top of the Python programming language.
- [PyGWalker](https://kanaries.net/pygwalker) - a python library for exploratory data analysis with visualization.