TDFS
Transwarp Distributed File System
Strong expandability, high availability, security and reliability
Product Introduction
TDFS is a cloud-native distributed storage system with high-performance and strongly consistent, which compatible with Hadoop ecosystems, supporting object storage and file systems. TDFS has thecharacteristicshigh scalability, high availability, security and reliability and other . Through TDFS, users can realize key functions such as multi-replication partition tolerance, "unlimited" capacity expansion, backup data migration and recovery, etc. TDFS can meet users' massive data storage requirements, and make full use of resources, give full play to resource flexibility, and reduce analysis resource costs.
Core Capabilities
Mass Storage
TDFS provides unlimited file metadata storage without bottleneck of single node. It can fully meet the requirements for massive big data storage and analysis, and can effectively improve resource utilization and ensure high data availability.
Safe and Reliable
Based on distributed architecture technology, TDFS provides redundant storage of multiple replication to ensure data persistence and service availability, it is not affected by temporary failures and achieve remote disaster recovery and resource isolation.
Data Management
TDFS provides a file directory structure and supports data exchange in file form during batch data import and data export.
High Concurrency
TDFS has a tree-shaped directory structure similar to the traditional file system, which supports users to quickly create directories, access directories, retrieve and query statistical information in directories, and manage authority. TDFS has a higher concurrency, and the operation of a single storage object is faster.
Compatible with Ecology
Based on the distributed storage architecture, TDFS is compatible with HDFS protocol in the communication protocol, and can directly replace NameNode. TDFS can integrate into Hadoop community, and connect to Hadoop ecosystem and Transwarp self-developed upstream components at 0 cost.
Eight Reasons to Choose TDFS
High resources utilization
TDFS supports object storage and file storage structure, and supports most storage scenarios. More new features will be developed in the future to meet therequirements of different business scenarios, and to effectively perform resource flexibility and utilization.
Highly controllable
TDFS is developed by Transwarp from the underlying architecture to the upper-level interface, and users have greater control.
Seamless integration of multiple components
TDFS combines the internal components of Transwarp to improve the efficiency of big data storage, providing storage capabilities for businesses with low-latency, high-throughput, and high-concurrency, and supporting "real-time" business scenarios.
Elastic scaling
Combined with self-developed Raft, when the new TDFS NameManager and BlockManager nodes are added to the cluster, their Raft nodes are automatically grouped, historical data is automatically synchronized, and retired nodes are automatically kicked out of the Raft group, thereby realizing non-aware cluster expansion and contraction.
Excellent performance
TDFS abandons the QJM high-availability solution, it combines Raft to enable TDFS to re-select the master when the node is down, and continue to provide services, the switch between master and backup is at low cost. Its consensus mechanism ensures strong data consistency of each node and multi-replication partition tolerance. The BlockManager node of TDFS maintains the block information in the embedded database, and does not need to perform a full block report each time it is started, which effectively solves problems such as block report storms.
Lightweight cache
TDFS stores metadata in a lightweight embedded database and uses part of the memory cache, effectively solving the memory bottleneck problem caused by the storage of a large number of small files.
Easy O&M
Different from the traditional HDFS + Zookeeper, as the core part of TDFS, the self-developed Raft does not need to start the service process, which reduces the O&M cost, makes TDFS more cohesive, and avoids the instability of external services.
Underlying language development
Based on the unique memory management mechanism of Rust language, TDFS does not require garbage collection, nor does it cause "stuttering" caused by GC. It provides compiler checking, which can effectively reduce various exceptions and concurrency problems, and does not require additional heap memory allocation. The memory is released when it is used up, which effectively reduces the cost.
Application Scenarios
Data Lake
Unified storage pool that stores structured, semi-structured, and unstructured data of any scale.
Data Warehouse
Provide storage systems for data warehouses with high-performance, high-reliability, low-latency, and low-cost .
Lake House
Combine the characteristics of distributed file and object storage systems, TDFS supports most storage scenarios.
Data Migration
Quickly, smoothly and securely data migration and object storage to TDFS.
Transwarp, Shaping the Future Data World